I recently ran across what I thought was an interesting thought exercise:
A conveyor belt transports oranges and bananas. A downward facing camera takes digitized pictures of the fruit. There is one image per piece of fruit on the belt. The images are 2 bit. You get a "0" for the conveyor belt, and a "1" for the fruit. The task is to determine the number of oranges and bananas that pass under the camera.Solution #1
The most obvious solution is to base your answer on the whether the polygon represented by the "1" bits posesses any concavity. If it's convex, then it's an orange. If it has any concavity, then it's a banana. My best guess on how to do this would to be determine if the line drawn between any boundary point goes outside of the area of the fruit. To do this, you'd first need to find the boundary points; the edges of the fruit. You could find these by say taking Y=0...n and for each value of Y, increment X=0 until you hit a fruit point. Then, do the same thing with X=m...0 decrementing X (visually, coming in from each side of the fruit). You could step Y by 2 or 5 or 10 or something to speed things up and still get an approximation of the polygon representing he fruit:
You can generate a vector image of the fruit's outline by connecting successive boundary points. Now, for each boundary point, draw a line to every other boundary point. If any of those lines intersect the lines defining the edges of the polygon, then the object has concavity and is banana:
The above solution in O(n squared). We'd like to improve that.Solution #2
What if the banana is not concave like some of the more exotic banana types that are less common in American grocery stores (I think these are called Chiquita bananas)? What if the object is not a banana, but say a mango (or any other not quite round, but not convex fruit)? Again, assume that we have the vector representation of the object. What if we knew the length and width of the object? We know that for a roughly spherical object, l / w = ~1. We can use this and say that if the ratio of the length over the width is equal to one within some tolerance, that the object is an orange, otherwise, it's a banana. How do we define the length and width of a polygon? Imagine we can draw two parallel lines over the bitmap image. We step them towards each other until they both intersect the polygon. The length is the maximum distance between parallel lines we can achieve if we repeat this process rotating the polygon 360 degrees (or changing the starting position of the lines 360 degrees around the polygon). The width is the minimum value. We can now compute the length and width, and the ratio of the two. We can say if l / w < 1.3 (or something), that the object is an orange, else it's a banana.
This solutiion is O(n), assuming we rotate the polygon 360 degrees in n steps.Solution #3
Another solution contributed by a reader ...
Count the number of pixels consumed by the object. Now make a copy of the image, and rotate it 90 degrees. XOR the bits in the original with the rotated copy. The result is an image that contains the bits that are 1's in the original but not in the rotated. If the object is a perfect circle, then there should be no 1's in the XOR'd image. Of course an orange is not a perfect circle, so you'd want to compare with some tolerance (some small percent of the original 1 bits remain). A non-circular object would contain a much larger proportion of 1 bits in the XOR'd image.
This assumes that the object is perfectly centered in the larger image. This is not likely, so before attempting this you'd need to find the boundary square for the object, and only rotate and XOR that subset of the larger image.Solution #4
Also submitted by a reader ...
Start with the assumption that object is roughly circular, and try to prove that it's not. Find the actual
area consumed by the object by counting the 1 bits in the image. Find the bounding rectangle. If the object is circular, then the bounding rectangle is actually a square. If so, we know the diameter d of the circle is equal to the length of the sides of the square. We know that the area of the circle is pi\*r squared, so we can compute the expected
area based on the assumption that the object is circular. If the object is truly circular, then the actual area should be equal to the expected area within some tolerance. If so, then we assume it's an orange, else it's a banana.
Any corrections or better solutions would be appreciated.