Thursday, April 27, 2006
Honduras Report
Here's a short video from the Honduras trip: [ fish5.avi ]
Watch it, and you will realize why I haven't gotten any major new results in shape classification this semester. The changes in lighting, camera jitter, small relative size of fish to the image, and abundance of texture (but not color) on which to segment make extracting fish contours an extremely challenging problem. Color segmentation (e.g. with pyramidal flood-filling) does a terrible job: under-water most objects are some shade of aqua-marine, and the texture varies so much when the fish is seen on the backdrop of the sea floor that the resulting contours are based almost exclusively on local texture patterns.


Background subtraction also does a bad job, since the camera is moving too quickly:

Even motion subtraction, via affine warping the previous frame to fit as closely as possible the current frame (using the flow field from sparse pyramidal Lucas Kanade optical flow) doesn't help us find the fish (which should be moving differently from the rest of the image). Due to the large range in scales across the image and the high degree of occlusion, many other pixels in the image stand out apart from the fish.


At this point you may be saying to yourself: "but there aren't any fish in this image." In fact, there are two:

It is only when the motion of the fish is seen that humans can even detect them. Thus, more complicated techniques for motion segmentation are needed, such as:
"Hierarchical Image-Motion Segmentation by Swendsen-Wang Cuts"
http://civs.stat.ucla.edu/Barbu_Research/Motion/index.html
I have also been playing around with some texture metrics for the purposes of segmenting out the sea-bed and for finding and classifying coral. However, for the time being I am giving up on the Honduras videos because they are simply too difficult to process. I will generate shape models from the Cousteau video, where fish are usually against the solid backdrop of the open sea, and where lighting conditions are much more favorable. I also have a Hokuyo laser range finder (http://www.hokuyo-aut.jp/products/urg/urg.htm) to play with for 3D shape analysis--I worked out the math for 3D Procrustean shape analysis while we were in Honduras.
Finally, while in Honduras the main application I worked on was optical flow/visual odometry. I had an implementation in OpenCV before we left; unfortunately we were using a Blackfin DSP processor, so OpenCV had to be ported. This turned out to be much more difficult than anticipated, even though there is at least one reference on the web claiming to have ported OpenCV to the Blackfin (or at least there was a reference...a google search for "opencv blackfin" turns up nothing now). The main problem is that the Blackfin has no hardware support for floating point arithmetic, so all of OpenCV's math functions were extremely slow, and had to be changed to use fixed point. Aside from over/underflow problems, in the end some of the algorithms themselves were too slow--pyramidal sparse Lucas Kanade optical flow was a little too pyramidal for the Blackfin's taste, it seems. In the end, after rewriting many of the image processing functions in OpenCV, the frame-rate was down to a blazingly fast 1.5 Hz through the JTAG debugging interface (*sarcasm*). Ironically, when we tried to run the program without the JTAG, the Blackfin operating system complained. The most likely explanation I got from the hardware people on the trip was that the program was too big. *Sigh*
Finally, on the last night of the trip, I gave up on OpenCV and hacked together a non-optical-flow-based image registration algorithm which almost, sorta-kinda worked. More processing power (to increase the searchable motion space) and smarter image processing techniques (registering edge images, for example) should improve results, but I haven't had a chance to work on it any more yet.
That's all for now! I remember why I usually don't post to my blog now...it takes so much time!
Watch it, and you will realize why I haven't gotten any major new results in shape classification this semester. The changes in lighting, camera jitter, small relative size of fish to the image, and abundance of texture (but not color) on which to segment make extracting fish contours an extremely challenging problem. Color segmentation (e.g. with pyramidal flood-filling) does a terrible job: under-water most objects are some shade of aqua-marine, and the texture varies so much when the fish is seen on the backdrop of the sea floor that the resulting contours are based almost exclusively on local texture patterns.


Background subtraction also does a bad job, since the camera is moving too quickly:

Even motion subtraction, via affine warping the previous frame to fit as closely as possible the current frame (using the flow field from sparse pyramidal Lucas Kanade optical flow) doesn't help us find the fish (which should be moving differently from the rest of the image). Due to the large range in scales across the image and the high degree of occlusion, many other pixels in the image stand out apart from the fish.


At this point you may be saying to yourself: "but there aren't any fish in this image." In fact, there are two:

It is only when the motion of the fish is seen that humans can even detect them. Thus, more complicated techniques for motion segmentation are needed, such as:
"Hierarchical Image-Motion Segmentation by Swendsen-Wang Cuts"
http://civs.stat.ucla.edu/Barbu_Research/Motion/index.html
I have also been playing around with some texture metrics for the purposes of segmenting out the sea-bed and for finding and classifying coral. However, for the time being I am giving up on the Honduras videos because they are simply too difficult to process. I will generate shape models from the Cousteau video, where fish are usually against the solid backdrop of the open sea, and where lighting conditions are much more favorable. I also have a Hokuyo laser range finder (http://www.hokuyo-aut.jp/products/urg/urg.htm) to play with for 3D shape analysis--I worked out the math for 3D Procrustean shape analysis while we were in Honduras.
Finally, while in Honduras the main application I worked on was optical flow/visual odometry. I had an implementation in OpenCV before we left; unfortunately we were using a Blackfin DSP processor, so OpenCV had to be ported. This turned out to be much more difficult than anticipated, even though there is at least one reference on the web claiming to have ported OpenCV to the Blackfin (or at least there was a reference...a google search for "opencv blackfin" turns up nothing now). The main problem is that the Blackfin has no hardware support for floating point arithmetic, so all of OpenCV's math functions were extremely slow, and had to be changed to use fixed point. Aside from over/underflow problems, in the end some of the algorithms themselves were too slow--pyramidal sparse Lucas Kanade optical flow was a little too pyramidal for the Blackfin's taste, it seems. In the end, after rewriting many of the image processing functions in OpenCV, the frame-rate was down to a blazingly fast 1.5 Hz through the JTAG debugging interface (*sarcasm*). Ironically, when we tried to run the program without the JTAG, the Blackfin operating system complained. The most likely explanation I got from the hardware people on the trip was that the program was too big. *Sigh*
Finally, on the last night of the trip, I gave up on OpenCV and hacked together a non-optical-flow-based image registration algorithm which almost, sorta-kinda worked. More processing power (to increase the searchable motion space) and smarter image processing techniques (registering edge images, for example) should improve results, but I haven't had a chance to work on it any more yet.
That's all for now! I remember why I usually don't post to my blog now...it takes so much time!
