Monday, May 15, 2006

 

New fish video

fish_classification_pseg_bkd_et025.mpg


Found: one bug (in Mahalanobis distance computation)

Implemented: background subtraction contour classification (blue contours in video; red contours are still from color segmentation contours)

Reduced: number of principle components (by about half)

Realized: for noisy data, you still need a couple of high frequency principle components in order to perform robust classification.

Saturday, May 13, 2006

 

Fish Finding

After porting the shape modeling code from matlab to opencv, I finally have results from the Cousteau video:

fish_classification_pseg.mpeg

The main limitation at this point is the contour extraction; nearly all of the human-discernable fish-like contours that were extracted by the algorithm were classified correctly. Contours were extracted using a pyramid color segmentation algorithm, then for the first one-third of the video fish contours were labelled by hand to train the fish shape model.

Many of the contours were quite noisy:














Tangent space PCA was performed to get a set of principle components. Here are the mean shape and the effects of the first four principle components:


























The corresponding eigenvalues were:
eigenvalue 1 = 0.084449
eigenvalue 2 = 0.048138
eigenvalue 3 = 0.031195
eigenvalue 4 = 0.016441

To my surprise, when I looked at the dataset after running the algorithm, there were 18 principle components in use, with the lowest eigenvalue being 1/100th the heighest eigenvalue. All but the top four principle components were simply modeling high-frequency noise. For example, the 18th principle component was:












Despite this massive case of overfitting, the model still performed quite well on the dataset, where the classification rule was a thresholded Mahalanobis distance from the origin to the new shape (normalized and projected into the tangent space), with a penalty for projecting onto the eigenspace which was inversely proportional to the lowest eigenvalue. The threshold was determined by hand, and was set to about 1/10th the largest Mahalanobis distance in the training set. Thus, not all of the hand-labelled fish from the training set were recognized as fish by the classification algorithm.

In addition to reducing the number of principle components in the model, there are a few other improvements that can be made to the algorithm. First, learning a generative model of noise would enable an ML or MAP classifier, eliminating the need for a knob to control the maximum Mahalanobis distance. Second, contours from the thresholded background subtraction images should be analyzed to increase the detection rate. Finally, the mirror image of each contour should be classified, in order to detect flipped, non-symmetric shapes.

More videos will be forthcoming!

This page is powered by Blogger. Isn't yours?