Ruxandra's log
Link back to Pollenometer page
22 May 2007
Played with SVM for a while. It seems to take a really long time to run on as many pollen samples as we have, so I finally got to a compromise and kept all the positive samples, but reduced the negative samples to as many as the total number of positive samples, which compared to what I had for the previous experiments is about 10% (I kept 10,000 negative samples out of 100,000). I used SVM to select the patches into pollen and non-pollen and, matching the obtained 'pollen' samples with the ground truth, I separated them into the respective categories (with the ones not matching into false alarms) and fed them into Marc'Aurelio's classifier. Below is the confusion matrix I obtained.
I also tried a different experiment, just to get a feel of how SVM alone works on all the categories. I basically just used SVM starting from the already obtained patches and trained and tested it on the 15 categories of pollen. It hasn't finished running yet, so I am still waiting for the results. Talking with Greg, however, it seems like a better idea might be to first make the classification into pollen and non-pollen and then take the pollen obtained and classify it into the 15 categories. After the first experiment is done, depending on the results, I would like to try this second one as well.
2 May 2007
Corrected last experiment to include only the false alarms from nearest neighbor as test false alarms to the classifier. Below is the confusion matrix (on the left) compared to the confusion matrix of a validation set from the training set comparable in size to the test set (on the right).
25 Apr 2007
Progress
I ran nearest neighbor on the extended dataset and then ran the results on the trained classifier. Below is the confusion matrix obtained. This is similar in notation to the initial confusion matrix obtained and also in numbers. This is mostly because of the way I ran the results of nearest neighbor on the trained classifier - i.e. it only had a little fewer images than before, but the images were pretty much the same.
10 Apr 2007
Progress
Ran the trained classifier on the images spit out by the nearest-neighbor selection algorithm. Below is the confusion matrix obtained. On the left side, the numbers are given in percentages and on the left in the number of samples. From the folder I used for the training of the nearest-neighbor selection algorithm, the recognized types of pollen were: cypress, grass, liq. amber, mulberry, oak, olive, walnut, and asterac., birch, eucalyp., palm, pine, poplar, sycam. in very small quantities. The classfier was trained on the following types of pollen: alder, ash, chenop, chinese elm, cypress, eucalyp., grass, jacaran., mulberry, oak, olive, pine, sageb., and walnut. Out of the intersection of the two we are only left with: cypress, eucalyp., grass, mulberry, oak, olive, pine, and walnut. That is the reason why some of the fields in the confusion matrix have no bubbles associated with them.
14 Mar 2007
Progress
- False alarms from the last nearest-neighbor experiment, with pine included:
- False alarms from the last nearest-neighbor experiment, without pine:
28 Feb 2007
Experiment including pine
- Found a bug in the code that dealt with permuting the samples, which I think might have affected the results pretty significantly.
- Repeated the last experiment and went though the samples manually again to have better control over the images I was thowing away out of the initially good samples. Basically, I have now different labels for good samples, false alarms and false alarms that came out of the good samples. I decided to not include these last ones in the false alarms used for either training or testing.
This time I ended up with 1625 good positive samples, out of which I used half for training and half for testing. I used 1000 negative samples for testing. Below are the ROC curves for this experiment.
Experiment excluding pine
Out of the 1625 good positive samples, only 46 were pine. By putting pine in the negative samples and repeating the above experiment, we get the following ROC curves.
Other experiments
I have also tried other experiments after finding the bug. I tried including the initally good samples that I manually threw away into the positive samples (to check if the manual selection helped) and then into the negative samples. Both experiments had results similar to the results obtained a couple of weeks ago, so not worth mentioning.
Minutes
Present: Perona, Paun
- Looked at the last curves and decided to try using Marc'Aurelio's classifier with nearest-neighbor instead of Marc'Aurelio's detector. And then look at the numbers there and at the samples that go through.
- Also look at the false alarms that go through when just using nearest neighbor.
20 Feb 2007
Added a few new features to nearest neighbor: the percentiles from the convolution of the image with [-1,1] and [-1;1]. I then manually went through 3370 positive samples and selected the best ones and ended up with 1614 positive samples. I ran the last experiment using these samples and got the following ROC curves.
Getting to this point took a little longer than I expected, because the program kept runnning out of memory and I had to change tactics several times to be able to get the matching patches in the first place. After I manually selected them, I realized I didn't account for the fact that I might also want to kill pine later, and in order to do that I have to manually run through the images again. I will have this done by the end of the week.
6 Feb 2007
Progress
I repeated experiment 5 (i.e. I used the normalized coordinates, the percentiles, and the fixed testing size), but I completely eliminated the filters in Marc'Aurelio's code, the ones associated with his automatic detector. I only kept the overlap filter, obviously necessary for not counting a pollen particle more than once. I used Marc'Aurelio's code for building the boxes around the particles of interest and then compared these boxes with the expert labeled boxes. I ran the experiment on the April06 folder with a total of 1639 pollen grains. Out of these 1411 were detected and 228 were missed. Since I managed to get more positive samples this time, I also changed the training and testing set sizes. I used 705 positive samples for both training and testing, 2000 negative samples for testing and varied the number of negative samples used for training among: 1000, 2000, 3000, 4000, 5000, 6000 and 7000.
Below are the ROC curves for the nearest neighbor classification without randomizing the interest points.
And here are the curves when randomizing the selection of interest points.
I don't know how to explain the last curves. We are doing worse when randomizing the selected points for the experiment and the percentage of false alarms went way up for both experiments.
Below are samples of the false alarms.
I also tried looking at the positive samples used by the classifier: the ones that overlap the expert labelings. Below are a few examples.
2-3 Feb 2007
Progress
Experiment 1
Used the last implementation of nearest-neighbor classifier, changing the classification reference labels from the automatic detector labels to the expert labels. I also dropped the overlap feature in the classification and added pixel-value features: 5, 25, 50, 75 and 95 percentile. This time I used a total of 3113 raw images and ended up with 1115 total correct detections and 789031 false alarms (according to the matching between expert labels and automatic detections). Out of these I used 557 positive samples for both training and testing and I varied the negative samples from 557 to 1000 to 2000 to 4000 for both testing and training.
Below are the ROC curves for all diferent numbers of negative samples used.
And here are all the ROC curves on one plot. It seems like the fewer negative samples we use, the better is the the performance.
Experiment 2
As a sanity check I repeated the experiment without using the various percentiles. Below are the detailed ROC curves and the plot with all of them together. The performance seems a little better with the percentiles included as features for nearest neighbor, but it is barely noticeable.
Experiment 3
This time I repeated the first experiment (the one that uses the percentiles for pixel values), but I normalized each coordinate to unit variance. The results are below.
Experiment 4
As a last experiment, I repeated the second experiment (the one that doesn't use the percentiles for pixel values), with normalized coordinates to unit variance. The results are below.
Experiment 5
Here I repeated the third experiment (the one that uses percentiles and normalizes the coordinates), but I kept the testing set size constant: 557 positive samples and 1000 negative samples. I just changed the number of negative training sampels to get the different ROC curves. The results are below. We seem to finally do better with more training samples, but this is clear only when we use 4000 negative samples.
Experiment 6
Here I repeated the fourth experiment (the one that doesn't use percentiles and normalizes the coordinates), but, just as in experiment 5, I kept the testing set size constant: 557 positive samples and 1000 negative samples. The results are below and similar to the results of experiment 5.
Looking at false alarms
Interest points filtered out by the automatic detector
These boxes contain the initial interest points that are filtered out by the detector because of size, overlap or mean pixel value. These are the boxes considered to not contain any pollen grains, or in case they actually contain pollen grains, to overlap with other boxes that also contain the same pollen grains. Below are a few samples, grouped in 10 by 10 tables for better viewing. The images I cropped them from are normalized, thus the white background.
Actual false alarms
These are the boxes containing the actual false alarms. The actual false alarms are obtained from the boxes not filtered out by the automatic detector that don't match the expert labeled boxes. Below are a few samples, grouped in 10 by 10 tables for better viewing. The images I cropped them from are the actual raw images, not normalized, thus the difference in the background.
30 Jan 2007
Progress
- I implemented the nearest-neighbor classifier to decide whether a given location in a pollen image is likely to contain a pollen grain. I used 3 of Marc'Aurelio's features to characterize each detection: size, overlap with other detections, and mean of pixel value. 1000 negative samples and 1000 positive samples were used for both training and testing. By varying the threshold of acceptance of nearest neignbors from 1 out of 9 to 9 out of 9, I got the ROC curve below.
10 Jan 2007
- Present: House, Paun, Perona
From the displays Ruxandra prepared during the last week it is apparent that it should be possible to decrease the rate of false alarms: those particles are not similar to pollen particles. It would be interesting at this point to know exactly what is our trade-off between false alarms and false rejects when we do particle detection.
TO DO:
- Implement nearest-neighbor classification to decide whether a given location in a pollen image is likely to contain a pollen grain. Use Marc'Aurelio's 7 features to characterize each detection. Use 1000 images as training examples (i.e. all the interest points in those 1000 images are grouped into `false alarm' and `correct detection' and their 7 features are used as training examples). Use another 1000 images to compute the ROC curve. Vary the criterion of acceptance of nearest neighbors (e.g. 9 out of 9 neighbors in `correct detection' are required to validate an interest point, 8 out of 9, 7 out of 9 etc).
- Prepare ROC diagram summarizing such results
- In due time, implement classifier with SVM and/or ADAboost rather than nearest neighbor. Mainly for speed.
9 Jan 2007
Progress
- Criteria for unknown vs. false alarms: For each particle of interest, the classifier tries to estimate a category for it, the category being either a pollen grain or a false alarm. If the calculated probability of belonging to this category is greater than .5, the particles is classified as belonging to the estimated category; otherwise, it falls into the unknown category.
- To visualize the errors in the classification I had to run the classifier again. To save time though, instead of running 100 experiments as I did before, I only ran one. The confusion matrixes from this experiment are shown below.
- Examples of false alarms classified as various types of pollen
- Examples of pollen grains from each category classified as false alarms
- Examples of pollen grains from each category classified as unknowns
- Examples of misclassifications in between categories (I used here the ones that were misclassified by 10% or more)
5 Jan 2007
- Present: Flagan, House, Paun, Perona

Minutes
- Reviewed confusion tables
- Decided to meet every wednesday 3pm until March 15
- Ruxandra will work on the project in the Vision lab on Mondays and Fridays. We will revisit this decision in a couple of weeks.
- Jim House will review the Wiki every week. He will edit and send fbk to Ruxandra as appropriate.
Action items for Wed next week (see Notes)
- Generate pictures showing examples of false alarms confused for various types of pollen
- Ditto showing examples of pollen grains classified as unknowns and as false alarms
- Clarify criteria for false alarm and unknown classification
29 Sept 2006
- Present: House, Flagan, Paun, Perona
Action items
- Ruxandra gets images from J. House
- Ruxandra posts images in /common/Image_Datasets on vision clusters. She writes Matlab script to count how many images in each category and to produce 10x10 mosaics with a random sampler of 100 images of each category. She posts those stats on this wiki.
- Ruxandra gets in touch with M. A. Ranzato to obtain copy of latest software for (a) training classifiers, (b) classifying images, (c) selecting `interesting' patches from large microscope images.
- Ruxandra repeats M.A. Ranzato's experiments using his software and new data. She charts performance when using 100, 200, 400, 800 training examples.