Greg's ICCV 2009 Notebook, Fall 2008
SPM Roundup
Standard SPM
>> cd ~/101/hist4/200; scansvm ntrain ntest nword performance 5 5 200 42.8 % 15 15 200 57.1 %
Modified SPM
Instead of building a kernel by matching individual images, match against the sum of all histograms for each class.
This gives a direct comparison to NBNN as published. I tried modifying NBNN to match files before ala SPM, and this was very slow. So this time, go the other way and make SPM more like NBNN instead. If it works, it would be way faster, because
- Compute time no longer depends on ntrain
- You may not even need to use an SVM
Plus you have something you can directly cluster, possibly useful for
- Unsupervised learning
- Constructing taxonomies
Ok, but does it actually work?
cd ~/101/hist4/200 mergehist([5 15],[5 15],200) makematch([5 15],[5 15],200,'prefix','merge'); load match005_005_200.mat [m,ctest]= max(mtest,[],2); truth= tags(1:100,5); c= confuse(ctest,truth); mean(diag(c)) ans = 0.3040 load match015_015_200.mat; [m,ctest]= max(mtest,[],2); truth= tags(1:100,15); c= confuse(ctest,truth); mean(diag(c)) ans = 0.3580
Hmm... not as good. That's a little disappointing.
Am I failing to normalize properly?
Tried briefly adding a "renormalize" flag to makematch
, and various normalization schemes within mergehist
itself. No improvement.
Using Categories We Know To Quickly Learn Categories We Don't
We throw valuable information away when we just apply N 1-vs-all classifiers and simply pick the winner. We throw away the margin scores.
Can a fixed (small) number of category classification margin scores be mapped (via an SVM) to a much larger number of categories.
Goat test images applied to the following classifiers: dog , horse, car, toaster
yields margins scores: 0.5, 0.3, -1.1, -.9
We can then classify it as a goat based on a second learning phase which showed that things that were half-dog, half-horse tended to most likely be goats.
Canonical Procedure
- Training
- Result: N=256 1-vs-255 classifiers C1...CN
- Testing
- Apply above classifiers to each test image
- Result: scores c1...cN
- Simply pick the classifier with the highest score
Proposed procedure
- Boostrapping
- Train n=32 1-vs-31 classifiers (first 128 categories)
- Result: Classifiers C1...Cn
- Training
- Classify ntrain images per category
- Each training image → set of n margin scores
- SVM: learn 1 of N training classes based on n margin scores
- Testing
- Apply above classifiers to each test image (but now there are only n!)
- Feed n margin scores into SVM
- Predict N
- Far less complexity
- Match scores generated against a fixed set of n training category images
- Additional complexity is in 2nd-level training
- Which is super-fast
A Real-World Example
>> cd ~/256/spm/80x60; scansvm ntrain ntest nword performance 15 15 200 24.6 %
- Use same exact files for training/testing
- But we need an additional set of nboot=30 disjoint files from each of Nboot=32 randomly chosen categories (which are hopefully representative of the categories overall).
rand_seed(1); randomize; indx=randperm(256); iboot= indx(1:32); cboot= getcategories(iboot); showstrings(cboot,3);
cd ~/256/images; [ftrain,ftest]= pickfiles(getcategories,15,15);
How do we make fboot
disjoint from ftrain
and ftest
Started writing pickfiles3
but this is just simpler:
pick ntest= 45 and then only use the first ntest=15 in each category. That leaves 30-per-category available for boostrapping.
cd ~/256/sift/80x60; rand_seed(1); makehist('~/256/spm/80x60/200',getcategories,15,45,200);
Hmmm... need to rethink this a little. I only need histograms for the nboot
files, so the above procedure isn't the right one.
For right now, it is easiest to simply relax the disjoint requirement, knowing that the probability of overlap is fairly small esp. for the larger categories.
Alternative (part 2)
- Prepare for 1st-level training
- Removes all but rows and columns 1:32 from
, all but rows 1:32 from <mtest>. mtrain
is now (32x30)x(32x30)mtest
is now (32x30)x(256x30)- Modify
as well
- 1st-level training
- Run
- Making sure classes
are correct - Classify all the "test" images represented by
- Keeping the margin scores, not just final classifications!!
- Run
- 2nd-level training
- Use 15-per-class from
for training - Margin scores labeled 1:256 → train 256 new classifiers
- Use 15-per-class from
- 2nd-level testing
- Use 15-per-class from
for testing - Margin scores unlabeled → classify!
- Use 15-per-class from
1st-Level Training
cd ~/256/spm/80x60; makeboot(1:32,30,30,200);
2nd-level Training
Final Examples
cd ~/256/spm/80x60; [svm2,svm1]=makeboot(1:64,10,[100 2e-2],30,30,200); indx=randperm(256); indx(64), cd ~/256/spm/80x60; [svm2,svm1]=makeboot(sort(indx(1:64)),10,[100 2e-2],30,30,200);
Result So Far
cd ~/256/spm/80x60; d={'32_5','32_15','64_5','64_10','64_15','128_10'}; for i=1:length(d), cd(['ncats' d{i}]); disp(pwd); scanboot; cd ..; printf; end
Which results in the following:
/common/greg/caltech256_final/spm/80x60/ncats32_5 boot1e1.mat 9.4 boot1e2.mat 9.8 boot2e2.mat 10.4 /common/greg/caltech256_final/spm/80x60/ncats32_15 boot1e2.mat 14.3 boot3e2.mat 15.7 /common/greg/caltech256_final/spm/80x60/ncats64_5 boot2e2.mat 14.3 /common/greg/caltech256_final/spm/80x60/ncats64_10 boot2e2.mat 17.9 /common/greg/caltech256_final/spm/80x60/ncats64_15 boot1e2.mat 20.2 boot1e3.mat 17.6 boot2e2.mat 21.2 boot3e2.mat 21.1 boot3e3.mat 17.8 /common/greg/caltech256_final/spm/80x60/ncats128_10 boot1e2.mat 21.8
Tuesday Talk Feedback
There are really two stories here: how to use stage-1 learning to
- increase performance in the low-ntrain regime
- speed up overall test performance, and improve scalability
I've been curious to see how learning on Caltech-101 categories in stage 1 could improve Caltech-256 performance in stage 2. Piotr suggests concentrating on one-shot learning in this scenario, in order to give the paper a clear focus. I really like this idea. As it is, the paper focus is too divided.
Merrielle wonders if the super-categories created via confusion matrix clustering could be good categories to use in stage-1 learning. Piotr's counter-point is that these categories might be so vague that their margin scores are useless for identifying specific categories.
Stage 1 is essentially just generating a new kind of feature vector, which is then used in stage 2. So several people suggested appending the match scores (on each image) to the margin scores (for each category) to see if stage 2 learning can do anything with it. This might help, but it does significantly increase the dimensionality of the feature vector. So I'm guessing it would provide some small benefit, but you would get diminishing returns?
Caltech-101/256 Pre-processing
Since we now have the memory to do this (on the 64-bit machines) go ahead and pre-calculate the entire match kernel for all relevant images. Then we can slice and dice it later over randomized training and testing sets, and the overall analysis will be much much faster.
For Caltech-256 1-shot learning with Caltech-101 pretraining
cd ~/101/sift/80x60; cats101= getcategories; files101= pickfiles(cats101); cd ~/256/sift/80x60; cats256= getcategories; files256= pickfiles(cats256); cd ~/357/sift/80x60; makehist('~/357/spm/80x60',{},'101_256_200.mat',files101,files256);
For Caltech-256 1-shot learning without pretraining
Not entirely sure if we'll need the mtest
generated below, but I'm generating it just for symmetry sake.
cd ~/357/sift/80x60; makehist('~/357/spm/80x60',{},'256_101_200.mat',files256,files101);
Caltech-101/256 Post-processing
Current Syntax
Part 2 will be like this:
1. Use postmatch
to generate multiple trials from one of the massive match files above, e.g.
2. Use makeboot
to analyze these as before, e.g.
cd boot [svm2,svm1]=makeboot(clist,5,25,[100 2e-2],30,30,1:10);
Remaining issues:
- Need to find
which only include the non-overlapping 101 categories. - Currently no output file saved by
New Improved Syntax
This is more polymorphic and easier to remember:
- Make syntax between both routines more consistent.
- First 3 arguments describe input file
- Last 3 describe output file
- Additional arguments are function-specific
postmatch(101, 256, 200, 30, 30, 1:10, 'boot'); cd boot; makeboot ( 30, 30, 1:10, 5, 25, 1:10, clist );
Here the 3rd suffix number refers to the random number seeds 1..10.
Non-overlapping categories
Bottom line: bad categories are:
cd ~/101/images; cats101= getcategories; Cbad= [1 2 3 4 6 13 14 16 20 24 27 34 36 37 40 47 48 51 52 55 56 58 59 64 76 81 87 90 94 95]; Cgood= setdiff(1:102,Cbad)
1 BACKGROUND_Google 2 Faces 3 Leopards 4 Motorbikes 5 airplanes 6 bonsai 7 brain 8 buddha 9 car_side 10 chandelier 11 crab 12 dolphin 13 electric_guitar 14 elephant 15 ewer 16 grand_piano 17 hawksbill 18 helicopter 19 ibis 20 kangaroo 21 ketch 22 laptop 23 llama 24 menorah 25 revolver 26 scissors 27 starfish 28 strawberry 29 trilobite 30 umbrella
Some snippets of code I no longer am using
Baseline for Comparison
This activates a 2nd mode of postmatch
for use with
the conventional old analysis:
postmatch(101, 256, 200, 15, 15, 1:10, 'baseline' ,2); cd baseline makesvm(15,15,1:10);
Note: need to make the postmatch
input arguments more robust. Just a hack now.
There was a bug in postmatch
where it did not symmetrize(mtrain)
before slicing it up into training and testing sets.
This has not affected my previous experiments, since those were on Caltech 101 vs 256.
Now we're getting sensible (~55%) performance on Caltech-101 for ntrain=15. That's still a little low : because I include the clutter category? Regardless my main goal here is to sort categories by performance, so I don't care.
Category Confound
Here's another problem I've been tracking down: getcategories
would always report 102 categories in Caltech101. But then at the end of the day there were only 101. Why? file2class
does the right thing by ignoring thumbdir
but getcategories
does not.
- How has this affected my results so far?
- Need to fix
and 357
and sift
- How does this affect
- categories after "t" may be offset by 1
- Is the Caltech-101 web page still showing images? unaffected
- If not point it towards .thumbdir instead of thumbdir deleted .thumbdir: it was empty
After solving all these issues, I should be able to
- Restrict
to the (now properly) index Cgood
- Limit
to, say, just the best 50 categories
- Generate
using postmatch
and makeboot
- Are the results better?
April 13, 2009
I wish to repeat the above results except, instead of picking pre-training categories
at random, pick the ones which are the most accurate.
In other words, hand-pick some sort of optimal clist
Random Category Choice (Baseline)
I'm the only one on the machine, so:
cd 357/spm/80x60/ncat32;
for i=1:10,
Don't forget to run matlabpool
Using 32 Best Categories
cd ~/101/images;
cats= getcategories;
cd ~/357/spm/80x60/baseline;
for i=1:10,
clear svm
if i==1, c=svm.conf;
else c=c+svm.conf;
[d,indx]= sort(-diag(c));
indx= indx(1:32);
1 car_side 2 Motorbikes 3 accordion 4 pagoda
5 Faces 6 minaret 7 trilobite 8 dollar_bill
9 laptop 10 windsor_chair 11 cellphone 12 revolver
13 grand_piano 14 airplanes 15 metronome 16 inline_skate
17 watch 18 okapi 19 pizza 20 Leopards
21 stop_sign 22 wheelchair 23 rooster 24 euphonium
25 dalmatian 26 ferry 27 sunflower 28 joshua_tree
29 hedgehog 30 scissors 31 garfield 32 yin_yang
These are the easiest categories, so use these for pre-training:
cd ~/357/spm/80x60/ncat32;
for i=1:10,
Production Runs
cd ~/101/images;
cats= getcategories;
cd ~/357/spm/80x60/baseline;
for i=1:10,
clear svm
if i==1, c=svm.conf;
else c=c+svm.conf;
[d,indx]= sort(-diag(c));
% on vision402
cd ~/357/spm/80x60/ncat32
for i=[1 2:2:12],
makeboot(30,30,1,i,15, 1,sort(cats(1:32)));
% on vision401
cd ~357/spm/80x60/ncat64
for i=[1 2:2:12],
makeboot(30,30,1,i,15, 1,sort(cats(1:64)));
% on vision 309
cd ~357/spm/80x60/ncat16
for i=1:100,
cats= randperm(101);
% on vision402
cd ~/357/spm/80x60/ncat16
for i=1:899,
cats= randperm(101);
% on vision402
cd ~/101/images; cats101= getcategories;
Cbad= [1 2 3 4 6 13 14 16 20 24 27 34 36 37 40 47 48 51 52 55 56 58 59 64 76 81 87 90 94 95];
Cgood= setdiff(1:102,Cbad)
cd ~/357/spm/80x60/ncat72b
for i=2:10,