Greg Paper 01 ICCV07

From Vision Wiki
Jump to navigation Jump to search
Overview of hierarchical cascade structure, Dec 27


Update of the CVPR07 Page from Nov 2006

The below is based on our original CVPR07 paper outline.

Pietro,
I've been doing some experiments and can now address at least some of our previous questions (below).
-Greg

Message

  1. It is possible to generate visual taxonomies from a labeled training set
  2. The taxonomy is useful for visual tasks
In this paper we keep the taxonomies simple: binary 1, 2 and 3-level trees.

Why taxonomies?

  • Help build better discriminative classifiers: faster training, faster classification (does it scale to log(N) cost?), perhaps (but not necessarily) higher accuracy
This is now the only topic covered in this paper, and it is a fairly simple first attempt.
  • Provide a good metric for evaluating a classifier: some errors are more silly than other errors
  • `Never lost' property: should something new come along, which one of the existing categories does it relate to?
  • Sharing front-end features?
We'll cover these last three topics to a separate paper:
Using Phylogenetic Techniques to Learn Taxonomies of Visual Categories (ICCV 2007)
Otherwise the current paper will be too broad and unfocused.

Technical approach

Building taxonomies

Start from a pairwise affinity matrix and generate a tree-like taxonomy. Many methods to generate the pairwise affinity matrix:

  • Start from the confusion matrix once discriminative classifiers have been trained.
  • Start from the affinity matrix between pairs of classes:Lana's method gives you pairwise affinity between pictures: how should one define the pairwise affinity between categories (e.g. median affinity between pairs, perhaps 95 percentile affinity between pairs, or the fifth highest pairwise affinity as done for galaxies.)
I have had the best luck with confusion matrix, not affinity matrices. See Nov 17 for details. Affinity matrices may, however, may still work with kernel PCA?
I'm also keen to try using the SVM margins since these are softer, ie. they take into account not just the final classification but the strength of each individual classification. Another option: use pairwise SVM and use the number of pairs per category.

Building better classifiers from a taxonomy

  • Method 1: at each split in the taxonomy tree build a discriminative classifier. Let classification happen coarse-to-fine down the tree.
Notebook entry Dec 27 is where you can read about the method I'm currently using.
  • Method 2: generative classifiers at the coarser level, discriminative at the finer level
  • Method 3: think of the taxonomy tree not as a decision tree, but in the same way as in Fleuret and Geman: a node is a test on whether to continue down or not (low rate of false rejects). One could reach multiple leaves (and what would we do then?).
  • Method 4: Use DAGSVM (Crisianini and others)
This also deserves it's own paper, I think:
Exploiting Taxonomies of Visual Categories Using Hierarchical DAG-SVM (NIPS 2007)
We're not sure yet what method is best. Trying simple methods first.

Experiments

  1. Implement and test two methods for generating taxonomies: (a) confusion matrix and (b) affinity matrix
  2. Quantitative assessment of performance improvement for different classifiers
So far the biggest experiments seems to be
  1. measure speed vs. performance
  2. evaluating different methods of constructing trees.
  3. are binary classifiers the best? What about variable numbers of branches based on spectral clustering's preferred number of clusters?
Some discussion and testing of this can be found on Dec 17 , Dec 20 and Jan 12 and other places.
To summarize: spectral clustering is usually better than the previous monte-carlo clustering method that I was using. The main disadvantages is that spectral clustering
  1. requires you to symmetrize the confusion matrix in an ad-hoc fashion
  2. has no incentive to balance the decision tree with equal-sized groupings. This sometimes hurts performance, particularly in Caltech-101.

Experiments that are not well defined yet

  1. Compare the visual taxonomy to wordnet. Compare to `logical' taxonomy built by people.
  2. Why is this working? Is it cheating by using the background?
Update: yes sometimes it is seems to be cheating by using the background. Should we apply some sort of segmentation? Actually using the background is not so crazy, is it? It's kind of smart of the algorithm to exploit this. For now I am not using the phylogenetic trees and, instead, saving those for separate papers. All the trees created in this paper come from simple groupings of the (confusion) matrix. For the simple binary trees in this paper I'm not sure can address this complicated issue.
There is also now a separate paper for the clutter detector:
Attentional Cascade for Rapidly Identifying Of Hundreds of Object Categories (NIPS 2007)


Open Questions

Information
Define hierarchical entropy in a way that takes taxonomic relationships into account.
Learning And Taxonomies
Where an object lands in the tree may give the computer some insight as to what sort of thing it is.
Then it can come back later and ask intelligent questions to incorporate the new information.
Alex's project would be highly relevant here.
Clutter and Taxonomies
Instead of training on a special clutter category, define clutter using the taxonomy. What if the best guesses aren't clustered in the tree, but spread around many different branches (high hierarchical entropy)? If the classification is spread throughout the tree, it is probably "uninteresting" If it is focused on a few small branches (low hierarchical entropy), it is probably "interesting". That could be powerful: we're defining clutter based only on a set of non-clutter categories! We're saying: clutter consists of things that do not make any sense within the hierarchy. This could be the "2nd-level" in the picture below:

Cascade Feb19.png