Welinder Meeting Log

From Vision Wiki
Jump to navigation Jump to search

Wed 16 Dec 2009

A* Search in the Constellation Model

  • Discussed comparison plots between "Expected" and "Best" heuristics
    • Change plots to log scale on x-axis
    • Compare against "Expected" that has stopped using a stopping criterion
    • Use the "Best" heuristic to find the global optimum, and then use the global max hypothesis as ground truth for the "Expected" heuristic
  • Ideas for a stopping criterion for the "Expected" heuristic:
    • Stop if the ratio between the chosen leaf hypotheses and the top estimate in the priority queue reaches a threshold value
  • Ideas for speeding up the A* search
    • At leaf node, drop all the hypotheses agreeing with the leaf hypotheses in all but one part (which is occluded) from the priority queue
    • At leaf node, drop all hypotheses which share at least one node with the leaf hypothesis
    • At leaf node, drop all hypotheses which share some parts with leaf hypothesis, but has different but spatially nearby parts for the unshared parts
    • Add a Hough-transform step: spend at most 1 day on this for the toy data, to initialize with good hypotheses. Does it help at all?
  • Get to work on real images

Possible Paper

We need to figure out which method to use in Visipedia. Here we compare the CM against Hough-based and Felzenszwalb's model, so see which one to proceed with.

  • Use 2 standard datasets:
    • (1) Rigid: Hough-based or Discriminative-Parts work, and CM works.
    • (2) Flexible: H or DP don't work, but CM still works.
  • Is CM as fast as the other if we use the A* algorithm?

MTurk Annotations

  • Anything <$100, don't ask. $100-$1000, maybe not. >$1000, ask Pietro.
  • GUI
    • Superimpose grid on images when mouse hovers over it so that the turker can evaluate size.
    • Use at least 5 turkers per hit so that we get enough data
    • Let Turkers answer a quiz of Bird questions before they can complete our tasks

Thurs 20 August 2009

Interesting problems:

  1. Efficient categorization: given hundreds of category models, some related (e.g. the different birds) and some unrelated (e.g. bicycles, birds, bottles ...) how do we come up with a sublinear search strategy for a constellation-like (or part-based) model?
  2. Part-based models for families of categories: can we come up with part-based models that will accommodate both broad categories (birds) and sub-categories and sub-sub-categories as well (raptors, hawks, ...)?
  3. Viewpoint-invariant, deformation-invariant, pose-invariant category models: combine multiple view-specific, pose-specific models, or have genuine 3D flexible models?
  4. Active learning. It would be fun to build a complete system that can ask intelligent questions from AMT workers and learn categories from huge image databases. Is this realistic? Do we see much to be done beyond what Grauman et al., Burl et al. and Freund et al. have done?


Different flavors of constellation models

  1. Full-covariance constellation (as in Weber et al.)
  2. Triangulated graphs (as in Grenander, Amit et al., von der Malsburg et al., Carneiro and Lowe)
  3. K-fans (Felsenszwalb and Huttenlocher)
  4. Star models: complex (as in Moreels et al, where the common reference frame could encode the pose, deformation, ...) or simple, with just a shared position and scale.
  5. Bounding-box models: this is the simplest version of a star model, where one imposes that the parts are inside a box with a given position and scale
  6. Bag of features: no condition on position and scale of the features. Just counting them within the image.

Conversation with Rob Fergus on goods and bads of old constellation model

  • Code is complicated because we were too careful about mutual dependencies and occlusions
  • Currently star models seem more efficient (Felszenswalb, Ramanan ...). They do not model occlusion explicitly, but the cost function saturates when parts go too far from their nominal position. They used the distance transform to make matching very efficient.
  • Something we did not do well (and Pedro and Deva did) was coming up with good representations/detectors for the individual parts. They carefully trained robust detectors for the individual parts.
  • Perhaps Pedro and Deva used supervised training, while we were insisting on unsupervised.
  • Rob likes Leo Zhu's work with Alan Yuille as well. They have hierarchies of parts. They get interesting compositions of corners and parts. What looks weak is that they only use the edges, rather than the pixels. Yann Le Cun and Geoff Hinton like hierarchical models too.
  • Likes Stefan Roth's work with Bernt Schiele on pedestrian detection. They get good results also with occlusion.

Conversation with Piotr Dollar on potential thesis topics

Models with parts

  • Part models
  • Geometry
  • 2D or 3D

Very many categories, some highly related

  • describing relationships between related categories
  • learning both broad models ("bird") and more specific ones
    • attribute learning, learning shared parts -- new and interesting problem, although we should avoid clashing with Steve's project
  • efficient detection/classification
  • learning quickly -> transfer learning, active learning -- very interesting, but hard to make lasting contribution
  • generative vs discriminative -- maybe not so new and interesting anymore

Human in the loop / Active Learning of Constellation models

  • one source of infinite data
  • there is a cost associated with training on the data (money and time)
  • there is also a cost for getting labels
  • in the learning process, take these costs into consideration
  • use a standard model (maybe Piotr's pedestrian code, or the constellation model)

Tues. 11 August 2009

Discussed Visipedia: what the page looks like at the moment, and what problems we want to address.

  • Picked random images from Wikipedia and search with them using Visipedia. In about 50% of the cases, we got good results.
  • Created a new category: Japanese bush warbler.
  • Examined pages with taxonomies of birds.
  • Possible interface: provide a list of bird species, and the system searches Flickr for examples, filters them through AMT and adds them to the website.
    • Is this what we want to do next?

Mon. 10 August 2009: 10 min

1. Visipedia

    • Talk
    • Serge collarboration
    • What we do?

2. Families

    • Card Database
    • Theory

3. Benchmark (Mohamed's project)

    • Journal Paper
    • Experiments
    • Code

4. Qualcomm

Visipedia:

    • Software System
    • How to probe Human annotators
    • Segmentation
    • ...

Talk about the above topics tomorrow.

Mon. 15 December 2008

  • Discussed results from benchmarking the Philbin system on the 5K Oxford data set.
  • Will write a brief Tech Report on the results.
    • Reimplemented the system based on the following papers, but using slightly different components.
    • Details missing in the original papers.
    • Results
    • New datasets: other objects/buildings (Caltech Cannon, Beckman Institute, Beckman Auditorium, Henry Moore Statues, Pieta Rondanini statue)
  • Need to aquire a large dataset (download flickr images or ask Zisserman's group)