Nissan
TO DO
- (DONE) May-July 07 -- Define project goal, write statement of work (Dollar, Watanabe, Shimomura, Perona)
- (DONE) July-August 07 -- Collect datasets (part acquired in Japan, part in US if necessary)
- (NOT DONE) July-Aug 07 -- Annotate datasets
- (DONE) August 07 -- No cost extension letter sent to Nissan by Caltech, together with new statement of work and with new confidentiality agreement (written by S. Dasgupta)
- September 07 - Signed report to be sent before Sept 2007 (report written by Dollar, Watanabe and signed by Perona)
- September 07 - Seigo and P. Dollar will take traffic video together. Seigo will instruct Dollar on how to replicate the Nissan camera setup. Seigo will produce a description of how to install the camera in a car, and how to gather good quality video.
- October 07 - Fast annotation of datasets
- November 07 - Dollar and Perona implement and test 3 promising methods for pedestrian detection. Benchmark on Nissan US data. If ready by end November, to be submitted to CVPR 2007
- December 07 - Detailed annotation of 1/4 of dataset
- December 07 / January 08 - Possibly Perona visit to Nissan
- January 08 - Second progress report from Caltech (written by Dollar)
- January 08 - Discuss 1 or 2-year follow up of project.
- September 08 - Third progress report from Caltech, written by Dollar, sent to Nissan by 15 Sept
below added on Sep.5, 07
- Dr. Perona or Dr. Dollar lets Nissan know to which conference papers are submitted when paper subumittion is decided or when registerations are done.
- Dr. Perona or Dr. Dollar submits a copy of the proposed publication to Nissan prior to submission to the conference (Non-technical parts are reviewed mainly by Nissan)
- Dr. Dollar emails Nissan the short progress every month or every two months.
Priorities for collaboration (updated 4 Sept 07)
The scenario is a narrow urban road with many pedestrians, poor lane markers, clutter of various sorts (parked vehicles, garbage dumpsters, buildings ...)
- Pedestrian detection
If it helps pedestrian recognition, also consider:
- road region recognition (whithout lane markers)
- improved lane marker recognition (for all lane markers)
Camera characteristics:
- ~640x480, 30Hz (i.e. NTSC),
- probably grayscale,
- better use 1 camera rather than 2,
- 30-40 degree field of view width
(In May 07 Seigo Watanabe obtained for Caltech a Panasonic camera, identical to the one used by Nissan).
Pedestrian detection specs:
- detect pedestrians in near and medium distance (80m): from 480 to 25 pix tall
- main focus is improved accuracy - no need to worry about computation time for now (goal for Nissan: 10Hz or better)
- need to keep track of multiple (10-20) pedestrians
- partially occluded pedestrians are important
- should consider all weather and all time of day except night time
Data Collection (updated 4 Sept 07)
- Video: 10h of video i.e. frames. Annotate one frame every 100 (approx. 100h of annotation time at 100 frames/h). If the video is annotated at UCSD, then we will share the video and the annotations with Professor Belongie's group.
- Collected 10h of video in Los Angeles County:
Seigo Watanabe. rRoute: Pasadena -> LA downtown -> Hollywood(Walk of fame) -> Beverly Hills(Rodeo Dr) -> Santa Monica(5th) -> LAX
- Collected 10h of video in Japan
Annotation (updated 4 Sept 07)
- Annotation Specifications
- pedestrian bounding box
- Examples of good bounding boxes (not too large, just right fit)
- Examples of bounding boxes for occluded pedestrians
- additional pedestrian info: orientation, height or distance, "danger level", male/female?, etc.
- road information: road boundaries, lanes, sidewalks, crossing regions
- rough image region annotation: road, buildings, sky, trees, vanishing point etc.
- other objects: cars, signs, ?
- image info: weather, day/night, fog, etc.
- other information?
- pedestrian bounding box
Fast Annotation
Fast annotation is a videogame in which users will view videos taken from car at full speed. The users will be asked to click on every pedestrian they see as fast as they detect them. Priority should be given to pedestrians that are in dangerous position with respect to the vehicle. Multiple clicks on the same pedestrians are encouraged, especially when the state of the pedestrian changes (e.g. a pedestrian starts crossing the road).
Accurate Annotation
- Goal: bounding boxes on pedestrians. Will need to estimate time needed to achieve this using fast annotation to select most promising frames. Need to schedule annotation.
- Annotation Specifications
- pedestrian bounding box
- Examples of good bounding boxes (not too large, just right fit)
- Examples of bounding boxes for occluded pedestrians
- additional pedestrian info: orientation, height or distance, "danger level", male/female?, etc.
- image info: weather, day/night, fog, etc.
- other information?
- pedestrian bounding box
- In the future may consider additional annotation:
- road information: road boundaries, lanes, sidewalks, crossing regions
- rough image region annotation: road, buildings, sky, trees, vanishing point etc.
- other objects: cars, signs, ?
- Unlabeled data
- labeled frames should not be consecutive, ideally every 3-4 seconds
- partially labeled data also useful (with say only bounding boxes, or even just centroid location)
- consider active learning to speed up labeling
- Public availability of the data:
- Nissan data will not leave the Caltech vision lab
- US traffic sequences may be shared with other groups
Deliverables (updated Sept 4, 2007)
- Statistics of pedestrian frequency, size, position, velocity in the US and Japan datasets
- Implementation of 3 existing methods (Triggs/Forsyth, Viola/Jones, shapelets) for pedestrian detection
- Benchmark of 3 methods in the literature (ROC curves)
- Use of position prior to improve detection performance
- Detection rates vs resolution
- Analysis of causes of false reject (occlusion, resolution, motion blur ...)