ICCV2005 Computer Vision Contest

"Where Am I?"

Quick links: Registration, Data (latest), Evaluation, Results.

Latest updates: The formal contest is now over, but I will be happy to let interested people continue submitting results [Nov 2, 2005].
Final results and contest winners on Test5 posted, final data set Test5 posted, ground truth *_test.txt files added to Test4 and Test5 [Nov 2, 2005]
Final results on Test4 posted, finalist selected, prize money announced [Oct 4, 2005].
New test results posted, calibration data posted [Sept 28,2005]
New data set posted (Test 4), final rules released, results and teams updated [Sep 12, 2005].
First results posted, new teams added, contest page updated [July 31, 2005].
New data sets (2 & 3), data set 1 retired. [Jun 5, 2005]

Goals

The goal of this contest was to provide a fun framework for vision students to try their hand at solving a challenging vision-related task and to improve their knowledge and understanding by competing against other teams from around the world. The contest results were announced in October 2005 at ICCV 2005 in Beijing, but people are welcome to continue refining their algorithms on the provided data or use this data as the basis of class projects.

The task

For the first Computer Vision Contest, we selected the following location recognition task, which we called "Where Am I?" for short.

Contestants are given a collection of color images taken by a calibrated digital camera. The photographs have been taken at various locations and often share overlapping fields of view or certain objects in common. The GPS locations for a subset of the images are provided. The goal of the contest is to guess, as accurately as possible, the GPS locations of the un-labeled images.

Contestants are free to use whatever combination of programs they wish, including existing imaging libraries. The compiled executable must read a descriptor file that contains a list of images and associated GPS locations (for a subset of the images), as well as open and process the JPEG images listed in this file. Its output must be a similar descriptor file, with the missing GPS locations filled in, which is then sent to the evaluation system for scoring.

Format of the competition

Competitors areprovided with one or more "training" test sets, which can be submitted for evaluation to a Web-based scoring system. The goal of this scoring system is to help the contest organizers to fine tune the difficulty of the tasks and to help competitors fine tune their programs and to see how they are doing relative to other contestants. The actual data set used for the final competition will not be released ahead of the final evaluation.

The exact formula for scoring depends on the expected reliability of the location estimates. We have not yet decided if there will be a time limit for the execution of the program, or if a standard platform will be provided to run the students' programs against new data sets.

The schedule for the competition is as follows:

Date	Event
June 2, 2005	Contest announced on ICCV 2005 home page and initial data sets released
July 31, 2005	First results posted, registrations keep coming in
Sept 12, 2005	Final test data set posted, final schedule posted
Sept 31, 2005	Team registration closes
Oct 3, 2005	Preliminary round ends, top five entries advance to final
Oct 5, 2005	Binary executables from top five teams due
Oct 19, 2005	Final results announced and prizes awarded at ICCV 2005

Teams wishing to advance to the final round and be eligible for the contest prizes must register their team by Sept 31 and send in their results by October 3. The ranking of the teams in the initial round will be announced the next day, and the top five teams who wish to participate in the final round must send in their binaries (or ZIP files that contain binaries and associated libraries) by October 5. Note that in order to qualify, selected teams' binaries must exactly reproduce their submitted entries.

The rationale for having teams submit binaries is that it is extremely hard to verify that algorithms aren't being "tweaked" or no human assistance is being given when the algorithms are run remotely. The submitted binaries for the final round must either execute on Windows (2000 or XP) or Linux, and must have a single command file/script or executable that accepts the input text file and outputs a single text file, as described in the Task section above. If your programs use Matlab, PERL, Python, or other external languages or libraries, these must be included in the ZIP/tar/rar file with the executable. (See, e.g., http://www.mathworks.com/access/helpdesk/help/toolbox/compiler/deployment_process6.html for instructions on how to include a Matlab runtime with an executable.)

The final round will be run in a "double blind" fashion on identical platforms (one for each operating system), disconnected from network access. The identity of the teams will be unknown to the people running the code and evaluation software, and the identity of the winning teams will be unknown to the contest organizers until the final award announcement, which will take place during the ICCV awards banquet.

Team Registration

Before you can submit your results to the scoring system, you must first register your team. Please visit the Mock Web Page, which also serves as the registration form. (Please note that your team name should have only valid filename characters and no whitespace, and your password should also contain no whitespace, to help with the automated result parsing.) Save the page to your local disk, edit / fill in the required fields, and e-mail the resulting Web page to the ICCV 2005 Computer Vision Contest e-mail alias.

The password you submit with your registration will be used to validate subsequent contest entries and will be removed from the team Web page before it is posted. The current e-mail based system for registration and scoring may get replaced at a later date with a completely automated Web-based system.

The list of teams currently registered can be found on the Teams page.

The data

We strongly encourage anyone who is interested in participating in the contest to e-mail the ICCV 2005 Computer Vision Contest alias so that they can be advised when the contents of this Web site changes.

The contest data sets consist of ZIP files that contains the JPEG images and the descriptor files. For convenience, we also provide a viewable page containing thumbnails of the images and direct links to the descriptor files.

Descriptor files

The Input.txt descriptor file lists all of the image names along with Longitude and Latitude values for all of the images in a given dataset:

	Dataset:	Test1
	Name Longitude Latitude
	PIC_0196.JPG 0 0
	PIC_0270.JPG -71.063449 42.355426
	...

"Test" images for which the longitude and latitude values must be inferred by the contestants' programs are tagged with (0,0) values.

The Output.txt result file (produced by the contestants' programs) must list the team name, password, date generated, computation time (which is not currently used in the evaluation), data set name, and name/longitude/latitude triplets (all entries are case sensitive!):

	Team:		MSR1
	Password:	Secret1
	Date_Time:	2005-06-01T13:23:45+08:00
	Elapsed_Time:	01:23:45
	Dataset:	Test1
	Name	Longitude	Latitude
	PIC_0196.JPG	-71.012345	42.123456
	PIC_0270.JPG	-71.063449	42.355426
	...

The Team name and Password must match those submitted at registration time, and the Dataset name must match that given in the Input.txt file. The Date_Time string indicates the date (and optionally time) when the experiment was run (or submitted), preferably in ISO 8601 format. Elapsed_Time indicates how long it took to generate the results. This value is not currently used in computing the final score, but there will be a hard cutoff on the run time in the final round (probably 8 hours), so you must report it.

It does not matter if the result file does or does not contain the names and locations of the "training" images whose true values were given in the Input.txt file. These are ignored in the scoring of the entry (only the "unknown" locations flagged with 0,0 long/lat in the Inputs.txt file are counted during the scoring). Also, note that the long/lat values given in the sample Output.txt file are nonsense values, and are for illustrative purposes only.

In some of the directories/ZIP files (Test4 and Final5), I have also included (at the participants' request) the "ground truth" (answers) for the Test data in a file called *_test.txt. You can use these results to see how close your algorithm comes to getting the "correct" answer.

Calibration data

You can find download ZIP files of images for the Canon S400 and Canon SD500 camera that I used. The SD500 was used for Data Sets 2 and 3 and the S400 for Data Set 4. The final test data set will be shot with the SD 500. (If unsure which images are which, you can always look for the "Canon SD500" or "Canon S400" in the image headers.)

The calibration images contain both an image of a checkerboard (for calibrating radial distortion) and panoramas, for calibrating the focal length (and/or radial distortion). It's up to you how you want to use them.

From previous experience (taking 360 panoramas) with my Canon SD500 and S400, the horizontal field of view is approximately 49 degrees.

Contest data

(Final) Data Set 5 (November 2, 2005): 38 2MPixel images + input and sample output descriptor files (21 MB).
Note: This data set is more typical of regular street-side imagery, but involves matching sequences of images to one another.
The top five teams ran their algorithms on this data and their results are reported here.

Data Set 4 (September 12, 2005): 29 2MPixel images + input and sample output descriptor files (19 MB).
Note: This data set is more typical of regular street-side imagery, but involves matching sequences of images to one another.
All contestants must produce results on this data set and include their running times in order to qualify for the final round.

Data Set 2 (Jun 5, 2005): 37 2MPixel images + input and sample output descriptor files (25 MB).
Note: This data set is reasonably small but is still fairly challenging because of the large number of repeated structures (windows).

Data Set 3 (Jun 5, 2005): 96 2MPixel images + input and sample output descriptor files (59 MB).
Note: This data set is quite challenging and should only be attempted after mastering Data Set 2.

Data Set 1 (May 31, 2005): 97 2MPixel images + input and sample output descriptor files (59 MB).
Note: This data set had some bugs in it (repeated file names) and is no longer being used. The same images (minus one mislabeled image) are in Data Set 3, but with different file names. [Rick Szeliski, June 5, 2005]

Evaluation and Scoring

The current evaluation and scoring system is based on e-mail submissions. Once you have calculated an appropriate Output.txt file, please e-mail the results to the ICCV 2005 Computer Vision Contest alias. Approximately once a week, the submitted results will be compared against the secret "gold standard" set of locations, and a histogram of distances (errors) will be computed. [See the note in the Hints section on the computation of these distances.] A submission's score will be a function of this histogram of location errors, e.g.,

Team	Date	Elapsed Time	Histogram of Errors (score)						Avg. Score
Team	Date	Elapsed Time	< 2 m (5 pts)	< 5 m (4 pts)	< 10 m (3 pts)	< 25 m (2 pts)	< 50 m (1 pt)	> 50 m (0 pts)	Avg. Score
MSR1	2005-06-01	01:23:45	1	4	3	7	2	80	0.47

In the future, the e-mail based system may be replaced by an automated Web-based system.

Prizes

Cash prizes in the amounts of $750US, $500US, and $250US were awarded to the first, second, and third place teams at ICCV 2005.

Results

Final results of the competition can be found here, along with the PowerPoint presentation of the awards at ICCV. Congratulations to the winners and runners up!

I will be happy to start a new table for updated post-competition results for those people who want to continue improving their algorithms. Please e-mail the ICCV 2005 Computer Vision Contest alias.

Final results on the qualifying round can be found here. The current results for the development and qualifying data sets can be found by clicking on the appropriate links: Results 2, Results 3. and Results 4

References

While contestants are free to use whatever algorithms they wish, the following papers may be useful as sources of ideas:

Schaffalitzky, F. and Zisserman, A., Multi-view Matching for Unordered Image Sets, or "How Do I Organize My Holiday Snaps?", ECCV 2002, vol I, pp. 414-431.
James Randerson, Photo recognition software gives location, New Scientist, April 2004 http://www.newscientist.com/article.ns?id=dn4857
B. Johansson and R. Cipolla. A system for automatic pose-estimation from a single image in a city scene. In IASTED Int. Conf. Signal Processing, Pattern Recognition and Applications, Crete (Greece), (June) 2002. http://mi.eng.cam.ac.uk/~cipolla/Publications.html (Roberto Cipolla's publications page: search for word �city�)
T. Yeh, K. Tollmar, and T. Darrell. Searching the Web with Mobile Images for Location Recognition, CVPR'2004, pp. 76-81. http://csdl.computer.org/comp/proceedings/cvpr/2004/2158/02/215820076abs.htm.
Joint affine region detector evaluation (Linux binaries available), http://www.robots.ox.ac.uk/%7Evgg/research/affine/index.html
Matthew Brown and David G. Lowe, "Unsupervised 3D object recognition and reconstruction in unordered datasets," International Conference on 3-D Digital Imaging and Modeling (3DIM 2005), Ottawa, Canada (June 2005), http://www.cs.ubc.ca/~mbrown/sam/sam.html.

Hints

You don't have to process the full-resolution images if you don't want to. You can subsample them after read-in and work with smaller-resolution data.

Please remember that Longitude and Latitude refer to spherical coordinates on the Earth, not Euclidean coordinates.

The formula for converting a distance in Long/Lat to meters is given by:
d_long = R cos((Lat₀+Lat₁)/2) sin(Long₁-Long₀), d_lat = R sin(Lat₁-Lat₀), d = hypot(d_long,d_lat)
and R = 6371.3 km. Note that Long/Lat are given in degrees, so these need to be converted into radians before evaluating the above functions.

Daniel Eaton has written a short MatLab script for converting long/lat coordinates to Google Maps URLs, which he has posted on his Web page.

History

Click here to see a list of other ideas that were proposed for the contest.

Acknowledgements

The initial idea of holding a Computer Vision Contest was proposed by P. Anandan in the Fall of 2003. Since then, many people have contributed to these ideas, including Bill Triggs, Cordelia Schmid, Andrew Zisserman, Jiri Matas, and many others. (See the history link above for more details.)

Last updated 2-Nov-2005 by Richard Szeliski. Please e-mail your comments or queries to the ICCV 2005 Computer Vision Contest alias.