Hila Becker


Upcoming Dataset

The Upcoming dataset of flickr photo metadata used in:

Hila Becker, Mor Naaman, Luis Gravano, "Learning Similarity Metrics for Event Identification in Social Media", in Proceedings of the Third ACM International Conference on Web Search and Data Mining (WSDM'10).

is available here.

Please note that this is the raw data, collected via the Flickr API. The tags column contains machine tags (e.g., "upcoming:event=12345", "lastfm:event=67890") that can be used to identify the photo's Upcoming event ID. We removed these machine tags from the list of tags for each photo for the purpose of our experiments.

Additional details can be found in the paper.

If you wish to use this dataset in a publication, please cite the above paper (BibTex).