Capturing Photos and Videos with Tagged Pixels
Digital cameras, from professional SLRs to cellphone cameras, have become ubiquitous in daily life. Today, cameras are being used not only for photography but also to access information. For example, some cellphone cameras enable a user to take images of bar-codes and obtain information regarding the objects they are attached to. The appearance of a bar-code in an image depends on its distance and inclination with respect to the camera as well as the illumination conditions. Given the limited resolution, dynamic range, and depth of field of a camera, it is difficult to reliably detect bar-codes from images taken from arbitrary viewpoints and distances.

In this project, we propose the use of active illumination to create optical (“virtual”) tags in a scene. As a key advantage, our method does not require making physical contact with, or alterations to, the objects in the scene, as existing tagging methods (barcodes or LED tags) do. Our basic idea is simple: We use an infrared (IR) projector to project temporally-coded (blinking) dots onto selected points in a scene. These tags are invisible to the human eye, but detected as time-varying codes by an IR-sensitive photo detector. As a proof-of-concept, we have implemented a prototype camera system (consisting of an off-the-shelf camcorder and a co-located IR video camera) to simultaneously capture visible and IR images. In a tagged scene, this camera can acquire photos as well as videos (taken as the camera moves) with tagged pixels.

In the illustration above, we show three examples of images with tagged pixels captured using our system. Each detected tag (green dot) carries information about the 3D location of the tag and the identity of the object it falls on. By using the 3D positions of the detected tags in a single image, a user camera can robustly and efficiently estimate its pose. This pose information is used to compute the 2D coordinates of the invisible tags in the scene on the captured image. The invisible tags include ones that are occluded in the scene (red dots) as well as ones that lie outside the field of view of the camera (blue dots). Such functionality is very difficult, if not impossible, to achieve by using traditional physical tags (e.g., LED tags) that do not convey 3D information.

We demonstrate several applications of our system, including, photo-browsing, e-commerce, augmented reality, and objection localization.


"Capturing Images with Sparse Informational Pixels using Projected 3D Tags,"
L. Zhang, N. Subramaniam, R. Lin, S. K. Nayar, and R. Raskar,
Proceedings of IEEE Virtual Reality,
Mar. 2008.
[PDF] [bib] [©]

PhotoTags Interactive Demonstration

  PhotoTags - An Interactive Demo:
The PhotoTags program demonstrates several applications of the system such as photo-browsing and information search. In this web-based demo, a Museum can be explored and showcases varying use-cases of the system.

Macromedia Flash Required.
Also, depending on your connection speed, the demo may take up to a minute to load.


  The 3D Tagging System and the Authoring Procedure:
The 3D tagging system consists of an IR projector with a co-located color camera and an IR camera.
  A Tagged Enabled Hybrid Camera:
To capture tagged photos, the user camera must be able to acquire a color photo and an IR video, simultaneously. As a proof-of concept, we have constructed a hybrid camera for this purpose. This camera consists of a consumer camcorder and an IR camera.
  Museum Browsing Example:
We have developed a novel interactive viewer called PhotoTags for browsing collections of tagged photographs. In this example, we show the use of PhotoTags to browse a collection of tagged photos taken in a museum.
  Toy Store Browsing Example:
Here, we show the use of PhotoTags to browse a collection of tagged photos taken in a toy store.


  VR 2008 Full Video:
This video introduces the tagging system and the user camera, summarizes the authoring procedure, and demonstrates several applications of our method, including photo-browsing, augmented video, and objection localization. (With narration)
This video shows how PhotoTags can be used to enhance a photo browsing experience. (With narration)
  Video Augmentation:
This video shows how our tagging method can be used to augment videos with textual information. (With narration)
  Object Search:
This video shows the use of the tagging method to quickly search for an object by taking a tagged photo. (With narration)


  PowerPoint Slide Presentation:
The presentation which was given at the IEEE Virtual Reality 2008 Conference, in Reno, Nevada. Note that the zipped file includes all associated videos as well.

[ .zip | 27 MB ]

Related Projects

Project Anywhere: Radiometric and Geometric Compensation

Project Anywhere: Focus and Pixelation