Contents
General Info
Announcements
Project Guidelines
Assignments/Handouts
Instructional Staff
Useful Links
Newsgroup

Timetable and Summary
November 4 (Thu) Proposal due
November 9 (Tue) Teaming agreements fixed
December 14 (Tue) Paper or project due

General Rules
The course paper and/or project will be the major requirement of the course. Thus, a quality job commensurate with three course credits is expected. Although size is not a measure of quality, the approximate amount of effort should be comparable to that of writing a major term paper of about 30 pages. Implicit in such a paper is library research of a dozen or so articles. Projects should show a similar amount of dedication, with the final writeup of similar size, although about half of the document can be program listings and results. If the project is done as a team (there is a limit of two people on a team, unless with special permission), the result should show proportionate effort (that is, 60 pages or its equivalent).

If you intend to submit your paper or a project to another course as well, you must obtain permission. The understanding is that such a joint investigation is for the sake of depth, and the result should show a proportionate increase in the amount of work (that is, again 60 pages). A joint effort across courses should truly be a work of art.

For a Paper
Investigate and report on the state of the art in human and/or machine visual and/or spatial perception. You may have to use libraries other than Engineering and Science. What is important in the paper is scholarly research and reflection, not breadth of coverage. You are not required to make an original contribution, although a well-written paper might see publication. You must show some evidence of an attempt at a synthesis: the field is disparate, so your work should help tie it together, at least in your own mind. Thus, the last five to seven pages or so (15-25%) should serve to future readers in the area as a personal guide. Prior course papers have included: analyses of current commercial systems (such as iris verification, fingerprint analysis, blood cell analysis, robot spacecraft), analyses of current research systems (gesture or sign language analysis, face recognition, gestural control of robots), or designs for new visual interface application areas (psychology of visual interfaces, musical instrument simulation, robotic delivery vehicles).

For a Project
Program up a small version of some visual input processor of human data. Pick any one that has appeared in the scientific literature, or invent your own. Stay small: this is difficult work, even if someone has gone before you. Ensure your completion of the project by having well-defined bail-out points; build in stages. Two dimensional data are acceptable. You can use a real-time camera (hard!) or just a real camera (not so hard) or input that has been digitized off-line (easy: for example, faces, fingerprints, photographs of parking lots, etc.) The emphasis should be on the assertion of a small symbol: a binary decision (authorization) or token from a small set (gesture class, etc.), perhaps with modifying scalar data (certainty factors; positions or speeds, etc.). Prior course projects have included: a visual burglar alarm (demonstrated in my own office!), a system which detected class changing times from a video of campus pedestrian traffic, and two separate systems using "visual passwords" for computer logins.

Proposal Structure

Since there will be a lot of work involved in an area that is not well understood, it will be important to forestall any major catastrophes early. You are required to submit a paper/project proposal, which will be critiqued and returned within a week. It should address the following checklist:

  • The phenomena or program to be investigated.
  • A description of the limits you have placed on the investigation. This is most important for programs, but even papers can get out of hand. In past experience, most initial project proposals are approximately ten times too ambitious.
  • At least two major references: either textbooks, journal articles, or (possibly) computer code. This must be a part of the proposal: start now!
  • A two-page sketch of the anticipated results. For a paper, this would be usually be what sort of personal synthesis you hope to be able to get out of it. For a program, this would be program performance. You are not held to this as a promise, but it will help you decide among topics.
  • Any special help that may be required, such as exotic references or equipment. If such help is necessary but unattainable, the proposal will have to be denied.

Sample Topics

The following are intended only as suggestions to get you started. If you are interested in something that deals with visual (input) interfaces, please feel free to suggest it in the proposal, and something will be worked out.

For a Paper

The following domains appear to be ripe with multiple topics, with varying degrees of development and commercialization.

  • The medical domain is a rich area for visually significant human data. Blood cells, cancer cells, chromosomes, iris patterns, retinal patterns, thumb or fingerprints, palm prints, lip prints, faces, and gross body structure have all been considered and/or commercialized in systems that provide identity and/or health information via visual means.
  • There are many security and surveillance systems being developed, which include room, parking lot, freeway, and prison monitoring, sometimes using mobile robot platforms.
  • The sports world reportedly has systems to help analyze performance such as more effective use of the baseball bat, golf club, vault pole, etc. Several systems for virtual sports based on visual analysis of the athlete also exist.
  • The art world has some automated interaction systems for "performance" art.
  • Some initial attempts on coordinating facial analysis with speech understanding, and even of the reading of "mental state" (emotions, etc.), have been demonstrated.

For a Project

The following simple systems should be possible.

  • An automatic screensaver or login/logout control, based on detecting the presence of a user at the key board, even if only one-way: user goes away to turn screensaver on or logout, or user appears to turn screen saver off or login, or both. Possibly, the detection of head direction might also be a trigger: user looks away, etc.
  • A simple gestural password system: a gross pose or motion of the user serves as the password, which, like baseball signals, might be embedded in decoy gestures. Or, the "password" could be used to trigger some favorite application, or terminate it (a visual "here comes the boss!" switch, for example).
  • A simple activity sensor ("lots" of change over "short" intervals of time) could be used to do things like control the volume of music from the system, and which fades away as the activity calms down.
  • The gross sizing of a window keyed to the perspective effect of a looming hand: hand nearer makes window bigger, and vice versa.

All of these require some operating system hacking, but not very much sophisticated visual processing.

If you are considering camera input, you should consider the cameras and interfaces available in the CLIC lab. There are also some low cost cameras available for purchase on teh net: try starting at www.connectix.com for some ideas, or investigate the code behind CUSeeMe.