COMS E6998: Advanced Topics in Computational Learning Theory
Spring 2005

General Information | Motivation | Prerequisites | Topics | Class Requirements | Readings

ANNOUNCEMENT: There will be no class on Thurs April 21. Project presentations will take place in class on Thurs April 28 and Thurs May 5. On Thurs April 28 class will be held in from 5:30-7:45 in 415 CEPSR, and on Thurs May 8 class will be held from 5:30-7:45 in 825 Mudd. Contact Andrew with any questions about AV setup or scheduling for presentations.

General Information

Instructor: Rocco Servedio (517 CSC, office hours Tues 2-4). Email: rocco at cs dot columbia dot edu. Week of May 2-6: Rocco's office hours will be Thurs 10-12.
TA: Andrew Wan (516 CSC, office hours Mon 3-4 and Wed 2-3). Email: atw12 at cs dot columbia dot edu
Room: 327 Seeley W. Mudd building
Time: Thurs 5:25-7:25pm

COMS E6998 is an advanced graduate course on efficient algorithms in computational learning theory. The course can be used as a theory elective for the Ph.D. program in computer science, or as an track elective course for MS students in the "Foundations of Computer Science" track or the "Machine Learning" track. Auditing the class requires the approval of the instructor. CVN students should contact the instructor before registering for the class.

Motivation

The question "Can machines learn from experience?" is one that has fascinated people for a long time. Over the past few decades, many researchers in computer science have studied this question from a range of applied and theoretical perspectives. We will study this question from the perspective of theoretical computer science, and thus our focus will be on computationally efficient and provably correct algorithms for learning Boolean functions.

There is no single "right way" to model a complex phenomenon such as learning from a theoretical point of view. In this course we will cover several different models of learning: these will include online learning, learning from uniformly distributed random examples (the "uniform distribution PAC learning model"), and exact learning from queries. For each of these models we will study efficient algorithms for learning decision trees, Boolean formulas in Disjunctive Normal Form, intersections of halfspaces, and other interesting classes of functions.

Much of the course will have a complexity-theoretic flavor, and a central theme of the course will be the close connections which exist between techniques and ideas from computational complexity theory and efficient algorithms in computational learning theory.

Prerequisites

COMS 4252 (Computational Learning Theory), or its prior incarnation as COMS 4995, is ideal preparation. Students who have not taken COMS 4252 but who have taken some related coursework (such as Machine Learning, COMS 4236, or COMS 4231) may enroll with the instructor's permission; contact me if you have questions.

List of Topics

This is a rough list of topics which is subject to change. Only a subset of these topics will be covered, and topics not listed here may be covered as well as determined by time and the interest/background of the couse participants. Some topics will take less than one lecture and some will take more.

The course will be roughly divided into three main units corresponding to three different learning models. As we will see, each of these models has its own "flavor" and set of techniques that are useful for designing efficient learning algorithms in that model. These models are the online mistake-bound model, the model of learning from uniform random examples, and the model of exact learning from membership and equivalence queries.

First unit: online mistake bound learning.

Second unit: learning from uniform random examples over {0,1}^n.

Unit #3: exact learning from membership and equivalence queries:

Class Requirements

The requirements of the course are as follows:

Readings

There is no required textbook for the course. Pointers to required and recommended readings (papers and occasional book chapters), or hard copies when appropriate, will be made available to students as the semester progresses. Click here for the readings page.

Here are scribe notes for:
Lecture 1 (1/20/2005)
Lecture 2 (1/27/2005)
Lecture 3 (2/3/2005)
Lecture 4 (2/10/2005)
Lecture 5 (2/17/2005)
Lecture 6 (2/24/2005)
Lecture 7 (3/3/2005)
Lecture 8 (3/10/2005)
Lecture 9 (3/24/2005)
Lecture 10 (3/31/2005)
Lecture 11 (4/7/2005)

Academic Honesty Policy

By taking the course, students are presumed to consent to the Computer Science departmental policy on academic honesty. This policy has been passed by the faculty of the Department and approved by the school deans. The policy will be enforced for all courses, including this one. You can find the text of the policy at http://www.cs.columbia.edu/academics/honesty. For this course in particular, students should be sure to provide appropriate citations for all sources used in their project reports and scribe notes.