COMS 6998-006 Network Theory

Call Number 98461

Spring 2008

Thursdays 6:10-8 PM

Place 233 Mudd

Dragomir R. Radev

Goal of the course

To allow graduate students to catch up with recent developments in network theory, focusing on existing networks such as the web and protein interaction networks and the methods and algorithms for analysing them.

Brief description

The course is about naturally occurring networks such as the Web, social networks, citation networks, protein interaction networks, lexical networks, movie actors, etc. The course will cover the mathematical and computational models that explain the behavior of these networks. Specific topics include random graphs, small worlds, scale-free networks, random walks and harmonic functions, spectral methods, descriptive analysis of networks, information diffusion and learning on graphs, etc.

Format

The class will meet once a week for 2 hours in order for MS students to be able to attend regularly. The students will be responsible for scribing, 3 assignments, including a final project, and a final exam.

Assignments

  1. Project 1
    Analysis of a network (6 PR-style pages), assignment, examples, list of analyzed datasets
  2. Project 2
    Replicating an existing paper (6 pages), assignment, examples, list of replicated papers
  3. Project 3
    Final project (10 pages) and presentation, assignment, datasets, list of final projects
  4. Course Book
    Compilations of all lectures and projects. assignment, sample chapters, table of contents

Final project

An open-ended research project in one of two categories:
  1. Research paper - using the SIGIR format. Students will be in charge of problem formulation, literature survey, hypothesis formulation, experimental design, implementation, and possibly submission to a conference like SIGIR, Sunbelt, ICWSM, or WWW.
  2. Software system - develop a working, useful system (including an API). Students will be responsible for identifying a niche problem, implementing it and deploying it, either on the Web or as an open-source downloadable tool.

Contact Information/Office Hours

Sara Stolbach, email: ss3067 [at] columbia.edu - Thursday, 2-4, TA room

Resources

Readings

Assigned Reading

Jan 31Albert/Barabasi 2002, Statistical Mechanics of Complex Networks sections I, II, III (pages 48-59)
Newman 2003, The structure and function of complex networks sections I, II, IV (pages 1-9 and 20-26)
Feb 7Lada A. Adamic Zipf, Power-laws, and Pareto - a ranking tutorial, Information Dynamics Lab, HP Labs
Newman 2002, Random graphs as models of networks pages 1-11 (required), pages 12-19 (optional), Kalevi Kilkki, A practical model for analyzing long tails (optional)
Feb 14D. J. Watts and S. H. Strogatz. Collective dynamics of 'small-world' networks, Nature, 393:440-442 (1998)
A. Barrat and M. Weigt, On the properties of small-world network models, Europ. Phys. J. B 13, 547 (2000) pages 1-8
Feb 21Doyle/Snell Random Walks and Electric Networks pages 1-14, 29-37 (required), 37-51 (optional)
Feb 28David Austin How Google Finds Your Needle in the Web's Haystack
Amy N. Langville and Carl D. Meyer Deeper Inside PageRank. Internet Mathematics, Vol. 1(3):335-380, 2005. pages 1-20
Mar 6 Ferrer i Cancho, Sole, Koehler, Patterns in syntactic dependency networks, Phys Rev E. 69, 051915 (2004)
Dorogovtsev and Mendes, Language as an evolving word web, Proc. Royal Soc. London B 268, 2603-2606 (2001) (optional)
Motter et al., Topology of the conceptual network of language, Phys Rev E. 65, 065102 (2002)
Masucci and Rodgers, Network properties of written human language, Phys Rev E.74. 026102, (2006)
Sole et al., Language networks: their structure, function, and evolution, Trends in Cognitive Sciences (2005)
Sigman and Cecchi, Global organization of the Wordnet lexicon, Proc Natl Acad Sci U S A, 99(3):1742-7, February 2002
Caldeira et al., The network of concepts in written texts (2005)
Mar 13Xiaojin Zhu SSL Survey pages 1-30, 33-34
M. Seeger Learning with Labeled and Unlabeled Data (optional)
Mar 27McPherson et al. Birds of a feather - homophily in social network (2001)
Kossinets and Watts, Empirical Analysis of an Evolving Social Network (2006)
Apr 10Derek de Solla Price, Networks of Scientific papers
Epidemics and percolation in small-world networks (2000)
J.E. Hirsch An index to quantify an individual's scientific research output
W. Glanzel, BIBLIOMETRICS AS A RESEARCH FIELD (2003)
Apr 17Inderjit Dhillon, Co-clustering documents and Words using Bipartite Spectral Graph Partitioning, KDD (2001)
Martin Gardner, The fantastic combinations of John Conway's new solitaire game "life" Scientific American 223 (October 1970): 120-123.
Jon Kleinberg, Navigation in a small world, Nature 2000

Syllabus

Jan 24
[ppt]
Introduction to networks [notes], Real networks [notes]
Jan 31
[ppt]
Random graphs [notes (draft)], Software for network analysis [notes (draft)]
Feb 7
[ppt]
Properties of networks [notes (draft)], Power law degree distributions [notes (draft)]
Feb 14
[ppt]
Power Law (cont.) [notes (draft)], Self Similarity [notes (draft)], Small world networks [notes (draft)]
Feb 21
[ppt]
Random Walks and Electrical Networks [nodes (draft)], Method of relaxations and other methods for computing harmonic functions, Eigenvector methods[notes (draft)]
Feb 28
[ppt]
Community Identification [notes (draft)], Spectral clustering [notes (draft)]
Mar 6
[ppt]
Lexical networks [notes (draft)], Eigenvectors, Eigenvalues and Stochastic Matrices [notes (draft)]
Mar 13
[ppt]
Applications to information retrieval and NLP [notes (draft)],[notes (draft)]
Mar 27
[ppt]
Social networks, Cascading Behavior, Strength of weak ties, Assortativity and homophily [notes (draft)],[notes (draft)]
Apr 3
[ppt]
Biological, Technological and other networks [notes (draft)], Network traversal
Apr 10
[ppt]
Bibliometrics, The Ising model, Percolation on Graphs [notes (draft)]
Apr 17
[ppt]
Cellular Automata, Network evolution, Search on Networks, Co-clustering [notes (draft)]
Apr 24
[ppt1][ppt2]
Dependancy Parsing using Graphs (Guest Lecturer: Ryan McDonald) [notes (draft)]
Semisupervised learning on graphs (Guest Lecturer: Gunes Erkan) [notes (draft)]
Time Permitting:
  1. Diffusion on graphs and the heat kernel
  2. Finding motifs and graph mining
  3. Generating functions
  4. Graphs with predefined degree sequences
  5. Relational learning

Miscellaneous

Instructor

Dragomir R. Radev obtained a PhD in Computer Science from Columbia in 1999. He is currently an associate professor at the University of Michigan with a joint appointment in the Department of Electrical Engineering and Computer Science and the School of Information. He has taught numerous courses in Natural Language Processing, Databases, and Information Retrieval. He has received a number of awards including the Columbia CS department award for graduate student teaching, the University of Michigan UROP award for outstanding research mentorship, as well as the Gosnell Prize for excellence in political methodology. As undergraduate he has been on a team that finished in the top 10 at the ACM international computer programming contest and, later, has been the coach of Columbia's team for 4 years, taking it to three international finals. Dragomir has worked in different capacities for places like Microsoft Research, IBM Research, MITRE, AT&T Bell Labs, and Yahoo!

Related courses elsewhere