Computational Biology Research Projects

Protein Classification Project

The goal of this project is to develop a computationally efficient protein classifier. The project will be written in Java and will employ advanced machine learning methods, including support vector machines, bagging and boosting.

Our plan is to create a classification system that will be accessible through a web browser and enable molecular biologists to classify protein sequences into functional/structural classes and identify evolutionarily conserved motif regions within them. Students will have the opportunity to learn about state-of-the-art machine learning techniques and gain insight into one of the fundamental problems in computational biology.

The main pre-requisite for participation in this project is proven experience in Java programming. Other desired (but not required) qualifications are:

  1. A course in machine learning.
  2. A course in probability and statistics.
  3. A couse in bio-informatics.
  4. Experience with Matlab.
  5. Experience in a software development team.

Students can count their project work for up to 3 points of research credit per semester, with permission from the faculty supervisers. Students who are interested in participating in the project should contact either

or