Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
(CS 528)


Semi-supervised learning

Tommi Jaakkola
MIT Artificial Intelligence Laboratory
Monday, January 12, 2004, 4:15PM
TCSeq 200
http://graphics.stanford.edu/ba-colloquium/

Abstract

Many modern prediction tasks involve very limited explicit guidance about the task to be solved such as labeled instances (documents, images, molecules, etc.). In contrast, a large number unlabeled instances, examples to be classified, may be readily available. The set of unlabeled examples provides information about the structure, properties, and distribution of examples. By incorporating this additional source of information we hope to (and sometimes can) substantially improve the prediction accuracy. With few exceptions the benefit is derived from a combination of prior constraints and assumptions pertaining to "natural distinctions" that can be made over the data points. While current methods of incorporating unlabeled examples often yield better predictions, they can also lead to a dramatic loss of accuracy with little forewarning. I will discuss some of the methods proposed in this context, explain our current understanding of how and why they work, and discuss a new information theoretic principle aiming to solve the problem in general. I will also briefly outline open problems in this context.

About the Speaker

Tommi Jaakkola is an associate professor of Electrical Engineering and Computer Science at the Massachusetts Institute of Technology. He received M.Sc. from Helsinki University of Technology in theoretical physics, 1992, and Ph.D. from MIT in computational neuroscience, 1997. Following a postdoctoral position in computational molecular biology (DOE/Sloan postdoctoral fellow) he joined the MIT EECS faculty 1998. He is currently a Sloan Research Fellow in Computer Science, on the editorial board of Artificial Intelligence Research, and an action editor of Machine Learning Research. Prof. Jaakkola's research covers machine learning, information retrieval, and computational biology.


Contact: bac-coordinators@cs.stanford.edu

Back to the Colloquium Page