Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
(CS 528)


Robust Textual Inference

Christopher Manning
April 17, 2006, 4:15PM
TCSeq 200
http://graphics.stanford.edu/ba-colloquium/

Abstract

The big problem in natural language understanding today is that we still do not have a technology to do arbitrary-domain language understanding which can offer semantic matching at a higher level of fidelity than keyword-based web search or text categorization.

The task of robust textual inference focuses on this problem. It tests being able to draw conclusions from an arbitrary piece of text as a human would, bringing to bear any necessary common sense knowledge. For example, given the text:

Last week, Romano Prodi met the US President George Bush to discuss the abolition of farming subsidies in Europe and the US.

Then you would want to be able to conclude that 'George Bush has met Romano Prodi' and 'there are farming subsidies in the U.S.' but it would be a big mistake to conclude that 'George Bush abolished farming subsidies in Europe'. Such reasoning can be used as the basis for an information retrieval system providing semantic search.

Over the last eighteen months, a bunch of colleagues at Stanford and I have attempted to build systems for this task. In this talk I will focus on the possible roles and interactions beween machine learning, logical inference, and linguistic knowledge that this problem presents. I advocate a new architecture for textual inference in which finding a good alignment is separated from evaluating entailment. Entailment is evaluated by extracting features of a text match, which mark whether it is a valid or invalid pattern of inference (somewhat in the tradition of syllogistic reasoning). These features are then fed into a statistical classifier, trained on development data.

About the Speaker

Christopher Manning is an assistant professor of computer science and linguistics at Stanford University. Previously, he held faculty positions at Carnegie Mellon University and the University of Sydney. His research interests include probabilistic natural language parsing, syntax, information extraction and text mining. He is the author of three books, including Foundations of Statistical Natural Language Processing (MIT Press, 1999, with Hinrich Schuetze).


Contact: bac-coordinators@cs.stanford.edu

Back to the Colloquium Page