Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
(CS 528)
Robust Textual Inference
Christopher Manning
April 17, 2006, 4:15PM
TCSeq 200 http://graphics.stanford.edu/ba-colloquium/
Abstract
The big problem in natural language understanding today is that we still
do not have a technology to do arbitrary-domain language understanding
which can offer semantic matching at a higher level of fidelity than
keyword-based web search or text categorization.
The task of robust textual inference focuses on this problem. It tests
being able to draw conclusions from an arbitrary piece of text as a
human would, bringing to bear any necessary common sense knowledge. For
example, given the text:
Last week, Romano Prodi met the US President George Bush to discuss
the abolition of farming subsidies in Europe and the US.
Then you would want to be able to conclude that 'George Bush has met
Romano Prodi' and 'there are farming subsidies in the U.S.' but it would
be a big mistake to conclude that 'George Bush abolished farming
subsidies in Europe'. Such reasoning can be used as the basis for an
information retrieval system providing semantic search.
Over the last eighteen months, a bunch of colleagues at Stanford and I
have attempted to build systems for this task. In this talk I will
focus on the possible roles and interactions beween machine learning,
logical inference, and linguistic knowledge that this problem presents.
I advocate a new architecture for textual inference in which finding a
good alignment is separated from evaluating entailment. Entailment is
evaluated by extracting features of a text match, which mark whether it
is a valid or invalid pattern of inference (somewhat in the tradition of
syllogistic reasoning). These features are then fed into a statistical
classifier, trained on development data.
About the Speaker
Christopher Manning is an assistant professor of computer science and
linguistics at Stanford University. Previously, he held faculty
positions at Carnegie Mellon University and the University of
Sydney. His research interests include probabilistic natural language
parsing, syntax, information extraction and text mining. He is the
author of three books, including Foundations of Statistical Natural
Language Processing (MIT Press, 1999, with Hinrich Schuetze).