Broad Area Colloquium for Artificial Intelligence,
Geometry, Graphics, Robotics and Vision
From grep to graphical models: three decades of finite-state language
processing
Fernando Pereira
University of Pennsylvania
Monday, January 28, 2002, 4:15PM
Gates B01 http://robotics.stanford.edu/ba-colloquium/
Abstract
In early 1973, Ken Thompson wrote "grep" at the request of Doug
McIlroy, who needed a convenient tool to implement phonetic rules for
a speech synthesizer. Almost thirty years later, finite-state methods
still dominate practical text and speech processing, and also play a
central role in biological sequence analysis. I will discuss several
advances in finite-state techniques that have contributed to this
remarkable longevity, with particular focus on probabilistic
extensions used in speech recognition and information extraction. I
will conclude with a brief discussion of whether we are ready to move
beyond finite state.
About the Speaker
Fernando Pereira is the Andrew and Debra Rachleff Professor and chair
of the department of Computer and Information Science, University of
Pennsylvania. He received a Ph.D. in Artificial Intelligence from the
University of Edinburgh in 1982. Before joining Penn, he held
industrial research and management positions at SRI International, at
AT&T Labs, where he led the machine learning and information retrieval
research department from September 1995 to April 2000, and most
recently at WhizBang Labs, a Web information extraction company. His
main research areas are computational linguistics and machine
learning, and he is a main contributor to several advances in
finite-state models for speech and text processing in everyday
industrial use. He has 73 research publications on computational
linguistics, speech recognition, machine learning and logic
programming, and several issued and pending patents on speech and
language processing, and on human-computer interfaces. He was elected
Fellow of the American Association for Artificial Intelligence in 1991
for his contributions to computational linguistics and logic
programming, and he is a past president of the Association for
Computational Linguistics.