Taming the Giants and The Monsters:
Recent Developments in Data Mining
Usama Fayyad
Microsoft Research
Abstract
Knowledge Discovery in Databases (KDD) and Data Mining are concerned
with the extraction of interesting structure from databases, especially
large stores. Following a brief overview of this rapidly growing area
of research and applications, I'll focus on data mining methods. These
methods have their origins in statistics, pattern recognition,
learning, visualization, databases, optimization, and parallel computing.
I'll
discuss some classification and clustering methods and how they are
scaled to large databases. I'll present results from our recent work
to demonstrate that the methods can be effectively scaled to work with
large databases with only limited memory resources. I'll outline the
research challenges and opportunities posed by the problem of
extracting models from massive data sets. Operating under such
scalability constraints poses interesting problems for how
models can be built and what methods are practical. Some
applications will be used to motivate and illustrate the
techniques.
BIO OF USAMA FAYYAD:
Usama Fayyad is a Senior Researcher at Microsoft Research
(http://research.microsoft.com/~fayyad). His research interests
include scaling data mining algorithms to large databases,
learning algorithms, and statistical pattern recognition, especially
classification and clustering. After receiving the Ph.D. degree from
The University of Michigan, Ann Arbor in 1991, he joined the Jet
Propulsion Laboratory (JPL), California Institute of Technology,
where (until 1996) he headed the Machine Learning Systems Group and
developed data mining systems for automated science data analysis. He
received the 1994 NASA Exceptional Achievement Medal and the JPL 1993
Lew Allen Award for Excellence in Research for his work on developing
data mining systems to solve challenging science analysis problems in
astronomy and remote sensing. He remains affiliated with JPL as a
Distinguished Visting Scientist. He is a co-editor of Advances
in Knowledge Discovery and Data Mining (AAAI/MIT Press, 1996) and is an
Editor-in-Chief of the journal: Data Mining and Knowledge Discovery. He was
program co-chair of KDD-94 and KDD-95 (the First International Conference
on Knowledge Discovery and Data Mining) and is general chair of KDD-96 and
KDD-99. He co-chaired the 1997 Workshops on the role of KDD in
Visualizations held at KDD-97 and IEEE Vis-97 conferences.
Eyal Amir
Last modified: Thu Mar 18 15:51:55 PST 1999