Andrew Moore -- Talk Abstract

Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
(CS 528)

Second Generation Cached-Sufficient Statistics for efficient statistical queries

Andrew Moore, Carnegie Mellon University
Feburary 7, 2005, 4:15PM
TCSeq 200
http://graphics.stanford.edu/ba-colloquium/

Abstract

This talk is about recent work on new ways to exploit preprocessed views of data tables for tractably solving big statistical queries. We'll describe deployments of these new algorithms in the realms of detecting killer asteroids and unnatural disease outbreaks.

In recent years, several groups have looked at methods for pre-storing general sufficient statistics of the data in spatial data structures such as kd-trees and ball-trees so that statistical operations involving aggregation, convolution and contingency tables become fast for large datasets. In this talk we will look at two other classes of optimization required in important statistical queries. The first involves iterating over all spatial regions (big and small). The second involves detection of tracks from noisy intermittent observations separated far apart in time and space. We will also discuss the implications that have arisen from making these operations tractable. We will focus particularly on

Detecting all asteroids in the solar system larger than Pittsburgh's Cathedral of Learning (data to be collected over 2006-2010).
Early detection of emerging diseases based on national monitoring of health-related transactions.

Joint work with Jeremy Kubica, Ting Liu, and Daniel Neill.

About the Speaker

Andrew Moore is a Professor of Robotics and Computer Science at the School of Computer Science, Carnegie Mellon University. Andrew began his career writing video-games for an obscure British personal computer (http://www.oric.org/index.php?page=software&fille=detail&num_log=2). He rapidly became a thousandaire and retired to academia, where he received a PhD from the University of Cambridge in 1991. He researched robot learning as a Post-doc working with Chris Atkeson, and then moved to CMU.

His main research interest is data mining: statistical algorithms for finding all the potentially useful and statistically meaningful patterns in massive sources of data. His research group, The Auton Lab, (http://www.autonlab.org) has devised several new ways of performing massive statistical operations efficiently, in several cases accelerating state-of-the-art by a several magnitudes. Members of the Auton Lab collaborate closely with many kinds of scientists, government agencies, technology companies and engineers in a constant quest to determine what are some of the most urgent unresolved questions at the border of computation and statistics. Auton Lab algorithms are now in use in dozens of commercial, university and government applications. Andrew serves on several editorial boards, and in industrial, government and academic advisory roles. In his non-work life he has no hobbies or talents of any significance.

Contact: bac-coordinators@cs.stanford.edu

Back to the Colloquium Page

Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision (CS 528)

Second Generation Cached-Sufficient Statistics for efficient statistical queries

Abstract

About the Speaker

Broad Area Colloquium For AI-Geometry-Graphics-Robotics-Vision
(CS 528)