2009 Fall STA 218-01

Bulletin Course Description
Introduction to data mining, including multivariate nonparametric regression, classification, and cluster analysis. Topics include the Curse of Dimensionality, the bootstrap, cross-validation, search (especially model selection), smoothing, the backfitting algorithm, and boosting. Emphasis on regression methods (e.g., neural networks, wavelets, the LASSO, and LARS), classifications methods (e.g., CART, Support vector machines, and nearest-neighbor methods), and cluster analysis (e.g., self-organizing maps, D-means clustering, and minimum spanning trees). Theory illustrated through analysis of classical data sets. Instructor: Banks
(Instructor named in bulletin description above may not be current. For current instructor, see listing below.)

Title STATISTICAL DATA MINING
Department STA
Course Number2009 Fall 218
Section Number 01
Primary Instructor Banks,David L
Prerequisites Prerequisites: Statistics 114.


Synopsis of course content
Data mining is a field that has grown up at the intersection of statistics and computer science. This course covers the material in Hastie, Tibshirani, and Friedman's _The Elements of Statistical Learning_ and then proceeds to cover special topics. Graded material in the course include research projects and a presentation.



Help with searching

synop@aas.duke.edu