DS4140-CS5140-CS6140

DS4140/CS5140/CS6140 Data Mining

Class Hours: Monday/Wednesday 3:00pm-4:20pm, IVC (Zoom Link on Canvas)

Instructor

Qingyao Ai (aiqy[at]cs[dot]utah[dot]edu)

Office Hours: Monday 4:30pm-5:30pm, IVC (Zoom Link on Canvas)

TA/TM

Prerequisites

Text Books:

Resources

Grading

The grade will count the assessments using the following proportions (tentative and subject to change):

Tentative Class Schedule

Week, Date Topic Textbook and Resources
Week 01, 01/20 Class Overview  
Week 02, 01/25 Statistics Principles M4D 2.2-2.3, MMDS 1.2, FoDS 12.4
Week 02, 01/27 Similarity : Jaccard + k-Grams M4D 4.3-4.4, MMDS 3.1+3.2, FoDS 7.3
Week 03, 02/01 Similarity : Min Hashing M4D 4.6.6, MMDS 3.3
Week 03, 02/03 Similarity : LSH M4D 4.6, MMDS 3.4
Week 04, 02/08 Similarity : Distances M4D 4-4.3, MMDS 3.5+7.1, FoDS 8.1
Week 04, 02/10 Similarity : Word Embed + ANN vs. LSH M4D 4.4, MMDS 3.7+7.1.3
Week 05, 02/15 Presidents Day  
Week 05, 02/17 Clustering : Hierarchical M4D 8.5+8.2, MMDS 7.2, FoDS 7.7
Week 06, 02/22 Clustering : K-Means M4D 8-8.3, MMDS 7.3, FoDS 7.2-3
Week 06, 02/24 Clustering : Spectral M4D 10.3, MMDS 10.4, FoDS 7.5
Week 07, 03/01 Streaming : Model and Misra-Greis M4D 11.1-11.2.2, FoDS 6.2.3, MMDS 6
Week 07, 03/03 Streaming : Count-Min Sketch, Count Sketch, and Apriori M4D 11.2.3-4, FoDS 6.2.3, MMDS 4.3
Week 08, 03/08 Regression : Basics, and Ridge Regression M4D 5-5.3
Week 08, 03/10 Regression : Lasso + MP + Comp. Sensing M4D 5.5, FoDS 10.2
Week 09, 03/15 Regression : Cross-Validation and p-values M4D 5.5
Week 09, 03/17 Mid-term Exam  
Week 10, 03/22 Dim Reduce : SVD + PCA M4D 7-7.3+7.5, FoDS 4
Week 10, 03/24 Dim Reduce : Matrix Sketching M4D 11.3, MMDS 9.4, FoDS 2.7+7.2.2
Week 11, 03/29 Dim Reduce : Metric Learning M4D 7.6-7.8
Week 11, 03/31 Noise : Random Projections and Noise in Data M4D 7.10+8.6, MMDS 9.1, FoDS 2.9
Week 12, 04/05 Non-Instructional Day  
Week 12, 04/07 Noise : Calibration and Counterfactual Learning Cascade Model Craswell et al. (WSDM 2008)
DBN Chapelle and Zhang (WWW 2009)
UBM Dupret and Piwowarski (SIGIR 2008)
IPW Joachims et al. (WSDM 2017)
Regression EM Wang et al. (WSDM 2018)
DLA Ai et al. (SIGIR 2018)
Week 13, 04/12 Graph Analysis : Markov Chains M4D 10.1, MMDS 10.1+5.1, FoDS 5
Week 13, 04/14 Graph Analysis : PageRank M4D 10.2, MMDS 5.1+5.4
Week 14, 04/19 Graph Analysis : MapReduce or Communities M4D 10.4, MMDS 2+10.2+5.5, FoDS 8.8+3.4
Week 14, 04/21 Class Project Presentation  
Week 15, 04/26 Final Exam  

Acknowledgements

Special thanks to Prof. Jeff M. Phillips from University of Utah. Teaching materials are borrowed from his courses on CS5140/6140: Data Mining.