Statistical Methods in Data Science and Machine Learning (DSML)
Statistics plays a central role in both data science and machine learning. We will introduce a set of most popular statistical methods in data science and machine learning for various tasks, such as classification and regression. These will include very modern techniques such as ensemble tree-based methods (including bagging, random forests, boosting such as gbm and xgboost, etc.) and deep neural networks, and also the ever-popular classical techniques such as nearest neighbor classifiers, discriminant analysis, and naïve Bayes. We will also discuss important thoughts such as curse of dimensionality and bias-variance tradeoff. Students will learn not only what the methods are, how they work, and why they work, but also how to code and use them in real data.
- An academically rigorous program that incorporates 28 classroom contact hours with faculty, 9 hours of workshops, and 4 hours with teaching assistants.
- Live and synchronous classes with a University of Notre Dame professor in the department of Department of Applied and Computational Mathematics and Statistics.
- A specially designed schedule that accommodates time differences for students in China/Asia, with classes in the evening and workshops & events in the morning.
- Live workshops on Data Storytelling, Data Ethics, Applying to Grad School in the U.S. & at Notre Dame, Academic English Presentation Skills, etc.
- Panel discussion with Notre Dame’s current international Ph.D students.
- Have access to University of Notre Dame’s student learning platform Sakai and all classroom materials.
For your reference only. A finalized schedule will be provided to all participants in late July.
100% online and synchronous (live)
15 days of classes
Dates & Times
Monday, August 2nd - Thursday, August 19, 2021. No classes on weekends.
The core classes (Statistical Methods in Data Science and Machine Learning) meet from 8am to 10am on weekday mornings, US Eastern Standard time. Workshops are often conducted in the evenings, US Eastern Standard Time.
- International students enrolled in any accredited institutions internationally or in the United States.
- Minimum GPA: 2.75
- Preferred English proficiency:
- TOEFL iBT "My Best Score" 80
- IELTS 6
- Duolingo English Test 105
- Chinese English Test 4 (CET4) 500
- Chinese English Test 6 (CET6) 450
Preparation for This Course
- Basic knowledge of applied probability and applied statistics, especially, linear regression.
- R programming. Suggested read - click here ( Chinese version).
Submit your application using the link below.
PLEASE NOTE: You will be required to upload a copy of your English proficiency test report and a copy of your university transcript. Both reports can be unofficial, but must be legible and with your name on the report.
July 15, 2021
Payment deadline is July 20, 2021
Admitted students will receive detailed instructions on making the program fee payment (via credit card) once their application is submitted and reviewed.
- Withdraw on or before July 25th - full refund.
- Withdraw between July 26th to 31st - $100 withdrawal fee.
- Withdraw on or after August 1st - no refund.
Enrolled participants are expected to attend all classes and mandatory workshops. Participation requirements are detailed on the schedule. Personal emergencies such as illness or technological issues can be excused and program staff must be notified.
Upon successful completion of the program, each participant will receive an official program completion certificate.
Some students might wish to apply for scholarships at their home universities to cover their program fee. For this purpose, an official program invoice can be provided upon request. Please email email@example.com to request your invoice.
Please email firstname.lastname@example.org for any questions.