This course provides a foundational understanding of selected mathematics topics in machine learning and data science. Module 1 of the course focuses on key concepts and principles in statistical inference. Students will learn to apply basic inference principles of estimation and hypothesis testing, as well as use concentration inequalities and asymptotic theory to evaluate the properties of statistical methods. Module 1 will cover topics from both frequentist statistics and Bayesian statistics, as well as fundamentals of statistical decision theory.

Course type:

  • AS track: elective
  • AI track: elective
  • Joint curriculum: foundational

Time: Given yearly

Teachers:  Shaobo Jin (UU), Pierre Nyquist (CTH)

Examiner: Pierre Nyquist (CTH)

The participants are assumed to have a background in mathematics corresponding to the contents of the WASP-course “Introduction to Mathematics for Machine Learning”.  

After completing the course, students should be able to

Module1:

Apply basic principles of estimation and hypothesis testing.

Apply concentration inequalities and asymptotic theory to analyze the properties of different methods.

Apply statistical inference principles to construct optimal estimators and tests.

Describe the basics of statistical decision theory.

Describe the basics of Bayesian statistics.

Module2:

Define and describe some basic properties of a Markov chain.

Define and describe some standard computational methods for statistical inference, such as Markov chain Monte Carlo methods and regularized regression methods, and their properties.

Describe bootstrap and random forests.

Describe the boosting methodology.

Describe some standard methods for stochastic optimization.

Apply the above methods for different computational tasks.

In Module 1, we cover principles of statistical inference, concentration inequalities, asymptotic statistics, confidence intervals, risk and statistical decision theory. 

In Module 2, we cover the basics of computer intensive methods, including Markov chain Monte Carlo, with applications in Bayesian statistics, bootstrap, Lasso, random forest, boosting and stochastic optimization. 

Larry Wasserman (2004), All of Statistics, A Concise Course in Statistical Inference. Springer.

There will be two sets of homework assignments, coupled with the contents of the two on-site meetings. To pass the course, a minimum requirement on each of the modules must be met, and the homework assignments must be completed within the timeframe given.