The purpose with this course is to give a thorough introduction to deep machine learning, also known as deep learning or deep neural networks. Over the last few years, deep machine learning has dramatically changed the state-of-the-art performance in various fields including speech-recognition, computer vision and reinforcement learning (used, e.g., to learn how to play Go).
The first module starts with the fundamental concepts of supervised machine learning, deep learning and convolutional neural networks (CNNs). This includes the main components in feed-forward and CNNs, commonly used loss functions, training paradigms, architectures and strategies to help the network generalize well to unseen data.
The second module covers algorithms used for training of neural networks such as stochastic gradient descent and Adam. We will introduce the methods and discuss their implicit regularizing properties in connection to generalization of neural networks in the overparameterized setting.
The last module of the course presents the theory behind some advanced topics of deep learning research, namely recently-developed methods for 1) equipping discriminative deep networks with estimates of their uncertainty and 2) different types of deep generative models. The students’ knowledge will be examined with corresponding hand-in assignments.
Course Type:
- AS track: elective
- AI track: mandatory
- Joint curriculum: advanced
Time: Given odd years, Spring
Teachers: Lennart Svensson (CTH), Pontus Giselsson (LU), Hossein Azizpour (KTH)
Examiner: Pontus Giselsson (LU)
Basic machine learning, linear algebra, probability theory, basic optimization, programming (Python). The participants are assumed to have a background in mathematics corresponding to the contents of the WASP-course “Mathematics and Machine Learning”.
After the course, students should be able to:
- explain the fundamental principles of supervised and unsupervised learning, including basic techniques like cross-validation to avoid overfitting,
- describe the standard cost functions optimized during supervised training and the standard solution techniques.
- explain how traditional feed-forward networks are constructed and why they can approximate “almost” and function (the universality theorem).
- summarize the key components in convolutional neural networks (CNNs) and their key advantages.
- argue for the benefits of transfer learning and data augmentation in situations when we have a limited amount of annotated/labelled data.
- train and apply CNNs to image applications.
- understand deep neural networks on a high level and know how to train them.
- understand generalization and why it is needed.
- understand why stochastic gradient descent has implicit regularization properties that help improve generalization in deep overparameterized neural networks.
- account for the theoretical background for probabilistic and generative deep learning techniques.
- implement methods based on recently published results for probabilistic or generative deep networks.
- understand the dynamics that govern common algorithms for training of deep neural networks
- reflect on different implicit regularization effects for different training algorithms
Module 1: Fundamentals of deep learning. Feedforward networks. Training procedures. Convolutional neural networks.
Module 2: Implicit regularization of stochastic gradient descent in overparameterized deep neural networks.
Module 3: Uncertainty Estimation. Out-of-Distribution Detection/Robustness. Deep Generative Modeling.
Lecture slides.
One hand-in assignments per module.
Syllabus (Kursplan)
Course page
If you are not a student at KTH you must login via https://canvas.kth.se/login/canvas