Topological Data Analysis (6hp)

How to give a machine a sense of geometry? There are two aspects of what a sense is: a technical tool and an ability to learn to use it. This learning ability is essential. For example we are born with the technical ability to detect smells and throughout our lives we use and develop this sense, depending on needs and the environment around us. In this course the technical tool we introduce to describe geometry is based on homology. The main aim of the course is to explain how versatile this tool is and how to use this versatility to give a machine the ability to learn to sense geometry.

Technical tool. Homology, the central theme of the 20th century geometry, has been particularly useful for studying spaces with controllable cell decompositions such as Grassmann varieties. During the last decade there has been an explosion of applications ranging from neuroscience to vehicle tracking, protein structure analysis and the nano characterization of materials, testifying to the usefulness of homology to describe also spaces related to data sets. One might ask: why homology? Often due to heterogeneity or the presence of noise, it is very hard to understand our data. In these cases rather than trying to fit the data with complicated models a good strategy is to first investigate the shape properties of such data. Here homology comes into play.

Learning. We explain how to use homology to convert geometry of datasets into features suitable for statistical analysis and machine learning. It is a process that translates spacial and geometrical information into information that can be analysed through more basic operations such as counting and integration. Furthermore we provide an entire space of such translations.

Course information

Course type:

AS track: elective
AI track: elective
Joint curriculum: advanced

Time: Given odd years, Autumn

Teachers: Florian Pokorny (KTH), Martina Scolamiero (KTH), Wojciech Chachólski (KTH)

Examiner: Florian Pokorny (KTH)

Entry requirements

The participants are assumed to have a background in mathematics corresponding to the contents of the WASP-course “Mathematics and Machine Learning” and especially they should have completed a course in linear algebra.

Course content

In this course the technical tool we introduce to describe geometry is based on homology. The main aim of the course is to explain how versatile this tool is and how to use this versatility to give a machine the ability to learn to sense geometry.

Technical tool

Homology, the central theme of the 20th century geometry, has been particularly useful for studying spaces with controllable cell decompositions such as Grassmann varieties. During the last decade there has been an explosion of applications ranging from neuroscience to vehicle tracking, protein structure analysis and the nano characterization of materials, testifying to the usefulness of homology to describe also spaces related to data sets. One might ask: why homology? Often due to heterogeneity or the presence of noise, it is very hard to understand our data. In these cases rather than trying to fit the data with complicated models a good strategy is to first investigate the shape properties of such data. Here homology comes into play.

Learning

We explain how to use homology to convert geometry of datasets into features suitable for statistical analysis and machine learning. It is a process that translates spacial and geometrical information into information that can be analysed through more basic operations such as counting and integration. Furthermore we provide an entire space of such translations.

Intended learning outcomes

After completing the course the student should be able to

use topological data analysis techniques and tools in real-world machine learning applications
discuss/convert data into format suitable for TDA ..analysis part + math part…
extract topological invariants
make conclusions based on TDA analysis
understand the overall machinery of TDA
reason about the algorithmic complexity of TDA and its constructions (simplicial complexes, reduction algorithms, etc).

Literature

Comprehensive course notes and slides and exercises are provided.

Examination

The course is examined through a pass/fail group project at the end of the course where students explore TDA applications from an algorithmic, mathematical or real-world use case point of view

Syllabus (Kusplan)

Topological-Data-Analysis.pdf

Course page

Course Page 2021 (CANVAS KTH)

Course Page 2019 (CANVAS KTH)