Sequential Decision-Making and Reinforcement Learning

The SDM & RL core technology cluster provides a network for researchers working on sequential decision-making problems.

We, the SDM & RL cluster, focus on exploring the fascinating field of sequential decision-making. We are interested in developing algorithms and models that enable intelligent agents to make optimal decisions in dynamic and uncertain environments. Most of us research in the direction of applied or foundational (deep) reinforcement learning, while another large subset of us works with robotics and control theory.

Goal: The main purpose of the cluster is to provide a network for practitioners and researchers in the SDM & RL field, to advance our understanding of the fundamental principles underlying intelligent, sequential decision-making, to foster scientific discussion and collaboration, and to exchange practical knowledge.

Join the Cluster

The cluster uses the public Slack channel:

#ctc_sequential_decision_making_and_reinforcement_learning

To get started, simply fill out the cluster registration form and join us on Slack!

Cluster Activities

With respect to the goal of the cluster, we meet bi-weekly via zoom and typically engage in one of the following activities:

- Research discussions: Cluster members are encouraged to present their research in SDM & RL, engaging in valuable scientific discourse.
- Paper discussions: The cluster discusses SOTA papers from the field. Typically this involves one person presenting a summary of the paper, followed by a discussion.
- Practical knowledge exchange: The cluster discusses implementational details, hyperparameters, and heuristics associated with the complex code underlying sequential decision-making problems and algorithms.
- External speaker invitation: The cluster occasionally invites external (potential industrial) speakers to talk about their SDM & RL research.

Recent Publications from Cluster Members

A selection of the latest research published by members of our cluster:

Enhancing pre-trained decision transformers with prompt-tuning bandits
Finn Rietz, Sara Karimi et al. (2025)
A graph-based reinforcement learning approach with frontier potential based reward for safe cluttered environment exploration
Gabriele Calzolari et al. (2025)
Efficient prior selection in Gaussian process bandits with Thompson sampling
Jack Sandberg et al. (2025)
Automatic planning and optimization of a laser radar inspection system
Jack Sandberg et al. (2025)
Identifying 14-3-3 interactome binding sites with deep learning
Laura van Weesep et al. (2025)
Platoon coordination and leader selection in mixed transportation systems via dynamic programming
Ying Wang et al. (2025)
Fast online learning of CLiFF-maps in changing environments
Yufei Zhu et al. (2025)
Learning to ground existentially quantified goals
Martin Funkquist et al. (2024)
SCORE: skill-conditioned online reinforcement learning
Sara Karimi et al. (2024)
Model-free low-rank reinforcement learning via leveraged entry-wise matrix estimation
Stefan Stojanovic et al. (2024)
Identifiable latent bandits: Combining observational data and exploration for personalized healthcare
Ahmet Balcioglu et al. (2024)
Multi-agent obstacle avoidance using velocity obstacles and control barrier functions
Alejandro Sánchez Roncero et al. (2024)
Diversity-aware reinforcement learning for de novo drug design
Hampus Gummesson Svensson et al. (2024)
Towards interpretable reinforcement learning with constrained normalizing flow policies
Finn Rietz et al. (2024)
Decentralized multi-agent reinforcement learning exploration with inter-agent communication-based action space
Gabriele Calzolari et al. (2024)

Active Cluster Members

Ahmet Balcioglu

Keywords: Non-linear independent component analysis, Causal representation learning

Affiliation: Chalmers University of Technology

Website: https://selozhd.github.io/

Alejandro Sánchez Roncero

Keywords: Autonomous anti-drone systems

Affiliation: KTH Royal Institute of Technology

Website: https://www.kth.se/profile/alesr

Caroline Skoglund

Keywords:

Affiliation: KTH Royal Institute of Technology & Traton Group

Website:

Deepthi Pathare

Keywords: Autonomous Vehicles

Affiliation: Chalmers University of Technology

Website: https://www.chalmers.se/en/persons/pathare/

Dominik Frey

Keywords: M

Affiliation: Linköping University

Website: https://liu.se/medarbetare/domfr93

Filip Rydin

Keywords: Robust learning methods for electric vehicle route selection

Affiliation: Chalmers University of Technology

Website: https://www.chalmers.se/personer/filipry/

Finn Rietz

Keywords: RL (offline, online, o2o), Multi-task Transfer Learning, Constrained RL, in-context RL

Affiliation: Örebro University

Website: https://www.finnrietz.dev/

Gabriele Calzolari

Keywords: Heterogeneous Multi-Agent Reinforcement Learning, Safe RL

Affiliation: Luleå University of Technology

Website: https://www.ltu.se/en/staff/g/gabriele-calzolari

Hampus Gummesson Svensson

Keywords: Autonomous drug design, bandits, RL, active learning/online learning

Affiliation: Chalmers University of Technology

Website: https://www.chalmers.se/personer/hamsven/

Hampus Åström

Keywords: Goal-conditioned RL, Vision, Pose Estimation, Robotics, Sim2Real

Affiliation: Lund University

Website: https://portal.research.lu.se/sv/persons/hampus-åström/

Jack Sandberg

Keywords: Multi-armed bandits, Gaussian process bandits, Bayesian optimization.

Affiliation: Chalmers University of Technology

Website: https://www.chalmers.se/en/persons/jacksa/

Laura van Weesep

Keywords: Agentic systems for drug discovery

Affiliation: Uppsala University and AstraZeneca

Website: https://www.uu.se/en/contact-and-organisation/staff?query=N25-892

Mahyar Mohammadi

Keywords:

Affiliation: Linköping University

Website: https://liu.se/en/employee/mahmo80

Martí Ejarque Galindo

Keywords:

Affiliation: Umeå University

Website: https://www.umu.se/en/staff/marti-ejarque/

Martin Funkquist

Keywords: Sequential descision making, classical planning

Affiliation: Linköping University

Website: https://martin36.github.io/

Mengyuan Wang

Keywords:

Affiliation: Chalmers University of Technology

Website: https://www.chalmers.se/en/persons/mengyuan/

Meraj Mammadov

Keywords: Transfer and imitation learning, vision-based navigation for UAVs

Affiliation: Örebro University

Website: https://meraccos.com/

Mika Persson

Keywords: Game theory and MARL for drones

Affiliation: Chalmers University of Technology and SAAB

Website: https://www.chalmers.se/en/persons/mikape/

Raghav Bongole

Keywords: Sequential decision making, Theoretical guarantees

Affiliation: KTH Royal Institute of Technology

Website: https://www.kth.se/profile/bongole

Samuel Blad

Keywords: Curiosity driven exploration

Affiliation: Örebro University and Nexer

Website: https://www.oru.se/personal/samuel_blad

Sara Karimi

Keywords: DRL, RL in games, generalization, scalability

Affiliation: KTH Royal Institute of Technology

Website: https://www.kth.se/profile/sarakari

Stefan Stojanovic

Keywords: Zero-shot RL, theoretical guarantees for RL

Affiliation: KTH Royal Institute of Technology

Website: https://www.kth.se/profile/stesto

Supratim Manna

Keywords:

Affiliation: Linköping University

Website: https://liu.se/en/employee/supma61

Tinh Cao

Keywords:

Affiliation: Uppsala University

Website: https://www.uu.se/en/contact-and-organisation/staff?query=N24-2682

Ying Wang

Keywords: Learning for control, System identification

Affiliation: KTH Royal Institute of Technology

Website: https://www.kth.se/profile/yinwang

Yufei Zhu

Keywords: Human motion prediction, dynamics mapping

Affiliation: Örebro University

Website: https://www.oru.se/personal/yufei_zhu

Current Cluster Leader

Stefan Stojanovic

PhD student, Division of Decision and Control Systems, KTH

Send Email