The SDM & RL core technology cluster provides a network for researchers working on sequential decision-making problems.
We, the SDM & RL cluster, focus on exploring the fascinating field of sequential decision-making. We are interested in developing algorithms and models that enable intelligent agents to make optimal decisions in dynamic and uncertain environments. Most of us research in the direction of applied or foundational (deep) reinforcement learning, while another large subset of us works with robotics and control theory.
Goal: The main purpose of the cluster is to provide a network for practitioners and researchers in the SDM & RL field, to advance our understanding of the fundamental principles underlying intelligent, sequential decision-making, to foster scientific discussion and collaboration, and to exchange practical knowledge.
Join the Cluster
The cluster uses the public Slack channel:
#ctc_sequential_decision_making_and_reinforcement_learning
To get started, simply fill out the cluster registration form and join us on Slack!
Cluster Activities
With respect to the goal of the cluster, we meet bi-weekly via zoom and typically engage in one of the following activities:
-
- Research discussions: Cluster members are encouraged to present their research in SDM & RL, engaging in valuable scientific discourse.
- Paper discussions: The cluster discusses SOTA papers from the field. Typically this involves one person presenting a summary of the paper, followed by a discussion.
- Practical knowledge exchange: The cluster discusses implementational details, hyperparameters, and heuristics associated with the complex code underlying sequential decision-making problems and algorithms.
- External speaker invitation: The cluster occasionally invites external (potential industrial) speakers to talk about their SDM & RL research.
Recent Publications from Cluster Members
A selection of the latest research published by members of our cluster:
- Enhancing pre-trained decision transformers with prompt-tuning bandits
Finn Rietz, Sara Karimi et al. (2025) - A graph-based reinforcement learning approach with frontier potential based reward for safe cluttered environment exploration
Gabriele Calzolari et al. (2025) - Efficient prior selection in Gaussian process bandits with Thompson sampling
Jack Sandberg et al. (2025) - Automatic planning and optimization of a laser radar inspection system
Jack Sandberg et al. (2025) - Identifying 14-3-3 interactome binding sites with deep learning
Laura van Weesep et al. (2025) - Platoon coordination and leader selection in mixed transportation systems via dynamic programming
Ying Wang et al. (2025) - Fast online learning of CLiFF-maps in changing environments
Yufei Zhu et al. (2025) - Learning to ground existentially quantified goals
Martin Funkquist et al. (2024) - SCORE: skill-conditioned online reinforcement learning
Sara Karimi et al. (2024) - Model-free low-rank reinforcement learning via leveraged entry-wise matrix estimation
Stefan Stojanovic et al. (2024) - Identifiable latent bandits: Combining observational data and exploration for personalized healthcare
Ahmet Balcioglu et al. (2024) - Multi-agent obstacle avoidance using velocity obstacles and control barrier functions
Alejandro Sánchez Roncero et al. (2024) - Diversity-aware reinforcement learning for de novo drug design
Hampus Gummesson Svensson et al. (2024) - Towards interpretable reinforcement learning with constrained normalizing flow policies
Finn Rietz et al. (2024) - Decentralized multi-agent reinforcement learning exploration with inter-agent communication-based action space
Gabriele Calzolari et al. (2024)
Active Cluster Members
Keywords: Non-linear independent component analysis, Causal representation learning
Affiliation: Chalmers University of Technology
Website: https://selozhd.github.io/
Keywords: Autonomous anti-drone systems
Affiliation: KTH Royal Institute of Technology
Website: https://www.kth.se/profile/alesr
Keywords: Autonomous Vehicles
Affiliation: Chalmers University of Technology
Keywords: RL (offline, online, o2o), Multi-task Transfer Learning, Constrained RL, in-context RL
Affiliation: Örebro University
Website: https://www.finnrietz.dev/
Keywords: Heterogeneous Multi-Agent Reinforcement Learning, Safe RL
Affiliation: Luleå University of Technology
Keywords: Autonomous drug design, bandits, RL, active learning/online learning
Affiliation: Chalmers University of Technology
Keywords: Goal-conditioned RL, Vision, Pose Estimation, Robotics, Sim2Real
Affiliation: Lund University
Website: https://portal.research.lu.se/sv/persons/hampus-åström/
Keywords: Multi-armed bandits, Gaussian process bandits, Bayesian optimization.
Affiliation: Chalmers University of Technology
Keywords: Agentic systems for drug discovery
Affiliation: Uppsala University and AstraZeneca
Website: https://www.uu.se/en/contact-and-organisation/staff?query=N25-892
Keywords: Sequential descision making, classical planning
Affiliation: Linköping University
Website: https://martin36.github.io/
Keywords: Transfer and imitation learning, vision-based navigation for UAVs
Affiliation: Örebro University
Website: https://meraccos.com/
Keywords: Game theory and MARL for drones
Affiliation: Chalmers University of Technology and SAAB
Keywords: Sequential decision making, Theoretical guarantees
Affiliation: KTH Royal Institute of Technology
Website: https://www.kth.se/profile/bongole
Keywords: Curiosity driven exploration
Affiliation: Örebro University and Nexer
Keywords: DRL, RL in games, generalization, scalability
Affiliation: KTH Royal Institute of Technology
Website: https://www.kth.se/profile/sarakari
Keywords: Zero-shot RL, theoretical guarantees for RL
Affiliation: KTH Royal Institute of Technology
Website: https://www.kth.se/profile/stesto
Keywords: Learning for control, System identification
Affiliation: KTH Royal Institute of Technology
Website: https://www.kth.se/profile/yinwang
Keywords: Human motion prediction, dynamics mapping
Affiliation: Örebro University