Student Projects

Counterfactual Explanations for Unsupervised Learning

Status: Assigned November 2022
Student: Aurora Spagnol

Artificially Intelligence (AI) is slowly but surely becoming an integrated part of our daily lives. However, with decisions derived from AI systems ultimately affecting human lives (e.g., medicine and law), there is an emerging need for understanding how such decisions are made by AI methods. Furthermore, the “right to explanation” foreshadowed by the General Data Protection Regulation (GDPR) [1] challenged the Machine Learning (ML) community to build explainability into predictive models and their outputs. This paradigm shift – where predictive performance is no longer the only (and main) objective – gives rise to two distinct viewpoints. One argues that algorithmic black boxes should continue to be optimised for predictive power with explainability needs, possibly, fulfilled through post-hoc methods due to an apparent incompatibility of these two goals, thus forcing one of them to be sacrificed for the other [2]. The second standpoint disputes this trade-off as purely anecdotal and persuasively argues for building inherently transparent models, especially for high-stakes decisions [3].

Counterfactuals are an explainability approach uniquely positioned in this space as they can be generated post-hoc but remain truthful with respect to the underlying black box (i.e., exhibit full fidelity). They enable ML users to understand what the output of a predictive model would have been had the instance in question changed in a particular way. This type of counterfactual analysis helps the explainees to simulate certain aspects of the ML model, thus improving its interpretability [4]. Notably, evidence from psychology and cognitive sciences suggests that people use counterfactual reasoning daily to analyse what could have happened had they acted differently [5].

The goal of this project is to explore existing XAI methods and possibly develop a new method that would improve existing solutions, with a focus on counterfactual-based XAI.

Specifically, this Master thesis has four main tasks:

  1. Explore the related scientific studies (papers) on: Counterfactual explanations for unsupervised models; and counterfactual explanations in wearable sensor data.
  2. Preprocessing and exploratory data analysis over the LAUREATE dataset [6].
  3. Develop Unsupervised BayCon – an unsupervised version of the supervised counterfactual generator BayCon [6]
  4. Perform a small user-study with 5-10 participants to evaluate the quality of the explanations generated by Unsupervised BayCon


  1. Goodman, S.Flaxman, European union regulations on algorithmic decision-making and a “right to explanation”, AI Magazine 38 (3) (2017) 50–57.
  3. Rudin, C. Stop explaining black-box machine learning models for high-stakes decisions and use interpretable models instead. Nature Machine Intelligence 1, no. 5:206-215, 2019.
  4. Robert Hoffman, Tim Miller, Shane T. Mueller, Gary Klein, and William J. Clancey. Explaining explanation, part 4: A deep dive on deep nets. IEEE Intelligent Sys-tems 33, no. 3:87-95, 2018.
  5. Ruth M. J. Byrne. The rational imagination: How people create alternatives to reality. MIT Press, Cambridge, Massachusetts, 2005.
  6. Romashov, P., Gjoreski, M., Sokol, K., Martinez, M. V., & Langheinrich, M. BayCon: Model-agnostic Bayesian Counterfactual Generator.
  7. In-house dataset (30+ participants) that contain longitudinal data from physiological sensors (heart rate, sweating rate, skin temperature and acceleration). To be provided by the supervisors:


For more information contact: Martin Gjoreski