Every month, we want to acknowledge some valuable TAILOR papers, selected among the papers published by scientists belonging to our network by TAILOR principal investigator Fredrik Heintz.
The list of the most valuable papers gathers contributions from different TAILOR partners, each providing valuable insights on different topics related to TrustworthyAI.
Stay tuned for other valuable insights and groundbreaking research from our diverse community!
Generative Model for Decision Trees
Riccardo Guidotti, Anna Monreale, Mattia Setzu, Giulia Volpi
Vol. 38 No. 19: AAAI-24 Special Track Safe, Robust and Responsible AI Track
Abstract: Decision trees are among the most popular supervised models due to their interpretability and knowledge representation resembling human reasoning. Commonly-used decision tree induction algorithms are based on greedy top-down strategies. Although these approaches are known to be an efficient heuristic, the resulting trees are only locally optimal and tend to have overly complex structures. On the other hand, optimal decision tree algorithms attempt to create an entire decision tree at once to achieve global optimality. We place our proposal between these approaches by designing a generative model for decision trees. Our method first learns a latent decision tree space through a variational architecture using pre-trained decision tree models. Then, it adopts a genetic procedure to explore such latent space to find a compact decision tree with good predictive performance. We compare our proposal against classical tree induction methods, optimal approaches, and ensemble models. The results show that our proposal can generate accurate and shallow, i.e., interpretable, decision trees.
Download the paper here: https://ojs.aaai.org/index.php/AAAI/article/view/30104
Advancing Dermatological Diagnostics: Interpretable AI for Enhanced Skin Lesion Classification
Carlo Metta, Andrea Beretta, Riccardo Guidotti, Yuan Yin, Patrick Gallinari, Salvatore Rinzivillo, Fosca Giannotti
Diagnostics 2024, 14(7), 753; https://doi.org/10.3390/diagnostics14070753
Abstract: A crucial challenge in critical settings like medical diagnosis is making deep learning models used in decision-making systems interpretable. Efforts in Explainable Artificial Intelligence (XAI) are underway to address this challenge. Yet, many XAI methods are evaluated on broad classifiers and fail to address complex, real-world issues, such as medical diagnosis. In our study, we focus on enhancing user trust and confidence in automated AI decision-making systems, particularly for diagnosing skin lesions, by tailoring an XAI method to explain an AI model’s ability to identify various skin lesion types. We generate explanations using synthetic images of skin lesions as examples and counterexamples, offering a method for practitioners to pinpoint the critical features influencing the classification outcome. A validation survey involving domain experts, novices, and laypersons has demonstrated that explanations increase trust and confidence in the automated decision system. Furthermore, our exploration of the model’s latent space reveals clear separations among the most common skin lesion classes, a distinction that likely arises from the unique characteristics of each class and could assist in correcting frequent misdiagnoses by human professionals.
Download the paper here: https://www.mdpi.com/2075-4418/14/7/753
Semantic enrichment of explanations of AI models for healthcare
Corbucci L., Monreale A., Panigutti C., Natilli M., Smiraglio S., Pedreschi D.
DS 2023: 26th International Conference on Discovery Science, pp. 216–229, Porto, Portugal, 09-11/10/2023
Abstract: Explaining AI-based clinical decision support systems is crucial to enhancing clinician trust in those powerful systems. Unfortunately, current explanations provided by eXplainable Artificial Intelligence techniques are not easily understandable by experts outside of AI. As a consequence, the enrichment of explanations with relevant clinical information concerning the health status of a patient is fundamental to increasing human experts’ ability to assess the reliability of AI decisions. Therefore, in this paper, we propose a methodology to enable clinical reasoning by semantically enriching AI explanations. Starting with a medical AI explanation based only on the input features provided to the algorithm, our methodology leverages medical ontologies and NLP embedding techniques to link relevant information present in the patient’s clinical notes to the original explanation. Our experiments, involving a human expert, highlight promising performance in correctly identifying relevant information about the diseases of the patients.
Download the paper here: https://www.researchgate.net/publication/374541700_Semantic_Enrichment_of_Explanations_of_AI_Models_for_Healthcare
Mitigating Biases in Collective Decision-Making: Enhancing Performance in the Face of Fake News
Axel Abels, Elias Fernandez Domingos, Ann Nowé, Tom Lenaerts
ArXivabs/2403.08829 (2024)
Abstract: Individual and social biases undermine the effectiveness of human advisers by inducing judgment errors which can disadvantage protected groups. In this paper, we study the influence these biases can have in the pervasive problem of fake news by evaluating human participants’ capacity to identify false head- lines. By focusing on headlines involving sensitive characteristics, we gather a comprehensive dataset to explore how human responses are shaped by their biases. Our analysis reveals recurring individ- ual biases and their permeation into collective decisions. We show that demographic factors, headline categories, and the manner in which information is presented significantly influence errors in human judgment. We then use our collected data as a benchmark problem on which we evaluate the efficacy of adaptive aggregation algorithms. In addition to their improved accuracy, our results highlight the interactions between the emergence of collective intelligence and the mitigation of participant biases.
Download the paper here: https://arxiv.org/pdf/2403.08829.pdf
Towards Quantifying the Effect of Datasets for Benchmarking: A Look at Tabular Machine Learning
Ravin Kohli, Matthias Feurer, Katharina Eggensperger, Bernd Bischl, Frank Hutter
Data-centric Machine Learning Research (DMLR) Workshop at ICLR (2024)
Abstract: Data in tabular form makes up a large part of real-world ML applications, and thus, there has been a strong interest in developing novel deep learning (DL) architectures for supervised learning on tabular data in recent years. As a result, there is a debate as to whether DL methods are superior to the ubiquitous ensembles of boosted decision trees. Typically, the advantage of one model class over the other is claimed based on an empirical evaluation, where different variations of both model classes are compared on a set of benchmark datasets that supposedly resemble relevant real-world tabular data. While the landscape of state-of- the-art models for tabular data changed, one factor has remained largely constant over the years: The datasets. Here, we examine 30 recent publications and 187 different datasets they use, in terms of age, study size and relevance. We found that the average study used less than 10 datasets and that half of the datasets are older than 20 years. Our insights raise questions about the conclusions drawn from previous studies and urge the research community to develop and publish additional recent, challenging and relevant datasets and ML tasks for supervised learning on tabular data.
Download the paper here: https://ml.informatik.uni-freiburg.de/wp-content/uploads/2024/04/61_towards_quantifying_the_effect.pdf