TAILOR Selected papers: June 2024

Every month, we want to acknowledge some valuable TAILOR papers, selected among the papers published by scientists belonging to our network by TAILOR principal investigator Fredrik Heintz.
The list of the most valuable papers gathers contributions from different TAILOR partners, each providing valuable insights on different topics related to TrustworthyAI.
Stay tuned for other valuable insights and groundbreaking research from our diverse community!

Abstraction Heuristics for Factored Tasks

C. Büchner, P. Ferber, J. Seipp, and M. Helmert

Proceedings International Conference on Automated Planning and Scheduling, ICAPS, 2024, pp. 40–49. doi: 10.1609/icaps.v34i1.31459.

Abstract: One of the strongest approaches for optimal classical planning is A* search with heuristics based on abstractions of the planning task. Abstraction heuristics are well studied in planning formalisms without conditional effects such as SAS+. However, conditional effects are crucial to model many planning tasks compactly. In this paper, we focus on factored tasks which allow a specific form of conditional effect, where effects on variable x can only depend on the value of x. We generalize projections, domain abstractions, Cartesian abstractions and the counterexample-guided abstraction refinement method to this formalism. While merge-and-shrink already covers factored task in theory, we provide an implementation that does so. In our experiments, we compare these abstraction-based heuristics to other heuristics supporting conditional effects, as well as symbolic search. On our new benchmark set of factored tasks, pattern database heuristics solve the most problems, followed by symbolic approaches on par with domain abstractions. The more general Cartesian abstractions fall behind in terms of coverage but usually solve problems the fastest among all tested approaches. The generality of merge-and-shrink abstractions does not seem to be beneficial for these factored tasks.

Bayesian Ensembles for Exploration in Deep Q-Learning

P. R. van der Vaart, N. Yorke-Smith, and M. T. J. Spaan

Proceedings of the International Joint Conference on Autonomous Agents and Multiagent Systems, AAMAS, 2024, pp. 2528–2530. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85196366068&partnerID=40&md5=483b1464887e8fbc67ca29c955e0c2a5

Abstract: Exploration in reinforcement learning remains a difficult challenge. In order to drive exploration, ensembles with randomized prior functions have recently been popularized to quantify uncertainty in the value model. There is no theoretical reason for these ensembles to resemble the actual posterior, however. In this work, we view training ensembles from the perspective of Sequential Monte Carlo, a Monte Carlo method that approximates a sequence of distributions with a set of particles. In particular, we propose an algorithm that exploits both the practical flexibility of ensembles and theory of the Bayesian paradigm. We incorporate this method into a standard Deep Q-learning agent (DQN) and experimentally show qualitatively good uncertainty quantification and improved exploration capabilities over a regular ensemble. © 2024 International Foundation for Autonomous Agents and Multiagent Systems.

Drifting explanations in continual learning

A. Cossu, F. Spinnato, R. Guidotti, and D. Bacciu

Neurocomputing, vol. 597, 2024, doi: 10.1016/j.neucom.2024.127960.

Abstract:

Evaluating language models for mathematics through interactions

K. M. Collins et al.

Proceedings of the National Academy of Sciences of the United States of America, vol. 121, no. 24, 2024, doi: 10.1073/pnas.2318124121.

Abstract: Continual Learning (CL) trains models on streams of data, with the aim of learning new information without forgetting previous knowledge. However, many of these models lack interpretability, making it difficult to understand or explain how they make decisions. This lack of interpretability becomes even more challenging given the non-stationary nature of the data streams in CL. Furthermore, CL strategies aimed at mitigating forgetting directly impact the learned representations. We study the behavior of different explanation methods in CL and propose CLEX (ContinuaL EXplanations), an evaluation protocol to robustly assess the change of explanations in Class-Incremental scenarios, where forgetting is pronounced. We observed that models with similar predictive accuracy do not generate similar explanations. Replay-based strategies, well-known to be some of the most effective ones in class-incremental scenarios, are able to generate explanations that are aligned to the ones of a model trained offline. On the contrary, naive fine-tuning often results in degenerate explanations that drift from the ones of an offline model. Finally, we discovered that even replay strategies do not always operate at best when applied to fully-trained recurrent models. Instead, randomized recurrent models (leveraging on an untrained recurrent component) clearly reduce the drift of the explanations. This discrepancy between fully-trained and randomized recurrent models, previously known only in the context of their predictive continual performance, is more general, including also continual explanations.

General Policies, Subgoal Structure, and Planning Width

B. Bonet and H. Geffner

Journal of Artificial Intelligence Research, vol. 80, pp. 475–516, 2024, doi: 10.1613/jair.1.15581.

Abstract: It has been observed that many classical planning domains with atomic goals can be solved by means of a simple polynomial exploration procedure, called IW, that runs in time exponential in the problem width, which in these cases is bounded and small. Yet, while the notion of width has become part of state-of-the-art planning algorithms such as BFWS, there is no good explanation for why so many benchmark domains have bounded width when atomic goals are considered. In this work, we address this question by relating bounded width with the existence of general optimal policies that in each planning instance are represented by tuples of atoms of bounded size. We also define the notions of (explicit) serializations and serialized width that have a broader scope, as many domains have a bounded serialized width but no bounded width. Such problems are solved nonoptimally in polynomial time by a variant of the Serialized IW algorithm. Finally, the language of general policies and the semantics of serializations are combined to yield a simple, meaningful, and expressive language for specifying serializations in compact form in the form of sketches, which can be used for encoding domain control knowledge by hand or for learning it from examples. Sketches express general problem decompositions in terms of subgoals, and terminating sketches of bounded width express problem decompositions that can be solved in polynomial time.

Machine learning augmented branch and bound for mixed integer linear programming

L. Scavuzzo, K. Aardal, A. Lodi, and N. Yorke-Smith

Mathematical Programming, 2024, doi: 10.1007/s10107-024-02130-y.

Abstract: Mixed Integer Linear Programming (MILP) is a pillar of mathematical optimization that offers a powerful modeling language for a wide range of applications. The main engine for solving MILPs is the branch-and-bound algorithm. Adding to the enormous algorithmic progress in MILP solving of the past decades, in more recent years there has been an explosive development in the use of machine learning for enhancing all main tasks involved in the branch-and-bound algorithm. These include primal heuristics, branching, cutting planes, node selection and solver configuration decisions. This article presents a survey of such approaches, addressing the vision of integration of machine learning and mathematical optimization as complementary technologies, and how this integration can benefit MILP solving. In particular, we give detailed attention to machine learning algorithms that automatically optimize some metric of branch-and-bound efficiency. We also address appropriate MILP representations, benchmarks and software tools used in the context of applying learning algorithms.

Planning with Object Creation

A. B. Corrêa, G. De Giacomo, M. Helmert, and S. Rubin

Proceedings International Conference on Automated Planning and Scheduling, ICAPS, 2024, pp. 104–113. doi: 10.1609/icaps.v34i1.31466.

Abstract: Classical planning problems are defined using some specification language, such as PDDL. The domain expert defines action schemas, objects, the initial state, and the goal. One key aspect of PDDL is that the set of objects cannot be modified during plan execution. While this is fine in many domains, sometimes it makes modeling more complicated. This may impact the performance of planners, and it requires the domain expert to bound the number of required objects beforehand, which can be a challenge. We introduce an extension to the classical planning formalism, where action effects can create and remove objects. This problem is semi-decidable, but it becomes decidable if we can bound the number of objects in any given state, even though the state space is still infinite. On the practical side, we extend the Powerlifted planning system to support this PDDL extension. Our results show that this extension improves the performance of Powerlifted while supporting more natural PDDL models.