TAILOR Selected papers: August 2024

Every month, we want to acknowledge some valuable TAILOR papers, selected among the papers published by scientists belonging to our network by TAILOR principal investigator Fredrik Heintz.
The list of the most valuable papers gathers contributions from different TAILOR partners, each providing valuable insights on different topics related to TrustworthyAI.
Stay tuned for other valuable insights and groundbreaking research from our diverse community!

Constraints-free structure learning with smooth acyclic orientations

R. Massidda, F. Landolfi, M. Cinquini, and D. Bacciu

12th International Conference on Learning Representations, ICLR 2024, 2024. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85200595898&partnerID=40&md5=fe4b9f82a7944cf5607c8aa3c336ae5c

Abstract: The structure learning problem consists of fitting data generated by a Directed Acyclic Graph (DAG) to correctly reconstruct its arcs. In this context, differentiable approaches constrain or regularize an optimization problem with a continuous relaxation of the acyclicity property. The computational cost of evaluating graph acyclicity is cubic on the number of nodes and significantly affects scalability. In this paper, we introduce COSMO, a constraint-free continuous optimization scheme for acyclic structure learning. At the core of our method lies a novel differentiable approximation of an orientation matrix parameterized by a single priority vector. Differently from previous works, our parameterization fits a smooth orientation matrix and the resulting acyclic adjacency matrix without evaluating acyclicity at any step. Despite this absence, we prove that COSMO always converges to an acyclic solution. In addition to being asymptotically faster, our empirical analysis highlights how COSMO performance on graph reconstruction compares favorably with competing structure learning methods.

Divergent Token Metrics: Measuring degradation to prune away LLM components – and optimize quantization

B. Deiseroth et al.

Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL 2024, 2024, pp. 6764–6783. doi: 10.18653/v1/2024.naacl-long.377.

Abstract: Large Language Models (LLMs) have reshaped natural language processing with their impressive capabilities. However, their ever-increasing size has raised concerns about their effective deployment and the need for LLM compression. This study introduces the Divergent Token Metrics (DTMs), a novel approach to assessing compressed LLMs, addressing the limitations of traditional perplexity or accuracy measures that fail to accurately reflect text generation quality. DTMs measure token divergences that allow deeper insights into the subtleties of model compression, in particular, when evaluating components’ impacts individually. Utilizing the First Divergent Token Metric (FDTM) in model sparsification reveals that 25% of all attention components can be pruned beyond 90% on the Llama-2 model family, still keeping SOTA performance. For quantization, FDTM suggests that more than 80% of parameters can be naively transformed to int8 without special outlier management. These evaluations indicate the necessity of choosing appropriate compressions for parameters individually—and that FDTM can identify those—while standard metrics result in deteriorated outcomes.

Evaluating language model agency through negotiations

T. R. Davidson, V. Veselovsky, M. Kosinski, and R. West

12th International Conference on Learning Representations, ICLR 2024, 2024. [Online]. Available: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85199581362&partnerID=40&md5=dad736f96b9215c135d031ea90dd2fc5

Abstract: We introduce an approach to evaluate language model (LM) agency using negotiation games. This approach better reflects real-world use cases and addresses some of the shortcomings of alternative LM benchmarks. Negotiation games enable us to study multi-turn, and cross-model interactions, modulate complexity, and side-step accidental evaluation data leakage. We use our approach to test six widely used and publicly accessible LMs, evaluating performance and alignment in both self-play and cross-play settings. Noteworthy findings include: (i) only closed-source models tested here were able to complete these tasks; (ii) cooperative bargaining games proved to be most challenging to the models; and (iii) even the most powerful models sometimes “lose” to weaker opponents.

UNIFY: A unified policy designing framework for solving integrated Constrained Optimization and Machine Learning problems

M. Silvestri, A. De Filippo, M. Lombardi, and M. Milano

Knowledge-Based Systems, vol. 303, 2024, doi: 10.1016/j.knosys.2024.112383

Abstract: The integration of Machine Learning (ML) and Constrained Optimization (CO) techniques has recently gained significant interest. While pure CO methods struggle with scalability and robustness, and ML methods like constrained Reinforcement Learning (RL) face difficulties with combinatorial decision spaces and hard constraints, a hybrid approach shows promise. However, multi-stage decision-making under uncertainty remains challenging for current methods, which often rely on restrictive assumptions or specialized algorithms. This paper introduces unify, a versatile framework for tackling a wide range of problems, including multi-stage decision-making under uncertainty, using standard ML and CO components. unify integrates a CO problem with an unconstrained ML model through parameters controlled by the ML model, guiding the decision process. This ensures feasible decisions, minimal costs over time, and robustness to uncertainty. In the empirical evaluation, unify demonstrates its capability to address problems typically handled by Decision Focused Learning, Constrained RL, and Stochastic Optimization. While not always outperforming specialized methods, unify’s flexibility offers broader applicability and maintainability. The paper includes the method’s formalization and empirical evaluation through case studies in energy management and production scheduling, concluding with future research directions.