Improving Cross-Lingual Retrieval of Previously Fact-Checked Claims

Róbert Móro

Researcher at Kempelen Institute of Intelligent Technologies

To mitigate disinformation with AI in a trustworthy way, it should prioritize human agency and control, transparency, and accountability including the means for redress. This can be achieved by using AI to support rather than to replace media professionals, such as fact-checkers, in their efforts to debunk disinformation. One such task that can benefit from AI support, is previously fact-checked claim retrieval. Prior works have traditionally focused on retrieval in English, or, if it was addressed in other languages, it was predominantly monolingual, i.e., having both the input and the retrieved claims in the same language. The goal of the project is to research methods to improve the cross-lingual retrieval performance (i.e., when the input and the retrieved claims are in different languages), focusing on better selection of positive and negative samples to fine-tune the selected multilingual text embedding model(s). The expected outcomes include fine-tuned text embedding model(s) improving cross-lingual retrieval performance and an extended dataset containing identified cross-lingual pairs of fact-checked claims and posts.

Keywords: claim retrieval, fact-checking, multilinguality, data augmentation

Scientific area: Artificial Intelligence

Bio: Róbert Móro works as a researcher at the Kempelen Institute of Intelligent Technologies (KInIT) in Bratislava, Slovakia focusing on artificial intelligence, machine learning, natural language processing, and personalization. He was awarded a Doctoral degree in Intelligent Software Systems at the Slovak University of Technology in Bratislava in 2017, where he also worked as an Assistant Professor from 2017 to 2020. He has co-authored more than 35 peer-reviewed research publications and is involved in several EU-funded/Horizon Europe projects, which focus on disinformation and support of media professionals to mitigate its negative effects.

Visiting period: 2nd June, 2024 – 9th July, 2024 (5 weeks) at Digital Humanities group at Fondazione Bruno Kessler (FBK)