We are pleased to announce our participation in the newly funded interdisciplinary project DIDI ("Different groups, different lenses? How Media Framing Shapes Perceptions of Majority and Minority Communities"), funded by Research Südtirol Alto Adige 2024. DIDI is a collaboration between University of Innsbruck (TCS) and Eurac Research, investigating how media framing shapes perceptions across German, Italian, and Ladin communities in South Tyrol. We bring our expertise in formal methods and natural language processing to tackle fundamental challenges in generative AI.
Tackling AI Bias and Hallucinations in Multilingual Contexts
As news media increasingly serve as training data for generative AI systems, the risks of biased outputs and hallucinations (information inconsistent with facts) become critical concerns. For smaller linguistic communities with limited training data, these risks are amplified, as their cultural and linguistic nuances are often underrepresented. Our research addresses these challenges at the intersection of formal methods, natural language processing, and multilingual communication.
We are developing computational approaches to address two core problems: First, identifying and mitigating intrinsic biases in LLMs when applied to minority contexts, where cultural and linguistic underrepresentation leads to skewed narratives. Second, reducing hallucinations through constraint mechanisms that ground model outputs in verified local knowledge bases. By combining logic-based constraints with curated regional multilingual databases, we aim to ensure AI systems can reliably analyse media content while highlighting differences in coverage between linguistic communities and detecting biased or fabricated information.
A critical challenge lies in adapting NLP techniques designed for high-resource languages to Ladin. We are developing specialized tools for topic modeling, sentiment analysis, and emotion mining for this language. This work not only supports the project's objectives but also creates generalizable methodologies for extending modern language technologies to underrepresented linguistic communities.
