Yuxia Wang

Floor 13, Room 2

INSAIT

Sofia, Bulgaria

I am currently a tenure-track Assistant Professor at INSAIT in Sofia. Prior to this, I was a postdoctoral researcher at MBZUAI NLP department, working with Prof. Preslav Nakov. I completed my PhD at The University of Melbourne in January 2023, under the guidance of Prof. Tim Baldwin and Prof. Karin Verspoor. I earned both my Bachelor’s (2016) and Master’s (2018) degrees from the Beijing Institute of Technology.

My research interests lie in natural language processing, with a particular goal to enable models to advance safe, factual, and empathetic human-AI interactions. My current work mainly focuses on LLM/LRM optimization in reasoning, safety, factuality and empathy, machine-generated content detection, and LLM applications in financial and medical domains. I have published papers in top-tier NLP conferences and journals such as ACL, TACL, EMNLP, NAACL and so on.

I am looking for motivated PhD students. We offer competitive scholarship (€40,000 per year), ample GPU resources (GB200), and strong academic ties with ETH Zurich, MIT, and DeepMind. We have co-supervision programs with ETH Zurich and DeepMind. Under our DeepMind Co-Supervision Program, PhD students can work jointly with world-leading mentors from Deepmind, such as Kristina Toutanova and Fei Liu. If you’re passionate about these topics, feel free to contact me with your CV and a brief introduction of your research interests.

news

May 19, 2025	One paper (SpeechDialogueFactory) accepted to Interspeech 2025!
May 15, 2025	Seven papers (5 Main and 2 Findings) accepted to ACL 2025!
Apr 28, 2025	Three papers (Arabic Safeguard Evaluation, Libra-leaderboard, and FIRE) accepted to NAACL 2025! See you in Albuquerque, New Mexico!

selected publications

M4

M4: Multi-generator, Multi-domain, and Multi-lingual Black-Box Machine-Generated Text Detection

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, and 12 more authors

In Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), Mar 2024
M4GT

M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, and 11 more authors

In Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Aug 2024

DOI
OpenFactCheck

OpenFactCheck: A Unified Framework for Factuality Evaluation of LLMs

Hasan Iqbal^*, Yuxia Wang^*, Minghan Wang, and 4 more authors

In Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, Nov 2024

DOI
Factcheck-Bench

Factcheck-Bench: Fine-Grained Evaluation Benchmark for Automatic Fact-checkers

Yuxia Wang, Revanth Gangi Reddy, Zain Muhammad Mujahid, and 10 more authors

In Findings of the Association for Computational Linguistics: EMNLP 2024, Nov 2024

DOI
CDNA

A Chinese Dataset for Evaluating the Safeguards in Large Language Models

Yuxia Wang, Zenan Zhai, Haonan Li, and 6 more authors

In Findings of the Association for Computational Linguistics: ACL 2024, Aug 2024

Abs DOI

Many studies have demonstrated that large language models (LLMs) can produce harmful responses, exposing users to unexpected risks. Previous studies have proposed comprehensive taxonomies of LLM risks, as well as corresponding prompts that can be used to examine LLM safety. However, the focus has been almost exclusively on English. We aim to broaden LLM safety research by introducing a dataset for the safety evaluation of Chinese LLMs, and extending it to better identify false negative and false positive examples in terms of risky prompt rejections. We further present a set of fine-grained safety assessment criteria for each risk type, facilitating both manual annotation and automatic evaluation in terms of LLM response harmfulness. Our experiments over five LLMs show that region-specific risks are the prevalent risk type. Warning: this paper contains example data that may be offensive, harmful, or biased. Our data is available at https://github.com/Libr-AI/do-not-answer.
DNA

Do-Not-Answer: Evaluating Safeguards in LLMs

Yuxia Wang, Haonan Li, Xudong Han, and 2 more authors

In Findings of the Association for Computational Linguistics: EACL 2024, Mar 2024

Abs

With the rapid evolution of large language models (LLMs), new and hard-to-predict harmful capabilities are emerging. This requires developers to identify potential risks through the evaluation of “dangerous capabilities” in order to responsibly deploy LLMs. Here we aim to facilitate this process. In particular, we collect an open-source dataset to evaluate the safeguards in LLMs, to facilitate the deployment of safer open-source LLMs at a low cost. Our dataset is curated and filtered to consist only of instructions that responsible language models should not follow. We assess the responses of six popular LLMs to these instructions, and we find that simple BERT-style classifiers can achieve results that are comparable to GPT-4 on automatic safety evaluation. Our data and code are available at https://github.com/Libr-AI/do-not-answer
SemEval2024MGT

SemEval-2024 Task 8: Multidomain, Multimodel and Multilingual Machine-Generated Text Detection

Yuxia Wang, Jonibek Mansurov, Petar Ivanov, and 7 more authors

In Proceedings of the 18th International Workshop on Semantic Evaluation (SemEval-2024), Jun 2024

Abs DOI

We present the results and the main findings of SemEval-2024 Task 8: Multigenerator, Multidomain, and Multilingual Machine-Generated Text Detection. The task featured three subtasks. Subtask A is a binary classification task determining whether a text is written by a human or generated by a machine. This subtask has two tracks: a monolingual track focused solely on English texts and a multilingual track. Subtask B is to detect the exact source of a text, discerning whether it is written by a human or generated by a specific LLM. Subtask C aims to identify the changing point within a text, at which the authorship transitions from human to machine. The task attracted a large number of participants: subtask A monolingual (126), subtask A multilingual (59), subtask B (70), and subtask C (30). In this paper, we present the task, analyze the results, and discuss the system submissions and the methods they used. For all subtasks, the best systems used LLMs.
Empathy

Can Machines Resonate with Humans? Evaluating the Emotional and Empathic Comprehension of LMs

Muhammad Arslan Manzoor, Yuxia Wang, Minghan Wang, and 1 more author

In Findings of the Association for Computational Linguistics: EMNLP 2024, Nov 2024

Abs DOI

Empathy plays a pivotal role in fostering prosocial behavior, often triggered by the sharing of personal experiences through narratives. However, modeling empathy using NLP approaches remains challenging due to its deep interconnection with human interaction dynamics. Previous approaches, which involve fine-tuning language models (LMs) on human-annotated empathic datasets, have had limited success. In our pursuit of improving empathy understanding in LMs, we propose several strategies, including contrastive learning with masked LMs and supervised fine-tuning with large language models. While these methods show improvements over previous methods, the overall results remain unsatisfactory. To better understand this trend, we performed an analysis which reveals a low agreement among annotators. This lack of consensus hinders training and highlights the subjective nature of the task. We also explore the cultural impact on annotations. To study this, we meticulously collected story pairs in Urdu language and find that subjectivity in interpreting empathy among annotators appears to be independent of cultural background. Our systematic exploration of LMs’ understanding of empathy reveals substantial opportunities for further investigation in both task formulation and modeling.
OpenFactCheck

OpenFactCheck: Building, Benchmarking Customized Fact-Checking Systems and Evaluating the Factuality of Claims and LLMs

Yuxia Wang, Minghan Wang, Hasan Iqbal, and 4 more authors

In Proceedings of the 31st International Conference on Computational Linguistics, Jan 2025

Abs

The increased use of large language models (LLMs) across a variety of real-world applications calls for mechanisms to verify the fac- tual accuracy of their outputs. Difficulties lie in assessing the factuality of free-form responses in open domains. Also, different pa- pers use disparate evaluation benchmarks and measurements, which renders them hard to compare and hampers future progress. To mitigate these issues, we propose OpenFactCheck, a unified framework for building customized automatic fact-checking systems, benchmarking their accuracy, evaluating factuality of LLMs, and verifying claims in a document. OpenFactCheck consists of three modules: (i) CUSTCHECKER allows users to easily customize an automatic fact-checker and verify the factual correctness of documents and claims, (ii) LLMEVAL, a unified evaluation framework assesses LLM’s factuality ability from various perspectives fairly, and (iii) CHECKEREVAL is an extensible solution for gauging the reliability of automatic fact-checkers’ verification results using human-annotated datasets. Data and code are publicly available at https: //github.com/yuxiaw/openfactcheck.
HumanEval MGT

Is Human-Like Text Liked by Humans? Multilingual Human Detection and Preference Against AI

Yuxia Wang, Rui Xing, Jonibek Mansurov, and 8 more authors

arXiv preprint arXiv:2502.11614, Jan 2025
KazLLM

Llama-3.1-Sherkala-8B-Chat: An Open Large Language Model for Kazakh

Fajri Koto, Rituraj Joshi, Nurdaulet Mukhituly, and 8 more authors

arXiv preprint arXiv:2503.01493, Jan 2025
SpeechDialogue

SpeechDialogueFactory: Generating High-Quality Speech Dialogue Data to Accelerate Your Speech-LLM Development

Minghan Wang, Ye Bai, Yuxia Wang, and 3 more authors

Interspeech 2025, Jan 2025
FinChain

FinChain: A Symbolic Benchmark for Verifiable Chain-of-Thought Financial Reasoning

Zhuohan Xie, Dhruv Sahnan, Debopriyo Banerjee, and 14 more authors

Jan 2025
HD-NDEs

HD-NDEs: Neural Differential Equations for Hallucination Detection in LLMs

Qing Li, Jiahui Geng, Zongxiong Chen, and 5 more authors

ACL 2025, Jan 2025