Can Large Language Models handle task-oriented dialogues - or do they fall short without fine-tuning? ?
Our EMNLP 2025 Findings study reveals a stronger baseline and shows how a simple Self-Checking mechanism redefines capabilities of LLMs in task-oriented dialogue. ?
Authors:
Sebastian Steindl, André Kestler, and Ulrich Sch?fer (OTH Amberg-Weiden)
Bernd Ludwig (University of Regensburg)
? About the paper:
Recent studies have shown mixed results on whether LLMs can act effectively as Task-Oriented Dialogue (TOD) systems. Our work revisits these findings and proposes a stronger, improved baseline for evaluating pre-trained LLMs in TOD tasks. By introducing a simple yet powerful self-checking mechanism, we show that newer LLMs can perform competitively with fine-tuned systems in certain metrics - challenging previous assumptions and offering fresh directions for future research.
? Paper available in the ACL Anthology: https://aclanthology.org/2025.findings-emnlp.610/ (external link, opens in a new window)
? Code available on GitHub: https://github.com/sebastian-steindl/LLM4TOD_baseline (external link, opens in a new window)
This work was presented at EMNLP 2025, one of the premier conferences in Natural Language Processing and Artificial Intelligence, held last week in Suzhou, China (November 5-9, 2025)! ????
A big thank-you to the EMNLP 2025 organisers for a wonderful conference and to the attendees for the inspiring discussions and feedback! ?
#LargeLanguageModels #TaskOrientedDialogues
#NLP #AI #MachineLearning
#Research #AcademicAI #PhDResearch #AIResearch
#EMNLP2025 #ConferenceOnEmpiricalMethodsInNaturalLanguageProcessing
#AssociationForComputationalLinguistics #ACL
#ResearchSuccess #ResearchPaperAccepted
#OTHAmbergWeiden #UniversityOfRegensburg
#InformationScienceRegensburg #StayInformed
#AcademicLinkedIn
Information/Contacts
Find more information on Bernd Ludwig (including further research works) on Bernd Ludwig's web page.