Can Large Language Models handle task-oriented dialogues — or do they fall short without fine-tuning? ?
Our EMNLP 2025 Findings study reveals a stronger baseline and shows how a simple Self-Checking mechanism redefines capabilities of LLMs in task-oriented dialogue. ?
Authors:
Sebastian Steindl, André Kestler, and Ulrich Sch?fer (OTH Amberg-Weiden)
Bernd Ludwig (University of Regensburg)
? About the paper:
Recent studies have shown mixed results on whether LLMs can act effectively as Task-Oriented Dialogue (TOD) systems. Our work revisits these findings and proposes a stronger, improved baseline for evaluating pre-trained LLMs in TOD tasks. By introducing a simple yet powerful Self-Checking mechanism, we show that newer LLMs can perform competitively with fine-tuned systems in certain metrics — challenging previous assumptions and offering fresh directions for future research.
? Paper available in the ACL Anthology: https://aclanthology.org/2025.findings-emnlp.610/ (externer Link, ?ffnet neues Fenster)
? Code available on GitHub: https://github.com/sebastian-steindl/LLM4TOD_baseline (externer Link, ?ffnet neues Fenster)
This work was presented at EMNLP 2025, one of the premier conferences in Natural Language Processing and Artificial Intelligence, held last week in Suzhou, China (November 5–9, 2025)! ????
A big thank-you to the EMNLP 2025 organizers for a wonderful conference and to the attendees for the inspiring discussions and feedback! ?
#LargeLanguageModels #TaskOrientedDialogues
#NLP #AI #MachineLearning
#Research #AcademicAI #PhDResearch #AIResearch
#EMNLP2025 #ConferenceOnEmpiricalMethodsInNaturalLanguageProcessing
#AssociationForComputationalLinguistics #ACL
#ResearchSuccess #ResearchPaperAccepted
#OTHAmbergWeiden #UniversityOfRegensburg
#InformationScienceRegensburg #StayInformed
#AcademicLinkedIn
Informationen/Kontakt
Find more information on Bernd Ludwig (including further research works) on Bernd Ludwig's web page.