ECJ-LLM: From Extraction to Judgment; Reinventing Automated Review Annotation

  • Loukmane MAADA IA Laboratory, Science Faculty, Moulay Ismail University, Meknes, Morocco
  • Khalid AL FARARNI LISAC Laboratory, Faculty of Sciences Dhar El Mahraz, Sidi Mohamed Ben Abdellah University, Fez, Morocco
  • BADRADDINE AGHOUTANE IA Laboratory, Science Faculty, Moulay Ismail University, Meknes, Morocco
  • Yousef FARHAOUI EST, Sidi Mohamed Ben Abdellah University, Fez, Morocco
  • Mohammed FATTAH Image laboratory, Moulay Ismail University Meknes, Morocco
Keywords: Large Language Models (LLMs), multiagent systems, data annotation, Computational Annotator, Prompt Engineering, Annotation Automation, web application, prompt engineering

Abstract

This paper introduces a novel multi-agent debate framework for interest-based data annotation, leveraging Large Language Models (LLMs) to facilitate structured, collaborative annotation. The framework orchestrates three specialized LLM agents—an aspect extractor, a critic, and a judge—who engage in a systematic debate: the extractor identifies relevant aspects, the critic evaluates and challenges these aspects, and the judge synthesizes their input to assign final, high-quality interest-level labels. This debate-driven process not only enhances annotation fidelity and context but also allows for flexible customization of models used in each role and the interest to be detected. To ensure transparency and quality, the framework incorporates an evaluation suite with metrics such as precision, recall, F1-score, and confusion matrices. Empirical results on a gold-standard hotel review dataset demonstrate that the framework outperforms single-agent methods in annotation quality. A customizable annotation tool, developed as a demonstration of the framework’s practical utility, further showcases its flexibility and extensibility for a range of annotation tasks.
Published
2025-09-12
How to Cite
MAADA, L., AL FARARNI, K., AGHOUTANE, B., FARHAOUI, Y., & FATTAH, M. (2025). ECJ-LLM: From Extraction to Judgment; Reinventing Automated Review Annotation. Statistics, Optimization & Information Computing, 14(6), 3101-3125. https://doi.org/10.19139/soic-2310-5070-2836
Section
Research Articles