Indonesian News Extractive Summarization using Lexrank and YAKE Algorithm

  • Julyanto Wijaya Computer Science Department, Bina Nusantara University, Jakarta, Indonesia
  • Abba Suganda Girsang Computer Science Department, Bina Nusantara University, Jakarta, Indonesia
Keywords: automatic text summarization, unsupervised learning, extractive text summarization, sentence extraction, term weight

Abstract

The surge in global technological advancements has led to an unprecedented volume of information sharingacross diverse platforms. This information, easily accessible through browsers, has created an overload, making it challenging for individuals to efficiently extract essential content. In response, this paper proposes a hybrid Automatic Text Summarization (ATS) method, combining LexRank and YAKE algorithms. LexRank determines sentence scores, while YAKE calculates individual word scores, collectively enhancing summarization accuracy. Leveraging an unsupervised learning approach, the hybrid model demonstrates a 2% improvement over its base model. To validate the effectiveness of the proposed method, the paper utilizes 5000 Indonesian news articles from the Indosum dataset. Ground-truth summaries are employed, with the objective of condensing each article to 30% of its content. The algorithmic approach and experimental results are presented, offering a promising solution to information overload. Notably, the results reveal a two percent improvement in the Rouge-1 and Rouge-2 scores, along with a one percent enhancement in the Rouge-L score. These findings underscore the potential of incorporating a keyword score to enhance the overall accuracy of the summaries generated by LexRank. Despite the absence of a machine learning model in this experiment, the unsupervised learning and heuristic approach suggest broader applications on a global scale. A comparative analysis with other state-of-the-art text summarization methods or hybrid approaches will be essential to gauge its overall effectiveness.
Published
2024-06-07
How to Cite
Wijaya, J., & Suganda Girsang, A. (2024). Indonesian News Extractive Summarization using Lexrank and YAKE Algorithm. Statistics, Optimization & Information Computing, 12(6), 1973-1983. https://doi.org/10.19139/soic-2310-5070-1976
Section
Research Articles