Query Processing and Ranking of News Titles Related to the Governor of West Java Using TF-IDF and Cosine Similarity

Authors

  • Chandra Saputra Universitas Multi Data Palembang
  • Wilcent Wilcent Universitas Multi Data Palembang
  • Hafiz Irsyad Universitas Multi Data Palembang
  • Abdul Rahman Universitas Multi Data Palembang
https://doi.org/10.58466/aicoms.v4i1.1799

Keywords:

Cosine Similarity, Document Ranking, News Title Search, TF-IDF, Web Scraping

Abstract

Increasing efficiency and relevance in searching for news information is a pressing need
in the digital era. This study aims to develop a news title ranking system based on keywords (que
ries) by combining the Term Frequency-Inverse Document Frequency (TF-IDF) and cosine similar
ity methods. The data used are 2,507 news titles from four of the most popular news sites in Indo
nesia, namely Kompas.com, Detik.com, CNNIndonesia.com, and Tempo.com in the last one year.
The stages carried out include web scraping, pre-processing (case folding, tokenizing, stopwords
removal, and stemming), word weighting using TF-IDF, similarity calculation using cosine simi
larity, to system performance evaluation with accuracy, precision, recall, and f1-score metrics. The
test results on three different queries show that the system is able to provide very good results with
an average accuracy of 99.75%, precision 96.67%, recall 100%, and f1-score 98.33%. This study
shows that the combination of TF-IDF and cosine similarity is effective in optimizing the search for
news titles that are relevant to the entered query.

References

https://journal.universitaspahlawan.ac.id/index.php/jpdk/article/view/13891/10691

https://journal.universitaspahlawan.ac.id/index.php/jpdk/article/view/14206/10918

https://peerj.com/articles/cs-389/

https://journals.telkomuniversity.ac.id/IJDPR/article/view/7944/2545

http://jkm.my.id/index.php/komunikasi/article/view/123/137

https://www.eksplora.stikom-bali.ac.id/index.php/eksplora/article/view/360/175

https://ejournal.uika-bogor.ac.id/index.php/krea-tif/article/download/15470/5511

https://ejournal.itn.ac.id/index.php/jati/article/view/12406/7091

https://jurnal.mdp.ac.id/index.php/jatisi/article/view/6718/1758

https://mail.ejournal.itn.ac.id/index.php/jati/article/view/13041/7284

http://www.seminar.iaii.or.id/index.php/SISFOTEK/article/view/349/297

https://pdfs.semanticscholar.org/2c0f/689d19311e7666533ee1afadbf16cc427de2.pdf

https://jurnal.unidha.ac.id/index.php/jteksis/article/view/913/652

https://doi.org/10.24002/ijis.v2i2.3029

https://bajangjournal.com/index.php/JIRK/article/view/822/549

https://repository.urecol.org/index.php/proceeding/article/view/1079/1049

https://media.neliti.com/media/publications/431771-none-a2a07dcb.pdf

https://ojs.unikom.ac.id/index.php/jamika/article/view/9424/3611

https://subset.id/index.php/IJCSR/article/view/11/4

https://jurnal.polinema.ac.id/index.php/jip/article/view/2564/2022

https://www.ojs.cahayamandalika.com/index.php/jcm/article/view/2292/1799

https://jurnalitpln.id/kilat/article/view/2001/1118

https://ejournal.itn.ac.id/index.php/jati/article/view/12494/6943

https://journal.stekom.ac.id/index.php/Bisnis/article/view/251

Published

2025-06-02

How to Cite

Saputra, C., Wilcent , W. ., Irsyad , H. ., & Rahman , A. . (2025). Query Processing and Ranking of News Titles Related to the Governor of West Java Using TF-IDF and Cosine Similarity. Applied Information Technology and Computer Science (AICOMS), 4(1), 25-32. https://doi.org/10.58466/aicoms.v4i1.1799

Issue

Section

Artikel