COMPARISON BETWEEN CNN AND RANDOM FOREST PERFORMANCE IN DETECTING HOAX INDONESIAN NEWS ARTICLES

Franciska Nugrahaeni Siwi Pramudya, Yonathan Purbo Santosa

Abstract


Hoax news is a serious problem in this era. Many people are easily led by opinions made by certain people without seeing the truth or looking for existing facts. To overcome this, many researchers have conducted hoax news detection using various algorithms. In some studies, it is said that Random Forest has better performance to overcome this hoax news problem. In other studies, it is also said that CNN has the same level of performance as the Random Forest algorithm. In addition, the problem that is often found is the error in prediction due to improper preprocessing methods. Therefore, in this research, the appropriate preprocessing method is searched by using several preprocessing scenarios for the Convolutional Neural Network (CNN) and Random Forest algorithms. Therefore, in addition to finding the right preprocessing method for each algorithm, a performance comparison is also carried out on the CNN and Random Forest algorithms using a dataset of 4000 news facts from Kompas.com and 4000 hoax news from the turnback.hoax site. the results obtained in this study are random forest has an average model accuracy value of 90% and the CNN algorithm has an average model accuracy value of 60% using the same extraction method, namely TFIDF combined with Ngrams worth one or unigram

Keywords


Hoax news detection; Convolutional Neural Network(CNN); Random Forest

Full Text:

PDF

References


A. A. Kurniawan and M. Mustikasari, “Implementasi Deep Learning Menggunakan Metode CNN dan LSTM untuk Menentukan Berita Palsu dalam Bahasa Indonesia,” JIUP, vol. 5, no. 4, p. 544, Dec. 2021, doi: 10.32493/informatika.v5i4.6760.

Friskadini Ismayanti and Erwin Budi Setiawan, “Deteksi Konten Hoax Berbahasa Indonesia di Twitter Menggunakan Fitur Ekspansi dengan Word2Vec,” openlibrarypublications.telkomuniversity.ac.id, 2020, [Online]. Available: https://openlibrarypublications.telkomuniversity.ac.id/index.php/engineering/article/view/15697

F. Rahutomo, I. Y. R. Pratiwi, and D. M. Ramadhani, “Eksperimen Naïve Bayes Pada Deteksi Berita Hoax Berbahasa Indonesia,” Jurnal PKOP, vol. 23, no. 1, Jul. 2019, doi: 10.33299/jpkop.23.1.1805.

S. Khomsah and Agus Sasmito Aribowo, “Text-Preprocessing Model Youtube Comments in Indonesian,” RESTI, vol. 4, no. 4, pp. 648–654, Aug. 2020, doi: 10.29207/resti.v4i4.2035.

T. T. A. Putri, “ANALYSIS AND DETECTION OF HOAX CONTENTS IN INDONESIAN NEWS BASED ON MACHINE LEARNING,” vol. 4, no. 1, 2019.

D. S. Wahyuni and Y. Sibaroni, “Comparison of Ensemble Methods for Detecting Hoax News,” bits, vol. 4, no. 2, Sep. 2022, doi: 10.47065/bits.v4i2.1957.

H. K. Farid, E. B. Setiawan, and I. Kurniawan, “Implementation Information Gain Feature Selection for Hoax News Detection on Twitter using Convolutional Neural Network (CNN),” vol. 5, no. 3, 2021.

D. T. Hermanto, A. Setyanto, and E. T. Luthfi, “Algoritma LSTM-CNN untuk Sentimen Klasifikasi dengan Word2vec pada Media Online,” vol. 8, no. 1, 2021.

D. Alita and A. R. Isnain, “Pendeteksian Sarkasme pada Proses Analisis Sentimen Menggunakan Random Forest Classifier,” komputasi, vol. 8, no. 2, Oct. 2020, doi: 10.23960/komputasi.v8i2.2615.

T. Pranckevičius and V. Marcinkevičius, “Comparison of Naive Bayes, Random Forest, Decision Tree, Support Vector Machines, and Logistic Regression Classifiers for Text Reviews Classification,” BJMC, vol. 5, no. 2, 2017, doi: 10.22364/bjmc.2017.5.2.05




DOI: https://doi.org/10.24167/proxies.v7i1.12463

Copyright (c) 2024 Proxies : Jurnal Informatika



View My Stats