CHI - SQUARE AND INFORMATION GAIN FEATURE SELECTION FOR HOTEL REVIEW SENTIMENT ANALYSIS USING SUPPORT VECTOR MACHINE

Nathanael Karunia, Yonathan Purbo Santosa

Abstract


In the current era, it has become a trend for people to order tickets online through online booking sites and applications, both in terms of transportation such as planes, vacations such as tours, and also lodging such as hotels. To get a good hotel, you need a review from people who have booked it. With the reviews written by visitors to the site or mobile application, they will then be analyzed so that an output can be produced that can be useful. One of the analytical models that can be done is sentiment analysis. The purpose of this study is to find the best method in analyzing sentiment based on the preprocessing of the data and hopefully it can produce knowledge in the form of sentiment analysis classification methods in order to determine a good method devoted to the data preprocessing section. The algorithm used to make this sentiment classification analysis is the Support Vector Machine using 3 feature selection methods, namely not using the selection feature, using the chi square selection feature, and using the information gain selection feature. The process consists of five steps in this study, which include several activities. namely data collection, preprocessing, feature extraction, feature selection, classification, and calculating accuracy. In the process of calculating accuracy, I used the Confusion Matrix method to find the best method of the three based on the accuracy results obtained.  The results of the 3 uses of the feature selection method that were carried out were using the chi square feature selection method, the highest results were obtained, namely with an average accuracy of 86.68% which was followed by the use of the information gain selection feature which obtained an average accuracy of 85.78% and the last one was followed by the method not using the selection feature which got an average accuracy of 85.24%. From the results of the three methods, it can be concluded that the use of the chi square feature selection method in the case of sentiment analysis on hotel reviews is the best compared to the other two.


Keywords


feature selection; sentiment analysis; hotel review

Full Text:

PDF

References


P. Nomleni, “SENTIMENT ANALYSIS MENGGUNAKAN SUPPORT VECTOR MACHINE(SVM),” Thesis, Institut Teknologi Sepuluh Nopember, Surabaya, 2015. Accessed: Aug. 01, 2023. [Online]. Available: https://repository.its.ac.id/41821/1/2213206717-Master%20Thesis.pdf

A. R. Rozzaqi, “Naive Bayes dan Filtering Feature Selection Information Gain untuk Prediksi Ketepatan Kelulusan Mahasiswa,” Jurnal Informatika Upgris, vol. 1, no. 1 Juni, Art. no. 1 Juni, 2015, doi: 10.26877/jiu.v1i1.

O. Somantri and D. Apriliani, “Support Vector Machine Berbasis Feature Selection Untuk Sentiment Analysis Kepuasan Pelanggan Terhadap Pelayanan Warung dan Restoran Kuliner Kota Tegal,” Jurnal Teknologi Informasi dan Ilmu Komputer, vol. 5, no. 5, Art. no. 5, Oct. 2018, doi: 10.25126/jtiik.201855867.

A. K. Santoso, A. Noviriandini, A. Kurniasih, B. D. Wicaksono, and A. Nuryanto, “KLASIFIKASI PERSEPSI PENGGUNA TWITTER TERHADAP KASUS COVID-19 MENGGUNAKAN METODE LOGISTIC REGRESSION,” Jurnal Informatika Kaputama (JIK), vol. 5, no. 2, Art. no. 2, Jul. 2021.

Merinda Lestandy, Abdurrahim Abdurrahim, and Lailis Syafa’ah, “Analisis Sentimen Tweet Vaksin COVID-19 Menggunakan Recurrent Neural Network dan Naïve Bayes,” RESTI, vol. 5, no. 4, pp. 802–808, Aug. 2021, doi: 10.29207/resti.v5i4.3308.

O. Somantri and D. Dairoh, “Analisis Sentimen Penilaian Tempat Tujuan Wisata Kota Tegal Berbasis Text Mining,” JEPIN, vol. 5, no. 2, p. 191, Aug. 2019, doi: 10.26418/jp.v5i2.32661.

M. Wankhade, A. C. S. Rao, S. Dara, and B. Kaushik, “A Sentiment Analysis of Food Review using Logistic Regression,” int. j. sci. res. comput. sci. eng. inf. technol., vol. 2, no. 7, pp. 251–260, Sep. 2017, doi: 10.32628/CSEIT174430.

M. Murnawan, “PEMANFAATAN ANALISIS SENTIMEN UNTUK PEMERINGKATAN POPULARITAS TUJUAN WISATA,” Jurnal Penelitian Pos dan Informatika, vol. 7, no. 2, p. 109, 2017.

F. V. Sari and A. Wibowo, “ANALISIS SENTIMEN PELANGGAN TOKO ONLINE JD.ID MENGGUNAKAN METODE NAÏVE BAYES CLASSIFIER BERBASIS KONVERSI IKON EMOSI,” Simetris: Jurnal Teknik Mesin, Elektro dan Ilmu Komputer, vol. 10, no. 2, Art. no. 2, Nov. 2019, doi: 10.24176/simet.v10i2.3487.

S. Afrizal, H. N. Irmanda, N. Falih, and I. N. Isnainiyah, “Implementasi Metode Naïve Bayes untuk Analisis Sentimen Warga Jakarta Terhadap,” Informatik : Jurnal Ilmu Komputer, vol. 15, no. 3, p. 157, Aug. 2020, doi: 10.52958/iftk.v15i3.1454.




DOI: https://doi.org/10.24167/proxies.v5i2.12451

Copyright (c) 2024 Proxies : Jurnal Informatika



View My Stats