IMPLEMENTATION OF K-MEANS ALGORITHM ELBOW METHOD AND SILHOUETTE COEFFICIENT FOR RAINFALL CLASSIFICATION
Abstract
Rain is one of the hydrological cycles which is a cycle of water rotation from the earth to the atmosphere and back to the earth continuously. High Rainfall may cause some areas that are in lowlands or those with low water infiltration systems will be very susceptible to flooding. For that it is necessary to have a system to classify weather data and rainfall in each city and district so the city that has high rainfall and extreme weather can be given special attention to prevent any natural disaster like flooding. The collected data will be processed with K-Means algorithm to classify the cities or district that have low, medium, high, or very high rainfall data. In the K-Means algorithm the amount of k or cluster usually determined by randomly, on this project will be used a method that is Elbow Method to determine the value of k or cluster and Silhouette Coefficient Method will be used for testing the quality amount of a cluster. The data that will be used is the rainfall data from dataonline.bmkg.go.id at a certain period of time to be classified using the K-Means algorithm. The elbow method and the silhouette method can be used in selecting a good optimal number of clusters, and both methods mostly have the same results in determining the optimal number of clusters, it can be seen that the calculation of accuracy between using the optimal number of clusters is higher rather than not using the amount optimal number of clusters. This can be seen in the results of the clustering in Semarang on February 1 - 28, 2021, when using the amount of K = 4 produce the accuracy result 92.8571429 %, while when using the optimal number of cluster K=3 the accuracy result is higher (97.6190476 %). In the Cilacap city classification on April 1-30 2021, the elbow method and the silhouette coefficient method produce different optimal cluster results, but the accuracy obtained when using the optimal number of clusters from the silhouette coefficient (85.7142857 %) is higher than using the optimal cluster from the elbow method.(74.6031746 %), but when the data is processed with centroid on table 5.10, the elbow method and silhouette coefficient method produce the same amount of optimal number of clusters is 2. This shows that differences in the use of the initial centroid point can affect the results of the elbow method and the silhouette coefficient method
Keywords
Full Text:
PDFReferences
Santosa B, "Data Mining : Teknik Pemanfaatan Data untuk keperluan Bisnis," Graha Ilmu- Yogyakarta, 2007.
Santoso, "Statistik Multivariat," Elekmedi Komputindo-Jakarta, 2005. [3] Prasetyo, Eko, "Data Mining Mengolah Data
menjadi Informasi dengan Matlab," Andi- Yogyakarta, 2009.
Dephut, “ P.32/Menhut-II/2009 tentang Tata cara penyusunan rencana teknik rehabilitasi hutan dan lahan Daeah Aliran
sungai”. Jakarta, 2009.
Rajagopal, Sankar, "Customer data clustering using data mining Technique," in International Journal of Database
Management System Vol.3 No.4, 2011.
Oscar, Johan Ong, “Implementasi Algoritma K-Means Clustering untuk menentukan Strategi Marketing President
University”, Jurnal Ilmiah Teknik Industri, Vol.12 No.1, 2013.
Handoyo, Rendy, dkk. 2014. Perbandingan Metode Clustering Menggunakan Metode Single Linkage dan K-Means Pada
Pengelompokan Dokumen. JSM STMIK Mikroskil, volume 15, no 2.
Madhulatha, T.S., 2012. An Overview On Clustering Methods. IOSR Journal of Engineering, II(4), pp.719-725
Irwanto, et. al (2012). Optimasi Kinerja Algoritma Klasterisasi K-Means untuk kuantisasi Warna Citra. Jurnal Teknik ITS,
I(1), pp.197-202.
Kodinariya, Trupti M. & Makwana, Prashant R., (2013). Review on determining number of cluster in K-Means Clustering.
International Journal of Advance Research in Computer Science and Management Studies, I(6), pp. 90-95
Bholowalia, Purnima & Kumar, Arvind, 2014. EBK-Means: A Clustering Techiniques based on Elbow Method and K-Means
in WSN. International Journal of Computer Application (0975-8887), IX(105), pp. 17-24
DOI: https://doi.org/10.24167/proxies.v4i1.12433
Copyright (c) 2024 Proxies : Jurnal Informatika
View My Stats