PENERAPAN LEXICON-BASED LABELING UNTUK ANALISIS SENTIMEN ULASAN UMKM KULINER DENGAN PERBANDINGAN KINERJA ALGORITMA MACHINE LEARNING
DOI:
https://doi.org/10.54840/jcstech.v6i1.570Keywords:
Analisis Sentimen, Lexicon-Based, Machine Learning, UMKM Kuliner, NLPAbstract
This study aims to classify consumer sentiment of culinary MSMEs in South Kalimantan using a hybrid approach that combines Lexicon-Based methods and Machine Learning algorithms. The dataset used is a collection of MSME consumer reviews written on social media Instagram and Twitter (X). Lexicon-based is used for automatic labeling based on positive and negative word dictionaries, while Machine Learning algorithms (Naïve Bayes, Random Forest, Logistic Regression, Support Vector Machine (SVM), Decision Tree, and K-Nearest Neighbors (KNN)) as a model to determine the results of pattern-based sentiment classification. The research stages include dataset collection, text preprocessing, automatic labeling using lexicon-based labeling, TF-IDF feature extraction, model implementation with Machine Learning and evaluation of results using classification metrics (accuracy, precision, recall and F1-score), then comparative analysis between models to determine which model has the best performance in classifying MSME consumer review sentiment. The results showed that the Support Vector Machine model produced the best accuracy (79%) and an F1-score of 70%, followed by Logistic Regression with 77% accuracy and 77% precision. Both models performed better in classification than Naive Bayes and Random Forest, while the tree-based model was considered suboptimal. The results of this study are expected to contribute and serve as a reference for culinary MSMEs in East Kalimantan in understanding consumer perceptions to improve product and service quality
References
Anggina, S., Setiawan, N. Y., & Bachtiar, F. A. (2022). Analisis Ulasan Pelanggan Menggunakan Multinomial Naïve Bayes Classifier dengan Lexicon-Based dan TF-IDF Pada Formaggio Coffee and Resto. Ais The Best, 7, 76–90.
Fathoni, M. F. N., Puspaningrum, E. Y., & Sihananto, A. N. (2024). Perbandingan Performa Labeling Lexicon InSet dan VADER pada Analisa Sentimen Rohingya di Aplikasi X dengan SVM. Modem : Jurnal Informatika Dan Sains Teknologi, 1(3).
Geni, L., Yulianti, E., & Sensuse, D. I. (2023). Sentiment Analysis of Tweets Before the 2024 Elections in Indonesia Using IndoBERT Language Models. Jurnal Ilmiah Teknik Elektro Komputer Dan Informatika (JITEKI), 9(3), 746–757. https://doi.org/10.26555/jiteki.v9i3.26490
Hamka, M., & Sari, D. R. (2022). ANALISIS SENTIMEN DAN INFORMATION EXTRACTION PEMBELAJARAN DARING MENGGUNAKAN PENDEKATAN LEXICON. Djtechno: Journal of Information Technology Research, 3(1), 21–32.
Mardiana, H., Walid, M., & Darmawan, A. K. (2023). LEXICON-BASED AND NAIVE BAYES SENTIMENT ANALYSIS FOR RECOMMENDING THE BEST MARKETPLACE SELECTION AS A MARKETING STRATEGY FOR MSMES. PILAR Nusa Mandiri: Journal of Computing and Information System, 19(2). https://doi.org/10.33480/pilar.v19i2.4176
Muttakin, F., Andrika, N., & Salsabila. (2025). Sentiment Analysis of Shoe Product Reviews on Indonesian E-Commerce Platform Using Lexicon Based and Support Vector Machine. Jurnal Teknik Informatika (Jutif), 6(2), 839–854.
Nugraha, S. A. (2025). PENERAPAN LEXICON BASED UNTUK ANALISIS SENTIMEN MASYARAKAT INDONESIA TERHADAP DANANTARA. JATI (Jurnal Mahasiswa Teknik Informatika), 9(3), 4949–4957.
Oktaviana, N. E., Sari, Y. A., & Indriati. (2022). ANALISIS SENTIMEN TERHADAP KEBIJAKAN KULIAH DARING SELAMA PANDEMI MENGGUNAKAN PENDEKATAN LEXICON BASED FEATURES DAN SUPPORT VECTOR MACHINE. Jurnal Teknologi Informasi Dan Ilmu Komputer (JTIIK), 9(2). https://doi.org/10.25126/jtiik.202295625
Rihastuti, S., & Rosyidi, A. (2025a). Analisis Sentimen Pengguna Tiktok Tentang Progres Pembangunan Ikn Dengan Metode Random Forest. Journal of Computer Science and Technology, 5(1), 19–23.
Rihastuti, S., & Rosyidi, A. (2025b). Perbandingan Kinerja Support Vector Machine Dan Random Forest Untuk Klasifikasi Sentimen Pengguna Aplikasi Gojek Dengan Optimasi Smote. Algoritme, 131–141. https://jurnal.mdp.ac.id/index.php/algoritme/article/view/13463
Rizkia, A. S., Wufron, & Roji, F. F. (2025). Sentiment Analysis of Coretax : A Comparison of Manual , Transformers- Based , and Lexicon-Based Data Labeling on IndoBERT Performance. MALCOM: Indonesian Journal of Machine Learning and Computer Science, 5(July), 1037–1048.
Satrya, W. F., Aprilliyanib, R., & Yossy, E. H. (2023). Sentiment analysis of Indonesian police chief using multi-level ensemble model ensemble model. Procedia Computer Science, 216(2022), 620–629. https://doi.org/10.1016/j.procs.2022.12.177
Setiyawan, R., & Mustofa, Z. (2024). Comparison of the performance of naive bayes and support vector machine in sirekap sentiment analysis with the lexicon- based approach. JOSCEX: Journal of Soft Computing Exploration, 5, 122–132.
Winata, G. I., Aji, A. F., Cahyawijaya, S., Mahendra, R., Koto, F., Romadhony, A., Kurniawan, K., Moeljadi, D., Prasojo, R. E., Fung, P., Baldwin, T., Lau, J. H., Sennrich, R., & Ruder, S. (2023). NusaX : Multilingual Parallel Sentiment Dataset for 10 Indonesian Local Languages. Proceedings Ofthe 17th Conference Ofthe European Chapter Ofthe Association for Computational Linguistics, 815–834
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2026 Journal of Computer Science and Technology (JCS-TECH)

This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.
