Analisis Komparatif Support Vector Machine dan Random Forest untuk Deteksi Email Phishing

Authors

  • Indah Purnama Sari Universitas Muhammadiyah Sumatera Utara
  • Oris Krianto Sulaiman Universitas Islam Negeri Ar-Raniry Banda Aceh
  • Dicky Apdilah Universitas Asahan
  • Pastima Simanjuntak Universitas Putera Batam
https://doi.org/10.58466/aicoms.v4i2.1806

Keywords:

Email Phishing, Machine Learning, Support Vector Machine, Random Forest

Abstract

Information and communication technology has rapidly advanced, bringing significant changes to daily life. With these advancements, access to information has become faster and easier; however, this convenience also introduces challenges, particularly concerning personal data security. One common cybercrime is email phishing, where attackers use malicious links to encrypt user data or devices and demand a ransom to restore access. Phishing emails often resemble official messages from trusted sources, making recipients unaware of the potential threat. To minimize such risks, technology can be utilized to automatically classify phishing emails. This study focuses on developing a machine learning model for automatic phishing email classification. The dataset used consists of 18,650 emails, including 11,322 non-phishing and 7,328 phishing emails. The proposed models employ two algorithms: Support Vector Machine (SVM) and Random Forest. To optimize performance, hyperparameter tuning was conducted using GridSearchCV. The experimental results demonstrate that the SVM algorithm achieved an accuracy of 97.27%, while the Random Forest algorithm achieved 96.51%. These findings indicate that the developed models can effectively support efforts to anticipate and mitigate phishing email threats..

References

Anggarda, M., Kustiawan, I., Nurjanah, D., & Hakim, N. (2023). Pengembangan Sistem Prediksi Waktu Penyiraman Optimal pada Perkebunan: Pendekatan Machine Learning untuk Peningkatan Produktivitas Pertanian. JURNAL BUDIDAYA PERTANIAN, 19(2), 124-136. https://doi.org/10.30598/jbdp.2023.19.2.124

Erlangga, F., & Sari, I.P. (2024). Perancangan Sistem Untuk Merekomendasikan Produk Skincare Menggunakan Metode NLP. Portal Riset dan Inovasi Sistem Perangkat Lunak 2 (4), 1-11

Avcı, C., Budak, M., Yağmur, N., Balçık, F. (2023). Comparison between random forest and support vector machine algorithms for LULC classification. International Journal of Engineering and Geosciences, 8(1), 1-10. https://doi.org/10.26833/ijeg.987605

Azzahrah., A & Sari., I.P. (2024). Perbandingan Sistem Prediksi Menggunakan Metode Monte Carlo dengan Metode K-NN pada Nilai Peserta Didik Uji Kompetensi Kejuruan. sudo Jurnal Teknik Informatika 3 (3), 127-135

Azhari, M., Situmorang, Z., & Rosnelly, R. (2021). Perbandingan Akurasi, Recall, dan Presisi Klasifikasi pada Algoritma C4. 5, Random Forest, SVM dan Naive Bayes. Jurnal Media Informatika Budidarma, 5(2), 640-651. http://dx.doi.org/10.30865/mib.v5i2.2937

Sari, I.P., & Batubara, I.H. (2021). Perancangan Sistem Informasi Laporan Keuangan Pada Apotek Menggunakan Algoritma K-NN. Seminar Nasional Teknologi Edukasi dan Humaniora (SiNTESa)

Hasibuan., W.R, Sari.., I.P, & Basri., M. (2025). Klasifikasi Kerusakan (Cacat) pada Biji Kopi Arabika Menggunakan Algoritma KNN (K-Nearest Neighbor). Blend Sains Jurnal Teknik 3 (4), 452-459

Sari, I.P., Al-Khowarizmi, A., & Batubara, I.H. (2021). Cluster Analysis Using K-Means Algorithm and Fuzzy C-Means Clustering For Grouping Students' Abilities In Online Learning Process. Journal of Computer Science, Information Technology and Telecommunication Engineering, 2(1), 139-144

Apdilah, D., & Sari, I.P. (2021). Optimization Of The Fuzzy C-Means Cluster Center For Credit Data Grouping Using Genetic Algorithms. Al'adzkiya International of Computer Science and Information Technology (AIoCSIT) Journal, 2(2), 156-163

Badillo, S., Banfai, B., Birzele, F., Davydov, I.I., Hutchinson, L., Kam-Thong, T., Siebourg-Polster, J., Steiert, B. and Zhang, J.D. (2020), An Introduction to Machine Learning. Clin. Pharmacol. Ther., 107: 871-885. https://doi.org/10.1002/cpt.1796

Sari, I.P., Ramadhani, F., & Satria, A. (2024). Classification of Tuberculosis Based on Thorax X-ray Images Using Multi-Scale Convolutional Neural Network. 2024 7th International Conference of Computer and Informatics Engineering (IC2IE)

CASUARINA, Indah Putri; HAYATI, Memi Nor; PRANGGA, Surya. (2022). Klasifikasi Status Pembayaran Kredit Barang Elektronik dan Furniture Menggunakan Support Vector Machine. EKSPONENSIAL, [S.l.], v. 13, n. 1, p. 71-78, june 2022. ISSN 2798-3455. Available at: <https://jurnal.fmipa.unmul.ac.id/index.php/exponensial/article/view/887>. Date accessed: 28 may 2024. doi: https://doi.org/10.30872/eksponensial.v13i1.887.

Rolly Junius Lontaan Muhammad Fairuzabadi, Indah Purnama Sari Imam Ekowicaksono, Fatimah Nur Arifah Rahman Indra Kesuma, Nizirwan Anwar Andika Setiawan Deep Learning untuk Pemula: Memahami Algoritma, Tools, dan Masa Depan AI

Chairunisa, G., Najib, M. K., Nurdiati, S., Imni, S. F., Sanjaya, W., Andriani, R. D., Henriyansah, Putri, R. S. P., & Ekaputri, D. (2024). Life Expectancy Prediction Using Decision Tree, Random Forest, Gradient Boosting, and XGBoost Regressions. JURNAL SINTAK, 2(2), 71–82. https://doi.org/10.62375/jsintak.v2i2.249

Ernianti Hasibuan, & Elmo Allistair Heriyanto. (2022). ANALISIS SENTIMEN PADA ULASAN APLIKASI AMAZON SHOPPING DI GOOGLE PLAY STORE MENGGUNAKAN NAIVE BAYES CLASSIFIER. Jurnal Teknik Dan Science, 1(3), 13–24. https://doi.org/10.56127/jts.v1i3.434

Published

2025-11-13

How to Cite

Purnama Sari, I. ., Krianto Sulaiman , O. ., Apdilah , D. ., & Simanjuntak , P. . (2025). Analisis Komparatif Support Vector Machine dan Random Forest untuk Deteksi Email Phishing. Applied Information Technology and Computer Science (AICOMS), 4(2), 18-27. https://doi.org/10.58466/aicoms.v4i2.1806

Issue

Section

Artikel