Classification of Stock Listing Boards for Warrants using Machine Learning and Bayesian Optimization
Keywords:
Stock Listing, Bayesian Optimization, Machine Learning, Prediction, IPOAbstract
Automatic classification of warrant stock listing boards is an important challenge in managing capital market information, especially on the Electronic Indonesia Public Offering (E-IPO) platform. This research implements various machine learning algorithms optimized using Bayesian Optimization to improve the classification accuracy of six listing board categories. Ensemble models such as Random Forest, CatBoost, and XGBoost showed superior performance with the highest accuracy reaching 74.68%. The use of Bayesian Optimization effectively finds the optimal hyperparameters, strengthening the overall performance of the model. Evaluation was conducted through stratified cross-validation and confusion matrix analysis, providing in-depth insight into prediction accuracy. The results of this research contribute to the automation of listing board clustering that supports the strategic decisions of investors, issuers, and regulators in the Indonesian capital market.
References
[1] Amanda, A. T., Purba, P. B., Sitorus, N. D., & Hidayat, R. (2024). The Role of The Capital Market in the Development of Indonesian Economy. International Journal of Economic Research Collaboration, 1(1), 57-64.
[2] Draho, J. (2004). The IPO decision: Why and how companies go public. Edward Elgar Publishing.
[3] Chemmanur, T. J., Hu, G., & Huang, J. (2010). The role of institutional investors in initial public offerings. The Review of Financial Studies, 23(12), 4496-4540.
[4] Novianti, Y., Firdaus, A. R., & Aji, R. F. (2024). The Strategic Role of Information Technology Leadership in Indonesia Stock Exchange. The Indonesian Journal of Computer Science, 13(4).
[5] Babu, T. R. C., & Dsouza, A. E. C. (2021). Post listing IPO returns and performance in India: an empirical investigation. Journal of Financial Studies & Research, 418441, 1-20.
[6] Hendrawan, I. P. S., & Utama, C. A. (2024). Do executive facial trustworthiness have impact on IPO underpricing in the Indonesia stock exchange?. Review of Behavioral Finance, 16(6), 1059-1086.
[7] Shukla, S. (2020). Approaches for machine learning in finance. Available at SSRN 5173529.
[8] Cihan, P. (2025). Bayesian Hyperparameter Optimization of Machine Learning Models for Predicting Biomass Gasification Gases. Applied Sciences, 15(3), 1018.
[9] Atasever, Ü. H., & Bozdağ, A. (2025). Carbon footprint mapping of urban areas in Türkiye using hyperparameter-optimized machine learning techniques. International Journal of Environmental Science and Technology, 1-24.
[10] Buzpinar, M. A., Gunaydin, S., Kavuncuoglu, E., Cetin, N., Sacilik, K., & Cheein, F. A. (2025). Model comparison and hyperparameter optimization for visible and near-infrared (Vis-NIR) spectral classification of dehydrated banana slices. Expert Systems with Applications, 127858.
[11] Gulsen, P., Gulsen, A., & Alci, M. (2025). Machine Learning Models With Hyperparameter Optimization for Voice Pathology Classification on Saarbrücken Voice Database. Journal of Voice.
[12] Lu, Q., Zhang, H., Fan, R., Wan, Y., & Luo, J. (2025). Machine learning-based Bayesian optimization facilitates ultrafiltration process design for efficient protein purification. Separation and Purification Technology, 363, 132122.
[13] Jin, B., & Xu, X. (2025). Predicting wholesale edible oil prices through Gaussian process regressions tuned with Bayesian optimization and cross-validation. Asian Journal of Economics and Banking, 9(1), 64-82.
[14] Li, T. (2025, January). Optimization of Clinical Trial Strategies for Anti-HER2 Drugs Based on Bayesian Optimization and Deep Learning. In Proceedings of the 2025 5th International Conference on Bioinformatics and Intelligent Computing (pp. 163-168).
[15] Khurshid, M. R., Manzoor, S., Sadiq, T., Hussain, L., Khan, M. S., & Dutta, A. K. (2025). Unveiling diabetes onset: Optimized XGBoost with Bayesian optimization for enhanced prediction. PloS one, 20(1), e0310218.
[16] Malley, B., Ramazzotti, D., & Wu, J. T. Y. (2016). Data pre-processing. Secondary analysis of electronic health records, 115-141.
[17] Acock, A. C. (2005). Working with missing values. Journal of Marriage and family, 67(4), 1012-1028.
[18] Low, M. X., Yap, T. T. V., Soo, W. K., Ng, H., Goh, V. T., Chin, J. J., & Kuek, T. Y. (2022). Comparison of label encoding and evidence counting for malware classification. Journal of System and Management Sciences, 12(6), 17-30.
[19] Gelman, A. (2008). Scaling regression inputs by dividing by two standard deviations. Statistics in medicine, 27(15), 2865-2873.
[20] Kang, S., Cho, S., & Kang, P. (2015). Multi-class classification via heterogeneous ensemble of one-class classifiers. Engineering Applications of Artificial Intelligence, 43, 35-43.
[21] Stoltzfus, J. C. (2011). Logistic regression: a brief primer. Academic emergency medicine, 18(10), 1099-1104.
[22] Ghaddar, B., & Naoum-Sawaya, J. (2018). High dimensional data classification and feature selection using support vector machines. European Journal of Operational Research, 265(3), 993-1004.
[23] Syam, N., & Kaul, R. (2021). Random forest, bagging, and boosting of decision trees. In Machine Learning and Artificial Intelligence in Marketing and Sales: Essential Reference for Practitioners and Data Scientists (pp. 139-182). Emerald Publishing Limited.
[24] Mohammadagha, M. (2025). Hyperparameter Optimization Strategies for Tree-Based Machine Learning Models Prediction: A Comparative Study of AdaBoost, Decision Trees, and Random Forest. Decision Trees, and Random Forest (April 11, 2025).
[25] Demir, S., & Sahin, E. K. (2023). An investigation of feature selection methods for soil liquefaction prediction based on tree-based ensemble algorithms using AdaBoost, gradient boosting, and XGBoost. Neural Computing and Applications, 35(4), 3173-3190.
[26] Hakkal, S., & Ait Lahcen, A. (2024). XGBoost to enhance learner performance prediction. Computers and Education: Artificial Intelligence, 7, 100254.
[27] Kulkarni, C. S. (2022). Advancing Gradient Boosting: A Comprehensive Evaluation of the CatBoost Algorithm for Predictive Modeling. Journal of Artificial Intelligence, Machine Learning and Data Science, 1(5), 54-57.
[28] Hancock, J. T., & Khoshgoftaar, T. M. (2020). CatBoost for big data: an interdisciplinary review. Journal of big data, 7(1), 94.
[29] Sun, D., Wen, H., Wang, D., & Xu, J. (2020). A random forest model of landslide susceptibility mapping based on hyperparameter optimization using Bayes algorithm. Geomorphology, 362, 107201.
[30] Satria, A., Sitompul, O. S., & Mawengkang, H. (2021, November). 5-Fold cross validation on supporting k-nearest neighbour accuration of making consimilar symptoms disease classification. In 2021 International Conference on Computer Science and Engineering (IC2SE) (Vol. 1, pp. 1-5). IEEE.
[31] Fushiki, T. (2011). Estimation of prediction error by using K-fold cross-validation. Statistics and Computing, 21, 137-146.
[32] Naidu, G., Zuva, T., & Sibanda, E. M. (2023, April). A review of evaluation metrics in machine learning algorithms. In Computer science on-line conference (pp. 15-25). Cham: Springer International Publishing.
Downloads
Published
How to Cite
Issue
Section
License
Copyright (c) 2025 Journal of Artificial Intelligence and Legal Technology

This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.