BERT with Entity Recognition for Classifying Local and Import Products to Support Local MSMEs in the Global Market
Main Article Content
Abstract
The digital era and the COVID-19 pandemic have encouraged a shift toward online shopping. E-commerce has expanded the market by giving buyers access to a global market, resulting in increased cross-border transactions. As a direct challenge for local micro, small, and medium enterprises (MSMEs), the government has made regulations and campaigns to prioritize local products. This study presents a machine-learning model for classifying local products. We fine-tuned pre-trained Bidirectional Encoder Representations from Transformers (BERT) trained on Bahasa on product titles and used Entity Recognition to extract brand names as additional features. The model performs nearly perfectly with an accuracy of 97.79%. Adding brand name information provides an excellent signal to classify local products, indicated by the improvement after adding the brand name as a feature. With this implementation, the government and all e-commerce in Indonesia can collaborate to support the government campaign to encourage the competitiveness of MSMEs by prioritizing local products, reducing the number of imported products, and evaluating government programs specifically aimed at accelerating Indonesian MSMEs.
Article Details
This work is licensed under a Creative Commons Attribution-ShareAlike 4.0 International License.
References
M. T. Kinda, E-commerce as a Potential New Engine for Growth in Asia. International Monetary Fund. International Monetary Fund. 2019.
S. M. Alagoz and H. Hekimoglu, “A Study on Tam: Analysis of Customer Attitudes in Online Food Ordering System,” Procedia Soc Behav Sci, vol. 62, pp. 1138–1143, Oct. 2012.
E. Hartono, C. W. Holsapple, K.-Y. Kim, K.-S. Na, and J. T. Simpson, “Measuring perceived security in B2C electronic commerce website usage: A respecification and validation,” Decis Support Syst, vol. 62, pp. 11–21, Jun. 2014.
H. Manjula Bai, “The Socio-Economic Implications of the Coronavirus Pandemic (COVID-19): A Review,” ComFin Research, vol. 8, no. 4, pp. 8–17, Oct. 2020.
V. Erokhin and T. Gao, “Impacts of COVID-19 on Trade and Economic Aspects of Food Security: Evidence from 45 Developing Countries,” Int J Environ Res Public Health, vol. 17, no. 16, p. 5775, Aug. 2020.
C. Tudor and R. Sova, “Infodemiological study on the impact of the COVID-19 pandemic on increased headache incidences at the world level,” Sci Rep, vol. 12, no. 1, p. 10253, Jun. 2022.
G. Fletcher and M. Griffiths, “Digital transformation during a lockdown,” Int J Inf Manage, vol. 55, p. 102185, Dec. 2020.
A. Priyono, A. Moin, and V. N. A. O. Putri, “Identifying Digital Transformation Paths in the Business Model of SMEs during the COVID-19 Pandemic,” Journal of Open Innovation: Technology, Market, and Complexity, vol. 6, no. 4, p. 104, Oct. 2020.
S. Lone, N. Harboul, and J. W. J. Weltevreden (2021). 2021 European E-Commerce Report. https://www.cmihva.nl/wp-content/uploads/2021/09/European-Ecommerce-Report-2021.pdf
A. Situmorang (2020). Pertumbuhan E-Commerce Tahun Ini Meningkat Tajam di Indonesia. https://techno.okezone.com/read/2020/11/04/16/2304173/pertumbuhan-e-commerce-tahun-ini-meningkat-tajam-di-indonesia.
Cbcommerce (2020). Top 500 EU cross-border analysis report 2020. https://www.cbcommerce.eu (Feb. 05, 2023).
Indonesia, Keputusan Presiden Republik Indonesia Nomor 15 Tahun 2021 Tentang Tim Gerakan Nasional Bangga Buatan Indonesia. 2021.
Indonesia, Keputusan Presiden Republik Indonesia Nomor 2 Tahun 2022 Tentang Percepatan Peningkatan Penggunaan Produk Dalam Negeri Dan Produk Usaha Mikro, Usaha Kecil, Dan Koperasi Dalam Rangka Menyukseskan Gerakan Nasional Bangga Buatan Indonesia Pada Pelaksanaan Pengadaan Barang/Jasa Pemerintah. 2022.
LKPP (2022). LKPP Kawal Transaksi Produk Dalam Negeri, Produk Impor Dalam PBJP Dibatasi http://www.lkpp.go.id/v3/#/read/6757.
H. Kim, G. Joo, and H. Im, “Product Category Classification using Word Embedding and GRUs,” The Journal of Korean Institute of Information Technology, vol. 19, no. 4, pp. 11–18, Apr. 2021.
H. M. Zahera and M. A. Sherif, “ProBERT: Product Data Classification with Fine-tuning BERT Model,” in MWPD@ ISWC. 2020.
E. Lunando and A. Purwarianti, “Indonesian social media sentiment analysis with sarcasm detection,” in 2013 International Conference on Advanced Computer Science and Information Systems (ICACSIS). 2013.
G. Robert and R. Gosselin, “Evaluating the impact of NIR pre-processing methods via multiblock partial least-squares,” Anal Chim Acta, vol. 1189, p. 339255. 2022.
H. Mao, A. Yusup, Y. Ge, and D. Chen, “Named Entity Recognition in Chinese E-commerce Domain Based on Multi-Head Attention,” in 2022 9th International Conference on Dependable Systems and Their Applications (DSA), pp. 576–580. 2022.
I. A. Klampanos, “Manning Christopher, Prabhakar Raghavan, Hinrich Schütze: Introduction to information retrieval,” Inf Retr Boston, vol. 12, no. 5, pp. 609–612, Oct. 2009.
D. A. Nurdeni, I. Budi, and A. B. Santoso, “Sentiment Analysis on Covid19 Vaccines in Indonesia: From The Perspective of Sinovac and Pfizer,” in 2021 3rd East Indonesia Conference on Computer and Information Technology (EIConCIT), pp. 122–127, 2021.
A. Vaswani et al.., “Attention is all you need,” Adv Neural Inf Process Syst, vol. 30, 2017.
J. Devlin, M.-W. Chang, K. Lee, and K. Toutanova, “BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding,” In Proceedings of the Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers). Association for Computational Linguistics, Minneapolis, Minnesota, Oct. 2018.
F. Koto, A. Rahimi, J. H. Lau, and T. Baldwin, “IndoLEM and IndoBERT: A Benchmark Dataset and Pre-trained Language Model for Indonesian NLP,” Proceedings of the 28th International Conference on Computational Linguistics, Nov. 2020.
M. Li, H. Wang, L. Yang, Y. Liang, Z. Shang, and H. Wan, “Fast hybrid dimensionality reduction method for classification based on feature selection and grouped feature extraction,” Expert Syst Appl, vol. 150, p. 113277, Jul. 2020.
N. Reimers and I. Gurevych, “Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks,” Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing, pages 3982–3992, Hong Kong, China, November 3–7, 2019.
K. Shah, H. Patel, D. Sanghvi, and M. Shah, “A Comparative Analysis of Logistic Regression, Random Forest and KNN Models for the Text Classification,” Augmented Human Research, vol. 5, no. 1, p. 12, Dec. 2020.
I. S. Damanik, A. P. Windarto, A. Wanto, Poningsih, S. R. Andani, and W. Saputra, “Decision Tree Optimization in C4.5 Algorithm Using Genetic Algorithm,” J Phys Conf Ser, vol. 1255, no. 1, p. 012012, Aug. 2019.
K. Kowsari, K. J. Meimandi, M. Heidarysafa, S. Mendu, L. E. Barnes, and D. E. Brown, “Text Classification Algorithms: A Survey,” Information 10, no. 4, Apr. 2019.
H. Shelar, G. Kaur, N. Heda, and P. Agrawal, “Named Entity Recognition Approaches and Their Comparison for Custom NER Model,” Sci Technol Libr (New York, NY), vol. 39, no. 3, pp. 324–337, Jul. 2020.