×

Industry-sensitive language modeling for business. (English) Zbl 07864432

Summary: We introduce BusinessBERT, a new industry-sensitive language model for business applications. The key novelty of our model lies in incorporating industry information to enhance decision-making in business-related natural language processing (NLP) tasks. BusinessBERT extends the Bidirectional Encoder Representations from Transformers (BERT) architecture by embedding industry information during pretraining through two innovative approaches that enable BusinessBert to capture industry-specific terminology: (1) BusinessBERT is trained on business communication corpora totaling 2.23 billion tokens consisting of company website content, MD&A statements and scientific papers in the business domain; (2) we employ industry classification as an additional pretraining objective. Our results suggest that BusinessBERT improves data-driven decision-making by providing superior performance on business-related NLP tasks. Our experiments cover 7 benchmark datasets that include text classification, named entity recognition, sentiment analysis, and question-answering tasks. Additionally, this paper reduces the complexity of using BusinessBERT for other NLP applications by making it freely available as a pretrained language model to the business community. The model, its pretraining corpora and corresponding code snippets are accessible via https://github.com/pnborchert/BusinessBERT.

MSC:

90Bxx Operations research and management science
Full Text: DOI

References:

[1] Aldunate, Á.; Maldonado, S.; Vairetti, C.; Armelini, G., Understanding customer satisfaction via deep learning and natural language processing, Expert Systems with Applications, 209, Article 118309 pp. (2022), URL: https://www.sciencedirect.com/science/article/pii/S0957417422014397
[2] Alvarado, S.; Cesar, J.; Verspoor, K.; Baldwin, T., Domain adaption of named entity recognition to support credit risk assessment, (Proceedings of the Australasian language technology association workshop 2015 (2015)), 84-90, URL: https://aclanthology.org/U15-1010
[3] Araci, D., FinBERT: Financial sentiment analysis with pre-trained language models (2019), arXiv:1908.10063 [cs]
[4] Archak, N.; Ghose, A.; Ipeirotis, P. G., Deriving the pricing power of product features by mining consumer reviews, Management Science, 57, 8, 1485-1509 (2011) · Zbl 1279.90083
[5] Arts, S.; Cassiman, B.; Gomez, J. C., Text matching to measure patent similarity, Strategic Management Journal, 39, 1, 62-84 (2018)
[6] Baechle, C.; Huang, C. D.; Agarwal, A.; Behara, R. S.; Goo, J., Latent topic ensemble learning for hospital readmission cost optimization, European Journal of Operational Research, 281, 3, 517-531 (2020)
[7] Bao, Y.; Datta, A., Simultaneously discovering and quantifying risk types from textual risk disclosures, Management Science, 60, 6, 1371-1391 (2014), URL: http://pubsonline.informs.org/doi/abs/10.1287/mnsc.2014.1930
[8] Bellstam, G.; Bhagat, S.; Cookson, J. A., A text-based analysis of corporate innovation, Management Science, 67, 7, 4004-4031 (2021), URL: http://pubsonline.informs.org/doi/10.1287/mnsc.2020.3682
[9] Beltagy, I.; Lo, K.; Cohan, A., SciBERT: A pretrained language model for scientific text, (Proceedings of the 2019 conference on empirical methods in natural language processing and the 9th international joint conference on natural language processing (2019), Association for Computational Linguistics: Association for Computational Linguistics Hong Kong, China), 3615-3620, URL: https://aclanthology.org/D19-1371
[10] Beyer, A.; Cohen, D. A.; Lys, T. Z.; Walther, B. R., The financial reporting environment: Review of the recent literature, Journal of Accounting and Economics, 50, 2-3, 296-343 (2010), URL: https://linkinghub.elsevier.com/retrieve/pii/S0165410110000431
[11] Bhatia, S., Predicting risk perception: New insights from data science, Management Science, 65, 8, 3800-3823 (2019), URL: https://pubsonline.informs.org/doi/10.1287/mnsc.2018.3121
[12] Borchert, P.; Coussement, K.; De Caigny, A.; De Weerdt, J., Extending business failure prediction models with textual website content using deep learning, European Journal of Operational Research (2022), URL: https://www.sciencedirect.com/science/article/pii/S0377221722005495
[13] Chen, Z.; Chen, W.; Smiley, C.; Shah, S.; Borova, I.; Langdon, D.; Moussa, R.; Beane, M.; Huang, T.-H.; Routledge, B.; Wang, W. Y., FinQA: A dataset of numerical reasoning over financial data, (Proceedings of the 2021 conference on empirical methods in natural language processing (2021), Association for Computational Linguistics: Association for Computational Linguistics Online and Punta Cana, Dominican Republic), 3697-3711
[14] Chen, K.; Li, X.; Luo, P.; Zhao, J. L., News-induced dynamic networks for market signaling: Understanding the impact of news on firm equity value, Information Systems Research, 32, 2, 356-377 (2021)
[15] Choi, J.; Menon, A.; Tabakovic, H., Using machine learning to revisit the diversification-performance relationship, Strategic Management Journal, 42, 9, 1632-1661 (2021)
[16] Davis, A. K.; Piger, J. M.; Sedor, L. M., Beyond the numbers: Measuring the information content of earnings press release language*, Contemporary Accounting Research, 29, 3, 845-868 (2012), URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1911-3846.2011.01130.x
[17] Deng, S.; Huang, Z. J.; Sinha, A. P.; Zhao, H., The interaction between microblog sentiment and stock returns: An empirical examination, MIS Quarterly, 42, 3, 895-918 (2018)
[18] Devlin, J., Chang, M. W., Lee, K., & Toutanova, K. (2019). BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding. In Proceedings of the 2019 conference of the North American chapter of the Association for Computational Linguistics (pp. 4171-4186).
[19] Díaz, S. B.; Coussement, K.; Caigny, A. D.; Pérez, L. F.; Creemers, S., Do the US president’s tweets better predict oil prices? An empirical examination using long short-term memory networks, International Journal of Production Research, 1-18 (2023)
[20] van Dijk, B., Orbis International Company information (2021), Bureau van Dijk Electronic Publishing, URL: https://orbis.bvdinfo.com/
[21] Ewens, M., MD&A statements from public firms: 2002-2018 (Version: 1.0) (2019), CaltechDATA, URL: https://data.caltech.edu/records/1249
[22] Feldman, R.; Govindaraj, S.; Livnat, J.; Segal, B., Management’s tone change, post earnings announcement drift and accruals, Review of Accounting Studies, 15, 4, 915-953 (2010)
[23] Feuerriegel, S.; Gordon, J., News-based forecasts of macroeconomic indicators: A semantic path model for interpretable predictions, European Journal of Operational Research, 272, 1, 162-175 (2019) · Zbl 1403.91273
[24] Frankel, R.; Jennings, J.; Lee, J., Disclosure sentiment: Machine learning vs. dictionary methods, Management Science (2021), URL: https://pubsonline.informs.org/doi/10.1287/mnsc.2021.4156
[25] Geng, Z.; Zhang, Y.; Han, Y., Joint entity and relation extraction model based on rich semantics, Neurocomputing, 429, 132-140 (2021)
[26] Gururangan, S.; Marasović, A.; Swayamdipta, S.; Lo, K.; Beltagy, I.; Downey, D.; Smith, N. A., Don’t stop pretraining: Adapt language models to domains and tasks, (Proceedings of the 58th annual meeting of the Association for Computational Linguistics (2020), Association for Computational Linguistics), 8342-8360, Online. URL: https://aclanthology.org/2020.acl-main.740
[27] Hartmann, J.; Heitmann, M.; Schamp, C.; Netzer, O., The power of brand selfies, Journal of Marketing Research, 58, 6, 1159-1177 (2021)
[28] Hayes, P. J., Intelligent high-volume text processing using shallow, domain-specific techniques, (Jacobs, P. S., Text-based intelligent systems (1992), Lawrence Erlbaum: Lawrence Erlbaum Hillsdale, NJ)
[29] Hoberg, G.; Phillips, G., Text-based network industries and endogenous product differentiation, Journal of Political Economy, 124, 5, 1423-1465 (2016), URL: https://www.journals.uchicago.edu/doi/full/10.1086/688176
[30] Hong, J.; Hoban, P. R., Writing more compelling creative appeals: A deep learning-based approach, Marketing Science, 41, 5, 941-965 (2022)
[31] Hsu, M. F.; Hsin, Y. S.; Shiue, F. J., Business analytics for corporate risk management and performance improvement, Annals of Operations Research, 315, 2, 629-669 (2022) · Zbl 1497.90109
[32] Jeong, M.; Minson, J.; Yeomans, M.; Gino, F., Communicating with warmth in distributive negotiations is surprisingly counterproductive, Management Science, 65, 12, 5813-5837 (2019)
[33] Joulin, A.; Grave, E.; Bojanowski, P.; Mikolov, T., Bag of tricks for efficient text classification (2016), arXiv preprint arXiv:1607.01759
[34] Katsafados, A. G.; Leledakis, G. N.; Pyrgiotakis, E. G.; Androutsopoulos, I.; Fergadiotis, M., Machine learning in bank merger prediction: A text-based approach, European Journal of Operational Research, 312, 2, 783-797 (2024), URL: https://www.sciencedirect.com/science/article/pii/S0377221723005982 · Zbl 07764658
[35] Koo, D. S.; Julie Wu, J.; Yeung, P. E., Earnings attribution and information transfers, Contemporary Accounting Research, 34, 3, 1547-1579 (2017), URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/1911-3846.12308
[36] Lee, D.; Hosanagar, K.; Nair, H. S., Advertising content and consumer engagement on social media: Evidence from facebook, Management Science, 64, 11, 5105-5131 (2018)
[37] Lee, J.; Yoon, W.; Kim, S.; Kim, D.; Kim, S.; So, C. H.; Kang, J., BioBERT: a pre-trained biomedical language representation model for biomedical text mining, Bioinformatics (2019)
[38] Li, F., The information content of forward-looking statements in corporate filings—A Naïve Bayesian machine learning approach, Journal of Accounting Research, 48, 5, 1049-1102 (2010), URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/j.1475-679X.2010.00382.x
[39] Li, F.; Wang, Z.; Hui, S. C.; Liao, L.; Zhu, X.; Huang, H., A segment enhanced span-based model for nested named entity recognition, Neurocomputing, 465, 26-37 (2021), URL: https://www.sciencedirect.com/science/article/pii/S0925231221012911
[40] Li, Y.; Xie, Y., Is a picture worth a thousand words? An empirical study of image content and social media engagement, Journal of Marketing Research, 57, 1, 1-19 (2020)
[41] Liu, A. X.; Li, Y.; Xu, S. X., Assessing the unacquainted: Inferred reviewer personality and review helpfulness, MIS Quarterly: Management Information Systems, 45, 3, 1113-1148 (2021)
[42] Liu, Y.; Ott, M.; Goyal, N.; Du, J.; Joshi, M.; Chen, D.; Levy, O.; Lewis, M.; Zettlemoyer, L.; Stoyanov, V., RoBERTa: A robustly optimized BERT pretraining approach (2019), arXiv:1907.11692 [cs]
[43] Liu, Y.; Pant, G.; Sheng, O. R.L., Predicting labor market competition: Leveraging interfirm network and employee skills, Information Systems Research, 31, 4, 1443-1466 (2020)
[44] Lo, K.; Wang, L. L.; Neumann, M.; Kinney, R.; Weld, D., S2ORC: The semantic scholar open research corpus (version: 2020-07-05), (Proceedings of the 58th annual meeting of the Association for Computational Linguistics (2020), Association for Computational Linguistics), 4969-4983, Online. URL: https://www.aclweb.org/anthology/2020.acl-main.447
[45] Loshchilov, I.; Hutter, F., Decoupled weight decay regularization, (International conference on learning representations (2019)), URL: http://arxiv.org/abs/1711.05101
[46] Maia, M.; Handschuh, S.; Freitas, A.; Davis, B.; McDermott, R.; Zarrouk, M.; Balahur, A., WWW’18 open challenge: Financial opinion mining and question answering, 1941-1942 (2018)
[47] Malo, P.; Sinha, A.; Takala, P.; Korhonen, P.; Wallenius, J., Good debt or bad debt: Detecting semantic orientations in economic texts, Journal of the Association for Information Science and Technology, 782-796 (2014)
[48] Miric, M.; Jia, N.; Huang, K. G., Using supervised machine learning for large-scale classification in management research: The case for identifying artificial intelligence patents, Strategic Management Journal (2022)
[49] Moreno, A.; Terwiesch, C., Doing business with strangers: Reputation in online service marketplaces, Information Systems Research, 25, 4, 865-886 (2014)
[50] Mousavi, R.; Raghu, T.; Frey, K., Harnessing artificial intelligence to improve the quality of answers in online question-answering health forums, Journal of Management Information Systems, 37, 4, 1073-1098 (2020), URL: https://www.tandfonline.com/doi/full/10.1080/07421222.2020.1831775
[51] Narang, U.; Yadav, M. S.; Rindfleisch, A., The “idea advantage”: How content sharing strategies impact engagement in online learning platforms, Journal of Marketing Research, 59, 1, 61-78 (2022)
[52] Nauhaus, S.; Luger, J.; Raisch, S., Strategic decision making in the digital age: Expert sentiment and corporate capital allocation, Journal of Management Studies, 58, 7, 1933-1961 (2021), URL: https://onlinelibrary.wiley.com/doi/10.1111/joms.12742
[53] Netzer, O.; Lemaire, A.; Herzenstein, M., When words sweat: Identifying signals for loan default in the text of loan applications, Journal of Marketing Research, 56, 6, 960-980 (2019)
[54] Pan, Y.; Huang, P.; Gopal, A., Storm clouds on the horizon? New entry threats and R&D investments in the U.S. IT industry, Information Systems Research, 30, 2, 540-562 (2019)
[55] Puranam, D.; Kadiyali, V.; Narayan, V., The impact of increase in minimum wages on consumer perceptions of service: A transformer model of online restaurant reviews, Marketing Science, 40, 5, 985-1004 (2021), URL: https://pubsonline.informs.org/doi/10.1287/mksc.2021.1294
[56] Purda, L.; Skillicorn, D., Accounting variables, deception, and a bag of words: Assessing the tools of fraud detection, Contemporary Accounting Research, 32, 3, 1193-1223 (2015), URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/1911-3846.12089
[57] Raffel, C.; Shazeer, N.; Roberts, A.; Lee, K.; Narang, S.; Matena, M.; Zhou, Y.; Li, W.; Liu, P. J., Exploring the limits of transfer learning with a unified text-to-text transformer, Journal of Machine Learning Research, 21, 67 (2020)
[58] Shi, Z.; Lee, G. M.; Whinston, A. B., Toward a better measure of business proximity: Topic modeling for industry intelligence, MIS Quarterly, 40, 4, 1035-A53 (2016)
[59] Shin, D.; He, S.; Lee, G. M.; Whinston, A. B.; Cetintas, S.; Lee, K. C., Enhancing social media analysis with visual data analytics: A deep learning approach, MIS Quarterly, 44, 4, 1459-1492 (2020)
[60] Stevenson, M.; Mues, C.; Bravo, C., The value of text for small business default prediction: A deep learning approach, European Journal of Operational Research, 295, 2, 758-771 (2021) · Zbl 1487.91171
[61] Sun, C.; Wang, S.; Zhang, C., Corporate payout policy and credit risk: Evidence from credit default swap markets, Management Science, 67, 9, 5755-5775 (2021), URL: http://pubsonline.informs.org/doi/10.1287/mnsc.2020.3753
[62] Symitsi, E.; Stamolampros, P.; Daskalakis, G.; Korfiatis, N., The informational value of employee online reviews, European Journal of Operational Research, 288, 2, 605-619 (2021), URL: https://www.sciencedirect.com/science/article/pii/S0377221720305269 · Zbl 1487.90421
[63] Taborda, B.; de Almeida, A.; Carlos Dias, J.; Batista, F.; Ribeiro, R., Stock market tweets data (2021), IEEE Dataport
[64] Theurer, C. P.; Schäpers, P.; Tumasjan, A.; Welpe, I.; Lievens, F., What you see is what you get? Measuring companies’ projected employer image attributes via companies’ employment webpages, Human Resource Management, 61, 5, 543-561 (2022)
[65] Touvron, H.; Martin, L.; Stone, K.; Albert, P.; Almahairi, A.; Babaei, Y.; Bashlykov, N.; Batra, S.; Bhargava, P.; Bhosale, S.; Bikel, D.; Blecher, L.; Ferrer, C. C.; Chen, M.; Cucurull, G.; Esiobu, D.; Fernandes, J.; Fu, J.; Fu, W.; Scialom, T., Llama 2: Open foundation and fine-tuned chat models (2023), arXiv:2307.09288
[66] Vairetti, C.; Aránguiz, I.; Maldonado, S.; Karmy, J. P.; Leal, A., Analytics-driven complaint prioritisation via deep learning and multicriteria decision-making, European Journal of Operational Research, 312, 3, 1108-1118 (2024), URL: https://www.sciencedirect.com/science/article/pii/S0377221723006562 · Zbl 07765842
[67] Vaswani, A.; Bengio, S.; Brevdo, E.; Chollet, F.; Gomez, A. N.; Gouws, S.; Jones, L.; Kaiser, L.; Kalchbrenner, N.; Parmar, N.; Sepassi, R.; Shazeer, N.; Uszkoreit, J., Tensor2Tensor for neural machine translation, CoRR abs/1803.07416 (2018), URL: http://arxiv.org/abs/1803.07416
[68] Vaswani, A.; Shazeer, N.; Parmar, N.; Uszkoreit, J.; Jones, L.; Gomez, A. N.; Kaiser, L.; Polosukhin, I., Attention is all you need, (Advances in neural information processing systems, vol. 30 (2017), Curran Associates, Inc.), URL: https://proceedings.neurips.cc/paper/2017/hash/3f5ee243547dee91fbd053c1c4a845aa-Abstract.html
[69] Wang, K., Is the tone of risk disclosures in MD&as relevant to debt markets? Evidence from the pricing of credit default swaps*, Contemporary Accounting Research, 38, 2, 1465-1501 (2021), URL: https://onlinelibrary.wiley.com/doi/abs/10.1111/1911-3846.12644
[70] Wang, G.; Chen, G.; Zhao, H.; Zhang, F.; Yang, S.; Lu, T., Leveraging multisource heterogeneous data for financial risk prediction: A novel hybrid-strategy-based self-adaptive method, MIS Quarterly, 45, 4, 1949-19998 (2021)
[71] Wolf, T.; Debut, L.; Sanh, V.; Chaumond, J.; Delangue, C.; Moi, A.; Cistac, P.; Rault, T.; Louf, R.; Funtowicz, M.; Davison, J.; Shleifer, S.; von Platen, P.; Ma, C.; Jernite, Y.; Plu, J.; Xu, C.; Le Scao, T.; Gugger, S.; Rush, A., Transformers: State-of-the-art natural language processing, (Proceedings of the 2020 conference on empirical methods in natural language processing: System demonstrations (2020), Association for Computational Linguistics), 38-45, Online. URL: https://aclanthology.org/2020.emnlp-demos.6
[72] Xu, Y.; Armony, M.; Ghose, A., The interplay between online reviews and physician demand: An empirical investigation, Management Science, 67, 12, 7344-7361 (2021)
[73] Xu, X.; Qian, H.; Ge, C.; Lin, Z., Industry classification with online resume big data: A design science approach, Information & Management, 57, 5, Article 103182 pp. (2020), URL: https://www.sciencedirect.com/science/article/pii/S0378720618307377
[74] Xu, Y.; Tan, T. F.; Netessine, S., The impact of workload on operational risk: Evidence from a commercial bank, Management Science, 68, 4, 2668-2693 (2022)
[75] Yang, Y.; Uy, M. C.S.; Huang, A., FinBERT: A pretrained language model for financial communications (2020), arXiv:2006.08097 [cs]
[76] Zhang, M.; Luo, L., Can consumer-posted photos serve as a leading indicator of restaurant survival? Evidence from Yelp, Management Science (2022), mnsc.2022.4359
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.