Large Scale Fine-Tuned Transformers Models Application for Business Names Generation

Mantas Lukauskas

Department of Applied Mathematics Kaunas University of Technology K. Donelaičio st. 73, LT-44249 Kaunas, Lithuania
Tomas Rasymas

Hostinger, UAB Jonavos st. 60C, LT-44192 Kaunas, Lithuania
Matas Minelga

Zyro Inc, UAB Jonavos st. 60C, LT-44192 Kaunas, Lithuania
Domas Vaitmonas

Zyro Inc, UAB Jonavos st. 60C, LT-44192 Kaunas, Lithuania

Large Scale Fine-Tuned Transformers Models Application for Business Names Generation

keywords: Natural language processing, NLP, natural language generation, NLG, transformers

Natural language processing (NLP) involves the computer analysis and processing of human languages using a variety of techniques aimed at adapting various tasks or computer programs to linguistically process natural language. Currently, NLP is increasingly applied to a wide range of real-world problems. These tasks can vary from extracting meaningful information from unstructured data, analyzing sentiment, translating text between languages, to generating human-level text autonomously. The goal of this study is to employ transformer-based natural language models to generate high-quality business names. Specifically, this work investigates whether larger models, which require more training time, yield better results for generating relatively short texts, such as business names. To achieve this, we utilize different transformer architectures, including both freely available and proprietary models, and compare their performance. Our dataset comprises 250 928 observations of business names. Based on the perplexity metric, the top-performing model in our study is the GPT2-Medium model. However, our findings reveal a discrepancy between human evaluation and perplexity-based assessment. According to human evaluation, the best results are obtained using the GPT-Neo-1.3B model. Interestingly, the larger GPT-Neo-2.7B model yields poorer results, with its performance not being statistically different from that of the GPT-Neo-125M model, which is 20 times smaller.

mathematics subject classification 2000: 68-T50

reference: Vol. 42, 2023, No. 3, pp. 525–545

doi: 10.31577/cai_2023_3_525

Computing and Informatics

formerly Computers and Artificial Intelligence

Large Scale Fine-Tuned Transformers Models Application for Business Names Generation