Large Scale Fine-Tuned Transformers Models Application for Business Names Generation
keywords: Natural language processing, NLP, natural language generation, NLG, transformers
Natural language processing (NLP) involves the computer analysis and processing of human languages using a variety of techniques aimed at adapting various tasks or computer programs to linguistically process natural language. Currently, NLP is increasingly applied to a wide range of real-world problems. These tasks can vary from extracting meaningful information from unstructured data, analyzing sentiment, translating text between languages, to generating human-level text autonomously. The goal of this study is to employ transformer-based natural language models to generate high-quality business names. Specifically, this work investigates whether larger models, which require more training time, yield better results for generating relatively short texts, such as business names. To achieve this, we utilize different transformer architectures, including both freely available and proprietary models, and compare their performance. Our dataset comprises 250 928 observations of business names. Based on the perplexity metric, the top-performing model in our study is the GPT2-Medium model. However, our findings reveal a discrepancy between human evaluation and perplexity-based assessment. According to human evaluation, the best results are obtained using the GPT-Neo-1.3B model. Interestingly, the larger GPT-Neo-2.7B model yields poorer results, with its performance not being statistically different from that of the GPT-Neo-125M model, which is 20 times smaller.
mathematics subject classification 2000: 68-T50
reference: Vol. 42, 2023, No. 3, pp. 525–545