Controlled Text Generation: A Self Supervised Approach to Generate Text Satisfying a Given Set of Constraints

Language plays a crucial role in day-to-day communication. The cognitive ability of humans to generate language effortlessly to express ideas, information, and facts is nothing less than a sensational phenomenon. To express any form of information effectively, a speaker or a writer needs to dynamically alter the attributes of a discourse, this may include gathering relevant facts, addressing a person or a group, sticking to the ideas that are confined to a specific task, or situation, etc. In Natural Language Processing, we utilize the task of Language Generation for applications such as text summarization, question-answering, text completion, sports and weather reporting, etc. Natural Language Generation (NLG) or Text Generation is an active sub-field of NLP which aims to generate text indistinguishable from human-written ones. Although with the rise of Recurrent Neural Networks and the Transformer architecture, text generation has made leaps and accomplished state-of-the-art performance that is almost indistinguishable from natural language, the generation process is still uncontrollable, randomized, and in several cases results in loss of context. In this study, we explore the area of controlled text generation and offer a simplified approach to solve this task, we achieve this by employing the model to learn the underlying attributes of text and generate results confined to a dynamic and userspecified context. To steer the generation of text and confine it to a context that is built upon a specified set of attributes is a challenging task. We limited our focus to only three fundamental attributes of text such as keywords, content, and style and employed the GPT-2 transformer-based language model to a semi-supervised learning approach, by taking the text instance with its extracted attributes and binding them together using an encoding format. We trained the model with this encoding so it learns the underlying representation and become capable of producing the output if presented the attribute set dynamically. Each control attribute was evaluated based on its own nature of representation in the output text. The keyword attribute was evaluated based on how well the keyword set was reflected in the final output, the style attribute was evaluated by pre-existing classifiers and we trained a BiLSTM classifier on the raw dataset to evaluate the content attribute of the generated text. Finally, the generated text was evaluated for its fluency and plausibility. The automatic evaluation shows promising results and demonstrates the model’s capability to produce the given context with ease. While the human evaluation provides almost similar results to the automatic ones, it is still limited due to its small-scale nature and lack of available resources specific to this study. During stresstesting the model we found that the model was able to maintain plausibility in the results for most of the cases even though it was given constraints in form of unusual inputs. However, the ability of the model to reproduce the given context fulfills our initial hypothesis for this study to investigate the semi-supervised learning approach for the task of controlled text generation and provide a reasonable evaluation that supports necessary proof of concepts, while also considering there is still room for plenty of exploration and challenges to overcome.