Improved training methods for language models using data generation and reinforcement learning

Abstract

The disclosed method generates helpful training data for a language model, for example, a model implementing a punctuation restoration task, for real-world ASR texts. The method uses a reinforcement learning method using a generative AI model to generate additional data to train the language model. The method allows the generative AI model to learn from real-world ASR text to generate more effective training examples based on gradient feedback from the language model.

Date: January 16, 2025
Authors: VD Lai, T Bui, S Yoon, Q Tran, H Tan, H Deilamsalehy, A Salinas, ...
Inventors: Viet Dac Lai, Trung Bui, Seunghyun Yoon, Quan Tran, Hao Tan, Hanieh Deilamsalehy, Abel Salinas, Franck Dernoncourt
Patent_office: US
Application_number: 18220910

Information Sciences Institute

Publications

Improved training methods for language models using data generation and reinforcement learning

Abstract