On the generalization abilities of fine-tuned commonsense language representation models

Abstract

Recently, transformer-based methods such as RoBERTa and GPT-3 have led to significant experimental advances in natural language processing tasks such as question answering and commonsense reasoning. The latter is typically evaluated through multiple benchmarks framed as multiple-choice instances of the former. According to influential leaderboards hosted by the Allen Institute (evaluating state-of-the-art performance on commonsense reasoning benchmarks), models based on such transformer methods are approaching human-like performance and have average accuracy well over 80% on many benchmarks. Since these are commonsense benchmarks, a model that generalizes on commonsense reasoning should not experience much performance loss across multiple commonsense benchmarks. In this paper, we study the generalization issue in detail by designing and conducting a …

Date: August 23, 2025
Authors: Ke Shen, Mayank Kejriwal
Conference: Artificial Intelligence XXXVIII: 41st SGAI International Conference on Artificial Intelligence, AI 2021, Cambridge, UK, December 14–16, 2021, Proceedings 41
Pages: 3-16
Publisher: Springer International Publishing

Information Sciences Institute

Publications

On the generalization abilities of fine-tuned commonsense language representation models

Abstract