Publications
The plausibility machine commonsense (PMC) dataset: A massively crowdsourced human-annotated dataset for studying plausibility in large language models
Abstract
Commonsense reasoning has emerged as a challenging problem in Artificial Intelligence (AI). However, one area of commonsense reasoning that has not received nearly as much attention in the AI research community is plausibility assessment, which focuses on determining the likelihood of commonsense statements. Human-annotated benchmarks are essential for advancing research in this nascent area, as they enable researchers to develop and evaluate AI models effectively. Because plausibility is a subjective concept, it is important to obtain nuanced annotations, rather than a binary label of ‘plausible’ or ‘implausible’. Furthermore, it is also important to obtain multiple human annotations for a given statement, to ensure validity of the labels.In this data article, we describe the process of re-annotating an existing commonsense plausibility dataset (SemEval-2020 Task 4) using large-scale crowdsourcing on the …
Metadata
- publication
- Data in Brief 57, 110869, 2024
- year
- 2024
- publication date
- 2024/12/1
- authors
- Navapat Nananukul, Ke Shen, Mayank Kejriwal
- link
- https://www.sciencedirect.com/science/article/pii/S2352340924008333
- resource_link
- https://www.sciencedirect.com/science/article/pii/S2352340924008333
- journal
- Data in Brief
- volume
- 57
- pages
- 110869
- publisher
- Elsevier