Publications
I cast detect thoughts: Learning to converse and guide with intents and theory-of-mind in dungeons and dragons
Abstract
We propose a novel task, G4C, to study teacher-student natural language interactions in a goal-driven and grounded environment. Dungeons and Dragons (D&D), a role-playing game, provides an ideal setting to investigate such interactions. Here, the Dungeon Master (DM), i.e., the teacher, guides the actions of several players -- students, each with their own personas and abilities -- to achieve shared goals grounded in a fantasy world. Our approach is to decompose and model these interactions into (1) the DM's intent to guide players toward a given goal; (2) the DM's guidance utterance to the players expressing this intent; and (3) a theory-of-mind (ToM) model that anticipates the players' reaction to the guidance one turn into the future. We develop a novel reinforcement learning (RL) method for training a DM that generates guidance for players by rewarding utterances where the intent matches the ToM-anticipated player actions. Human and automated evaluations show that a DM trained to explicitly model intents and incorporate ToM of the players using RL generates better-quality guidance that is 3x more likely to fulfill the DM's intent than a vanilla natural language generation (NLG) approach.
Metadata
- publication
- arXiv preprint arXiv:2212.10060, 2022
- year
- 2022
- publication date
- 2022/12/20
- authors
- Pei Zhou, Andrew Zhu, Jennifer Hu, Jay Pujara, Xiang Ren, Chris Callison-Burch, Yejin Choi, Prithviraj Ammanabrolu
- link
- https://arxiv.org/abs/2212.10060
- resource_link
- https://arxiv.org/pdf/2212.10060
- journal
- arXiv preprint arXiv:2212.10060