A Discriminative Model for Perceptually-Grounded Incremental Reference Resolution
Kennington, Casey and Dia, Livia and Schlangen, David
A large part of human communication involves referring to entities in the world, and often these entities are objects that are visually present for the interlocutors. A computer system that aims to resolve such references needs to tackle a complex task: objects and their visual features must be determined, the referring expressions must be recognised, extra-linguistic information such as eye gaze or pointing gestures must be incorporated — and the intended connection between words and world must be reconstructed. In this paper, we introduce a discriminative model of reference resolution that processes incrementally (i.e., word for word), is perceptually-grounded, and improves when interpolated with information from gaze and pointing gestures. We evaluated our model and found that it performed robustly in a realistic reference resolution task, when compared to a generative model.
In Proceedings of the 11th International Conference on Computational Semantics (IWCS) 2015 , 2015[PDF]
@inproceedings{Kennington-2015,
author = {Kennington, Casey and Dia, Livia and Schlangen, David},
booktitle = {Proceedings of the 11th International Conference on Computational Semantics (IWCS) 2015},
location = {London},
pages = {195--205},
title = {{A Discriminative Model for Perceptually-Grounded Incremental Reference Resolution}},
year = {2015},
topics = {},
domains = {},
approach = {},
project = {}
}