Pento-DIARef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from Examples

Sadler, Philipp and Schlangen, David

NLP tasks are typically defined extensionally through datasets containing example instantiations (e.g., pairs of image _i_ and text _t_), but motivated intensionally through capabilities invoked in verbal descriptions of the task (e.g., “_t_ is a description of _i_, for which the content of _i_ needs to be recognised and understood”).We present Pento-DIARef, a diagnostic dataset in a visual domain of puzzle pieces where referring expressions are generated by a well-known symbolic algorithm (the “Incremental Algorithm”),which itself is motivated by appeal to a hypothesised capability (eliminating distractors through application of Gricean maxims). Our question then is whether the extensional description (the dataset) is sufficient for a neural model to pick up the underlying regularity and exhibit this capability given the simple task definition of producing expressions from visual inputs. We find that a model supported by a vision detection step and a targeted data generation scheme achieves an almost perfect BLEU@1 score and sentence accuracy, whereas simpler baselines do not.

In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics , 2023
[PDF]

@inproceedings{Sadler-2023,
  title = {Pento-{DIAR}ef: A Diagnostic Dataset for Learning the Incremental Algorithm for Referring Expression Generation from Examples},
  author = {Sadler, Philipp and Schlangen, David},
  editor = {Vlachos, Andreas and Augenstein, Isabelle},
  booktitle = {Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics},
  month = may,
  year = {2023},
  address = {Dubrovnik, Croatia},
  publisher = {Association for Computational Linguistics},
  url = {https://aclanthology.org/2023.eacl-main.154},
  doi = {10.18653/v1/2023.eacl-main.154},
  pages = {2106--2122}
}