A computational psycholinguistic evaluation of the syntactic abilities of Galician BERT models at the interface of dependency resolution and training time

  1. de-Dios-Flores, Iria
  2. García González, Marcos
Revista:
Procesamiento del lenguaje natural

ISSN: 1135-5948

Ano de publicación: 2022

Número: 69

Páxinas: 15-26

Tipo: Artigo

Outras publicacións en: Procesamiento del lenguaje natural

Resumo

This paper explores the ability of Transformer models to capture subject-verb and noun-adjective agreement dependencies in Galician. We conduct a series of word prediction experiments in which we manipulate dependency length together with the presence of an attractor noun that acts as a lure. First, we evaluate the overall performance of the existing monolingual and multilingual models for Galician. Secondly, to observe the effects of the training process, we compare the different degrees of achievement of two monolingual BERT models at different training points. We also release their checkpoints and propose an alternative evaluation metric. Our results confirm previous findings by similar works that use the agreement prediction task and provide interesting insights into the number of training steps required by a Transformer model to solve long-distance dependencies.

Referencias bibliográficas

  • Agerri, R., X. G´omez Guinovart, G. Rigau, and M. A. Solla Portela. 2018. Developing new linguistic resources and tools for the Galician language. In Proceedings of the Eleventh International Conference on Language Resources and Evaluation (LREC 2018), Miyazaki, Japan, May. European Language Resources Association (ELRA).
  • Bernardy, J.-P. and S. Lappin. 2017. Using deep neural networks to learn syntactic agreement. In Linguistic Issues in Language Technology, Volume 15, 2017. CSLI Publications.
  • Devlin, J., M.-W. Chang, K. Lee, and K. Toutanova. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, pages 4171–4186, Minneapolis, Minnesota. Association for Computational Linguistics.
  • Garcia, M. 2021. Exploring the representation of word meanings in context: A case study on homonymy and synonymy. In Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing (Volume 1: Long Papers), pages 3625–3640, Online, August. Association for Computational Linguistics.
  • Garcia, M. and A. Crespo-Otero. 2022. A Targeted Assessment of the Syntactic Abilities of Transformer Models for Galician-Portuguese. In International Conference on Computational Processing of the Portuguese Language (PROPOR 2022), pages 46–56. Springer.
  • Gauthier, J., J. Hu, E. Wilcox, P. Qian, and R. Levy. 2020. SyntaxGym: An online platform for targeted evaluation of language models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, pages 70–76, Online, July. Association for Computational Linguistics.
  • Goldberg, Y. 2019. Assessing BERT’s Syntactic Abilities. arXiv preprint arXiv:1901.05287.
  • Gulordava, K., P. Bojanowski, E. Grave, T. Linzen, and M. Baroni. 2018. Colorless green recurrent networks dream hierarchically. In Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long Papers), pages 1195–1205, New Orleans, Louisiana, June. Association for Computational Linguistics.
  • Henderson, J. 2020. The unstoppable rise of computational linguistics in deep learning. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 6294–6306, Online, July. Association for Computational Linguistics.
  • Hewitt, J. and C. D. Manning. 2019. A structural probe for finding syntax in word representations. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 4129–4138, Minneapolis, Minnesota, June. Association for Computational Linguistics.
  • Kuncoro, A., C. Dyer, J. Hale, and P. Blunsom. 2018a. The perils of natural behaviour tests for unnatural models: the case of number agreement. Learning Language in Humans and in Machines, 5(6). https://osf.io/9usyt/.
  • Kuncoro, A., C. Dyer, J. Hale, D. Yogatama, S. Clark, and P. Blunsom. 2018b. LSTMs can learn syntax-sensitive dependencies well, but modeling structure makes them better. In Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pages 1426–1436, Melbourne, Australia, July. Association for Computational Linguistics.
  • Lakretz, Y., G. Kruszewski, T. Desbordes, D. Hupkes, S. Dehaene, and M. Baroni. 2019. The emergence of number and syntax units in LSTM language models. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 11–20, Minneapolis, Minnesota, June. Association for Computational Linguistics.
  • Lin, Y., Y. C. Tan, and R. Frank. 2019. Open sesame: Getting inside BERT’s linguistic knowledge. In Proceedings of the 2019 ACL Workshop BlackboxNLP: Analyzing and Interpreting Neural Networks for NLP, pages 241–253, Florence, Italy, August. Association for Computational Linguistics.
  • Linzen, T., E. Dupoux, and Y. Goldberg. 2016. Assessing the ability of LSTMs to learn syntax-sensitive dependencies. Transactions of the Association for Computational Linguistics, 4:521–535.
  • Linzen, T. and B. Leonard. 2018. Distinct patterns of syntactic agreement errors in recurrent networks and humans. In Proceedings of the 40th Annual Conference of the Cognitive Science Society. arXiv preprint arXiv:1807.06882.
  • Marvin, R. and T. Linzen. 2018. Targeted syntactic evaluation of language models. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 1192–1202, Brussels, Belgium, October-November. Association for Computational Linguistics.
  • Mueller, A., G. Nicolai, P. Petrou-Zeniou, N. Talmina, and T. Linzen. 2020. Crosslinguistic syntactic evaluation of word prediction models. In Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pages 5523– 5539, Online, July. Association for Computational Linguistics.
  • Newman, B., K.-S. Ang, J. Gong, and J. Hewitt. 2021. Refining targeted syntactic evaluation of language models. In Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, pages 3710– 3723, Online, June. Association for Computational Linguistics.
  • Pérez-Mayos, L., M. Ballesteros, and L.Wanner. 2021. How much pretraining data do language models need to learn syntax? In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 1571–1582, Online and Punta Cana, Dominican Republic, November. Association for Computational Linguistics.
  • Pérez-Mayos, L., A. T´aboas Garc´ıa, S. Mille, and L. Wanner. 2021. Assessing the syntactic capabilities of transformer-based multilingual language models. In Findings of the Association for Computational Linguistics: ACL-IJCNLP 2021, pages 3799–3812, Online, August. Association for Computational Linguistics.
  • Sellam, T., S. Yadlowsky, J. Wei, N. Saphra, A. D’Amour, T. Linzen, J. Bastings, I. Turc, J. Eisenstein, D. Das, I. Tenney, and E. Pavlick. 2022. The Multi- BERTs: BERT Reproductions for Robustness Analysis. In The Tenth International Conference on Learning Representations (ICLR 2022). arXiv preprint arXiv:2106.16163.
  • Tran, K., A. Bisazza, and C. Monz. 2018. The importance of being recurrent for modeling hierarchical structure. In Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing, pages 4731–4736, Brussels, Belgium, October-November. Association for Computational Linguistics.
  • Vaswani, A., N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, and I. Polosukhin. 2017. Attention Is All You Need. arXiv preprint arXiv:1706.03762.
  • Vilares, D., M. Garcia, and C. G´omez- Rodr´ıguez. 2021. Bertinho: Galician BERT Representations. Procesamiento del Lenguaje Natural, 66:13–26.
  • Wei, J., D. Garrette, T. Linzen, and E. Pavlick. 2021. Frequency effects on syntactic rule learning in transformers. In Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing, pages 932–948, Online and Punta Cana, Dominican Republic, November. Association for Computational Linguistics.
  • Wenzek, G., M.-A. Lachaux, A. Conneau, V. Chaudhary, F. Guzm´an, A. Joulin, and E. Grave. 2020. CCNet: Extracting high quality monolingual datasets from web crawl data. In Proceedings of the 12th Language Resources and Evaluation Conference, pages 4003–4012, Marseille, France, May. European Language Resources Association.
  • Wolf, T., L. Debut, V. Sanh, J. Chaumond, C. Delangue, A. Moi, P. Cistac, T. Rault, R. Louf, M. Funtowicz, J. Davison, S. Shleifer, P. von Platen, C. Ma, Y. Jernite, J. Plu, C. Xu, T. Le Scao, S. Gugger, M. Drame, Q. Lhoest, and A. Rush. 2020. Transformers: Stateof- the-art natural language processing. In Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing: System Demonstrations, pages 38–45, Online, October. Association for Computational Linguistics.