Classification of galician surnames with web scraping

  1. Maria José Ginzo Villamayor 1
  1. 1 Departamento de Estatística, Análise Matemática e Optimización. Universidade de Santiago de Compostela
Book:
Aplicações em R : encurtando distâncias nas ciências
  1. Luciane Ferreira Alcoforado (coord.)

Publisher: Faculdade de Zootecnia e Engenharia de Alimentos da Universidade de São Paulo

ISBN: 978-65-87023-39-7

Year of publication: 2024

Pages: 223-238

Type: Book chapter

Abstract

Linguistics considers different classifications of surnames according to theirmotivation, morphology or semantics. In the case of Galician surnames, Boullón-Agrelo (2008) proposes a classification based on three main groups: appellatives,patronymics and toponymics. In order to classify Galician surnames in these th-ree categories, Web Scraping techniques were used, i.e. a process of extractingcontent and data from websites, scraping official Galician, Spanish and even Por-tuguese language dictionaries. These techniques were very useful, especially forappellatives.