30 September 2024 to 3 October 2024
Toulouse, France
Europe/Paris timezone

Challenges of heterogeneous data for building Linguistic Theory

1 Oct 2024, 16:05
Le Village, Auditorium (Toulouse, France)

Le Village, Auditorium

Toulouse, France

31 Allées Jules Guesde, 31000 TOULOUSE
Oral presentation


Anisia Popescu (LISN)


Linguistics thrives on data, whether it stems from small highly controlled laboratory studies or from large heterogeneous datasets. Speech technology is increasingly providing new and varied tools to test linguistic theories (from sound change to second language learning) on large scale data. This, however, does not come without its challenges. In this presentation, we address one of the key challenges for testing linguistic theory (such as diverging voicing systems within language families), posed by heterogeneous data: the frequent absence of linguistically formatted metadata.

Contribution length Middle

Primary author

Presentation materials