Information-theoretic causal inference of lexical flow

Dellert, Johannes

doi:10.5281/zenodo.3247415

Information-theoretic causal inference of lexical flow

Metadaten

dc.contributor.author

Dellert, Johannes

dc.date.accessioned

2019-11-04T10:48:07Z

dc.date.available

2019-11-04T10:48:07Z

dc.date.issued

2019

dc.identifier.isbn

978-3-96110-144-3

dc.identifier.uri

https://refubium.fu-berlin.de/handle/fub188/25866

dc.identifier.uri

http://dx.doi.org/10.17169/refubium-25627

dc.description.abstract

This volume seeks to infer large phylogenetic networks from phonetically encoded lexical data and contribute in this way to the historical study of language varieties. The technical step that enables progress in this case is the use of causal inference algorithms. Sample sets of words from language varieties are preprocessed into automatically inferred cognate sets, and then modeled as information-theoretic variables based on an intuitive measure of cognate overlap. Causal inference is then applied to these variables in order to determine the existence and direction of influence among the varieties. The directed arcs in the resulting graph structures can be interpreted as reflecting the existence and directionality of lexical flow, a unified model which subsumes inheritance and borrowing as the two main ways of transmission that shape the basic lexicon of languages. A flow-based separation criterion and domain-specific directionality detection criteria are developed to make existing causal inference algorithms more robust against imperfect cognacy data, giving rise to two new algorithms. The Phylogenetic Lexical Flow Inference (PLFI) algorithm requires lexical features of proto-languages to be reconstructed in advance, but yields fully general phylogenetic networks, whereas the more complex Contact Lexical Flow Inference (CLFI) algorithm treats proto-languages as hidden common causes, and only returns hypotheses of historical contact situations between attested languages. The algorithms are evaluated both against a large lexical database of Northern Eurasia spanning many language families, and against simulated data generated by a new model of language contact that builds on the opening and closing of directional contact channels as primary evolutionary events. The algorithms are found to infer the existence of contacts very reliably, whereas the inference of directionality remains difficult. This currently limits the new algorithms to a role as exploratory tools for quickly detecting salient patterns in large lexical datasets, but it should soon be possible for the framework to be enhanced e.g. by confidence values for each directionality decision.

dc.format.extent

xiii, 363 Seiten

dc.language

eng

dc.rights.uri

http://www.fu-berlin.de/sites/refubium/rechtliches/Nutzungsbedingungen

dc.subject

Linguistics

dc.subject

Causal inference

dc.subject

lexical flow

dc.subject.ddc

400 Sprache::410 Linguistik::410 Linguistik

dc.title

Information-theoretic causal inference of lexical flow

dc.type

Buch

dc.identifier.urn

urn:nbn:de:kobv:188-refubium-25866-9

dcterms.bibliographicCitation.doi

10.5281/zenodo.3247415

dcterms.bibliographicCitation.originalpublishername

Language Science Press

dcterms.bibliographicCitation.url

http://langsci-press.org/catalog/book/233

refubium.affiliation

Philosophie und Geisteswissenschaften

refubium.affiliation.other

Institut für Deutsche und Niederländische Philologie

Dieser Normdateneintrag wurde von einer Benutzerin oder einem Benutzer als gültig bestätigt.

refubium.resourceType.isindependentpub

yes

refubium.series.issueNumber

refubium.series.name

Language Variation

dcterms.accessRights.dnb

free

dcterms.accessRights.openaire

open access

dc.identifier.eisbn

978-3-96110-143-6

Zur Kurzanzeige

Das Dokument erscheint in:

Language Variation

Dateien zu dieser Ressource

LV_4_Dellert.pdf

Größe: 2.994MB

Format: PDF

Prüfsumme (MD5): 4963908346a8c388a112222a35945c91

Öffnen

Information-theoretic causal inference of lexical flow

Refubium - Repositorium der Freien Universität Berlin

Information-theoretic causal inference of lexical flow

Metadaten

Das Dokument erscheint in:

Dateien zu dieser Ressource

Metadaten exportieren