id |
oapen-20.500.12657-23715
|
record_format |
dspace
|
spelling |
oapen-20.500.12657-237152024-03-22T19:23:05Z Chapter Black box approaches to genealogical classification and their shortcomings Prokić, Jelena Moran, Steven Saxena, Anju Borin, Lars linguistic differences thema EDItEUR::C Language and Linguistics::CF Linguistics In the past 20 years, the application of quantitative methods in historical linguistics has received a lot of attention. Traditional historical linguistics relies on the comparative method in order to determine the genealogical related-ness of languages. More recent quantitative approaches attempt to automate this process, either by developing computational tools that complement the comparative method (Steiner et al. 2010) or by applying fully automatized methods that take into account very limited or no linguistic knowledge, e.g. the Levenshtein approach. The Levenshtein method has been extensively used in dialectometry to measure the distances between various dialects (Kessler 1995; Heeringa 2004; Nerbonne 1996). It has also been frequently used to analyze the relatedness between languages, such as Indo-European (Serva and Petroni 2008; Blanchard et al. 2010), Austronesian (Petroni and Serva 2008), and a very large sample of 3002 languages (Holman 2010). In this paper we will examine the performance of the Levenshtein distance against n-gram models and a zipping approach by applying these methods to the same set of language data. 2019-11-19 23:55 2020-01-07 16:47:06 2020-04-01T09:26:42Z 2020-04-01T09:26:42Z 2013 chapter 1006429 OCN: 1135845144 9783110488081 http://library.oapen.org/handle/20.500.12657/23715 eng application/pdf n/a [9783110305258 - Approaches to] Black box.pdf De Gruyter Approaches to Measuring Linguistic Differences 10.1515/9783110305258.429 10.1515/9783110305258.429 2b386f62-fc18-4108-bcf1-ade3ed4cf2f3 d344d431-123c-48b3-94be-c8d10c495b20 7292b17b-f01a-4016-94d3-d7fb5ef9fb79 9783110488081 European Research Council (ERC) Berlin/Boston 240816 FP7 Ideas: European Research Council FP7-IDEAS-ERC - Specific Programme: "Ideas" Implementing the Seventh Framework Programme of the European Community for Research, Technological Development and Demonstration Activities (2007 to 2013) open access
|
institution |
OAPEN
|
collection |
DSpace
|
language |
English
|
description |
In the past 20 years, the application of quantitative methods in historical linguistics has received a lot of attention. Traditional historical linguistics relies on the comparative method in order to determine the genealogical related-ness of languages. More recent quantitative approaches attempt to automate this process, either by developing computational tools that complement the comparative method (Steiner et al. 2010) or by applying fully automatized methods that take into account very limited or no linguistic knowledge, e.g. the Levenshtein approach. The Levenshtein method has been extensively used in dialectometry to measure the distances between various dialects (Kessler 1995; Heeringa 2004; Nerbonne 1996). It has also been frequently used to analyze the relatedness between languages, such as Indo-European (Serva and Petroni 2008; Blanchard et al. 2010), Austronesian (Petroni and Serva 2008), and a very large sample of 3002 languages (Holman 2010). In this paper we will examine the performance of the Levenshtein distance against n-gram models and a zipping approach by applying these methods to the same set of language data.
|
title |
[9783110305258 - Approaches to] Black box.pdf
|
spellingShingle |
[9783110305258 - Approaches to] Black box.pdf
|
title_short |
[9783110305258 - Approaches to] Black box.pdf
|
title_full |
[9783110305258 - Approaches to] Black box.pdf
|
title_fullStr |
[9783110305258 - Approaches to] Black box.pdf
|
title_full_unstemmed |
[9783110305258 - Approaches to] Black box.pdf
|
title_sort |
[9783110305258 - approaches to] black box.pdf
|
publisher |
De Gruyter
|
publishDate |
2019
|
_version_ |
1799945265072832512
|