
New Perspectives on Foreign Language Learning 3: Language Learning with Parsed Corpora [Prof. Alastair Butler]
Title: Learning English with a treebank/parsed corpusButler (Faculty of Humanities and Social Sciences) A corpus is a large database made up of samples from a language. A parsed corpus is a corpus with linguistic analysis added to the data. The kinds of analysis added can be quite varied, but analysis often takes the form of syntactic trees. A parsed corpus with syntactic trees is called a treebank. Treebanks of quality and size are extremely costly to create and entry restrictions are normally imposed. But when you can participate, there is a lot of value to be found. For starters you can explore the sample data, but the real power comes from being able to search the linguistic analysis that accompanies the language data, which brings together related instances of language use. This talk will tell you about a treebank/parsed corpus for English that you can access, not least because it is being created at Hirosaki University. Also it can be found on the internet, fromhttps://entrees.github.io/. Use of this corpus will be connected to suggestions for improving your English skills.
Speaker: Alastair