Corpus Linguistics: Readings in a Widening DisciplineCorpus Linguistics seeks to provide a comprehensive sampling of real-life usage in a given language, and to use these empirical data to test language hypotheses. Modern corpus linguistics began fifty years ago, but the subject has seen explosive growth since the early 1990s. These days corpora are being used to advance virtually every aspect of language study, from computer processing techniques such as machine translation, to literary stylistics, social aspects of language use, and improved language-teaching methods. Because corpus linguistics has grown fast from small beginnings, newcomers to the field often find it hard to get their bearings. Important papers can be difficult to track down. This volume reprints forty-two articles on corpus linguistics by an international selection of authors, which comprehensively illustrate the directions in which the subject is developing. It includes articles that are already recognized as classics, and others which deserve to become so, supplemented with editorial introductions relating the individual contributions to the field as a whole. This collection of readings will be useful to students of corpus linguistics at both undergraduate and postgraduate level, as well as academics researching this fascinating area of linguistics. |
Contents
1 | |
9 | |
27 | |
35 | |
5 Predicting text segmentation into tone units 1986 | 49 |
6 Typicality and meaning potentials 1986 | 58 |
7 Historical drift in three English genres 1987 | 67 |
8 Corpus creation 1987 | 78 |
24 Why a Fiji corpus? 1996 | 276 |
25 Treebank grammars 1996 | 285 |
26 English corpus linguistics and the foreignlanguage teaching syllabus 1996 | 293 |
an overview 1996 | 304 |
A comparison of the verbal disputes between adolescent females in two corpora 1996 | 326 |
the kappa statistic 1996 | 335 |
30 Linguistic and interactional features of Internet Relay Chat 1996 | 340 |
New evaluation methods for wordsense disambiguation 1997 | 353 |
9 Cleft and pseudocleft constructions in English spoken and written discourse 1987 | 85 |
10 What is wrong with adding one? 1989 | 95 |
11 A statistical approach to machine translation 1990 | 103 |
an analysis of a dialect continuum 1991 | 113 |
13 Using corpus data in the Swedish Academy grammar 1991 | 122 |
14 On the history of thatzero as object clause links in English 1991 | 137 |
15 Encoding the British National Corpus 1992 | 149 |
16 Computer corpora what do they tell us about culture? 1992 | 160 |
17 Representativeness in corpus design 1992 | 174 |
Principles Methods and Examples 1993 | 198 |
19 Structural ambiguity and lexical relations 1993 | 212 |
20 Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies 1993 | 229 |
the Penn Treebank 1993 | 242 |
22 Automatically extracting collocations from corpora for language learning 1994 | 258 |
23 Developing and evaluating a probabilistic LR parser of partofspeech and punctuation labels 1995 | 267 |
32 Qualification and certainty in L1 and L2 students writing 1997 | 371 |
33 Analysing and predicting patterns of DAMSL utterance tags 1998 | 387 |
34 Assessing claims about language use with corpus data swearing and abuse 1998 | 396 |
35 The syntax of disfluency in spontaneous spoken language 1998 | 404 |
36 The use of large text corpora for evaluating texttospeech systems 1998 | 421 |
how much of the underlying syntactic structure can be tagged automatically? 1999 | 427 |
38 Reflections of a dendrographer 1999 | 434 |
39 A generic approach to software support for linguistic annotation using XML 2000 | 449 |
40 Europes ignored languages 2001 | 460 |
41 Semiautomatic tagging of intonation in French spoken corpora 2001 | 462 |
42 Web as corpus 2001 | 471 |
43 Intonational variation in the British Isles 2002 | 474 |
Bibliography | 483 |
509 | |
511 | |
Other editions - View all
Corpus Linguistics: Readings in a Widening Discipline Geoffrey Sampson,Diana McCarthy Limited preview - 2005 |
Corpus Linguistics: Readings in a Widening Discipline Geoffrey Sampson,Diana McCarthy No preview available - 2005 |
Common terms and phrases
adjective adverb algorithm analysis annotation attachment automatic bigram British National Corpus Brown Corpus cent chapter COBUILD coders coding collocations communicative constituents constructions context conversations corpora corpus linguistics correct derivation developed dialect dictionary Dimension disambiguation discourse disfluency distribution encoding English epistemic estimate example fiction figure Fiji Fijian formal frequency function genres grammar hackles Indo-Fijians instance intonation labelled language lexical linguistic features mean modal n-gram node noun phrase occur parse tree parser patterns Penn Treebank possible prepositional phrase present probable parse problem processing pronouns prosodic pseudo-clefts relative clauses represent rules sample score segmentation selection semantic prosodies sense sentence sequence SGML shows Somerset speakers speech spoken standard structure style subderivations subtrees syntactic tags tagset tion transcription translation tree Treebank utterance variation verb words writing written XSLT zero