Corpus Linguistics: Readings in a Widening Discipline

A&C Black, Oct 1, 2005 - Language Arts & Disciplines - 542 pages

Corpus Linguistics seeks to provide a comprehensive sampling of real-life usage in a given language, and to use these empirical data to test language hypotheses. Modern corpus linguistics began fifty years ago, but the subject has seen explosive growth since the early 1990s. These days corpora are being used to advance virtually every aspect of language study, from computer processing techniques such as machine translation, to literary stylistics, social aspects of language use, and improved language-teaching methods.

Because corpus linguistics has grown fast from small beginnings, newcomers to the field often find it hard to get their bearings. Important papers can be difficult to track down. This volume reprints forty-two articles on corpus linguistics by an international selection of authors, which comprehensively illustrate the directions in which the subject is developing. It includes articles that are already recognized as classics, and others which deserve to become so, supplemented with editorial introductions relating the individual contributions to the field as a whole.

This collection of readings will be useful to students of corpus linguistics at both undergraduate and postgraduate level, as well as academics researching this fascinating area of linguistics.

Preview this book »

Selected pages

1 Introduction	1

2 From The Structure of English 1952	9

3 A standard corpus of edited presentday American English 1965	27

4 On the distribution of nounphrase types in English clausestructure 1971	35

5 Predicting text segmentation into tone units 1986	49

6 Typicality and meaning potentials 1986	58

7 Historical drift in three English genres 1987	67

8 Corpus creation 1987	78

24 Why a Fiji corpus? 1996	276

25 Treebank grammars 1996	285

26 English corpus linguistics and the foreignlanguage teaching syllabus 1996	293

an overview 1996	304

A comparison of the verbal disputes between adolescent females in two corpora 1996	326

the kappa statistic 1996	335

30 Linguistic and interactional features of Internet Relay Chat 1996	340

New evaluation methods for wordsense disambiguation 1997	353

9 Cleft and pseudocleft constructions in English spoken and written discourse 1987	85

10 What is wrong with adding one? 1989	95

11 A statistical approach to machine translation 1990	103

an analysis of a dialect continuum 1991	113

13 Using corpus data in the Swedish Academy grammar 1991	122

14 On the history of thatzero as object clause links in English 1991	137

15 Encoding the British National Corpus 1992	149

16 Computer corpora what do they tell us about culture? 1992	160

17 Representativeness in corpus design 1992	174

Principles Methods and Examples 1993	198

19 Structural ambiguity and lexical relations 1993	212

20 Irony in the text or insincerity in the writer? The diagnostic potential of semantic prosodies 1993	229

the Penn Treebank 1993	242

22 Automatically extracting collocations from corpora for language learning 1994	258

23 Developing and evaluating a probabilistic LR parser of partofspeech and punctuation labels 1995	267

32 Qualification and certainty in L1 and L2 students writing 1997	371

33 Analysing and predicting patterns of DAMSL utterance tags 1998	387

34 Assessing claims about language use with corpus data swearing and abuse 1998	396

35 The syntax of disfluency in spontaneous spoken language 1998	404

36 The use of large text corpora for evaluating texttospeech systems 1998	421

how much of the underlying syntactic structure can be tagged automatically? 1999	427

38 Reflections of a dendrographer 1999	434

39 A generic approach to software support for linguistic annotation using XML 2000	449

40 Europes ignored languages 2001	460

41 Semiautomatic tagging of intonation in French spoken corpora 2001	462

42 Web as corpus 2001	471

43 Intonational variation in the British Isles 2002	474

Bibliography	483

URL List	509

Index	511

Other editions - View all

Corpus Linguistics: Readings in a Widening Discipline
Geoffrey Sampson,Diana McCarthy
Limited preview - 2005

Corpus Linguistics: Readings in a Widening Discipline
Geoffrey Sampson,Diana McCarthy
Snippet view - 2004

Corpus Linguistics: Readings in a Widening Discipline
Geoffrey Sampson,Diana McCarthy
No preview available - 2005

Common terms and phrases

adjective adverb algorithm analysis annotation attachment automatic bigram British National Corpus Brown Corpus cent chapter COBUILD coders coding collocations communicative constituents constructions context conversations corpora corpus linguistics correct derivation developed dialect dictionary Dimension disambiguation discourse disfluency distribution encoding English epistemic estimate example fiction figure Fiji Fijian formal frequency function genres grammar hackles Indo-Fijians instance intonation labelled language lexical linguistic features mean modal n-gram node noun phrase occur parse tree parser patterns Penn Treebank possible prepositional phrase present probable parse problem processing pronouns prosodic pseudo-clefts relative clauses represent rules sample score segmentation selection semantic prosodies sense sentence sequence SGML shows Somerset speakers speech spoken standard structure style subderivations subtrees syntactic tags tagset tion transcription translation tree Treebank utterance variation verb words writing written XSLT zero

About the author (2005)

Geoffrey Sampson is a former Professor of Natural Language Computing at the School of Informatics, University of Sussex. He is now a Research Fellow at the University of South Africa.

Diana McCarthy is a Royal Society Dorothy Hodgkin Fellow, in the Department of Informatics at Sussex University.

Bibliographic information

Title	Corpus Linguistics: Readings in a Widening Discipline Open linguistics series Corpus linguistics
Authors	Geoffrey Sampson, Diana McCarthy
Edition	reprint
Publisher	A&C Black, 2005
ISBN	1441139370, 9781441139375
Length	542 pages
Subjects	Language Arts & Disciplines › Linguistics › General Language Arts & Disciplines / Linguistics / General

Export Citation	BiBTeX EndNote RefMan

About Google Books - Privacy Policy - Terms of Service - Information for Publishers - Report an issue - Help - Google Home