Purchase instant access (PDF download and unlimited online access):
The British National Corpus (BNC) contains a wealth of data about the frequency and distribution of words and phrases in many different registers of English, yet, via the standard interface, there is not explicit way of investigating the semantic relationship between words. On the other hand, WordNet contains detailed hierarchies about the semantic relation between hundreds of thousands of lexical items, but it has very limited information about the frequency and distribution of these words. My project employs relational databases to join together these two resources, and allow advanced semantically-based queries of the BNC. These include queries that show the relative frequency of all of the synonyms for a given word, which hyponyms (more specific words) or meronyms (part of a whole) of a particular word are more common in the BNC, and all of the specific phrases that express more general semantic concepts.