Hidden structures in English corpora

In: Corpus Linguistics and Variation in English


Why do young children prefer child-directed speech (CDS) to adult language? A possible reason for this phenomenon could be attributed to the specific arrangement of phoneme chain distributions in CDS. If so, a quantitative linguistic analysis would contribute acutely to understanding one of the mysteries of first language acquisition, that is, word segmentation. The present study undertakes such an attempt. A computer program transcribed 190 texts containing CDS and adult language to IPA, removed all whitespaces and randomly extracted a defined number of n-grams from all texts. To these phoneme chains, a logistic regression model was applied. The model makes meaningful predictions on the distribution of phoneme chains. English CDS reveals a remarkable different distribution from adult language, which gives an additional cue to why babies prefer CDS.


All Time Past Year Past 30 Days
Abstract Views 31 12 0
Full Text Views 55 45 0
PDF Downloads 9 2 0