|
Lexical There are
282
verbs with a lemma frequency of between 300 and 600 in CdP:New,
which are also found in at least two of the three online
dictionaries that we are using to correct the lemma lists. The
following shows how many times these same verbs appear in CdP:Old.
Of the 282 verbs in CdP:New, about 42% have ten tokens or less in
CdP:Old, which really isn't enough to say anything about the verbs.
And only 33 / 282 (about 12%) have 50 tokens or more.
Semantic
Without enough tokens of a given word, it is impossible to look at collocates
("nearby words") to say much about the meaning and usage of a word. For example,
we have chosen (almost at random) a verb, noun, adjective, and adverb from
CdP:New, to show how many different collocates occur with this word (at least
three times as a lemma, between four words to the left and four words to the
right of the node word) in CdP:New and CdP:Old. (You might need to manually
reset the SEC 1 value to just the 1900s for the CdP:Old to get the correct type
count.) As we see, CdP:New provides much
better data to examine the meaning and usage of words.
Syntactic
Because CdP:New is about 50 times as large as the 1900s portion of the CdP:Old,
it provides many more tokens for lower frequency syntactic constructions. The
following shows the number of tokens in the two corpora for a number of
different constructions. (You might need to manually reset the SEC 1 value to
just the 1900s for the CdP:Old to get the correct type count.)
|