319 messages over 40 pages: << Previous 1 2 3 4 5 6 7 ... 15 ... 39 40 Next >>
Serpent Octoglot Senior Member Russian Federation serpent-849.livejour Joined 6595 days ago 9753 posts - 15779 votes 4 sounds Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish
| Message 113 of 319 13 April 2014 at 7:59pm | IP Logged |
Nobody is arguing against that. But reaching C2 requires the knowledge of this sort of words. Whether you get them for free or not, whether you can cheat through an exam without knowing these words or not.
Edited by Serpent on 13 April 2014 at 8:00pm
2 persons have voted this message useful
| patrickwilken Senior Member Germany radiant-flux.net Joined 4531 days ago 1546 posts - 3200 votes Studies: German
| Message 114 of 319 13 April 2014 at 8:04pm | IP Logged |
Serpent wrote:
Nobody is arguing against that. But reaching C2 requires the knowledge of this sort of words. Whether you get them for free or not, whether you can cheat through an exam without knowing these words or not. |
|
|
I thought the paper by Milton suggested a range of about 4000-5000 words for C2?
EDIT: I meant 4k-5k word GROUPS.
Edited by patrickwilken on 13 April 2014 at 10:38pm
1 person has voted this message useful
| daegga Tetraglot Senior Member Austria lang-8.com/553301 Joined 4519 days ago 1076 posts - 1792 votes Speaks: German*, EnglishC2, Swedish, Norwegian Studies: Danish, French, Finnish, Icelandic
| Message 115 of 319 13 April 2014 at 9:44pm | IP Logged |
patrickwilken wrote:
Serpent wrote:
Nobody is arguing against that. But reaching C2
requires the knowledge of this sort of words. Whether you get them for free or not,
whether you can cheat through an exam without knowing these words or not. |
|
|
I thought the paper by Milton suggested a range of about 4000-5000 words for C2? |
|
|
Milton suggests an XLex score of 4000-5000. As vocabulary development is not linear,
this suggests a much higher vocabulary knowledge.
http://www.readingmatrix.com/articles/sept_2009/eldridge_neu feld.pdf
page 226
2 persons have voted this message useful
| luke Diglot Senior Member United States Joined 7203 days ago 3133 posts - 4351 votes Speaks: English*, Spanish Studies: Esperanto, French
| Message 116 of 319 13 April 2014 at 10:02pm | IP Logged |
emk wrote:
Or to put it another way, if we assume a typical sentence is about 10 words, a student with a
5,000 word vocabulary is going to have to guess an average of one word per sentence, or 25 words on a
typical 250-word page. |
|
|
Don't get me wrong. I liked your post and everything. You are definitely one of the most valuable contributors
as far as I'm concerned. It was the bit above that I was referring to.
If knowing, let's say, all the words in the 5000 word French Frequency Dictionary means you'll still miss 25
words in a "typical" 250 page, I'm missing something.
What I got out of the article was that those vocabulary counts had to do with the top N words in the frequency
dictionary. I also intuited that the studies did not suggest that the students were getting those top words from
an SRS. E.G., they were learning them in the course of their studies. Some could be supplementing with an
SRS or flashcards, but that would be done on an individual basis and was not part of the studies per se.
So, that C2 French student with a vocabulary that included the top 4000-5000 words picked them up in the
course of study and not through the seemingly most direct route (using SRS with a frequency dictionary).
Those C2 students actually have a French vocabulary in the 9000-12000 word range. Many of those are less
frequent words, but relevant and important for filling in the gaps for comprehension as well as production.
I used the word "lemma", and don't recall that in the article. Lemma, as I understand it would put banker and
bank in the same "word" for counting. The counts they were doing in the studies would not. They would
lump together bank, banks, banked, and will bank though.
Bottom line for me was, if I'm interested to know what CEFR level I'm around, assuming I can use the
vocabulary with relative ease, vocabulary is a good indicator of where I'm at. E.G., If I didn't set out to learn
the top 2500 words in the frequency dictionary, but I've learned them, and can use them effectively, I may be
at the B level in the CEFR scale. If I can do that with the top 4000-5000 words, I may be at the C2 level with
5000 words.
I know CEFR requires reading, writing, listening, forming and defending arguments, etc. Within that context,
if I can use the vocabulary in the ranges specified above well, I may be approximately ready for a test that
demonstrates my accomplishment.
1 person has voted this message useful
| Medulin Tetraglot Senior Member Croatia Joined 4666 days ago 1199 posts - 2192 votes Speaks: Croatian*, English, Spanish, Portuguese Studies: Norwegian, Hindi, Nepali
| Message 117 of 319 13 April 2014 at 10:15pm | IP Logged |
I'd say C2 means 20 K words* (at least in English, which is comparable to the level of junior high students in the US).
(*word families).
Edited by Medulin on 13 April 2014 at 10:20pm
1 person has voted this message useful
| Jeffers Senior Member United Kingdom Joined 4907 days ago 2151 posts - 3960 votes Speaks: English* Studies: Hindi, Ancient Greek, French, Sanskrit, German
| Message 118 of 319 13 April 2014 at 10:40pm | IP Logged |
luke wrote:
If knowing, let's say, all the words in the 5000 word French Frequency Dictionary means you'll still miss 25 words in a "typical" 250 page, I'm missing something. |
|
|
I'm not sure, but I think I might be getting what your difficulty is here. The idea quoted by Emk is that with a vocabulary of 5000 you have 90% coverage of an average text in the Brown corpus, so therefore, you would not know 25 out of 250 words. I guess your question is what does the 90% refer to? There are two options, vocabulary coverage or text coverage, and they do mean quite different things:
1. It could mean 90% of the vocabulary items in the text. In this case, you are right. Those 90% of vocabulary items appear many many times, while the 10% of unknown words probably only appear once or twice each. So an average sentence will still contain 100% known vocabulary.
2. It could mean 90% of the text. In this case you wouldn't know 1 word out of 10 words appearing.
I did an analysis of Hindi texts (totalling a mere 100k words). My results, using the second criterion, was that a vocabulary of the 2500 most common words gives you 90% text coverage. In other words, you would not recognize 1 word out of every 10 you read. 95% text coverage comes from knowing the 3500 most common words. I remember seeing very similar numbers in another study, but I can't cite it. Anyway, for Hindi at least, I think the situation is better than the article Emk quotes.
1 person has voted this message useful
| patrickwilken Senior Member Germany radiant-flux.net Joined 4531 days ago 1546 posts - 3200 votes Studies: German
| Message 119 of 319 13 April 2014 at 10:46pm | IP Logged |
Medulin wrote:
I'd say C2 means 20 K words* (at least in English, which is comparable to the level of junior high students in the US).
(*word families).
|
|
|
One last quote from the Schnitt & Schnitt (2012) paper as it covers this quite well:
A more recent and relevant empirical study is Nation’s (2006) corpus study. He analyzed a range of English authentic texts (novels, newspapers), and calculated that it requires knowledge of the most frequent 8–9,000 word families (+proper nouns) to reach the 98% coverage which is thought to enable efficient reading. It took less vocabulary to cover the spoken corpora at 98% (5–6,000 word families). If 8–9,000 word families is enough to enable both listening to and reading a wide range of texts without being unduly constrained by a lack of vocabulary knowledge, then low-frequency/utility vocabulary can plausibly be defined as anything beyond this frequency level, that is, vocabulary beyond the 9,000 frequency band (9,000+).
Support for Nation’s 8–9,000 word families figure is given by an analysis of the Corpus of Contemporary American English (COCA) (Davies 2008). The 425+ million token COCA is a very large corpus of current American English, with a substantial spoken component (for the following analysis, numerals, words with apostrophes and proper nouns were excluded, leaving 402,646,672 tokens). In terms of size, balance and currency it is now the best corpus of general English in existence. Using Nation’s BNC frequency lists, we find that the most frequent 9,000 word families cover 95.5% of the COCA (Table 3). This means that the most frequent 9,000 word families cover over 95% of a huge amount (400+ million words) of very diverse written and spoken English. The average person would come across much less English than this, and importantly, many fewer different words. Thus the lexical coverage figures would be higher for the amount of language any individual person might be exposed to (Nation 2001b), so Nation’s (2006) 8–9,000 figures are likely to get close to 98% coverage for individual users, especially if numerals and proper nouns are assumed to be known.
The Nation paper is:
Nation, I. S. P. (2006). How large a vocabulary is needed for reading and listening? Canadian Modern Language Review 63.1, 59–82.
https://www.victoria.ac.nz/lals/about/staff/publications/pau l-nation/2006-How-large-a-vocab.pdf
Edited by patrickwilken on 13 April 2014 at 10:47pm
1 person has voted this message useful
| Elexi Senior Member United Kingdom Joined 5563 days ago 938 posts - 1840 votes Speaks: English* Studies: French, German, Latin
| Message 120 of 319 13 April 2014 at 11:00pm | IP Logged |
Should add that Prof. Nation's 'big' book is this (which has a big preview on Amazon)
http://www.amazon.co.uk/Learning-Vocabulary-Language-Cambrid ge-
Linguistics/dp/0521804981/ref=sr_1_1?ie=UTF8&qid=1397422744& sr=8-1&keywords=Paul+Nation
1 person has voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 1.2500 seconds.
DHTML Menu By Milonic JavaScript
|