Article: Students fall short on Vocabulary (General discussion) Language Learning Forum

Article: Students fall short on Vocabulary
Tags: Class Learning \| Number of words \| United Kingdom (UK) \| CEFR Language Levels \| Press Article \| Vocabulary learning
Share with: Delicious Digg reddit Facebook StumbleUpon
Language Learning Forum : General discussion

319 messages over 40 pages: << Previous 1 2 3 4 5 6 7 ... 14 ... 39 40 Next >>

Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 6595 days ago
9753 posts - 15779 votes

4 sounds
Speaks: Russian*, English, Finnish^C1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

Message 105 of 319

13 April 2014 at 12:45pm | IP Logged

s_allard wrote:

When the going gets tough, the tough get going.

To me this is just more proof that it's pointless to learn advanced usage of basic words when you can barely speak. Such a sentence will make so much more sense if you know enough words to read native materials.
3 persons have voted this message useful

Solfrid Cristin
Heptaglot
Winner TAC 2011 & 2012
Senior Member
Norway
Joined 5332 days ago
4143 posts - 8864 votes

Speaks: Norwegian*, Spanish, Swedish, French, English, German, Italian
Studies: Russian

Message 106 of 319

13 April 2014 at 12:48pm | IP Logged

Living in a foreign bubble can indeed be a problem. I took a course at Sorbonne once, and made lots of
friends, but none of them were native speakers. French girls wouldn't talk to us, and in Paris in July it was
actually hard to find native French speaking guys our age.

The successful immersion situations I have had, which fortunately are in majority, have been where I have
thought outside of the box and avoided any contact with non-native speakers like the plague, and only dealt
with the locals.

The "immersion" you get from living in an all foreigners student dorm, attending a course for foreigners and
where all your friends are foreigners (to the local country) is negligible.

In my opinion, in stead of spending money on a course, one would learn better by just living in a local family,
and try to get as many native friends as possible.
4 persons have voted this message useful

s_allard
Triglot
Senior Member
Canada
Joined 5428 days ago
2704 posts - 5425 votes

Speaks: French*, English, Spanish
Studies: Polish

Message 107 of 319

13 April 2014 at 3:52pm | IP Logged

Elexi wrote:

As we are discussing only a summary paper by Professor James Milton, and we are
generally not university level linguists, it seems fair that we should look at a fuller
paper where he explains his working suppositions, including the importance of vocabulary
size to language comprehnsion:

http://eurosla.org/monographs/EM01/211-232Milton.pdf

This is a very good paper. Of particular interest are the figures for vocabulary size relative to CEFR levels. Table
5 for English, show that for C1 in English the associated vocabulary size is 3750-4500 words. For C2, it's 4500-
5000, although it should be pointed out that the maximum allowed by the software used was 5000.

Table 6 is also very interesting because it looks at the vocabulary sizes for different languages for students of
different countries. For example students from Greece taking the C2 French exam score 3525 for mean
vocabulary size.

What is interesting here is that these figures are a far cry from all those figures of 10,000 and 20,000 words that
are bandied about here for C level candidates. It would seem that according to Milton a vocabulary in the 4000
word range would do very well for French.

To me this is not surprising of course. That said, I still believe that the importance of vocabulary size is way
overblown. Vocabulary is important of course, but, in my opinion, measuring it for foreign language learning
purposes is rather useless.
2 persons have voted this message useful

emk
Diglot
Moderator
United States
Joined 5530 days ago
2615 posts - 8806 votes

Speaks: English*, French^B2
Studies: Spanish, Ancient Egyptian
Personal Language Map

Message 108 of 319

13 April 2014 at 4:51pm | IP Logged

s_allard wrote:

I've written about Milton's paper several times here on HTLAL (IIRC), and I've previously noted that I have several problems with his methodology. My biggest problem, of course, is that he used a 5,000 word vocabulary list, which means that his paper really can't tell you anything at all about anything beyond 5,000 words.

One of my goals is to be able to read French comfortably, with excellent understanding. Among other things, I'd like to be able to go at least 2 pages before running into a genuinely puzzling word. Assuming 250 words per page, that means my goal is 99.8% comprehension of anything you stick in front of me.

Now, it turns out that I need a surprisingly large passive vocabulary to reach 99.8% coverage of a text. See this chart by Nation and Warring, which goes up to 97.8%:

Quote:

Table 1: Vocabulary size and text coverage in the Brown corpus

Vocabulary size | Text coverage
1000 | 72.0%
2000 | 79.7%
3000 | 84.0%
4000 | 86.8%
5000 | 88.7%
6000 | 89.9%
15,851 | 97.8%

Let's look at the 6,000 word vocabulary. This gets you 89.9% coverage of the Brown corpus, or about 9 words in 10. Or to put it another way, if we assume a typical sentence is about 10 words, a student with a 5,000 word vocabulary is going to have to guess an average of one word per sentence, or 25 words on a typical 250-word page. Personally, that feels like somewhere between B1 and a low B2 at best; it's hard to answer complicated questions about a page with 25 unknown words.

Now, I'm not saying I trust Nation and Warring's numbers. There's a lot of decisions you have to make when counting words (How do you handle proper names? How do you group related words? etc.) But nonetheless, their numbers suggest that you need a pretty big vocabulary to reach 98% coverage.

Anyway, if I had some spare time, I'd love to build a little website where people could paste in French text, choose a vocabulary size, and see how many words they would be able to understand. But not this week, sadly.

I'd also love to know whether people are really passing C2 exams with only 88.7% comprehension of typical L2 text, as suggested by the chart above, and by Milton's 4500–5000 word estimate for typical C2 students. Obviously, the two counting methodologies were different, so I can't really compare the numbers. But even so, maybe it's time I go take another exam. ;-)

And of course, I don't think that students ought to sit around with 10,000 flash cards. In fact, they shouldn't even worry much about vocabulary past B2; extensive reading will take care of most of it. And I agree that vocabulary isn't the most important part of language learning. But no matter how good my grammar, or how well I master every nuance of the verb faire, at some point I actually need to learn a big pile of words, somehow, even if only by osmosis. It's hard to progress beyond a certain point if I'm still guessing the meaning of 1 out of every 10 words in the typical text.

Edited by emk on 13 April 2014 at 5:03pm
4 persons have voted this message useful

patrickwilken
Senior Member
Germany
radiant-flux.net
Joined 4531 days ago
1546 posts - 3200 votes

Studies: German

Message 109 of 319

13 April 2014 at 6:38pm | IP Logged

This 2012 paper has some interesting ideas regarding sufficient breadth for an L2 vocabulary:

A reassessment of frequency and vocabulary size in L2 vocabulary teaching

Norbert Schmitt & Diane Schmitt

The high-frequency vocabulary of English has traditionally been thought to consist of the 2,000 most frequent word families, and low-frequency vocabulary as that beyond the 10,000 frequency level. This paper argues that these boundaries should be reassessed on pedagogic grounds. Based on a number of perspectives (including frequency and acquisition studies, the amount of vocabulary necessary for English usage, the range of graded readers, and dictionary defining vocabulary), we argue that high-frequency English vocabulary should include the most frequent 3,000 word families. We also propose that the low-frequency vocabulary boundary should be lowered to the 9,000 level, on the basis that 8–9,000 word families are sufficient to provide the lexical resources necessary to be able to read a wide range of authentic texts (Nation 2006). We label the vocabulary between high-frequency (3,000) and low-frequency (9,000+) as MID-FREQUENCY vocabulary. We illustrate the necessity of mid-frequency vocabulary for proficient language use, and make some initial suggestions for research addressing the pedagogical challenge raised by mid-frequency vocabulary.

http://www.norbertschmitt.co.uk/#untitled41

A relevant paragraph from this paper:

While we agree with the cost/benefit approach, we feel that recent research has made the four-part categorization [where language is divided up into 2000 high frequency words, lots of low frequency words, plus specialized vocabulary for academics and technical disciplines - the implication that only low and possibly academic/technical vocabulary should be taught in language schools] untenable as a pedagogic description. The key evidence is more recent study by Nation (2006), in which he uses a solely frequency-based approach instead of the four-part categorization. In it, he calculates that a reader needs knowledge of 8–9,000 word families to read a diverse range of authentic texts in English without unknown vocabulary being a substantial handicap. This vocabulary size takes us far beyond high-frequency vocabulary; in fact it takes us beyond current definitions of high-frequency, academic and technical vocabulary combined. If it takes this much vocabulary for proficient English use, there clearly needs to be a focus on vocabulary beyond that covered by the high-frequency, academic and technical categories.

Edited by patrickwilken on 13 April 2014 at 6:55pm
3 persons have voted this message useful

luke
Diglot
Senior Member
United States
Joined 7203 days ago
3133 posts - 4351 votes

Speaks: English*, Spanish
Studies: Esperanto, French

Message 110 of 319

13 April 2014 at 6:58pm | IP Logged

I don't think it's accurate to say knowing 90% of the words means "one word in ten" will be unknown. The
10% of the unknown words will be the less frequently used words in the text. That is, it could be 4 or 5
words per page are unknown. E.G., When you count up all the unknown words in a text, they will be 10% of
the lemmas, but could be less than 1% of the actual text.

Edited by luke on 13 April 2014 at 7:01pm
2 persons have voted this message useful

emk
Diglot
Moderator
United States
Joined 5530 days ago
2615 posts - 8806 votes

Speaks: English*, French^B2
Studies: Spanish, Ancient Egyptian
Personal Language Map

Message 111 of 319

13 April 2014 at 7:16pm | IP Logged

luke wrote:

According to the paper, the table above was for "coverage" of the text, and they specifically explain what that means:

Quote:

With a vocabulary size of 2,000 words, a learner knows 80% of the words in a text which means that 1 word in every 5 (approximately 2 words in every line) are unknown.

So 80% means 1 word in 5 is unknown, and thus 90% means 1 word in 10 is unknown. (They confirm this by saying 2 words in every line are unknown, and many book formats have approximately 8–14 words per line, so that checks out.) As I said, I don't especially trust their numbers—I'm not sure they cared about headwords versus derived forms, or about proper names when they did the count, which makes a difference. But I've seen other papers with similar tables, and they all say much the same thing: if you want to get below a few unknown words per page, you're going to need a pretty big vocabulary.

Edited by emk on 13 April 2014 at 7:20pm
4 persons have voted this message useful

patrickwilken
Senior Member
Germany
radiant-flux.net
Joined 4531 days ago
1546 posts - 3200 votes

Studies: German

Message 112 of 319

13 April 2014 at 7:24pm | IP Logged

Going a bit further into this paper:

Perhaps the best way of discussing mid-frequency vocabulary is by giving examples and explaining how mid-frequency vocabulary relates to language use. The list below exemplifies the type of words at each 1,000 level in the mid-frequency band:

3,001–4,000: academic, consist, exploit, rapid, vocabulary
4,001–5,000: agricultural, contemporary, dense, insight, particle
5,001–6,000: cumulative, default, penguin, rigorous, schoolchildren
6,001–7,000: axis, comprehension, peripheral, sinister, taper
7,001–8,000: authentic, conversely, latitude, mediation, undergraduate,
8,001–9,000: anthropology, fruitful, hypothesis, semester, virulent

It seems hard to argue that these sorts of mid-frequency words are not essential for accessing authentic texts.

I am willing to accept that for C1 your vocabulary could be <5000 word groups, but I really find it hard to believe that someone at a solid C2 level wouldn't know a substantial part of the mid-range vocabulary as defined in this paper.

Edited by patrickwilken on 13 April 2014 at 7:59pm

3 persons have voted this message useful

This discussion contains 319 messages over 40 pages: << Prev 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 Next >>

Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum

This page was generated in 0.6719 seconds.