Register  Login  Active Topics  Maps  

How many words to speak?

 Language Learning Forum : General discussion Post Reply
309 messages over 39 pages: 1 2 3 4 5 6 7 ... 7 ... 38 39 Next >>
Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 5080 days ago

9753 posts - 15777 votes 
4 sounds
Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

 
 Message 49 of 309
01 September 2014 at 8:40pm | IP Logged 
s_allard wrote:
Two words a bit technical are myope (myopic) and astigmate (astigmatic) that are easily understood.

Likinäköisyys and hajataittoisuus (okay, also astigmatismi) in Finnish, btw. I'm interested in medicine but even I wouldn't recall them from memory. I don't think I would've understood them before I was a strong B1/weak B2.
2 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 3913 days ago

2704 posts - 5424 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 50 of 309
01 September 2014 at 9:12pm | IP Logged 
Just a short word about that site France bienvenue. This is a terrific example of that aggregate vocabulary sampling
effect when measuring vocabulary size. Each of the short conversations is a mixture of very common words and
some specific vocabulary. As a matter of fact, each conversation has annotations aimed at explaining details to
foreign readers. Any given conversation uses a very small set of different words but when you add them all up, the
set of different words increases considerably.
2 persons have voted this message useful



Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 5080 days ago

9753 posts - 15777 votes 
4 sounds
Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

 
 Message 51 of 309
01 September 2014 at 9:44pm | IP Logged 
s_allard wrote:
Any given conversation uses a very small set of different words but when you add them all up, the set of different words increases considerably.
Amen.
4 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 3913 days ago

2704 posts - 5424 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 52 of 309
02 September 2014 at 1:43am | IP Logged 
The CEFR does not give specific vocabulary sizes for each of its levels. It does however publish Oral Assessment
Criteria under the headings: Range, Accuracy, Fluency, Interaction and Coherence. This is basically what I had
mentioned earlier. I won't publish all the detailed descriptors for C2 in French, that can be found on the Internet,
here are two criteria:

RANGE
Shows great flexibility in reformulating ideas in differing linguistic forms to convey finer shades of meaning
precisely, to give emphasis, to differentiate and eliminate ambiguity. Also has good command of idiomatic
expressions and colloquialisms.

FLUENCY
Can express him/herself spontaneously at length with a natural colloquial flow, avoiding or backtracking around
any difficulty so smoothly that the interlocutor is hardly aware of it.

The question for us is how does all this relate to active vocabulary size. In passing, for me active vocabulary is
the set of the words the speaker has actually used previously. How many words do you need to meet these
criteria? 300 or 600 or 1,000 or 2,000 or 5,000 or 10,000 or 20,000?

Let's eliminate 300 because people get very angry when the see that figure. I will go for 600 to 1,000 as a
minimum. 2,000 and 5,000 are certainly good. The others are overkill.

Once the shrieking over the numbers 600 or 1,000 has settled down, let me explain my reasoning. It's pretty
simple. The first 500 words of French are highly polysemic and therefore can be used in very many different
ways. This core group of words is very versatile and rich in possibilities. After that the words become
progressively monosemic and less frequent. Beyond 1,000 the frequency is pretty much the same for all the
words.

My take on this is great performance can be achieved with excellent mastery of the core vocabulary combined
with grammatical accuracy, speaking fluency, interaction and coherence. This is what native speakers do. When I
see real conversations or transcriptions of speech, I don't see a ton of complex or rare vocabulary. Rather, I see
lots of high frequency words used in many different ways and tiny numbers of technical terms.

This is important in terms of your preparation strategy. How do you best use your time? Do you spend four hours
a day flipping through your 10,000-card deck Anki deck? Do you read newspapers and books six hours a day? Or
do you spend two hours a day with a tutor going over sample questions? I have nothing against Anki, maybe not
four hous a day. Maybe more like one hour a day. Lots of reading of course.

Above all, you want to practice talking with and getting correction from a professional tutor. You know what the
assessment criteria are. So you work along those lines. You know that idiomatic expressions are important, so
you work on that. Most of the time these expression use very ordinary words, but you have to know how to use
them. You identify your weak points of grammar and drill them to death, maybe with Anki. And you practice
interaction, how to take turns, interrupt, engage with your interlocutors.

As you can see there is a lot more to worry about besides vocabulary size. Let me add that if you feel more
comfortable with 2,000 or 5,000 words, then go for it.

Edited by s_allard on 02 September 2014 at 3:29pm

1 person has voted this message useful



robarb
Nonaglot
Senior Member
United States
languagenpluson
Joined 3542 days ago

361 posts - 921 votes 
Speaks: Portuguese, English*, German, Italian, Spanish, Dutch, Swedish, Esperanto, French
Studies: Mandarin, Danish, Russian, Norwegian, Cantonese, Japanese, Korean, Polish, Greek, Latin, Nepali, Modern Hebrew

 
 Message 53 of 309
02 September 2014 at 5:58am | IP Logged 
s_allard wrote:

Let's eliminate 300 because people get very angry when the see that figure. I will go for 600 to 1,000 as a
minimum. 2,000 and 5,000 are certainly good. The others are overkill.

Once the shrieking over the numbers 600 or 1,000 has settled down, let me explain my reasoning. It's pretty
simple. The first 500 words of French are highly polysemic and therefore can be used in very many different
ways. This core group of words is very versatile and rich in possibilities. After that the words become
progressively monosemic and less frequent. Beyond 1,000 the frequency is pretty much the same for all the
words.

My take on this is great performance can be achieved with excellent mastery of the core vocabulary combined
with grammatical accuracy, speaking fluency, interaction and coherence. This is what native speakers do. When I
see real conversations or transcriptions of speech, I don't see a ton of complex or rare vocabulary. Rather, I see
lots of high frequency words used in many different ways and tiny numbers of technical terms.


Yes, 2000 active words may be enough for conversation about everyday topics, and 5000 may be enough for
conversation in general, if you have grammatical accuracy, fluency, interaction, and coherence, and you can
understand what your conversation partner says to you. Something like this situation actually occurs quite often
with heritage language speakers, who have no problem with the phonology and grammar and converse quite
fluently about a limited range of everyday topics, but have a small vocabulary because they have no reading,
school or work experience with the language. However, from the adult language learner's perspective it's almost
a moot
point, because it's hard to imagine how you could reach that level of accuracy, fluency, interaction, and
coherence without also doing enough reading and listening to get your vocabulary well above 5000 along the
way. Unless, that is, you actively try to learn only basic vocabulary, to prove the point. In which case I think it
could be done.

Edited by robarb on 02 September 2014 at 5:58am

4 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 3913 days ago

2704 posts - 5424 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 54 of 309
02 September 2014 at 7:05am | IP Logged 
Much of this confusion about vocabulary size comes from the fact that is some debate, even within the scientific
community, as to what is exactly passive and active or receptive and productive vocabulary. And then there is
the whole issue of how to assess them.

Passive or receptive vocabulary is not a big problem. The idea is that such vocabulary is the set of words that you
recognize in context and understand. Defining active or productive vocabulary is more complicated because we
have to decide whether we are looking at the words you could use, because they are readily available to you, or
those words that you have actually used recently or use regularly.

Most of the studies of student vocabulary size, like those of Paul Nation, look at receptive vocabulary. All those
figures about people's vocabulary size refer to this kind of vocabulary.

Counting what people actually use is very difficult. There is a very interesting study of actual student productive
vocabulary size on the IELTS tests by Hu and Nation.

What we all do know is that none of us speak like the books and newspapers that we read. The number of
different words that we actually use in our daily lives is tiny compared to the many words we can read or hear.
Unless you are a journalist, a legal professional or a university teacher, your active vocabulary is actually quite
small.

I personally define productive vocabulary as those different words that I have used in the past five years. There
are many words that I may read daily but never use. For example, I have never in my life used the verb "to woo"
although I know what it means.

The interesting thing is that people will use all kinds of figures about their vocabulary size without having ever
attempted to measure their own productive vocabularies. They take some dodgy internet vocabulary size test and
then proclaim that they have a vocabulary of 40,000 words in English.

Measuring productive vocabulary size is very difficult. I invite people to try. It's a project I have for my retirement
years. What we do know is that, depending on profession and social activities, active vocabularies range around
500 to 3,000 words in English. For example, I estimate that my own productive vocabulary in English here at
HTLAL must be around 2,000 to 2,500 distinct words. I didn't count them; heavens no. I simply said that I must
come in somewhere under Iversen who is the only person I know around here who has done such a count.

When I talk about 300 words in French as a starting point, I explicitly mean 300 words of productive vocabulary.
People see that figure and go crazy because they have read that you need 10,000 words to read a simple book.
When I say that with 500 to 1,000 words you have a very solid base in French, I'm really not that far off from what
we know about real productive vocabularies. Although I am mixing up figures here for French and English, an
active vocabulary size of 2,000 is very respectable.

In passing, I should point out that there seems to be no academic interest in French or Spanish for vocabulary
size studies along the lines of what Paul Nation has done and continues doing. Here in Quebec no linguists are
working in this area. I don't know of anyone in France looking at this. I suspect it has something to do with the
structure of the French language.



Edited by s_allard on 02 September 2014 at 7:15am

2 persons have voted this message useful



robarb
Nonaglot
Senior Member
United States
languagenpluson
Joined 3542 days ago

361 posts - 921 votes 
Speaks: Portuguese, English*, German, Italian, Spanish, Dutch, Swedish, Esperanto, French
Studies: Mandarin, Danish, Russian, Norwegian, Cantonese, Japanese, Korean, Polish, Greek, Latin, Nepali, Modern Hebrew

 
 Message 55 of 309
02 September 2014 at 7:56am | IP Logged 
s_allard wrote:

Measuring productive vocabulary size is very difficult. I invite people to try. It's a project I have for my retirement
years. What we do know is that, depending on profession and social activities, active vocabularies range around
500 to 3,000 words in English. For example, I estimate that my own productive vocabulary in English here at
HTLAL must be around 2,000 to 2,500 distinct words. I didn't count them; heavens no. I simply said that I must
come in somewhere under Iversen who is the only person I know around here who has done such a count.


While I agree with the sentiment of your post, your figures are simply off. Even if you count only the words a
person actually has used in the past five years, ignoring the ones they could use but haven't recently or haven't
yet, and not double-counting inflections, native speakers simply use far, far more than 500-3000 unique words.

Here is a link to a list of the 5000 most frequent word
stems in an English corpus. The corpus contains a lot of texts that aren't representative of a typical speaker
('federal' appears before 'thank'), but most of the 5000 are words most people use. Above the 3000 mark are
such commonplace words as 'store', 'meter', 'fifty', 'organic', 'cow', 'loud', 'helicopter', 'crash', 'awful', 'boyfriend',
'rip', 'mud', 'guitar', 'pork', and 'stereotype.'

Yes, people don't actively use all of the tens of thousands of words in their passive vocabularies. No, the range of
active vocabularies is not anywhere near as low as 500-3000. I do agree it's hard to measure, so I won't give a
speculative estimate of what I think it is. What's not as hard to do is provide a lower bound for a person's active
vocabulary- Just take a sample of their speech or writing and count the unique words. I guarantee you, it does
not take a large sample to get to 1000. And for every word used frequently enough to get in that sample, there
are many rare words.

As an illustration of this concept, xkcd created a text editor that only
allows you to use the most common 'ten hundred' words. (Oops, can't say 'thousand'). (Oops, can't say 'oops'.)

From a language learner's perspective, 1000 words is simply not enough to say anything the normal way,
although you can get almost any idea across with much circumlocution.
3000 words is probably enough to sound
pretty normal. For most speakers, a little idiomatic variety and the occasional need to talk about something
unusual pushes it to somewhat more than that. For writers, professors, journalists, people who like to debate,
people who use technical language, or people like me who pepper their writing with the occasional word that's a
tad uppity, it's much, much more.

Edited by robarb on 02 September 2014 at 9:05am

6 persons have voted this message useful





Iversen
Super Polyglot
Moderator
Denmark
berejst.dk
Joined 5186 days ago

9078 posts - 16471 votes 
Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan
Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian
Personal Language Map

 
 Message 56 of 309
02 September 2014 at 10:09am | IP Logged 
There are a few studies that give figures for USED vocabulary (for famous authors, rap artists and - in all modesty - for me), but there aren't any - and probably can't be any - that give figures for active vocabulary. And for a good reason: active vocabulary is something potential, and it is more than doubtful whether there really is such a thing floating around inside the passive vocabulary.

My own experiments suggest that the number of words used in an 'strong' language follow a linear curve when measured against sample size with no signs of abating even up to 70.000 words. However you would expect that the curve starts to bend with really big samples, and if it reaches some kind of plateau below the size of the active vocabulary then you might claim that this defines the size of that persons active vocabulary. But I have not seen one single study that is big enough to prove this even for one person, let alone a statistically significant sample of a native population.

I haven't seen any studies either based on 'weak' languages, and I have written too little in digital media in other languages to make the kind of studies I have done for English - except maybe in Danish (thanks to my travelogues), but being a native Dane language a study of my Danish outpourings wouldn't say anything about the shape of the curve in a weak language.

My own hunch is that is is wrong to define 'active' status as a binary characteristic (active or not active). A word can in practice be more or less easy to recall depending on the context, words you have been exposed to recently and the bundle of associations you have formed for that word. And there is absolutely no way you can take those things into account in scientific experiments. Add to that the 'guessability', which already serves to blur discussions about passive vocabulary, and you have got something no sane scientist would choose to wager his/her career on.

With passive vocabulary you can at least present a number of words or expressions to a test person and check whether that person has even the foggiest notion of their meaning. But that method would not work for active vocabulary. OK, if I read a couple of pages in a dictionary I may ask myself whether I would be likely to remember each of those words in a relevant situation, but that would be a purely subjective and very loose estimate - and probably also wrong. You can't even present sentences with one missing word, because even a slight change in the wording would change the probablitiy that the test person would come up with the word you expect. And a tired test person would almost certainly perform less well than the same person in tip-top form.

In that situation we can just as well start to assign percentages. And there is one rule I do believe in: the words you know in a strong language are more likely to be active than the words you 'know' in a weak language. But it is absurd to attach specific percentages to these estimates, so therefore we can just as well start doing it.

Edited by Iversen on 02 September 2014 at 10:24am



4 persons have voted this message useful



This discussion contains 309 messages over 39 pages: << Prev 1 2 3 4 5 68 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3281 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2020 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.