210 messages over 27 pages: << Previous 1 2 3 4 5 6 7 ... 18 ... 26 27 Next >>
s_allard Triglot Senior Member Canada Joined 5430 days ago 2704 posts - 5425 votes Speaks: French*, English, Spanish Studies: Polish
| Message 137 of 210 21 August 2012 at 6:54pm | IP Logged |
emk wrote:
...
Or in tabular form:
Quote:
Range Coverage
1..300 73.45%
1..1000 86.11%
1..2500 92.77%
1..5000 96.17%
1..10000 98.27%
1..20000 99.47%
1..33355 100.00% |
|
|
So if you want to know 98% of the words before doing extensive TV watching, you'll need
a vocabulary between 5,000 and 10,000 words. Honestly, I think that's a bit high: You
could probably get a lot out of TV while knowing less than 98% of the words. I also
suspect that the numbers are much better for a single TV series, because you will
quickly pick up the vocabulary used by a show.
Interestingly, one of the criteria for C2 is "can understand with ease virtually
everything heard or read". I don't think you could claim this if you knew 5000 words.
Judging from the table above, that would only give you 96.17% of the words in an
average TV show or movie, which is pretty bad. Either 10,000 or 20,000 words looks much
more promising: That would be 98.27% or 99.47%, respectively.
I guess this just goes to show that I'm not convinced by Miller's claim that you can
pass a real C2 exam while only recognizing 4,500 words. I've calculated things about 5
different ways, and I keep seeing indications that C2 should fall somewhere between
10,000 and 20,000 words of passive vocabulary. And this corresponds nicely with typical
vocabulary sizes after several years of full-time immersion, if I remember the research
correctly.
I find these numbers pretty fascinating, because they explain so much about my
listening comprehension and what I need to do to improve it. |
|
|
The statistics given are typical of all these lexical studies, as Zipf and others pointed many years ago.
I won't dispute these figures but what I've always found intriguing is the significance of the differences in the band of over 90%. What is the difference, in terms of understanding, between 92% and 98% coverage? It's important to note that the statistics describe coverage or words used, not how well they are understood.
At 92% you will not recognize 8 words out of 100. At 98% you will not recognize 2 out of 100, How important is this in subtitles that flash across the screen quickly and are accompanied by images. (Or shall I say that the subtitles accompany the images?) Can't we just fill in the blanks a lot of the time?
The reason I bring this up is that we all have certainly noted that we don't always equally understand the same movie dialogs in our native language. A regional accent, a moment of distraction, the sound level and cultural references can all impact on our ability to perceive and understand what was said. Sometimes we don't get a joke. And sometimes we even have to ask, "What did they say?"
So, how important is it to have 98% coverage compared to 92%? Why bother trying to learn 10000 words when 2500 will do the job? And although that 73.4% coverage with 300 words looks enticing, I won't go there.
Can this idea be applied to other areas, such as in examinations? Do I have to understand every single word that the examiner says or 100%? Couldn't I get by with 96% Of course, it depends on the actual words used but considering that the CEFR tests are essentially pass or fail, couldn't I say that 92% coverage is good enough for me to get at least the minimum grade to pass?
Edited by s_allard on 21 August 2012 at 7:34pm
2 persons have voted this message useful
| Hertz Pro Member United States Joined 4513 days ago 47 posts - 63 votes Speaks: English* Studies: German, Spanish, Mandarin Personal Language Map
| Message 138 of 210 21 August 2012 at 7:10pm | IP Logged |
emk wrote:
Also note that you can't accurately judge the size of my active vocabulary by counting all the words I use in (say) a year. ... Now, you could have recorded everything I said or wrote for the past 5 years and never detected that I know "chiaroscuro". But if I see that particular artistic effect, the word is on the tip of my tongue. |
|
|
This is true. It is also true that you may use words during the time period under study that you would not normally use.
1 person has voted this message useful
| s_allard Triglot Senior Member Canada Joined 5430 days ago 2704 posts - 5425 votes Speaks: French*, English, Spanish Studies: Polish
| Message 139 of 210 21 August 2012 at 8:32pm | IP Logged |
emk wrote:
...
I guess this just goes to show that I'm not convinced by Miller's claim that you can
pass a real C2 exam while only recognizing 4,500 words. I've calculated things about 5
different ways, and I keep seeing indications that C2 should fall somewhere between
10,000 and 20,000 words of passive vocabulary. And this corresponds nicely with typical
vocabulary sizes after several years of full-time immersion, if I remember the research
correctly.
I find these numbers pretty fascinating, because they explain so much about my
listening comprehension and what I need to do to improve it. |
|
|
I read another paper by Milton on this same question, "The development of vocabulary breadth across the CEFR levels." The XLex test he uses asks for both passive and active vocabulary but is above all an indicator of receptive vocabulary. The maximum score is 5000. Interestingly, the author refers extensively to the work of Paul Nation and mentions that this 5000 was determined to be the crucial band.
Although it is not entirely clear after my cursory reading, I think the author judged that the marginal value over the 5000 word limit was insignificant. Indeed, if we refer to the data for movie subtitles, we see that 5000 words take us to 96% coverage. That was probably considered sufficient coverage for the cutoff point.
On @emk's last point about improving his listening comprehension, although he doesn't say so as such, I think we can assume that he means it is necessary to increase the vocabulary size. Is this the best strategy?
There is nothing wrong with it, generally speaking, but here are some thoughts. To go to 98% coverage you need to at least double the number of words you know. Fine, but what about working on giving more depth to the words you already know? A fair number of those words have all kinds of shades of meaning, idiomatic constructions, derivational forms and collocations. In other words, enhance the understanding of the existing stock of 5000 words. Could this work? Or some combination of both strategies?
The thinking behind this is that those 4% of unknown words can be guessed or surmised to some extent with a better knowledge of the 96%. Or that in any case, a profound knowledge of the 96% is enough.
Obviously, I'm not telling @emk or anybody what to do. I'm just speculating.
1 person has voted this message useful
| Serpent Octoglot Senior Member Russian Federation serpent-849.livejour Joined 6597 days ago 9753 posts - 15779 votes 4 sounds Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish
| Message 140 of 210 21 August 2012 at 9:29pm | IP Logged |
s_allard wrote:
Can this idea be applied to other areas, such as in examinations? Do I have to understand every single word that the examiner says or 100%? Couldn't I get by with 96% Of course, it depends on the actual words used but considering that the CEFR tests are essentially pass or fail, couldn't I say that 92% coverage is good enough for me to get at least the minimum grade to pass? |
|
|
Aren't you supposed to fulfill the criteria just 80-90% of the time? For C2 you should understand nearly perfectly what you do understand, though.
1 person has voted this message useful
| montmorency Diglot Senior Member United Kingdom Joined 4828 days ago 2371 posts - 3676 votes Speaks: English*, German Studies: Danish, Welsh
| Message 141 of 210 21 August 2012 at 10:48pm | IP Logged |
s_allard wrote:
The thinking behind this is that those 4% of unknown words can be guessed or surmised
to some extent with a better knowledge of the 96%. Or that in any case, a profound
knowledge of the 96% is enough.
|
|
|
A thought experiment (probably invalid, but let's see if it goes anywhere):
Let's say 100% = 100 words, so you are missing 4 words:
Chosen more or less at random, these are:
headset
cardboard
plate
eyebrow
Now, we don't know what the other 96 words are, but would we be able to define all four
of our unknown words with the 96 that are present?
Or if those numbers are too small, then lets try 300, in which case 12 are missing.
e.g.
headset (audio variety, not of a bike!)
cardboard
plate
eyebrow
scissors
screwdriver
radiator
breathe
eat
bathroom
carpet
street
I will try to define a headset in simple terms:
[A piece of equipment/kit] that carries sound from a stereo or computer to your ears.
or something that takes sound from a stero or computer to your ears.
OK, that doesn't work because that could also be a speaker, so we need something else,
but we don't need stereo and computer. Let's try:
something that takes sound through a wire from a stereo (or computer or iPod) to your
ears. We might be able to replace "through a wire" with "directly":
something that takes sound from a stereo directly to your ears.
11 words, of which say 5 are very common re-usable pronouns, prepositions, etc.
Let's say it needed (11-5=) 6 particular words to define it.
Do we have enough words to define all 12 in our (300-12=) 288 original words which
represent 96%?
I'm just trying to get a flavour of the problems of a small vocabulary here; not trying
to prove anything.
Would anyone care to analyse it in a different way?
EDIT: Ah yes, I forgot an important point in the back of my mind when I started this
post: If I knew the usage, etc, of those 96% (or 288) words better than I do now, how
might it have helped me in defining the missing 12? Can anyone give an example?
Edited by montmorency on 21 August 2012 at 10:51pm
1 person has voted this message useful
| Serpent Octoglot Senior Member Russian Federation serpent-849.livejour Joined 6597 days ago 9753 posts - 15779 votes 4 sounds Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish
| Message 142 of 210 21 August 2012 at 11:19pm | IP Logged |
if my earphones died and I needed new ones asap, I'd just say "thing you put into your ear to listen to music" if I didn't know the word. the simplest monolingual dictionaries do use more than 300 words, but your goal isn't to provide an accurate definition, your goal is to get your point across.
unfortunately, "you have beautiful hair above your eyes" isn't an elegant compliment, though...
Edited by Serpent on 21 August 2012 at 11:23pm
2 persons have voted this message useful
| Hertz Pro Member United States Joined 4513 days ago 47 posts - 63 votes Speaks: English* Studies: German, Spanish, Mandarin Personal Language Map
| Message 143 of 210 22 August 2012 at 12:09am | IP Logged |
montmorency wrote:
EDIT: Ah yes, I forgot ... If I knew the usage, etc, of those 96% (or 288) words better than I do now, how might it have helped me in defining the missing 12? Can anyone give an example? |
|
|
Absolutely:
A. "Please, what is the word for a thing for listening to music?"
B. "A radio? Like this?"
A. "No, it goes on my ears."
B. "Ah, you mean a headset. Or headphones."
A good command of a primary vocabulary and basic grammar structures can help you not merely to describe your desired item in a roundabout way, but to actively solicit the word itself for your own future use.
1 person has voted this message useful
| daegga Tetraglot Senior Member Austria lang-8.com/553301 Joined 4521 days ago 1076 posts - 1792 votes Speaks: German*, EnglishC2, Swedish, Norwegian Studies: Danish, French, Finnish, Icelandic
| Message 144 of 210 22 August 2012 at 12:21am | IP Logged |
emk wrote:
Or in tabular form:
Quote:
Range Coverage
1..300 73.45%
1..1000 86.11%
1..2500 92.77%
1..5000 96.17%
1..10000 98.27%
1..20000 99.47%
1..33355 100.00% |
|
|
So if you want to know 98% of the words before doing extensive TV watching, you'll need
a vocabulary between 5,000 and 10,000 words. Honestly, I think that's a bit high: You
could probably get a lot out of TV while knowing less than 98% of the words. I also
suspect that the numbers are much better for a single TV series, because you will
quickly pick up the vocabulary used by a show.
|
|
|
Of course the numbers are better for a single TV series, because it's the same few writers who write each episode, so the vocabulary used is much narrower than that in all movies and TV shows together. It's the same as with reading: to be able to read some authors with understanding 98% of the words, you might only need the 3000 most frequent words, for others you need more than 10000. If you are selective, you should find TV shows you can use pretty early and still know roughly 98% of the used vocabulary (proper names not counted).
For English, one such perfect example is "Stargate SG-1". Generally, very easy language is used (subjectively, haven't counted any vocabulary in subtitles), you know the special military vocabulary (very few words actually) after a few episodes and most everything you don't understand you aren't supposed to understand anyways (scientist gibberish, alien languages, etc.), you get explained in easy words another time (almost everything Carter explains ... very pedagogical) or it is the name of some alien device. Additionally, there are 15 seasons (SG-1 + Atlantis) of more than 20 episodes each. That gives you enough to do until your vocabulary size has increased.
You won't be that lucky in most languages, but you should be able to find at least some pretty easy material. Well, I think you get something similar in German with "Lindenstraße" and "GZSZ", which are more about everyday life.
Edited by daegga on 22 August 2012 at 12:43am
2 persons have voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 0.4219 seconds.
DHTML Menu By Milonic JavaScript
|