100 messages over 13 pages: 1 2 3 4 5 6 7 ... 1 ... 12 13 Next >>
s_allard Triglot Senior Member Canada Joined 5218 days ago 2704 posts - 5425 votes Speaks: French*, English, Spanish Studies: Polish
| Message 1 of 100 05 August 2011 at 5:20pm | IP Logged |
Here at HTLAL a common topic is the necessary vocabulary size in the target language. Usually estimates run into the thousands or tens of thousands of words.
While watching a Spanish soap opera (Amar en tiempos revueltos) over the last couple of months, I couldn't help noticing how repetitive the vocabulary seemed to be.
I recorded a complete 50 minute episode and out of curiosity decided to count the number of active verbs and arrived at a figure of around 120 different verbs. But the really interesting but not surprising observation is that around 40% of all the instances of verbs are represented by just five verbs (ser, estar, haber, tener, hacer). In other words, most verbs are used very little and a few very often.
I didn't do the math, but looking at usage of tenses and moods, one sees immediately that the present, and the past dominate totally and many other forms are hardly used.
My feeling is that within the confines of the limited world of a this specific program, only a limited range of words are used. I'll do some more counting, but I suspect that the entire series uses a vocabulary in the area of 500 to 800 words, if not even less.
My conclusion is that one can achieve a high level of spoken proficiency by really mastering quite a small number of items in the target language. I wonder how others think about this.
1 person has voted this message useful
| fiziwig Senior Member United States Joined 4653 days ago 297 posts - 618 votes Speaks: English* Studies: Spanish
| Message 2 of 100 05 August 2011 at 5:56pm | IP Logged |
This is a common phenomenon in all languages. I have spent many years, off and on, playing with writing computer programs that "understand" English. I wondered early on how large a vocabulary would be necessary for a "chatbot" program to be able to converse in English. So I ran all kinds of statistical studies of newspaper articles, movies scripts, and various other corpora.
Briefly, English looks like this:
The first 20% of a corpus consists of 4 words: {the, be, of, and}
The next 20% (40% cumulatively) requires another 10 words: {a, to, in, he, have, it, that, for, they, I}
To understand 60% of a corpus takes another 54 words (see list below).
To understand 80% takes another 284 words.
To understand 100% requires a very long tail, depending on the size and nature of the corpus. But 99% of my total sample English corpus is covered by a vocabulary of 2280 words.
It should be noted that {is, am, be, are, was, were,...} are all counted as "be".
Of course, it's not just a matter of learning some 2300 words, but of learning the right 2300 words. That will most certainly vary for every language.
BTW: The top 100 for English newspaper articles (excluding headlines) are, in order:
the be of and a to in he have it that for they I with as not on she at by this we you do but from or which one would all will there say who make when can more if no man out other so what time up go about than into could state only new year some take come these know see use get like then first any work now may such give over think most even find day also after way many must look before great back through long where much should well people down own just
These 100 words cover 64.3% of all English newspaper text.
--gary
Edited by fiziwig on 05 August 2011 at 6:04pm
10 persons have voted this message useful
| Bao Diglot Senior Member Germany tinyurl.com/pe4kqe5 Joined 5554 days ago 2256 posts - 4046 votes Speaks: German*, English Studies: French, Spanish, Japanese, Mandarin
| Message 3 of 100 05 August 2011 at 6:17pm | IP Logged |
Me Tarzan, you Jane.
Four.
Wasn't there this guy who radically simplified English so that any meaning could be converey by a combination of the base vocabulary? I don't really remember the name, nor the numbers, but it struck me as odd that creating new phrasal verbs and compound nouns for complex meanings should be any easier than just learning the proper words for them.
Of course you can express almost everything by a combination of words you already know (and maybe some gestures), but is that real proficiency? Are you proficient when you visit a doctor and don't know the words pain, hurt or ache (referring to a random 1000 most common words source)?
Edited by Bao on 05 August 2011 at 6:18pm
2 persons have voted this message useful
| s_allard Triglot Senior Member Canada Joined 5218 days ago 2704 posts - 5425 votes Speaks: French*, English, Spanish Studies: Polish
| Message 4 of 100 05 August 2011 at 6:50pm | IP Logged |
I don't really think that the main idea here is that you will be totally proficient with a vocabulary of 500 words, and that this is what you should aim for. Instead, I believe that the real lesson is that there is a core set of elements of the target language that you really have to truly master if you want to develop conversational proficiency. For example, if five verbs make up 40% of all verb usages, the conclusions is that as a learner you have to really make an effort to get those verbs down pat in addition to learning all the other useful verbs. And speaking of verbs, why not concentrate on those 120 verbs that keep coming back?
And let's put things into perspective here. Basically, what my very simple and crude calculation says is that with 120 verbs you have nearly every verb you need to understand all 242 episodes (to date) of this soap opera. I wouldn't extrapolate this to the language at large and say that 120 verbs are all you need to be proficient in Spanish. Even so, I suspect that those 120 verbs will take you very far.
1 person has voted this message useful
| fiziwig Senior Member United States Joined 4653 days ago 297 posts - 618 votes Speaks: English* Studies: Spanish
| Message 5 of 100 05 August 2011 at 8:36pm | IP Logged |
Bao wrote:
Me Tarzan, you Jane.
Four.
Wasn't there this guy who radically simplified English so that any meaning could be converey by a combination of the base vocabulary? I don't really remember the name, nor the numbers, but it struck me as odd that creating new phrasal verbs and compound nouns for complex meanings should be any easier than just learning the proper words for them.
--- |
|
|
Charles Kay Ogden; "Basic English" 850 words, 18 of which are verbs.
http://en.wikipedia.org/wiki/Basic_English
Many linguists claim that he cheated in counting his 850 words because he reuses a lot of those words in idiomatic ways that are peculiar to English, so a learner also needs to learn those idiomatic phrases as separate vocabulary items since the meaning of the idiom cannot be inferred from the meaning of the words that comprise it.
--gary
1 person has voted this message useful
| Arekkusu Hexaglot Senior Member Canada bit.ly/qc_10_lec Joined 5169 days ago 3971 posts - 7747 votes Speaks: English, French*, GermanC1, Spanish, Japanese, Esperanto Studies: Italian, Norwegian, Mandarin, Romanian, Estonian
| Message 6 of 100 05 August 2011 at 9:14pm | IP Logged |
It's possible that only 800 words were needed to understand that specific program, but those 800 words are probably NOT the 800 most common words. In other words, how many words must a person know in order to happen to know all 800 in the show... The bulk of the most common words are mostly short words that play a grammatical role rather than a strong semantic one.
Regardless, a show will inevitably use a lot more words than a person would use in a conversation, let alone a conversation with a second language speaker.
You can certainly have a conversation with a vocabulary of 1000 words. You can have a better one with 2000 or 3000, obviously. In any case, my personal experience tells me there is no way that number would have to be in the tens of thousands because I've had many conversations in many languages I knew at most a few thousand words in. "Conservation" as in sitting down with someone and actually exchanging ideas and opinions.
I suspect, however, that when you are nearing 1000 or 2000 words, the quality of the conversation would heavily depend on the speakers mastery of grammar. A person who put heavy emphasis on vocabulary probably needs more words for an equally satisfying conversation than a person who concentrated on getting the grammar down pat right off the bat. Better grammar means better control of the words you know, making less count more.
Edited by Arekkusu on 05 August 2011 at 9:18pm
1 person has voted this message useful
| s_allard Triglot Senior Member Canada Joined 5218 days ago 2704 posts - 5425 votes Speaks: French*, English, Spanish Studies: Polish
| Message 7 of 100 05 August 2011 at 10:15pm | IP Logged |
Arekkusu wrote:
It's possible that only 800 words were needed to understand that specific program, but those 800 words are probably NOT the 800 most common words. In other words, how many words must a person know in order to happen to know all 800 in the show... The bulk of the most common words are mostly short words that play a grammatical role rather than a strong semantic one.
Regardless, a show will inevitably use a lot more words than a person would use in a conversation, let alone a conversation with a second language speaker.
You can certainly have a conversation with a vocabulary of 1000 words. You can have a better one with 2000 or 3000, obviously. In any case, my personal experience tells me there is no way that number would have to be in the tens of thousands because I've had many conversations in many languages I knew at most a few thousand words in. "Conservation" as in sitting down with someone and actually exchanging ideas and opinions.
I suspect, however, that when you are nearing 1000 or 2000 words, the quality of the conversation would heavily depend on the speakers mastery of grammar. A person who put heavy emphasis on vocabulary probably needs more words for an equally satisfying conversation than a person who concentrated on getting the grammar down pat right off the bat. Better grammar means better control of the words you know, making less count more. |
|
|
I focused specifically on verbs to avoid this issue of function words that are very common. In the case of verbs, I think it's quite evident that a very small number can cover most needs. If we take the vocabulary of all 240 episodes, it's certainly more than the 800 words of a particular episode. But how much more? This is where the limited world of this kind of soap opera enters into account. After listening to over 17 hours of programming, I think the the vocabulary does not differ much from one episode to the next because this is how these programs are constructed so that any viewer can pick up the story anywhere. This is not real life, but in its own way the language is quite realistic.
What is very true, however, is that being proficient in a language is not as simple as learning the 800 most common words. The real challenge is how to use these words and particularly the very many idiomatic expressions that make all the difference.
1 person has voted this message useful
| s_allard Triglot Senior Member Canada Joined 5218 days ago 2704 posts - 5425 votes Speaks: French*, English, Spanish Studies: Polish
| Message 8 of 100 05 August 2011 at 10:47pm | IP Logged |
I don't think we have get into arguments over the exact number of words one needs. Defining and counting words quickly becomes complicated. What I think we can do is use this kind of knowledge to think strategically when learning a language. If you know that in Spanish 5 verbs account for 30 to 40% of all verb usages, it only makes sense to concentrate on them. It's the same thing in English, albeit with very simple morphology. The key verbs in English are: be, have, do, get, know. In French, it would be: être, avoir, faire, aller, dire. If you want to speak French, those are five key verbs that you simply have to know backwards and forwards out of the thousands of verbs. Of course, in all these languages you have to know more than these 5 verbs, but rather than work in a willy nilly fashion you can at least progress in a logical manner.
1 person has voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 0.3906 seconds.
DHTML Menu By Milonic JavaScript
|