319 messages over 40 pages: << Previous 1 2 3 4 5 6 7 ... 31 ... 39 40 Next >>
s_allard Triglot Senior Member Canada Joined 5427 days ago 2704 posts - 5425 votes Speaks: French*, English, Spanish Studies: Polish
| Message 241 of 319 25 April 2014 at 4:47am | IP Logged |
After a bit of a cultural interlude, back to the serious stuff. There is a dearth of studies of vocabulary size in
French along the lines of what Milton, Nation and other have done for English. But there is quite a bit of work that
was done quite a few years back on "le français fondamental". Here is an article that is not scientific as such but it
does summarize a lot of the current information.
Les 600 mots les plus
utilisés
If you can read French, here is the really interesting section:
"D’après l’interprétation de l’échelle Dubois-Buyse [1], le vocabulaire fondamental du français écrit est, en fin de
3e, de 3 725 mots.
Certaines distinctions sont particulièrement intéressantes, et permettent de tenter un dernier bilan :
Vocabulaire quotidien et pratique : de 300 à 3 000 mots environ, selon l’individu.
Vocabulaire « de base » ou fondamental (vocabulaire actif) : 800 à 1 600 mots pour un élève de collège ou de
lycée et quelques 3 000 mots pour l’individu moyen.
Vocabulaire « passif » ou dit « de culture générale » : entre 2 500 et 6 000 mots pour un élève de lycée et
quelques 30 000 mots pour un public cultivé.
Ainsi, un collégien de 6e disposerait d’environ 6 000 mots (y compris les listes fermées et les mots outils) tandis
que le vocabulaire du public cultivé irait jusqu’à 30 000 mots (en suivant cette échelle, on va des mots très
polysémiques - les 1 500 / 3 000 - comme cœur, feu, passion... aux plus monosémiques - les 30 000 - comme
agnosticisme, cacochyme , galéjade, panégyrique, rhomboédrique, vernaculaire...).
La plupart des Français utilisent donc moins de 5 000 mots pour s’exprimer et se faire comprendre !"
I won't translate the whole thing, but here is a rendition of the line starting with "Vocabulaire « de base »..."
"Basic or fundamental vocabulary (active vocabulary): 800 to 1600 words for a student in the collège or lycée and
around 3000 for the average person."
The passive vocabulary is much higher of course.
What I find striking is these figures and the others quoted in the article is how low they are. Maybe Milton's
figures quoted by the OP may be right after all, but that is the reality of French. Intriguing.
Edited by s_allard on 25 April 2014 at 1:54pm
3 persons have voted this message useful
| Serpent Octoglot Senior Member Russian Federation serpent-849.livejour Joined 6594 days ago 9753 posts - 15779 votes 4 sounds Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish
| Message 242 of 319 25 April 2014 at 9:47am | IP Logged |
s_allard wrote:
I can always tell the quality of a person by the music they like. |
|
|
I can always tell the quality of a person by how they tell the quality of a person.
BTW I like O Fortuna too :P
2 persons have voted this message useful
|
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6700 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 243 of 319 25 April 2014 at 10:00am | IP Logged |
As s_allard I would like to see some more work done on vocabulary size in other languages. There are of course lists over the most common words in a number of languages, but not the variety of research reports which you have for the English language (or if they exist then they don't get distributed as efficiently as those regarding English). There are some methodological issues with languages that freely combine words into long compounds (which may explain the skyhigh published number count for the Swedish author Strindberg) so ideally such an analysis should include not only a number for complete unique words irrespective of length, but also a count for root words. English and French, among others, have the opposite problem: they have many word combinations which effectively function as single words, and it is almost impossible to draw a line between such word combinations and and loose combinations. For instance "train ticket" has exactly the meaning you could expect from adding its components and therefore there is no reason to count it as a 'combi word'. But maybe "speeding ticket" would qualify - though it would be hard to specify why. Anyway, any statistical analysis would single both out as a fixed combinations with the same distribution of usage as real single-component words..
The French numbers quoted by s_allard are not really surprising, except the the low number of passive words for the lycée/high school/Gymnasium students - which places the poor souls way below the cultured adult. Can you expect the youngsters to catch up, or is this vast gap a permanent feature caused by some kind of general dumbing down in the media? Or is there some kind of methodological flaw involved? Did the old folks do more crosswords in their time because they didn't have Twitter and Facebook and an avalanche of American pop songs? In that case the damage done may be irreparable.
On a more personal note: due to Easter, some travelling and other disturbances I haven't had time to extend the conversion of my language log to lists of unique headwords which I had planned, but 3 months in 2006 gave around 15000 words all in all and 2400 unique headwords. After some 6 years of multiconfused babbling in my log thread I may have to sort out and cleanse some 200-300.000 words or more in at least a dozen languages (restricting it to things written by myself), so it will definitely take time to get through those 6 years. But some day I'll have that corpus ready, and even the subcorpora for German, French and a few other languages will then have a size where it is possible to use them statistically. The project may not live up to the most rigorous academical standards, but it should give some relevant numbers - and I get a chance to revisit some almost forgotten parts of the 400 or so pages of my log.
PS I just did the vocabulary test in the source which s_allard referred to above. My result as an adult (an estimate of 38.027 words) was higher than my own estimates from dictionary based word counts - comforting for me, but I believe my own results are more realistic. However that's not the interesting thing here. The real bummer was that I only got an estimate of 2238 words when I did the test for the <5 year olds even though I marked each and every word on the list. Or in other words: there is an artificial upper limit (and a quite low one at that) for the number of words a 5 year old child is permitted to know, and there is no way to break that barrier except by taking the test for adults and hope for the best. Even though the estimate may be realistic (it probably is for almost all children) the method is flawed. And it is flawed in exactly the same way for the adults with just one difference: there isn't any level above adult so the built-in upper limit at around 40-41.000 words appears to be invisible.
PS PS: This reminds me about some of the anecdotes about the British historian and child prodigy Macauley:
When he was 4 years old a servant spilled hot coffee on his legs; when the hostess inquired how he was feeling, he said, “Thank you, madam, the agony is abated.” When a housemaid threw away some oyster shells he’d been using to fence a garden plot, he marched into the drawing room and said, “Cursed be Sally, for it is written, ‘Cursed be he that removeth his neighbor’s landmark.’”
Edited by Iversen on 25 April 2014 at 11:39am
3 persons have voted this message useful
| luke Diglot Senior Member United States Joined 7202 days ago 3133 posts - 4351 votes Speaks: English*, Spanish Studies: Esperanto, French
| Message 244 of 319 25 April 2014 at 3:40pm | IP Logged |
s_allard wrote:
luke wrote:
s_allard wrote:
Vocabulary size is invariably linked to language proficiency. The more vocabulary you have, the better you speak the language. |
|
|
I hear angels singing Let's all sing together! |
|
|
I don't know how this contributes to the debate here but that rendition of O Fortuna from Carl Orff's Carmina Burana is great. I can always tell the quality of a person by the music they like. If nothing else @luke and I share the love of great music. In fact, I had the good fortune (no pun intended) of singing that very chorus in a choir some years back. Great memories. It also did wonders for my Latin. |
|
|
To hear your response, which made me laugh out loud. (the first part, "I don't know how this contributes to the debate here"). The second part, about your experience and memories around O Fortuna also helps me understand you as a person.
2 persons have voted this message useful
| Gemuse Senior Member Germany Joined 4079 days ago 818 posts - 1189 votes Speaks: English Studies: German
| Message 245 of 319 25 April 2014 at 5:55pm | IP Logged |
s_allard wrote:
"Basic or fundamental vocabulary (active vocabulary): 800 to 1600 words for a student in
the collège or lycée and
around 3000 for the average person."
|
|
|
Are college students less educated than the average person in France?
2 persons have voted this message useful
| Josquin Heptaglot Senior Member Germany Joined 4841 days ago 2266 posts - 3992 votes Speaks: German*, English, French, Latin, Italian, Russian, Swedish Studies: Japanese, Irish, Portuguese, Persian
| Message 246 of 319 25 April 2014 at 6:22pm | IP Logged |
Gemuse wrote:
s_allard wrote:
"Basic or fundamental vocabulary (active vocabulary): 800 to 1600 words for a student in
the collège or lycée and
around 3000 for the average person."
|
|
|
Are college students less educated than the average person in France? |
|
|
The French collège is the equivalent of the American high school or the British secondary school, not college.
1 person has voted this message useful
| luke Diglot Senior Member United States Joined 7202 days ago 3133 posts - 4351 votes Speaks: English*, Spanish Studies: Esperanto, French
| Message 247 of 319 25 April 2014 at 7:42pm | IP Logged |
Gemuse wrote:
s_allard wrote:
"Basic or fundamental vocabulary (active vocabulary): 800 to 1600 words for a student in the collège or lycée and around 3000 for the average person."
|
|
|
Are college students less educated than the average person in France? |
|
|
Do you mean college students in the United States compared to the average person in France?
Now for a more slightly more serious post ...
luke wrote:
emk wrote:
Iversen wrote:
At any rate: 3300 passive words for a university graduate seems very low compared to the number of words I have found for myself even in my weakest languages. Until somebody comes up with a better explanation I'm inclined to see it as an artefact of the research methods used. |
|
|
Yeah, this number is based on a 5,000 word dictionary. But even so, it seems low—out of the top 6,000 French words. |
|
|
The Routledge Frequency Dictionary of French was built from a 23 million word corpus. The top word, "le" and it's various forms constitutes over 1% of those 23 million. In the 3300 word frequency range, those words make up only about .17% of the words in the corpus. Around the 5000 word range, they make up only about .087%
An underlying question in this thread is, if one knows 3300 of the 5000 most frequent words, how does that translate into, say the 25,000 most frequent words? I would suspect one might know several thousand more of the 5001-25000 most frequent words. That is, if you know about 1/2 of the 4001-5000 most frequent words, you probably know only slightly less than 1/2 of the 5001-6000 most frequent words, etc. Are there any math geniuses here? |
|
|
So, having actually gotten some sleep, and re-reading a bit of the Milton paper, in somewhat answer to my own question... This blurb on the topic may be helpful in formulating my hypotenuse. Or FX's chart, or , or How Many words do I need to learn?
Note: XLex of 5000 is basically knowing all the 5000 most frequent words. Beyond that, vocabulary size is more speculative.
LV XLex@5000 Bonus BonusWords total_words
A1 under 1500 00-05% 0000-0075 1000-1575
A2 1500-2500 05-10% 0075-0250 1575-2750
B1 2750-3250 10-20% 0275-0650 3025-3900
B2 3250-3750 20-25% 0812-0937 4062-4867
C1 3750-4500 30-40% 1125-1800 4875-7500
C2 4500-5000 35-45% 2500-4000 7000-9000
That's my theory. As one goes up in CEFR levels, the percentage of words that one knows beyond the 5000 most frequent words goes up. Thus, total vocabulary size is also higher than might normally be thought of based on number of the most frequent words they know.
Anyone have some grant money they don't need?
Edited by luke on 25 April 2014 at 7:44pm
3 persons have voted this message useful
| Serpent Octoglot Senior Member Russian Federation serpent-849.livejour Joined 6594 days ago 9753 posts - 15779 votes 4 sounds Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish
| Message 248 of 319 25 April 2014 at 8:11pm | IP Logged |
Makes perfect sense to me.
1 person has voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 1.5938 seconds.
DHTML Menu By Milonic JavaScript
|