Register  Login  Active Topics  Maps  

How many words to speak?

 Language Learning Forum : General discussion Post Reply
309 messages over 39 pages: << Previous 1 2 3 4 5 6 7 ... 35 ... 38 39 Next >>
Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 5080 days ago

9753 posts - 15777 votes 
4 sounds
Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

 
 Message 273 of 309
23 September 2014 at 3:55pm | IP Logged 
emk wrote:

Now, I'd love to know what that list looks like to a monolingual German speaker, or to a monolingual Mandarin speaker. Do words like association and handicaper filter into non-Romance languages via global English?

monolingual Russian:

Quote:
adaptation, association, centrer, chômage, diminuer, gestion, handicaper, insertion, intégration, liguer, mobiliser, polémiquer, préjuger, slogan

handicap is pretty uncommon in Russian, it's mostly used in the context of some sports and betting. also, it's spelt as "gandikap", so not everyone will even think of the word at once. If the Russian word happens to be relevant to your interests, you may be able to figure it out. Also, "integratsija" and "polemika" probably sound like modern buzzwords to many, although there are definitely plenty of educated people who know them. And the latter is a noun, whereas the verb is "polemizirovat'", I think. It's probably heard less commonly by those to whom it's a buzzword anyway.

In Finnish, without looking up I can only be sure about poleeminen and mobilisoida. I was also fairly sure about slogani, but tbh I assumed it would be something like sloogan. And again integraatio/integroida, probably with the same caveat as in Russian.

Okay, let's put together a proper list:

Quote:
adaptation, association, centrer, chômage, diminuer, gestion, handicaper, insertion, intégration, liguer, mobiliser, polémiquer, préjuger, slogan

"assosiaatio" isn't even in the Finnish wiktionary. again, possibly a buzzword. and a popularly known term from psychology.
"sentraali" or "sentralisoida" honestly look awkward to me. I can't imagine anyone in Finland not knowing the English centre or Swedish centrum though
Linguists know "diminutiivinen". (in Russia a linguist will only learn the word as an English one, I'd say)
"insertti" is a noun and doesn't overlap fully with English. honestly those who know it are likely to know it from English
Nearly all the words have a derived equivalent that's more clear and can be more common. One more thing is that a Russian will have a much easier time becoming an educated person without learning any English.

Quote:
If you truly have no cognate discounts, I'd guess that the DELF B1 reading comprehension exam actually requires B2 comprehension.

Sounds like that, yes. On a related note, I tend to believe I've reached B2 only when Dialang says I'm at C1.

Edited by Serpent on 23 September 2014 at 4:09pm

2 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 3913 days ago

2704 posts - 5424 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 274 of 309
23 September 2014 at 4:10pm | IP Logged 
Iversen wrote:
...

But in real life you can't be sure what the next topic is and which words your interlocutor will use (apart from a
small group of very common words, including the typical grammar words), so either you choose situations where
the topic is given or you are in trouble up to your ears and end up speaking English or what ever the local tourist
koiné is. Is that fun? ...


Since this objection keeps coming up, and with deference to iversen, I want to revisit it. Although the debate has
meandered in a number of directions, one position that I have initiated and doggedly defended is that around
300 words in French will get you started speaking. This idea has somehow morphed into a behemoth here where
people believe that I've said that you need not learn more than 300 words in any language. I do not believe that
300 words in Swahili will have you chatting like a native. But I do believe that it will get you going, getting off the
ground so to speak.

A completely different theme in the thread is what can be done with 300 words and how many words are
necessary for certain tasks. There is a bit of confusion over whether we are talking about passive or active
vocabulary or reading and listening comprehension. I'm focused on active speaking, the words that come out of
your mouth.

Where there is considerable debate is what can be done with numbers of unique words. I say that there are lots
of examples of conversations with way less than 300 words by native speakers. I'm not trying to compare
learners of French with native speakers. I'm saying that many actual conversations use few words.

For example, I ask, How many words are necessary to have a very abstract debate about the work of the great
French philosopher Paul Ricoeur? I say way less than a 1000 because that is what I estimate from listening to a
recording of such a debate

For the CEFR tests, we have lots of estimates of the passive vocabulary necessary but I have never seen any
figures for the words actually spoken by the examinees. In the CEFR B2 speaking test, I estimate that candidates
will use less than 300 unique words in the interaction with the examiner. How those 300 words are used is the
big question.

At the C2 level, the number of distinct words coming out of your mouth on the speaking test will probably be not
very different. The difference is "quality". What I mean by that is the level of skills that are required at that level.   
For example, the debate will become more abstract and more nuanced. Maybe more actual words will be
required. But there are other ways of rendering nuances and subtleties other than just adding words.

I maintain that you will actually use a small number of words and that the total productive vocabulary - the words
that you have used or feel comfortable using - is also very small.

1 person has voted this message useful



iguanamon
Pentaglot
Senior Member
Virgin Islands
Speaks: Ladino
Joined 3745 days ago

2224 posts - 6708 votes 
Speaks: English*, Spanish, Portuguese, Haitian Creole, Creole (French)

 
 Message 275 of 309
23 September 2014 at 4:38pm | IP Logged 
Thanks, patrickwilken! I just took the Portuguese test and scored:

Passive (Recptive):

As palavras mais frequentes do português:
1000 29/30
2000 28/30
3000 29/30
4000 25/30
5000 27/30

Total: 138/150 = 92%

Active (Productive):

1000 15/18
2000 15/18
3000 15/18
4000 15/18
5000 15/18

Total = 75/90 or 83%

I still have more work to do.


Edited by iguanamon on 23 September 2014 at 5:27pm

3 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 3913 days ago

2704 posts - 5424 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 276 of 309
23 September 2014 at 4:39pm | IP Logged 
patrickwilken wrote:
Well this is completely depressing. I found a set of vocabulary tests at the University of
Leipzig for different languages (German, English, Japanese, Arabic, Portuguese, Russian, French, Spanish, Italian),
which test your knowledge (passive/active) of the 5000 most common words in each language.

As expected in my receptive knowledge of English I score essentially 100% in all levels (2/30 errors at the 2000
word level for some reason, perhaps I misread the question or rushed things).

For German however, I do really quite badly. For the 1000, 2000, 3000, 4000, 5000 word levels I get: 28/30;
17/30; 20/30; 18/30; 13/30 (didn't finish all the questions for the last so perhaps that would have been a bit
higher).

Using the raw scores that suggests a passive vocabulary of 3200 for the first 5000 most frequent words in
German.

The tests can be found here: http://www.itt-
leipzig.de/static/startseiteeng.html

I think everybody should have a look at this test. I won't call it a joke but it shows the methodology of testing and
I think it has some relevance for the debate here.

For example, to estimate the productive knowledge of the most 1000 words in French, it asks you to complete
the spelling of a word in each of five sentences.That's it. If you get those five answers right, you are estimated to
know how to use the first 1000 words in French.

The way this works of course is that the first 1000 words in the frequency list are divided into five bands with the
five words here corresponding each to a band. If you know a word, you know all the words in the band.

This is how nearly all vocabulary tests work. They don't actually count the words you know, they take little
samples. Fair enough.

How well you use these words is a different story.
1 person has voted this message useful





emk
Diglot
Moderator
United States
Joined 4015 days ago

2615 posts - 8805 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 277 of 309
23 September 2014 at 4:48pm | IP Logged 
s_allard wrote:
Now that robarb has explained overfitting I think I have a better understanding. If I know only the vocabulary of
one Harry Potter book, I would have a problem understanding a book by a different author. For example, I would
have some difficulty reading a work by Charles Dickens. But suppose I'm not interested in Charles Dickens and I
all I want to read is one Harry Potter book, that vocabulary will suit me fine.

Very close, but not quite I meant. "Overfitting" is a technical from machine learning and statistics. Wikipedia has a decent overview:

Quote:
In statistics and machine learning, overfitting occurs when a statistical model describes random error or noise instead of the underlying relationship. Overfitting generally occurs when a model is excessively complex, such as having too many parameters relative to the number of observations. A model that has been overfit will generally have poor predictive performance, as it can exaggerate minor fluctuations in the data.

Let me unpack that a bit. Let's assume that you really only care about Harry Potter, and you have no interest in Charles Dickens or newspaper articles or YouTube comedy videos. So the model we're trying to construct here is "Harry Potter French", or just enough French to enjoy this one series of books. We don't care about "Charles Dickens French" or "quantum mechanics French" or even just plain old "French." (This is actually a totally viable learning strategy by the way; Krashen calls it "narrow reading," and it's why I can read even aggressively literary science fiction in French.)

So how can we construct a model of "Harry Potter French"? Let's start by taking books 1, 3 and 5, and learning every word that appears in them. Then we try to apply this vocabulary to books 2, 4 and 6. I think we'll find that there are two things "wrong" with our vocabulary:

1. We'll have wasted too much time learning words like sornettes that might only appear once or twice in the entire series.
2. We'll have omitted a bunch of semi-common words that appear in books 2, 4 and 6 but not in books 1, 3 and 5.

So "overfitting" means that we've wasted vocabulary "slots" on genuinely weird and peculiar words from the odd numbered books, slots that we could have used better by simply knowing common French words instead. A better strategy would be to combine a generic frequency list with, say, 1,000 words we learned from books 1, 3 and 5.

Now, to be fair, Harry Potter is almost half of an original Super Challenge by itself. It might be just big enough to be a good model for "French popular fiction." And so maybe we wouldn't see too much overfitting.

s_allard wrote:
This is not to say take the first 300 words from French film subtitle frequency list. Heavens no. That would not
definitely not work. Instead, let's take four conversations that give us a unique word count of 300 and see what
we can do with that. We know we can at least have four conversations.

I'm happy to try this experiment. Here's what we would need to get started:

A. Eight conversations selected from the France Bienvenue website, in text format, covering a variety of easy topics.
B. Before doing anything else, I will look at the eight conversations and tell you whether or not I consider them to "cover a variety of topics."

An experiment with a fun bet

Here's the methodology we'll use:

1. I'll randomly pick four conversations to use as my training set and four to use as my test set.
2. I'll generate a vocabulary list containing all the unique words in the training set.
3. I'll generate a second vocabulary list containing the same number of words pulled from the top of Lexique 3's 'freqfilms2' data.
4. Both vocabulary lists will be applied to the test set using my "vocabulaire" tool (as demonstrated in this thread), and I'll post the results online.

Without having seen the data, I'll make a prediction: If the conversations chosen from France Bienvenue cover a range of easy subjects, then the vocabulary list generated in step (3) will give better coverage than the vocabulary list generated in step (2).

Just to make this a bit more interesting, I'll make an amusing wager: If my prediction is wrong, I'll post a sound file of me reading Raymond Devos's classic Parler pour ne rien dire (video, text). This is a brilliant stand-up speaking routine using 461 words, of which 135 are unique, and all but 2 or 3 of which would be understandable to an English speaker who only knew the top 1000 words from a frequency list. But this speech shows an incontestable mastery of idiomatic French that would stomp any C2 speaking exam into oblivion if delivered spontaneously.

Are you game? Pick 8 conversations, save them as text, and PM me to get an email address.
6 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 3913 days ago

2704 posts - 5424 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 278 of 309
23 September 2014 at 4:51pm | IP Logged 
Before I head off to work, and before I get on to my favourite topic, I want to commend all the great posters who
have been actively contributing to this thread that I never thought would last this long. The usual heavy hitters
are here. And the usual jokers of course. But people are people.

To try to resolve this issue of the vocabulary sizes for the CEFR speaking tests, a possible solution would be a
grid that looks something like this:

For C2 speaking:

Receptive vocabulary: 8000
Reserve productive vocabulary: 600
Vocabulary likely used: 300

Definitions:
Receptive vocabulary: unique words that the user can recognize with primary meaning
Reserve productive vocabulary: words that the user has used or feels comfortable knowing how to use
Vocabulary likely used: the unique words the examinee will use in the speaking test

Before people start screaming at me, let me say that those figures are pure guesstimates. Feel free to play with
them.
1 person has voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 3913 days ago

2704 posts - 5424 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 279 of 309
23 September 2014 at 5:02pm | IP Logged 
emk wrote:
...
So how can we construct a model of "Harry Potter French"? Let's start by taking books 1, 3 and 5, and learning
every word that appears in them. Then we try to apply this vocabulary to books 2, 4 and 6. I think we'll find that
there are two things "wrong" with our vocabulary:

1. We'll have wasted too much time learning words like sornettes that might only appear once or twice in the
entire series.
2. We'll have omitted a bunch of semi-common words that appear in books 2, 4 and 6 but not in books 1, 3 and
5.

So "overfitting" means that we've wasted vocabulary "slots" on genuinely weird and peculiar words from the odd
numbered books, slots that we could have used better by simply knowing common French words instead. A
better strategy would be to combine a generic frequency list with, say, 1,000 words we learned from books
1, 3 and 5.

Now, to be fair, Harry Potter is almost half of an original Super Challenge by itself. It might be just big
enough to be a good model for "French popular fiction." And so maybe we wouldn't see too much overfitting.

...

I think this is what I understood by overfitting, but I'm not trying to create a model of Harry Potter French. That's
not the goal. I'm reading one book after the other. I'm learning the French of Harry Potter as I go along. Why
would I want to incorporate a list of generic common words, many of which are not used in the books, to my
French from Harry Potter?
1 person has voted this message useful



Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 5080 days ago

9753 posts - 15777 votes 
4 sounds
Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

 
 Message 280 of 309
23 September 2014 at 5:08pm | IP Logged 
s_allard, you seem to be think that everyone is scared that the topic will change, and doesn't dare to speak until they've learned 5000 words or 15000 words or whatever.

Now, emk'll correct me if I'm wrong, but I'm sure he was glad to be asked about that sci-fi book in the bakery. Maybe even delighted :-) Most learners want to participate in this kind of conversations, are happy to change the subject, want to be asked about the book they happen to be reading etc.

Really, most HLAL'ers aren't afraid of that. Many also don't have to be afraid of the awkward situations that could be improved significantly with 300 well-known words and more grammatical accuracy - in other words, most of the members don't need anything but their native language and English in their daily life, we don't routinely get embarrassed about not speaking an official language of our country and walk away thinking "okay, I already know 294 words, 4706 to go and then I can participate in conversations". And yes, I'm aware that being able to put it off is a luxury, but other Canadian luxuries more than make up for it.

I don't think anyone questions that your strategy is useful in Canada, especially in truly bilingual cities. The trouble is your generalization from your experience as a teacher (and learner?) to language learning in general. In fact, in these favourable conditions that you take for granted, most members have already learned the language in question (English/French in Canada, Swedish in Finland, English in much of Europe, to some extent even Spanish in the USA).


Edited by Serpent on 23 September 2014 at 5:17pm



1 person has voted this message useful



This discussion contains 309 messages over 39 pages: << Prev 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.9219 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2020 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.