Register  Login  Active Topics  Maps  

300-word High Proficiency Kernel Concept

 Language Learning Forum : Learning Techniques, Methods & Strategies Post Reply
80 messages over 10 pages: 1 2 35 6 7 ... 4 ... 9 10 Next >>


emk
Diglot
Moderator
United States
Joined 5327 days ago

2615 posts - 8806 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 25 of 80
29 September 2014 at 4:40pm | IP Logged 
s_allard wrote:
What I find interesting here is that after I have gone to the trouble of actually giving an example of such a kernel
that I am presently using and refining, all I see is some of the usual petty nitpicking about whether 300 is the
right figure. I'm interested in hearing from the people who think that this sample kernel is completely useless.

Unfortunately, I can't help you here the way that I can in French. I know barely any Spanish vocabulary, and I can only occasionally understand it if I pretend it's badly pronounced French. But I congratulate you for actually making a list and trying to see what you can do with it. The next step would be to find some typical C2-level conversations and compare them against your list, and see how much vocabulary is missing—and whether it's subject-specific vocabulary, or general-purpose vocabulary that just happens to be a bit rarer.

Serpent wrote:
Why do you keep on making assumptions instead of trying to find out what polyglots actually do? People like Prof Argüelles definitely maintain a 20k vocabulary at least passively. They (we?) do get help from cognates, borrowings and etymological connections, but most don't care about "apparently" sounding better than they really are. At least in their best languages polyglots definitely have pretty large vocabularies.

Alexander Arguelles is a fascinating example, because he has very high text coverage in something like 20 languages. Here's a sample from his "About" page:



Now, compare these coverage numbers to the tables in Nation's paper (which I believe to be pretty reliable). You can see that 98% coverage requires vocabulary sizes in the 5,000 to 15,000 word range, depending on the type of materials measured and the measuring procedure.

Now, I have learned a second language to a levels which would appear on his chart, and it's a major effort. Of course, Alexander Arguelles is getting a huge cognate discount across the Germanic and Romance families, and he only pays "full price" when he learns a totally unrelated language like Korean. To achieve these remarkable results, he apparently spent much of his youth waking up before dawn so that he could get in at least 5 hours of studying per day, and for many years he lived a nearly monastic lifestyle.

So yes, at least some of the most impressive polyglots out there really do know something like 10,000 word families in each of 10 or more languages, minus a cognate discount. This is a remarkable achievement, and it deserves to be honored, and not merely "explained away" as a party trick involving a few hundred words per language.

Edited by emk on 29 September 2014 at 4:44pm

7 persons have voted this message useful





Iversen
Super Polyglot
Moderator
Denmark
berejst.dk
Joined 6498 days ago

9078 posts - 16473 votes 
Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan
Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian
Personal Language Map

 
 Message 26 of 80
29 September 2014 at 5:12pm | IP Logged 
s_allard wrote:
If we ignore the silly chatter and get back to uses of the high-proficiency kernel, I think it goes a long way to explaining how well-known polyglots and hyperpolyglots are able to maintain a large number of languages active. By concentrating on the kernel of each language, way under 1000 words, these polyglots can easily maintain an apparently high level of proficiency in all their languages. Not to in any way diminish their achievements. Keeping that kernel up to scratch is not very easy. But it's easier than trying to maintain 10,000 and 20,000 word vocabularies.


I'm not sure that it is easier to learn 1000 words from random conversations than it is to learn 10.000 word using more or less formalized techniques. But if people can pick up words in that way then they probably also learn to use them along the way.

It is fairly obvious that anybody who attempts to speak or write in a language needs to know its core vocabulary, which includes the typical 'grammar words'. Whatever you add after that should depend on your interests rather than on some kind of prefabricated frequency list. For instance the words for kitchen ustensils should be learned if you want to speak about cooking, and the way to learn that vocabulary is to read stuff about kitchens and cooking - and maybe use some kind of memorizing technique to make the words stick. At least that's what I do to boost my vocabularies - but that in itself isn't enough to make those words active. There the methods used to acquire the first 1000 words though informal chatting (and limited use of textbooks and dictionaries to support the effort) may be more relevant.

Btw: the word "drivel" about messages from other members (as in the last message on the preceding page) should be avoided. It may be relevant in discussions about child care, but not in discussions about language learning.

Edited by Iversen on 29 September 2014 at 5:34pm

5 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 5225 days ago

2704 posts - 5425 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 27 of 80
29 September 2014 at 5:13pm | IP Logged 
Jeffers wrote:
...

If I correctly understand what you're saying in this thread, you are suggesting that there is a core of high
frequency words that learners should learn first, after which they should learn other useful words? Maybe they
should just save time and use a frequency list. Okay, you say the kernel is customized based on the individual's
needs, but it's not going to differ significantly in the first 300 words or so.

...

You do not understand correctly. Maybe my explanations have not been clear although I explicitly said that this
kernel is not based solely on frequency. This is not a frequency list constructed by pooling all the words of all the
speakers. This list first of all starts with a specific speaking genre and then attempts to identify the words that
are most widely used by speakers. A tiny number of words are used by all speakers: grammar words and auxiliary
verbs. Another set of words are shared by a large number but not all speakers. And finally, we drill down to those
words that are specific to the topic.

I'm interested in the first two categories. I say that we have a set of around 300 - 350 units. What some people
don't understand here is that I am not saying that everybody uses only these words. I say they choose from this
set a very significant number of words but add others that they prefer or are related to the topic. I should point
that that there is also significant regional variation. But nobody uses over 300 words in their 16-18 minutes of
speaking.

Let me give a specific example. Here is Oscar Villalobos at TEDxDF in Mexico. "The impossible is possible"
Es posible lo imposible

The title of the talk is right out the my list. The first thing he says is:

Siempre que tengo la oportunidad de convivir con un público tan diverso como ustedes es un reto.
Whenever I have the chance to spend time with an audience as diverse as you are, it is a challenge.

Most of the words, but not all, are from my list. What I'm saying is that the speaker has basically selected from
my list the words that he wants to use and added others. This is a far cry from everybody trying to make do with
the 300 most frequent words in Spanish.

What goes without saying, and this is where most people get confused, is that the real challenge is to take those
words and to put them into coherent and fluent sentences.
1 person has voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 5225 days ago

2704 posts - 5425 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 28 of 80
29 September 2014 at 5:27pm | IP Logged 
emk wrote:
..

So yes, at least some of the most impressive polyglots out there really do know something like 10,000 word
families in each of 10 or more languages, minus a cognate discount. This is a remarkable achievement, and it
deserves to be honored, and not merely "explained away" as a party trick involving a few hundred words per
language.


I have no doubt that the great polyglots do "know" large numbers of word families in each of their languages.
But when it comes to actually speaking the language - that is actually opening their mouths for us to hear - I
believe that they all use this kernel idea.

I find it amusing that we talk about these huge vocabularies when in fact we have little actual evidence of huge
speaking vocabularies. As I have argued many times, we take tiny examples of people speaking and conclude
that they are great speakers of the language. Something I actually agree with. But we have no examples of any of
these people giving TED-like talks in many languages. This would really impress me.
1 person has voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 5225 days ago

2704 posts - 5425 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 29 of 80
29 September 2014 at 5:41pm | IP Logged 
Doitsujin wrote:
@s_allard: I find your idea interesting; it reminds me a lot of Elisabeth Smith's book
Instant
Spanish
, which she also adapted for a couple of other languages.
BTW, Smith manages to create meaningful (tourist level) sentences using only 31 verbs. (The whole book teaches
about 390 words, including inflected verb forms.)

How did you go about selecting the words for your list?

A good question that I've already explained but I think it's worth looking at again because it's so fundamental.
Basically I looked at around 15 TED talks in Spanish. They represent the genre. Without actually making
transcripts, I tried to identify the words that are most commonly used across all the talks. The verbs ser, estar,
hacer, haber, for example, appear in every single talk. That's simple. Then I looked at other words that are
common across the various samples. Given the nature of the genre, certain kinds of content words keep coming
back. Things like el problema, la comunicación, la crisis, la situación. Then it was a question of whittling the
numbers down to around 300. As I said earlier, I really think that 350 would be more useful.
1 person has voted this message useful



patrickwilken
Senior Member
Germany
radiant-flux.net
Joined 4328 days ago

1546 posts - 3200 votes 
Studies: German

 
 Message 30 of 80
29 September 2014 at 5:55pm | IP Logged 
s_allard wrote:

Let me give a specific example. Here is Oscar Villalobos at TEDxDF in Mexico. "The impossible is possible"
Es posible lo imposible

The title of the talk is right out the my list. The first thing he says is:

Siempre que tengo la oportunidad de convivir con un público tan diverso como ustedes es un reto.
Whenever I have the chance to spend time with an audience as diverse as you are, it is a challenge.

Most of the words, but not all, are from my list. What I'm saying is that the speaker has basically selected from
my list the words that he wants to use and added others. This is a far cry from everybody trying to make do with
the 300 most frequent words in Spanish.


I don't think I fall in the non-drivel category - presumably that kernel of worthy speakers on HTLAL is much smaller than 300 and I would not expect to be blessed by such an exalted state.

However, for the sake of talking to the wind:

If we were to use a simple word frequency list we would cover with the first 300-350 most common words a lot of the words spoken in a corpus. No doubt about it. If we were go to a particular corpus and hand-pick the 300-350 most common/relevant words, by definition we could even do better in our coverage for that particular corpus.

But as you say yourself: The speeches are composed of the 300-word Kernel PLUS A WHOLE LOT OF OTHERS WORDS THAT THE SPEAKERS CHOOSE TO INCLUDE. What most of us are interested in is how important these other words are. How big is this second kernel? How do the speakers pick these extra words? Do they pick up a bilingual dictionary every time they want to speak so they can include words outside kernel? Or have they already memorised a much bigger kernel of words that they can draw upon as needed? And if so, how big is this second invisible kernel?

Putting that aside I am perfectly willing to accept you can have a conversation (or perhaps better - a monologue) with a kernel of 300 words. What does that prove though?

You can also write a book about quasars and relatively theory using the 1000 most common words in English. That doesn't mean you can have a conversation with a physicist about relativity using the same 1000 words. And forget about reading any books, even popular ones, about blackholes or dark energy (even though "black" "hole" "dark" and "energy" are probably on your 1000 word list).

Edited by patrickwilken on 30 September 2014 at 12:36am

5 persons have voted this message useful



Ari
Heptaglot
Senior Member
Norway
Joined 6377 days ago

2314 posts - 5695 votes 
Speaks: Swedish*, English, French, Spanish, Portuguese, Mandarin, Cantonese
Studies: Czech, Latin, German

 
 Message 31 of 80
29 September 2014 at 6:31pm | IP Logged 
FWIW, I think you're talking about some interesting stuff here, s_allard. I'm interested in seeing where you plan on taking this. Having this kernel, what do you suggest we do with it? You're obviously not saying that these are the only words we need to learn, so what should we do with these words that we don't do with the other words? Should we try to learn all the expressions and idiomatic uses of these words to master them completely, since these are the extra important words? Or should we simply practice using these words to construct sentences, working on formulating our ideas in ways that use these words, allowing us to skip over many holes in our vocabulary? I see many ways this kind of focused approach can be helpful. Instead of knowing 10,000 words superficially, maybe we can learn 5,000 words superficially and 350 words deeply, and end up with a superior communicative ability? Is that the sort of direction you're taking this?
1 person has voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 5225 days ago

2704 posts - 5425 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 32 of 80
29 September 2014 at 6:37pm | IP Logged 
The problem that people are having is that they are trying to do everything with the same 300 high-proficiency
words. The trap here is thinking in traditional terms of word coverage. If I take all the TED talks in Spanish and
apply my 310 word list, what coverage will I get? I don't know and I don't care. The point here is that we are not
trying to prove that you can understand all the TED talks with 310 words.

What I'm claiming is that to give a TED talk you can start with a kernel like this, discard the words you don't need,
add the ones you need and you're good to go. The total number of words you use in your entire speech will
probably be less than 300.

I'm not suggesting that this is what users do consciously but for learners like ourselves, the idea behind
developing a tool like this is that it can be something of a study guide to high-value words and micro-structures
that you will likely want to use if ever you are giving a TED talk or planning on passing a C-level exam.

For Pete's sake, nobody is suggesting that you only learn these 300 words. Now a good question that is implicit
here is: Don't I need thousands of other words on the tip of my tongue to cover all the possible subjects that may
come up in a C level speaking exam?

You'll need to be able to comprehend more than 300. You'll need to be able to use more than the 300 here. But,
and again for speaking purposes, how many more? I certainly don't have a figure but I believe it is a relatively
small number because of the nature of the exams where the emphasis is on mastery of the language and not on
the size of speaking vocabulary.


1 person has voted this message useful



This discussion contains 80 messages over 10 pages: << Prev 1 2 35 6 7 8 9 10  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3125 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.