How many English words do you know? (Specific Languages) Language Learning Forum

How many English words do you know?
Tags: Number of words \| Placement Test \| English
Share with: Delicious Digg reddit Facebook StumbleUpon
Language Learning Forum : Specific Languages

21 messages over 3 pages: 1 2 3 Next >>

Iversen
Super Polyglot
Moderator
Denmark
berejst.dk
Joined 6703 days ago
9078 posts - 16473 votes

Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan
Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian
Personal Language Map

Message 9 of 21

30 January 2014 at 10:25am | IP Logged

It may for all sorts of mathematical reasons be problematic to take your positive score on real words and subtract the number (or percentage) of false positives on non-existing words. But it is also a problem because it assumes that you know precisely X words, and that you always can separate known from unknown words. But that's not how things are in the real world.

Ideally you should add the number (or precentages) of words you understood perfectly, but mistrusted because you couldn't say with certainty that you had seen them. I skipped several correct answers fearing the the punishment for false positives, but didn't get any bonus for doing so. In my own latest statistics I have a category for 'guessable words', and because I take the words from a dictionary it is fairly certain that they all exist. But sometimes it is difficult to see the difference between 'known' and 'guessable'.

One afterthought: at the polyglot conference in Budapest in 2013 Anthony Lauder (aka Splog) claimed that one of the main differences between polyglots and polynots is that the former are very adept at guessing. Isn't it then a bit problematic to punish guessing harder than not answering a question? I'll have to look at the d' index to see whether it really "is independent of your response bias - a number that doesn't get affected by whether you guess a lot or not at all", as stated by Patrickwilken.

Edited by Iversen on 30 January 2014 at 5:19pm
1 person has voted this message useful

daegga
Tetraglot
Senior Member
Austria
lang-8.com/553301
Joined 4521 days ago
1076 posts - 1792 votes

Speaks: German*, English^C2, Swedish, Norwegian
Studies: Danish, French, Finnish, Icelandic

Message 10 of 21

30 January 2014 at 10:55am | IP Logged

Quote:

You said yes to 60% of the existing words.

You said yes to 0% of the nonwords.

Now what does that mean exactly? :)

I also responded yes to a few words I didn't know but which seemed very likely to be an
English word. There were a few of those I wasn't sure enough, so I pressed "no".
If the question was purely about discriminating words and non-words, I'd have pressed
"yes" more often.

edit: 2 more runs
as a discrimination test: 70/10
as a vocabulary test (have I seen this word before?): 39/0

The 39/0 seems a lot more realistic to me than the 60/0.

Edited by daegga on 30 January 2014 at 11:28am
1 person has voted this message useful

patrickwilken
Senior Member
Germany
radiant-flux.net
Joined 4533 days ago
1546 posts - 3200 votes

Studies: German

Message 11 of 21

30 January 2014 at 11:42am | IP Logged

Iversen wrote:

That's exactly why SDT was developed. The simple intuitive idea that you either know something with either 100% or 0% is just not plausible. SDT assumes informationally noisy internal representation.

Iversen wrote:

One afterthought: at the polyglot conference in Budapest in 2013 Anthony Lauder (aka Splog) claimed that one of the main differences between polyglots and polynots is that the former are very adept at guessing. Isn't it then a bit problematic to punish guessing harder than not answering a question? I'll have to look at the d' index to see whether it really "is independent of your response bias - a number that doesn't get affected by whether you guess a lot or not at all", as stated by Patrickwilken.

Check out any article on "Signal Detection Theory" and you should get a reasonable explanation of d' and other measures. The whole rationale of SDT is to create sensitivity measures independent of response bias.

What should a sensitivity measure do? It should indicate how well you can say a signal is present when it is (Hit Rate), and additionally how well you can say a signal isn't there when it isn't (1 - False Alarm rate).

Hit rates on their own are useless, because you don't know how much of the Hit Rate is composed from lucky guesses. So you measure the guessing rate independently (the False Alarm rate) to allow you to adjust and create a 'pure' sensitivity measure.

Guessing is not punished. The whole idea is that decisions are essentially noisy and you can't avoid making some false positives and some false negatives when making a judgment.

It's assumed that your sensitivity is fixed (here your ability to distinguish words from non-words). The only thing you can change is your bias/criteria/threshold when you say one or the other.

Imagine: (1) I'll give you 100 Euros for every word you correctly identify in the test; or (2) You'll give me 100 Euros for every non-word you incorrectly classify as a word.

In 1 you'll make a lot of false alarms (i.e., saying non-words are words) because you are trying to maximize your winnings (ideally you'd say everything is a word). In 2 you'll be very careful not to make mistakes and so your False Alarm rate (saying a word is there when it isn't) will be very low.

In both cases your underlying knowledge of words (your 'sensitivity') stays the same, just your criterion bias changes.

SDT was developed to try to try to develop a pure measure of sensitivity. It's not perfect, but it's a lot better than a simple percentage correct.

---------

The main point for me is simply that I know from experience in running visual experiments, that d-primes that are larger than 2 are close to ceiling. Basically if your d' is that high tests are not useful at distinguishing performance (same is true if d' is very close to zero). A d' of +2 implies you have no difficulty distinguishing the two stimuli (here 'words' and 'non-words).

The another thought is that this test is really about estimating your ability to distinguish between two stimuli classes ('words' and 'non-words') - not directly about how many words you know in English. Your sensitivity will obviously depend on the nature of the non-words. If all the non-words appeared in red, and all the words in blue, then presumably everyone would be able to get close to perfect performance. I don't know how the 'non-words' were generated, but presumably there are certain assumptions being made here about what a realistic non-English-word would look like (random letters would presumably be way too easy). It could be that the experiment is assessing the sorts of internal models of English words that native and non-native speakers have, by systematically manipulating certain structural properties of the non-words (e.g., perhaps non-native speakers are more likely than natives to think that words that end in "logy" are real words for instance).

It might be for that people score highly not because they know all the 'words', but that their internal model of what an English word is is really good (or at least really good at distinguishing between the non-words on the test).

Edited by patrickwilken on 30 January 2014 at 2:17pm
2 persons have voted this message useful

dampingwire
Bilingual Triglot
Senior Member
United Kingdom
Joined 4665 days ago
1185 posts - 1513 votes

Speaks: English*, Italian*, French
Studies: Japanese

Message 12 of 21

30 January 2014 at 2:13pm | IP Logged

tastyonions wrote:

I said "no" to a number of words that were perfectly understandable
but I have never in my life seen or heard. Stuff like "instable", for example:
"unstable", yes, "instability", yes, but never "instable." Honestly if I saw "instable"
on a forum I would probably assume that the writer was a non-native speaker, even though
the word is apparently in at least some dictionaries... :-)

I've just checked in the OED. "instable" the adjective, meaning unstable, is marked as
rare. "instable", the verb meaning to stable a horse, is marked as both rare and
obsolete.

Interesting that they have such words in their list.

1 person has voted this message useful

beano
Diglot
Senior Member
United KingdomRegistered users can see my Skype Name
Joined 4622 days ago
1049 posts - 2152 votes

Speaks: English*, German
Studies: Russian, Serbian, Hungarian

Message 13 of 21

30 January 2014 at 2:40pm | IP Logged

Practicable is another word that exists in dictionaries but is rarely heard in real life. (I've heard it once, and assumed the guy had made a mistake).
1 person has voted this message useful

Ogrim
Heptaglot
Senior Member
France
Joined 4639 days ago
991 posts - 1896 votes

Speaks: Norwegian*, English, Spanish, French, Romansh, German, Italian
Studies: Russian, Catalan, Latin, Greek, Romanian

Message 14 of 21

30 January 2014 at 3:15pm | IP Logged

daegga wrote:

Quote:

You said yes to 60% of the existing words.

You said yes to 0% of the nonwords.

I think you point to one of the weaknesses of the test. I got a score of 76/0, but obviously I said yes to several words for which I would not be able to give a clear definition or a translation into another language, I just "knew" by intuition that it had to be a real word.

On the other hand, this does mean that if you see those words in a context, you are likely to get the meaning by how they are used. So as a word-recognition test it has its merits, but obviously it doesn't mean that you have an active vocabulary of 50.000 words in English, the way I read it.

1 person has voted this message useful

daegga
Tetraglot
Senior Member
Austria
lang-8.com/553301
Joined 4521 days ago
1076 posts - 1792 votes

Speaks: German*, English^C2, Swedish, Norwegian
Studies: Danish, French, Finnish, Icelandic

Message 15 of 21

30 January 2014 at 5:36pm | IP Logged

beano wrote:

Practicable is another word that exists in dictionaries but is rarely
heard in real life. (I've heard it once, and assumed the guy had made a mistake).

Was this guy per chance a German (or French) native speaker? "praktikabel" is not
uncommon here, and I would use it in English without a second thought to be honest. Good
to know that I shouldn't.

Edited by daegga on 30 January 2014 at 5:43pm
1 person has voted this message useful

Medulin
Tetraglot
Senior Member
Croatia
Joined 4668 days ago
1199 posts - 2192 votes

Speaks: Croatian*, English, Spanish, Portuguese
Studies: Norwegian, Hindi, Nepali

Message 16 of 21

30 January 2014 at 6:52pm | IP Logged

''You said yes to 71% of the existing words.
You said yes to 7% of the nonwords.

This gives you a corrected score of 71% - 7% = 65%.

This is fairly high level for a native speaker.

Also help our colleagues who are investigating word associations.

Nonwords you responded YES to
       Seconds    ; ;
slitly      11.999   &n bsp;
tertiell      8.539''

Edited by Medulin on 30 January 2014 at 6:57pm

1 person has voted this message useful

This discussion contains 21 messages over 3 pages: << Prev 1 2 3 Next >>

Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum

This page was generated in 0.4531 seconds.