translator2 Senior Member United States Joined 6854 days ago 848 posts - 1862 votes Speaks: English*
| Message 9 of 23 03 July 2011 at 5:28pm | IP Logged |
Iversen wrote:
Knowing the first 38% or so is like having the empty shelves ready in a supermarket, but all the goodies which make you come there are in the upper 62% |
|
|
Nice one!
1 person has voted this message useful
|
zuneybunny Diglot Newbie United States turkishtrip.wordpres Joined 4872 days ago 32 posts - 52 votes Speaks: English, Mandarin* Studies: Spanish, Turkish
| Message 10 of 23 04 July 2011 at 5:25am | IP Logged |
Quote:
For an English-speaker, Dutch vocabulary is fairly transparent. Not so Turkish
for a Mandarin speaker! |
|
|
I'm actually an English speaker. Yes, I did put down Mandarin as first native language
(born in China), but I live in the US and mainly speak English :)
Quote:
Out of curiosity, approximately how many words would you need to know to be able
to read, say, 85% of the text? |
|
|
You would need 6105!
Obviously, the higher % you want, the words increase exponentially. For example you
only need 2227 words for 70% comprehension, which is 1/3 of what you need for 85%!
But like some replied, this isn't really a percentage of comprehension. It's just nice
to know :D
1 person has voted this message useful
|
leosmith Senior Member United States Joined 6485 days ago 2365 posts - 3804 votes Speaks: English* Studies: Tagalog
| Message 11 of 23 04 July 2011 at 7:47am | IP Logged |
Iversen wrote:
Knowing the first 38% or so is like having the empty shelves ready in a supermarket, but all the
goodies which make you come there are in the upper 62% |
|
|
I think it's better to turn all of these into swim-related:
Knowing the first 38% or so is like going to do the 100 meter crawl, but without your torso, head, or left arm.
2 persons have voted this message useful
|
jean-luc Senior Member France Joined 4895 days ago 100 posts - 150 votes Speaks: French* Studies: German
| Message 12 of 23 04 July 2011 at 10:03am | IP Logged |
As a side note, are you willing to share your python script ? I would be interested to have a look on some texts I have.
2 persons have voted this message useful
|
zuneybunny Diglot Newbie United States turkishtrip.wordpres Joined 4872 days ago 32 posts - 52 votes Speaks: English, Mandarin* Studies: Spanish, Turkish
| Message 13 of 23 04 July 2011 at 4:42pm | IP Logged |
Quote:
As a side note, are you willing to share your python script ? I would be
interested to have a look on some texts I have. |
|
|
Of course :)
http://codepad.org/9T0Ew6Cd
Replace "harrypotter.txt" with whatever your input text file is. It'll create a file
named "freqlist.txt" after you run it.
4 persons have voted this message useful
|
zerothinking Senior Member Australia Joined 6307 days ago 528 posts - 772 votes Speaks: English*
| Message 14 of 23 04 July 2011 at 5:44pm | IP Logged |
The content words which make up most of the meaning are the rarer words. Knowing 80% of
the words on the page does not mean understanding 80% of the text. This is something all
language learners will learn. I know I got a rude awakening at how much more I had to
learn when I first opened a French novel.
1 person has voted this message useful
|
jean-luc Senior Member France Joined 4895 days ago 100 posts - 150 votes Speaks: French* Studies: German
| Message 15 of 23 04 July 2011 at 10:25pm | IP Logged |
Thanks a lot, it works really well !
I just had to add «» in the regexp (and # -*- coding: utf-8 -*-
in the header) for using it on my German text.
1 person has voted this message useful
|
Cainntear Pentaglot Senior Member Scotland linguafrankly.blogsp Joined 5946 days ago 4399 posts - 7687 votes Speaks: Lowland Scots, English*, French, Spanish, Scottish Gaelic Studies: Catalan, Italian, German, Irish, Welsh
| Message 16 of 23 05 July 2011 at 8:38am | IP Logged |
zuneybunny wrote:
Quote:
For an English-speaker, Dutch vocabulary is fairly transparent. Not so Turkish
for a Mandarin speaker! |
|
|
I'm actually an English speaker. Yes, I did put down Mandarin as first native language
(born in China), but I live in the US and mainly speak English :) |
|
|
Fair enough, sorry.
(But the same still holds true for Turkish for English speakers anyway....)
zuneybunny wrote:
Quote:
As a side note, are you willing to share your python script ? I would be
interested to have a look on some texts I have. |
|
|
Of course :)
http://codepad.org/9T0Ew6Cd
Replace "harrypotter.txt" with whatever your input text file is. It'll create a file
named "freqlist.txt" after you run it. |
|
|
That is extremely useful.
Many, many thanks.
1 person has voted this message useful
|