Register  Login  Active Topics  Maps  

Extensive Reading Vocabulary Range Video

 Language Learning Forum : Lessons in Polyglottery Post Reply
38 messages over 5 pages: 1 2 35  Next >>
Teango
Triglot
Winner TAC 2010 & 2012
Senior Member
United States
teango.wordpress.comRegistered users can see my Skype Name
Joined 5336 days ago

2210 posts - 3734 votes 
Speaks: English*, German, Russian
Studies: Hawaiian, French, Toki Pona

 
 Message 26 of 38
09 September 2011 at 3:13pm | IP Logged 
I concur with Iversen that extensive reading, whilst very useful for learning and maintaining vocabulary, only forms part of the overall picture. For example, when I've trained language models on very large corpora in the past, they've been far from optimal for speech recognition and dialogue translation. This is because spoken language is very different to the type of language used in literature and news articles. When I later add even 10,000 words from a spoken corpus to my models, the results increase significantly.

I think it's also important to point out that television, films, music, Internet and radio, in addition to increasing our spoken vocabulary, all add to our expanding acoustic models and ability to deal with variation, not to mention our pragmatic and cultural understanding. This is more important than ever in these times of increased globalisation and travel, as languages and their various accents or dialects are in a continuous state of flux all around us. Including plenty of multimedia in my daily life over the years has enabled me, for example, to jump effortlessly from Rab's frustrations over the phone with automated helplines to Horsemouth's classic speech in the Jamaican comedy Rockers. Having access to a wide diversity of English is something I'm now really grateful for.

So whilst Professor Arguelles makes a valid point that there are generally more lower-frequency words in written material than speech, and acknowledging that novels offer a faster route to more input, I still think the best way forward is to aim for a synergy of multiple resources, and focus predominantly on having fun and developing a more rounded appreciation of the cultural variety and subtle creativity within language.

Edited by Teango on 09 September 2011 at 3:17pm

7 persons have voted this message useful



sipes23
Diglot
Senior Member
United States
pluteopleno.com/wprs
Joined 4650 days ago

134 posts - 235 votes 
Speaks: English*, Latin
Studies: Spanish, Ancient Greek, Persian

 
 Message 27 of 38
10 September 2011 at 12:32am | IP Logged 
kagemusha wrote:
I think one of the main points was that an active reading vocabulary is higher then an
active speech vocabulary. Therefore by reading, you are forced to extend your vocabulary
quicker then speech.


I wonder. Obviously the most frequent, say 5-10 thousand words, are the same for everyone. Everyone says
"hear" or "see" or "foot". But as you move into less frequently used words, different people have different sets in
active use. In other words, you say "wayward" and I prefer "errant". My neighbor, in turn, uses "wandering." Or
whatever. Since you teach English, your students hear "wayward" but not "errant". I work with immigrants, they
hear "errant", but never "wandering." To further complicate matters, you might only use the word once in a given
month. So even if your English students knew me, I may never say "wayward" in front of them. As a native
speaker, you know lots of native speakers and thus are statistically likely to hear all three of those words at some
point or another. Your English learners eventually go back to their home country. The immigrant goes back to his
ethnic neighborhood. Neither one spends enough time with the language to make enough statistical difference—
and really, they've got a good enough command on the language for their purposes.

I'm guessing the people who hang out here are different. They *do* want to get enough exposure to make a
statistical difference, but don't want to spend years of conversational immersion to get those low-frequency
words. Extensive reading is the tool. You read lots of stuff by different people. And writers tend to like words—
why else become a writer?—so they use lots. (For example, Steadman's introduction to Herodotus says that H.
uses 4207 unique words in Histories Book 1. 2137 of those he uses only one time. The statistic was handy.) Five
books by five authors are likely to present more unusual low-frequency words than just daily conversation with
those same authors.

Again, just my suspicion, and I'm too lazy to do the footwork to back it up with data.


5 persons have voted this message useful



Zwlth
Super Polyglot
Senior Member
United States
Joined 5006 days ago

154 posts - 320 votes 
Speaks: English*, German, Italian, Spanish, Russian, Arabic (Written), Dutch, Swedish, Portuguese, Latin, French, Persian, Greek

 
 Message 28 of 38
14 September 2011 at 6:03am | IP Logged 
He's put up the 2nd part of the talk:

Selecting Appropriate Texts for Expanding Vocabulary Range Through Extensive Reading.

It's just as well worth watching as the 1st part, if not even more so.

5 persons have voted this message useful



sipes23
Diglot
Senior Member
United States
pluteopleno.com/wprs
Joined 4650 days ago

134 posts - 235 votes 
Speaks: English*, Latin
Studies: Spanish, Ancient Greek, Persian

 
 Message 29 of 38
16 September 2011 at 2:28am | IP Logged 
Zwlth wrote:
It's just as well worth watching as the 1st part, if not even more so.


I'd say moreso. Here he shows us the vocabulary analysis tool. The problem, and it's big, is that the tool only has
databases for English. If you had the databases for another language, I think the program would chew through the
text just as well. If.
2 persons have voted this message useful



sundog66
Tetraglot
Newbie
United States
Joined 4790 days ago

6 posts - 8 votes
Speaks: English*, Spanish, Mandarin, Esperanto
Studies: Russian

 
 Message 30 of 38
16 September 2011 at 6:00am | IP Logged 
sipes23 wrote:
The problem, and it's big, is that the tool only has databases for English.



What makes this problem especially insidious, I think, is that every single inflectional variant of every word in
every word family has to be explicitly listed for this software to work. For languages with even modestly
sophisticated morphology, such as for example Spanish with its verbs, I would think that this would make the
necessary word database intractably large. More sophisticated software, with automatic recognition of novel but
regular inflectional forms, is probably needed.

But for anyone who wants to use this software for Mandarin, you're in luck. (Well, as long as you're content to be
running the statistics on characters rather than "words".)

Here is a list of
9,933 Chinese characters in decreasing order of frequency. The characters here can be pasted into files in the
right format for use with the software Prof. Arguelles demonstrates. The only thing, though, is that before you
analyze a text, you need to insert a space between each character so that the software recognizes each character
as a "word" for its calculations. This can be done in sed with the following command:

sed 's/./ &/g;s/^ //'

Another nice thing about using the software with Mandarin is that because Mandarin has no inflectional
morphology to worry about, you don't even need a character frequency database to use the software to calculate
the total number of unique character types in a text. I think this is a useful statistic in and of itself in gauging the
difficulty of a text.
3 persons have voted this message useful



learnvietnamese
Diglot
Groupie
Singapore
yourvietnamese.comRegistered users can see my Skype Name
Joined 4729 days ago

98 posts - 132 votes 
Speaks: Vietnamese*, EnglishC2
Studies: French, Mandarin

 
 Message 31 of 38
16 September 2011 at 9:33am | IP Logged 
Thanks for the list of Chinese characters, Sundog66.

Quote:

sed 's/./ &/g;s/^ //'


Sometimes, I'm pleasantly surprised to find that the language of programming is quite..."concise" :D
1 person has voted this message useful



montmorency
Diglot
Senior Member
United Kingdom
Joined 4608 days ago

2371 posts - 3676 votes 
Speaks: English*, German
Studies: Danish, Welsh

 
 Message 32 of 38
19 September 2011 at 1:54am | IP Logged 
learnvietnamese wrote:
Thanks for the list of Chinese characters, Sundog66.

Quote:

sed 's/./ &/g;s/^ //'


Sometimes, I'm pleasantly surprised to find that the language of programming is quite..."concise" :D



Technically, that is the language of "regular expressions".
A gentleman called Jeffrey Friedl wrote an excellent book about them:
"Mastering Regular Expressions".




3 persons have voted this message useful



This discussion contains 38 messages over 5 pages: << Prev 1 2 35  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3438 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.