Register  Login  Active Topics  Maps  

How much time studying vocabulary?

 Language Learning Forum : Learning Techniques, Methods & Strategies Post Reply
350 messages over 44 pages: << Previous 1 2 3 4 5 6 7 ... 18 ... 43 44 Next >>
patrickwilken
Senior Member
Germany
radiant-flux.net
Joined 4326 days ago

1546 posts - 3200 votes 
Studies: German

 
 Message 137 of 350
17 May 2015 at 4:27pm | IP Logged 
rdearman wrote:

1) Find an electronic copy of the book I want to read.
2) First pass, break the book into individual words, the remove all duplicates. (remove previous words if this isn't the first book by checking previous files)
3) Second pass, go back through the book file and generate a csv with individual word and the first one (possible two) sentences in which the word appears.
4) From this csv turn the sentence examples into cloze deletion by removing the word from the sentences.
5) Third Pass, google for a translation of the words, then construct a spreadsheet where column 1 is the word, column 2 is the translation, 3 & 4 are cloze deletion sentence examples and the final column.


Readlang achieves most of this; though it doesn't give you a file before you read. Basically you read an Epub version of a book, click on words you don't know in the text, these unknown words are replaced with an English Google translate version, the sentence, L2 word and L1 translation are added to a specific file for each book you read, then you can download this file and reload it to Anki. If you don't like the Google translate translation Readlang allows you easily access other dictionaries before downloading. You can download with lots of different formats including those for Cloze deletion if you so wish. I preferred my cards to be a bit easier, as shown above.

Ezy Ryder wrote:

What if you end up remembering the sentence, as opposed to the word itself? As in, you
won't even need to see the word, because the beginning of the sentence will be enough?
Also, reviewing sentences is slower.


I don't really trust introspection, but for what it's worth I sometimes you end up remembering the sentence, but most often not. You do learn collocations of words though which is a big advantage of this approach.

I often had more than one sentence for each word though which helps you learn word independently from the sentence. And if you haven't learnt it then the next time you come across it you'll create a new card.

WRT speed it's hard to really say. Yes, of course it takes longer to read a sentence, but so long you pick shorter sentences it's not a problem (the example sentence is bit longer than I would normally use). As you learn these examples quickly you actually spend much less time per card than for simple word lists, so in terms of overall time its much quicker. Also sentences help reinforce many words at the same time, not simply the word listed as unknown.

This technique solved a certain problem for me. I quickly found that when reading native materials I saw many more unknown words than I could possibly learn via L1->L2 wordlists in SRS. If you know 98% of words in a novel that means you don't know 5 words per page or say 2000-3000 words per book (of course some of these unknown words are repeats - but a really surprising number are very low frequency words that appear once or perhaps twice in the text).

To give you some examples from American Gods:

Einige in den Catskills, ein paar in den Freizeitorten in Florida.
- Recreational places

...fragte Gugwei, dessen Haar lang und weiß und dessen Gesicht so runzlig war wie die graue Haut eines Dornenbaums.
-Thorn tree - reminder of dorn=thorn

Mad Sweeney ließ ein Messingfeuerzeug aufflammen, ein Fingerbreit der Zigarette glühte auf und wurde zu Asche. »
-Brass lighter - Messing=brass is useful.

Er ging in ein kleines Restaurant, wo er geräucherten Papageientaucher mit Moltebeeren und Seesaibling mit Dampfkartoffeln aß...
-Puffins (literally 'diving parrots' - cool word!).

Learning words via sentences allowed me to process about one book's worth of words (300 pages - +2000 cards) a month.

For me learning the word, wasn't about being able to give the German word when I heard the English equivalent in some sort of test. What mattered to me was that I could recognise the meaning of the word next time I saw it in context.

I think one reason this method is so much quicker than simply learning from word lists is that you generate much richer associations for words: recreational areas (Freizeitorten) are of course in Florida; a face so creased as the grey bark of a thorn tree; the character Mad Sweeney lighting a brass (messing) lighter in a bar; smoked puffin (Papageientaucher) being eaten in a restaurant in Iceland. It helps of course that the sentences come from a book I've read and I can easily put them back into the context of the story.

Personally at this point I am happy to relax and just read without SRS, but this method is really effective at building up your passive vocabulary very quickly. Using word lists I needed to be very selective about what words I needed to learn; with this technique I could learn pretty much all the unknown words I came across.

Edited by patrickwilken on 17 May 2015 at 4:55pm

3 persons have voted this message useful



s_allard
Triglot
Senior Member
Canada
Joined 5223 days ago

2704 posts - 5425 votes 
Speaks: French*, English, Spanish
Studies: Polish

 
 Message 138 of 350
17 May 2015 at 4:28pm | IP Logged 
smallwhite wrote:
....
And secondly, it really doesn't have to be 8000 words, especially since our definition of "word" varies so greatly.
It's actually X words. I can't sort my 8k words by frequency, but here's 20 of them, randomly picked by Excel:
to listen to music ....... die Musik hören
to be right ....... recht haben
to close the door ....... die Tür zumachen
bartender m ....... der Barmann
garden hose ....... der Gartenschlauch
yellow card ....... die gelbe Karte
coloured pencil ....... der Buntstift
radiology ....... die Radiologie
ticket counter ....... der Fahrkartenschalter
appearance ....... der Anschein
chest (body) ....... die Brust
ant ....... die Ameise
purchase ....... der Kauf
to pull a tooth ....... einen Zahn ziehen
American m ....... der Amerikaner
comb ....... der Kamm
site, place ....... die Stätte
fairness ....... die Fairness
in written form ....... schriftlich
disk ....... die Diskette

The word "radiology" is relatively rare, but my cousin works in radiology and we last talked in February, so the
word feels relevant to me and I'm happy with it. The other words I don't think are rare or useless.

I have goals when I learn a language, and that includes a certain number of words. Some people don't set goals
or don't quantify them, of course.

I've stayed out of much of the debate recently because I thought that my intervention would probably have made
things worse. But now that we have an actual sample of what @smallwhite's list contains, we can have a better
understanding of how the whole thing works.

Although I never thought I'd be writing this, I think @smallwhite is being unfairly criticized here and given a bad
rap. I don't think the idea is to start learning a language by first memorizing 8000 words as quickly as possible.
That idea comes from that fellow who tried to learn Swedish that way. As soon as he reached 8100 words in 56
days, we stopped hearing from him. I wonder what happened.

@smallwhite writes "I have goals when I learn a language, and that includes a certain number of words." We are
not told what these goals are except for a number of words. If I remember correctly from this or the other similar
thread @smallwhite is more interested in passive knowledge of languages and much less in the actual ability to
write or speak with native speakers. The major priority seems to be reaching a passive knowledge of X number of
words.

It's only normal then that @smallwhite adopt an approach that combines grammar and lots of vocabulary. From
what I gather, the vocabulary component combines existing lists and material added by hand.

When I look at the snippet of the list, what I find interesting is that much of the material consists actually of little
phrases and not just single words as many of us might have believed. This is important because it gives some
contextual substance to the target words.

I believe this list is more of a memorization tool than a learning tool. It's probably great for reviewing and
keeping words alive.

In reality, I don't think this approach is so different from what most of us do. I have 6 notebooks of observations
on Spanish that I made when reading or listening. And nearly every day I add a couple of phrases. Who knows
how many entries are there. From time to time, I review this material to see if I've forgotten any thing or if I'm
looking for inspiration.

I also have an Excel spreadsheet where I have my Spanish speaking kernel broken down into functional
categories. This I review from time to time just to make sure that I'm on track.

What I seen in this debate is that basically we end up in the same place in terms of numbers of words known
passively. Most of us, myself included, get there by accumulating vocabulary as we go along. @smallwhite seems
to take the approach of bulking up the vocabulary in the beginning.

I think we have to look at the end results in terms of our initial goals. I probably would never use @smallwhite's
approach because speaking the language is my number one priority and I believe I can best start with a small
number of words and expand. But @smallwhite has different goals.
5 persons have voted this message useful



rdearman
Senior Member
United Kingdom
rdearman.orgRegistered users can see my Skype Name
Joined 5029 days ago

881 posts - 1812 votes 
Speaks: English*
Studies: Italian, French, Mandarin

 
 Message 139 of 350
17 May 2015 at 5:33pm | IP Logged 
Wow! So much interesting stuff. I'm not going to bother quoting everyone, just reply as much as possible.

First I believe there is some misunderstanding in my use of the word efficiency. I'm not saying my 8000 common words from the Italian lixicon are the most efficient words to select, I'm saying smallwhites method of using a spreadsheet to efficiently learn words is efficient. As she already said, the selection of words is up to the individual. So you could select your words from books you are going to or already read, from transcript, from lexicons, whatever. Smallwhite pointed out that she selects words based on revevance to her: "the word feels relevant to me and I'm happy with it."

Obviously the system I described where you strip unique words from books prior to reading would actually make for a very efficient system of vocabulary learning. In that example I wouldn't bother to strip out variations of verbs; e.g. être, suis, est, sont, etc, etc. since it is still useful to learn them, and if you are seeing them in a cloze deletion system anyway, it's all good.

My analysis of 1984 is problematic in this instance since the book is in Mandarin. So the answer to the question; How many unique words does it have. is a problem. The reason is in Mandarin they don't (normally) use spaces for word boundaries. And you just have to figure out when they are using a character combination as a word, or each character is being used individually. For example 浴室 is bathroom, the left character is bath, the right is room. 我拿着浴浴室。is I took the bath to the bathroom. Easy for you to work out, difficult for a computer. But it shouldn't be that hard for the computer to word this out for a Romance language.

I do have a copy of Shogun electronically, so after a couple of quick hacks I've determined there are 437,199 words and only 20,227 unique words. This is a very raw result and I haven't sanity checked for punctuation, Japanese language usage, etc, but you could say 20k unique words in Shogun as a valid result. So if you're going to read Shogun (English version) then you need a vocabulary of 20,000 words. I did the same analysis for "Of Mice and Men", and there are 29,760 words with only 2,977 unique words. So the lesson here is; Read Steinbeck.

As far as word frequency memorisation I'd like to compare my English to my other languages. I have had 50 years of constant bombardment of words in English. As a native English speaker they estimate I should know somewhere between 12,000 and 20,000 English words. The chances of me living 50 years in France, and another 50 years in Italy and another 50 years in China is slim, although I am hoping medical science will advance that much. So there isn't really anyway for me to get to the 12,000 and 20,000 range in a second language without going out of my way to discover new words and to memorise them. So with my languages if I want to approach C-level vocabulary knowledge in those three languages I need to learn between 36,000 and 60,000, three times the amount of words I've managed to learn in the last 50 years!!! So telling people to take their time and they'll eventually come across the words is great advice for a 16 year old, but isn't such good advice for an old-timer like me.

Anyway, the upshot of all this is different strokes for different folks.

1 person has voted this message useful



Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 6390 days ago

9753 posts - 15779 votes 
4 sounds
Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

 
 Message 140 of 350
17 May 2015 at 6:13pm | IP Logged 
I hate to seem picky but let me point out that a diskette is a floppy drive (in English, too), and Fairness is a pretty specific word. I know that most of these things even out through input, though.

On the other hand it's good to see that many of these are collocations/phrases, presumably of the words that were learned before. What would be your estimate for headwords, smallwhite? Closer to 4-5k?
1 person has voted this message useful



rdearman
Senior Member
United Kingdom
rdearman.orgRegistered users can see my Skype Name
Joined 5029 days ago

881 posts - 1812 votes 
Speaks: English*
Studies: Italian, French, Mandarin

 
 Message 141 of 350
17 May 2015 at 6:20pm | IP Logged 
Serpent wrote:
I hate to seem picky but let me point out that a diskette is a floppy drive (in English, too)


err... technically no. A "Floppy disk" is in reference to the 5 1/4 inch disks which originally came out on the PC. They weren't enclosed in anything other then a semi-hard plastic covering and they "flopped around", e.g. they were bendable. A diskette is actually the 3 1/2 inch disk which was made from hard plastic and didn't actually bend. And although they became interchangeable for most people, to an IT guy they are different things. BTW it also winds me up when people call the computer case the "CPU".
3 persons have voted this message useful



Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 6390 days ago

9753 posts - 15779 votes 
4 sounds
Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

 
 Message 142 of 350
17 May 2015 at 6:36pm | IP Logged 
Okay, thanks for clarifying it further. Still, it isn't any disk but a very specific kind.
1 person has voted this message useful



daegga
Tetraglot
Senior Member
Austria
lang-8.com/553301
Joined 4314 days ago

1076 posts - 1792 votes 
Speaks: German*, EnglishC2, Swedish, Norwegian
Studies: Danish, French, Finnish, Icelandic

 
 Message 143 of 350
17 May 2015 at 7:14pm | IP Logged 
rdearman wrote:
Serpent wrote:
I hate to seem picky but let me point out that a diskette is a floppy drive (in English, too)


err... technically no. A "Floppy disk" is in reference to the 5 1/4 inch disks which originally came out on the PC. They weren't enclosed in anything other then a semi-hard plastic covering and they "flopped around", e.g. they were bendable. A diskette is actually the 3 1/2 inch disk which was made from hard plastic and didn't actually bend. And although they became interchangeable for most people, to an IT guy they are different things. BTW it also winds me up when people call the computer case the "CPU".


Both are called "Diskette" in German though, using the inch value to disambiguate if necessary.

Anyway, this raises another question that might be relevant. If we encounter words by reading and make L2 --> L1 cards, can we just reverse the direction? "eine Diskette" is some kind of disk, it might be enough to know that and the context will later tell you what kind of disk it is. But a disk is usually not translated to "Diskette" in German if you don't have any context, but "Scheibe" or "Platte" (like in "Festplatte"). Going from "Scheibe" to "Bandscheibe" (spinal disk) for example seems easier than to go from "Diskette" to "Bandscheibe" once you want to expand your knowledge.

edit:
As this example is from smallwhite's inventory and she lives in Australia, disk=Diskette actually makes perfectly sense (as opposed to "disc" = "Diskette") ;) So the negative example only makes sense if you use American spelling.

Edited by daegga on 17 May 2015 at 7:36pm

1 person has voted this message useful



Jeffers
Senior Member
United Kingdom
Joined 4702 days ago

2151 posts - 3960 votes 
Speaks: English*
Studies: Hindi, Ancient Greek, French, Sanskrit, German

 
 Message 144 of 350
17 May 2015 at 7:22pm | IP Logged 
rdearman wrote:
Serpent wrote:
I hate to seem picky but let me point out that a diskette is a floppy drive (in English, too)


err... technically no. A "Floppy disk" is in reference to the 5 1/4 inch disks which originally came out on the PC. They weren't enclosed in anything other then a semi-hard plastic covering and they "flopped around", e.g. they were bendable. A diskette is actually the 3 1/2 inch disk which was made from hard plastic and didn't actually bend. And although they became interchangeable for most people, to an IT guy they are different things. BTW it also winds me up when people call the computer case the "CPU".


I don't want to drag the thread down a rabbit hole, but I am pretty sure a 3 1/2 disk is still a floppy disk because the disk inside is floppy, so it was called a floppy in contrast to a hard disk. But then people in different areas might have used the word differently. Even worse than people who call the box a CPU *shudder*, were people who called the 3 1/2 version a hard disk because it had a hard case.


(Who am I kidding? I love it when a conversation goes down a rabbit hole....)

Edited by Jeffers on 17 May 2015 at 7:25pm



1 person has voted this message useful



This discussion contains 350 messages over 44 pages: << Prev 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3438 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.