1317 messages over 165 pages: << Previous 1 2 3 4 5 6 7 ... 144 ... 164 165 Next >>
rdearman Senior Member United Kingdom rdearman.orgRegistered users can see my Skype Name Joined 5234 days ago 881 posts - 1812 votes Speaks: English* Studies: Italian, French, Mandarin
| Message 1145 of 1317 30 September 2014 at 1:51pm | IP Logged |
emk wrote:
Next thing you know, the computers will be signing up for HTLAL and asking us language learning questions. :-) |
|
|
How do you know they haven't already? ;-)
Another request for you if you have the time? Back before I got bored of flogging the 300 word dead horse I reviewed your figures and picked out the fact the non-noun words needed to cover 95% language cover were 2082 words.
Would it be possible for you to dump a file for me of these non-noun words for 95% coverage? Or could I really push the boat out and ask for 75%, 90%, 95%, 99% for non-noun coverage?
I still think it would be useful to know these relatively small sets of words, and I want to test this theory by memorising them and trying to determine if it has an effect on my reading comprehension. I'm guessing it should have a dramatic effect, if the 95% coverage is true. I've also got a couple of potential guinea pigs who go to my French classes and I'll try to enlist them to help me.
1 person has voted this message useful
| Jeffers Senior Member United Kingdom Joined 4907 days ago 2151 posts - 3960 votes Speaks: English* Studies: Hindi, Ancient Greek, French, Sanskrit, German
| Message 1146 of 1317 30 September 2014 at 8:41pm | IP Logged |
emk, all this talk of computer tools for language, coupled with your thread about experiments with word frequency, got me thinking about a computer SRS generating tool I've been toying with in my mind for a long time. I don't think I'd ever get to the point of being able to implement it, so I figured I'd share the idea with you.
The main idea would be to give the tool an ebook (or subtitle file) and it generates a frequency list from the book. So far nothing new. The software feeds a requested number of words to SRS. (I would do it in lots of 20, study them, then ask for the next 20- but I think the number should be user-selected). The software could generate single word, sentence or cloze cards, drawing the sentences from actual occurrences in the text.
What's different (I think) about my ideas is what would happen when you finish the book (or subs file) and move on to the next one. It would again make a frequency list, but when you request the first 20 words, it would skip any words that it has already given you. The idea is that you would have a growing body as you move from book to book. While you are studying words based on frequency, you are only learning words which you are actually coming across regularly.
I think it would work well also with your subs2srs software. Instead of getting a set with all the lines in the film, you could ask for the first occurrences of the 100 most common words you haven't learnt yet.
The idea is obviously in need of a dose of technical reality (for example, could the sofware check your ebook files directly on a kindle or other ereader, or would you have to feed it text files?) But it's my dream SRS generator. (The higher dream would be that it could also use a linked audiobook, find the appropriate sentence using voice recognition, and automatically add the audio to the card).
1 person has voted this message useful
| rdearman Senior Member United Kingdom rdearman.orgRegistered users can see my Skype Name Joined 5234 days ago 881 posts - 1812 votes Speaks: English* Studies: Italian, French, Mandarin
| Message 1147 of 1317 30 September 2014 at 11:46pm | IP Logged |
Jeffers wrote:
emk, all this talk of computer tools for language, coupled with your thread about experiments with word frequency, got me thinking about a computer SRS generating tool I've been toying with in my mind for a long time. I don't think I'd ever get to the point of being able to implement it, so I figured I'd share the idea with you.
The main idea would be to give the tool an ebook (or subtitle file) and it generates a frequency list from the book. So far nothing new. The software feeds a requested number of words to SRS. (I would do it in lots of 20, study them, then ask for the next 20- but I think the number should be user-selected). The software could generate single word, sentence or cloze cards, drawing the sentences from actual occurrences in the text.
What's different (I think) about my ideas is what would happen when you finish the book (or subs file) and move on to the next one. It would again make a frequency list, but when you request the first 20 words, it would skip any words that it has already given you. The idea is that you would have a growing body as you move from book to book. While you are studying words based on frequency, you are only learning words which you are actually coming across regularly.
I think it would work well also with your subs2srs software. Instead of getting a set with all the lines in the film, you could ask for the first occurrences of the 100 most common words you haven't learnt yet.
The idea is obviously in need of a dose of technical reality (for example, could the sofware check your ebook files directly on a kindle or other ereader, or would you have to feed it text files?) But it's my dream SRS generator. (The higher dream would be that it could also use a linked audiobook, find the appropriate sentence using voice recognition, and automatically add the audio to the card). |
|
|
Don't want to Shanghai emk's log, but I logged something very similar in
my Mandarin log to do this, using basic linux commandlines. What you are asking for would be a relatively simple script to write, then you just need to use ANKI with the setting to "introduce new words in the order they were input". You could simply compare the first csv file against the second and remove the duplicates from the second file before importing the second deck. So you could have what you want now I think.
1 person has voted this message useful
|
emk Diglot Moderator United States Joined 5530 days ago 2615 posts - 8806 votes Speaks: English*, FrenchB2 Studies: Spanish, Ancient Egyptian Personal Language Map
| Message 1148 of 1317 01 October 2014 at 2:48am | IP Logged |
rdearman wrote:
emk wrote:
Next thing you know, the computers will be signing up for HTLAL and asking us language learning questions. :-) |
|
|
How do you know they haven't already? ;-) |
|
|
We do have one spambot which creates hundreds of accounts, each of them claiming, "I AM A HUMAN AND I AM HERE TO DISCUSS ON THE FORUM." As far as I'm concerned, bots will only be welcome when they can actually contribute to language-learning discussions. :-)
rdearman wrote:
Would it be possible for you to dump a file for me of these non-noun words for 95% coverage? Or could I really push the boat out and ask for 75%, 90%, 95%, 99% for non-noun coverage? |
|
|
Since you're comfortable with the command line, you can actually find all my tools on GitHub, and dump the word lists directly from the SQLite3 database. If it doesn't work, please feel free to submit an issue against the GitHub project. It's also worth taking a look at the ipython notebook, which you can run interactively.
Jeffers wrote:
What's different (I think) about my ideas is what would happen when you finish the book (or subs file) and move on to the next one. It would again make a frequency list, but when you request the first 20 words, it would skip any words that it has already given you. The idea is that you would have a growing body as you move from book to book. While you are studying words based on frequency, you are only learning words which you are actually coming across regularly. |
|
|
I actually have a slightly different system, which involves:
1. Highlighting sentences with interesting vocabulary or grammar when reading.
2. After finishing the book, using a tool to turn the sentences into Anki cards and add definitions.
3. Importing everything into Anki.
Here's what it looks like in practice:
This works really insanely well, and I'd love to extend it to handle video someday.
A little exercise
Just to amuse myself, I spent 30 minutes last night aligning the first page of Harry Potter y la Piedra Filosofal against Harry Potter à l'école des sorciers:
Quote:
El señor y la señora Dursley, que vivían en el número 4 de Privet Drive, estaban orgullosos de decir que eran muy normales, afortunadamente.
Mr et Mrs Dursley, qui habitaient au 4, Privet Drive, avaient toujours affirmé avec la plus grande fierté qu’ils étaient parfaitement normaux, merci pour eux.
Eran las últimas personas que se esperaría encontrar relacionadas con algo extraño o misterioso, porque no estaban para tales tonterías.
Jamais quiconque n’aurait imaginé qu’ils puissent se trouver impliqués dans quoi que ce soit d’étrange ou de mystérieux. Ils n’avaient pas de temps à perdre avec des sornettes.
El señor Dursley era el director de una empresa llamada Grunnings, que fabricaba taladros.
Mr Dursley dirigeait la Grunnings, une entreprise qui fabriquait des perceuses. |
|
|
If I have really strong context, I can actually decipher maybe 2/3rds of the Spanish. With no context, I can sometimes get something. This reinforces my theory that somehow creating really strong context is actually a key language-learning skill—and it's even better if you can do it while amusing yourself.
1 person has voted this message useful
| rdearman Senior Member United Kingdom rdearman.orgRegistered users can see my Skype Name Joined 5234 days ago 881 posts - 1812 votes Speaks: English* Studies: Italian, French, Mandarin
| Message 1149 of 1317 01 October 2014 at 11:16am | IP Logged |
emk wrote:
Since you're comfortable with the command line, you can actually find all my tools on GitHub, and dump the word lists directly from the SQLite3 database. If it doesn't work, please feel free to submit an issue against the GitHub project. It's also worth taking a look at the ipython notebook, which you can run interactively.
|
|
|
I have never successfully managed to install or use anything on GitHub. Mostly because there doesn't seem to be anyway to download a tar file or anything. I stopped using source code control after cvs.
EDIT: Managed to get enough GitHub knowledge to create the DB. :)
Edited by rdearman on 01 October 2014 at 12:51pm
1 person has voted this message useful
| patrickwilken Senior Member Germany radiant-flux.net Joined 4531 days ago 1546 posts - 3200 votes Studies: German
| Message 1150 of 1317 01 October 2014 at 12:02pm | IP Logged |
emk wrote:
This reinforces my theory that somehow creating really strong context is actually a key language-learning skill—and it's even better if you can do it while amusing yourself. |
|
|
The boost context gives to language understanding is fascinating.
Edited by patrickwilken on 01 October 2014 at 12:10pm
1 person has voted this message useful
| sctroyenne Diglot Senior Member United StatesRegistered users can see my Skype Name Joined 5389 days ago 739 posts - 1312 votes Speaks: English*, French Studies: Spanish, Irish
| Message 1151 of 1317 05 October 2014 at 12:41am | IP Logged |
emk wrote:
This works really insanely well, and I'd love to extend it to handle video someday. |
|
|
It's a great tool - I love the cards I'm getting!
1 person has voted this message useful
|
emk Diglot Moderator United States Joined 5530 days ago 2615 posts - 8806 votes Speaks: English*, FrenchB2 Studies: Spanish, Ancient Egyptian Personal Language Map
| Message 1152 of 1317 09 October 2014 at 5:40pm | IP Logged |
patrickwilken wrote:
emk wrote:
This reinforces my theory that somehow creating really strong context is actually a key language-learning skill—and it's even better if you can do it while amusing yourself. |
|
|
The boost context gives to language understanding is fascinating. |
|
|
Yeah, Krashen hints at this in a lot of places, but it's really one of the major things I learned while improving my French: Extensive methods can consolidate anything that you can puzzle out, but you need some way to puzzle it out, or you're not going to make any progress.
sctroyenne wrote:
emk wrote:
This works really insanely well, and I'd love to extend it to handle video someday. |
|
|
It's a great tool - I love the cards I'm getting! |
|
|
I'm glad to hear that it's still working well for you! I'm taking a break from new French Anki cards for the moment to focus on other projects, but I really love my SRS Collector cards—they worked very well, and they were easy and pleasant to create.
A little Spanish experiment
I've tracked down an ebook and an audio book for Harry Potter y la piedra filosofal (both legal), and I'm been playing around with them a bit: I've manually aligned the Spanish and French text of the first several paragraphs, and I've left the corresponding audio looping on my CD player for half a day.
The good news: My brain is already separating the audio into syllables, and I can mostly match text to audio in real time. This has historically taken me a couple of weeks before the sounds "click", and I'm way ahead of schedule. This probably has something to do with Spanish's nice clean sound system, and possibly the fact that my brain can already segment Italian.
The bad news: This isn't going to work without some substantial tweaking. Harry Potter is just too hard, and Spanish is too unfamiliar, even with the Renaissance cognates.
A few things which might help:
1. More repetition of each audio segment, to burn it into my sound memory.
2. More repetition with comprehension to help match meanings up to sounds and lock them in.
3. More reinforcement of the basics, which could be achieved either through graded texts or sheer volume.
So basically, if I want to learn extensively from the beginning, it feels like I'm getting pushed sharply in the direction of several existing approaches:
a. Assimil (grading, repetition).
b. Listening/Reading (volume and long consecutive hours, providing repetition).
c. subs2srs (small chunk size w/parallel text, with optimized repetition).
Even though I can generally more-or-less decipher Spanish sentences given a parallel French text, and even though I have a very high tolerance for incomprehension, I can't tackle native Spanish with pure extensive methods. This is obvious, of course.
Now, the next question: Can I finesse this? Given one Romance language, can I somehow create an effective method (d) to add to the above list? Or will any possible solution wind up as a minor variant of (a–c)?
1 person has voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 0.4531 seconds.
DHTML Menu By Milonic JavaScript
|