Register  Login  Active Topics  Maps  

10k sentences from parallel text?

 Language Learning Forum : Learning Techniques, Methods & Strategies Post Reply
18 messages over 3 pages: 1 2 3  Next >>
Incunabulum
Newbie
United States
Joined 4851 days ago

13 posts - 30 votes
Studies: French, Serbian

 
 Message 1 of 18
10 January 2013 at 4:01pm | IP Logged 
Recently I used LF Aligner (http://aligner.sourceforge.net/) to extract text from two PDF
files in L1 and L2 and produced the aligned text, which I then ran through a python
script to create a parallel language book. Since the output of LF Aligner is a text file
with tab separated fields, it can also be imported into Anki. I was wondering if anyone
else had ever tried learning sentences from a parallel language book in Anki, and what
your experience was.

In order for this to work the translation has to be fairly literal, and you have to check
for sentences being in a different order in the translation. The idea would be to learn
the vocabulary through the sentences in Anki, which should then allow reading the book in
L2 with enjoyment. If audio for the book was available, you would then be able to do LR
by listening to L2 while reading L2.
3 persons have voted this message useful



Majka
Triglot
Senior Member
Czech Republic
kofoholici.wordpress
Joined 4469 days ago

307 posts - 755 votes 
Speaks: Czech*, German, English
Studies: French
Studies: Russian

 
 Message 2 of 18
10 January 2013 at 4:16pm | IP Logged 
It is a good idea in theory.
However, not all sentences from a book are worth learning. And you will have easier time with French than in Serbian.

I am doing something similar. My additional step is to get list of all words in the book in their base form in order they appear in the book, compare it to a list of known words and learn these (in smaller groups). I have scripts for this, it takes few minutes maximum. Again, in theory I learn about 20 words and start reading. When I find an unknown word, it is time to learn the next group.

I am adding to anki only sentences I think I can use. These are mostly sentences from dialogs and here and there interesting phrase.
2 persons have voted this message useful



Incunabulum
Newbie
United States
Joined 4851 days ago

13 posts - 30 votes
Studies: French, Serbian

 
 Message 3 of 18
10 January 2013 at 4:31pm | IP Logged 
My thinking is that I'd like to be able to understand the meaning of all the sentences,
so I wouldn't be trying to "learn" or "memorize" the sentences. If I could learn the
meaning of all the sentences then I should be able to read the book. I'm guessing
reading the book at that point may be a rather funny experience, since I would only then
see all the sentences in their proper order and see the whole story for the first time.
I'm considering shuffling the sentences in Anki so I won't be able to follow the plot;
that way I'll still be interested in reading the book in L2.

I'm actually doing this for Serbian since I haven't found parallel texts in that
language. I wouldn't bother doing this for French since their are so many other
resources already available.
1 person has voted this message useful





emk
Diglot
Moderator
United States
Joined 5344 days ago

2615 posts - 8806 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 4 of 18
10 January 2013 at 4:40pm | IP Logged 
Incunabulum wrote:
Since the output of LF Aligner is a text file
with tab separated fields, it can also be imported into Anki. I was wondering if anyone
else had ever tried learning sentences from a parallel language book in Anki, and what
your experience was.


I've used subs2srs, which is pretty much the same idea, except that it uses movie subtitles and automatically includes audio clips and images from the movie. Personally, I love subs2srs, especially with good bilingual subtitles.

But there's a really big danger here. There's dozens of horror stories floating around the net about people who tried to collect "10K sentences" using some sort of bulk import tool. Most of these people crashed and burned very hard after months of Anki hell, and none of them seem to have acquired the language. Khatzumoto is very clear about this these days: You need to spend most of your time soaking in the language, and only keep those few sentences which are awesome and compelling and interesting. Anki is either an amazing bionic memory or an exquisitely sophisticated torture tool, and the deciding factor is how much you love your individual cards.

What I discovered with subs2srs is that you need to be incredibly aggressive about deleting cards. Even if it's the funniest and wittiest movie ever, and every line of dialog is brilliant, you need to delete at least 50% of cards on the first pass. And then you still need to delete very heavily during the first two weeks of reviews.

If I were going to make Anki cards from a parallel book, I'd aim for at least 95% deletion within the first two passes, and I'd keep deleting heavily after that. If it were a brilliant book and I really loved it, I might aim to keep 1 or 2 sentences per page.


8 persons have voted this message useful



stifa
Triglot
Senior Member
Norway
lang-8.com/448715
Joined 4685 days ago

629 posts - 813 votes 
Speaks: Norwegian*, EnglishC2, German
Studies: Japanese, Spanish

 
 Message 5 of 18
10 January 2013 at 4:47pm | IP Logged 
Why can't you just keep them in a list, and then just cherry pick? Or just keep it as a
corpus to search through for example sentences?

Remember that Anki sentences must be n+1 (or 2 in rare cases)

When you senselessly import them, you will end up with sentences that teaches you
nothing or sentences that overwhelm you. Both of these waste your time.

Edited by stifa on 10 January 2013 at 4:48pm

3 persons have voted this message useful



Incunabulum
Newbie
United States
Joined 4851 days ago

13 posts - 30 votes
Studies: French, Serbian

 
 Message 6 of 18
10 January 2013 at 5:04pm | IP Logged 
I'm looking at this as a way to acquire the vocabulary so that I can read the book,
rather than a way to get 10k amazing sentences that I can use to internalize the
grammar, etc. I already know Serbian grammar from other methods. The idea would be to
know the meaning of all the sentences so I could read the book. For that reason I
wouldn't want to delete 95% of the sentences; even if the sentence isn't interesting on
its own, I still want to be able to understand what it means. I would probably be
pretty loose when judging my understanding of a sentence.

I was viewing this as an alternative to just jumping into the parallel text. With the
parallel text you can read through, seeing each sentence just once. If I put them in
Anki I could quickly pass through the ones I already know and repeatedly review the
ones I didn't. That seemed like it would be more efficient than say rereading the
parallel text over and over again.

1 person has voted this message useful





emk
Diglot
Moderator
United States
Joined 5344 days ago

2615 posts - 8806 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 7 of 18
10 January 2013 at 6:28pm | IP Logged 
Incunabulum wrote:
I'm looking at this as a way to acquire the vocabulary so that I can read the book, rather than a way to get 10k amazing sentences that I can use to internalize the grammar, etc.


OK, I tossed together a quick and dirty script to simulate this experience. I assumed 10,000 cards, an interval multiplier of 2.5, a 98% pass rate, and 14.7 seconds per sentence card on average. These numbers are based on my own sentence deck over the last 9 months or so. The time per card seems high because it includes initial learning.

- If you try to get through all 10,000 cards in a year, you'd be looking at 150 to 200 Anki reps per day, which would average 40 to 50 minutes. Total time would be about 262 hours of Anki reviews.

- If you try to blast through the project in 2 months, you'd be looking at 600 to 900 Anki reps per day, or 3+ hours per day, for a total of 181 hours.

Now, speaking as somebody who once averaged 40 minutes of boring Anki reviews per day, this sounds like living hell. Even 20 minutes of fun Anki reviews can be physically draining some days.

Now, what happens if we discard 95% of cards after reading them once, and try to make it through the book in two months? That gives us about 200 Anki reps per day, 160+ of them new and in order, and very roughly an hour of reading per day. This is hard work, and it will probably take 50 hours to finish the book, but you'll get 500 relatively interesting and useful Anki cards of it.

Like I said, I've probably come as close as anybody at HTLAL to trying this experiment in real life (with subs2srs). Whenever I forgot to delete heavily, it got very ugly. Personally, if you can't afford to delete 95% of the sentences after the first review, I'd recommend you consider an easier book. But ultimately, your choices are up to you, and you might just love this book enough to do 65,000 Anki reviews with no deletions. But if you find yourself hating the book, Anki and the language, then stop. Reckless AJATTing has already claimed enough victims. :-)

6 persons have voted this message useful



Incunabulum
Newbie
United States
Joined 4851 days ago

13 posts - 30 votes
Studies: French, Serbian

 
 Message 8 of 18
10 January 2013 at 7:10pm | IP Logged 
EMK, I see your point about deleting cards during the first pass that are so easy that
you never need to see them again, which I'm assuming would be majority of the cards.
While reviews of those cards would be very short (much less than 15 seconds I'd hope),
deleting them would probably be much more efficient. I'd assumed before that you were
suggesting deleting sentences based on how interesting they were rather than based on
ability to understand them.

Regarding the time to do this, I don't have a fixed time frame and will keep at it until
I learn it, so I'm not worried about blasting through this in a particular period of
time. If I still need to learn it it will still be in Anki. As for the warnings, I know
to stop something if it isn't useful. Ја нисам неки почетник.


2 persons have voted this message useful



This discussion contains 18 messages over 3 pages: 2 3  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.6133 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.