Register  Login  Active Topics  Maps  

Spanish: A little subs2srs experiment

 Language Learning Forum : Language Learning Log Post Reply
147 messages over 19 pages: << Previous 1 2 3 4 5 6 7 ... 8 ... 18 19 Next >>
rdearman
Senior Member
United Kingdom
rdearman.orgRegistered users can see my Skype Name
Joined 3672 days ago

881 posts - 1812 votes 
Speaks: English*
Studies: Italian, French, Mandarin

 
 Message 57 of 147
13 November 2014 at 6:14pm | IP Logged 
tommus wrote:

6. Put your cursor in Anki right after your sentence or phrase. Click the red record circle in Anki. Click the green play arrow in Audacity. As soon as the audio stops, click the stop button that pops up in Anki.


I did a similar thing to turn my Mandarin Pimsleur into anki cards, I find it is easier to highlight the audio you want in audacity, then "export selection" as MP3 and give the file name the same name as the phrase your learning. The other benefit is you can use the MP3 files outside of ANKI as well.

Using the "export selection" method I also extracted all the numbers from 1-100 from a you tube video and created my own CD for the car. Counts 1-100 then I randomly scattered the files through the rest of the CD so I could translate the numbers when I hear it. I burn a new CD every couple of weeks so I get different numbers. Same as ANKI, but I don't have to press any buttons when I'm driving. :)
2 persons have voted this message useful



tommus
Senior Member
CanadaRegistered users can see my Skype Name
Joined 4302 days ago

979 posts - 1686 votes 
Speaks: English*
Studies: Dutch, French, Esperanto, German, Spanish

 
 Message 58 of 147
14 November 2014 at 1:14am | IP Logged 
rdearman wrote:
I find it is easier to highlight the audio you want in audacity, then "export
selection" as MP3 and give the file name the same name as the phrase your learning.

That makes a lot of sense. And it suggests a whole lot of additional useful things to do.
Thanks for that.

1 person has voted this message useful



Crush
Diglot
Senior Member
ChinaRegistered users can see my Skype Name
Joined 4301 days ago

1622 posts - 2297 votes 
Speaks: English*, Spanish
Studies: Mandarin, Esperanto, Basque

 
 Message 59 of 147
14 November 2014 at 3:37am | IP Logged 
Btw, a shortcut for exporting audio from Audacity is using labels. Highlight the portion you want to export then press Ctrl+B. Type the name you want for that chunk of audio then highlight the next chunk and repeat until everything you want is labeled. Afterwards, go to "Export Multiple Files" (Ctrl+Shift+L) and choose "Divide files by label" (or something along those lines). Each file will be exported using the label as the filename. I broke up the Princeton Russian course this way and it's much faster than doing it one by one.
10 persons have voted this message useful





emk
Diglot
Moderator
United States
Joined 3968 days ago

2615 posts - 8805 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 60 of 147
14 November 2014 at 1:32pm | IP Logged 
Crush wrote:
Highlight the portion you want to export then press Ctrl+B. Type the name you want for that chunk of audio then highlight the next chunk and repeat until everything you want is labeled. Afterwards, go to "Export Multiple Files" (Ctrl+Shift+L) and choose "Divide files by label" (or something along those lines).

That's a really nice trick. Thank you.

Benchmark: Sprachprofi's progress

I'm going to provide some progress numbers in a moment, but first I want to quote from Sprachprofi's subs2srs results. These cover roughly her first 30 hours of Anki reviews, I think:

Sprachprofi wrote:
I hadn't studied any Japanese before. Some examples of what I was able to understand ON FIRST HEARING, without having seen these exact phrases before:

450 cards in
お前 知ってたの?- You knew?
黒六十八目 - Black has 68 points.

800 cards in
もしかして強い奴? - Could he be someone really strong?
お前ならできるだろう - You can do it, right?
速く打てよ お前の番だぜ - Hurry up and move! It's your turn.

1200 cards in
なぜ囲碁部に入った? - Why did you enter the Go club?
私は最近ぜんぜん打ってないですよ~ - But I haven't played at all lately!
海王の三将ってどんな奴かな - I wonder who Kaio's third board is.

1500 cards in
俺は 海王の岸本と打ちたい だから 負けるなよ あんた - I want to play against Kishimoto of Kaio, so don't you lose.
佐為にも時々打たせてやりたいけど - I want to let Sai play every now and then
英語なんかできなくだっていいんだよ 碁を打つだけだから - I don't have to speak English, all I'm doing is playing Go.
普通の君がインターネットで世界中の人と碁 を打つの? - You're only okay and your playing people from all over the world through the internet?

My three week report

In the last 21 days, I've done 922 Anki reps in 6 hours, plus another 5 or so hours of subtitle aligning, an hour or so of grammar study, and a bunch of hours of playing TV in the background. I've learned 202 cards, and suspended 160. I'm currently learning 15 new cards per day, and spending roughly 25 minutes/day on reviews.

By media source:
- Y Tu Mamá También: 89 cards learned, 150 suspended
- Avatar: 113 cards learned, 10 suspended

At my current rate, it would take me 12 more weeks to reach Sprachprofi's total of 1,500 cards. Here's what I have available:

- Avatar episodes 1&2: 483 cards.
- Avatar episodes 3&4: Inaccurate subtitles.
- Avatar episodes 5&6: Accurate subtitles, not yet aligned.
- Avatar episodes 7&8: Inaccurate subtitles.
- Avatar episodes 9&10: Accurate subtitles, not yet aligned.

This should be enough to get me near 1,500 cards, and I still have 30 more Avatar episodes to check (plus a season of Korra). So I have plenty of material on a single subject.

Subjectively, how does it feel?

- Syllables and word boundaries are generally quite sharp, except in fast speech.
- I can often hear words and phrases from my cards when they appear elsewhere.
- I can definitely understand some new stuff more complicated than Sprachprofi's 450-card-level examples.
- Spanish verbs are starting to "click". I naturally understand some sentences without pronouns, imperfect verbs, clitic pronouns and so on. It's spotty, but it feels like it's working.
- I need a lot more vocabulary.
- Watching previously unseen episodes is interesting, and I pick up some stuff.
- Watching unseen episodes with bilingual subtitles is already very useful.
- When watching episodes I've studied, I can understand well over half the dialog naturally, directly in Spanish.

Overall, I'm quite happy. For 6 hours of Anki reps (officially, maybe as much as 9 unofficially), I've made pretty amazing progress. If you ultimately want to learn a language by sitting on the couch watching TV, this method appears to be considerably more efficient than an Assimil-only approach. The major advantages seem to come from (1) the intensive listening practice, (2) the SRS review schedule, and (3) the very focused, narrow content.

I'll try to post some examples like Sprachprofi's later on, if I get a minute.

Edited by emk on 14 November 2014 at 1:33pm

1 person has voted this message useful



rdearman
Senior Member
United Kingdom
rdearman.orgRegistered users can see my Skype Name
Joined 3672 days ago

881 posts - 1812 votes 
Speaks: English*
Studies: Italian, French, Mandarin

 
 Message 61 of 147
15 November 2014 at 1:22pm | IP Logged 
I was trying to do "Duel Language" subtitles like you did.

emk wrote:
garyb wrote:
I did try "Lingual media player", which can display two subtitle files at once, but it was unusably slow on my ageing laptop. Combining the files hadn't crossed my mind.

Since there seems to be interest in this, I've added a "Dueling Subtitles" tutorial to the wiki. You'll probably also want to see the Subtitle Edit tutorial, which talks about finding and preparing subtitles.

The results are pretty cool:



You can change the position and color of the subtitles, if you want to make one subtitle easier to read, and another more challenging.


I stupidly thought you were doing it with Sub Edit, not subs2srs. However there is a way to do this easily in Sub Edit, and you can have multiple languages (3-4 if you wanted). Also it generates a single subtitle file. To do this in Sub Edit you do the following.

1) Open L1 file.
2) Click Tools->Append Subtitle...
3) Select L2 file
4) If the second file is already in synch with the video just click OK. This will append the second language to the bottom of the file, but keep the timings so it runs concurrently.
4.a) Repeat if you want more than 2 languages
5) Change the format to Sub Station Alpha
6) Right Click on the subtitles and select "Sub Station Alpha Styles"
7) Setup two different styles, colours etc for L1 & L2. You can position them at in 9 different places, with 4 different colours, fonts, etc. You can export the styles to a file so that you've always got them.
8) Return to main screen and set the L1 style to the L1 half of the subtitles, and set L2 style to the L2 part.
9) Save as a .ssa file and play in VLC or other media player.

The advantages of this is you can have multiple languages, position them together or separately on the screen, changes fonts, font sizes, margins, etc.

You can also automatically translate with Sub Edit and append the translation and do the above to watch concurrently. Which keeps the same times and durations so it is in synch already.

1 person has voted this message useful



Stefan
Diglot
Senior Member
Sweden
stefannilsson.cRegistered users can see my Skype Name
Joined 2763 days ago

22 posts - 29 votes
Speaks: Swedish*, EnglishC1
Studies: German

 
 Message 62 of 147
16 November 2014 at 7:18pm | IP Logged 
I really like the idea and I'm sure it will improve your listening comprehension.

Yesterday I spent the whole day trying to launch subs2srs on my Mac. Hard work but I
finally managed to do it by combining Crossover with .NET 3.5 SP1 and a WinXP-bottle.
Then I discovered the most difficult part - finding material! I'm fluent in English and the
anime I've watched usually have burnt in subtitles so I've never really bothered with it
before. First I tried Big Bang Theory but none of the subtitles matched regardless of how
much time I spent trying to sync them (probably due to different commercial cuts). Then I
tried Scrubs and found a synced subtitle but it wasn't even close to what they actually
said. I guess that's a common problem when they write the subtitles first and then dub it.

Tomorrow I'll try with Spirited Away because a friend told me he has a copy with audio
and subtitles in both languages. I'm sure they'll sync but the question is how well the
dubbed audio matches the subtitles. Maybe it would be a good idea to aim for native
material with subtitles for people hard of hearing.

Anyway.. Since several of us seem to be spending hours and hours to create decks -
maybe it would be a wise idea to encourage everyone to share them. The Wiki includes
some but there are lots of dead links (wish we had mirrors). For example, I know I'd really
appreciate the Die Welle-deck sabotai mentioned on the other page. Maybe someone
would be interested in Spirited Away once I've completed it.
1 person has voted this message useful





emk
Diglot
Moderator
United States
Joined 3968 days ago

2615 posts - 8805 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 63 of 147
16 November 2014 at 8:53pm | IP Logged 
rdearman: Yeah, there are a couple of different ways to get bilingual subtitles working in high-end video players. In my case, however, the challenge is figuring out how to stream video to a Chromecast using Videostream.

Which brings us to this weekend's project.

A little programming interlude

The "secret sauce" of subs2srs is its ability to line up subtitles that don't quite match:

Quote:
English

18
00:01:02,328 --> 00:01:03,162
Yay!
Yay!

19
00:01:03,163 --> 00:01:04,664
Aang's back!

Spanish

16
00:01:02,328 --> 00:01:04,664
¡Si! ¡Aang ha vuelto!

I wrote a library and command-line tool in Rust (Rust source code here) that can merge two *.srt files into one:



These subtitles can be used in a wide variety of players, including VLC:



…and Videostream:



The decision to put both languages at the bottom of the screen is deliberate: This allows my eyes to skip quickly back and forth between the two languages.

Videostream has a nice "skip back 30 seconds" button as well, which could be very useful for making multiple passes through a tricky conversation:



Some other things I'd like to try someday

Now that I have a general-purpose subtitle aligner, I can imagine some other fun projects for a future day:

- Automatically generate bilingual MCD cards from subtitles, focusing on inflections, ser/estar, por/para, etc.
- Generate MP3 files that play a line of L2 dialog, the same line in L1, and then L2 again, then move on.
- A command-line clone of subs2srs?

But for now, I have Anki cards to review, and real work to get done this weekend. :-)
1 person has voted this message useful



kujichagulia
Senior Member
Japan
Joined 3283 days ago

1031 posts - 1571 votes 
Speaks: English*
Studies: Japanese, Portuguese

 
 Message 64 of 147
17 November 2014 at 4:17am | IP Logged 
emk and anybody else doing this, I have a question for you all.

With your subs2srs decks, do you have any cards where the subtitles do not match the audio? Perhaps one word is different, or the entire line is different from the audio. What do you do with such cards? Do you:
(a) Delete them right away, or
(b) Depending on the card, keep them and use the subtitles as hints to the audio?

The reason I ask is that I have some DVDs that I really like, that also come with Japanese and/or Portuguese subtitles. The problem is that, most of the time, the subtitles don't match the audio. A search on some of the subtitle websites mentioned above (OpenSubtitles.org, etc.) didn't find any files with subtitles that match the audio verbatim. (In fact, it seems like they just ripped the subtitles from the DVD and posted them up there.) I like those DVDs and would like to use them, if possible, for subs2srs, hence the question. It would be cool to have 3,000 cards of natural conversation with audio, but if 95% of them are going to be deleted, that leaves 150. Is that enough to understand a movie or a TV show?

Edited by kujichagulia on 17 November 2014 at 4:18am



1 person has voted this message useful



This discussion contains 147 messages over 19 pages: << Prev 1 2 3 4 5 6 79 10 11 12 13 14 15 16 17 18 19  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3438 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2020 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.