Register  Login  Active Topics  Maps  

Spanish: A little subs2srs experiment

 Language Learning Forum : Language Learning Log Post Reply
147 messages over 19 pages: << Previous 1 2 3 4 5 6 7 ... 14 ... 18 19 Next >>
sctroyenne
Diglot
Senior Member
United StatesRegistered users can see my Skype Name
Joined 3934 days ago

739 posts - 1312 votes 
Speaks: English*, French
Studies: Spanish, Irish

 
 Message 105 of 147
04 December 2014 at 5:49pm | IP Logged 
emk wrote:

- Left: I thought meta was subjunctive, so I checked the conjugation tables.
Yup! Even though Spanish is more enthusiastic about the subjunctive than French, I
still have an instinct for where to look.


Yup, though Spanish uses subjunctive tenses that have fallen into disuse in everyday
French, the overriding rules are pretty much the same. A couple exceptions: some verbs
that don't take subjunctive in French (such as I hope that: espérer/esperar) do so in
Spanish and from what I can remember there are conditional statements ("si" clauses)
that can use it. I'm sure someone else can pipe in with the distinctions. But, having
already learned the subjunctive in French means you've already won over half the battle
with the subjuctive in Spanish (one of the toughest challenges in the language).
1 person has voted this message useful



rapp
Senior Member
United States
Joined 4274 days ago

129 posts - 204 votes 
Speaks: English*
Studies: Esperanto, Spanish

 
 Message 106 of 147
04 December 2014 at 7:12pm | IP Logged 
emk wrote:
rapp wrote:
Just fyi, there is an MCD plugin available for Anki that works very well for creating cards in format #1 (it may work for format #2, I just haven't tried using it that way).

Hi, rapp! Great to see you again. How has Neutrino been working out for you? And yes, there's at least one Susuru-style cloze plugin for Anki, but I haven't looked at it yet.


It is going well. I haven't updated my neutrino log in forever, but I'm still plugging away every day. I haven't missed a day of studying since I started the program on September 1, 2013, so I can definitely say it has been very motivational. Khatz recently revamped the scoring and number of levels that make up the program, so none of the stats I had reported in my log will make sense going forward, but I should make a new entry just to cover a couple of anki-related techniques I've come up with along the way.

I'm finding this log very inspiring. I've made a couple of aborted attempts to use subs2srs before, but keep running into problems getting "clean" subtitle files to use. I may have to try the ocr instructions you've provided for generating them from a dvd, but I've got an idea for an alternative method. I haven't tried it yet, so this is pure speculation, but I wonder if you have any insight on whether it would be workable.

The idea is to get a subtitle file that has accurate subtitles, but faulty timing. I would strip out all of the timing info, leaving just the text of the subtitles. Then I would load that text and audio from the show/movie into transcriber. Transcriber should allow me to just listen along to the audio, hitting the enter key to mark the start/end of each subtitle. The results get saved to a .trs file, which subs2srs accepts, so the rest of the process would be just the same as you've documented in this log.

If this works out like I'm hoping, it should be possible to accurately align subtitles for a 30 minute show in not much more than 30 minutes, instead of the hours it seems to take with subtitle edit or such.

There is a short tutorial on how to use transcriber on dinglabs.com, a project similar to readlang that was mentioned here on HTLAL a few years ago. It uses transcriber to synchronize text with an audiobook, but I can't think of any reason it shouldn't work for subtitles also. Hopefully I'll get a chance to try it out this weekend, and I'll report back on how it goes.





3 persons have voted this message useful





emk
Diglot
Moderator
United States
Joined 4075 days ago

2615 posts - 8805 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 107 of 147
04 December 2014 at 11:09pm | IP Logged 
sctroyenne wrote:
But, having
already learned the subjunctive in French means you've already won over half the battle
with the subjuctive in Spanish (one of the toughest challenges in the language).

I actually "built" my French subjunctive around my English subjunctive, starting with phrases like:

Quote:
I demand he be on time tomorrow!
All I asked is that she stop annoying the cat.

For those folks who've never paid much attention to English grammar, note the inflections: "be" and "stop" are not the usual third-person singular forms. There's a handful of these constructions in English, and they're actually a pretty analogous to the Romance subjunctive. Honestly, I found the French subjunctive much easier than developing a more-or-less natural-sounding imperfect. I got a lot of red ink over on lang-8 working on the imperfect. :-)

rapp wrote:
I haven't updated my neutrino log in forever, but I'm still plugging away every day. I haven't missed a day of studying since I started the program on September 1, 2013, so I can definitely say it has been very motivational. Khatz recently revamped the scoring and number of levels that make up the program, so none of the stats I had reported in my log will make sense going forward, but I should make a new entry just to cover a couple of anki-related techniques I've come up with along the way.

I will keep my eyes open for new log entries! I always enjoyed your log greatly.

rapp wrote:
I've made a couple of aborted attempts to use subs2srs before, but keep running into problems getting "clean" subtitle files to use. I may have to try the ocr instructions you've provided for generating them from a dvd, but I've got an idea for an alternative method. I haven't tried it yet, so this is pure speculation, but I wonder if you have any insight on whether it would be workable.

The idea is to get a subtitle file that has accurate subtitles, but faulty timing. I would strip out all of the timing info, leaving just the text of the subtitles. Then I would load that text and audio from the show/movie into transcriber. Transcriber should allow me to just listen along to the audio, hitting the enter key to mark the start/end of each subtitle.

I've never used Transcriber. Several years ago, Transcriber was abandoned in favor of TranscriberAG, which was in turn abandoned due to a lack of development funds. There's a more recent fork on GitHub that will compile and run under Ubuntu 12.04 if you kick it repeatedly and rename AVCodecID to CodecID everywhere it appears in the source code.

All of my earliest experiments with Anki audio cards were actually created using TranscriberAG and some custom scripts that I hacked together. These cards were a huge success, so I later experimented with Subs2srs, and more recently with Subtitle Edit. The workflow you see in this log and on the wiki has been greatly improved since the earliest days.

That said, TranscriberAG is actually pretty nice software, if you can get it to work:



I find that the audio controls are very accurate, and pressing return is pretty efficient. But I can't do it in real-time, because my reflexes just aren't good enough. I need to pause playback, adjust the playback location, select the correct text, and hit return. It's probably still slightly more efficient than Subtitle Edit. But TrascriberAG also involved more conversion passes. It needs specific audio formats for input, and it outputs in *.tag format, which is not on the list of ~200 file formats supported by Subtitle Edit. So you'll waste time extracting audio tracks and converting subtitles. If you use the original Transcriber, at least you'll have an easier time using the output.

Anyway, my Subtitle Edit process is getting more efficient. I can now do a 23 minute episode in about an hour (and I doubt I could do much better with TranscriberAG). The only hard part is dealing with episodes with lots of background noise—these prevent me from seeing nice, clean waveforms for speech, and they can drag the process out to two hours or so.

EDIT: Further experimentation suggests that TranscriberAG isn't that efficient for video. There are two major drackbacks compared to Subtitle Edit:

1. TranscriberAG can't use srt files as starting point for aligning. Among other things, this makes it hard to skip over sections without any dialog, because you need to play until you find the next lines. With Subtitle Edit, you just click and it gets you in the ballpark.

2. You can't watch the video. This is especially boring when you need to listen to all the audio anyway, because of (1).

Edited by emk on 05 December 2014 at 1:10am

2 persons have voted this message useful



Crush
Diglot
Senior Member
ChinaRegistered users can see my Skype Name
Joined 4408 days ago

1622 posts - 2298 votes 
Speaks: English*, Spanish
Studies: Mandarin, Esperanto, Basque

 
 Message 108 of 147
05 December 2014 at 2:49am | IP Logged 
I've been wondering about an efficient way to turn transcripts into subtitles, as well. I was thinking of using the French transcripts of The Simpsons over at http://simpsonspark.com/ as there is a lot of colloquial speech and stuff going on, but haven't yet found a nice way to go about it. My idea was to try to use hunalign with the original English subtitles to try to get them relatively aligned and then fine tune them from there, but i'm still trying to figure out the best way to go about it.
2 persons have voted this message useful





emk
Diglot
Moderator
United States
Joined 4075 days ago

2615 posts - 8805 votes 
Speaks: English*, FrenchB2
Studies: Spanish, Ancient Egyptian
Personal Language Map

 
 Message 109 of 147
05 December 2014 at 12:34pm | IP Logged 
Crush wrote:
My idea was to try to use hunalign with the original English subtitles to try to get them relatively aligned and then fine tune them from there, but i'm still trying to figure out the best way to go about it.

That is a remarkably good idea. But it's probably only worth the effort if you can code. Maybe I need to revisit substudy soon, and see if hunalign can be turned into a library. Hmm.

If you don't want to write code, Subtitle Edit has a "translator's mode" which shows two sets of subtitles side-by-side, and you can just copy-and-paste snippets of the transcript into the correct boxes. This works tolerably well.

A few cards from today

I have no idea whether or not posting cards is still informative at this point; perhaps everybody has gotten the idea by now. :-) But it's fun and only takes a couple of minutes.



- Left: I saw cabazea "head" in Harry Potter when I read two pages on readlang. Everything is starting to reinforce everything else, which is good.

- Right: Some lovely grammar, well worth internalizing. It's a Spanish Verb Workout™, guaranteed to produce a Deep Understanding of Spanish Verbs™ rapidly, or your money back. :-)



- Left: Yup, sctroyenne, here's esperer meaning "to hope". Looks like it means both "wait" and "hope" in Spanish.

- Right: Some weird stuff going on here.



- Left: This card just "clicked" this morning. I had been mishearing it, and needing multiple listens, but this time, everything just made sense.

- Right: This is just a perfect way to learn caulquier. I mean, I could make a pair of Anki vocabulary cards where I translated back and forth between caulquier and "any", and I could inevitably confuse it with similar sounding words, and fail it repeatedly, and relearn it, and finally work to break my "translation reflex" and understand it in actual Spanish speech. Or I could just listen to an unstable, menacing king use the word in context and absorb it naturally. :-)

Sooner or later, if I'm patient, most words will appear in a perfect context.

Another listening experiment!

Last night, my wife and I listened to two shows in Spanish without subtitles: the documentary that tastyonions recommended, and episode 3 of Avatar. This is the easiest of the early episodes, and I had watched it a week ago with English subs. Here's how we did:

The documentary: My wife just destroyed me on listening comprehension, even though she's never studied Spanish. She said that her remaining smidgen of Italian might have helped a little bit. But apparently, when it comes to moving between Romance languages, a native French speaker still has a huge listening advantage over someone with B2+ listening comprehension and C1+ reading comprehension. Interesting. I wonder if this holds true for any skill besides listening?

Avatar episode 3: And here's where I got my revenge. I had an unfair advantage—thanks to having watched a week ago with English subs—but I still understood vastly more than my wife. Apparently my efforts are paying off in a very real way here: I know a lot more of the basic, physical vocabulary, and I'm apparently better at the mechanics of Spanish verbs.

Reviews went well this morning. Combining subs2srs with occasionally-comprehensible input seems to work better than just subs2srs by itself. This is a fun experiment. :-)

Edited by emk on 05 December 2014 at 8:20pm

3 persons have voted this message useful



lorinth
Tetraglot
Senior Member
Belgium
Joined 2817 days ago

443 posts - 581 votes 
Speaks: French*, English, Spanish, Latin
Studies: Mandarin, Finnish

 
 Message 110 of 147
05 December 2014 at 5:41pm | IP Logged 
Quote:
when it comes to moving between Romance languages, a native French speaker still has a huge listening advantage over someone with B2+ listening comprehension and C1+ reading comprehension. Interesting. I wonder if this holds true for any skill besides listening?


I believe it's even more obvious as far as reading is concerned. As a native French speaker who has studied Spanish and Latin to a failry advanced level, not counting my native romance Walloon tongue, I can understand sizeable amounts of written Portugese or Italian (especially non fiction, where cognates are even more numerous) without having ever formally studied those two languages. Understanding the spoken language is more difficult (especially Portuguese), but still there's a huge discount.

Having studied Dutch to a fairly decent intermediate level, I can understand quite a lot of [EDIT: written or simple spoken] German. The first time I went to Finland, having studied only some basic features of Finnish, I discovered that I could learn to puzzle out more from Swedish signs, a language that I've never ever studied but that has a few recognizable cognates with other Germanic languages, than from Finnish signs, although I had studied that language, albeit for a short while.

emk, that's a fascinating thread.

Edited by lorinth on 05 December 2014 at 5:44pm

1 person has voted this message useful



Tupiniquim
Senior Member
Brazil
Joined 4626 days ago

184 posts - 217 votes 
Speaks: Portuguese*
Studies: English, Russian

 
 Message 111 of 147
05 December 2014 at 7:29pm | IP Logged 
emk wrote:
I have no idea whether or not posting cards is still informative at this point; perhaps everybody has gotten the idea by now. :-) But it's fun and only takes a couple of minutes.


They are helpful indeed and always a treat.
2 persons have voted this message useful



victorhart
Bilingual Tetraglot
Groupie
United States
mandarinexperiment.o
Joined 2250 days ago

66 posts - 155 votes 
Speaks: English*, Portuguese*, Spanish, French
Studies: Mandarin

 
 Message 112 of 147
14 December 2014 at 3:58pm | IP Logged 
How have your past 10 days gone emk?

After a long hiatus, I am back in business with my Mandarin experiment.

My latest blog post is mostly about subs2srs and the
methodology you are using. I hope it will bring new people to this forum and generate
interest is your fantastic approach!

Edited by victorhart on 14 December 2014 at 3:59pm



2 persons have voted this message useful



This discussion contains 147 messages over 19 pages: << Prev 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3438 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2020 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.