exscribere Diglot Senior Member IndiaRegistered users can see my Skype Name Joined 5256 days ago 104 posts - 126 votes Speaks: English*, Danish Studies: Mandarin, French, Korean, Hindi
| Message 9 of 34 30 June 2010 at 6:32pm | IP Logged |
This is great... now I just need to find texts and audio files I can match up and try to contribute!
1 person has voted this message useful
|
slattery Newbie United States dinglabs.com Joined 6453 days ago 13 posts - 38 votes Speaks: English*
| Message 10 of 34 01 July 2010 at 12:39am | IP Logged |
exscribere wrote:
This is great... now I just need to find texts and audio files I can
match up and try to contribute! |
|
|
If anyone wants to contribute, I'll go ahead and spill the beans: the secret sauce is
"Transcriber". All the content is prepared using that tool. I've written a tutorial on
using it, here. That should get you up and running in no time. You can start working with your favorite
L2 podcast or audiobook. And you can synchronize it with either L1 or L2 text! :)
@Aineko: naturalarabic.com looks really interesting, thanks for sharing that.
7 persons have voted this message useful
|
johntm93 Senior Member United States Joined 5304 days ago 587 posts - 746 votes 2 sounds Speaks: English* Studies: German, Spanish
| Message 11 of 34 01 July 2010 at 4:26am | IP Logged |
slattery wrote:
exscribere wrote:
This is great... now I just need to find texts and audio files I can
match up and try to contribute! |
|
|
If anyone wants to contribute, I'll go ahead and spill the beans: the secret sauce is
"Transcriber". All the content is prepared using that tool. I've written a tutorial on
using it, here. That should get you up and running in no time. You can start working with your favorite
L2 podcast or audiobook. And you can synchronize it with either L1 or L2 text! :)
@Aineko: naturalarabic.com looks really interesting, thanks for sharing that. |
|
|
Sir...you have done great things for the language learning community.
2 persons have voted this message useful
|
jerrypettit Groupie United States Joined 6003 days ago 79 posts - 103 votes Speaks: English*
| Message 12 of 34 01 July 2010 at 10:25pm | IP Logged |
This is great, great stuff.
Keep working on it!
1 person has voted this message useful
|
fizzer Newbie United Kingdom Joined 5522 days ago 17 posts - 25 votes Speaks: English* Studies: German, French
| Message 13 of 34 02 July 2010 at 10:07am | IP Logged |
Great idea. If only you could automate the transcription process...
- Assume a language with full stops or some equivalent.
- Acquire original text file, T, and corresponding high quality human-read sound file, H.
- Split T into N sentence-sized pieces, Si (i = 0, .. N-1 ), at the full stops.
- Pass the Si through a free text-to-speech converter to generate N poor quality sound samples, Pi, of lengths Li.
- As a first guess split H into N samples, Hi, of lengths proportional to the Li.
- Find a computationally feasible error function e(h, p) representing the distance between two audio samples.
- Use a genetic algorithm to find the vector Li which minimizes the sum of e(Hi, Pi) over i = 0, .. N-1.
- The hope is that the Hi you end up with correspond to the Pi, and hence to the Si.
The hard part is finding a good error function. The literature on speech analysis is very scary. A first try might be the distance between Fourier transforms over the interval.
1 person has voted this message useful
|
slattery Newbie United States dinglabs.com Joined 6453 days ago 13 posts - 38 votes Speaks: English*
| Message 14 of 34 02 July 2010 at 4:01pm | IP Logged |
fizzer wrote:
Great idea. If only you could automate the transcription process... |
|
|
Check this out, I think you'll get a kick out of this! :)
"The Linguist" by Steve Kaufmann, in the DingLabs Reader
Pretty much every single WORD is synchronized with the audio, all 4+ hours of it!
That is something I have just recently been able to do, but only for English material.
I hope English learners will feel empowered to take on "The Linguist" book, with it available in this format. I really liked it.
Steve Kaufmann generously allowed LingQ material to be made available in this audio-text synchronized format, as long as attribution is made.
If anybody wants to help align LingQ, or any other content, for ANY language,
please get in touch with me, and we can build out a great library of content
for ourselves, and fellow language learners.
2 persons have voted this message useful
|
Luai_lashire Diglot Senior Member United States luai-lashire.deviant Joined 5805 days ago 384 posts - 560 votes Speaks: English*, Esperanto Studies: Japanese, French
| Message 15 of 34 02 July 2010 at 4:10pm | IP Logged |
This post seems to have
many perfect candidates for your great tool, and as a student of Japanese I would love to see even a few of them
made available. I don't know if I'd have the time or know-how to do anything about it myself, though.
2 persons have voted this message useful
|
jerrypettit Groupie United States Joined 6003 days ago 79 posts - 103 votes Speaks: English*
| Message 16 of 34 02 July 2010 at 4:13pm | IP Logged |
I've downloaded Transcriber and will fiddle around with the text and mp3 audio (which I'll need to convert to WAV) of the Italian language PINOCCHIO which is available on the Web.
Not that I'll do the whole book, but I want to see how lengthy a process this is.
It would be interesting to know how long it took to do the word by word of "The Linguist". Was there some kind of automated procedure done there? That seems like an incredible amount of work...
3 persons have voted this message useful
|