Register  Login  Active Topics  Maps  

Who is Creating Parallel Texts?

  Tags: Bilingual texts
 Language Learning Forum : General discussion Post Reply
76 messages over 10 pages: 1 2 3 4 5 6 7 ... 7 ... 9 10 Next >>
mrwarper
Diglot
Winner TAC 2012
Senior Member
Spain
forum_posts.asp?TID=Registered users can see my Skype Name
Joined 5227 days ago

1493 posts - 2500 votes 
Speaks: Spanish*, EnglishC2
Studies: German, Russian, Japanese

 
 Message 49 of 76
06 August 2011 at 3:54pm | IP Logged 
Actually, so far I've spent more time polishing my tools than creating parallel texts with them... juggling two jobs and all, that was OK, since I figured even if I had no time for myself, my students would benefit from it anyway. Boy was I wrong! While there are some people who appreciate and use my stuff, they're certainly way more serious about their language studies than my students are - isn't that ironic? ;(

Anyway, I finally found some time to create parallel texts for me, and it turns out I have very little need for them, in the sense that after reading and absorbing a few parallel texts, I am quite comfortable reading target-only materials.

The up side of it is that now I'm able to produce very high-quality -even I say so myself- 2-way or 3-way parallel texts, at an approximate rate of a three hundred-page novel per evening. My output is such that you can switch between (roughly) paragraph- and sentence- level alignment at will, and display it on columns or interleaved (Franker style?), so pretty much anyone should be happy with it. The texts I can produce can also be printed for those who prefer reading long things on paper -like myself-, or kept in electronic form to play around with them, link them to on-line dictionaries, etc. Samples are available upon request.

To get my tools ready for prime time (they still need some more polishing but I'm lacking motivation) I've been thinking of offering some kind of cheap aligning service here at HTLAL (where else would I find people in need of such a weird thing?). However, I'm not sure if such commercial activities contravene any rules or should undergo any kind of procedure, even if being very low profile, so I'd be glad if someone has ideas / previous experience about/with this.

1 person has voted this message useful



Abdalan
Triglot
Senior Member
Brazil
abdalan.wordpress.co
Joined 5047 days ago

120 posts - 194 votes 
Speaks: Portuguese*, French, English
Studies: German

 
 Message 50 of 76
07 August 2011 at 7:38pm | IP Logged 
I created some parallel texts also. Two of them can be seen
here (Scribd) .

I simply love the fact that someone could use those books, spending much less time to
profit them as a learning tool than myself, as I applied much effort (time) to get them
confortable to follow (at least I guess so).



Edited by Abdalan on 07 August 2011 at 7:48pm

3 persons have voted this message useful



carlonove
Senior Member
United States
Joined 5987 days ago

145 posts - 253 votes 
Speaks: English*
Studies: Italian

 
 Message 51 of 76
08 August 2011 at 1:26am | IP Logged 
mrwarper wrote:
Actually, so far I've spent more time polishing my tools than creating parallel texts with them... juggling two jobs and all, that was OK, since I figured even if I had no time for myself, my students would benefit from it anyway. Boy was I wrong! While there are some people who appreciate and use my stuff, they're certainly way more serious about their language studies than my students are - isn't that ironic? ;(

Anyway, I finally found some time to create parallel texts for me, and it turns out I have very little need for them, in the sense that after reading and absorbing a few parallel texts, I am quite comfortable reading target-only materials.

The up side of it is that now I'm able to produce very high-quality -even I say so myself- 2-way or 3-way parallel texts, at an approximate rate of a three hundred-page novel per evening. My output is such that you can switch between (roughly) paragraph- and sentence- level alignment at will, and display it on columns or interleaved (Franker style?), so pretty much anyone should be happy with it. The texts I can produce can also be printed for those who prefer reading long things on paper -like myself-, or kept in electronic form to play around with them, link them to on-line dictionaries, etc. Samples are available upon request.

To get my tools ready for prime time (they still need some more polishing but I'm lacking motivation) I've been thinking of offering some kind of cheap aligning service here at HTLAL (where else would I find people in need of such a weird thing?). However, I'm not sure if such commercial activities contravene any rules or should undergo any kind of procedure, even if being very low profile, so I'd be glad if someone has ideas / previous experience about/with this.


Can you elaborate on some of the programs you use? Or is this custom software you've written yourself?
1 person has voted this message useful



mrwarper
Diglot
Winner TAC 2012
Senior Member
Spain
forum_posts.asp?TID=Registered users can see my Skype Name
Joined 5227 days ago

1493 posts - 2500 votes 
Speaks: Spanish*, EnglishC2
Studies: German, Russian, Japanese

 
 Message 52 of 76
16 August 2011 at 5:25pm | IP Logged 
carlonove wrote:
Can you elaborate on some of the programs you use? Or is this custom software you've written yourself?


One thing doesn't impede the other -- I've basically written the tool myself, and I'll elaborate a bit :)

Philosophy:

My thing relies in the immense superiority of human begins over machines to align even poorly understood texts, and so tries to put /you/ in charge of that instead of trying to do it algorithmically. With the help of a tool that tries to fit a purpose, you can do it much faster than you would do normally, and way more accurately than any program would do on its own (even if more slowly).

The intended purpose of my tool, btw, is to have texts split into 'segments' (roughly the same as paragraphs) that usually have the same content, and logically bind matching segments (and mark the non-matching ones) across these texts, which is only an intermediate task in creating a good parallel text, while allowing for other manual adjustments such as spelling corrections, defining sections, etc.

Current, basic mechanics, later to be made a bit more ergonomic for common computer users:

You pick up two or three texts in HTML format (convert them first!), usually one in a target language of yours, one in a language you understand well, and maybe a translation machine from A to B, and paste them into a template. You load the template in an internet browser like FireFox, and start playing with it. The texts are presented in parallel columns that can scroll at once, or independently.

My tool (just some javascript, actually) relies on the texts being reasonably untouched and split in paragraphs just as the author intended. I assume most paragraphs will match across columns, and you'll have maybe a few blocks like 'Foreword to the German Edition', etc. that are unique to each of them. You can mark these accordingly to leave gaps in the other columns, or doff them altogether. After that it is just a matter of going through the whole text fixing on sight the (hopefully) few occasions where single paragraphs do not match (hello, translators, editors and other manglers) and you have two-to-one or three-to-two and similar correspondences. There you just fuse segments together where appropriate, and realign them. As [non-]matching segments are defined by a human, they can optionally be further split to the sentence level.

Once you're done, you 'hit a button' and matching segments are laid out to form a parallel text. This can be viewed or printed normally in any browser, and additional integrated tools (more javascript) allow further playing, like switching presentational styles, interleaving horizontally or in columns, etc.

The thing itself and any further details you might feel curious about are available upon request as well :)

Edited by mrwarper on 17 August 2011 at 1:37pm

1 person has voted this message useful



Sprachprofi
Nonaglot
Senior Member
Germany
learnlangs.comRegistered users can see my Skype Name
Joined 6471 days ago

2608 posts - 4866 votes 
Speaks: German*, English, French, Esperanto, Greek, Mandarin, Latin, Dutch, Italian
Studies: Spanish, Arabic (Written), Swahili, Indonesian, Japanese, Modern Hebrew, Portuguese

 
 Message 53 of 76
16 August 2011 at 9:38pm | IP Logged 
I would love to have that for my Chinese aligning...
1 person has voted this message useful



Abdalan
Triglot
Senior Member
Brazil
abdalan.wordpress.co
Joined 5047 days ago

120 posts - 194 votes 
Speaks: Portuguese*, French, English
Studies: German

 
 Message 54 of 76
17 August 2011 at 12:54am | IP Logged 
I think I could help with the initial steps of aligning between these languages: English,
Spanish, French, German, Italian, Polish and Portuguese... and try Russian, Turkish and
Ukranian if there are real good translations. (file .docx).
1 person has voted this message useful



mrwarper
Diglot
Winner TAC 2012
Senior Member
Spain
forum_posts.asp?TID=Registered users can see my Skype Name
Joined 5227 days ago

1493 posts - 2500 votes 
Speaks: Spanish*, EnglishC2
Studies: German, Russian, Japanese

 
 Message 55 of 76
17 August 2011 at 6:24pm | IP Logged 
Abdalan wrote:
I created some parallel texts also. Two of them can be seen
here (Scribd) .

I simply love the fact that someone could use those books, spending much less time to
profit them as a learning tool than myself, as I applied much effort (time) to get them
confortable to follow (at least I guess so).

Those look very good, and they're very similar to a printed version of mine.

However, any texts I've ever produced will be available upon request only, because copyright infringement is more than likely, and I suggest you do the same to avoid trouble (I haven't checked the copyright status of your texts, but just in case).

From my early experiments I can successfully align any languages as long as I can reasonably read one of them and the other is not ideographic -- I aligned a long Swedish-Russian text with only 4 alignment mistakes according to someone who could read them at the time, and you can expect I've only gotten better at it over time.

The alignment process takes me now only a few hours per book, and the real challenge is to correct all of the OCR mistakes, misspellings, etc., and get them rectified on the original. I usually do that as I read on paper, but sometimes it is unbearable and I resort to correct them while I read on screen to avoid wasting more time.


Edited by mrwarper on 17 August 2011 at 6:25pm

1 person has voted this message useful



mrwarper
Diglot
Winner TAC 2012
Senior Member
Spain
forum_posts.asp?TID=Registered users can see my Skype Name
Joined 5227 days ago

1493 posts - 2500 votes 
Speaks: Spanish*, EnglishC2
Studies: German, Russian, Japanese

 
 Message 56 of 76
16 June 2012 at 2:57pm | IP Logged 
Bump. WRT to PDF conversion to text and page numbers etc. getting in the way, I've also made my own tool to remove all that before conversion but I suspect normal human beings will find Briss ;) much more to their taste.


1 person has voted this message useful



This discussion contains 76 messages over 10 pages: << Prev 1 2 3 4 5 68 9 10  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.9531 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.