76 messages over 10 pages: 1 2 3 4 5 6 7 ... 7 ... 9 10 Next >>
mrwarper Diglot Winner TAC 2012 Senior Member Spain forum_posts.asp?TID=Registered users can see my Skype Name Joined 5227 days ago 1493 posts - 2500 votes Speaks: Spanish*, EnglishC2 Studies: German, Russian, Japanese
| Message 49 of 76 06 August 2011 at 3:54pm | IP Logged |
Actually, so far I've spent more time polishing my tools than creating parallel texts with them... juggling two jobs and all, that was OK, since I figured even if I had no time for myself, my students would benefit from it anyway. Boy was I wrong! While there are some people who appreciate and use my stuff, they're certainly way more serious about their language studies than my students are - isn't that ironic? ;(
Anyway, I finally found some time to create parallel texts for me, and it turns out I have very little need for them, in the sense that after reading and absorbing a few parallel texts, I am quite comfortable reading target-only materials.
The up side of it is that now I'm able to produce very high-quality -even I say so myself- 2-way or 3-way parallel texts, at an approximate rate of a three hundred-page novel per evening. My output is such that you can switch between (roughly) paragraph- and sentence- level alignment at will, and display it on columns or interleaved (Franker style?), so pretty much anyone should be happy with it. The texts I can produce can also be printed for those who prefer reading long things on paper -like myself-, or kept in electronic form to play around with them, link them to on-line dictionaries, etc. Samples are available upon request.
To get my tools ready for prime time (they still need some more polishing but I'm lacking motivation) I've been thinking of offering some kind of cheap aligning service here at HTLAL (where else would I find people in need of such a weird thing?). However, I'm not sure if such commercial activities contravene any rules or should undergo any kind of procedure, even if being very low profile, so I'd be glad if someone has ideas / previous experience about/with this.
1 person has voted this message useful
| Abdalan Triglot Senior Member Brazil abdalan.wordpress.co Joined 5047 days ago 120 posts - 194 votes Speaks: Portuguese*, French, English Studies: German
| Message 50 of 76 07 August 2011 at 7:38pm | IP Logged |
I created some parallel texts also. Two of them can be seen
here (Scribd) .
I simply love the fact that someone could use those books, spending much less time to
profit them as a learning tool than myself, as I applied much effort (time) to get them
confortable to follow (at least I guess so).
Edited by Abdalan on 07 August 2011 at 7:48pm
3 persons have voted this message useful
| carlonove Senior Member United States Joined 5987 days ago 145 posts - 253 votes Speaks: English* Studies: Italian
| Message 51 of 76 08 August 2011 at 1:26am | IP Logged |
mrwarper wrote:
Actually, so far I've spent more time polishing my tools than creating parallel texts with them... juggling two jobs and all, that was OK, since I figured even if I had no time for myself, my students would benefit from it anyway. Boy was I wrong! While there are some people who appreciate and use my stuff, they're certainly way more serious about their language studies than my students are - isn't that ironic? ;(
Anyway, I finally found some time to create parallel texts for me, and it turns out I have very little need for them, in the sense that after reading and absorbing a few parallel texts, I am quite comfortable reading target-only materials.
The up side of it is that now I'm able to produce very high-quality -even I say so myself- 2-way or 3-way parallel texts, at an approximate rate of a three hundred-page novel per evening. My output is such that you can switch between (roughly) paragraph- and sentence- level alignment at will, and display it on columns or interleaved (Franker style?), so pretty much anyone should be happy with it. The texts I can produce can also be printed for those who prefer reading long things on paper -like myself-, or kept in electronic form to play around with them, link them to on-line dictionaries, etc. Samples are available upon request.
To get my tools ready for prime time (they still need some more polishing but I'm lacking motivation) I've been thinking of offering some kind of cheap aligning service here at HTLAL (where else would I find people in need of such a weird thing?). However, I'm not sure if such commercial activities contravene any rules or should undergo any kind of procedure, even if being very low profile, so I'd be glad if someone has ideas / previous experience about/with this.
|
|
|
Can you elaborate on some of the programs you use? Or is this custom software you've written yourself?
1 person has voted this message useful
| mrwarper Diglot Winner TAC 2012 Senior Member Spain forum_posts.asp?TID=Registered users can see my Skype Name Joined 5227 days ago 1493 posts - 2500 votes Speaks: Spanish*, EnglishC2 Studies: German, Russian, Japanese
| Message 52 of 76 16 August 2011 at 5:25pm | IP Logged |
carlonove wrote:
Can you elaborate on some of the programs you use? Or is this custom software you've written yourself? |
|
|
One thing doesn't impede the other -- I've basically written the tool myself, and I'll elaborate a bit :)
Philosophy:
My thing relies in the immense superiority of human begins over machines to align even poorly understood texts, and so tries to put /you/ in charge of that instead of trying to do it algorithmically. With the help of a tool that tries to fit a purpose, you can do it much faster than you would do normally, and way more accurately than any program would do on its own (even if more slowly).
The intended purpose of my tool, btw, is to have texts split into 'segments' (roughly the same as paragraphs) that usually have the same content, and logically bind matching segments (and mark the non-matching ones) across these texts, which is only an intermediate task in creating a good parallel text, while allowing for other manual adjustments such as spelling corrections, defining sections, etc.
Current, basic mechanics, later to be made a bit more ergonomic for common computer users:
You pick up two or three texts in HTML format (convert them first!), usually one in a target language of yours, one in a language you understand well, and maybe a translation machine from A to B, and paste them into a template. You load the template in an internet browser like FireFox, and start playing with it. The texts are presented in parallel columns that can scroll at once, or independently.
My tool (just some javascript, actually) relies on the texts being reasonably untouched and split in paragraphs just as the author intended. I assume most paragraphs will match across columns, and you'll have maybe a few blocks like 'Foreword to the German Edition', etc. that are unique to each of them. You can mark these accordingly to leave gaps in the other columns, or doff them altogether. After that it is just a matter of going through the whole text fixing on sight the (hopefully) few occasions where single paragraphs do not match (hello, translators, editors and other manglers) and you have two-to-one or three-to-two and similar correspondences. There you just fuse segments together where appropriate, and realign them. As [non-]matching segments are defined by a human, they can optionally be further split to the sentence level.
Once you're done, you 'hit a button' and matching segments are laid out to form a parallel text. This can be viewed or printed normally in any browser, and additional integrated tools (more javascript) allow further playing, like switching presentational styles, interleaving horizontally or in columns, etc.
The thing itself and any further details you might feel curious about are available upon request as well :)
Edited by mrwarper on 17 August 2011 at 1:37pm
1 person has voted this message useful
| Sprachprofi Nonaglot Senior Member Germany learnlangs.comRegistered users can see my Skype Name Joined 6471 days ago 2608 posts - 4866 votes Speaks: German*, English, French, Esperanto, Greek, Mandarin, Latin, Dutch, Italian Studies: Spanish, Arabic (Written), Swahili, Indonesian, Japanese, Modern Hebrew, Portuguese
| Message 53 of 76 16 August 2011 at 9:38pm | IP Logged |
I would love to have that for my Chinese aligning...
1 person has voted this message useful
| Abdalan Triglot Senior Member Brazil abdalan.wordpress.co Joined 5047 days ago 120 posts - 194 votes Speaks: Portuguese*, French, English Studies: German
| Message 54 of 76 17 August 2011 at 12:54am | IP Logged |
I think I could help with the initial steps of aligning between these languages: English,
Spanish, French, German, Italian, Polish and Portuguese... and try Russian, Turkish and
Ukranian if there are real good translations. (file .docx).
1 person has voted this message useful
| mrwarper Diglot Winner TAC 2012 Senior Member Spain forum_posts.asp?TID=Registered users can see my Skype Name Joined 5227 days ago 1493 posts - 2500 votes Speaks: Spanish*, EnglishC2 Studies: German, Russian, Japanese
| Message 55 of 76 17 August 2011 at 6:24pm | IP Logged |
Abdalan wrote:
I created some parallel texts also. Two of them can be seen
here (Scribd) .
I simply love the fact that someone could use those books, spending much less time to
profit them as a learning tool than myself, as I applied much effort (time) to get them
confortable to follow (at least I guess so).
|
|
|
Those look very good, and they're very similar to a printed version of mine.
However, any texts I've ever produced will be available upon request only, because copyright infringement is more than likely, and I suggest you do the same to avoid trouble (I haven't checked the copyright status of your texts, but just in case).
From my early experiments I can successfully align any languages as long as I can reasonably read one of them and the other is not ideographic -- I aligned a long Swedish-Russian text with only 4 alignment mistakes according to someone who could read them at the time, and you can expect I've only gotten better at it over time.
The alignment process takes me now only a few hours per book, and the real challenge is to correct all of the OCR mistakes, misspellings, etc., and get them rectified on the original. I usually do that as I read on paper, but sometimes it is unbearable and I resort to correct them while I read on screen to avoid wasting more time.
Edited by mrwarper on 17 August 2011 at 6:25pm
1 person has voted this message useful
| mrwarper Diglot Winner TAC 2012 Senior Member Spain forum_posts.asp?TID=Registered users can see my Skype Name Joined 5227 days ago 1493 posts - 2500 votes Speaks: Spanish*, EnglishC2 Studies: German, Russian, Japanese
| Message 56 of 76 16 June 2012 at 2:57pm | IP Logged |
Bump. WRT to PDF conversion to text and page numbers etc. getting in the way, I've also made my own tool to remove all that before conversion but I suspect normal human beings will find Briss ;) much more to their taste.
1 person has voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 0.5781 seconds.
DHTML Menu By Milonic JavaScript
|