Register  Login  Active Topics  Maps  

Who is Creating Parallel Texts?

  Tags: Bilingual texts
 Language Learning Forum : General discussion Post Reply
76 messages over 10 pages: 1 2 3 4 57 ... 6 ... 9 10 Next >>
a3
Triglot
Senior Member
Bulgaria
Joined 5257 days ago

273 posts - 370 votes 
Speaks: Bulgarian*, English, Russian
Studies: Portuguese, German, Italian, Spanish, Norwegian, Finnish

 
 Message 41 of 76
25 July 2011 at 9:06pm | IP Logged 
What about using wikipedia for parallel texts? It has many articles translated to many languages.
1 person has voted this message useful



The Stephen
Diglot
Groupie
United States
Joined 5053 days ago

65 posts - 77 votes 
Speaks: English*, German
Studies: Czech, Hungarian

 
 Message 42 of 76
26 July 2011 at 2:52am | IP Logged 
I'm in the process of creating parallel texts of Harry Potter and the Sorcerer's Stone, in both English-German and English-Hungarian. The hardest part is the strange formatting that results from converting .pdf files to .txt files. Still trying to figure out what to do with the page numbers, since they tend to turn up in strange places which make it hard to automate the process of finding and deleting them. I might just have to go through and remove them manually (ugh...). Otherwise I've had good luck automating everything, and am looking forward to doing more. It's an interesting intersection between my love of languages and my interest in computers (I'm working on a computer science degree).

LATE EDIT: Aha! I got it! And after a little cleaning up, I have to say I'm quite proud of the result. I wish I could share it, but I'm afraid of all the legal/copyright issues it would probably entail, especially with such a popular book.

Edited by The Stephen on 26 July 2011 at 9:03am

1 person has voted this message useful



jazzboy.bebop
Senior Member
Norway
norwegianthroughnove
Joined 5419 days ago

439 posts - 800 votes 
Speaks: English*
Studies: Norwegian

 
 Message 43 of 76
26 July 2011 at 1:16pm | IP Logged 
a3 wrote:
What about using wikipedia for parallel texts? It has many articles
translated to many languages.


From what I have seen a lot of articles are not translated but instead are totally new
articles written in a different language with perhaps considerable overlap in terms of
factual content but not necessarily presentation.

I'm not actually sure if there are any articles which are translated on a line by line
basis, or at least not many, as I think you need to provide references which are in the
language of the article itself so as a result each article across different languages is
independently researched and written.
1 person has voted this message useful



Cavesa
Triglot
Senior Member
Czech Republic
Joined 5010 days ago

3277 posts - 6779 votes 
Speaks: Czech*, FrenchC2, EnglishC1
Studies: Spanish, German, Italian

 
 Message 44 of 76
26 July 2011 at 5:25pm | IP Logged 
For what languages can you use the Google translator? I do not trust it. For exemple when translating in Czech, it does not make "just a few mistakes". It creates complete nonsense (both by using wrong vocabulary and not using any grammar). I don't agree with Iversen here. Free literary translations are just as horrible as Google translator, some may be even better than it. But perhaps translations to some other languages might work better.
1 person has voted this message useful





Iversen
Super Polyglot
Moderator
Denmark
berejst.dk
Joined 6704 days ago

9078 posts - 16473 votes 
Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan
Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian
Personal Language Map

 
 Message 45 of 76
27 July 2011 at 1:33am | IP Logged 
I understand the scepticism of Cavesa and others concerning the use of machine translations, but you have to consider their role. I would NEVER trust a machine translation into my target languages, so the 'foreign' version in a bilingual text would always be the original. And the role of the translation is not to tell me what the foreign text means, but just to help me along when I'm in doubt. Of course one of those points where I'm doubt could coincide with one of the points where the machine had gone berserk and produced sheer nonsense, but even then I might pick up the one piece of information that permits me to understand the offending passage.

I have used bilingual texts based on machine translations for so long time now that I more or less know where translations from a certain language typically go wrong, and if I find that the translation is suspicious I still have my dictionaries ready so there isn't really a danger in this technique. And what would the alternative be? It could be cheating, sneaky and insidious literary translations of the kind where the translation can express same general meaning, but in a way that doesn't even hint at the structure in the original. If you find something more trustworthy then by all means use it - I have never said that machine translation are the only useful ones.


Edited by Iversen on 27 July 2011 at 3:38pm

1 person has voted this message useful



The Stephen
Diglot
Groupie
United States
Joined 5053 days ago

65 posts - 77 votes 
Speaks: English*, German
Studies: Czech, Hungarian

 
 Message 46 of 76
27 July 2011 at 10:25pm | IP Logged 
For those of you interested in creating your own parallel texts, member "doviende" has a tutorial on his blog here on how to do that (the catch is that you need the all-powerful 'emacs' text editor).

If I may update the tutorial a bit, doviende recommends using 'hunalign' to align the text, but the creator of hunalign has a newer, fancier version called 'LF Align' that has functionality for .txt files that came from .pdf files. Although I still recommend, if you go the pdf conversion route, cleaning up the .txt as much as possible for the best results.

Using this tutorial (had to add a little "elbow grease" of my own though) I was able to create a fairly sexy German-English text of the first Harry Potter in about an hour (subsequent ones will take a lot less time now that I know the process better). LF Align did amazingly well, so that I barely have anything to fix during my first proof-read of it. However, my Hungarian-English text of the same book isn't going nearly as well, but I still have some experiments up my sleeve for that.

So in summary I highly recommend these resources for those learning western European languages, but even with something like Hungarian it is a lot better than starting from scratch (sorry, but I have no idea how this all works for languages using non-latin script).

Good luck!

The Stephen
1 person has voted this message useful



Deji
Diglot
Senior Member
United States
Joined 5441 days ago

116 posts - 182 votes 
Speaks: English*, French
Studies: Hindi, Bengali

 
 Message 47 of 76
03 August 2011 at 1:22am | IP Logged 
I spent a lot of time trying to find, download and make parallel texts. After a year or so working with them
and countless hours struggling with text blocks from scanned original texts and scanned translations, I
discovered that i preferred the texts to be on separate pages.

Reading Bengali is a real effort, mostly because the vowels are found in three positions: before the consonant
(but pronounced after it) around the consonant on both sides, and after the consonant. If I see the English, it
just pulls my eye over to it. It is then a big effort to readjust to reading the Bengali. So I prefer to stay in the
Bengali and consult the English as needed.

I now have a radical new solution to bilingual reading. I call it BUY-lingual. (Sorry, couldn't resist)

I buy the Bengali book and I buy the translation. Three clicks and five days later comes Christmastime in the
mail. No legal problems and the Bengali books are pretty inexpensive. I order from the same site in New
Jersey: Parabaas. And I spend my time reading instead of wrestling with my computer.

Google translate can correctly translate "I don't speak Bengali" Anything more complex is sentence after
sentence of total gibberish.
2 persons have voted this message useful



maydayayday
Pentaglot
Senior Member
United Kingdom
Joined 5220 days ago

564 posts - 839 votes 
Speaks: English*, German, Italian, SpanishB2, FrenchB2
Studies: Arabic (Egyptian), Russian, Swedish, Turkish, Polish, Persian, Vietnamese
Studies: Urdu

 
 Message 48 of 76
03 August 2011 at 9:40pm | IP Logged 
a3 wrote:
What about using wikipedia for parallel texts? It has many articles translated to many languages.


For the last year or so I have used the Spanish Wikipedia articles quite a lot but I started out by also creating an English Google translate version too to give a 'sense check' of what I believe I am reading! I've set up a macro to look up highlighted words in specialist dictionaries.

The advantages to me of doing this is that I get what is usually native Spanish (though some are rubbish.....) and I am exposed to specialist vocabulary in that subject area, especially as I don't often read fiction.

I did read Harry Potter II in Spanish a couple of weeks ago though!







1 person has voted this message useful



This discussion contains 76 messages over 10 pages: << Prev 1 2 3 4 57 8 9 10  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3594 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.