Register  Login  Active Topics  Maps  

Digitizing FSI

 Language Learning Forum : Language Programs, Books & Tapes Post Reply
237 messages over 30 pages: 13 4 5 6 7 ... 2 ... 29 30 Next >>
heartburn
Senior Member
United States
Joined 7207 days ago

355 posts - 350 votes 
Speaks: English*
Studies: Spanish

 
 Message 9 of 237
08 March 2005 at 10:44pm | IP Logged 
Actually the audio is a gimee! Most of the articles are from two magazines. Think Spanish! is for beginners and Puerta del Sol is more advanced. I posted more details here:

http://how-to-learn-any-language.com/forum/forum_posts.asp?T ID=92&PN=0&TPN=2

They arrive faster than I can get through them. I have quite a backlog.

I'm sorry to say, the recordings are quite clear.


1 person has voted this message useful



mahyar
Newbie
Canada
Joined 7201 days ago

34 posts - 31 votes
Speaks: English*
Studies: French

 
 Message 10 of 237
09 March 2005 at 2:05am | IP Logged 
Your right about the audio and such.

I know that several photocopying places have high speed sheet feed scanners (photocopiers are the same thing, execpt they print too) at something like .03 euro a page (So 400 pages would cost you 12 euros, not bad at all). I can't find the service in my semi-small town although, but a place like the USA or big town in europe must definately have them.

I know that OCR software is not perfect, I guess i should of said the intial OCR process where the computer does the first big chunk of digitizing.

I've had good results with ABBY fine reader for OCRing and proofreading. (I've actually proofread about 100 pages of a badly scanned book with it) It has a file format where the OCRed results and original scanned results are both easily accessable and proof reading is really easy as a result. We could then split up the job amongst ourselves using that dual format. It doesn't have to be a one person job.

If the font scanned font is legible, I think just knowing how to type the taught language correctly into a computer would be sufficent to proofread the scanned results, since all your doing is a one to one simple copy, not a translation.

With my experience with BitTorrent, I've found it that you should really split the torrent files into big chunks like "All of FSI Spanish I" or "All of FSI Spanish I-IV plus progmatic". If you dont put them in big chunks and divide them up into really small bits like you gave an example, a dispersion effect tends to happen where the resources (people with complete files) are too spread out for anyone to get anything near complete. Plus people who are used to bit-torrent kind of things have been doing 3GB downloads for a while now anyway.

You can still divide the actual files into the fined grained way you specified, but make the .torrent downloadable bits into big chunks. Bit-torrent can make an entire folder with thousands of items in them availble as one downloadable unit. Think of it as a .zip file almost.

We really need to check with the project gutenburg people -before we start- if this would be ok copyright wise. I would do it myself, but I don't know the specifics of the FSI program as well as you guys do. If the copyright is cleared, then we can enlist the help of the all of the project gutenburg volunteers (a large amount of people) to help with digitizing and proofreading these FSI courses.

Edited by mahyar on 09 March 2005 at 2:14am

1 person has voted this message useful



administrator
Hexaglot
Forum Admin
Switzerland
FXcuisine.com
Joined 7376 days ago

3094 posts - 2987 votes 
12 sounds
Speaks: French*, EnglishC2, German, Italian, Spanish, Russian
Personal Language Map

 
 Message 11 of 237
09 March 2005 at 2:37am | IP Logged 
For large files we can probably find a compromise. I'm not sure I'd like people to start downloading huge files that they are not going to use. Although I'd favor a one-file-per-lesson format, I take your point about efficiency of BitTorrent. Perhaps we could do 5-lessons-per-file or slightly bigger, so that people who just wish to try their hand at the language could do it.

I think the big work is on the Audio side, the OCR is probably easier to manage. If one of us could do the scanning, the one could do the OCR and we could split the correction, that's a good idea.

Once we have started, it will probably be possible to translate the programs into other languages. For instance, FSI German could be have the English parts translated into French so that a Frenchman could use it to learn German. Of course some parts of the program have probably been designed with the specific problems of the English speaker in mind, but I don't think this would detract from the interest of the course. And we would not need perfect target-language speakers to do so, only fellow enthusiasts would have a reasonable voice in the 'From' language.

I think it would be terrific to be able to give free access to these great courses not only to English speakers, but to others as well.

Heartburn, which language would you be interested in digitizing?

Pentatonic, you seem to be the most knowledgeable about sound, would you be able to post some guidelines as to how other forum members might go about digitizing FSI tapes, such as what software and minimum hardware would be recommended? I think we need to achieve a minimum standard for those files and with a technical 'whitepaper' probably that we could all work on it and produce enough files.

In case somebody has interest in Mandarin Chinese, I have access to about 100 tapes of Defence Language Institute drills and dialogs. It's great material and could be made much more usable if digitized. I've never seen it anywhere in the trade and it came straight from NTIS.




Edited by administrator on 09 March 2005 at 6:11am

1 person has voted this message useful



Lunatic
Newbie
Joined 7200 days ago

1 posts - 1 votes

 
 Message 12 of 237
09 March 2005 at 6:17am | IP Logged 
Admin, you can specify which files you wish to download using Bittorrent.
So, for example, if I wanted to only download the first lesson then I could just select that file and download only those that I want.
As a general rule, Bittorrent is more effective with large files than smaller ones.
1 person has voted this message useful



administrator
Hexaglot
Forum Admin
Switzerland
FXcuisine.com
Joined 7376 days ago

3094 posts - 2987 votes 
12 sounds
Speaks: French*, EnglishC2, German, Italian, Spanish, Russian
Personal Language Map

 
 Message 13 of 237
09 March 2005 at 6:31am | IP Logged 
Lunatic, welcome to the forum!

I am not too informed about Peer to Peer networks but you and other forum members seem to be. So let me give the problems I face so you can tell us which options are the best:

-Distributing MP3 files that run for many hours
-Letting users access meaningful parts of the files directly (like a 'Scene Selection' on a DVD)
-Not letting quick-buck operators take advantage of our collaborative effort to take the files and sell them commercially
-Allow people to either download one or two lessons OR the whole 15 lessons (for instance)
-Keep the bandwidth needed on the server at an acceptable level - this site is costing me already and although I am willing to let people benefit from free knowledge, I can't actually pay the costs of their downloading myself. Each Gigabyte of traffic cost several dollars, so we need to find a way to keep it down.
-Not compromise the server security with a spammy software that opens connections to anybody and let them do what they want on the box

BitTorrent looks like it's designed to work in this context. However, if we distributed files for relatively rare languages such as Modern Greek, will there be any two people to download this particular file at the same time? If not, will BitTorrent bring any benefits?



1 person has voted this message useful



heartburn
Senior Member
United States
Joined 7207 days ago

355 posts - 350 votes 
Speaks: English*
Studies: Spanish

 
 Message 14 of 237
09 March 2005 at 9:50am | IP Logged 
Of course, I'd prefer to do Spanish. But I'd be ok doing any language that uses the Roman alphabet if I don't have to proofread.

Unfortunately, I only own the Barron's and Platiquemos versions of the Spanish program.


administrator wrote:

-Not letting quick-buck operators take advantage of our collaborative effort to take the files and sell them commercially

This one might be tough. I'm not a lawyer, but if the material is already public domain we might have no control over this. It is the only reason why we'd be able to do this in the first place.


Edited by heartburn on 09 March 2005 at 9:57am

1 person has voted this message useful



administrator
Hexaglot
Forum Admin
Switzerland
FXcuisine.com
Joined 7376 days ago

3094 posts - 2987 votes 
12 sounds
Speaks: French*, EnglishC2, German, Italian, Spanish, Russian
Personal Language Map

 
 Message 15 of 237
09 March 2005 at 10:18am | IP Logged 
There is a derived copyright for compilation work. For instance we could not take remastered tapes from commercial releases of FSI and rip them and offer them for free.

If we brand each file in a clear way saying where it came from and that it cannot be sold, I think it should work.
1 person has voted this message useful



heartburn
Senior Member
United States
Joined 7207 days ago

355 posts - 350 votes 
Speaks: English*
Studies: Spanish

 
 Message 16 of 237
09 March 2005 at 8:46pm | IP Logged 
I've been thinking about this copyright thing a little. The more I think about it, the more I think, "Why not let them use the files?" Here's my reasoning...

The goal is to make these programs freely available to everyone, right? When that happens, what will become of the companies who already repackage these programs? Some, like Platiquemos, might be ok because of the value that they add. Others, like AudioForum, will find themselves charging money for an inferior product.

In order to resell something that can otherwise be downloaded for free, the resellers will need to add value. That generally means that they will be making the programs better. Some people are willing to pay for something extra. Ultimately, the quality of commercial language programs will have to increase.

I'm envisioning something like a Free Software license. Maybe it could be based on the Apache license, or the BSD license, or something like that.

If we are doing this out of the goodness of our hearts anyway, why not be really good?

Edited by heartburn on 09 March 2005 at 8:49pm



1 person has voted this message useful



This discussion contains 237 messages over 30 pages: << Prev 13 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.4375 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.