nakrian keegiat Diglot Groupie Thailand Joined 4909 days ago 70 posts - 172 votes Speaks: English*, Thai Studies: Russian
| Message 1 of 28 15 August 2013 at 3:35pm | IP Logged |
Myself and other Thai learners are trying to get Thai added to LingQ. We need 1000 votes for it to be added as a beta language. Even if you aren't interested in learning Thai please vote to help us. Thank you.
https://www.facebook.com/questions/10150249705278786/
5 persons have voted this message useful
|
Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5132 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 2 of 28 15 August 2013 at 7:39pm | IP Logged |
Good initiative, but I guess Steve's programmers will face difficulties in adding Thai. Have a look at my blog post at womenlearnthai.com from November last year. In order to make Thai work for a system like LingQ you need a parser, and building a solid parser which doesn't require the occasional intervention on part of the user is tough (or rather impossible without semantic parsing). Last year, when I did my own parser (which requires me to surgically intervene every so often), I checked the scientific (computational linguistic) literature and couldn't find anything better. I don't want to spoil the party and I really hope for you that Steve's guys will find a solution that works for a commercial system like LingQ, but I just thought I should warn you that there may be difficulties of an unfamiliar nature ahead.
2 persons have voted this message useful
|
nakrian keegiat Diglot Groupie Thailand Joined 4909 days ago 70 posts - 172 votes Speaks: English*, Thai Studies: Russian
| Message 3 of 28 15 August 2013 at 8:40pm | IP Logged |
I know it will be tough and I have no knowledge of how to do it. I'm just trying to get support for it to get it in beta and then see how far it can progress. A friend of mine tried to have a programmer do a Thai only version and this is what he came up with:
http://thai-notes.com/tools/predictionary.shtml
It isn't a finished project and the developer said that he can't spend any more time working on it. My friend has said he will make a large donation to LingQ towards the programming costs if it gets that far.
Brett from Learn Thai From a White Guy is a big fan of LingQ and he agreed to let all of his videos/transcripts be used if we can get it started. Would you donate some (or all) of your material from thairecordings for it? I plan to use a few skype tutors (probably including the ones you used) to create the necessary beginner lessons. Your material would be great for intermediate level content.
*edit*
If the parsing issue is too much to overcome then another possibility would be to have the lessons created with spaces between the words. This isn't ideal, but it's an acceptable sacrifice to me if it means getting it up and running. Reading Thai is so difficult in the beginning with a small vocabulary when you don't even know where the word breaks are. Maybe this would be better for beginners anyway. More advanced learners can always easily find native text to practice reading.
Edited by nakrian keegiat on 15 August 2013 at 8:53pm
1 person has voted this message useful
|
Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5132 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 4 of 28 15 August 2013 at 9:05pm | IP Logged |
On thai-notes.com I couldn't find anything addressing the parsing issue, but it looks like a cool project.
Don't get me wrong, I'm all in favor of having Thai on LingQ, I just can't see how to do it without developing a robust parser. The technical problems with Thai are of a different nature than with Mandarin and Japanese. Maybe there is a crowd-sourced solution, that would be cool. Hopefully you can convince Steve to throw resources at developing a solution!
I'm lucky in that I have my own private LingQ (my parser plus FLTR) and I really learned a lot from intensive reading sessions. I really hope it works out for you guys as well.
The material on thairecordings.com - both audio and transcripts - is published under this license. As long as the conditions of non-commercial use and attribution are fulfilled it can be used freely. I'm not too familiar with Steve's current business model but I'm pretty sure he caters to that kind of copyright. In other words: yes, my material can be used. It's about 10 hours of audio plus transcripts.
Good luck, and please keep us up to date here on how it develops (also on the technical side, if possible)!
1 person has voted this message useful
|
nakrian keegiat Diglot Groupie Thailand Joined 4909 days ago 70 posts - 172 votes Speaks: English*, Thai Studies: Russian
| Message 5 of 28 15 August 2013 at 9:15pm | IP Logged |
What do you think of the idea of creating it with spaces between the words?
I read about your parser but it's only available for mac, right? Could it be used/adapted for this project?
LingQ is free with limited options. There are various paid membership levels that give benefits like unlimited lingQs, full app support, and discounted rates on tutoring sessions. I don't know if it would qualify as non-commercial or not. I guess that's up to you to decide. :)
1 person has voted this message useful
|
Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5132 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 6 of 28 15 August 2013 at 9:50pm | IP Logged |
nakrian keegiat wrote:
What do you think of the idea of creating it with spaces between the words? |
|
|
You'll run into all sorts of problems here. First of all, it's not really clear what constitutes a word. Different people will see this differently. Secondly, since it requires pre-processing by a knowledgeable party, content on LingQ will be limited. Thirdly, people can't easily upload their own texts... which takes away a big plus of a system like LingQ. Fourthly, Thai with spaces looks ugly.
To illustrate my point, try to separate the following into words:
กำลังวางแผนจะแต่งงาน ครับ อยากทราบว่า วิธีบริหารเงินเมื่อเ ริ่มต้นชีวิตคู่ของพี ่ๆเป็นอย่างไรบ้างครั บ (taken at random from page 1 of the pantip forum).
- is วางแผน one word or two?
- is เริ่มต้น one word or two?
- is ชีวิตคู่ one word or two?
- what to do about อย่างไรบ้าง
Then, on the other hand, Thai for first grade students is often spaced out. If you're in Thailand have a look at a schoolbook on reading Thai for primary school grade 1 to see what I mean. So it clearly works and might even help the students to learn to read. I see the main downside with point 3 of my short list above...
nakrian keegiat wrote:
I read about your parser but it's only available for mac, right? Could it be used/adapted for this project?
LingQ is free with limited options. There are various paid membership levels that give benefits like unlimited lingQs, full app support, and discounted rates on tutoring sessions. I don't know if it would qualify as non-commercial or not. I guess that's up to you to decide. :)
|
|
|
My parser is not available at all. I just did it for myself and lack the programming skills to share it with other people (e.g., ensure that it works outside of my specific set-up). It's a shame, but the only thing I felt confident sharing was the algorithm. From the comments in the discussion following the blog post you can see that other people have successfully built their own parsers but nobody has offered to share it :)
Regarding LingQ, don't worry. I've published the material on thairecordings.com for free so that everybody can use it. I don't want to make using it complicated. As long as Steve provides free access to the material and attributes accordingly, I'm fine.
1 person has voted this message useful
|
nakrian keegiat Diglot Groupie Thailand Joined 4909 days ago 70 posts - 172 votes Speaks: English*, Thai Studies: Russian
| Message 7 of 28 15 August 2013 at 10:06pm | IP Logged |
I am in Thailand and it's 3am so I won't go through your whole sentence now, but I would make each of the 3 examples 1 word. There are bound to be gray areas like this but I just wondered if it would be an alternate suggestion if the programming proves too difficult. After students become more accustomed to the language they'll (hopefully) see that they are sometimes one word and sometimes two depending on the context.
I agree that the spaces are ugly but most other languages have them and nobody complains! Cat from WLT used to type her lessons that way when she was first learning Thai. I don't know if she still does.
1 person has voted this message useful
|
Wulfgar Senior Member United States Joined 4673 days ago 404 posts - 791 votes Speaks: English*
| Message 8 of 28 16 August 2013 at 6:49pm | IP Logged |
Thanks for your initiative! I voted. Imo, the thai notes parsing is adequate. I've been using it for a while, and I no
longer feel like I have to use a parser and a separate dictionary, like I was doing with thai-to-english. The parsing in
thai notes is about as good as lingQ's japanese parsing, and there are no spaces between words. There are only 2
minor changes I'd personally like to see from what thai notes does.
1) turn off the underlining
2) add transliteration - this might be hard to do; I call it minor because I'm only occasionally unsure about
pronunciation of new words.
Edited by Wulfgar on 16 August 2013 at 6:52pm
1 person has voted this message useful
|