Register  Login  Active Topics  Maps  

Tashkeel: A New Arabic Diacritics Tool!

 Language Learning Forum : Specific Languages Post Reply
Hashimi
Senior Member
Oman
Joined 6261 days ago

362 posts - 529 votes 
Speaks: Arabic (Written)*
Studies: English, Japanese

 
 Message 1 of 8
23 March 2010 at 1:07pm | IP Logged 

A new tool from GoogleLabs to add missing diacritics to Arabic text:

http://tashkeel.googlelabs.com/




5 persons have voted this message useful



Teango
Triglot
Winner TAC 2010 & 2012
Senior Member
United States
teango.wordpress.comRegistered users can see my Skype Name
Joined 5558 days ago

2210 posts - 3734 votes 
Speaks: English*, German, Russian
Studies: Hawaiian, French, Toki Pona

 
 Message 2 of 8
23 March 2010 at 2:17pm | IP Logged 
I don't suppose you know what kind of accuracy Google achieves overall?

I made a system that combines Support Vector Machines with statistical machine learning tools and Buckwalter's Morphological Analyser, that looks very much the same, back in 2006 for Cambridge University. I published a thesis on it, making it public domain, and then contacted Google in great excitement, but they just weren't interested at the time.

My aim was to provide a free tool that could assist all future linguists and learners of Arabic and help semi-automate the construction of annotated Arabic corpora and databases. It's since been employed in several leading Arabic research projects, both in UK and in Egypt. And if I remember correctly, my final system achieved over 93% for words without case endings across almost a million words, which I was quite delighted with back then.

Edited by Teango on 23 March 2010 at 2:22pm

1 person has voted this message useful



translator2
Senior Member
United States
Joined 6921 days ago

848 posts - 1862 votes 
Speaks: English*

 
 Message 3 of 8
23 March 2010 at 2:19pm | IP Logged 
I tried it with the Arabic CNN site (http://arabic.cnn.com/) and it works great (inserts the vowel marks, etc.), but can a native speaker give an opinion regarding the accuracy of the vowels?


1 person has voted this message useful



Al-Irelandi
Senior Member
United Kingdom
Joined 5537 days ago

111 posts - 177 votes 
Speaks: English*

 
 Message 4 of 8
23 March 2010 at 6:38pm | IP Logged 
Quite useful. It obviously doesn't attempt to make i3raab of the endings and give them their tashkeelaat. That would need another algorithm.

Edited by al-Irlandee on 23 March 2010 at 6:39pm

1 person has voted this message useful



ANK47
Triglot
Senior Member
United States
thearabicstudent.blo
Joined 7099 days ago

188 posts - 259 votes 
Speaks: English*, Arabic (Written), Arabic (classical)

 
 Message 5 of 8
24 March 2010 at 8:37am | IP Logged 
Wow, that site will be very useful for people just starting out in Arabic before they know the basic patters of how things are voweled. After several months of exposure you can be reasonably sure of how most words will be pronounced, but at the beginning if a word isn't voweled for you then you'll have no idea how to say it. I remember vocabulary lists when I was learning Arabic that were just the word written with no vowels. Lists like that are practically useless to beginners unless there's audio to go along with them. Anyway, I looked at the site and it seems to work quite well. If you need 100% accuracy for something professional I wouldn't use it, but it's a great helper for learning.
1 person has voted this message useful



Woodpecker
Triglot
Senior Member
United States
Joined 5813 days ago

351 posts - 590 votes 
Speaks: English*, Arabic (Written), Arabic (Egyptian)
Studies: Arabic (classical)

 
 Message 6 of 8
24 March 2010 at 10:19am | IP Logged 
That's a pretty amazing resource, thank you.
1 person has voted this message useful



ehmoda
Newbie
United States
Joined 5228 days ago

2 posts - 2 votes
Speaks: English

 
 Message 7 of 8
04 August 2010 at 8:27pm | IP Logged 
Teango wrote:
I don't suppose you know what kind of accuracy Google achieves overall?

I made a system that combines Support Vector Machines with statistical machine learning tools and Buckwalter's Morphological Analyser, that looks very much the same, back in 2006 for Cambridge University. I published a thesis on it, making it public domain, and then contacted Google in great excitement, but they just weren't interested at the time.

My aim was to provide a free tool that could assist all future linguists and learners of Arabic and help semi-automate the construction of annotated Arabic corpora and databases. It's since been employed in several leading Arabic research projects, both in UK and in Egypt. And if I remember correctly, my final system achieved over 93% for words without case endings across almost a million words, which I was quite delighted with back then.

1 person has voted this message useful



ehmoda
Newbie
United States
Joined 5228 days ago

2 posts - 2 votes
Speaks: English

 
 Message 8 of 8
04 August 2010 at 8:28pm | IP Logged 
Hashimi wrote:

A new tool from GoogleLabs to add missing diacritics to Arabic text:

http://tashkeel.googlelabs.com/
Teango, could you give me your email. I am very interested in the software you developed. I ama researcher also and I need that software. Please whenever you see my post just send me an email on oehmoda@gmail.com





1 person has voted this message useful



If you wish to post a reply to this topic you must first login. If you are not already registered you must first register


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3516 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.