131 messages over 17 pages: << Previous 1 2 3 4 5 6 7 ... 9 ... 16 17 Next >>
montmorency Diglot Senior Member United Kingdom Joined 4825 days ago 2371 posts - 3676 votes Speaks: English*, German Studies: Danish, Welsh
| Message 65 of 131 22 June 2013 at 2:24am | IP Logged |
Hi Steve,
I've just been trying this for a German book I've imported. My main interest in
Readlang is for books, rather than web articles, and I was attracted to Readlang
because it specifically caters for book-length texts, which I think LQ and LWT do not.
I should say that I haven't previously used any automated SRS system, although I have
been using the "learn" facility somewhat. I'm not sure if I will continue to make great
use of "learn", but it will still be very good to have a list of words that I needed to
translate.
However, as no one else has mentioned it, I'm not sure if I've just missed something
obvious, since perhaps everyone else has used SRS systems before, but the flash cards
for nouns are not coming up with any articles, or other indication of gender, and it's
always being drummed into language learners that they should learn the gender along
with the noun, so I wondered what your approach to this was.
For some of the noun cards, I edited in the definite article (der, die, or das in the
case of German), which I was able to get from Wordreference. I guess it's not a bad
thing having to check it for oneself and edit it in, since that is a concious effort
and might make the learning more memorable.
The other slight issue I had was with losing any reference to chapter numbers with the
imported book. In my case, I converted from an EPUB book, and in its text form, I could
still see the chapter numbers (even if they weren't all that obvious).
I'm guessing that the problem here is that these are not really standardised in any
meaningful way (that seems to be the case even in the EPUB form, which surprises me a
bit).
If one is reading a longish book, one really needs to be able to keep track of it by
chapter, especially if one is doing something like L-R, where you want to keep the
audio and the text in sync.
I suppose one approach would be for users to edit in the chapters themselves (since you
have provided an edit facility, and users know their way around their own books better
than you or your software do - that is, they still have the original book to compare
with, and get the chapter numbers from ... page numbers as well I guess.
Anyway, I'll add my thanks to that of others for this great project, and the work you
have put into it. It's looking very good so far.
Edited by montmorency on 22 June 2013 at 11:37am
1 person has voted this message useful
| montmorency Diglot Senior Member United Kingdom Joined 4825 days ago 2371 posts - 3676 votes Speaks: English*, German Studies: Danish, Welsh
| Message 66 of 131 22 June 2013 at 2:43am | IP Logged |
hmm....I now seem to be seeing the chapter numbers, at least in later pages.
Not too obvious, but that's how it came out of the Calibre conversion, evidently.
However more importantly, there seems to be a bug with the "previous page" control ...
both the one on the left of the page in the middle, and the one at the bottom.
It seems to be advancing the page, not going back.
I've only tried one book so far though.
EDIT: Assuming that that is a bug, as it seems to be, in any case, I think you will
have to add some sort of facility to quickly get to a specific page. You don't want to
always have to go forward or backward only one page at a time.
Also, some form of scrolling up and down might be nice.
Edited by montmorency on 22 June 2013 at 2:46am
1 person has voted this message useful
| SteveRidout Diglot Groupie Spain readlang.com Joined 4279 days ago 65 posts - 121 votes Speaks: English*, Spanish
| Message 67 of 131 22 June 2013 at 11:28am | IP Logged |
Hi montmorency,
I agree the lack of articles for nouns is a problem for German. In future I want to include
this kind of metadata, including tenses of verbs too. (For Spanish the gender is usually
predictable so it hasn't bothered me, but information on verb conjugation is something I would
find useful.)
One workaround for now is to click on the noun and the article in the text, if they are
adjacent they will merge to make one flashcard. The only problem with this is that the
prioritisation of words based on their frequency/usefulness only works for individual words,
all phrases are bumped to the top of the queue. So if you translate lots of phrases, they will
drown out the individual cards. You should manually delete the cards you don't find useful,
but it isn't ideal and I should think of a better system to deal with this.
About the page controls: I've just fixed a bug where if you kept clicking very quickly on the
previous page button it would sometimes go forward. If you notice any more problems with this,
or anything else, please let me know.
Jumping to any page, scrolling, chapters, ePub support: These are all on my ever growing TODO
list. I can't promise when I'll get around to each of them because there are so many things
competing for my attention. What I do promise is to keep working full time on improving
Readlang :)
If you want a feature prioritised the best way would be to create or vote on an item in the
feedback forum: https://readlang.uservoice.com/forums/192149-general
Thanks for your feedback!
Steve
PS: I recently added a feature where texts are assigned an estimate of their difficulty,
represented by the CEFR scale, A1 (easiest) to C2 (hardest). It's a first attempt and would
like to know if it gives sensible results, especially for languages other than English and
Spanish, which I can't judge myself. Thanks!
1 person has voted this message useful
| montmorency Diglot Senior Member United Kingdom Joined 4825 days ago 2371 posts - 3676 votes Speaks: English*, German Studies: Danish, Welsh
| Message 68 of 131 22 June 2013 at 12:00pm | IP Logged |
Hi Steve,
Many thanks for your response. I will make use of the site's feedback, but I thought it
would be good for you to know you were getting users from HTLAL, and I'm also
interested in knowing what HTLAL members think of Readlang (mostly positive so far I
think).
Another thing with German nouns & articles is that the ending can change with the case,
so the trick of clicking on the article and noun together would only be useful if it
happened to be in the nominative (or those gender/case combinations that don't change
it). More experienced learners shouldn't have a problem (and it will keep them on their
toes), but beginners would have to watch out.
My problem with the "previous page" keys was not only when I was doing it quickly, I
should mention.
One little thing I noticed about that: I looked at the URL, and it seemed like the last
part of it was a line number (or possibly character number?). Anyway, I tried manually
reducing it, and sure enough, it went back, and from then on the previous page controls
worked fine. When I got back to the beginning, and then started working forward again,
the previous-page controls had stopped working again (going forward, not backward).
This is Chrome (without the extension), on XP.
I will try it again.
Oh, I noticed the CEFR level on the text. It said my book was B2, which is probably
about right - it's a "Krimi".
I was also intrigued as to how you determined which words were most frequently used (I
think this is in the language as a whole, not just the text.
By the way, I hope I didn't sound dismissive of the "Learn" tool. I've avoided SRS in
the past (in favour of paper methods ... word-list and Gold List ...or even just
letting the words sink in after looking them up once and writing them down), because of
the danger of feeling "tyrannised" by one's SRS system. (I've heard some people
complain about that). I won't rule out using it, but actually what will definitely be
useful is the initial review of words looked up in the most recent stint of reading,
especially if they are "new" words/cards/
1 person has voted this message useful
| SteveRidout Diglot Groupie Spain readlang.com Joined 4279 days ago 65 posts - 121 votes Speaks: English*, Spanish
| Message 69 of 131 22 June 2013 at 12:58pm | IP Logged |
I definitely appreciate the feedback in this forum, so please keep it coming! The only
advantage of the Readlang uservoice forum is that it allows voting so I can see which
features are most demanded, of course this will be more useful as more people use it.
I'll investigate the previous page button problem on Monday - it certainly sounds like a
bug.
I agree about feeling "tyrannised" by SRS systems, I have periods where I take a break and
then have a massive backlog to get through. I am definitely going to re-think this to put
the user more in control of how many words they want to do at a time. I think the main
problem is that it makes you feel guilty for having a queue of words to get through, where
it should be more encouraging and enjoyable so you want to come back and learn more.
I got the frequency word lists from this website: http://invokeit.wordpress.com/frequency-
word-lists/ They were generated from a very large corpus of subtitles from
opensubtitles.org
2 persons have voted this message useful
| Crush Tetraglot Senior Member ChinaRegistered users can see my Skype Name Joined 5862 days ago 1622 posts - 2299 votes Speaks: English*, Spanish, Mandarin, Esperanto Studies: Basque
| Message 70 of 131 22 June 2013 at 4:24pm | IP Logged |
I'd like to test the CEFR level but for both Català and Euskal it says there's no support for it. Is there anything i can do to help get it set up for these languages?
Another thing that'd be nice for the cards would be a "select all" checkbox. I've been exporting the words into ANKI and deleting them from the list afterwards. I really love the export procedure, though it'd be nice if it remembered the previous export format.
I was also going to suggest adding monolingual dictionaries, especially since Catalan currently has no extra dictionary. Even if i end up using an English word, i often use the Catalan dictionary to help decide how to translate it. On the uservoice site someone mentioned "dlc.iec.cat", which is what i was going to suggest as a Catalan dictionary. I'm not sure what the criteria is to add a dictionary, but words can be passed through the URL (ie. dlc.iec.cat/results.asp?txtEntrada=cercar).
I know i've mentioned it before, but supporting chapters in books would be amazing. Not super important by any means, but it'd make my front page look much cleaner. Along those lines, it'd be nice to have a way to hide texts you've read. (Does deleting them also remove them from the public list?) More sorting options in the Public Library would be nice, too, for example by author, date added, by user (though i think clicking on a user also shows what files they've uploaded), title, etc. Also, a way to edit texts added would be great, especially for mistakes in the title/author.
Oh, and lastly (for now!), i don't know if there'd be a way to just get a quick translation of a word. Sometimes i just want to make sure that something really is a cognate, for example if i see the word "creación" i just want to double-check that it means creation, and if it does don't want to add it to the word list. If i look up the word "constipado", however, i might be surprised to find that it doesn't mean what i thought it did. I've just remembered the word list available in the side bar, so this might not be necessary (i've also noticed that you can edit the texts from there, great!). Actually, i've realized a lot of what i've asked for here has already been implemented!
I'd just like to say once again thank you! I use it everyday, i love how easy it is to pick new words i want to learn. I feel like i can finally read a book on a computer screen, something i've never been able to do before! So, thank you for this wonderful program :)
3 persons have voted this message useful
| SteveRidout Diglot Groupie Spain readlang.com Joined 4279 days ago 65 posts - 121 votes Speaks: English*, Spanish
| Message 71 of 131 25 June 2013 at 12:03pm | IP Logged |
Crush wrote:
I'd like to test the CEFR level but for both Català and Euskal it says
there's no support for it. Is there anything i can do to help get it set up for these
languages?
|
|
|
It's enabled for every language for which I have word frequency lists like the top
50,000 ones here: http://invokeit.wordpress.com/frequency-word-lists/.
So if you can find or generate some word frequency lists like these from a large enough
corpus of Catalan or Basque texts I'll be very happy to add them.
Crush wrote:
Another thing that'd be nice for the cards would be a "select all" checkbox. I've been
exporting the words into ANKI and deleting them from the list afterwards. I really love
the export procedure, though it'd be nice if it remembered the previous export format.
|
|
|
You can check the first item, then SHIFT+click the last item in the list to select all.
I'll think about a dedicated select all checkbox in future as this would be more clear.
Crush wrote:
I was also going to suggest adding monolingual dictionaries, especially since Catalan
currently has no extra dictionary. Even if i end up using an English word, i often use
the Catalan dictionary to help decide how to translate it. On the uservoice site someone
mentioned "dlc.iec.cat", which is what i was going to suggest as a Catalan dictionary.
I'm not sure what the criteria is to add a dictionary, but words can be passed through
the URL (ie. dlc.iec.cat/results.asp?txtEntrada=cercar).
|
|
|
Hmm, the problem with that site is that it wouldn't fit into the Readlang sidebar and
would need horizontal scrolling, which is a bit ugly. I completely agree that
monolingual dictionaries are a good idea though, and would like to add them.
Crush wrote:
I know i've mentioned it before, but supporting chapters in books would be amazing. Not
super important by any means, but it'd make my front page look much cleaner. Along those
lines, it'd be nice to have a way to hide texts you've read. (Does deleting them also
remove them from the public list?) More sorting options in the Public Library would be
nice, too, for example by author, date added, by user (though i think clicking on a user
also shows what files they've uploaded), title, etc. Also, a way to edit texts added
would be great, especially for mistakes in the title/author.
|
|
|
Chapters are planned :)
Deleting shared books from your bookshelf doesn't delete them from the public library.
I've enabled some sorting options for the public books now.
You can edit texts from within the reader page side bar, in the Edit tab. It might be
nice to have some Edit features in the library page too.
Crush wrote:
Oh, and lastly (for now!), i don't know if there'd be a way to just get a quick
translation of a word. Sometimes i just want to make sure that something really is a
cognate, for example if i see the word "creación" i just want to double-check that it
means creation, and if it does don't want to add it to the word list. If i look up the
word "constipado", however, i might be surprised to find that it doesn't mean what i
thought it did. I've just remembered the word list available in the side bar, so this
might not be necessary (i've also noticed that you can edit the texts from there,
great!). Actually, i've realized a lot of what i've asked for here has already been
implemented!
|
|
|
Yes, for now the workflow is to delete them from the Words tab in the Sidebar, or to
wait until later and delete them from either the Words tab or the Learn tab. My aim has
been to remove as many decisions and distractions from the reading process as possible,
and it can be hard to think of ways to add more features to the reading interface while
keeping the UI very simple.
Crush wrote:
I'd just like to say once again thank you! I use it everyday, i love how easy it is to
pick new words i want to learn. I feel like i can finally read a book on a computer
screen, something i've never been able to do before! So, thank you for this wonderful
program :) |
|
|
Thanks so much, feedback like this makes my decision to carry on with the project very
easy!
1 person has voted this message useful
| Crush Tetraglot Senior Member ChinaRegistered users can see my Skype Name Joined 5862 days ago 1622 posts - 2299 votes Speaks: English*, Spanish, Mandarin, Esperanto Studies: Basque
| Message 72 of 131 25 June 2013 at 7:27pm | IP Logged |
SteveRidout wrote:
It's enabled for every language for which I have word frequency lists like the top
50,000 ones here: http://invokeit.wordpress.com/frequency-word-lists/.
So if you can find or generate some word frequency lists like these from a large enough
corpus of Catalan or Basque texts I'll be very happy to add them. |
|
|
I'm processing my own, i'm not sure if the corpus is really large enough, though. I put in around 20-30 ebooks (just over 2.5 million words), a ton of subtitles (250k words) and as many interviews as i could find (only around 50k). The subtitles take much longer cause i have to remove the timings and a lot of subtitles that say Catalan are actually in Spanish (or some other language) or something's up with the character encoding.
I've currently sent the files to http://voyant-tools.org/ and am waiting for them to finish uploading/processing :)
SteveRidout wrote:
You can check the first item, then SHIFT+click the last item in the list to select all.
I'll think about a dedicated select all checkbox in future as this would be more clear.
|
|
|
Ah, i had no idea you could do that. That's perfect, then :)
SteveRidout wrote:
Hmm, the problem with that site is that it wouldn't fit into the Readlang sidebar and
would need horizontal scrolling, which is a bit ugly. I completely agree that
monolingual dictionaries are a good idea though, and would like to add them.
|
|
|
I guess it's a bit more work, but would there be anyway to pull just the definition from the site? Each site's got a different format, but i imagine within each site there's a general format it uses (eg. something like this, he also helped me to write something similar for the Galician dictionary). I can understand wanting as much as possible to be automized, though.
SteveRidout wrote:
You can edit texts from within the reader page side bar, in the Edit tab. It might be
nice to have some Edit features in the library page too.
|
|
|
I found that out later, and that's perfect :)
EDIT: Here's a frequency list for Basque:
http://www.mirari.fr/WaHb
The format is "word, number of hits in the corpus, hits/million, number of syllables, and the last one i'm not sure what it is". There's a little over 50,000 words there, but i can pull more if you want. It's from here: http://www.ehu.es/ehg/ehme/datu2hitz.htm. Words that have the same number of hits are organized alphabetically.
Ah, i just found out there's an English version, too:
http://www.ehu.es/ehg/ehme/en/datu2hitz.htm
And a quote from the site:
Quote:
Frequency data have been taken from the Contemporary Reference Prose (CPR) corpus. However, only common Basque lexicon words have been included, those linked to a lemma. Proper names, words taken from other languages, errors, and so on, have been left out. As a result, of the 25.1 million in texts in the EPG corpus, 22.7 million words have been included. |
|
|
And here's the Catalan file i made, though i think i need to do a bit of cleaning up to the original texts so that things like "l'amor" are treated as separate words: l' and amor. I don't know if the corpurs is really large enough, either. I'll try adding some more stuff and see if the results are more convincing:
http://www.mirari.fr/LtBD
EDIT2: Here's a reprocessed Catalan frequency list with another 1.5 million words added from books:
http://www.mirari.fr/1DYK
I see a couple things that could still use some formatting, but this one seems much better. It seems i still might need to add more words, though. After around 10k a lot of words have the same frequency count.
Edited by Crush on 26 June 2013 at 1:45am
2 persons have voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 0.4063 seconds.
DHTML Menu By Milonic JavaScript
|