kraemder Senior Member United StatesRegistered users can see my Skype Name Joined 5176 days ago 1497 posts - 1648 votes Speaks: English* Studies: German, Spanish, Japanese
| Message 969 of 1702 29 May 2013 at 6:59pm | IP Logged |
I stumbled upon a word frequency generator that allegedly works for Japanese. After work I'll be using it on some texts.. Potter and Naruto scripts come to mind. I kind of wonder how it will parse the words effectively since Japanese hasn't got spaces on the text..
Edited by kraemder on 30 May 2013 at 8:30am
1 person has voted this message useful
|
kraemder Senior Member United StatesRegistered users can see my Skype Name Joined 5176 days ago 1497 posts - 1648 votes Speaks: English* Studies: German, Spanish, Japanese
| Message 970 of 1702 30 May 2013 at 12:56am | IP Logged |
Well ran the frequency generator on Potter. It's remarkably easy to use. I think it looks pretty good.. not definitive by any means. There's no spaces in Japanese so that makes it a pain to parse I'm sure. This word comes up really high on the frequency list.. however, I've yet to see it in the book and I'm several chapters in now..
賀する がする (vs-s,vt) to congratulate
If you look at the kana you can see how a dictionary might confuse a lot of する verbs with this word since there's no spaces. There's some other words.. like 田 appears 245 times in Harry Potter according to the parser. I haven't seen it yet either. However, if it failed to conjugate a word, it might mistake the declension for this vocabulary word. I think if I just ignore these strange words that come up as high frequency then the list is pretty good? I'm gonna make a list of the top 100 words by frequency that I don't already know. It can't hurt i guess. (and words that aren't obviously false.. although I'm including がする.. I'm fascinated by this word for some reason.. )
1 person has voted this message useful
|
kraemder Senior Member United StatesRegistered users can see my Skype Name Joined 5176 days ago 1497 posts - 1648 votes Speaks: English* Studies: German, Spanish, Japanese
| Message 971 of 1702 30 May 2013 at 1:45am | IP Logged |
I think I'm giving up on this word frequency thing. It seems like there's way too many false words found and sorting it is a pain and then there's no context given so when I study it later it won't recall an image at all.. I'm not liking it.
1 person has voted this message useful
|
kraemder Senior Member United StatesRegistered users can see my Skype Name Joined 5176 days ago 1497 posts - 1648 votes Speaks: English* Studies: German, Spanish, Japanese
| Message 972 of 1702 30 May 2013 at 8:01am | IP Logged |
I'm still playing with the frequency tool.. I ran it on all seven of the Potter books. It's depressing, but per the report, I need to learn 23,960 words to read all seven of the books and know every word. Didn't think it would be that many. It's probably off by a bit but clearly there's a lot of words here.
*edit*
2688 kanji used
*another edit*
I gotta say, the amount of words there is rather daunting. It makes it rather make sense to only tackle those that are high on the list. And it makes we want to use the prepared frequency lists you get off of iknow.jp etc.
Edited by kraemder on 30 May 2013 at 8:29am
1 person has voted this message useful
|
g-bod Diglot Senior Member United KingdomRegistered users can see my Skype Name Joined 5974 days ago 1485 posts - 2002 votes Speaks: English*, Japanese Studies: French, German
| Message 973 of 1702 30 May 2013 at 7:27pm | IP Logged |
Remember, however, that since any non-jouyou kanji is normally accompanied by a furigana
gloss, you don't have to worry about more than 2000 or so kanji anyway! It also sounds like
the text parser has some serious issues with word boundaries.
I think you'd probably have more success reading the old fashioned way and just looking stuff
up when you need to.
1 person has voted this message useful
|
dampingwire Bilingual Triglot Senior Member United Kingdom Joined 4657 days ago 1185 posts - 1513 votes Speaks: English*, Italian*, French Studies: Japanese
| Message 974 of 1702 31 May 2013 at 12:07am | IP Logged |
kraemder wrote:
I changed my RTK deck to include sounds. Chinese sounds mostly but
occasionally Japanese sounds if
they're used in compounds or the Chinese ones just aren't used. I'm tryin to keep it to
1 sound per kanji to make it more doable. |
|
|
I built a list of kanji that have only one reading (or at least, only one reading in
KANJIDIC ... there are always exceptions to anything ...) and there were over 200 of
them. I'd originally decided that I wasn't going to bother learning, just "vocab", but
then I noticed a few single-reading kanji cropping up (like 電 and both of the kanji in
制服) and I found them to help me in remembering some vocab (for example, the last
kanji in 民主主義 only has the reading ギ).
Maybe once the N4 is out of the way, I'll get back to that and put them all in a deck
and see what that leads to.
1 person has voted this message useful
|
dampingwire Bilingual Triglot Senior Member United Kingdom Joined 4657 days ago 1185 posts - 1513 votes Speaks: English*, Italian*, French Studies: Japanese
| Message 975 of 1702 31 May 2013 at 12:12am | IP Logged |
kraemder wrote:
I stumbled upon a word frequency generator that allegedly works for
Japanese. After work I'll be using it on some texts.. Potter and Naruto scripts come to
mind. I kind of wonder how it will parse the words effectively since Japanese hasn't got
spaces on the text.. |
|
|
I've not made much progress trying to build up my concordance (and I probably won't work
on it until sometime in July) but a pointer to any kind of Japanese parser would be good.
In the unlikely event that it can cope with PDF input, then it might even be brilliant!
1 person has voted this message useful
|
kraemder Senior Member United StatesRegistered users can see my Skype Name Joined 5176 days ago 1497 posts - 1648 votes Speaks: English* Studies: German, Spanish, Japanese
| Message 976 of 1702 31 May 2013 at 12:18am | IP Logged |
Yeah learning kanji sounds helps me a lot. But I haven't been able to make myself learn the them just on
their own. At least not very many at a time. Mostly just a few. I am picking them up as I do vocabulary
gradually so I'm getting there. To be honest I'm not too concerned by the amount of kanji that the
frequency thing found. It's more the 20k words. Ouch. But I don't know how accurate it is and I do know
that once you recognize enough words you get the context and can figure out the meaning of the words
you don't know. I get that with German. I don't already know all the words on a page on their own when I
read but I'll often be surprised to say with confidence that I understand them all. So it probably isn't as bad
as it seems just looking at the numbers.
1 person has voted this message useful
|