Register  Login  Active Topics  Maps  

How many Chinese characters do you need?

 Language Learning Forum : Specific Languages Post Reply
17 messages over 3 pages: 13  Next >>
Hertz
Pro Member
United States
Joined 4514 days ago

47 posts - 63 votes 
Speaks: English*
Studies: German, Spanish, Mandarin
Personal Language Map

 
 Message 9 of 17
12 March 2013 at 8:14pm | IP Logged 
As the OP hinted, an exhaustive analysis would account for the multiple uses of each character in
conjunction with others. Knowing 人 doesn't mean you know 工人 土人 女人, 大人, and so on. You'd need
stats to say: "15% of the time, 人 is used in a way I understand."

Edited by Hertz on 12 March 2013 at 8:17pm

1 person has voted this message useful



Mountolive
Pro Member
United States
Joined 4460 days ago

10 posts - 29 votes
Speaks: English*
Studies: Spanish
Personal Language Map

 
 Message 10 of 17
25 March 2013 at 4:26am | IP Logged 
I still have a copy of T.K. Ann's Cracking the Chinese Puzzles from an attempt to learn Mandarin some years ago which never really got off the ground.

In Chapter 6 of the first volume he discusses the results of private survey of character-usage in newspapers conducted by 20 students over a period of one year. 1,411,088 characters were counted. The results he reports were as follows:

- There were a total of 4,687 different characters used.

- The 50 most common characters made up 27.5% of the 1.4 million characters counted.

- The first 500 most frequent characters accounted for 74.7% of the total.

- The first 2500 most frequent characters accounted for 98.8% of the total.

- Ann claims that a knowledge of 3650 characters will allow a reader to recognize 99% of the content of a Chinese newspaper.

Ann's statistics may be a bit dated (his books were published in 1982), but you might be able to use his results as one data point.
4 persons have voted this message useful



OneEye
Diglot
Senior Member
Japan
Joined 6851 days ago

518 posts - 784 votes 
Speaks: English*, Mandarin
Studies: Japanese, Taiwanese, German, French

 
 Message 11 of 17
25 March 2013 at 5:45am | IP Logged 
That sounds great and everything, but it still means that if you know 3650 characters, there will likely be a few in
every article that you don't know.

The list of characters learned by Taiwanese students linked to above is kind of weird. If you read up at the top,
the researchers analyzed reading material, dictionaries, and textbooks used by school students, and then split
the characters up into "levels" based on the frequency of use. Both characters of 蘑菇 show up in the ninth level,
which is weird to be because it's a pretty common word.

I have to say though, there's still an unsettling number of characters I don't know on that list.

The list of 4808 characters from the MOE can be found
here.

I think that if you're aiming for advanced proficiency in Chinese, somewhere in the 4000-6000 range would be
suitable, depending on what you like to read. The more "highbrow," the more characters you'll need. I also have a
feeling that mainland writers tend to use fewer characters than those from Hong Kong or Taiwan, so if you plan
on limiting yourself to simplified material, maybe you can get away with fewer.

Edited by OneEye on 25 March 2013 at 6:31am

1 person has voted this message useful



shk00design
Triglot
Senior Member
Canada
Joined 4445 days ago

747 posts - 1123 votes 
Speaks: Cantonese*, English, Mandarin
Studies: French

 
 Message 12 of 17
25 March 2013 at 10:28pm | IP Logged 
This topic came up at least once already. Always interesting to explore.

Unlike English and languages that uses an alphabet, a lot of what you know off our head
come from frequent use. The more a character comes up, the easier it is to recognize. A
lot of times you recognize a character when you see it but can't remember how to write
it on the spot.

Reading a newspaper for instance, you will occasionally come across unfamiliar
characters. Knowing the characters around it in a sentence you can make out what the
unknown character is.

When it comes to writing an E-mail, it is much simpler. You have dictionaries on your
computer to locate the proper characters. All you need to know is the meaning or
pronunciation. Reading a news article online is much the same. You can just Cut & Paste
the character to a computer dictionary for quick look-up.

Edited by shk00design on 25 March 2013 at 10:31pm

1 person has voted this message useful



cacue23
Triglot
Groupie
Canada
Joined 4300 days ago

89 posts - 122 votes 
Speaks: Shanghainese, Mandarin*, English
Studies: Cantonese

 
 Message 13 of 17
14 April 2013 at 5:20am | IP Logged 
egill wrote:
Here is a list of characters that Taiwanese students supposedly learn (separated into
school grades 1-9). There's 5568 total but the first 8 sets (3526 total) would probably
suffice as a starting off point.


List


Oops, never thought traditional Chinese character was that hard to recognize (I use the simplified version) until I saw that list...
1 person has voted this message useful



gaoyoude1
Diglot
Newbie
United Kingdom
fluentinmandarin.comRegistered users can see my Skype Name
Joined 4214 days ago

6 posts - 16 votes
Speaks: English*, Mandarin
Studies: French, Spanish

 
 Message 14 of 17
14 May 2013 at 3:24am | IP Logged 
I have learned more than 3000 Chinese characters systematically, and when I read modern
literature or Chinese newspapers, I hardly come across any characters at all that I don't
know, and when I do, they are generally obscure names of trees or fish etc.
I would say that the most common 3000 characters are more than enough unless you want to
get into ancient texts.
2 persons have voted this message useful



lorinth
Tetraglot
Senior Member
Belgium
Joined 4275 days ago

443 posts - 581 votes 
Speaks: French*, English, Spanish, Latin
Studies: Mandarin, Finnish

 
 Message 15 of 17
14 May 2013 at 9:28am | IP Logged 
Yet another source of interesting statistics on this subject is
Patrick Zein's site, based on Jun
Da's research on vast corpuses of literary and non-literary writings.

In a nutshell, with 3000 characters, you should recognize 99.2 % of characters used in
contemporary texts (which still leaves about 1 unrecognized character every 4 lines,
maybe 5 or 6 per page). In my experience, you can start reading contemporary prose
(though somewhat laboriously) with 2000 characters, which is supposed to amount to a
97.0 % undestanding level. You will notice that the law of diminishing returns works at
full swing for the upper percentiles.

And, of course, the big caveat is: "recognizing characters" does NOT mean "recognizing
words" or "understanding sentences".

Edited by lorinth on 14 May 2013 at 9:33am

2 persons have voted this message useful



lichtrausch
Triglot
Senior Member
United States
Joined 5961 days ago

525 posts - 1072 votes 
Speaks: English*, German, Japanese
Studies: Korean, Mandarin

 
 Message 16 of 17
14 May 2013 at 5:14pm | IP Logged 
Unless I'm reading something with a pop-up dictionary, the most unknown characters I can
tolerate is ≈1 per page. Its just too much of a pain to be looking up characters on top
of looking up unknown words. So I probably won't be cracking open a novel until I'm
around 4000 characters. But at this point I'm making speedy progress so it probably isn't
far off. I actually have little idea how many characters I know since I don't use Anki or
Heisig or what not. If I was forced to guess I'd say somewhere between 3000 and 3500.
It gets very hard to quantify at this level.


1 person has voted this message useful



This discussion contains 17 messages over 3 pages: << Prev 13  Next >>


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3750 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.