Register  Login  Active Topics  Maps  

Grammar Frequency

  Tags: Grammar
 Language Learning Forum : General discussion Post Reply
DaraghM
Diglot
Senior Member
Ireland
Joined 6151 days ago

1947 posts - 2923 votes 
Speaks: English*, Spanish
Studies: French, Russian, Hungarian

 
 Message 1 of 8
07 September 2012 at 1:54pm | IP Logged 
Does anyone know if the frequency of grammatical concepts has ever been derived for various languages ?

In Spanish, the most frequent word is 'que'. However, this doesn't tell me about its actual usage. Is 'que' more likely to occur as a relative pronoun by itself, or combined as 'lo que'. Similarly, 'de', has numerous usages but what is the most common, expressing possession or in conjunction with verbs. Is the -ía ending for imperfect verbs more common than -aba ? Is the imperfect more common than the preterit ?

In Russian, what are the relative frequency of the cases, and the frequency of the case endings ? Is the -у end for verbs more common than -ю E.g. я иду (I go), я читаю (I read)

I think an invaluable resource for language learners would be a frequency list of grammatical concepts. Most grammar books are either too brief or too detailed to learn from efficiently; but if the concepts were ordered by frequency, it would provide the greatest coverage in the shortest time possible.

What are your thoughts ?


Edited by DaraghM on 07 September 2012 at 1:55pm

1 person has voted this message useful





Iversen
Super Polyglot
Moderator
Denmark
berejst.dk
Joined 6703 days ago

9078 posts - 16473 votes 
Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan
Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian
Personal Language Map

 
 Message 2 of 8
07 September 2012 at 3:32pm | IP Logged 
I made a complete typological survey of subordinate phrases in Modern French as part of my final dissertation when I studied that language, and I remember that only one other person (whose name I don't remember now) had made something similar and published it in a book - I could probably find the name in my dissertation if need be.

As rebellious as ever I had proposed my own reclassifications of the phrase types, and then I went through exactly 10.000 pages of literature (representing different genres and styles) and marked all subordinates and some related constructions without finite verbs and counted them. The main theme of the dissertation was something like "The correlative constructions in Modern French", but the statistics included much more than that, and I also described the development of the phrase types from Latin through the history of French with ramifications into other Romance langages and beyond. I could easily have made it into a doctoral thesis, but I needed a thesis for my candidate degree first.

And then I discovered that I couldn't even get a job as a grammar teaching hireling at my own institute - I needed a course in pedagogics, but if I had taken that I would at best have ended up as a teacher in the 'gymnasium' (high school), most likely on a part time basis. The positions at the universities were taken by the 68'er generation. Therefore I stopped doing serious scientific research, and I never got around to publish the results of my statistical analysis.

Edited by Iversen on 07 September 2012 at 9:08pm

5 persons have voted this message useful



Chung
Diglot
Senior Member
Joined 7156 days ago

4228 posts - 8259 votes 
20 sounds
Speaks: English*, French
Studies: Polish, Slovak, Uzbek, Turkish, Korean, Finnish

 
 Message 3 of 8
07 September 2012 at 8:03pm | IP Logged 
DaraghM wrote:
Does anyone know if the frequency of grammatical concepts has ever been derived for various languages ?

In Spanish, the most frequent word is 'que'. However, this doesn't tell me about its actual usage. Is 'que' more likely to occur as a relative pronoun by itself, or combined as 'lo que'. Similarly, 'de', has numerous usages but what is the most common, expressing possession or in conjunction with verbs. Is the -ía ending for imperfect verbs more common than -aba ? Is the imperfect more common than the preterit ?

In Russian, what are the relative frequency of the cases, and the frequency of the case endings ? Is the -у end for verbs more common than -ю E.g. я иду (I go), я читаю (I read)

I think an invaluable resource for language learners would be a frequency list of grammatical concepts. Most grammar books are either too brief or too detailed to learn from efficiently; but if the concepts were ordered by frequency, it would provide the greatest coverage in the shortest time possible.

What are your thoughts ?


I've never seen anything systematic considering frequency of several features in a language. I have read essays, studies or lists for some feature in a language that also list frequency of that feature for the language in question. These often involve corpus analysis as you'd deduce the frequency of a feature by searching a large sample of the language in use.

Here are a few examples:

Cases in Finnish (including frequency of cases' employment based on analysis of articles from Helsingin Sanomat)
The use of the partitive case in Finnish learner language: A corpus study
Word order variation in German main clauses: A corpus analysis

In general, most courses for languages that I've come across align generally to frequency of use in the contemporary language. For a canonical nominative-accusative language (the only type of language alignment that I've experienced) I invariably start with present tense and nominative case, and then move onto the other tenses and cases respectively. However I think that this is hard to think of as something applicable to several languages since teachers or language course authors have to consider not only frequency of the feature but to a certain degree also how difficult or time-consuming it can be for non-natives to figure out.
1 person has voted this message useful



Peregrinus
Senior Member
United States
Joined 4492 days ago

149 posts - 273 votes 
Speaks: English*

 
 Message 4 of 8
07 September 2012 at 8:51pm | IP Logged 
For conjugated and inflected variations, surely the frequency of same can be derived from a non-lemmatized frequency list. Such lists in my experience tend to be heavily based on written sources, so perhaps there would be difference in common speech to some degree.

For phrases, as in the use of que in Spanish, you can look for the studies of large corpora for lexical chunks ("n-grams" is often used in academic studies that I have see).

Here is such a site with files for English, Spanish and Portuguese:

http://www.ngrams.info/spanport.asp

It has a large number of files according to the number of words looked at, as in 2-grams, 3-grams, etc. The problem is that you have to manually select what you are looking for to remove mere collocations and other junk. For frequency purposes though, I don't remember if such data is included. Even if it is, I suspect it only applies to that specific number of words, but perhaps not.

1 person has voted this message useful





Iversen
Super Polyglot
Moderator
Denmark
berejst.dk
Joined 6703 days ago

9078 posts - 16473 votes 
Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan
Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian
Personal Language Map

 
 Message 5 of 8
07 September 2012 at 9:03pm | IP Logged 
A non-lemmatized frequency list would lump words with similar endings, but different functions together - like the ending "ae" in Latin. To get something relevant you would have to do some grammatical analysis to separate those cases.
1 person has voted this message useful



Peregrinus
Senior Member
United States
Joined 4492 days ago

149 posts - 273 votes 
Speaks: English*

 
 Message 6 of 8
07 September 2012 at 9:25pm | IP Logged 
Iversen,

That is a good point. While it would be more pronounced in heavily inflected languages like Russian and Latin, it still would be present in Spanish with gender endings. Also in Spanish one would have to manually combine and mathematically derive a combined frequency for the alternate imperfect subjunctive endings and distinguish between those conjugated endings which are the same in more than one tense.

Edited by Peregrinus on 07 September 2012 at 9:27pm

1 person has voted this message useful



Serpent
Octoglot
Senior Member
Russian Federation
serpent-849.livejour
Joined 6597 days ago

9753 posts - 15779 votes 
4 sounds
Speaks: Russian*, English, FinnishC1, Latin, German, Italian, Spanish, Portuguese
Studies: Danish, Romanian, Polish, Belarusian, Ukrainian, Croatian, Slovenian, Catalan, Czech, Galician, Dutch, Swedish

 
 Message 7 of 8
07 September 2012 at 10:40pm | IP Logged 
It would be the same in German, with the same form of the article appearing in various cases. Seriously, it would be EASIER if there were no repeats!!! like in Finnish.

Gunnemark wrote a bit on this subject, but it was more in terms of what concepts/meanings you should start with, just like the 400-500 words of the "active minimum".
1 person has voted this message useful



Jeffers
Senior Member
United Kingdom
Joined 4909 days ago

2151 posts - 3960 votes 
Speaks: English*
Studies: Hindi, Ancient Greek, French, Sanskrit, German

 
 Message 8 of 8
16 September 2012 at 9:57pm | IP Logged 
I had a flashcard program which had the 5000 most frequent word-forms in the Greek NT,
arranged in order. It seemed like a good idea, but actually it was a bugger to learn the
forms out of context. Many forms have multiple uses, so to answer the card correctly you
should be able to name them all.

There is a lot of Bible software with each word grammatically tagged. I used to use one
called Bible Works. It should be possible to create grammar frequency lists with software
like this, but it would need to be with texts important enough for people to have gone
through and tagged every single word (such as the Bible).

By the way, the software is sophisticated enough to find things such as, for example, a
particular verb within two words of any infinitive. And I bought it nearly 12 years ago.


1 person has voted this message useful



If you wish to post a reply to this topic you must first login. If you are not already registered you must first register


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3750 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.