3959 messages over 495 pages: << Previous 1 2 3 4 5 6 7 ... 451 ... 494 495 Next >>
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6696 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 3601 of 3959 10 May 2014 at 1:48am | IP Logged |
.. and now for a bit of statistics.
I had 75838 English words from 2008 and 2009 in my corpus, which I gave a first rough cleaning to remove long quotes and other conspicuous, but unwanted elements. That brought it down to 73172 words, which I divided into two packages of 36304 resp. 36868 words, and these were put into Excel for a bit of spreadsheet magic. I made unique lists from each bundle, and then I cleaned these lists. Foreign words, acronyms, nonsense and proper names were removed - though it has to be said that I probably should have kept at least the geographical proper names - after all English is notorious for mangling place names in any other languages, and it was also with mixed feelings I kicked Youtube and Wikipedia out. But I kept derivations of these proper names. As for misspelled words I corrected some and kicked out those that weren't obvious typos . After this I had 4421 resp. 4738 unique words left. And when I coupled the two lists and removed doublets I had a list with 6804 words.
But this is not all: my goal was to count headwords. I used a somewhat homemade and fuzzy definition, that for instance made adverbs on -ly forms of the corresponding adjective. I also included all forms af verbs under the infinitive, but that gave some problems because you can't see whether for instance whether "train" is the verb or the substantive when you just see it in a list. I have generally let the derivations tell me that, and here I only found "training" so I only counted "train" as a verb - but in reality it may also have been the substantive in the original context. So there is a certain amount of slack in the definitions, but my two groups were reduced to respectively 3498 and 3914 headwords, and the combined list gave 5433 unique words (so there was obviously an overlap of 1979 unique headwords). But these words were the result of an epuration of the two lists with unique wordforms. To see what that would have meant for the original samples I used I trick: I made a look-up from the samples down into the complete and revised wordlists, and everything I didn't find was eliminated. That gave revised samples with resp. 28725 and 32393 words.
And now we can draw a line: I had one sample of roughly 15.000 words in 2007, and they represented roughly 2600 unique headwords. With two independent samples of 28725 and 32393 words (after some cleaning-up) I found 3498 resp. 3914 headwords, and the combined sample of 61118 words gave 5433 unique headwords.
And this is actually a somewhat unexpected result. I thought 75838 words were enough to give an adequate picture of my writing, but if it had been so the curve would have curved and getting asymptotically closer and closer to some upper limit with growing sample sizes. In fact the interpolated line continues in an almost straight line upwards so it is clear that even 77000 words in my original corpus (or 61118 after revision) isn't nearly enough to show me a clear saturation effect. And then the upper limit might actually be the mysterious 'potentially active' vocabulary, which lies lower than a person's passive vocabulary - but heaven knows where.
It would be interesting to make the same calculations on a language where my 'potentially active' vocabulary is lower (both in numbers and as a percentage of my passive vocabulary in the language in question), but apart from Danish I don't have nearly as much text in any other language on my computer. And that's a pity because it really seems that there would be something worth investigating here. My hunch is that there would be more of a saturation effect, but I could be wrong - my use of dictionaries when I write in 'weak' languages could hide such an effect.
In this project I have made it clear that I didn't expect to produce a scientific report- there are far too many loose ends for that. But I'm fairly sure that the question I have been investigating is right at the frontline for research in vocabulary sizes - and it is irritating to leave such a question unanswered.
PS: I have actually also calculated some word frequencies, but that took place before the revision processes so it would be necessary to repeat the calculations with the revised combined corpus. And they would probably just show that the usual high frequency words also are frequent in my multiconfused log thread.
Edited by Iversen on 18 August 2014 at 6:20pm
1 person has voted this message useful
| montmorency Diglot Senior Member United Kingdom Joined 4821 days ago 2371 posts - 3676 votes Speaks: English*, German Studies: Danish, Welsh
| Message 3602 of 3959 10 May 2014 at 9:19pm | IP Logged |
I was trying to remember why the name "Melchisedech" seemed familiar, and Wikipedia has
reminded me that he gets a name check in the liturgy of the Catholic Latin mass (not
sure if he made it into the English translation):
Melchizedek which I must have heard
a few hundred times when I was growing up.
I was intrigued by such an unusual name and kind of like it (although not enough to
name one of my children after him :-) - A favourite cat, perhaps.
Let's face it, if anyone on HTLAL was going to mention him, you kind of know it was
going to be Iversen. :-) (meant as a sincere compliment, by the way :-) ).
I gather that the Eurovision song contest is coming from Denmark tonight, and that the
actor that some of us know as "Kasper Juul" (Borgen) will be hosting it. As usual, we
shall be watching it in ironic mode....and, oops, it's already started!
1 person has voted this message useful
|
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6696 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 3603 of 3959 11 May 2014 at 12:53pm | IP Logged |
I'm not going to watch the European Song contest - but I probably have to pay for it. The venue on "Refshaleøen" (a former shipyeard as far as I know) where it is done has cost a lot of money - at least 240 mio DKK (crowns) or 30 mio. €, and even though the city of Copenhagen hopes for some revenues from the people who visit it during the event (and pay exorbitant hotel prices) that money will not go to pay my bills. And we have 'TV licens' here in Denmark, which means that we more or less all cofinance the TV station DR (Danmarks Radio) which has to pay a large share of the expenses. The actual building of the premises has however been assigned to a halfway independent company.
I don't know who's going to win, but I hope it's somebody else this time. I don't even remember why we had to host it this time. What have 'we' done to deserve this - some 'Danish' song must have won last year? But probably in English, 'cause everybody except the French apparently sing in English now.
No, I definitely don't like the European song contest!
PS: right now I'm going to translate something I wrote in English at the polydog.forum into Latin - my recent research demonstrated that Latin was one of the languages that have been somewhat neglected lately due to other projects.
Edited by Iversen on 11 May 2014 at 12:59pm
1 person has voted this message useful
|
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6696 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 3604 of 3959 11 May 2014 at 6:43pm | IP Logged |
I could see from my corpora with words from this thread that I to an intolerable extent have neglected a number of languages, including Latin. So I decided to do something. I have recently written a rant about the origin of the univers and our chances (or maybe we should rather say risk) of meeting extraterrestrials at the polydog forum, and I decided to translate it into Latin. And due to the subject I had to find Latin words for things like soap bubbles, plastic bags, plate techtonics and archaeotic litophile bactiera. My "New College Latin and English" dictionary by John C.Traupman proved again to be an invaluable aid, because it basically is written for people who live today and who might want to use the Latin language. My "Neues Latein Lexicon" from Libraria editoria vaticana gave a few valuable suggestions, but it has too large holes in its vocabulary to stand alone. My Dansk-Latinsk dictionary by Ove Kjær proved once again to be as much a nuisance as a help. It looks comprehensive, but even the most simple words in modern Danish may be mssing from it. It seems that it was compiled by taking a Latin-Danish dictionary and reverting the direction, not by taking the relevant Danish vocabulary and finding Latin translations. However it was compiled in 1870 and it was a solid piece of scholarship for its age. The same cannot be said about the execrable junk which Gyldendal has published as their attempt at a Dansk-Latin dictionary. It has all the defects of the old Old Kjær dictionary (except that the Danish spelling has been modernized), but it has only a fraction of its words - and it is heartwrenching to think about the kind of dictionary we could have had if Gyldendal hadn't decided to cut corners and published this worthless item, which now blocks the way for a decent successor. It is very long time since I last have thrown a dictionary away, but today it happened - and if I had had a furnace at home I would have burnt it. As a consolation I found out that the Latin Vicipedia has been steadily increasing its collection of articles about scientific matters, and by switching from articles in English to articles in Latin you can solve many problems of scientific nomenclature and ways of expression. Besides I have used Google in some cases and in connection with this I found a pdf with some scientific vocabulary collected (or proposed) by David Morgan.
Just to illustrate the problems: the word "atmosphere" is translated by Traupman as "caelum" - but for me that means the sky as seen from the earth. I can't use that word here in 2014- The Vatican dictionary as usual prefers weird circumlocutions to simple words, and here it proposes "aer circumiectus". One more unpalatable choice. The old Kjær repeats the two preceding choices, but both Wikipedia and Morgan borrow the Greek word "atmosphaera" (which also is what the Romans themselves would have done), and I have followed in their footsteps.
So here goes:
LAT: Quid de theoria nominata anglice "big bang" dicere? Quomodo universum nostrum scilicet ortum est? Dicere quo deus eum fecit nihil explicat - ex quo veniret creator? Cardo discutionis clare est impossibilitas scientiae multum de ipsa origine dicere quia ut singularitas definienda est ubi leges naturae non plus validae sunt (forsitan idem leges mathematicae). Itaque non scimus cur origo prorsum sit ubi aliquod aliqui disploserit (quavis rationes extant immensarum '(mem)branarum' ultra universum et 'big bang' nostra qui concurrendae universa nova crearerent), tamen post fractionem secundae postea evolutionem universi subtiliter describere possum legibus physicibus qui magna ex parte indigationibus qui in Terra aut per investigationes astronomicas explorandibus sunt. Sicut indigationes suppositicae radiationae abscedentiae quae inter alia de prima aetate universi indicia nobis dat - exempli gratia indicationes momenti inflationis ubi universum repentine magnitudinum suam modo immodice auxerit, id quod dicitur uniformatitatem suam structuramque suam bullarum saponis explicare poteret.
Nimirum theoriae diversae extant, sed considerandum est quod omnis theoria multitudinem factorum iam collectorum explicare debet, et ut optio alternativa theoriae magnis fragoris melius facta ista explicare debet. Non satis est quaestiones operosas ponere de fragore magna et rationibus suae. Se ad quaedam quaestionem non responsum dare possum istud non rationem constituit ad theorias vestustas refutatasque revertere, sed contra theorias melioras quarere. Sicut accidit cum theoria relativitatis Einsteini, qui theoriae Newtoni successit sine eam - intra ambitos amplos - supervacanea reddere. Non ad theorias falsas Aristoteli regressi sumus.
De quaestionibus animentium extraterrestrium atque peregrinatium ad exoplanetas se Terra inhabitabilis fiat modo dicendum est: spatia immensa eo iter facendum quasi impossibile hoc efficit. Etiamsi existat alia planeta cum reis vivendis, manifeste non possumus eo pervenire intra spatium vitae humanum - et etiamsi pervenissimus expeditionem dimittere, non hic adesserimus se aliquando venisserit chartula (aut chartula digitalis). Ergo cur cruciamus? Futile est - "hortum suum cultivandum est", ut dixit Voltarius in libro "Candidus". Melius examinanda fuerit quid de saccos plasticos, calefactionem globalem facere atque hospites reales ex universo facere. Feliciter paret asteroidem parvam 'Aphophis' non Terram anno MMXXIX aut anno MMXXXVI percuturam iri, sed alia obiecta celesta recenter nobis praetervexerunt, certa vel sub satellitibus nostribus, et obiectum supra Chelyabinske in Russia anno 2014 displosit.
Etiam ad rem attinent quarere, cur omnia vita terrestra in RNA et DNA fundata esse paret. Quasi certum est quaecumque forma vivenda abhinc ultra tres billiones annuos adesse (etiamsi vestigia antiquissima non vere fossiles sunt, at potius relationes anormales isotopiorum et cetera - sed etiam vestigia stromotolitium vestustissima sunt). Nihilominus non scimus an prima constituo cum DNA (acidum desoxyribonucleicum) competitores habebat, et non scimus se vita in lacunis tenuibus lutulentis aut in fundo maris circa spiramentibus hydrotermalicis. Sed scimus quod bacteria prisca (archaea) in saxa longissime sub superficiam terrae et etiam in aqua salina fervidens, sicut origo vitae in multa circumiacentiis naturalibus possibilis sit. Mirabile est quod origen vitae accidit tantum maturum post creationem Terrae dum iam locus satis inhospitalis pro normis nostribus erat. Interdum dicitur vitam cum meteorides ex caelo afferta esse, sed istud non explicatio constituit - modo problemam in alteram locatationem transfert. Et dicere quo alicumque intelligentia illud fecit non quiquam ominum explicat. De unde intelligentia ista veniat? Debet facilius esse bacteriam archaeum cogitari aut vel creare, quam divum cogitari ac creare qui capabilis sit bacteriam archaeum creare.
In universo tanta stella et planetae sunt quod possibile est vita in aliquot alibus locationibus orta esse. Sed nihilominus eventum rarissimum essere potest. Ut Terrae conditiones jucundas vitae dare series eventorum inverisimilis necesse erat: opus erant planeta cum magnitudo apto per tectonicam laminarum, atmosphaeram aequam, campum magneticam, temperaturam iucundam, aquam (et libentissime aquam fluidam), concussio tempestiva ut lunam creare et magna planeta procul Terra ut asteroides cometasque divertere ex vicinitate nostri ... et etiam locum bonum ubi eventa fortuita proteina, RNA et DNA in haec ordine producerent. Ita non excluendum est alia Terra cum vita et fora interretialibus aliquo in universo existere, sed scilicet remota billiones (milliarda) de anniis lucis. Et forsitan interitus est billiones (milliarda) anniis ante creationem Terrae. Non scimus, et probabiliter numquam sciemus.
Edited by Iversen on 15 May 2014 at 1:18am
1 person has voted this message useful
|
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6696 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 3605 of 3959 14 May 2014 at 1:41pm | IP Logged |
Besides I watched Croatian TV yesterday: a program about the upcoming EU election with contributions from other countries - for instance about youth unemployment in Spanish with Croatian subtitles. So far I stick to Cyrillic when I write myself, but it is evident that Roman letters are more important because they are used for everything in Croatian and (probably) most of the stuff on the internet in Serbian.
And speaking about Romans: Yesterday I finished 'Asterix et la Zizanie' in Greek. It has taken a long time because I have worked intensively with the text, ie. copied/retranslated etc., and the writing style is not among the easiest on the planet.
GR: Οταν εξετάζο ένα κειμένο εγώ εντακτικá αντιγρáφω και/ή επανα-μεταγρáφω κάθε πρόταση, και αυτó φυσικá θέλει πολύ χρóνο. Γι 'αυτó έχω δουλέψει με τον "Αστερíξ του γαλáτη: Η Δικονοíα"πάρα πολύ καιρό , αλλá τώρα επιτέλους του έχει τελειώσει. Χθες το βράδυ δούλευα αποτελεσματικά με τους κείμενο αρκετές ώρες για να φτάσετε στην τελευταία σελίδα με τη συνήθη γιορτή, που εíναι αυτή τη φóρα τα γενέθλια του αργικó Μαζεστíξ (ο βáρδος οπώς εíναι δεμένος Χειροπóδαδα - αλλά κάθεται στο τραπέζι, αντί να κρέμονται στο δέντρο). Ο υποκειμένο που πρóκαλα την διχονοíα αυτή τη φορá εíταν ο ρωμαíος Ζιζáνιος, που ήταν ειδικóς να πρακαλεíσει αντíθεση παντού.
Edited by Iversen on 14 May 2014 at 2:48pm
1 person has voted this message useful
|
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6696 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 3606 of 3959 15 May 2014 at 1:12am | IP Logged |
I have spent part of the evening on a study of the verbs in my Polish text book - in particular the use of perfective vs imperfective verbs. Besides I have read the section about Nominal inflection in my old Serbocroatian (!) grammar by Meillet and Vailland. The funny thing about this book - apart from its attempt to describe a language which allegedly doesn't exist - is that it is talkative rather than addicted to tables, so if I didn't already know the inflection tables I wouldn't have learnt much from it. It would be like driving direction of the type "drive to the green tree, then left, then two roads further on to the left and then to the right one kilometer before the battered roadsign and then you are 'there'". But it is in French, which is nice after a lot of English grammar, and in between there are remarks which gives me some insight I otherwise might have missed. Like the remarks about differences in vowel length between word forms which look the same on paper. I won't learn them from reading a book, but I now know there is something I have to listen out for .. later.
I would have written something about the Italian astrononomy magazine I currently read in my bus-back-home-from-my-job and also about activities in Indonesian and Latin, but it is quite late now. Later...
Edited by Iversen on 15 May 2014 at 1:22am
1 person has voted this message useful
|
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6696 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 3607 of 3959 15 May 2014 at 3:09pm | IP Logged |
I was too tired at 1.22 AM to write more about that Italian magazine, but now I'm in the mood for some popular astronomy.
IT: Ho comprato la revista "Le Stelle" di Luglio 2011 durante un viaggio a Roma, Sardegna, Berlino e Copenaghen. E perché Copenaghen? Eh, perché la Federazione internazionale de ...
ESPeranto elektitis teni lian jaran mondan kongreson 2011 en la Bela Centro de Kopenhago, kaj ekde ne ekzistas vera Esperantistujo, tiu estis mia plej bona ŝanco por ajna trejnado. Sed en 2011 mi haud parolis Esperante, kaj sekve mi koncentris je la komprenado de la grandvaloran lingvon. Mi ankaŭ aĉetis kelkajn librojn, inkluzive la tradukon de La Hobito de Tolkien. En 2012, mi fine sukcesis babiladi Esperante dum konferenco en Galway, Irlando - almenaŭ dum la tago, en kiu mi restadis en la konferencejo. La resto de la tempo mi parolis Anglan lingvon.
IT: Ma stavamo parlando di astronomia. Uno degli articoli dice che forse ci sono più pianeti che stelle nell'universo, ma non possiamo vederli. Nello stesso articolo si parla anche della prima pianeta trovata con un minimo di sommiglianza alla Terra (Gliese d) - sebbene la sua stella è più piccola del nostro sole e lo stesso pianeta è più grande. Il 'd' significa che ci sono altri pianeti presso della stella Gliese, ma solo il 'd' si trova nella 'Goldilock zone', cioè la zona dove essere viventi potrebbero trovare temperature adeguate. I pianeti senza stella sono certamente ghiacciate e non adatte per la vita. C'è anche un articolo du Titan, la luna più grande di Saturno - e infatti talmente grande che possiede un atmosfera più densa di quella di Marte. Ma Titan è freddissima, brrrrr! Per una discussion sulle possibilità di gettare ewochi nei laghi di scoregge congelate su Titano, si prega di ascoltare QI XLI07 dove Ross Noble discute la possibilità con il professore Brian Cox e Sue Perkins.
Edited by Iversen on 16 May 2014 at 12:15am
1 person has voted this message useful
|
Iversen Super Polyglot Moderator Denmark berejst.dk Joined 6696 days ago 9078 posts - 16473 votes Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian Personal Language Map
| Message 3608 of 3959 15 May 2014 at 11:47pm | IP Logged |
It isn't a big thing, but I have been reading about the composer Josquin's collegues at a site named HOASM, and there I happened to read about the supremely prolific composer Orlando di Lasso (Rolandus Lassus):
The prize Franco-Netherlander was won by Duke Albert V of Bavaria, whose emissary persuaded Lassus to go from Antwerp to Munich as a singer in 1550.(...) That Lassus' relations with Albert's son, William V, were particularly friendly is shown by his letters to William, written in a cheerful mixture of languages, e.g., Italian, Latin, French, and German mingling in a single letter.
The interesting thing is not only that di Lasso could write in those languages, but also that he assumed that his regal penpal could read the result.
1 person has voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 0.7969 seconds.
DHTML Menu By Milonic JavaScript
|