54 messages over 7 pages: 1 2 3 4 5 6 7 Next >>
Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5123 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 9 of 54 08 March 2014 at 9:05pm | IP Logged |
How I could I forget that most useful of tools for the learner bound by a 'No translation'-principle: Google image search. It's like asking 'What is X?' or 'Show me X!'. Doesn't work for everything, but most nouns, basic verbs (especially verbs of motion) and key adjectives are well covered. I try not to overdo it because I don't want to sit in front of a screen when studying Turkish, but it's very helpful when I almost understand something and only miss a word or two. It's also great to confirm, or disprove, hunches.
I've thrown the Excel sheet out of the window and spent Friday on programming a basic but proper corpus and concordancer. I already have one for Thai, but Turkish is a very different language and has totally different requirements. I'll feed it sentences (and later texts) I fully understand, slowly building up a source of well-curated texts meaningful to me. This time the concordancer is command line based, making it much easier to use. I've only started feeding it today, but here's a sample search result:
Other than that and that I'm picking up words left, right and center - and dropping them almost as fast - there's nothing to report.
Edited by Bakunin on 08 March 2014 at 9:05pm
1 person has voted this message useful
| Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5123 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 10 of 54 11 March 2014 at 7:12pm | IP Logged |
I've reached a first - modest - milestone: I've completed the first cycle through the 29 science books for kids of the 'Tübitak Popüler Bilim Kıtapları' series ('A', see here for a description and pictures). The book I thought was on goats was actually on domestic animals, the one on monkeys introduced rain forests etc., but most were what the cover promised.
I didn't do much else than read the books, i.e., move my eyes over the words, and look at the pictures. I went slowly, of course, and sometimes back and forth to confirm hunches ('Haven't I seen this word already in the previous sentence?'). Over the course of the ten days or so, I noticed a lot of improvement. In my first book, the one on bears, I started out with understanding* two sentences out of the maybe 100, and 10-15 words in total. In the last book, I completely understood maybe ten sentences and many more words, maybe 40 different ones if you want to put a number to it. I also picked up a bit of grammar, especially noun forms. Verbs are more of a mystery. Progress is tangible, and I'm now confident that I can pull it off. It just takes time and a high tolerance for ambiguity.
Apart from reading, I've started Anki-ing those elementary picture books ('Z'). There is some overlap with the science books (aimed at age 7+ according to the cover) but less than I had hoped for. The elementary picture books have more stuff from the immediate environment of a small child like toothbrush or bus. Bears and sharks on the other hand don't brush their teeth. I also continue to listen to Deutsche Welle's press review. In the beginning, I didn't understand much beyond names and countries, but now more and more isolated words I picked up reading are starting to stand out.
What's next? I will do one more cycle, and probably more, through those 29 books. As long as I don't get bored there's no need to move on, and I'm sure successive rounds will multiply the amount of words I understand.
I also worked a bit more on my corpus. I'm now pretty happy with the design, but I haven't started entering texts beyond testing. The set up requires me to identify the base form of a word. For nouns, the base form is usually obvious, but I'm struggling with verbs - there seems to be a whole zoo of verb forms. I'm not going to read up on Turkish verbs, I trust that over time things will become obvious.
*Understanding at my stage always means to have a guess.
3 persons have voted this message useful
| renaissancemedi Bilingual Triglot Senior Member Greece Joined 4351 days ago 941 posts - 1309 votes Speaks: Greek*, Ancient Greek*, EnglishC2 Studies: French, Russian, Turkish, Modern Hebrew
| Message 11 of 54 12 March 2014 at 9:01am | IP Logged |
Well done for the milestone. You'll end up with a ton of vocabulary!
1 person has voted this message useful
| Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5123 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 12 of 54 14 March 2014 at 8:38am | IP Logged |
@renaissancemedi: I hope so :) but see below for the fragility of my knowledge at this stage.
I've had a very busy week at work, but now I've finally come around to reading the book on bears again. Now that my corpus is set up I decided to enter all sentences I understand (for disclaimer and an example, see below) in my corpus. Since I'm not working with textbooks, I need other ways of measuring progress, and counting sentences is one way of doing so. Reading this book for the first time about two weeks ago, I understood 2 sentences. This time I was able to enter 28 sentences into my corpus. That's pretty good progress and demonstrates nicely that it is entirely possible to rely solely on context to acquire vocabulary from a base of zero given appropriate texts with lots of pictures. The whole book contains an estimated 130 sentences, so I'm well on my way. But, of course, it's mostly the shorter, easier sentences I understand, so the ratio 28/130 might be a bit misleading.
As a disclaimer and an example, here's one sentence I rate as understood, plus what I make of it:
Kutup ayıları sualtında iki dakika nefeslerini tutabilirler.
Polar bears [are able to/can] [dive/hunt/fish] two minutes under water; I don't understand the word 'nefeslerini'.
(If someone with knowledge of Turkish reads this and sees that I'm totally off, please let me know :))
So, obviously I haven't gotten all the details down yet and even skipped a word. But I think I'm not cheating on myself when I say I understand this sentence. This sentence is an example for the 'least clear' sentences I allow myself to rate as understood. Most other sentences are less ambiguous and don't contain unknown words.
Edited by Bakunin on 14 March 2014 at 8:40am
1 person has voted this message useful
| Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5123 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 13 of 54 14 March 2014 at 12:35pm | IP Logged |
The acquisition process in action... My understanding of the sentence in my last post was likely to be wrong. In a book about the Earth, I've just stumbled over the following sentence: Tüm canlıların - insanlar, hayvanlar ve bitkiler - nefes almak için havaya ihtiyacı vardır.
While I don't understand the sentence in its entirety, it's clear that it talks about people, animals and plants having the breathe. And there's the word 'nefes' again, which I now take for something like breath, or maybe 'nefes almak' as to breathe. In light of that, that other sentence from above, 'Kutup ayıları sualtında iki dakika nefeslerini tutabilirler', might rather mean something like 'The polar bear [is able to/can] hold his breath for two minutes under water'.
While not my sharpest moment, this little incident exemplifies in a rather honest manner how I'm moving along (and forward, hopefully). It's a very iterative process with lots of wrong hypotheses. Normally, I would avoid committing to a translation or meaning, but for my post I had to. Over time, incorrect guesses are being wed out, and the meaning of certain words is shaped and sharpened. At the moment, however, it's still mostly very blurry.
Edited by Bakunin on 14 March 2014 at 2:34pm
2 persons have voted this message useful
| Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5123 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 14 of 54 15 March 2014 at 10:30am | IP Logged |
Not sure if anybody is actually interested in those details, but here's another example of my work flow and thought process when I'm at my desk. I'll translate words below, but note that I only do this for the sake of describing it here; when working with the book I would often think in pictures and concepts instead of translations.
Here is a section of the books on penguins I'm currently reading:
There are penguins on something that looks like an iceberg or a headland. The text reads as follows:
Bu penguenler dev bir buz kütlesinin üzerindeler. Bu yüzen buz kütlelerine buzdağı denir.
Initially, my word for word understanding was as follows:
[This/Here] penguins [dev = ?] one ice [kütle = ?] [on/on top?]. [This/Here] swimming ice [kütle = ?] iceberg [call/say?].
I use squared brackets to indicate several possibilities, and where there's a question mark, I only have a vague feeling or guess. The word 'kütle', appearing twice, is new to me. I've seen the word 'dev' before but don't remember what it might mean.
First I checked the word 'dev' on Google image search. 'Dev' led to Bengali actors and other people, but 'devler', which I take for something like a plural, resulted in pictures like the following:
So, 'dev' might mean 'huge' or 'beast' or something to that effect. I don't think 'beast' is appropriate when talking about cute penguins, so I will accept 'huge' as the most likely hypothesis for the moment.
Next I googled 'kütle', resulting in pictures like the following:
From this, I'd say 'kütle' means 'mass'.
My understanding of the sentences now has evolved to:
Here, penguins are on a huge ice mass. These swimming ice masses are called icebergs.
As we've seen yesterday, I could still be way off. This doesn't worry me, over time things will clear up. The next time I encounter 'dev' or 'kütle' I will revisit my hypotheses and update them accordingly if need be; this is, of course, not always a conscious process - and even better if it isn't!
Edited by Bakunin on 15 March 2014 at 10:37am
2 persons have voted this message useful
| druckfehler Triglot Senior Member Germany Joined 4861 days ago 1181 posts - 1912 votes Speaks: German*, EnglishC2, Korean Studies: Persian
| Message 15 of 54 16 March 2014 at 12:09pm | IP Logged |
Your log is seriously beyond fascinating. Thanks for explaining your method with examples! The image search method seems extremely helpful. I really want to give your method a try now with one of my next Persian children's books and I'm glad some form of look-up is still possible, even without a dictionary. The results for nouns and adjectives look very promising.
I noticed that there are a couple of similar words in Turkish and Persian (not surprising) which I can understand, such as kitap (ketaab) and nefes (nafas) - your guess on the later one seems solid :)
1 person has voted this message useful
| Bakunin Diglot Senior Member Switzerland outerkhmer.blogspot. Joined 5123 days ago 531 posts - 1126 votes Speaks: German*, Thai Studies: Khmer
| Message 16 of 54 23 March 2014 at 2:19pm | IP Logged |
@druckfehler: Thanks, it’s always nice to see you stopping by. Yeah, there must be many loanwords in both languages, considering how close these two cultures are. It’s cool to find cognates in new languages, isn’t it?
The first implementation of my corpus had two shortcomings which I had to fix, one related to the workflow and the other to computation time. Now my corpus is properly implemented using sqlite3 and database technology, and the workflow streamlined. I started adding texts on Friday and am excited to see it grow; it has the promise to become a major study tool for me, given the agglutinative nature of the language. On the internet, I found TS Corpus, a general-purpose open-access corpus containing half a billion POS-tagged tokens. It has a lot of functionality, including collocations, and is simply an impressive piece of work. While I prefer to build up my own corpus of curated, familiar texts, it’s good to know that something professional exists on the internet. I approached the guy behind TS Corpus, Taner Sezer, a researcher at Mersin University, to find out how to efficiently lemmatize words (= remove endings and get to the root, like [apples -> apple] or [caught -> catch]). Lemmatization in an agglutinative language, which tends to add one ending after the other to word stems, is extremely useful, and I decided I need this for my own corpus as well. Taner pointed me to TRmorph, a morphological analyzer for Turkish, which is a great tool and meets all my needs at this point.
So, my workflow at the moment looks as follows: I read my book, look at the pictures, and add to my corpus sentences I believe I understand to some extent. While adding those sentences, I use Google image search to check up on certain words or confirm/disprove hunches, and I also use TRmorph to get at the lemma (root, stem). Sometimes, I already use my fledgling corpus to investigate things I notice; at least for now, I still remember most of the pictures the sentences come with. A second line of work consists of just reading and looking at the pictures, I do that on the train to work. I also listen to a bit of spoken Turkish, but not very much.
To wrap up for today, here are three examples of what I already do with my three-day old corpus.
1) I look at individual words in sentences familiar to me:
In this specific example, I see that the word uzun is used with a great many different nouns: claws of a bear, the river Nile, time, wings of an albatross and even fish around underwater volcanoes. I use this technique to investigate words I’ve seen before and which I know have already sneaked into the corpus even though I’m not entirely sure what they mean, or to confirm hunches.
2) I look at specific constructions:
Looking at these examples, the pattern is pretty obvious.
3) I look at specific endings (rarely):
I use this mostly to investigate hunches about grammar. Here, I have the hunch that some kind of ‘ability to do something’ is expressed with this ending, but I need to see more examples in the wild. I try not to focus too much on grammar, I believe it’s more efficient to let it emerge at it’s own pace.
Edited by Bakunin on 23 March 2014 at 2:48pm
1 person has voted this message useful
|
You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum
This page was generated in 0.3594 seconds.
DHTML Menu By Milonic JavaScript
|