Register  Login  Active Topics  Maps  

Sources for Sentence Mining

 Language Learning Forum : Learning Techniques, Methods & Strategies Post Reply
PaulLambeth
Senior Member
United Kingdom
Joined 5363 days ago

244 posts - 315 votes 
Speaks: English*
Studies: Icelandic, Hindi, Irish

 
 Message 1 of 6
05 November 2011 at 7:24am | IP Logged 
Hi all,

I'm looking to start the sentence mining technique that Glossika (Mike Campbell) uses so frequently. If you don't know it, he talks about it in detail here: http://www.youtube.com/watch?v=XAdyAa4oHDA. Of course, I can't do anything like 5.000 sentences every day, but I can see its effectiveness.

The language I'm planning to use it for is Icelandic. Has anyone tried collecting mass sentences (a thousand or so at a minimum) and found particularly good resources? My thoughts:

- books which list sentences. Easier to find for important business languages.
- phrase books. Like smaller, simpler versions of the above. There really aren't any for Icelandic and are a little too simple.
- own creation. I'd rather not have to resort to this, besides wise manipulation of existing ones, as I'm trying to increase my vocabulary where I know it's correctly used, and I don't really know any Icelanders properly enough to ask them to check them.
- translating example sentences in other languages' textbooks. Again, this has the problem that 'own creation' does, but with the advantage of giving a range of sentences I might not think of normally.
- normal books. I have a couple, which are being quite useful, however one has to avoid more poetic ones unless they are that progressed.
- Tatoeba (http://tatoeba.org/eng/). This has a good 8.800 sentences in Icelandic, although a lot of them are translations of poetic quotes or too simple. Definitely some possibilities here though.
- web resources, e.g. forums, Wikipedia and news websites (as well as newspapers). Again this could be handy. I've not looked there yet though. Newspapers often contain a lot of words I don't understand, but forums should be less taxing and more 'modern' too.

Any ideas for more? These resources can be used for any language of course, and the 'books which list sentences' source appears to be key.

I would like to apply the method to Hindi later too, but that'll be a while.

Paul

---

EDIT: This may be useful: http://www.forlagid.is/?p=5731. It has a whole book about the making of sentences, so I'd expect a few examples, but unfortunately it's out of print. Might have to check elsewhere for that.

Edited by PaulLambeth on 05 November 2011 at 7:41am

3 persons have voted this message useful



Doitsujin
Diglot
Senior Member
Germany
Joined 5310 days ago

1256 posts - 2363 votes 
Speaks: German*, English

 
 Message 2 of 6
05 November 2011 at 12:54pm | IP Logged 
There was a similar thread last year.
1 person has voted this message useful



PaulLambeth
Senior Member
United Kingdom
Joined 5363 days ago

244 posts - 315 votes 
Speaks: English*
Studies: Icelandic, Hindi, Irish

 
 Message 3 of 6
05 November 2011 at 3:28pm | IP Logged 
Doitsujin wrote:
There was a similar thread last year.


That's about as similar as it can get. Same method, same language. I had no idea it was called the 10.000 Sentences method.

I've read the whole thread, and what I gather on top of mine is:
- blogs (sorta had it under web resources, but hadn't thought of that one)
- movie subtitles

Movie subtitles are an interesting thought. I know Nói Albinói and 101 Reykjavík have some pretty regular dialogue, and if I can pick an other-language film that interests me then all the better. I won't use a script like suggested; I want to read and comprehend each sentence first, and also write it down in my notebook. I also don't agree with the posts in there about Old Norse; its grammar may be almost identical, but the subject matter and choice of words is archaic. Words like kingdom, humanity and horse are the most common in that. Hardly appropriate for downtown Reykjavík usage.

Thanks for pointing it out to me. I'd still like to know if there's any more useful sources.

I've been through a few on Tatoeba and it seems I was underestimating how good those sentences are. There are some translated Bible quotes, admittedly, but many are applicable. Latest one:
Ég fékk krampa í fótinn þegar ég hljóp niður tröppurnar til að ná lestinni og ég varð að setjast niður í miðjum tröppunum.
(I got cramp in my leg when I ran down the stairs to catch the train, and I had to sit down in the middle of the stairwell)
Another one I saw yesterday, which kinda proves my point:
Í heiminum mínum eru allir smáhestar og borða regnboga og kúka fiðrildum.
(In my world everyone’s a pony and they all eat rainbows and poop butterflies)
I kept it though. Mainly because it's a cute sentence, but partially because it's amusing to anyone who knows some Norwegian.
1 person has voted this message useful



jed
Newbie
United States
Joined 4806 days ago

12 posts - 33 votes
Speaks: English*

 
 Message 4 of 6
07 November 2011 at 6:33pm | IP Logged 
I don't know if this is particularly helpful, but here are a couple of ideas.

I like any kind of dual reader as they make for less work - Wikipedia is nice because, although it is not strictly speaking a dual reader, you can often reference back to a very similar article in your L1, making life easier.

Other than finding sources that make the task as easy as possible, for me the most important part of mining sentences is choosing the sentences. I prefer contemporary prose, and then focus on sentences with a variety of what in English would be adjective clauses, adverb clauses and noun clauses. Other than vocab and special expressions, these clauses are what usually make reading/listening most difficult, especially if your target language is markedly different from your L1. The faster I can get my head around these types of clauses, the faster I progress.
1 person has voted this message useful



Splog
Diglot
Senior Member
Czech Republic
anthonylauder.c
Joined 5659 days ago

1062 posts - 3263 votes 
Speaks: English*, Czech
Studies: Mandarin

 
 Message 5 of 6
07 November 2011 at 7:28pm | IP Logged 
I faced exactly this problem when learning Czech. It was very hard to find sources of
material to mine from. Until I realised that although there are few resources for learning
Czech, there are tons of resources for Czechs learning English. So, I bought several of
those books which contained thousands of paired sentences: Czech on one side and English
on the other. These books worked just as well in the opposite direction.

I imagine that, similarly, there are masses of books teaching English to speakers of
Icelandic, with pairs of sentences in both languages that you could use just as
effectively for mining Icelandic sentences.
2 persons have voted this message useful



montmorency
Diglot
Senior Member
United Kingdom
Joined 4818 days ago

2371 posts - 3676 votes 
Speaks: English*, German
Studies: Danish, Welsh

 
 Message 6 of 6
10 November 2011 at 2:10am | IP Logged 
For those who sentence-mine, do you also use your SRS system for individual words?


1 person has voted this message useful



If you wish to post a reply to this topic you must first login. If you are not already registered you must first register


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3125 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.