Register  Login  Active Topics  Maps  

Building your own corpus

 Language Learning Forum : Learning Techniques, Methods & Strategies Post Reply
27 messages over 4 pages: 1 2 3
andras_farkas
Tetraglot
Groupie
Hungary
Joined 4695 days ago

56 posts - 165 votes 
Speaks: Hungarian*, Spanish, English, Italian

 
 Message 25 of 27
25 February 2014 at 7:03pm | IP Logged 
Lizzern wrote:
kujichagulia wrote:
It's easier to have a collection of files such as
Word or LibreOffice documents, PDFs, or .txt files in a folder. But how do you search
all of those files? I'm not aware of any software that could do that.


The normal Windows search lets you get a list of files at least (it doesn't always
search inside pdfs though). Or if you keep them in the same folder, you can search the
folder itself and narrow down the list of files to those containing a certain word,
without having to open them. I'm probably going to use Word if I decide I want to build
my own corpus, and just use the search in the program, along with the Windows search.

Liz

Both of those options work, but they are really clumsy for regular use. The Windows
search feature gives you a list of file names without the actual content, so you have
to open the files one by one and find where the search term occurs. The search in Word
allows you to search in a single file only and even then you can only see one hit at a
time.


Dedicated software tools work much better, so that's what I recommend unless you are
unable or unwilling to install a dedicated tool.
Copernic, dtsearch et al. can search in pdf/word/html/txt and a bunch of other formats,
while xbench and tmlookup can handle bilingual or multilingual texts and have better
search features. The downside is that you can't just dump files in a folder and be done
with it; you need to import your files, possibly after converting them to the right
format.

3 persons have voted this message useful



Lizzern
Diglot
Senior Member
Norway
Joined 5704 days ago

791 posts - 1053 votes 
Speaks: Norwegian*, English
Studies: Japanese

 
 Message 26 of 27
25 February 2014 at 7:11pm | IP Logged 
andras_farkas wrote:
Both of those options work, but they are really clumsy for regular use. The Windows
search feature gives you a list of file names without the actual content, so you have
to open the files one by one and find where the search term occurs. The search in Word
allows you to search in a single file only and even then you can only see one hit at a
time.


Dedicated software tools work much better, so that's what I recommend unless you are
unable or unwilling to install a dedicated tool.
Copernic, dtsearch et al. can search in pdf/word/html/txt and a bunch of other formats,
while xbench and tmlookup can handle bilingual or multilingual texts and have better
search features. The downside is that you can't just dump files in a folder and be done
with it; you need to import your files, possibly after converting them to the right
format.


Yep, it's not ideal. I just have an aversion to having to learn new software and how to convert files and so on, so I tend to just stick to the stone age method :-) I'm just not the technologically savvy type. That said, I probably wouldn't feel that way if I intended to have lots of separate files. I'm most likely going to compile texts into large files grouped in some way that makes sense (by subject, source, complexity...). So I might only have a few texts to search through. But if it feels too clunky I'll look into some of the other options :-)

Liz
1 person has voted this message useful



kujichagulia
Senior Member
Japan
Joined 4642 days ago

1031 posts - 1571 votes 
Speaks: English*
Studies: Japanese, Portuguese

 
 Message 27 of 27
26 February 2014 at 1:42am | IP Logged 
Excellent advice here. Thank you very much!

It is probably too time-consuming for me to attempt right now, but now I know what to do if I am ever in the right situation.


1 person has voted this message useful



This discussion contains 27 messages over 4 pages: << Prev 1 2 3

If you wish to post a reply to this topic you must first login. If you are not already registered you must first register


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.1875 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.