Register  Login  Active Topics  Maps  

Is there a software that extracts words?

  Tags: Word List | Software
 Language Learning Forum : Questions About Your Target Languages Post Reply
audiolang
Diglot
Senior Member
Romania
Joined 5236 days ago

108 posts - 109 votes 
2 sounds
Speaks: Romanian*, English

 
 Message 1 of 8
30 September 2007 at 3:56am | IP Logged 
I am curious to know if there exists a software which allows numbering the words in a text and arranging them in an alphabetic order.I had to do this as an homework once ,but the product had too many limitations.
1 person has voted this message useful





Iversen
Super Polyglot
Moderator
Denmark
berejst.dk
Joined 5619 days ago

9078 posts - 16472 votes 
Speaks: Danish*, French, English, German, Italian, Spanish, Portuguese, Dutch, Swedish, Esperanto, Romanian, Catalan
Studies: Afrikaans, Greek, Norwegian, Russian, Serbian, Icelandic, Latin, Irish, Lowland Scots, Indonesian, Polish, Croatian
Personal Language Map

 
 Message 2 of 8
01 October 2007 at 4:40am | IP Logged 
I don't know such a piece of software - at least not if you want to get something like a dictionary for a particular book or article. You need a grammatical module if you want to lump all the different forms of a certain word together, and then you still have the problem of separating homonyms, especially if they belong to the same word class. To get a simple list of all word forms is however a simple task. There are a few pitfall, but the general procedure would be as follows:

1) get the text into a decent word processor
2) block the word division
3) convert comma, period and other signs followed by a space to just the space
4) convert period and other signs followed by a new line to just a space
5) now convert all spaces to new line
6) move the whole bunch into a spreadsheet
6a) if there are too many words you have to do the 'unique thing' to one section at a time, then combine the resulting lists
7) make a unique list (i Excel through Data, Filter, Advanced filter, only uniques)
8) sort it alphabetically

.. and the rest has to be done manually

9) lump different forms of the same word together by removing all forms but one



Edited by Iversen on 01 October 2007 at 4:40am

2 persons have voted this message useful



slucido
Bilingual Diglot
Senior Member
Spain
https://goo.gl/126Yv
Joined 5591 days ago

1296 posts - 1781 votes 
4 sounds
Speaks: Spanish*, Catalan*
Studies: English

 
 Message 3 of 8
01 October 2007 at 7:06am | IP Logged 
Maybe something like this?

http://www.wordcounter.com/

http://www.hermetic.ch/wfc/wfc.htm

Edited by slucido on 01 October 2007 at 7:12am

2 persons have voted this message useful



xtremelingo
Trilingual Triglot
Senior Member
Canada
Joined 5203 days ago

398 posts - 515 votes 
Speaks: English*, Hindi*, Punjabi*
Studies: German, French, Arabic (Written)

 
 Message 4 of 8
03 October 2007 at 7:20pm | IP Logged 
Honestly Slucido has the BEST internet/software resources! Almost every link you have posted here has always been very impressive!

Thank you once again! :)

1 person has voted this message useful



TreoPaul
Senior Member
United States
Joined 5246 days ago

121 posts - 118 votes 
Speaks: English*
Studies: German

 
 Message 5 of 8
13 October 2007 at 4:44pm | IP Logged 
slucido wrote:
Maybe something like this?

http://www.wordcounter.com/

http://www.hermetic.ch/wfc/wfc.htm


A similar program is TextStat available at
http://www.niederlandistik.fu-berlin.de/textstat/software-en .html


Edited by TreoPaul on 13 October 2007 at 4:47pm

2 persons have voted this message useful



justinwilliams
Diglot
Senior Member
Canada
Joined 5605 days ago

321 posts - 327 votes 
3 sounds
Speaks: French*, EnglishC2
Studies: German, Italian

 
 Message 6 of 8
14 October 2007 at 6:07pm | IP Logged 
I just finished a programming assignement for school in which we had to implement something that would calculate the average word length of a text as well as the length standard deviation. But it's only in Java so it's not really versatile...Hope to get a perfect grade though!
1 person has voted this message useful



Eve
Triglot
Groupie
United States
Joined 5591 days ago

67 posts - 67 votes 
Speaks: Russian*, English, Spanish

 
 Message 7 of 8
28 October 2007 at 5:14am | IP Logged 
TreoPaul wrote:

A similar program is TextStat available at
http://www.niederlandistik.fu-berlin.de/textstat/software-en .html

This is pretty good free software!
I've just downloaded it and it took less then 10sec to create freq.list from 210p book. Also you can select how to sort words - by freq and/or alpha, etc.
Thanks a lot!

Edited by Eve on 28 October 2007 at 5:21am

1 person has voted this message useful



TreoPaul
Senior Member
United States
Joined 5246 days ago

121 posts - 118 votes 
Speaks: English*
Studies: German

 
 Message 8 of 8
28 October 2007 at 8:08am | IP Logged 
Eve, I am glad you like it. I knew if I looked long and hard enough I'd find something that could do the job well and without an enormous price tag. For some reason it is not very well known, but I find it perfect for my needs.


1 person has voted this message useful



If you wish to post a reply to this topic you must first login. If you are not already registered you must first register


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 2.5625 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2021 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.