Register  Login  Active Topics  Maps  

Create editable text from scanned PDFs

  Tags: Gadget
 Language Learning Forum : Learning Techniques, Methods & Strategies Post Reply
20 messages over 3 pages: 1 2
Doitsujin
Diglot
Senior Member
Germany
Joined 5254 days ago

1256 posts - 2363 votes 
Speaks: German*, English

 
 Message 17 of 20
16 January 2012 at 9:25pm | IP Logged 
fiziwig wrote:
The OCR is build right into Acrobat Reader, so you don't need a separate program.

Unfortunately, this information is not correct. While the full version of Acrobat has a built-in OCR option, in Acrobat Reader, you can only select text if the document was processed with an OCR program or originated from a word processor/DTP software file.
You cannot select text in a .pdf file that contains text as images.
2 persons have voted this message useful



fiziwig
Senior Member
United States
Joined 4799 days ago

297 posts - 618 votes 
Speaks: English*
Studies: Spanish

 
 Message 18 of 20
17 January 2012 at 5:19pm | IP Logged 
Hmmm. I didn't know that. I've only used it with some old Spanish textbooks from around 1910 that were scanned into Google books. It worked quite well with http://ia700306.us.archive.org/15/items/firstspanishcour00hi lluoft/firstspanishcour00hilluoft.pdf for example, and you can even highlight and copy/paste text right in the browser. These old textbooks are all images of pages.
1 person has voted this message useful



Doitsujin
Diglot
Senior Member
Germany
Joined 5254 days ago

1256 posts - 2363 votes 
Speaks: German*, English

 
 Message 19 of 20
17 January 2012 at 7:33pm | IP Logged 
fiziwig wrote:
Hmmm. I didn't know that. I've only used it with some old Spanish textbooks from around 1910 that were scanned into Google books. [...] These old textbooks are all images of pages.

They're pdfs with a text layer on top of the images, because Google processes all Google books with an OCR program. Luckily, the vast majority of older digitized textbooks available at archive.org have such as text layer. But some older textbooks found elsewhere on the Internet don't. I.e. don't expect Acrobat Reader to do the OCR for you in these cases.
1 person has voted this message useful



fiziwig
Senior Member
United States
Joined 4799 days ago

297 posts - 618 votes 
Speaks: English*
Studies: Spanish

 
 Message 20 of 20
18 January 2012 at 6:24am | IP Logged 
Doitsujin wrote:
fiziwig wrote:
Hmmm. I didn't know that. I've only used it with some old Spanish textbooks from around 1910 that were scanned into Google books. [...] These old textbooks are all images of pages.

They're pdfs with a text layer on top of the images, because Google processes all Google books with an OCR program. Luckily, the vast majority of older digitized textbooks available at archive.org have such as text layer. But some older textbooks found elsewhere on the Internet don't. I.e. don't expect Acrobat Reader to do the OCR for you in these cases.


I see. I was obviously misunderstanding what I was seeing when I selected text in those books. I didn't realize there was an extra step involved.


1 person has voted this message useful



This discussion contains 20 messages over 3 pages: << Prev 1 2

If you wish to post a reply to this topic you must first login. If you are not already registered you must first register


Post ReplyPost New Topic Printable version Printable version

You cannot post new topics in this forum - You cannot reply to topics in this forum - You cannot delete your posts in this forum
You cannot edit your posts in this forum - You cannot create polls in this forum - You cannot vote in polls in this forum


This page was generated in 0.3359 seconds.


DHTML Menu By Milonic JavaScript
Copyright 2024 FX Micheloud - All rights reserved
No part of this website may be copied by any means without my written authorization.