For the past year or so, I've been using ABBYY Finereader to scan documents, rather than Acrobat, in part because I was often scanning multilingual texts (something Finereader does quite well at), and in part because the version of Acrobat I had been using (6, I believe) kept wanting to reset the scanner software to black-and-white and was generally annoying.
I thought, however, that I'd see what Acrobat 8 did, and I was pleasantly surprised. Acrobat 8 no longer wants to reset the scanner software to a useless black-and-white, but offers its own controls, which include specifics on OCR and target language, and gives the option of viewing the scanner software settings as well, which I recommend doing.
Acrobat 8 was amazingly easy to set up to scan English-language chapters and articles. I was able to switch from grayscale to color for pages with color illustrations, and the program runs the OCR automatically once you tell it you're done scanning. My test searches seemed to be quite accurate.
Acrobat 8 offers more languages than the version I had used before (it now includes Czech!), so for monolingual documents it seems to be very fast and easy. For documents with text in multiple languages (I am thinking more of names with umlauts, cedillas, and so forth), I suspect it would be best to use ABBYY or Omnipage on a multi-language setting in order to make sure these are properly searchable.
At a less high-quality level, there remains the option of taking existing photocopies and running them through a Digital Sender, then separately using an OCR program on them to make them searchable. I still say the Digital Sender is an amazing machine; over the summer I used ours on hundreds more pages of old articles.
Sunday, August 24, 2008
Subscribe to:
Posts (Atom)