Doing OCR Batch Processing Using The ScanSnap And ABBYY FineReader

January 5, 2010

Sometimes, when you have to scan a large number of documents at once, the step of doing OCR (making the PDF searchable) after each document can really slow things down. It may be preferable to scan them all in and then OCR them all in one big shot.

In the past I have posted about how to do batch OCR using Adobe Acrobat and have posted an Acrobat Applescript.

Over at the Optimality! blog, Tobi has posted a walkthrough of using ABBYY Finereader, which comes with the ScanSnap S1500M (and S1500 for that matter) to do batch OCR.

The problem is that in the default setup, each scan is OCRed right after the scan and depending on the age your machine (my G5 is getting a little long in the tooth) in can take quite a while. When you’re in the process of scanning many hundred’s of pages of paper documents, you don’t want to have to wait for the computer to do it’s OCR recognition, you’d rather feed it all the documents and let it do OCR while you’re doing something else.

Fortunately, this is possible. Reading all the way through the handbook as well as through the ABBYY online help I found out that you can scan to PDF only, and then afterwards convert the PDFs with ABBYY FineReader.

Check out the post here. Do you have any other tricks for doing batch OCR?

Share and Enjoy:
  • Digg
  • del.icio.us
  • Facebook
  • Mixx
  • LinkedIn
  • Netvibes
  • Ping.fm
  • Propeller
  • Reddit
  • StumbleUpon
  • TwitThis

Related posts:

  1. Abbyy Finereader and Adobe Acrobat – Why Does Fujitsu Include Both?
  2. ABBYY Finereader And Snow Leopard – File Not Created With ScanSnap
  3. Use Acrobat Batch Processing To OCR Your PDFs Easily
  4. ABBYY FineReader For ScanSnap Update For Snow Leopard OSX 10.6 Now Available
  5. Fujitsu ScanSnap Update For Mac OSX Snow Leopard Now Available

Comments

This website uses IntenseDebate comments, but they are not currently loaded because either your browser doesn't support JavaScript, or they didn't load fast enough.

4 Responses to “Doing OCR Batch Processing Using The ScanSnap And ABBYY FineReader”

  1. Ron C on January 5th, 2010 3:15 pm

    We (that’s really my wife – I’m just IT support) use a ScanSnap along with DevonThink Pro and can do concurrent scanning and OCR-ing.

    DTP uses ABBYY FineReader to do it’s work, but the entire OCR process is under the control of DTP and not the Fujitsu driver.

    I can’t remember the exact configuration BUT it was pretty much straight out of the DTP playbook.

    Ron

  2. Michael F on January 6th, 2010 7:53 pm

    The easiest thing is just to scan everything to plain PDF, then run Finereader and drag a bunch of PDFs to its dock icon. As long as they were created in Scansnap, it should OCR them one after another, and save them as something like “Original_File_Name 1 processed by FineReader.pdf”. There is some limit to the number of files you can do at once, but it’s a fairly high one.

  3. Michael F on January 7th, 2010 3:54 am

    Whoops, now that I read the original post I see that is exactly what the linked article says. Duh.

  4. Leo on February 9th, 2010 1:55 pm

    Does FineReader come with the ScanSnap? I have the S510M and don't see FineReader on my Mac. Thanks!

Got something to say?