Doing OCR Batch Processing Using The ScanSnap And ABBYY FineReader
January 5, 2010
Sometimes, when you have to scan a large number of documents at once, the step of doing OCR (making the PDF searchable) after each document can really slow things down. It may be preferable to scan them all in and then OCR them all in one big shot.
In the past I have posted about how to do batch OCR using Adobe Acrobat and have posted an Acrobat Applescript.
Over at the Optimality! blog, Tobi has posted a walkthrough of using ABBYY Finereader, which comes with the ScanSnap S1500M (and S1500 for that matter) to do batch OCR.
The problem is that in the default setup, each scan is OCRed right after the scan and depending on the age your machine (my G5 is getting a little long in the tooth) in can take quite a while. When you’re in the process of scanning many hundred’s of pages of paper documents, you don’t want to have to wait for the computer to do it’s OCR recognition, you’d rather feed it all the documents and let it do OCR while you’re doing something else.
Fortunately, this is possible. Reading all the way through the handbook as well as through the ABBYY online help I found out that you can scan to PDF only, and then afterwards convert the PDFs with ABBYY FineReader.
Check out the post here. Do you have any other tricks for doing batch OCR?
ABBYY FineReader For ScanSnap Update For Snow Leopard OSX 10.6 Now Available
November 18, 2009
When it rains it pours. When Fujitsu released their ScanSnap Update For Snow Leopard, the missing piece was the OCR provided by FineReader. They said it would be released by ABBYY soon, and as of today, it’s out.
The update is for the ScanSnap S1500M and S510M.
Click Here To Download The FineReader Snow Leopard Update. It’s down at the bottom.
You know the deal.. let us know in the comments how the update worked out for you!
ABBYY Finereader And Snow Leopard – File Not Created With ScanSnap
August 31, 2009
One issue with the Fujitsu ScanSnap and OSX 10.6 Snow Leopard that I forgot to mention the other day is the ABBYY FineReader that comes bundled with it.
When scanning with the version of Finereader that ships with the ScanSnap S510M and S1500M, you may get an error message like “File not created with ScanSnap”.
This is a known issue and according to this bulletin from Fujitsu Support, it will be fixed “within 2009″.
Fujitsu has assured me that they’re working on it, so hopefully we’re not talking December 31 here!
I personally do not use FineReader.. anyone have any workarounds for the Snow Leopard issue that they use? Leave a note in the comments.
Update: Thanks to reader Spike in the comments for the tip, ABBYY has released a version of FineReader Express Edition that supports Snow Leopard. More info here.
Update #2 Nov 19/09: The ABBYY FineReader for ScanSnap Snow Leopard Update is now available.
Abbyy Finereader and Adobe Acrobat – Why Does Fujitsu Include Both?
April 20, 2009

I have received a number of questions recently about the software that is included with the Fujitsu ScanSnap. For example, why does the ScanSnap come with both Abbyy FineReader and Adobe Acrobat? Aren’t they both for doing OCR?
I suspect part of the reason that this question comes up is because of my posts about my ScanSnap workflow and my Adobe Acrobat OCR Applescript. Is all that necessary?
Let me start by saying that I personally have the ScanSnap S300M. The S300M comes neither with Abbyy FineReader not with Adobe Acrobat. If you have the S1500 or S1500M, your scanner will come with both and doing OCR is much more integrated than with the S300M, so my post-scan processing fun may not be necessary.
So What’s The Difference?
The ScanSnap comes with a special version of Abbyy FineReader called FineReader for ScanSnap. They’ve integrated that with ScanSnap Organizer, so if you are using the built-in automatic OCR’ing, that is what it is using.
If all you care about is having your PDFs searchable and don’t mind performing the OCR right after scanning, then the supplied FineReader is probably all you need.
To my mind, there are basically two main reasons why you will want to use Adobe Acrobat:
- You want to do PDF editing after the fact
- You want to batch your OCR after the fact
PDF Editing
So you have your scanned PDF. Now what? If you want to remove/rearrange pages and do a whole ton of other editing functions, Acrobat is a great tool. It is most definitely not just for making a PDF searchable.
You can see a bunch more information for Adobe Acrobat 9 (included with the ScanSnap 1500) and Acrobat 8 (included with the ScanSnap 1500M). You can see from the price that it’s a pretty good deal that this software is included with the ScanSnap.
Batch OCR
If you have a whole bunch of documents to scan in, it may be annoying to scan, sit there and wait for it to OCR, scan, OCR, scan, OCR, and so on. Some people prefer to scan all their documents to PDF in one shot, and then OCR them all at once. You can use Acrobat to do that instead of the included FineReader.
So there you have it, some of the differences between the two. What are some of the reasons you use one over the other?

