Update: This post is now slightly out of date as I now use the ScanSnap S1300. You may want to sign up for my free 7 part e-Course while will more comprehensively take you through the steps to go paperless.
This is Part 2 of the My ScanSnap Setup And Workflow series. Make sure to check out Part 1 – ScanSnap Settings.
Now that we have set up ScanSnap Manager with my four profiles, here is what I do with the files.
At first I started using DevonThink Pro Office, but I found that it was a little overkill for my needs. If I had a huge amount of documents that I needed regular access to it would be perfect, but for my home needs I wanted to go with something a little more lightweight.
One main drawback (maybe the only one) is that my ScanSnap S300M did not come with any OCR software in the box. I could download a form to have ReadIris Pro 11 mailed to me, but that didn’t help me at first.
Really Boring, Really Fast
Excited to start OCR’ing up a storm, I set ScanSnap Manager to output to Acrobat and away I went.
It worked quite well. I would hit the button, it would open the resulting file in Acrobat, and then I would go Document | OCR Text Recognition | Recognize Text Using OCR and follow the resulting menus.
I think this would be OK normally, but since I had a ton of things to scan from my file cabinet, this got really boring, really fast to have to sit there and manually OCR every document over and over again. I knew there had to be a better way.
Applescript To The Rescue
I am a complete AppleScript newbie, but I found this great post from Macworld where the author made an AppleScript Folder Action that would watch a certain folder, and when a document got put in it, it would kick off Acrobat (or ReadIris Pro) and OCR it automatically.
I set ScanSnap Manager to save to a folder called ToProcess and gave that folder a Folder Action to run the MacWorld script.
This worked quite well, and would possibly work OK on an ongoing basis, but again I ran into problems when doing my massive scan-a-thon – if i dropped a document in to the folder while the other Acrobat session was still OCR’ing, it would give error messages.
Droplets Are Fun
The solution I came up with was to change the script so that it became a droplet. To do this I ripped off part of the script referenced in this thread.
A droplet is just an Application that you save somewhere (I have it on my Dock). You run it by dragging a file onto its icon.
Here is the script that I cobbled together . Feel free to download and use as you please.
So now, I have the following workflow:
- Scan document using the ScanSnap, ScanSnap Manager saves the file in the ToProcess folder
- When I am done my batch, I drag the PDF files onto the OCRIt icon, which kicks of Adobe Acrobat Professional and tells it to recognize the text in the document
- When that is done, process/move the files as needed
It is working quite well for you, but I guess if I wanted to avoid all this I could have just stuck with DevonThink as it has built in OCR. What is your workflow? How do you handle the Optical Character Recognition part?