OCR Your ScanSnap PDF Before Sending It To Evernote

OCR Your ScanSnap PDF Before Sending It To Evernote

Update: Of course, a few days after I posted this, Evernote announced that they would make PDFs searchable for Premium users. So if you are not a Premium user, this will help. Otherwise, just upload away.

One of the most popular posts on this site is on how to use the Fujitsu ScanSnap with Evernote. It describes how to set up a profile in ScanSnap Manager to send the resulting PDF to Evernote.

There is one problem with doing it this way – Evernote does not OCR PDFs. I assume they’ll be fixing this someday, but for now, if you want your document searchable within Evernote, you need to OCR it before sending it into Evernote.

How you do this depends on which model of the ScanSnap that you have, and whether you have Windows or a Mac.

ScanSnap For Windows

If you have the ScanSnap S300, S510, or S1500, your solution is pretty simple.

What we’re going to do is set Evernote to watch a folder so that anything it finds in there it will automatically import. Then set up ScanSnap to save files to that folder.

  • In Evernote, go to File -> Import -> File Import Wizard
  • Hit Next and select the Source folder that you want Evernote to watch and set your notebook
  • Choose “Watch folder for changes and import files automatically”

Now set up ScanSnap normally to scan to that folder you just selected, and whatever files you save into that folder will be grabbed by Evernote.

ScanSnap S510M or S1500M For Mac

For whatever reason, Evernote for the Mac does not have the Watch Folder functionality that the Windows client does (why not Evernote?!). However, thanks to the magic of Applescript, we can do the same thing.

This will work for the ScanSnap S510M or S1500M.

  • Download this file – AddToEvernote.scpt and save it to /Library/Scripts/Folder Action Scripts
  • Create or select a folder that you want scanned PDFs to go into. Right-click on it and select More and then Enable Folder Actions
  • Right click on the folder again and select More and then Attach a Folder Action. Select the AddToEvernote script that you just saved

Now set up ScanSnap normally to scan to the folder that you just configured. When you add a PDF to it, the Applescript will go through that folder and add the files into Evernote. Handy!

ScanSnap S300M For Mac

For whatever reason (I say that a lot), the ScanSnap S300M does not come with OCR software (why not Fujitsu!?).

However, we’re in luck. Awesome DocumentSnap reader Sebastian Poll wrote this Applescript that will use Adobe Acrobat to automatically OCR the PDF and then kick it straight into Evernote.

Obviously, it requires Acrobat. If you don’t have Acrobat, you can use whatever method you currently use to OCR and then use the AddToEvernote above to import it in.

Note that Sebastian’s version was actually written with some of the code in German. I changed it to English, so if there are problems, it is probably my fault and not his.

  • Download this file – OCREvernote.scpt and save it to /Library/Scripts/Folder Action Scripts
  • Create or select a folder that you want scanned PDFs to go into. Right-click on it and select More and then Enable Folder Actions
  • Right click on the folder again and select More and then Attach a Folder Action. Select the OCREvernote script that you just saved

Now set up ScanSnap normally to scan to the folder that you just configured. When you add a PDF to it, the Applescript will go through that folder and OCR with Acrobat and then add the files into Evernote.

Do you use the ScanSnap with Evernote? Do you have any other methods of making PDFs searchable? Or do you not bother? Leave a message in the comments.

About the Author

Brooks Duncan helps individuals and small businesses go paperless. He's been an accountant, a software developer, a manager in a very large corporation, and has run DocumentSnap since 2008. You can find Brooks on Twitter at @documentsnap or @brooksduncan. Thanks for stopping by.

Leave a Reply 25 comments

Vincent - March 28, 2011 Reply

Hi, great little script. Just one question: how to change the script so that the scans get sent to an Evernote notebook other than the default notebook. Is it possible to change to script so that it sends the scans to a specified notebook ?

    Brooks Duncan - March 29, 2011 Reply

    Hi Vincent,

    According to this page: http://www.evernote.com/about/developer/mac.php I think it should be possible. Change this section:

    tell application "Evernote"
    activate
    create note from file this_item
    end tell

    To this:

    tell application "Evernote"
    activate
    set notebook1 to create notebook "MyNotebook"
    create note from file this_item notebook notebook1
    end tell

    I haven't personally tested it, but give it a try and hopefully it will work!

Ming - September 26, 2010 Reply

Just found out that you can open the PDF file in Evernote with application 'convert to searchable PDF' (if you have fineReader installed), it will convert the PDF file to searchable PDF file in place inside Evernote.

    Brooks Duncan - October 27, 2010 Reply

    Great suggestion Ming! Thanks!

James - October 31, 2009 Reply

thanks for the awesome post !

mike - October 5, 2009 Reply

thanks for this. I don't have the snapscan but the applescript to create the watched folder on Mac was useful. Really don't know why they don't have that functionality in the Mac version.

    Brooks Duncan - October 5, 2009 Reply

    Great to hear it worked. Hopefully they'll add it at some point because it's obviously a useful feature.

keeril - August 24, 2009 Reply

Thanks for the quick reply!

keeril - August 24, 2009 Reply

Hi,

This link doesn't work any more:
http://www.documentsnap.com/loves/OCREvernote.scp
Could you update?

Thanks,
Keeril

    Brooks Duncan - August 24, 2009 Reply

    Weird, I could have sworn I fixed that before but maybe not. It's fixed now. Sorry about that!

Marc - August 14, 2009 Reply

Somebody help 🙁

Marc - August 10, 2009 Reply

Hi BrooksD,

since I’m not receiving any answer and there seems to be something weird with date and time when I’m posting, I just wanted to deactivate this thread in case you had not been notified with my 2 previous posts (although they aren’t, they seem to be older than your post).

Thanks a lot,

Marc

    Brooks Duncan - August 11, 2009 Reply

    Hi Marc,

    Not sure what you mean exactly but when you say you want to deactivate the thread, does that mean you figured it out? What was the issue?

      Marc - August 12, 2009 Reply

      Hi BrooksD,

      first of all, my apologies. "Deactivate" has a typo, I wanted to write "Reactivate"!! :), let me explain…

      I've been experiencing a really strange behaviour with this site. Last Friday I answered (from my iPhone) almost immediatelly to your post when I was notified of your answer but, for some reason, my answer was posted before yours (my post was considered to be older than yours).

      Since I had not received any response eversince then yesterday, I thought that due to this weird behaviour, you may had not been notified of my response (the system was considering your post to be newer).

      To top it, now from a comptuer, I see the real order (check my posts related to this strange behaviour, they all refer to it). There must be something wrong with my iPhone, I don't know…

      Let me paste here my previous post in which I answered to your detailed explanation:

      "Dear BrooksD,

      thanks a lot for your comment. It is almost as you describe, but there is no step 3. ScanManager v5.0 for some reason saves the non-OCRed PDF file to the desired folder, does the OCR process and automatically saves the OCR jpeg in the desktop without me doing any sort of process whatsoever.

      So after step 2 canes step 4 and I end up with a non OCRed PDF in evernote and the jpeg file in the desktop.

      Is v5.0 buggy? There doesn't seem to be much freedom of configuration and FineReader v4.0 is also poor with retard to customization."

      Hope I can readdress this issue,

      Thanks in advance,

        Brooks Duncan - August 14, 2009 Reply

        Hi Marc,

        Unfortunately, I don't have ScanSnap Manager 5.0, so there is only so much help I can give, but I'll do my best.

        First I guess I should ask – are you on Mac or Windows? I'm guessing Windows.

        There are a few things I don't understand:

        -If you are wanting something OCR'ed, I don't think you want to be scanning to a picture folder. Generally OCR is for PDFs and not JPGs. What happens if you just scan to a normal folder and not a "picture folder"?

        -What happens when you completely take Evernote out of the mix? Just scan to a folder that is not watched by Evernote or anything else. What do you get when you scan? Do you still get a JPG and a PDF?

        If you are still getting duplicate files after taking Evernote or anything else out of the mix, you may want to contact Fujitsu support and hopefully they can tell you why you are getting 2 files.

        Otherwise, if you want, take some screenshots of your setup and email them to brooks -at- documentsnap.com and I can try to take a look.

Marc - August 7, 2009 Reply

Hey BrooksD,

what a wierd behaviour, my answer has been posted before yours… I wonder how that has happened… 🙂

looking forward to hearing from you 🙂

marc

Marc - August 7, 2009 Reply

Dear BrooksD,

thanks a lot for your comment. It is almost as you describe, but there is no step 3. ScanManager v5.0 for some reason saves the non-OCRed PDF file to the desired folder, does the OCR process and automatically saves the OCR jpeg in the desktop without me doing any sort of process whatsoever.

So after step 2 canes step 4 and I end up with a non OCRed PDF in evernote and the jpeg file in the desktop.

Is v5.0 buggy? There doesn’t seem to be much freedom of configuration and FineReader v4.0 is also poor with retard to customization.

Thanks a lot. I really appreciate your help!!!

Marc

Marc - August 7, 2009 Reply

Sebody help 🙁

Marc - August 5, 2009 Reply

Hi there,

Andrew, i would very much appreciate you elaborating little more on how you tuned Finereader… I'm experiencing the same problems and even something new. I scan all my business cards as jpeg using the Scan to Picture folder feature and I get a funtny result. I get the temp non-OCR file in the watched folder and the searchable jpeg in the desktop… Guess the results… Please help

thanks a lot

    Brooks Duncan - August 7, 2009 Reply

    Hi Marc,

    Can you walk us through the process?

    1) You choose Scan To Picture Folder in ScanSnap Manager. That has just the default settings and scans to your Evernote watched folder

    2) You put the biz card in and hit scan. It saves it in the Evernote folder but it is a PDF that is not OCR'ed

    3) You run some sort of process and after that, a JPG is saved on your Desktop

    4) Evernote imports the non-OCR'ed PDF and not the JPG which you wanted in the first place

    Does that sound like an accurate representation of your problem?

Andrew - July 22, 2009 Reply

Thanks for the script

It works well but I'm getting a non OCR copy send to Evernote straight after scanning, anyway way to stop this from happening? I'm OS X using S510m

The OCR copy getting added after to Evernote after local OCR which is great

Thanks

Andrew

    Brooks Duncan - July 22, 2009 Reply

    Hey Andrew, that's weird. What do you have your profile set to? Scan to folder or Evernote?

    Maybe both SnanSnap Manager and the script are sending to Evernote.

      Andrew - July 22, 2009 Reply

      I've got it set to Finereader for Snapscan saving to a local folder and the application is set to Scan to Searchable PDF

      I just tried it again and I'm getting still getting 2 copies

      for what it's worth I have previous use evernote application But I've have switch back

        Brooks Duncan - July 23, 2009 Reply

        Hi Andrew. I have a theory about what is happening.

        When ScanSnap Manager is saving the file before OCRing, the folder action is seeing it and then kicking it into Evernote.

        Then when finereader does the OCR and resaves the file to that same directory, he folder action is kicking off again and sending the now-ocr'ed file to Evernote again.

        I think what you'll need to do is have Finereader save the searchable PDF to a seperate folder and have the folder action attached to THAT folder, so that only tue searchable pdf get imported if that makes sense?

          Bluedog242 - July 23, 2009 Reply

          Hi Brooksd,

          Yes your right, I've change my default folder to where I save my non OCR copy which i use as backup, then when I'm scanning I manually select the folder which has the script and it's works beautifully

          Thanks for help

          Andrew

Leave a Reply: