PDFPen OCR Applescript To Automatically Make PDFs Searchable

I don’t know if it is because I have been glued to a computer since I was six years old, but my handwriting and printing is terrible. Really terrible. I think my 5 year old son and I have pretty similar handwriting skills.

Normally this is not a problem, except when I have to fill out a form. It’s a little embarrassing filling out some official form with my chicken scratch, which is one of the many reasons why I love PDFPen. Among many other things, it lets you fill out and edit any PDF document on your computer and then print it out.

However, that ability is not what this post is about. PDFPen will also OCR PDFs to make them searchable, and I wanted a way to OCR a bunch of documents automatically with an Applescript, similar to what has been done with Adobe Acrobat and with ABBYY FineReader.

I found two scripts out there. One from David Sparks at MacSparky, which some users reported problems with in newer PDFPen versions, and one from Michael Tsai at C-Command Software which will OCR a document with PDFPen and send it to EagleFiler.

Since both of these scripts were almost what I wanted, I decided to stand on the shoulder of giants and merge them together into this Applescript.

Here is the script:
-- Downloaded From: http://www.documentsnap.com
-- Last Modified: 2010-09-28
-- Includes code from MacSparky http://www.macsparky.com/blog/2009/5/24/pdfpen-ocr-folder-action-script.html
-- Includes code from C-Command Software http://c-command.com/scripts/eaglefiler/ocr-with-pdfpen

on adding folder items to this_folder after receiving added_items
try
repeat with added_item in added_items
my ocr(added_item)
end repeat
on error errText
display dialog "Error: " & errText
end try
end adding folder items to


on ocr(added_item)
tell application "PDFpen"
open added_item as alias
tell document 1
ocr
repeat while performing ocr
delay 1
end repeat
delay 1
close with saving
end tell
end tell
end ocr

PDFpen Users: Download The Text Script Here (Right-click and Save-As)
PDFpen Pro Users: Download The Text Script Here (Right-click and Save-As)

To implement, follow MacSparky’s excellent instructions.

I hope this is of use to someone, and thanks to David and Michael for their excellent Applescripts.

Want More Help With Going Paperless?
  • Receive my free guide 4 Ways To Tame Your Documents. 
  • Receive my popular free Paper Cuts newsletter.
  • Receive my free 7 part Paper Sanity e-Course. 

Tags: ,

16 Responses to “PDFPen OCR Applescript To Automatically Make PDFs Searchable”

  1. Josh September 29, 2010 at 12:50 pm #

    Awesome, thanks! Just thinking though….I'd love to use Hazel instead of Folder Actions if possible.

    Would it be possible to modify the script a bit so that Hazel can monitor for new files in the folder, then call a simpler AppleScript to tell PDFpen to OCR it?

    I wish I knew AppleScript…I would contribute!

    • BrooksD September 29, 2010 at 12:55 pm #

      Hey Josh, that should be doable. Just give me a few days and I should have something for you. Good idea.

    • BrooksD October 1, 2010 at 10:09 am #

      Hey there Josh, give this a try: http://www.documentsnap.com/hazel-rule-to-ocr-doc

      Hope it helps!

    • Thena December 22, 2011 at 6:25 pm #

      We defiiently need more smart people like you around.

  2. ToddPeperkorn December 15, 2010 at 7:21 am #

    I am trying to use this script and it wants to ask what language to scan the document in. Is there a way of making that a part of the script?

    • BrooksD December 15, 2010 at 7:28 am #

      Hm strange, I'll take a look and let you know.

  3. ToddPeperkorn December 15, 2010 at 7:40 am #

    Something is also causing the Script to force PDFPen to close. I will email you the error offlist if you would like to peek at it.

  4. Niv September 4, 2011 at 3:35 pm #

    Hi Brooks,

    I tried to save the script in Folder Action Scripts like D.Sparks recommends but I get the following error: The document “Untitled” could not be saved as “Untitled.scpt”.
    I have tried all sorts to save any script anywhere, but all instances fail.
    I am running Lion, does anyone else have this problem, or know of a solution?

    • Niv September 4, 2011 at 4:04 pm #

      Don't worry – I have solved it. Folder permissions!!!

      • BrooksD September 4, 2011 at 5:34 pm #

        Great Niv, and great to hear it works on Lion.

  5. Chris November 4, 2011 at 11:14 am #

    Brooks:

    I'm having 2 problems with this script I'm hoping you can help with.

    1. It asks me what language I want to use for OCR. This adds an unnecessary step. Can the script be modified to specify "English"?

    2. PDFPen quits unexpectedly at the end of the script.

    Thanks for any help you can provide.

  6. Chris November 4, 2011 at 11:36 am #

    Also: any chance of turning this into a droplet? Thanks!

  7. Patch January 28, 2012 at 6:28 am #

    I was inspired by this to create my own workflow which consists of the following steps:

    Scan with Fujitsu ScanSnap > OCR with PDFpenPro > Export to Yojimbo

    My goal was to automate all of this when the Scan button is pushed on the scanner. The following script accomplishes just that.

    If you save it as an application it also functions as a droplet. I'm sure it can be easily modified to export to other applications if you're not a Yojimbo user.

    Note: you will also need to DISable the preference to automatically OCR scanned documents in PDFpenPro, else you'll get that annoying dialog about language preference.

    – SCRIPT —
    on open ScannedDocument
    tell application "PDFpenPro"
    activate
    open ScannedDocument
    ocr document 1
    repeat
    if performing ocr of document 1 is false then
    exit repeat
    end if
    end repeat

    save document 1
    set documentPath to path of document 1

    tell application "Yojimbo"
    import documentPath
    end tell

    close document 1 –delete this if you want the doc to stay open
    quit –delete this if you want PDFpen to stay open
    end tell
    end open

  8. Patch January 28, 2012 at 6:29 am #

    Ah, one important step that I forgot, you also need to set the saved application as the target in the scan software preferences.

  9. Temple Run 2 Cheats August 27, 2014 at 12:29 pm #

    I have been exploring for a bit for any high-quality articles or weblog posts on this kind
    of space . Exploring in Yahoo I eventually stumbled
    upon this web site. Reading this information So i am satisfied to
    exhibit that I have a very excellent uncanny feeling I found
    out just what I needed. I most no doubt will make sure to don?t omit this web site and provides it
    a glance regularly.

Trackbacks/Pingbacks

  1. My Doxie Go Wireless Automated Workflow | Tips To Learn How To Go Paperless | DocumentSnap Paperless Blog - July 24, 2012

    [...] This rule watches toPDF for any PDF files, and if it finds any, it runs an AppleScript which calls PDFPen to do the OCR. I mainly did it this way because I already had a script that performs OCR with PDFPen. [...]

Leave a Reply