What is the holy grail of going paperless?
There are a lot of tricks out there for keeping your documents organized based on their location or filename, but the holy grail is to be able to keep them organized based on the actual contents of the documents themselves.
In other words, our computer does the work for us.
I have written before about how the Fujitsu ScanSnap allows you to use a highlighter pen to automatically assign keywords to a PDF.
However, once you have those keywords assigned, how does that help you?
If you’re on Windows, you can use the Distribute By Keyword feature of the included ScanSnap Organizer to move the files to a cabinet, but Mac users are out of luck there.
I humbly submit that using a highlighter, OCR, and the awesomeness that is Hazel, Mac users can one-up even the mighty ScanSnap Organizer.
What Is Hazel?
For years now, I have been engaged in a torrid love affair with a Mac application known as Hazel from Noodlesoft. At a very high level, it lets you create rules to automatically keep your files organized.
I have written about how you can use Hazel with Evernote, and David Sparks at Macsparky has a great guide for moving PDFs based on filename.
I wanted to do something that would marry the searchable goodness of the ScanSnap with the ninja skills of Hazel.
Set Up The ScanSnap For Keyword Highlighting
The first thing you’ll need to do is set up a ScanSnap Manager profile to read highlighted text and make keywords out of it.
First, on the Scanning tab, I have had best luck setting the Image quality to “Best” (300dpi). At anything lower, the ScanSnap wasn’t picking up the keywords consistently.
Then on the File Option tab, make sure that “Set the marked text as a keyword for the PDF file” is checked. That will tell it to look for any highlighted text and turn it into a keyword in the PDF.
You will, of course, want to choose a folder to save the PDF to. Make a note of this folder because we will need it when we switch to Hazel. In my case it is called ToMove.
Get Out Your Highlighter
Is it Hi-liter or Highlighter? I never know. Anyways, now take your pen and highlight the word or phrase that you want to move the file based on.
Essentially what we will be doing is saying “if the PDF contains this keyword, do something with it”.
All I have handy are grocery receipts, so you can see I highlighted “EXTRA FOODS”.
Scan And Check Keywords
Now scan your document using your shiny new ScanSnap Manager profile. When it is done, open up your new PDF in Preview, go to Tools > Inspector (or hit Cmd-I), and click on the magnifying glass. If everything worked properly, you should see the text that you highlighted.
Set Hazel To Move Based On Keyword
Let’s say we want to move any PDF with the keyword “EXTRA FOODS” to a folder called Filed Documents (we’d probably want to move it to a grocery-specific folder, but let’s just pretend).
Open up Hazel and on the left side, click the Plus to add a new folder. Add your ToMove folder that you used as a scan destination in ScanSnap Manager.
Now in the right pane, click the plus to add a new rule. Give it a name.
You can set a number of criteria and rules here, but to keep it simple we will leave it as “all conditions”, then set:
- Kind is PDF
- Keywords contain EXTRA FOODS
Next, set it to Move the file to folder Filed Documents
Hit OK to save it. If you want to see what your rule will catch, you can click on the little Gear icon near the bottom and choose “Preview Rule Matches”. If everything is set up properly, your newly-scanned document should show there.
If it doesn’t show, check the PDF to make sure that it really has keywords and re-check your rule setup.
If your document shows in the preview, either wait for Hazel to do its thing, or click on the Hazel icon in the Menu bar, choose Run Rules, and choose the rule that you just created.
Set Hazel To Rename Based On Keyword
Let’s say that instead of moving a file based on a certain keyword, we want to give our files a name based on the highlighted text. Is this possible? Why yes, yes it is. Let’s use our new Hazel Ninja powers and do it.
Create a new Hazel rule as we did before, but this time for the criteria, set this:
- Kind is PDF
- Keywords is not blank
Next, in the “Do the following” section, choose “Move file” to folder “Filed Documents” (if you choose), and then set up the following:
- Choose Rename file
- In the with pattern section it will say “name” and then “extension”. Click on “name” and hit the delete key. We want to get rid of that.
- Let’s give the filename a date. Drag “date created” up before extension. If you prefer, click the little down arrow in “date created” and choose Edit Date Pattern and change to whatever pattern you choose.
- Drag “other” up between “date created” and “extension”. It will ask you to select a Spotlight Attribute. Scroll down to find Keywords and hit Select.
- If you prefer, click on the little down arrow in “keywords” and change which keywords are selected and how they are formatted.
- You might want to click between “date created” and “keywords” and put a dash, but that is up to you.
Your final rule should look something like this:
Now when we scan that same Extra Foods receipt, our Hazel rule will move the file to Filed Documents and rename it like this.
Forget Keywords, Use Hazel To Move Based On Searchable Text
Let’s say you want to forget about this whole highlighter/keyword thing. You already have scanned and searchable PDFs. Can’t you just move based on the OCR’ed text in the documents? Let’s find out.
So you really, really like the vegetable kale and you want to move any scanned receipt that has the word Kale in it (can you tell all I had around for this demo is grocery receipts?).
First, here is our receipt:
Next, we obviously need to be using a ScanSnap Manager profile that has “Convert to searchable PDF” checked on the File Options tab. Again you will have better results if you use 300dpi for Image quality.
Now we set up another Hazel rule, this time using the following criteria:
- Kind is PDF
- Contents contain Kale
Then do something with it such as move it to Filed Documents.
Now when you scan a document that has the word “Kale” in it, Hazel will move it.
Bonus: You can even have Hazel read the dates from the text of the PDF and use them in your filename. Here is how to do that.
(By the way, if you’re a Windows user, there is a similar tool called File Juggler.)
There Is A Lot You Can Do With Hazel
These were a few examples of things you can do in Hazel to be a document management ninja. Hopefully it will give you some ideas.
Remember that OCR is never 100% perfect, and the effectiveness of these rules will be dependant on the quality of the scan and OCR.
Do you have other Hazel-eriffic document tricks? Drop a comment and let us know.