Automate via Hazel | Share Your Paperless Workflow | Forum

 
You must be logged in to post Login Register


Register? | Lost Your Password?

Search Forums:


 






Minimum search word length is 3 characters – Maximum search word length is 84 characters
Wildcard Usage:
*  matches any number of characters    %  matches exactly one character

Automate via Hazel

UserPost

3:55 pm
May 19, 2011


wannabgeek

Member

posts 9

Hi,

As a convert to the principles of a paperless office I have designed, revised and scrapped a number of attempts in my search for the perfect routine.

Previously I utilised OpenMeta Tags to assist in renaming and filing of documents but ultimately I found that it was just another added process that did not return any real benefit.

I prefer to use a hierarchical file structure to store my documents which coupled with content and file name searches via spotlight are sufficient for file retrieval process.

My process for the conversion of documents starts with the sorting of the paper documents into 3 piles;

  • Single sided
  • Double sided, and
  • Multi page

I have profiles setup in ScanSnap Manager that correspond making it a quick exercise to scan the documents, which I generally let build up and process weekly. I use the OCR function, restricted to the first page as that will generally capture the relevant information to assist in file retrieval. The files are all saved to a folder called Intray.

I utilise Hazel rules that will move the files to the appropriate folder under a main folder called Filing Cabinet. The rule searches for content that is particular to regular bills and documents eg supplier name or account number. I find that I can generally automatically file about 75% of the documents that I scan.

I have another Hazel rule that processes all files under the Filing Cabinet folder. First I rename the file to ‘-’, then I rename it to the file path – e.g. /users/(user name)/documents/Filing Cabinet/Category/Supplier Name ( I can have as many levels as I want, on some categories I add another directory indicating the financial year).

I then rename the file to ‘date created – name #.pdf’. The date format I use is yymmdd as this makes it possible to sort in date order by file name. I add the # as a unique identifier for those circumstances where multiple documents from the same supplier are processed on the same day.

I then use an Automator workflow to remove the leading part of the filename so that the file is renamed as yymmdd – Category/Supplier Name/#.pdf.

I then run another workflow that removes the ‘/’ characters so the final name is yymmdd – category supplier #.pdf

The final function is to add the Spotlight Comment ‘Filed‘ as the Filing Cabinet rule will ignore documents with that comment, this allows for manual naming of documents if a more descriptive title is required.

For those files that I acquire from other sources, primarily online or emails, I have an Action Wizard setup in Adobe Acrobat that will OCR all selected files. This will then trigger the Hazel rule in the Intray and start the filing process.

For the remaining files that are not automatically filed I just need to drag to the appropriate folder under Filing Cabinet and the renaming function kicks in and does its magic.

Typically I am still tinkering with this routine, but with 75% of my scans being filed and renamed without any interaction from me I am pretty pleased.

Images and more details can be found at my site wannabgeek.com

3:57 pm
May 19, 2011


Brooks

Vancouver, BC

Admin

posts 203

This is awesome! Is your initial filing Hazel rule operating on the OCRed text, or the filename?

4:12 pm
May 19, 2011


wannabgeek

Member

posts 9

Post edited 5:14 pm – May 19, 2011 by wannabgeek


Hi,

Its on the OCRed text.

There are multiple rules looking for content that will identify the provider of the document then file it in the corresponding folder. This part is a bit of trial and error but you can get it to recognise most documents eg content contains american express (account number).

A tip for beginners is to process your credit card statement rules first in Hazel. This will avoid content matches where you paid a particular supplier via the credit card.

5:36 pm
May 19, 2011


Brooks

Vancouver, BC

Admin

posts 203

Thanks for the tip! Having your documents automatically filed by the contents is pretty much the holy grail.

2:45 am
May 22, 2011


wannabgeek

Member

posts 9

Post edited 2:46 am – May 22, 2011 by wannabgeek


I added a video on my blog of the Hazel rules in action to demonstrate how my process works.

You can see that I scanned 4 documents and they were filed and renamed in the appropriate folders.

Cheers Laugh

7:18 pm
May 25, 2011


Alex Satrapa

Canberra, Australia

Member

posts 16

Is there some magic behind renaming the files three times in Hazel? At first glance it seems you're just renaming to "«date created»-«file pathname» «#».pdf".

12:49 am
May 26, 2011


wannabgeek

Member

posts 9

Hi,

When you rename to the file path it puts the full path in including Users/(user name)/Documents/Filing Cabinet/. As this would be the same for every file it is unnecessary (plus it would make the name to long).

Hazel has the capacity to rename based in the folder name but that is too limiting.

Might be a simpler way to do it but it would need to be at least a 2 step process as far as I can see.

10:23 pm
August 30, 2011


c1cummin

New Member

posts 1

sounds pretty cool. I think I'd like to try something like this except maybe use acrobat to ocr after scanning to batch that process.  Would I be able to have Hazel rename those acrobat ocr'd files instead of the ones directly from ScanSnap?

7:48 am
August 31, 2011


Brooks

Vancouver, BC

Admin

posts 203

Sure, I don't see why not. Hazel will be indifferent as to what software is actually OCRing the files.