Archive › July, 2008

Document Storage: The Yahoo or Google Philosophy?

yahoovgoogle.png

Once you have your documents scanned you then of course need to put the PDFs somewhere.

There are basically two schools of thought for document storage: storing in a folder structure (the Old Yahoo model), or dumping it all in one place and letting search take over (the Google model).

Old Yahoo Model – Folders

If you have been around the Internet for a long time, you will remember that Yahoo started out as strictly a directory, where websites would get placed in a hierarchical category structure.

The folder model of document structure is sort of like that. Set up an elaborate folder structure, and when it is time to file away a document, you figure out which folder it should go in.

The advantages of this are that you don’t need any third party application, and the folder concept is something that we have used for years and everyone understands.

The downside is that you then have to figure out which folder the file goes into, and when you are looking for a document, you have to go through and figure out where you saved it.

It also takes regular processing to go through and move the files to the right place in your structure.

The Google Model – Search

Google’s advantage (among many) is that it didn’t have to rely on people putting websites in certain categories, and it didn’t rely on searchers knowing which category to find the site. Users could just type in a keyword and as long as Google indexed the site, it would show the result.

With the search model of document storage, PDFs are dumped in one or just a few folders, and then when you want to find something, you just do a keyword search to bring back documents containing that keyword.

This can be a very effective model as long as PDFs are consistently OCR’ed so that they are searchable, and you know what you are looking for.

Once you have a collection of searchable PDF files, you can use Windows Desktop Search, Google Desktop, or Spotlight on the Mac to search through the documents and find the right one.

You can also take it to the next level and use a software like Yep, Evernote, Devonthink , or OneNote to collect and store your documents and do the searching inside it.

The downside of using the search model is, as I said, you have to know what you are searching for before you search. It may be hard to remember certain keywords from the document.

Also, if you are searching for fairly generic keywords, your search may bring back a ton of results, making it a pain to wade through them.

Which Model Do You Use?

Personally, I use a hybrid.

I do have a folder structure but I try to keep things high level without too many subfolders. I then make sure that documents are searchable by OCRing them once my Fujitsu ScanSnap has done it’s job.

When I am looking for a document, I generally use the search method because that is how I am used to finding information. It’s just nice to know that the folder structure is there as a backup.

What setup do you have for saving/finding your scanned PDFs?

Comments ( 4 )

Offline vs. Online Backups – Which is Better?

There has been a lot of debate lately about “living in the cloud” and whether to keep data and applications locally or stored out on the Internet with backup services like Mozy or Carbonite.

Which is better for backing up your documents?

Online Backup

Pros:

  • The data is (hopefully!) encrypted
  • Depending on the service you use, you may be able to get to your files via the Internet which can be very handy
  • If you have a fire, flood, or theft, your backup is offsite so you don’t have to worry about it
  • Chances are, your backup provider will have a much more advanced setup than you do with respect to replication etc.

Cons:

  • You don’t have direct control over your data
  • If your provider goes out of business, what happens to your data?
  • If you have a lot to upload, it could take a very long time to transfer the data
  • If your internet connection is down, so is your ability to backup/restore

Offline Backup

Pros:

  • The data is totally in your control
  • Its on your network so access is fast
  • If it is a portable drive like a MyBook, you can take it to another location
  • You don’t have to worry about uptime/downtime or your internet connection.

Cons:

  • If you have a fire or flood, your backup might be damaged along with your computer
  • If you have a theft, and have a portable hard drive, your backup could be stolen
  • Hard drives fail (boy do they), so depending on your setup your backup could bite the dust

After all that, which is better? It depends on your needs and how nervous you are about storing your data on someone else’s servers. Personally, I am a big fan of online backups but that is just me.

The good news is, you don’t have to pick one or the other. Back up your critical files to an external hard drive, and then also send your most critical files up to Mozy or Carbonite. The best of both worlds!

How do you do your backups (or do you? :) ). Do you trust online providers? Let us know in the comments.

Comments ( 2 )

What Software Do I Need?

Possibly None

It could be that you already have all the software that you need. If you have a ScanSnap or other scanner, it will come with software to scan and possibly even convert it to a searchable PDF.

It might even come with some simple document management software.

If you have an external backup like a MyBook or a Time Capsule, it might already have software to do backups for you.

However, if you want to “take things to the next level” and have a full document management workflow, there is software out there that can help.

Manage Documents Like A Pro

Mac users are spoiled for choice when it comes to document management software. Here are a few favorites:

DEVONthink Pro Office

Picture 2.png

DEVONthink is called a “Personal Information Assistant”. There are a number of different flavors, but the one that works best with the ScanSnap is DEVONthink Pro Office.

It manages documents and classifies them and files them automatically, and has very advanced OCR and searching technology .

It automatically takes documents from the ScanSnap and turns them into searchable PDFs.

Yep

yepscreenshot.jpg

Yep is an iPhoto-like file PDF browser that allows you to add tags to documents in order to manage them. It will assign tags based on the folder that they’re stored in, and then you can of course add your own.

PDFPen

pdfpen.jpg

PDFPen is a PDF editing solution that lets you add comments, highlighting, signatures, move around pages, and other general document management stuff. It’s kind of like a scaled down Acrobat for 1/6 of the price.

For Windows users one solution is Microsoft OneNote 2007

onenote.jpg

OneNote is a “digital notebook” that lets you bring in documents, images, media, etc. For documents, it will OCR them and allow you to search through.

Another Windows program is Home Document Manager.  Home Document Manager will scan, organize, and make your documents searchable.

hdmscreenshot

Stay Safe – Backup

There are a ton of backup programs for Windows.

For local backups, SyncBackSE is a favorite.

syncbackse.jpg

If you want to go the online route, there is Mozy and Carbonite. Here is more information about online backup solutions.

For Mac, OSX Leopard comes with built in backup software called Time Machine. You can use that with any external hard drive, or use a Time Capsule.

The online route for Mac users is a bit more limited, but Mozy is an extremely popular choice. The best part is that the first 2 Gigs of storage is free.

Do you have any other software for managing paper and documents that you can’t live without? Sound off in the comments.

Comments ( 2 )

10 Tips For Achieving Paper Zen

cleandesk.jpg
Photo by unimatrixZxero

Many people dream of the mythical “paperless office”. While these tips aren’t going to take you all the way there (and I don’t think anything truly will), they will take you a long way towards making friends with paper again.

1. Switch to paper-free option when possible. When at all possible, get rid of the paper coming in in the first place. Many banks or vendors will let you switch to online statements and bills, and when possible pay your bills online via your bank’s website instead of writing a check.

2. Get a scanner with automatic document feed and duplexing. If you try to go paperless (or even just less paper) with a flatbed scanner, chances are you are eventually going to find it a pain. A scanner (like a Fujitsu ScanSnap) that lets you put in a stack of paper and automatically scans both sides in with a push of a button will make life much easier.

3. Scan/process/shred right away. If you let things pile up too much, it becomes a chore and you won’t want to do it. Try to through your document in the scanner/shredder right when you get it.

4. Have everything close at hand. Stolen from GTD, you are more likely to process everything right away and correctly if all your equipment, file folders, and other processing materials are right there at arm’s length. If you have to walk to do something, you probably won’t.

5. Get buy-in from family/colleagues. Nothing is worse than coming up with a great system to reduce paper use, but your spouse or co-worker keeps on with their hoarding and filing ways. Try to involve them in designing and implementing the new process so they have buy-in right from the start and it is “theirs” too.

6. Chose a folder/filename system that makes sense to you. Sure you know what the receipt for your new USB turntable is now, but if you see a23422add.pdf in My Documents next year will you know what it is without opening it up? Come up with a folder and naming system and stick to it.

7. Make your PDFs searchable. Similar to #6, don’t just scan things to a PDF image. Use your scanning software to make the PDF searchable. That way in the future you can find it later on just by doing a Spotlight or Google Desktop search.

8. Be careful what you scan & shred. Like this guy says, don’t get too carried away with what you scan and shred. Other people (like your girlfriend), might not be quite as impressed with your mad paperless skillz.

9. Combine the process with something else. If you don’t have the discipline to do #3 right away, try to combine your processing with something else. If you have a laptop and a portable scanner, do your scanning in a batch while watching the football game or something.

10. Automate backups. Nothing will cause you stress with a system like this like knowing you are one harddrive failure away from disaster. Put yourself in paper zen mode by knowing that all your data is safe and secure. Use a backup system and make it automated so that you don’t even need to think about it.

Do you have any other tips for achieving “paper zen”? Share in the comments.

Comments ( 1 )