I installed my first Linux distribution back in the early 1990s. With my super-slow dialup modem, I downloaded the 20 or so floppy disk images of Slackware, copied them to floppies, and installed it.[1]
Since then, I have used Linux off and on, but haven’t done so in quite a few years. In the early years of DocumentSnap, one of the more popular posts was about how to use the Fujitsu ScanSnap in Linux, which works (apparently) fairly well thanks to the SANE project.
It was with great interest that I came across this post by Nathan Willis over on Linux.com, entitled Weekend Project: Create a Paperless Linux Office.
Nathan takes us through how he uses gscan2pdf to scan documents to searchable PDF.
That’s where optical character recognition (OCR) comes in. OCR recognizes letterforms in the scanned document image and outputs actual text, which is precisely what we’re after. But rather than run a command-line OCR program on every scanned image and produce a .txt file, it’s better to combine the two into a single document, and hopefully a single step. That’s the purpose of gscan2pdf, a lightweight GUI application that has a built-in SANE scanner interface, an OCR engine, and the ability to write PDF documents that embed the OCRed text and use the scanned image as a background for improved legibility.
I wasn’t familiar with gscan2pdf, so if you are looking at going paperless using Linux, check out the article.
How about you, Linux fanatics? What software do you use to go paperless? I’d love to hear about it in the comments.
(Photo by KobraSoft)
-
I think it was on a 386, but don’t quote me on that. ↩