Darken Scanned Text To Improve OCR Accuracy

Darken Scanned Text To Improve OCR Accuracy

Light Text

My dad is really into genealogy, and he often finds himself scanning old newspaper articles or other documents that contain faded text.

Often he takes the OCR’ed text from these documents and pastes it into his genealogy program, so OCR accuracy is pretty important. The problem is, when the text is too light, his OCR program (in this case Abbyy Finereader that comes with his ScanSnap) has trouble recognizing the text.

The solution (which he figured out by himself, impressively), is to set his scanner to darken the text before OCR is applied. The instructions in this post are for a Fujitsu ScanSnap, but if you have a different scanner you should almost certainly be able to do the same thing.

  • Fire up ScanSnap Manager by right-clicking on the ScanSnap icon either in your Dock or System Tray. Choose Settings or Scan Button Settings.

  • Choose your Profile, or if you are using the Quick Menu, hit Customize.

  • Go to the Scanning tab and make sure your Color mode is B&W. Hit the Option button.

ScanSnap Manager Scanning Tab

  • Drag the darkness slider way over to the right. Hit OK.

Scanning Button Darkness

  • Hit Apply.

Now your new profile should scan black and white text much darker, and hopefully you’ll get better results from OCR. If the scan is too dark, just move the slider to the left a bit until you get the best balance of darkness/OCR performance.

Any other tips for scanning old documents? Let us know in the comments.

(Photo by MichaelRiedel)

About the Author

Brooks Duncan helps individuals and small businesses go paperless. He's been an accountant, a software developer, a manager in a very large corporation, and has run DocumentSnap since 2008. You can find Brooks on Twitter at @documentsnap or @brooksduncan. Thanks for stopping by.

Leave a Reply 2 comments

Johannes - November 21, 2020 Reply

Hi,
I have the same problem as your dad but I have different sliders for my ocr program, i.e. brightness, contrast and image gamma. Dragging those to full saturation gives me a black page when scanning,
so it would be so helpful if you could supply a sample of what a page that is optimized for ocr look like.
Thanks for the tip though.
/Johannes

marge201 - February 4, 2016 Reply

This worked perfectly. THANK YOU SO MUCH!! My friend got a medical report and it’s so light that it’s illegible. IDIOTS.

Leave a Reply: