OCR Smackdown: ABBYY FineReader vs. Adobe Acrobat

A very common request that I get here at DocumentSnap is to compare the Optical Character Recognition (OCR) capabilities of ABBYY FineReader with Adobe Acrobat. Why? Well, for starters, both of them come included with models the Fujitsu ScanSnap as well as other scanners.

I decided to do a quick test comparing the OCR of the two packages using the following criteria:

  • OCR Speed
  • Resulting File Size
  • Accuracy

The Hardware

For a scanner I used my ScanSnap S1300.

I used two computers for the test:

  • Windows: A new cheap Acer laptop with a Core i3 2.40 GHz processor and 4 GB RAM running Windows 7
  • Mac: An old 2.5 GHz Intel Core 2 Duo MacBook Pro with 4 GB RAM running Mac OS X Snow Leopard

The Software

Here are the packages I used:

  • Windows: ABBYY FineReader For ScanSnap 4.1 (called from ScanSnap Manager) vs. Adobe Acrobat 9 Pro
  • Mac: ABBYY FineReader For ScanSnap 4.1 (run standalone) vs. Adobe Acrobat 8 Pro

Yes, I realize that Adobe Acrobat X is out, but since I am not aware of any scanners that come bundled with it yet, I decided to stick with the versions that ship with the ScanSnap. I’ll update Acrobat X in a later post.

The Document

I scanned a magazine article for this test. It probably would have been better to do this with a bunch of different documents to compare, but hey.

In all cases except one, I scanned without OCR so that I could run it standalone later. Here’s some info on the document that I used:

  • Pages: 2
  • Scan Quality: 300dpi, Color
  • Resulting File Size: 1.5 MB
  • Columns: 2, with some images

Maybe I am blind, but I couldn’t figure out a way to run ABBYY FineReader for ScanSnap on Windows standalone. If you know how, please leave a message in the comments. In that test, I re-scanned with “Create Searchable PDF” checked in the ScanSnap Manager settings.

The Settings

I tried not to do too many fancy settings to keep things as “real-life” as possible. There were essentially three configurations:

ABBYY FineReader

ABBYY FineReader OCR Settings

I set Save Mode to “Text under page image” and Quality to High. These were the settings for the Mac ABBYY, and I believe it is what ScanSnap Manager on Windows uses as well.

Adobe Acrobat (Normal)

Adobe Acrobat OCR Settings

I set the output style to “Searchable Image (Exact)” because leaving it just as Searchable Image in my experience has caused some weird things to happen with the resulting PDF. I used these settings on both Windows and Mac.

Adobe Acrobat (With ClearScan)

Adobe Acrobat ClearScan

In Acrobat 9 there is a setting called ClearScan. I used that as an additional test to see what the difference is.

Speed

Windows

  • ABBYY Windows: 20.5 seconds
  • Acrobat 9: 13.9 seconds
  • Acrobat 9 With Clearscan: 17.6 seconds

Mac

  • ABBYY Mac: 44.7 seconds
  • Acrobat 8: 20.2 seconds

Winner: Acrobat!

Since they are different machines, you can’t directly compare the Windows and Mac times, but clearly in both cases Acrobat is faster.

File Size

The non-OCR’ed PDF was 1.5 MB.

Windows

  • ABBYY Windows: 1.7 MB (+.2 MB)
  • Acrobat 9: 1.5 MB (same)
  • Acrobat 9 With ClearScan: 315 KB (-1.16 MB)

Mac

  • ABBYY Mac: 1.4 MB (-.1 MB)
  • Acrobat 8: 1.5 MB (same)

Winner: Acrobat 9 with ClearScan!

With an astonishing 1.16 MB reduction in file size after OCR, Acrobat 9 with ClearScan is the winner. Wow.

Accuracy

Here is a passage from the article:

Article Text Before OCR

Let’s see how each of the packages did:

ABBYY Windows

The spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything from preliminary strategic plans to financial statements. As with any familiar method, it finds its way into numerous situations where better alternatives are available, mostsignificantly in itswidespread use as a de facto reporting tool.
The appeal of the spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably the most comfortable environment for a lot of financial professionals,” Alok Ajmera, vice-president, professional services withMississauga, Ont.-basedProphixSoftware, says. “There’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Acrobat 9 Windows

T he spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything from preliminary su·ategic plans to financial statements. As with any farniliar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal of tlle spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably tlle most comfortable environment for a lot of financial professionals,” AJok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want witll tlle data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Acrobat 9 With ClearScan

The spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything from preliminary su·ategic plans to financial statements. As with any farniliar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal of tlle spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably tlle most comfortable environment for a lot of financial professionals,” AJok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want witll tlle data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

ABBYY Mac

The spreadsheet has become the virtual “slide rule” for CiMAs. It’s used for everything from preliminary strategic plans to financial statements. As with any familiar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal of die spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably the most comfortable environment for a lot of financial professionals,” Alok Ajmera, vice-president, professional sendees with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Acrobat 8 Mac

T he spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything frorn preliminary strategic plans to financial statements. Aswith any familiar method, it finds its way into numerous situations where better alterna tives are available, most significantly in its widespread use as a de facto reporting tool.
T he appeal of the spreadsheet as the quickest
way to get a report out is not hard to appreciate.
“Excel is probably the most comfortable
environment for a lot of financial professionals,” avaJlaun:.:,JIIU:::’l;)It;IIIULauuy1111l::>WIUC::>PU:C1U uocd::>
a de facto reporting tool. T he appeal of the spreadsheet as the quickest
way to get a report out is not hard to appreciate. “Excel is probably me most comfortable environment for a lot of financial professionals,” AJok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “T here’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organiza tions.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Winner: ABBYY FineReader for Mac looks the best to me. Acrobat 8 on the Mac is pretty terrible (in this example anyways).

Conclusion

Is there a “best” choice? It seems that in this example anyways, Adobe Acrobat 9 with ClearScan turned on gives fast results with good OCR while dramatically reducing the file size.

If you don’t really care about speed so much, FineReader produces good OCR results and for ScanSnap users, has the additional benefit of being integrated with ScanSnap Manager.

As with most things, the best software is the one that works the best for you. Have you found similar results? Any other tests of your own to share? Leave a note in the comments.

(Photo by Polina Sergeeva)

Want More Help With Going Paperless?
  • Receive my free guide 4 Ways To Tame Your Documents. 
  • Receive my popular free Paper Cuts newsletter.
  • Receive my free 7 part Paper Sanity e-Course. 

Tags: , , , , ,

21 Responses to “OCR Smackdown: ABBYY FineReader vs. Adobe Acrobat”

  1. Dave December 14, 2010 at 11:02 am #

    On Mac OS X, I've found a good compromise between accuracy and file size. First, I OCR the scanned document with ABBYY FineReader, then open it in Adobe Acrobat 8 to run through the "Optimize Scanned PDF" process. I get similar file size reductions, and the excellent accuracy of FineReader, even though it's an extra step in the process.

    • BrooksD December 14, 2010 at 11:14 am #

      Awesome David, great tip! Thanks!

  2. Ed Eubanks December 15, 2010 at 8:04 am #

    It's worth noting that DevonThink Pro Office (a Mac-only unstructured database) has the ABBYY Reader built-in, and applies it automatically when a ScanSnap (or other document scanner) is set up to send the scan to DevonThink. So, once set up, you get a one-button, automated, OCRed PDF of every document you scan.

  3. Donna December 15, 2010 at 9:49 am #

    I like Dave's suggestion about reducing the file size using Adobe Acrobat 8. However, Ed's use of DevonThink Pro Office is appealing as a "one-button" solution. I am concerned about the file sizes of the PDFs I scan. Is there a file reduction process similar to "optimize scanned PDF" in the DevonThink Pro Office software?

  4. Tom December 15, 2010 at 2:42 pm #

    Last week I purchased SmartOCR and I sincerely recommend it to anyone looking for accurate OCR for a low price: http://smartocr.com

    • Sojourner July 8, 2012 at 6:06 pm #

      Their prices seem similar to ABBYY. How do you find their quality?

  5. Marty December 17, 2010 at 4:19 am #

    How does Omnipage compare to ABBYY & Adobe Acrobat?

    • BrooksD December 17, 2010 at 6:27 am #

      If I can get my hands on a copy I'll try to include it in a future smackdown.

      • Marty December 17, 2010 at 8:22 am #

        I would think that if you contacted Nuance they would be happy to have you review their products, including OmniPage, PaperPort and PDF Converter Pro.

        • BrooksD December 17, 2010 at 9:17 am #

          Thanks Marty, will give it a go.

  6. Natalie December 20, 2010 at 3:43 pm #

    Awesome info…Nice to see perspective on how each work before investing the time and money to figure it out on my own! For now, I use an online OCR service that I recently discovered that's offered by Ricoh Innovations. http://beta.rii.ricoh.com/betalabs/content/docume

    Have you used this at all?

    • BrooksD December 20, 2010 at 4:08 pm #

      I hadn't seen it, thanks for the tip!!

  7. George July 28, 2011 at 12:22 am #

    I understand that PDF PenPro works well for OCR also. Any experience with it?

  8. pad November 10, 2011 at 11:42 am #

    I would suggest just cleaning it with warm water and a mild soap to remove any salt, and melt any ice that is stuck. You can soak it for a few minutes, then make sure it gets very dry, even between the pads.

  9. Admin August 6, 2012 at 5:02 pm #

    Two pages simple document is not enough complex to compare these programs. Scan a document with multiple pages (40-50) that contain tables witch colored backgrounds, various text sizes and typefaces, other embedded objects like pictures with text on them, and then compare. I am sure you'll see then which software is better. To me, your article is absolute joke.

    • BrooksD August 6, 2012 at 5:28 pm #

      Thanks for stopping by and spending time commenting!

  10. ghuth October 17, 2012 at 5:49 am #

    If only there was a decent way to automate the Adobe clearscan OCR process. It's painful to have to open and manually batch it.

Trackbacks/Pingbacks

  1. DocumentSnap Time Machine | Tips To Learn How To Go Paperless | DocumentSnap Paperless Blog - December 18, 2011

    […] OCR Smackdown: ABBYY FineReader vs. Adobe Acrobat I did a comparison of the abilities of the two OCR packages that come with the ScanSnap scanners. […]

  2. Google Drive OCR For Searchable PDFs | Tips To Learn How To Go Paperless | DocumentSnap Paperless Blog - May 15, 2012

    […] test OCR quality, I did the same test as in my OCR Smackdown post. Here are the […]

  3. ABBYY PDF Transformer+ Review - May 6, 2014

    […] the PDF Transformer+’s searchable text conversion process. It is the same file that I used in my old OCR Smackdown post if you want to compare the results to other […]

Leave a Reply