ABBYY PDF Transformer+ Review

I’ve long been a fan of ABBYY’s OCR software, so I was interested when I saw they’ve released a more fully-featured PDF editing suite. It’s called ABBYY PDF Transformer+, and the company threw me a review copy to check out.

It is Windows only, so Mac readers are going to have to sit this one out.

PDF Transformer+

You can see from the home screen that you can open a file (or files) or scan an image in. You can the convert it to a number of different formats.

Despite the name, PDF Transformer+ is more than just a PDF converter. It will let you edit, manipulate, annotate, redact, and OCR PDF documents.

Convert To PDF

I scanned a document to JPG and then used the ABBYY program to convert it to a searchable PDF. I intentionally used a lower-quality scanner than my ScanSnap to see how it would handle it. I scanned to JPG at 300dpi.

When you import a file, it takes you to the main work screen.

PDF Transformer+ Work screen

As you can see below, you have many options when it comes to conversion. I chose the first: Searchable PDF Document.

PDF Transformer+ Convert

To give you an idea of quality, here are the files:

Merge And Manipulate PDFs

You can open multiple files and merge them together. You can change the order, remove files, and even apply OCR to the resulting scan.

PDF Transporter Merge

When you have a file open with multiple pages, you can move the pages around, remove pages that you don’t need anymore, and directly add pages from a scanner or other file right into the document that you are working on.

PDF Transporter+ Manipulate Pages

OCR Accuracy

I ran another article through the PDF Transformer+’s searchable text conversion process. It is the same file that I used in my old OCR Smackdown post if you want to compare the results to other applications.

Here is the relevant section of the article:

PDF Transformer+ OCR Text

Let’s see how PDF Transformer+ did:

The spreadsheet has become the virtual “slide rale” for CMAs. It’s used for everything from preliminary strategic plans to financial statements. As with any familiar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal ofthe spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably the most comfortable environment for a lot of financial professionals,” Alok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.

Other than “slide rale”, not bad at all. It is rare to get 100% accuracy with OCR.

Annotate PDFs

You can mark up, annotate, stamp, and redact PDF documents. You can highlight text, make notes, and create your own stamps. It does Bates Numbering for you lawyer types.

PDF Transformer+ Annotate

One feature I could not find is the ability to create a stamp from your own image. This can be handy for things like scanned signatures.

Edit PDF Text

PDF Transformer+ gives you the ability to edit the text in a PDF, but what you can do depends on the type of PDF you are working with.

If you are working with a PDF that has been downloaded from the web or generated from a word processing program, you can edit the text in the document on a line-by-line basis.

PDF Transformer+ Edit Regular Text

If you have a scanned document, the best you can do is put a text box or eraser over top of the image. Unfortunately, as far as I can tell you can’t directly edit the text in a scanned document like you can with Acrobat or Nitro.

PDF Transformer+ Edit Image

Convert

As you saw earlier, you can convert a image or PDF document to a number of different formats including Word, Excel, PowerPoint, HTML, EPUB, text, RTF, CSV, and OpenOffice.

PDF Transformer+ Export

As a test, I converted the document I scanned at the beginning of the article to Word. Let’s see how it did.

PDF Transformer+ Word Conversion

Pretty good, and I was able to edit the text in the document. Here is the file if you want to take a look.

Scanning

On the home button, there is a big fat Scan button. If you have a TWAIN-compliant scanner, you can scan from right within PDF Transformer+.

If you have, for example, a Fujitsu ScanSnap which is not TWAIN compatible, not a problem. You can set up a ScanSnap Manager Profile to scan directly to PDF Transformer+, or you can use the ScanSnap Folder functionality to access the scanner.

PDF Transformer+

I like PDF Transformer+ for working with PDFs on Windows. It is well designed, and from my testing seems fast and accurate. It is $79.99 USD, which is a heck of a lot less than Acrobat.

You can buy it directly from ABBYY, and if you use that link you’ll be buying me a samosa from my favourite place on 41st Avenue (thank you).





About the Author

Brooks Duncan helps individuals and small businesses go paperless. He's been an accountant, a software developer, a manager in a very large corporation, and has run DocumentSnap since 2008. You can find Brooks on Twitter at @documentsnap or @brooksduncan. Thanks for stopping by.

Leave a Reply 1 comment

Alex Bocast - November 24, 2015 Reply

My experience has not been good with build 12.0.104.225 of ABBYY’s PDF Transformer+ . In terms of performance, ABBYY is a extremely memory intensive and extremely slow: for example, ABBYY takes minutes to open, minutes to save a file, and minutes to close. Yes, minutes. With Windows 10, I use ABBYY only when I must, and, when my task is done, I immediately close the program; otherwise, ABBYY memory leaks eventually bring Windows 10 to its knees and a cold reboot is needed before Windows 10 will run again. (This does not reflect well on Windows 10 either…) I work across several languages, and, at best, ABBYY’s software chokes on anything that is not modern English from a modern press. In spite of its advertised ability to recognize dozens of languages, I have found this to be simply not true in my work with Latin, German, French, Italian, Spanish, and Dutch documents. In my experience, ABBYY has no language awareness whatsoever in any language. ABBYY simply does not recognize basic diacritical marks such as the umlaut in German and the accent grave in French. ABBYY sees “est” in Latin text–a fundamental and frequent word in Latin–and renders it as “efl” or some other weird character response that has never been a Latin word. ABBYY cannot handle any sort of ligature, such as the ligatures for “ae”, “oe”, and “ct”. By my measurements, ABBYY seldom correctly recognizes more than 80% of the characters nor 50% of the words. ABBYY routinely confuses the letters ‘s’, ‘f’, and ‘l’. It is often faster to type in text from scratch than to correct the mistakes that ABBYY makes in its OCR conversions. Forget italic fonts, which terminally confuse ABBYY. Forget Fraktur or any old German or English font. ABBYY cannot even recognize standard punctuation characters: a semicolon will get you “j”, a comma will get you “”, a colon will get you a bullet and a dot or the letter ‘i’, and a period is likely to be recognized as a circumflex! (How is it even possible to confuse a period with a circumflex?) I have seen ABBYY give back dozens of different character strings when it encounters an ampersand, except, of course, an ampersand. All in all, ABBYY PDF Transformer+ may be better than nothing, but the advantage is not often obvious. If I sound frustrated, I am. ABBYY does not know what to do with text in columns: it interleaves the text from columns on a page instead of reading one column at a time. When you ask ABBYY to “recognize” text in a document, that is, to make the text searchable, you may tell ABBYY what the language of the text is. As far as I have been able to determine, this setting has absolutely no effect on the output. Regardless of the this language setting, ABBYY will consistently reject 50% to 70% of the pages in a document because ABBYY thinks that your language choice is wrong. Thus, when you try to search a text that has ostensibly been made searchable by ABBYY, most of the pages in the file will actually be skipped! When you ask ABBYY to generate a Word file, it infects the Word document with dozens of weird styles that must then be deleted, individually, one by one, from the Word document. (This does not reflect well on Microsoft Word either…) I have found that it is much more productive to ask ABBYY to generate a TXT file, even though this requires reformatting a document from scratch . On the plus side, although painfully slow, ABBYY does work unexceptionally when you need to create one PDF file from many or when you need to re-order or delete pages within a PDF document. However, to remove a page from a PDF file requires no more than removing a node from a linked list, consigning the unlinked node to garbage collection, and refreshing the display; with a decent machine, this is a task that should require no more than a few nanoseconds, yet I have had to wait, and wait, and wait, on ABBYY to complete this simple task. In short, I cannot fathom the good reviews that ABBY PDF Transformer+ has received because they do not at all reflect my experience.

The incident that precipitated this outburst (after long simmering) involves a search of a file that was just converted to be “searchable”, which took several minutes. The words I searched for: “clara ergo cognitio”. I knew the words were in this document, but I couldn’t recall just where, and that is why I converted the document. The search failed because “clara” is in italics! Thus my rant…