Most scanners these days come with an Optical Character Recognition, or OCR, program of some sort to make PDFs searchable. However, what if you don’t have an OCR program or you just want to do a quick and dirty file conversion without messing around with an application?
A few DocumentSnap commenters have pointed out that RICOH Innovations has created a number of Beta applications, one of which is an online Document Conversion tool.
As the site says:
The document conversion widget provides free OCR to convert your images into editable and searchable pdf, MsWord, HTML and text documents, providing capabilities such as pdf to doc conversion.
I thought I would put the tool to the test using the same parameters as in my ABBYY Finereader vs. Adobe Acrobat OCR comparison.
- Speed: Once I uploaded the file and hit Convert, it took 27.5 seconds to complete the PDF conversion
- File Size: The original was 1.5 MB, the converted copy was 160 KB
- Accuracy: Here is a screenshot from the original:
Here is the OCR’ed version:
The spreadsheet has become the virtual “slide rule” for CMAs. It’s used for everything from preliminary strategic plans to financial statements. As with any familiar method, it finds its way into numerous situations where better alternatives are available, most significantly in its widespread use as a de facto reporting tool.
The appeal of die spreadsheet as the quickest way to get a report out is not hard to appreciate. “Excel is probably the most comfortable environment for a lot of financial professionals,” Alok Ajmera, vice-president, professional services with Mississauga, Ont.-based Prophix Software, says. “There’s a very little learning curve, you can effectively do whatever you want with the data, and it works fairly well in smaller organizations.”
Periodic and complex reporting in processes like revenue management or cost management, however, is where the spreadsheet model really starts to break down.
Pretty good, I’d say
- Quality: Here is where it gets dicey. Unlike the other tools reviewed, RICOH’s online tool doesn’t put a text layer behind the image, it actually converts the image to text. The results are pretty good actually, but it is not for you if you want your original PDF’s exact look. Here is a screenshot:
Given that this is a free tool in beta, it is not surprising that there are some limits. The maximum file size is 20 MB and you can only request 20 conversions per hour.
You can choose to download the files immediately or have them emailed to you when they are reader.
Also, you may or may not want to use this tool to OCR sensitive documents. As they say, “In short, we will use submitted data to improve the service.”.
Privacy issues aside, this looks to be a tool that can come in handy when you really need it and is worth playing around with.
If you know of other good online OCR products, leave a note in the comments.