Using Microsoft Office Document Imaging To OCR For Free

Using Microsoft Office Document Imaging To OCR For Free

If you are a Windows user and already have Microsoft Office XP through 2007, chances are you already have the ability to OCR documents to get the text out of them.

It’s called Microsoft Office Document Imaging (MODI). I’m not going to lie, what I am about to show you is not exactly the best way to OCR documents. If you have software that came with your scanner, I’d stick to that.

However, if you don’t already have OCR software and all you want to do is get some text out of an image, the software you already have is better than nothing at all.

Finding Microsoft Office Document Imaging

First, you want to check to see if you already have it installed. In Office 2007, go to Start > Programs > Microsoft Office > Microsoft Office Tools, and you should see Microsoft Office Document Imaging.

If you don’t see it there, never fear. It’s an optional part of the Office install. In Control Panel, go to Add/Remove Programs, select Microsoft Office, click Change, and then select add features. You will find MODI under Microsoft Office Tools. Install it and you should be good to go.

Ah Microsoft, I Love You

It probably won’t surprise you to learn that Microsoft Office Document Imaging will not import PDFs (why would they support an Adobe product?!). It will only import TIFFs and Microsoft’s own Microsoft Document Imaging format (.MDI).

In this example, I’m going to assume that we want to get the text out of a PDF that has not been OCR’ed already. Sure you could use MODI to scan a document in, but I figure if you have the hardcopy document and a scanner, you’d probably just use the scanner’s software anyways.

Copying A PDF In

Since we can’t actually import a PDF, we’re going to do some copy & paste magic.

Open up your PDF in Acrobat Reader or whatever PDF reader you are using and either Select All or Select just the portion you want to OCR. Then hit Copy.

Select Info In PDF

(By the way, that’s my picture of a Fung Wah bus that made it into New York Magazine. Aren’t you proud of me?).

Then switch to MODI, and you would think you would go Edit > Paste right? Of course not! This is Microsoft!

Instead go to Page and then Paste Page. Voila, the image you just copied is now in Microsoft Office Document Imaging.

Saving The Text

So now that you have the image in MODI, what do you do with it? To OCR the text, go Tools and then Recognize Text Using OCR.

You can then save it as a TIF (though I understand that only MODI can read that TIF), or MDI. Since that is more than a little useless, I’m going to cover sending the text to Word.

Send Text To Word

To send the text (and graphics, if you’d like) go up to Tools and then Send Text to Word. The OCR’ed text will then appear in a Word document with all the images at the bottom, if you checked the “Maintain Pictures in Output” box.

So, again, this is not the greatest OCR process in the whole world, but hey. If you’re a Windows user you probably already have Office, so it’s good to know what is available if you ever need it.

Photo: Naufragio

About the Author

Brooks Duncan helps individuals and small businesses go paperless. He's been an accountant, a software developer, a manager in a very large corporation, and has run DocumentSnap since 2008. You can find Brooks on Twitter at @documentsnap or @brooksduncan. Thanks for stopping by.

Leave a Reply 32 comments

garden bridges japan - October 17, 2018 Reply

What i do not understood is actually how you are not really much more
well-preferred than you may be now. You are so intelligent.
You realize therefore considerably in terms of this subject, made me in my opinion consider it
from so many varied angles. Its like men and women don’t seem to be interested unless it is something to accomplish with Girl gaga!

Your personal stuffs great. At all times handle it up!

backyard bridge - October 8, 2018 Reply

Now for some container gardening ideas and tips.

Murray Raff - October 2, 2018 Reply

I have used MS – Document Imaging with Windows 7 and found it really useful. The main problem was doing OCR on foreign-language documents and getting MS – Document Imaging to connect to the foreign dictionary imported by means of a language pack. I now use Windows 10 and the language pack problems appear to have been solved, but I can’t find an equivalent imaging system in Windows 10 – can anyone suggest where the equivalent of MS – Document Imaging can be found in the Windows 10 Office structure?

www.mojablogosfera.com - October 1, 2018 Reply

This piece of writing presents clear idea in favor of the new people of blogging,
that in fact how to do blogging and site-building.

leggings femmes - September 13, 2018 Reply

Please let me know if you’re looking for a writer for your site.
You have some really great posts and I feel I would be a good asset.
If you ever want to take some of the load off, I’d really like to write
some articles for your blog in exchange for a link back to
mine. Please shoot me an e-mail if interested.
Kudos!

Kickass Torrents - September 11, 2018 Reply

Thanks for sharing your info. I truly appreciate your efforts
and I am waiting for your further post thanks once
again.

đô thị - August 30, 2018 Reply

I’m amazed, I have to admit. Rarely do I encounter a
blog that’s equally educative and entertaining, and without a doubt, you
have hit the nail on the head. The problem is an issue that too
few folks are speaking intelligently about. I am very happy I came across this in my search for
something relating to this.

cruise certificate - August 26, 2018 Reply

Good post. I learn something new and challenging
on sites I stumbleupon everyday. It’s always exciting to
read content from other writers and use something from their websites.

mode - August 15, 2018 Reply

What’s Going down i’m new to this, I stumbled upon this I’ve discovered
It absolutely useful and it has helped me out loads.

I am hoping to contribute & aid different customers like its helped me.
Great job.

pandora price - August 11, 2018 Reply

The Juno B1 Cabin Suitcase glides on four precision-made Hinomoto
wheels (a company which, according to obsessive fliers, is a standard-bearer
of quality caster-making). The thing is extraordinarily light at 5.3 pounds (the Rimowa analogue tips the scales
at 7.1), but feels shockingly sturdy; its speckled polypropylene shell is built to combat and conceal obvious (but
inevitable) scratches. The suitcase also has a handy built-in lock, and indestructible hard casing.
But what I really love about it is how much I can fit.
Despite its tiny dimensions, which always fit into an overhead, I’ve been able to cram in a week’s
worth of clothes for a winter trip in Asia (thanks to clever folding), or enough for ten summery days
in L.A. It’s really the clown car of carry-on luggage.

Arslan Wasi - July 10, 2018 Reply

great work

Marie Burgan - June 28, 2018 Reply

A new tool constructed for merging PDF documents doesn’t lag behind. You may combine pdfs online without extra efforts. In the best traditions of our platform, the procedure is self-explanatory and easy in usage. Our user-friendly interface attracts your attention to main moments and step-by-step leads you to the successful result. https://www.altomerge.com/

Miya Lenon - May 30, 2018 Reply

Thanks for it. I appreciate your guide. As for me, I frequently use LightPDF to convert image into editable text. This program is for free. You can try it here: https://lightpdf.com/

Office Guys - May 3, 2018 Reply

This web site definitely has all of the info I needed about this subject and didn’t know who to ask.

Murray Raff - February 25, 2018 Reply

I have used MS Document Imaging extensively and it is a good solid product. One point on which more finesse would be very helpful is when doing OCR on materials published in languages other than English. This is especially helpful when translating material. French and Spanish dictionaries come with the English language pack, but installing other language packs (certainly so far as Windows 7) is a minefield and there is no certainty that installing the relevant language pack will translate into a dictionary available in MS Document Imaging to aid OCR. One would think this a higher priority for MS in a globalising world.

Kind regards, Murray

James Jensen - May 23, 2017 Reply

This recommended Microsoft Office Document Imaging software really makes things different when I have no time to install other tools but want to get the text out of images desperately. However, it just allows to import file in MODI or TIFF rather than PDF, which seems inconvenient for me at most times. If you get time and energy to download and install OCR software, you can get more options to facilitate your OCR to Word process: http://www.ocrtoword.com

Tk - August 11, 2015 Reply

I am looking for a tool which will return the layout or coordinate information of words inside an image.
For example an image have some words. This tool should return the coordinates (x and y) of each word inside the image. I tried Microsoft OCR, but it is only for mobile applications.
Can you suggest anything?

Linda - August 1, 2015 Reply

Thank you so much for your very thorough article. It was very comprehensive and has provided me a very helpful direction for several problems with editing photos from my camera phone. Being notably smart phone (and OCR)illiterate was shocked find myself the owner of an android phone with camera. Although following the camera instructions included with the phone my first photo efforts were disappointing. At a total loss as to how to process pics for better reception.

As a seasoned photographer, hoped my new smart phone camera would be fun. Having spend HOURS of my precious weekend time researching and trying to find the ‘burst’ collection of pictures I thought I took last evening discouraged to discover that they are only single ‘snap shots’ and that my photo editor software really doesn’t do much for them.

Will redirect my energies in your suggested directions: already owned printer/scanner software.

Linda

clement - November 25, 2014 Reply

Nice article: I used it in enterprise, without any problem. Very nice & clear!
Thanks !
MODI is here, and useful.
The integrated OCR from Microsoft is less powerful than a free ocr I found online, but it’s much much more easy to use.

And last point, I had a problem copying the pdf on a page, at first my pages were blank.
In Acrobat, I had to copy the file in the clipboard right after copying. And then everything went fluent.

shre - October 18, 2012 Reply

thanks a lot!

Using Microsoft Office Document Imaging To OCR For Free « TrackBug - August 28, 2012 Reply

[…] Using Microsoft Office Document Imaging To OCR For Free […]

Bigg Frank - August 3, 2012 Reply

Great article, it's so nice to come to a site that explains things simply and fully Thanks.

Sergio - May 21, 2012 Reply

Thanks for this article. Great tip (To use Page>Page Paste instead of Edit>Paste).

    Brooks Duncan
    Brooks Duncan - May 21, 2012 Reply

    Great Sergio, glad it helped.

      WhatsApp 2018 Review - November 20, 2017 Reply

      Good article. To see review WhatsApp 2018 check:

zamir - January 24, 2012 Reply

Good article. To see how to implement MS-Office programmatically check: http://zamirsblog.blogspot.com/2010/12/ocr-using-

    Brooks Duncan
    Brooks Duncan - January 24, 2012 Reply

    Very nice zamir. Thanks!

      WhatsApp 2018 - November 20, 2017 Reply

      Free Download WhatsApp 2018 Latest Version

Leave a Reply: