Tag Archives: windows7

How To Find PDFs That Are Not Searchable

Sometimes, especially when you are a doing a big OCR project, you might want to find all the PDFs that are not searchable. That is to say, you want to find the PDFs that have not been OCR-ed.

It turns out that this is not as easy as you might think. Here are a few ways to “sort of” do it. As much as possible I wanted to limit this to search capabilities built into the operating system, or to applications that you might already have.

Mac OS X Spotlight

It occurred to me that, chances are, almost any PDF that has been made searchable will have at least one space in it. So, why not use Spotlight to find all PDFs that don’t have a space? Fire up Spotlight by going Command-Space and type the following:

kind:pdf NOT intext:" "

Is this a perfect test? No, but hopefully it will get you most of the way there.

Microsoft Windows

I had hoped to do the same thing with Windows Search in Windows 7, but it didn’t work. It doesn’t seem that it will let you just search for a space. The closest I could come to is to search for the word the. Obviously this is English only, so in your language hopefully there is an equivalent word that is in almost every document.

Start up Windows Search by pressing Windows Key-F and type the following:

ext:pdf NOT contents:the

That is not as likely to succeed as just searching for a space, but should get you most of the way there.

Adobe Acrobat

Adobe Acrobat has some features that may help. You can use Acrobat Pro’s Preflight feature, or even do a Batch Process Accessibility Report.

None of these searches are 100% guaranteed to succeed, but hopefully they will help you down the path. Thanks to DocumentSnap reader Matt for the idea for this post.

Do you have any tricks for finding non-OCR’ed PDFs? Share in the comments.

(Photo by Dirigentens)

Comments ( 3 )

How To Fix PDF Search In Windows 7 64-Bit

One of the best things about modern operating systems like Mac OSX and Windows 7 is that search is built right in, specifically PDF search. You don’t need to have a third party tool to search the contents of a searchable PDF, the OS will do it for you.

That is, unless you are running the 64-bit version of Windows 7.

It is fairly common for DocumentSnap readers to write in with questions/problems, but it is pretty handy when a reader writes in with both the problem and the solution, which is exactly what superstar DocumentSnap reader Matt did recently.

Matt had a problem: He was scanning all these OCR’ed PDFs, but Windows Search was not finding them when he typed a keyword in the document. It would only find it if he typed in the name of a file, which pretty much defeats the purpose of Optical Character Recognition. Not having a Windows machine at the time I was flying blind, but we went back and forth and eventually he figured out what the issue was: an iFilter (but I am getting ahead of myself here).

What Is 64 Bit Windows And Do I Have It?

There are basically two types of Windows: 32-bit and 64-bit. I’ll let Microsoft describe the difference:

The terms 32-bit and 64-bit refer to the way a computer’s processor (also called a CPU), handles information. The 64-bit version of Windows handles large amounts of random access memory (RAM) more effectively than a 32-bit system.

It used to be that only high-end computers were 64-bit, but that has changed in the last year or two. This cheap Acer laptop I am writing this on is 64-bit, for example. How can you tell which kind of Windows you have?

  • Click the Start button
  • Right-click on Computer, choose Propterties
  • You will see an entry for System Type which will give you the information that you need.

windows 7 properties

If you are having problems with PDF search and your System type says 32-bit, you can probably stop reading. This post likely won’t help you.

What Is The Problem?

Windows 7′s search capabilities are pretty good, but for some reason the 64-bit has a problem indexing PDF files. Windows Search uses something called an iFilter to help it index files, and the PDF iFilter for 64-bit Windows is missing. (This probably applies to 64-bit Vista and 64-bit XP too).

Here is how to tell if you have the problem:

  • Click on the Start Menu and choose Control Panel
  • Change View By to Small Icons and click on Indexing Options
  • Click on the Advanced button
  • Click on the File Types tab
  • Scroll way down to pdf and you will probably see Registered IFilter Is Not Found

Registered IFilter Is Not Found

If you see that message, you have the iFilter problem.

As an additional test, download or scan a searchable PDF. You can see here that I am searching for the word “Westminster” in Acrobat Reader and it is finding it. When I search using the search box under the Start menu, it doesn’t find it.

Westminster

Replace The Missing IFilter

To fix the problem, you need to download the missing iFilter.

Download Adobe PDF iFilter 9 for 64-bit platforms here

Note: You may notice that it does not list Windows 7 in the list of supported Operating Systems. While it worked fine for Matt and I, you need to make your own decision if you want to risk installing it.

Once you download it, unzip it and run the installer.

When the installer completes, go back and look at the file types list from above. It should now say “PDF Filter” instead of the “Registered IFilter Is Not Found” message. Yeah!

Test The New iFilter

Download or scan a new searchable PDF and find a word that is in the text and search on it in Acrobat Reader. For example, here I searched for the word “idyll”.

Idyll

Now I will search for it in Windows Search, and it looks like it found it. Double Yeah!

Idyll

Now lets search for Westminster again:

Westminster

Looks like it still didn’t find it. No!

It turns out that fixing the iFilter will only fix new documents, not the one that Windows Search has already indexed.

Do A Re-Index

In order to fix this problem, we’ll need to tell Windows 7 to do a re-index. If you have a large hard drive, this could take a long time, so do it before you are going to bed or something.

  • Click on the Start Menu and choose Control Panel
  • Change View By to Small Icons and click on Indexing Options
  • Click the Advanced button
  • On the Indexing Settings tab, hit Rebuild

Once this is done, let’s try searching for Westminster again. Hopefully third time’s the charm?

Westminster

It’s there!

Thanks again to Matt for doing the detective work on this one. Hopefully it will help one of you if you find that your 64-bit Windows isn’t finding your documents.

Comments ( 20 )

Why Is ScanSnap Organizer’s Search Box Greyed Out?

ssologo.jpg Most of you know that I typically use Macs more than Windows, but in the process of doing some consulting work (more on that later), I have been spending more time using the Windows programs that come with the Fujitsu ScanSnap.

For starters, I want to say that I really like ScanSnap Organizer. I wish the ScanSnap came with a Mac version. However, I came across what to me was a pretty weird issue.

I went to search some of my scanned-in PDFs and the Search box was greyed out!

ssogreysearch.jpg

After doing some digging, I found out why. It turns out that surprisingly, ScanSnap Organizer doesn’t have PDF searching capabilities of its own. It needs to use either Adobe Acrobat or Windows Desktop Search.

For ScanSnap S1300 users on Windows XP (raising hand), this is a bit of a problem because Adobe Acrobat doesn’t come with the scanner.

So, here is what you need to do:

If You Use Windows Vista Or Windows 7

You shouldn’t have this problem because Windows 7 and Vista have Windows Search built in.

So, you can use either the built in Windows Search, or if you have a ScanSnap S1500, you can use Adobe Acrobat that comes with the scanner.

If You Use Windows XP

You can download the appropriate version of Windows Search 4.0 here.

If you don’t want to use Windows Search, you’ll have to use Adobe Acrobat. If you have a ScanSnap S1500 you’re set as it comes with Acobat. If not, you’ll need to get your hands on Adobe Acrobat 7.0 or later to search within ScanSnap Organizer.

More details on all this can be found if you search ScanSnap Organizer Help for “File Search”.

For the S1300 peeps who have Windows Vista or Windows 7, can you confirm that ScanSnap Organizer search is not greyed out for you without Acrobat installed? Please leave a comment and let us know.

Comments ( 4 )

Windows 7 Update For Fujitsu ScanSnap S510 and S500 Is Now Available

windows7home.jpg

As we posted earlier, Windows 7 support wasn’t quite there for ScanSnap when the new operating system was released, and the Windows 7 Update for ScanSnap S1500 and S300 was released in December.

Yesterday, Fujitsu sent out a bulletin that the updates for the ScanSnap S510 and S500 have been released.

From the email:

The compatibility update for Windows 7 with ScanSnap S510 & S500 is now available for US based customers.

Please visit the on-line form link below and fill out the form completely.
https://www-s.fujitsu.com/us/services/computing/peripherals/scanners/w7_compform.html

After your submission is verified, you will receive an email within one business day with detailed instructions on how to download and install the ScanSnap applications for your Window 7 operating system.

So, for whatever reason, it looks like they are doing things differently this time and you have to fill out form to get sent the instructions.

What About The f-Series?

According to this support bulletin, the S510 update is due “end of January 2010″.

As always, let us know in the comments how your update goes.

Comments ( 0 )

Windows 7 Updates For ScanSnap S1500 and S300 Now Available

windows7home.jpg

As we posted earlier, Windows 7 support wasn’t quite there for ScanSnap when the new operating system was released.

Yesterday, Fujitsu sent out a bulletin that at least the updates for the ScanSnap S1500 and S300 have been released.

From the email:

The ScanSnap compatibility update for Windows 7 with ScanSnap S1500 and S300 is now posted! This update is for compatibility with select Windows 7 operating systems only. Proceed to the following site and go to the section labeled “ScanSnap/Organizer Service Packs” and locate the Windows 7 update for your model. Observe the download applicability notes and instructions for additional details related to installing the update.

The download page for the updates is here: http://www.fujitsu.com/us/services/computing/peripherals/scanners/support/downloads.html

One thing to note: You need to make sure you download both pieces. The ScanSnap Manager and ScanSnap Organizer updates.

What About The S510?

According to this support bulletin, the S510 update is due “end of December”. Not sure what the difference is, but there you go. I’ll update when it drops.

As always, let us know in the comments how your update goes.

Comments ( 0 )

Fujitsu ScanSnap on Windows 7 – Your Experiences?

windows7home.jpg

Well, Microsoft’s latest version of Windows, Windows 7, gets released today. Similar to what we did with OSX Snow Leopard, I thought I’d make a post so that we can share our experiences using the Fujitsu ScanSnap with it.

I haven’t seen anything official from Fujitsu, but if I had to guess I would say that the ScanSnap S1500 will probably be OK, but the older models such as the S510 and S300 will require a bit more work to get working.   

Since I have neither Windows 7 nor a Windows ScanSnap, I am relying on Google and you guys here.

This post from SevenForums seems to have a workaround to get the older ScanSnaps working.

Hopefully the Windows 7 release goes a bit more smoothly than the Snow Leopard one did. If you have any experience getting your ScanSnap working with Windows 7, leave a comment and let us know. I’ll post any relevant updates here.

Update: Fujitsu has posted a support bulletin outlining what will be supported when. Basically:

  • S1500: End of November 2009
  • S300: End of November 2009
  • S510: End of December 2009
  • S500: End of December 2009 (but it will not support 64-bit OS)

If you want to be notified with updates, you can sign up here.

Comments ( 9 )