Why Every Business Needs OCR

Your boss gives you a hardcopy of a company document that needs updating. Your client hands you a printed magazine article and asks you to create an editable text version. You receive an electronic image of a brochure and need to update the text.

What do all these situations have in common? They could all involve you spending hours retyping manually and correcting typos. Or you could take a more modern approach and convert any and all of them into a digital format with fully editable text in a matter of minutes.

All you need is a scanner or digital camera (to create an image file of any printed document) or an electronic image (if you’ve already got a .PDF, .jpg, .eps, .png or similar file, you’re in business), and Optical Character Recognition (OCR) software, like the OCR software that comes standard in Foxit PhantomPDF software.

What is OCR?

OCR is a software technology that enables you to convert scanned document into documents with “live text,” aka readable, searchable text that you can change, copy, edit and basically do anything you regularly do to text.

How does OCR work?

There are two methods used for OCR: Matrix matching (the simpler and more common) and feature extraction.

Matrix Matching compares what your OCR software detects as a character with a library of character templates. When it finds a match, bingo! The OCR software matches that image to its corresponding ASCII character.

Feature Extraction is OCR that uses computer intelligence to look for general features such as open areas, closed shapes, diagonal lines, line intersections, etc. It’s a much more versatile method, but it has more requirements for a successful outcome, such as a clean, straight image and minimum 300-dpi resolution. Matrix matching can still work well on less-than-ideal images and it’s what’s most common in PDF software like PhantomPDF.

Advantages of OCR

From faster searches and easier editing to saving digital and physical storage space, you’ll find many benefits to using OCR software to turn document images into searchable, editable text:

  • Au revoir retyping – Unless you’re a fan of extra time at the keyboard recreating documents that exist in printed or scanned format, you’ll love the time savings you get when converting those image files into searchable, editable text via OCR.
  • Speedy digital searches – By converting scanned text into a word processing file, OCR lets you search through documents using keywords or phrases. Got a few hundred invoices? Let your PC search for the client name you need faster than you can say “coffee break.”
  • Typing new text – If you need that image of a document to function like real text, where you can add new paragraphs, copy and paste, edit out an old reference, etc., OCR lets you do it. It’s ideal for everything from updating contracts to making changes to your archive of family recipes.
  • Saving space – If you’ve got reams of paper documents taking up space in your office, you can scan them into PDF files with the confidence that your OCR software will let you retrieve any of the text you need to work with, whenever you may need it. Goodbye big file cabinets, hello tidy little CDs of archived documents.
  • Accessibility – If you or someone you know is vision-impaired, OCR software can help turn books, magazines and other printed documents into accessible files that they can listen to with the help of a combination of word processing software and computer voice-over utilities.

So why not use the power of OCR in your PDF software to increase efficiency in your office? Once you start using it, you’re guaranteed to find numerous ways to use it. And you’ll wonder how you ever worked without it.

To learn more about how to scan and OCR documents, visit the Foxit PhantomPDF product page.

Advertisements
Leave a comment

10 Comments

  1. Ronald

     /  August 27, 2013

    While OCR usually works fine with English texts, it doesn’t with other languages like Italian. Do you provide language options (like FreeOcr.net) ?

    Reply
    • PhantomPDF supports a number of different languages, including Italian. Packages to support languages in OCR are very large, so many languages are not included in the basic PhantomPDF download. However, they can be downloaded from a separate module at the below URL at no additional charge to work with PhantomPDF.

      Look for the download called Foxit PhantomPDF Add-On (OCR) at
      http://www.foxitsoftware.com/downloads/

      Reply
  2. mohammad reza arabestani

     /  August 27, 2013

    Dear Sir/Madam
    For getting the OCR software, what should i do? please guide me.
    Best Regards

    Reply
  3. Desenv01

     /  August 27, 2013

    Does Foxit have any library to work with OCR?
    Thanks a lot.

    Reply
    • If you are looking for a product, PhantomPDF will provide you with OCR capability. If you’re looking for a OCR software library to integrate into an application, Foxit does not provide this capability at this time.

      Reply
  4. Oren Yulevitch

     /  August 27, 2013

    OCR with Foxit is great… if you are using English.
    Do you have plans on having OCR also for NON-Latin characters with right-to-left direction? Hebrew, Arabic

    Reply
    • PhantomPDF does support some non-Latin based languages today, but not Hebrew and Arabic yet. Those languages are planned for later this fall.

      Reply
  5. how do I get ocr?

    Reply

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: