Optical character recognition (OCR) is the process of converting an image or portions of an image into text. The importance of OCR becomes apparent in the context of searching a collection of images for a particular word. Without OCR, such an operation would not be possible because there is no text in the images. With OCR, the full power of full-text indexing and retrieval of images becomes possible. When you perform OCR on an image, depending on the OCR method used, Alchemy automatically associates the full OCR text with the image or places the specified text in a designated profile field. When you search for OCR text, Alchemy lists the associated image files in the Search results pane and highlights the search term in the image or displays the OCR text in the designated profile field when you display the image. You can also search within an image file that has been OCR’ed using the Alchemy Viewer Find feature.

Note: Alchemy will OCR only TIFF, BMP, DCX, PCX, JPEG, and PCX file types. If you attempt to OCR an unsupported file type, Alchemy will appear to OCR the image, but in fact, nothing will happen. In addition to OCR, you can perform barcode recognition. Barcode recognition functions similarly to OCR, except that instead of characters, Alchemy recognizes barcodes and translates the information into text data. 2D barcodes are not supported.

Alchemy includes three methods for performing OCR on images, and two methods for performing barcode recognition:

Entire page: (OCR and barcode recognition) Performs OCR or barcode recognition using the Entire Page template. This provides a quick and easy way to create a full-text database from a collection of images, allowing for content-based retrieval from an Alchemy database of scanned images.

OCR template: (OCR and barcode recognition) Uses the concept of an OCR template to describe which portions of an image are to be converted to text and placed into profile fields. OCR templates are ideally suited for automating the process of populating profile fields with text extracted from fixed locations within a collection of similar images. This method should only be used on a collection of scanned images with identical structure, such as invoices and credit card statements.

Drag and drop: (OCR only) This method allows you to select a region of an image and drag and drop it into an Alchemy profile field. When the selection is dropped, it is automatically converted to text in the profile field.

About performing zone OCR and entire page OCR: Please note that when performing OCR on an image, if you intend to perform both Entire Page OCR and Zone OCR (which populates profile fields with the text inside corresponding zones), you must perform Zone OCR before Entire Page OCR. When you perform OCR on an image using the Entire Page template, the entire contents of the page is associated with the image. Likewise, when you perform OCR using a template that contains a zone defined by the Full Text field, all of the text contained in the zone is associated with the image. The results of each of these OCR processes overwrites the other. Therefore, when you perform Entire Page OCR, and then perform Zone OCR with a template that does not use the Full Text field, the Full Text field is interpreted as blank.

About multi-page TIFF OCR: The OCR operation is the same for multi-page TIFF files as for any other document; however, the following items should be noted:

When zone template OCR is used, only the first page is converted to text and used to populate profile fields.

To perform OCR on all pages, the Entire Page option must be selected.

A single OCR file is created, regardless of the number of pages in the TIFF file.

When the results of a search locate a multi-page TIFF file, the image scrolls to the page containing the search term in the Viewer pane with the search term highlighted.

You should perform an OCR operation after adding, removing, or re-ordering pages in a multi-page TIFF file.

To improve the quality of the OCR and barcode recognition process, certain settings are provided to make adjustments to your documents before beginning the OCR and/or barcode recognition process. By default, none of these options are selected, but you may want to use them depending on the quality of the documents that you are scanning.

CAUTION: Selecting one or more of these settings typically makes the OCR and barcode recognition process more accurate, but also makes the process slow.

Settings – OCR

On the Tools >Settings to display the Settings dialog box.

Open the OCR tab.

Select one or more of the following options:Optional features.  Note: The options on this tab are also available when you select Scan > Settings > OCR tab from the Scan Settings dialog box, or when you select Scan > OCR then open the OCR Settings tab on the OCR dialog box. You may have already saved OCR settings for a document type.

Deskew image before recognition: Straightens images that are slightly tilted before performing a build. CAUTION: Selecting this option may affect how Alchemy reads separator pages, so make sure that your separator pages are still read properly when using this option.

Automatically detect page orientation: Automatically orients images to be right-side up before performing OCR.

Enhance degraded images: Sharpens images that are blurred or faded before performing OCR.

Language: Select the language to be used for OCR.

Click OK to apply the settings.