OCR (Optical Character Recognition)

TEXTBRIDGE OCR program showing icons and text selection areas OCR is a method of using a scanner to read an existing document and turn it into a text document for use on the PC. Initially,when the document is scanned,the result is an image and then this is translated into text as the program decodes the pixels in the image into a text document.

There are various image to text conversion programs - and they all use different techniques to achieve a text document from an image - but basically the OCR program goes through the same stages as an paint program does in scanning an image - and then attempts to discern what the document says. Some programs may offer a language option as not all documents you scan will be in English - others offer the option of taking an image from another place than your scanner,for example from an existing file on your hard drive.

Note that the image which the OCR uses most often is a black and white image - and if your existing image is something other than this,then it needs to be converted first.Some programs may only read certain image file types and may only produce certain other text file types as a result. The SAVE dialogue of TEXTBRIDGE showing the format options.It can also save directly to the copy buffer.

Generally speaking,300 DPI is the minimum resolution required for an OCR document to be translated. Once the image is in the OCR program - it is usual to put boxes around the areas containing text,some programs can do this automatically - but do not always get it right. It maybe that your OCR program includes a magnifier which enables you to see more clearly the areas that are selected for translation,and maybe the ability to rotate the image through 90 degrees.

The OCR program may show what words it has recognised or indicate how far through the document it is - and may also continue through a set of pages translating as it goes.Once it has succeeded in translating your image into text,it is then a matter of choosing which format you wish to save your document in.

Points to remember about OCR:

Images must be in the correct format to be translated (monochrome,300DPI).
Select the areas on the image that are text for the program to decode.
Ensure the correct language option is chosen and the text areas selected.
OCR the image and then save the text in the chosen format.

See Also: Scanners and Cameras