Source data automation, or source data collection, refers to procedures and equipment designed to make the input process more efficient by eliminating the manual entry of data. Instead of a person entering data using a keyboard, source data automation equipment captures data directly from its original form. The original form is called a source document. In addition to making the input process more efficient, source data automation usually results in a higher input accuracy rate.

An image scanner, sometimes called a page scanner, is an input device that can electronically capture an entire page of text or images such as photographs or art work. The scanner converts the text or images on the original document into digital data that can be stored on a disk and processed by the computer. The digitised data can be printed, displayed separately, or merged into another document for editing.

Optical recognition devices use a light source to read codes, marks, and characters and convert them into digital data that can be processed by a computer.

Optical codes use a pattern or symbols to represent data. The most common optical code is the bar code. Most of us are familiar with the "zebra-striped" Universal Product Code (UPC), which appears on most supermarket products. It consists of a set of vertical lines and spaces of different widths. The bar code reader uses the light pattern from the bar code lines to identify the item. The UPC bar code, used for grocery and retail items, can be translated into a ten-digit number that identifies the product manufacturer and product number.

Optical character recognition (OCR) devices are scanners that read typewritten, computer-printed, and in some cases hand-printed characters from ordinary documents.

A number of optical character recognition (OCR) systems are known. Typically, such systems comprise apparatus for scanning a page of printed text and performing a character recognition process on a bit-mapped image of the text, which is a pixel-by-pixel representation of the overall image in a binary form. The recognition system reads characters of a character code line by framing and recognizing the characters within the image data. During the recognition process, the document is analyzed for several key factors such as layout, fonts, text and graphics. The document is then converted into an electronic format that can be edited with application software. The output image is then supplied to a computer or other processing device, which performs an OCR algorithm on the scanned image. The document can be of many different languages, forms and features. The purpose of the OCR algorithm is to produce an electronic document comprising a collection of recognized words that are capable of being edited. In general, electronic reading machines using computer-based optical character recognition (OCR) comprises personal computers outfitted with computer scanners, optical character recognition software, and computerized text-to-voice hardware or software.

The OCR devices scan the shape of a character, compare it with a predefined shape stored in memory, and convert the character into the corresponding computer code. They use a light source to read special characters and convert them into electrical signals to be sent to the CPU. The characters can be read by both humans and machines. They are often found on sales tags in department stores or imprinted on credit card slips.

In large department stores you can see a device called the wand reader. It is capable to read OCR optical characters. After data from a retail tag has been read, the computer system can automatically and quickly pull together the information needed for billing purposes.

Wands are an extremely promising alternative to key input because they eliminate one more intermediate step between "data" and "processing" — that of key entry.

Magnetic ink character recognition (MICR) characters use special ink that can be magnetized during processing. MICR is used almost exclusively by the banking industry for processing checks. Blank checks already have the bank code, account number, and check number printed in MICR characters across the bottom. When the check is processed by the bank, the amount of the check is also printed in the lower right corner. Together, this information is read by MICR reader/sorter machines as part of the check- clearing process.


source document початковий документ

to capture захоплювати, збирати (дані)

recognition розпізнавання

pixel елемент зображення, точка растра; мінімальний

адресований елемент двомірного растрового зоб­раження, колір і яскравість якого можна задати незалежно від інших точок

bar code штриховий код; спеціальний код, у якому кожний знак складено з вертикальних темних і світлих смуг різної ширини, який друкують на упаковці товарів тощо для автоматизованого вводу даних про них