Document
capture in a production environment has several distinct steps. In PaperlessOffice, each
of these steps is integrated via internal workflow queues that provide exceptional
performance and flexibility. This allows PaperlessOffice to be used in virtually any
environment, from a single workstation up to high volume enterprise installations with
multiple scanners feeding multiple OCR, index, rescan, and release stations. No matter how
large or small your capture needs, PaperlessOffice's robust internal routing guarantees
maximum efficiency from every station and every operator.
1. Batch Preparation
The efficiency of production input systems come from
their batch orientation. Typically, pages are prepped, sorted into batches of similar
documents (for example, purchase orders, invoices, etc.), and then scanned.
PaperlessOffice is highly batch-oriented and provides an administrator with the ability to
predefine multiple document classes, which allows a scanner operator to quickly tell the
system what type of document to expect. Documents can be automatically separated within
batches either with job separator sheets (pre-printed pages with standard patch codes on
them) or with bar codes printed directly on the pages.
2. Scanning

PaperlessOffice supports a broad range of desktop production
scanners, including high-end video and SCSI models supported by Kofax accelerators and
low-end SCSI models supported by ISIS drivers. Supported scanners range from models with
speeds under 10 ppm all the way up to 160 ppm and above. PaperlessOffice supports both
simplex and duplex scanners and always allows you to run your scanners at their full rated
engine speed.
3. OCR
After a batch is scanned, it can be optionally queued to an OCR
station to assist indexing. PaperlessOffice incorporates deskewing, despeckling, line
removal, and other image enhancement functions to improve OCR accuracy and provides
support for both full text and zonal OCR using either software or hardware engines.
PaperlessOffice has been optimized for forms processing and can automatically queue
batches of forms to a single OCR station or to multiple OCR stations to eliminate
processing bottlenecks. Multi-language OCR support is included, which allows more accurate
OCR with non-English languages.
4. Indexing
Indexing is the most critical and time consuming step in the
capture process, with a typical capture operation sometimes requiring as many as four
index stations for each scanner. The index is the key to retrieving the document, and
PaperlessOffice provides several methods to cut down on operator errors and speed the
indexing process
OCR can be used to read indexing zones previously defined by the
document class. This allows the index operator to simply check the accuracy of the OCR
rather than keying every index field by hand.
Bar code recognition is a highly reliable method for indexing batches.
PaperlessOffice supports most popular bar code types.
Double key entry allows document batches to be queued to two index
operators sequentially. Only if both enter the same indexes is the document considered
finished and queued to the next process.
User-definable validation scripts catch both manual and OCR index
errors. For example, if a field is a social security number, validation rules can require
that all entries must be numbers, which prevents OCR from mistaking a "1" with
an "I". More sophisticated validation rules can go even further, requiring that
various fields on the document match each other (for example, matching telephone number
and address via a database lookup).
5. Rescanning
No scanner is perfect, and rescanning is an integral
part of PaperlessOffice. Index operators or QA operators can easily tag documents or
individual pages for rescan, attaching notes that tell the scanner operator exactly what
the error is. The batch is then queued to a rescan station where the operator is prompted
for the specific pages or documents to be rescanned. PaperlessOffice automatically inserts
rescanned pages in the correct position within the batch.
6. Release
The last stage in the capture process is to release the
documents in a batch either to long term storage or to a workflow system. In the release
process, the image files are released to a file system (for example, a hierarchical
storage management system or a document manager like PC DOCS) and the indexes are written
to a database. PaperlessOffice is compatible with all SQL/ODBC compliant databases. In
addition, PaperlessOffice allows users to write their own custom release scripts, either
to modify the standard release procedure or to release documents into a proprietary
back-end or non ODBC database.