Document Processing Automation with Icdocs

About the Article

8.5.2020 Using the example of an intelligent system for digitization, OCR, classification, and reconciliation of accounting documents, we show how kt.team solves complex integration challenges. Reading time: 5 min.

"Paper shuffling" as a separate business process

One kt.team client, a large logistics company, ships thousands of shipments every day.

Each shipment is accompanied by a master agreement, goods invoice, goods and transport waybill (TTN), invoices, passes for driver-couriers, handover-acceptance acts...

The document package - usually multi-page - for shipments is quite large.

Most of these documents arrive on paper: the driver-courier brings them in and hands them over for validation to a dedicated accounting unit.

Documents are checked manually for completeness, required attributes, and correctness of completion.

After scanning, the accounting staff manually enters in Admin Tool the contract number that the document should be linked to, and manually 'links' the individual pages of the document together.

The verified documents are sent to the accounting system (BluJay) and the electronic repository (Magellan).

In practice, dozens of employees spend their days shuffling paper around.

, though still targeted

Thousands of person-hours are spent each month on repetitive work. The processing speed for each successive document batch inevitably drops - even experienced employees become less attentive over time. Human error is unavoidable.

Task: automate document validation

The client had long wanted to automate routine document tasks so employee time could be used more efficiently.
But it could not find an off-the-shelf software product or product suite on the market that would fit smoothly into the company's existing business process and IT infrastructure.
All existing off-the-shelf products could handle only one or two steps of a complex business process, which would not solve the problem.
After analyzing the situation, we proposed that the client develop a custom solution for the validation and digitization of paper documents.
Our team studied the business process and identified the stages that could be automated: scanning, determining the document type, checking document completeness, assembling the document package, integration with the accounting system, and integration with the electronic document repository.
Our solution was not to build a separate product for each stage, but to combine them into a single product - iCdocs (short for intelligent compiler of documents).

Assess where AI can deliver impact in your process

clients@kt.team Telegram @kt_team_it

A solution for a complex task

The most difficult stage to automate was determining the document type.
To implement these tasks, we tested several hypotheses.
The first hypothesis was image processing.
We planned to train a neural network on a specific set of patterns that correspond to document forms.
By comparing a scan of a specific document with reference patterns stored in memory, the neural network was supposed to determine the document type and the counterparty named in it.
Practice showed that this was a poor approach.
For many documents, such as waybills, there is no single widely accepted format.
The number of fields, the relative placement of elements, and the completion of required fields differ.
Even long training that required significant system resources would not deliver an acceptable result, and identifying each document would take longer than manual processing.
Such a solution would not have been cost-effective from the customer's business perspective.
So instead of images, we decided to work with text.
Regardless of the format used by the counterparty, the goods and transport waybill always contains the document title, the TTN number, shipment and contract numbers, and other text information that makes correct processing possible. iCdocs uses random forest machine learning and vector analysis of word positions by metric to determine document types.
This approach proved to be more effective.
By analyzing the presence of the "right" words, we were able to reach nearly 78% right from the start, and iCdocs could identify the document type on its own - the operator only had to confirm the results.

iCdocs features: digitization and verification

In addition to document type recognition in iCdocs, we implemented other functions as well.

Digitizing paper documents Before iCdocs was implemented, document scanning was done manually or semi-automatically. The operator started the scan, retrieved the resulting files from the scanner software, and processed them. We wrote a scanner driver that starts scanning a document batch and sends the scanned images to iCdocs. The operator only loads the paper documents into the scanner.

Completion verification After identifying the document type, iCdocs checks whether it is filled out correctly: whether all required fields are completed and whether the information they contain meets the standard. To perform this function correctly, the system must be trained, so at first document completion verification is done with an operator involved. By confirming or rejecting the correctness of the fields and the document as a whole, the operator teaches iCdocs to recognize "correct" documents and send incorrect ones back for revision.

iCdocs features: sorting, completeness, and export

Sorting documents by counterparties and batches To sort documents by counterparties and batches, iCdocs was integrated with the standard BluJay accounting system. The system requests active contract and shipment numbers, compares them with the data from the document package, and links the package to the corresponding counterparty and contract.
Checking document package completeness Document package completeness is checked within iCdocs. The system verifies that all required documents are present according to the list and compares the nominal and actual page counts in the documents. If a document is missing from the package or pages are missing from a document, iCdocs alerts the operator to the issue.
Data export After document package verification is complete, iCdocs automatically exports the data to the company's information systems: accounting, CRM, and document archives. By the time of transfer, the documents are already grouped into a package and associated with the counterparty, contract, and shipment.
Backup In addition to the standard storage, iCdocs keeps backup copies of documents. As a result, the entire document package workflow - from digitization to archiving - is implemented in one product.

Not Just Logistics

iCdocs functionality can be applied not only in logistics but also in other industries.
Let's look at a few cases where it will be useful.
The organization receives and processes a large number of paper documents every day.
Incoming document packs must be checked promptly for accuracy, sorted, sent into processing, and stored in the archive. Several contracts are in place with each partner. Each contract may have different terms, for example different carriers, payment methods, or payment approaches, and each request or deal must be matched to the appropriate contract.
Partners operate through several legal entities with different organizational forms, OKVED codes, and separate contracts for each entity.
The document package must be checked for completeness, matching contract number, and accurate field completion. The legal entity, signature, and legal details must match those specified in the contract.
At the same time, iCdocs is not, strictly speaking, a universal product.
The process of handling paper documents is organized differently in each company, with different roles and information systems involved.
The first step in integrating iCdocs is always to study the relevant business process and adapt the product to the customer's needs and infrastructure.

Business Document Workflow Automation with the Icdocs Platform

About the Article

"Paper shuffling" as a separate business process

, though still targeted

Task: automate document validation

Assess where AI can deliver impact in your process

A solution for a complex task

iCdocs features: digitization and verification

iCdocs features: sorting, completeness, and export

Not Just Logistics

Discuss the article: Document workflow automation in business...

Business Document Workflow Automation with the Icdocs Platform

About the Article

"Paper shuffling" as a separate business process

, though still targeted

Task: automate document validation

Assess where AI can deliver impact in your process

A solution for a complex task

iCdocs features: digitization and verification

iCdocs features: sorting, completeness, and export

Not Just Logistics

Discuss the article: Document workflow automation in business...

Continue on the topic

Articles on the topic

Related videos