How we automated composition recognition from packaging by barcode

How we automated reading compositions from packaging, extracting the data, and preparing files for PIM and the catalog.

2026-01-29

Discuss a Project

Our clients

FSK Group

SMLT

Tochno

Sber City

Relief Center

Askona

Snezhnaia Koroleva

Muztorg

TVOE

Yandex

Lenta

International perfume and cosmetics brand

RAEC

L'Etoile

Key takeaways

How we automated reading compositions from packaging, extracting the data, and preparing files for PIM and the catalog.
Delivered by KT.Team. The CIS source page carries the full project story, metrics and interface screenshots.

Client

A large distributor of imported goods with a product range of hundreds of SKUs on the market.

The situation at the start of the project

Most brands in the client's portfolio had no ready digital data on product composition. The only source of information was the digital packaging artwork.

The challenge

Since 2024, there has been a requirement to upload product composition data to the national catalog. The digitization process was done manually, which significantly slowed the work down and led to errors.

Business Outcome

The composition digitization process became almost fully automated. Now it is enough to upload an image — the system does the rest: it extracts the text, finds the barcode, links the composition to a specific product, and builds structured data.

The finished files can be used immediately to upload into a PIM system or any other system holding product data, and then transferred to the national catalog. Manual work is reduced to verifying the result — this cuts time spent and lowers the risk of errors.

Comment from the KT.Team manager

"When we started, we had no comparable cases — this was the first project where we needed not just to recognize barcodes and text, but to precisely isolate the relevant areas on packaging, often positioned at an angle or in different projections. This required the team to deeply work through the approach and tune every stage of the pipeline.

We built the recognition service to fit the client's business. When new types of packaging come to market, the service can be extended to support them. As a result, our client is ready to expand its product range, while the team is free to focus on higher-priority tasks."

Review a similar project with an architect

clients@kt.team Telegram @kt_team_it

Business request at the start

Automate composition recognition across hundreds of packages to eliminate labor-intensive manual processing and speed up data preparation for upload to the national registry.

Development time

6 months from the first discussions and only 3 months from the start of development to launching a working system.

Result

The composition recognition service cut data processing time several times over: instead of 30 minutes per package, the system now automatically processes up to 10 images in 2 minutes. Recognition accuracy is 80–95%.

How it works

The service runs automatically on a schedule and goes through the following steps:

If any image could not be recognized, this is also recorded in the report — the user immediately sees which files require manual review.

The finished report can be uploaded into any master system — using the barcode, the data is correctly matched to the right product.

checks the uploaded PDF files and finds the product composition block (the "Ingredients" section);
recognizes the text and extracts the composition information;
finds the barcode on the packaging to precisely link the composition to the product;
builds a report in which each barcode is matched with its recognized composition.

Technology stack

The recognition service is built as a distributed solution using modern computer vision tools and architectural approaches:

Composition recognition and extraction use a multi-stage image processing pipeline that combines both classic computer vision algorithms and OCR tools:

Go development language for high performance.
Microservice architecture with asynchronous interaction through the RabbitMQ message queue: the service receives tasks, processes them, and returns the results back to the queue.
Integration with SMB/FTP for uploading PDF documents.
Two separate modules for working with files: one tracks new uploads, the other builds the final Excel reports.
Conversion: converting PDFs into images.
Contour detection: outlining object boundaries with OpenCV.
Contour hierarchy: detecting nested areas (for example, on packaging).
Clustering: grouping elements with DBSCAN to isolate text blocks.
Filtering: selecting image regions that are highly likely to contain text and a barcode.
Text recognition: using Tesseract OCR.
Composition extraction: searching for keywords (for example, "ingredients") allowing for distortions, normalizing text, and, when needed, inverting colors and rotating the image.
Barcode recognition: using the ZXing library or searching for a 13-digit sequence in the OCR results.

What happened next

The system is already running in production and helps the client process composition data quickly. To date, it has successfully recognized more than 500 packages. Improvements and updates are added as new products prepare to enter the market.

To enable automatic data transfer to the national catalog, the PIM system includes a tool that generates Excel templates tailored to each HS code (TN VED). The forms are filled automatically with data from PIM, accounting for transformations and mapping to reference tables. The user only needs to download the finished file and upload it to their national catalog account — but that is another story.

How we automated composition recognition from packaging by barcode

Clients and partners

Key takeaways

Client

The situation at the start of the project

The challenge

Business Outcome

Comment from the KT.Team manager

Review a similar project with an architect

Business request at the start

Development time

Result

Technology stack

What happened next

Related articles

Related Cases

Enabled Polaris to easily launch new products on marketplaces and update product information in just a few clicks

Built a supplier portal for the Fix Price chain and automated work with product data

Explore a similar task: How we automated…

Continue on the topic

Related Cases

Talend ESB for pharma integrations

Supplier portal for Fix Price

Brandquad PIM for Campari Rus

Related solutions

Related videos

News on the topic