AI tools
LLM & Security Gateway: one gateway to models
Companies connect dozens of LLMs but lose control over cost, reliability and personal data leaking into external models.
PII protection is not an output filter but a two-way contour: obfuscation replaces personal data with stable tokens before the model and de-obfuscation substitutes the original values into the response, so the model and the provider's logs see only anonymized text.
AI layer
An AI assistant must take action, not just reply with text
Core thesis of the AI block: a pilot with measurable impact, private data under control, agent actions logged, quality passes evals before scaling.
Assistant ≠ chatbot
A chatbot answers; an assistant checks the regulations, queries systems, records the deviation and proposes the next step.
Control plane
Agent registry, owner, permissions, memory, evals, trace logs, kill-switch and budget at the enterprise-layer level.
Data
RAG returns an answer with a source citation; LLM Gateway obfuscates personal data before the model and restores it after the response.
Industry solutions
What you can do with LLM & Security Gateway
Capabilities
LLM & Security Gateway capabilities
PII obfuscation before sending to the model
Names, phones, addresses and card numbers are replaced with stable tokens before sending; the model and its logs never get real customer data, removing GDPR fine risk
De-obfuscation in the response
Tokens in the model's response are restored to original values via a session map — the user sees real data that never left the perimeter
Model routing by price and quality
Cheap requests go to light models, complex ones to frontier; costs drop 45-85% with no noticeable quality loss
Budgets and limits per team and project
Token limits per key, team and project stop overspend before the provider bill — an end to surprises of 40% over forecast
Fallback and load balancing across providers
If one model fails or slows down, traffic shifts automatically to a backup — the AI service does not go down with the provider
Observability and cost attribution
Every request is logged with model, tokens, latency and cost via OpenTelemetry — you see who spends how much and where quality degrades
Semantic cache
Repeated requests are served from cache (40%+ hit rate in production) — lower bill and latency with no loss of answer quality
Unified key and access management
Provider keys are stored in the gateway; teams get virtual keys with instant revocation — secrets never leak into application code
Guardrails on input and output
Prompt injection detection and content filtering before and after the model reduce leak risk and toxic responses in production
Approach
How we implement LLM & Security Gateway
Minimal core modification
We don't fork or patch the LLM & Security Gateway core. LLM & Security Gateway stays on the standard, upgradable version — we move business logic into separate microservices alongside it, so platform updates don't break your customizations.
International Standards, Not Homegrown Hacks
Where a mature international solution exists, we use it instead of inventing our own protocol or platform. Before writing code, we study how the problem is already solved in the industry.
Transferability
The solution is loosely coupled and documented: it can be handed over between teams and contractors without rewriting. You are not tied to us.
AI compatibility
LLM & Security Gateway in the AI landscape
OpenAI-compatible API
A single OpenAI Chat Completions endpoint — apps switch to the gateway by changing the base URL, with no client code rewrite
Multi-provider support
OpenAI, Anthropic, Google, open-source and local models behind one interface; switching a model is a route config change, not a code change
OpenTelemetry and Prometheus
Gateway telemetry lands next to application telemetry in Datadog, Grafana and Splunk without custom adapters
Integration with RAG and agents
PII obfuscation works at the level of each call, so it protects both single-step requests and multi-step agent chains and RAG pipelines
Self-hosted deployment
The gateway and obfuscation map are deployed in the client's perimeter — PII and session mappings never leave the company boundary
Projects
Cases
AI ingredient recognition by barcode
- Processing sped up from 30 minutes to 2 per batch of 10 images
- Composition recognition accuracy is 80–95%
OSNO-VA: AI accountant
- LLM Gateway: implementation and integration.
AI analytics of the real estate market
- LLM Gateway: implementation and integration.
Contacts
Let's Discuss Your Project
Leave your current contact details and describe your task. We will come back with clarifying questions and a proposal for the next step.


