A practical guide to creating digital agents to automate processes

5 минут

‍A practical guide to creating agents

Introduction

Large language models (LLMs) are becoming more and more capable of performing complex, multi-step tasks. Advances in reasoning, multimodality, and the use of tools have opened up a new category of LLM systems — agents.

This guide is for product and engineering teams building their first agents. It contains practical recommendations and best practices: how to identify suitable cases, design the agent's logic and orchestration, and how to ensure its security and predictability.

After reading, you'll get the basic knowledge you need to get off to a confident start.

What is an agent?

Ordinary software helps automate tasks, but agents perform these tasks themselves on behind the user.

An agent is a system that independently performs work processes with a high degree of autonomy.

Examples of workflows: solving a customer's issue, booking a restaurant, committing to the repository, creating a report.

It is important: If an LLM does not manage the process, but only processes individual steps (for example, a chatbot, a classifier), it is not an agent.

Agent characteristics:

Uses LLM for decision making and process management. He understands the completion of the task, can correct actions or return control to a person in case of a mistake.
It has access to tools (API, UI) and selects tasks that are appropriate depending on the current state, acting within the framework of protective mechanisms (guardrails).

When should you create an agent?

Creating agents is about rethinking how systems make decisions and deal with complexity.

In contrast to classical automation, agents are useful where rule-like systems fail.

Example: The traditional system detects fraud using templates, and the agent analyzes the context, picks up atypical signals and acts as an “experienced investigator”.

Ideal cases for agents:

Difficult decisions: dealing with ambiguities (for example, returns in support).
Complex rules: systems with cumbersome conditions (for example, supplier verification).
Unstructured data: understanding text, dialogue, documents (for example, insurance claims).

Fundamentals of agent design

The basic components of the agent are:

model — LLM decision maker.
Tools — API, functions, UI actions.
Instructions — description of behavior and restrictions.

Example (Python):

python

CopyEdit

weather_agent = Agent(
   name="Weather agent",
   instructions="You are a helpful agent who can talk to users about the weather.",
   tools=[get_weather],
)

‍

Model selection

Different models are suitable for different tasks: you don't always need the smartest one.

Approach:

Build a prototype on the strongest model.
Then try replacing tasks with less expensive models.

Principles:

Run eval tests.
Use better models for important logic.
Optimize cost and latency, reducing the model where possible.

OpenAI Model Selection Guide

Defining tools

Tools are APIs and functions that an agent can call.

If there is no API, use UI interactions like a human does.

Tool types:

Тип	Назначение	Примеры
Данные	Получение контекста	Поиск в базе, PDF, интернет
Действия	Изменение состояния	Email, обновление CRM
Оркестрация	Вызов других агентов как инструментов	агент-писатель, агент-исследователь

Example:

python

CopyEdit

@function_tool
def save_results(output):
    db.insert({"output": output, "timestamp": datetime.time()})
    return "File saved"

search_agent = Agent(
    name="Search agent",
    instructions="Help the user search the internet and save results if asked.",
    tools=[WebSearchTool(), save_results],
)

‍

Configure instructions

Clear instructions are the key to success. The more specific, the fewer errors.

Best practices:

Use existing regulations and guidelines.
Break tasks down into step-by-step actions.
Identify clear actions and results.
Consider exceptions and options (for example, if the user did not provide data).

An example of generating instructions:

text

CopyEdit

“You're an expert at writing instructions for an LLM agent. Convert the following knowledge base document into step by step instructions in the form of a list. Make sure everything is clear and unambiguous.”

Orchestration

Orchestration is a structure for an agent or group of agents to perform tasks.

Variants:

Single agent — performs the entire process.
Multiple agents — share responsibilities and call each other out.

Single agent

Controls all tools and logic. It works in a cycle until the task is completed.

python

CopyEdit

Agents.run (agent, [UserMessage (“What's the capital of the USA?”)])

‍

You can use templates:

python

CopyEdit

“You are a call center agent talking to {{user_first_name}}. Their complaints are about {{user_complaint_categories}}...”

‍

When to split into multiple agents:

Complex logic (many conditions).
Overloading tools.
Division by task (search, generation, verification, etc.)

Multiple agents

Templates:

Manager — the central agent calls subordinates.
Decentralization — Agents transfer control to each other.

Manager pattern

python

CopyEdit

manager_agent = Agent(
    name="Manager",
    instructions="If asked for multiple translations, call the relevant tools.",
    tools=[
        spanish_agent.as_tool(tool_name="translate_to_spanish", tool_description="..."),
        french_agent.as_tool(tool_name="translate_to_french", tool_description="..."),
        italian_agent.as_tool(tool_name="translate_to_italian", tool_description="..."),
    ],
)

await Runner.run(manager_agent, "Translate 'hello' to Spanish, French and Italian for me!")

‍

Decentralized pattern

python

CopyEdit

triage_agent = Agent(
    name="Triage Agent",
    instructions="Assess the query and direct to appropriate agent.",
    handoffs=[technical_support_agent, sales_assistant_agent, order_management_agent],
)

await Runner.run(triage_agent, "Could you provide an update on the delivery?")
triage_agent = Agent(
    name="Triage Agent",
    instructions="Assess the query and direct to appropriate agent.",
    handoffs=[technical_support_agent, sales_assistant_agent, order_management_agent],
)

await Runner.run(triage_agent, "Could you provide an update on the delivery?")

‍

Protection mechanisms (Guardrails)

Guardrails are measures to ensure security, privacy, and correctness.

Protection types:

Relevance classifier — rejects off topic.
Safety filter — protects against prompt injection.
PII filter — checks for leaks of personal data.
Moderation — filters toxic content.
Risk assessment tool — at high risk — confirmation from a person.
Filters and blocklists — regular expressions, length restrictions.
Validating the response — checking compliance with the brand and tone of communication.

Guardrail's churn example:

python

CopyEdit

class ChurnDetectionOutput(BaseModel):
    is_churn_risk: bool
    reasoning: str

@input_guardrail
async def churn_detection_tripwire(ctx: RunContextWrapper, agent: Agent, input: str):
    result = await Runner.run(churn_detection_agent, input, context=ctx.context)
    return GuardrailFunctionOutput(
        output_info=result.final_output,
        tripwire_triggered=result.final_output.is_churn_risk,
    )

customer_support_agent = Agent(
    name="Customer support agent",
    instructions="You help customers with their questions.",
    input_guardrails=[
        Guardrail(guardrail_function=churn_detection_tripwire),
    ],
)