# Trainer
```{warning}
Before running any code, ensure you are logged in to the Afnio backend (`afnio login`).. See [Logging in to Afnio Backend](login) for details.
```
```{tip}
For full control over every training and evaluation step, use a manual optimization loop as shown in [Optimization Loop](optimization_loop). For most workflows, however, `Trainer` is the fastest and easiest way to train and evaluate agents in Afnio.
```
Afnio’s `Trainer` module provides a high-level interface for training, validating, and testing agents. It automates many aspects of the optimization loop, including experiment tracking, metric logging, checkpointing, and cost monitoring. If you want to get started quickly and focus on agent design rather than boilerplate training code, `Trainer` is the recommended approach.
---
## Why Use Trainer?
- **Automatic Experiment Tracking:** Metrics (loss, accuracy, etc.) are logged to [Tellurio Studio](https://platform.tellurio.ai/), giving you interactive plots and dashboards for each run.
- **LM Cost Tracking:** The Trainer automatically tracks and logs the cost of all language model (LM) calls, so you can monitor and optimize your budget.
- **Progress Bars and Summaries:** Built-in progress bars and agent summaries make it easy to follow training progress.
- **Checkpointing:** Trainer automatically saves agent checkpoints during training, allowing you to resume experiments, analyze results, or deploy the best-performing agent to production.
- **Less Boilerplate:** You don’t need to write custom loops for training, validation, or testing—just implement a few methods in your agent.
---
## Preparing Your Agent and Data
Before using the `Trainer` module, you should define your agent, dataset, and data loaders. See [Datasets and DataLoaders](datasets_and_dataloaders) and [Build the Agent or Workflow](build_agent_workflow) for details.
This example uses the same agent and dataset as in the [Optimization Loop](optimization_loop) page, but demonstrates training with the `Trainer` module for a streamlined workflow.
To work with `Trainer`, your agent (which should extend `cog.Module`) must implement the following methods:
| Method | Purpose | Input | Output/Return Value |
| ---------------------- | --------------------------------------- | ------------------------------------------------------------ | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- |
| `training_step` | Defines logic for each training batch | A batch `batch` of training data and its index `batch_idx` | A dict with keys like `"loss"` and `"accuracy"`, where `"loss"` is a tuple `(score, explanation)` of `afnio.Variable` objects, or tuple `(score, explanation)` of `afnio.Variable` objects. |
| `validation_step` | Defines logic for each validation batch | A batch `batch` of validation data and its index `batch_idx` | Same format as `training_step` |
| `test_step` | Defines logic for each test batch | A batch `batch` of test data and its index `batch_idx` | Same format as `training_step` |
| `configure_optimizers` | Returns optimizer(s) for training | None | Optimizer instance(s), e.g., `afnio.optim.TGD` |
```{tip}
- The `batch` input is typically a tuple `(X, y)` or a dictionary, depending on your `DataLoader`.
- The `"loss"` output must be a tuple of two `afnio.Variable` objects: the numeric score and the explanation (used for gradients).
```
````python
import json
import re
import afnio
import afnio.cognitive as cog
import afnio.cognitive.functional as F
import afnio.tellurio as te
from afnio.models.openai import AsyncOpenAI
from afnio.utils.data import DataLoader, WeightedRandomSampler
from afnio.utils.datasets import FacilitySupport
os.environ["OPENAI_API_KEY"] = "sk-..." # Replace with your actual key
def compute_sample_weights(data):
with te.suppress_variable_notifications():
labels = [y.data for _, (_, y, _) in data]
counts = {label: labels.count(label) for label in set(labels)}
total = len(data)
return [total / counts[label] for label in labels]
training_data = FacilitySupport(split="train", root="data")
validation_data = FacilitySupport(split="val", root="data")
test_data = FacilitySupport(split="test", root="data")
weights = compute_sample_weights(training_data)
sampler = WeightedRandomSampler(
weights, num_samples=len(training_data), replacement=True
)
BATCH_SIZE = 33
train_dataloader = DataLoader(training_data, sampler=sampler, batch_size=BATCH_SIZE)
val_dataloader = DataLoader(validation_data, batch_size=BATCH_SIZE, seed=42)
test_dataloader = DataLoader(test_data, batch_size=BATCH_SIZE, seed=42)
SENTIMENT_RESPONSE_FORMAT = {
"type": "json_schema",
"json_schema": {
"strict": True,
"name": "sentiment_response_schema",
"schema": {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "neutral", "negative"],
},
},
"additionalProperties": False,
"required": ["sentiment"],
},
},
}
afnio.set_backward_model_client(
"openai/gpt-5",
completion_args={
"temperature": 1.0,
"max_completion_tokens": 32000,
"reasoning_effort": "low",
},
)
fw_model_client = AsyncOpenAI()
optim_model_client = AsyncOpenAI()
class FacilitySupportAnalyzer(cog.Module):
def __init__(self):
super().__init__()
self.sentiment_task = cog.Parameter(
data="Read the provided message and determine the sentiment.",
role="system prompt for sentiment classification",
requires_grad=True,
)
self.sentiment_user = afnio.Variable(
data="**Message:**\n\n{message}\n\n",
role="input template to sentiment classifier",
)
self.sentiment_classifier = cog.ChatCompletion()
def forward(self, fwd_model, inputs, **completion_args):
sentiment_messages = [
{"role": "system", "content": [self.sentiment_task]},
{"role": "user", "content": [self.sentiment_user]},
]
return self.sentiment_classifier(
fwd_model,
sentiment_messages,
inputs=inputs,
response_format=SENTIMENT_RESPONSE_FORMAT,
**completion_args,
)
def training_step(self, batch, batch_idx):
X, y = batch
_, gold_sentiment, _ = y
pred_sentiment = self(
fw_model_client,
inputs={"message": X},
model="gpt-4.1-nano",
temperature=0.0,
)
pred_sentiment.data = [
json.loads(re.sub(r"^```json\n|\n```$", "", item))["sentiment"].lower()
for item in pred_sentiment.data
]
loss = F.exact_match_evaluator(pred_sentiment, gold_sentiment)
return {"loss": loss, "accuracy": loss[0].data / len(gold_sentiment.data)}
def validation_step(self, batch, batch_idx):
return self.training_step(batch, batch_idx)
def test_step(self, batch, batch_idx):
return self.validation_step(batch, batch_idx)
def configure_optimizers(self):
constraints = [
afnio.Variable(
data="The improved variable must never include or reference the characters `{` or `}`. Do not output them, mention them, or describe them in any way.",
role="optimizer constraint",
)
]
optimizer = afnio.optim.TGD(
self.parameters(),
model_client=optim_model_client,
constraints=constraints,
momentum=3,
model="gpt-5",
temperature=1.0,
max_completion_tokens=32000,
reasoning_effort="low",
)
return optimizer
agent = FacilitySupportAnalyzer()
````
_Output:_
```output
INFO : API key provided and stored securely in local keyring.
INFO : Currently logged in as 'username' to 'http://localhost'. Use `afnio login --relogin` to force relogin.
INFO : Project with slug 'my-project' already exists in namespace 'username'.
Downloading https://raw.githubusercontent.com/meta-llama/llama-prompt-ops/refs/heads/main/use-cases/facility-support-analyzer/dataset.json to data/FacilitySupport/raw/dataset.json
Downloading ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 383.7/383.7 kB 1.1 MB/s 0:00:00
Using downloaded and verified file: data/FacilitySupport/raw/dataset.json
Using downloaded and verified file: data/FacilitySupport/raw/dataset.json
```
---
## Example: Training the Facility Support Sentiment Agent with Trainer
Below, we use the same agent and dataset as in the [Optimization Loop](optimization_loop) page, but this time we train using the `Trainer` module.
### 1. Define the Agent
````python
import afnio
import afnio.cognitive as cog
import afnio.cognitive.functional as F
from afnio.models.openai import AsyncOpenAI
class FacilitySupportAnalyzer(cog.Module):
def __init__(self):
super().__init__()
self.sentiment_task = cog.Parameter(
data="Read the provided message and determine the sentiment.",
role="system prompt for sentiment classification",
requires_grad=True,
)
self.sentiment_user = afnio.Variable(
data="**Message:**\n\n{message}\n\n",
role="input template to sentiment classifier",
)
self.sentiment_classifier = cog.ChatCompletion()
def forward(self, fwd_model, inputs, **completion_args):
sentiment_messages = [
{"role": "system", "content": [self.sentiment_task]},
{"role": "user", "content": [self.sentiment_user]},
]
return self.sentiment_classifier(
fwd_model,
sentiment_messages,
inputs=inputs,
response_format={
"type": "json_schema",
"json_schema": {
"strict": True,
"name": "sentiment_response_schema",
"schema": {
"type": "object",
"properties": {
"sentiment": {
"type": "string",
"enum": ["positive", "neutral", "negative"],
},
},
"additionalProperties": False,
"required": ["sentiment"],
},
},
},
**completion_args,
)
def training_step(self, batch, batch_idx):
X, y = batch
_, gold_sentiment, _ = y
pred_sentiment = self(
fw_model_client,
inputs={"message": X},
model="gpt-4.1-nano",
temperature=0.0,
)
pred_sentiment.data = [
json.loads(re.sub(r"^```json\n|\n```$", "", item))["sentiment"].lower()
for item in pred_sentiment.data
]
loss = F.exact_match_evaluator(pred_sentiment, gold_sentiment)
return {"loss": loss, "accuracy": loss[0].data / len(gold_sentiment.data)}
def validation_step(self, batch, batch_idx):
return self.training_step(batch, batch_idx)
def test_step(self, batch, batch_idx):
return self.validation_step(batch, batch_idx)
def configure_optimizers(self):
constraints = [
afnio.Variable(
data="The improved variable must never include or reference the characters `{` or `}`. Do not output them, mention them, or describe them in any way.",
role="optimizer constraint"
)
]
optimizer = afnio.optim.TGD(
self.parameters(),
model_client=optim_model_client,
constraints=constraints,
momentum=3,
model="gpt-5",
temperature=1.0,
max_completion_tokens=32000,
reasoning_effort="low"
)
return optimizer
````
### 2. Prepare Data and Model Clients
```python
from afnio.utils.data import DataLoader, WeightedRandomSampler
from afnio.utils.datasets import FacilitySupport
BATCH_SIZE = 33
training_data = FacilitySupport(split="train", root="data")
validation_data = FacilitySupport(split="val", root="data")
test_data = FacilitySupport(split="test", root="data")
def compute_sample_weights(data):
labels = [y.data for _, (_, y, _) in data]
counts = {label: labels.count(label) for label in set(labels)}
total = len(data)
return [total / counts[label] for label in labels]
weights = compute_sample_weights(training_data)
sampler = WeightedRandomSampler(weights, num_samples=len(training_data), replacement=True)
train_dataloader = DataLoader(training_data, sampler=sampler, batch_size=BATCH_SIZE)
val_dataloader = DataLoader(validation_data, batch_size=BATCH_SIZE, seed=42)
test_dataloader = DataLoader(test_data, batch_size=BATCH_SIZE, seed=42)
fw_model_client = AsyncOpenAI()
optim_model_client = AsyncOpenAI()
```
### 3. Train and Evaluate with Trainer
```python
from afnio.trainer import Trainer
import afnio.tellurio as te
agent = FacilitySupportAnalyzer()
trainer = Trainer(max_epochs=5, enable_agent_summary=True)
llm_clients = [fw_model_client, afnio.get_backward_model_client(), optim_model_client]
# Log in and initialize your experiment run
te.login(api_key="YOUR_TELLURIO_API_KEY")
run = te.init("your-username", "Facility Support")
# Test baseline performance
trainer.test(agent=agent, test_dataloader=test_dataloader, llm_clients=llm_clients)
# Train and validate
trainer.fit(
agent=agent,
train_dataloader=train_dataloader,
val_dataloader=val_dataloader,
llm_clients=llm_clients
)
# Test the trained agent
trainer.test(agent=agent, test_dataloader=test_dataloader, llm_clients=llm_clients)
run.finish()
```
## Key Differences from Manual Optimization Loop
- Less Code: You don’t need to write explicit loops for training, validation, or testing.
- Automatic Logging: All metrics and LM costs are logged to Tellurio Studio, giving you interactive plots and experiment tracking out of the box.
- Checkpointing: Trainer saves checkpoints automatically, so you can resume or analyze experiments later.
- Progress Bar: Trainer provides a rich progress bar and agent summary for each run.
- Cost Tracking: LM usage and cost are tracked and logged automatically.
If you want full control over every step, you can still use a manual optimization loop as shown in [Optimization Loop](optimization_loop). For most workflows, however, `Trainer` is the fastest and easiest way to train and evaluate agents in Afnio.
## Further Reading
- [Optimization Loop](optimization_loop)
- [Save, Load and Use Agent](save_load_use_agent)
- [Runs and Experiments](runs_and_experiments)