# Optimization Loop

```{warning}
Before running any code, ensure you are logged in to the Afnio backend (`afnio login`).. See [Logging in to Afnio Backend](login) for details.
```

```{tip}
Afnio lets you build custom training, validation, and test loops for full control over your agent’s optimization process.
However, if you prefer a ready-made solution, Afnio provides the [`Trainer`](../../generated/afnio.trainer) class, which handles standard training, validation, and testing routines out of the box.
If you want to get started quickly, see [Trainer](trainer) for details and usage examples.
```

Training an AI agent or workflow is an iterative process: in each iteration, the agent makes a guess about the output, calculates the error in its guess (loss), collects feedback with respect to its parameters (see [Automatic Differentiation](automatic_differentiation)), and optimizes these parameters to improve future predictions. In Afnio, this means iteratively refining parameters—such as prompts, templates, or logic—based on feedback from evaluators or loss functions to better achieve your desired outcomes.

A typical optimization loop in Afnio consists of:

- **Forward Pass:** The agent or workflow processes input data and generates outputs.
- **Evaluation:** Outputs are compared to ground truth or assessed by evaluators, producing scores and semantic feedback (gradients).
- **Backward Pass:** Semantic feedback is backpropagated through the computational graph to accumulate gradients for learnable parameters.
- **Parameter Update:** The optimizer uses accumulated gradients to update parameters, improving the agent’s performance.

---

## Prerequisite Code

Before running the optimization loop, you should define your agent, dataset, and data loaders. See [Datasets and DataLoaders](datasets_and_dataloaders) and [Build the Agent or Workflow](build_agent_workflow) for details.

```python
import os

import afnio
import afnio.cognitive as cog
import afnio.tellurio as te
from afnio.models.openai import AsyncOpenAI
from afnio.utils.data import DataLoader, WeightedRandomSampler
from afnio.utils.datasets import FacilitySupport

os.environ["OPENAI_API_KEY"] = "sk-..."  # Replace with your actual key

def compute_sample_weights(data):
    with te.suppress_variable_notifications():
        labels = [y.data for _, (_, y, _) in data]
        counts = {label: labels.count(label) for label in set(labels)}
        total = len(data)
    return [total / counts[label] for label in labels]

training_data = FacilitySupport(split="train", root="data")
validation_data = FacilitySupport(split="val", root="data")
test_data = FacilitySupport(split="test", root="data")

weights = compute_sample_weights(training_data)
sampler = WeightedRandomSampler(
    weights, num_samples=len(training_data), replacement=True
)

BATCH_SIZE = 33
train_dataloader = DataLoader(training_data, sampler=sampler, batch_size=BATCH_SIZE)
val_dataloader = DataLoader(validation_data, batch_size=BATCH_SIZE, seed=42)
test_dataloader = DataLoader(test_data, batch_size=BATCH_SIZE, seed=42)

SENTIMENT_RESPONSE_FORMAT = {
    "type": "json_schema",
    "json_schema": {
        "strict": True,
        "name": "sentiment_response_schema",
        "schema": {
            "type": "object",
            "properties": {
                "sentiment": {
                    "type": "string",
                    "enum": ["positive", "neutral", "negative"],
                },
            },
            "additionalProperties": False,
            "required": ["sentiment"],
        },
    },
}

afnio.set_backward_model_client(
    "openai/gpt-5",
    completion_args={
        "temperature": 1.0,
        "max_completion_tokens": 32000,
        "reasoning_effort": "low",
    },
)
fw_model_client = AsyncOpenAI()
optim_model_client = AsyncOpenAI()

class FacilitySupportAnalyzer(cog.Module):

    def __init__(self):
        super().__init__()
        self.sentiment_task = cog.Parameter(
            data="Read the provided message and determine the sentiment.",
            role="system prompt for sentiment classification",
            requires_grad=True,
        )
        self.sentiment_user = afnio.Variable(
            data="**Message:**\n\n{message}\n\n",
            role="input template to sentiment classifier",
        )
        self.sentiment_classifier = cog.ChatCompletion()

    def forward(self, fwd_model, inputs, **completion_args):
        sentiment_messages = [
            {"role": "system", "content": [self.sentiment_task]},
            {"role": "user", "content": [self.sentiment_user]},
        ]
        return self.sentiment_classifier(
            fwd_model,
            sentiment_messages,
            inputs=inputs,
            response_format=SENTIMENT_RESPONSE_FORMAT,
            **completion_args,
        )

agent = FacilitySupportAnalyzer()
```

_Output:_

```output
INFO     : API key provided and stored securely in local keyring.
INFO     : Currently logged in as 'username' to 'http://localhost'. Use `afnio login --relogin` to force relogin.
INFO     : Project with slug 'my-project' already exists in namespace 'username'.
Downloading https://raw.githubusercontent.com/meta-llama/llama-prompt-ops/refs/heads/main/use-cases/facility-support-analyzer/dataset.json to data/FacilitySupport/raw/dataset.json
Downloading ━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━ 100% 383.7/383.7 kB 1.1 MB/s 0:00:00

Using downloaded and verified file: data/FacilitySupport/raw/dataset.json

Using downloaded and verified file: data/FacilitySupport/raw/dataset.json

```

---

## Hyperparameters

Hyperparameters are adjustable parameters that let you control the agent optimization process. Different hyperparameter values can impact agent training and convergence rates.

Common hyperparameters include:

- **Number of Epochs:** How many times the agent iterates over the entire dataset.
- **Batch Size:** Number of samples processed together before updating parameters.

**Example: Basic hyperparameter settings**

```python
MAX_EPOCHS = 5
BATCH_SIZE = 32
```

Other important hyperparameters in Afnio include:

- **Backward Engine Settings:** The language model (LM) and its parameters (such as temperature, max tokens, reasoning effort) passed to `set_backward_model_client`.
- **Optimizer Settings:** Parameters used by optimizers like `afnio.optim.TGD`, including constraints, momentum, and model selection.

**Example: Setting backward engine hyperparameters**

```python
afnio.set_backward_model_client(
    "openai/gpt-5",
    completion_args={
        "temperature": 1.0,
        "max_completion_tokens": 32000,
        "reasoning_effort": "low",
    },
)
```

**Example: Setting optimizer hyperparameters**

```python
optimizer = afnio.optim.TGD(
    agent.parameters(),
    model_client=AsyncOpenAI(),
    momentum=3,
    model="gpt-5",
    temperature=1.0,
    max_completion_tokens=32000,
    reasoning_effort="low",
)
```

You can quickly adjust these hyperparameters to experiment with and improve agent performance.

---

## Optimization Loop

Once your hyperparameters are set, you can train and optimize your agent using an optimization loop. Each cycle through the loop is called an epoch.

Every epoch typically includes two main phases:

1. **Training Loop** – Iterate over the training dataset to update and improve the agent’s parameters.
2. **Validation/Test Loop** – Evaluate the agent on validation or test data to monitor performance and generalization on unseen data.

Below, we’ll introduce key concepts used in the training loop.  
If you prefer to see the complete workflow, you can jump ahead to the [Full Implementation](#full-implementation) section.

### Loss Functions and Evaluators

In Afnio, **evaluators** serve as both loss functions and metrics for assessing your agent’s predictions. When you present training data to an untrained agent, its outputs may not match the desired targets. Evaluators measure how close the agent’s output is to the ground truth, providing both a numeric score and a semantic explanation (used as a gradient for optimization).

To compute the loss, you make a prediction using your agent and compare it to the true label using an evaluator. During training, you typically aim to maximize this score or minimize the error.

Common evaluators (used as loss functions) include:

- [`cog.ExactMatchEvaluator`](../../generated/afnio.cognitive.modules.exact_match_evaluator): for classification tasks.
- [`cog.DeterministicEvaluator`](../../generated/afnio.cognitive.modules.deterministic_evaluator): for custom deterministic criteria.
- [`cog.LMJudgeEvaluator`](../../generated/afnio.cognitive.modules.lm_judge_evaluator): for semantic or qualitative evaluation.

**Example: Initializing an evaluator for loss calculation**

```python
# Initialize the evaluator (used as a loss function)
loss_fn = cog.ExactMatchEvaluator()
```

### Optimizer

Optimization is the process of updating agent parameters to minimize error and improve performance during training. In Afnio, optimization logic is encapsulated in the `optimizer` object, which manages how parameters are adjusted based on feedback.  
For example, [`afnio.optim.TGD`](../../generated/afnio.optim) uses [Textual Gradient Descent](https://arxiv.org/abs/2406.07496) to rewrite prompts using language model feedback.

To initialize the optimizer, you register the agent’s parameters to be trained and specify relevant hyperparameters and constraints.

**Example: Initializing an optimizer**

```python
# Initialize optimizer constraints
constraints = [
    afnio.Variable(
        data="The improved variable must never include or reference the characters `{` or `}`. Do not output them, mention them, or describe them in any way.",
        role="optimizer constraint",
    )
]

# Initialize the optimizer
optimizer = afnio.optim.TGD(
    agent.parameters(),
    model_client=optim_model_client,
    constraints=constraints,
    momentum=3,
    model="gpt-5",
    temperature=1.0,
    max_completion_tokens=32000,
    reasoning_effort="low",
)
```

During each training iteration, optimization typically involves:

1. Call `optimizer.clear_grad()` to reset accumulated textual gradients for agent parameters. This prevents old feedback from biasing the next training iteration.
2. Backpropagate the loss explanation with `explanation.backward()`, which computes the gradients of the loss w.r.t. each parameter.
3. Call `optimizer.step()` to update parameters using the newly collected gradients.

---

## End-to-End Training Workflow

We define `train_loop` that loops over our optimization code, and `test_loop` that evaluates the agent’s performance against our test data.

### Training Loop

A typical training loop in Afnio looks like this:

````python
import json
import re

def train_loop(dataloader, agent, loss_fn, optimizer):
    size = len(dataloader.dataset)

    # Set the agent to training mode - important for some operations
    # Unnecessary in this situation but added for best practices
    agent.train()

    for batch, (X, y) in enumerate(dataloader):
        _, gold_sentiment, _ = y

        # Forward pass: agent processes input and generates output
        pred = agent(
            fw_model_client,
            inputs={"message": X},
            model="gpt-4.1-nano",
            temperature=0.0,
        )
        pred.data = [
            json.loads(re.sub(r"^```json\n|\n```$", "", item))["sentiment"].lower()
            for item in pred.data
        ]

        # Evaluation: compare prediction to ground truth
        loss_score, loss_explanation = loss_fn(pred, gold_sentiment)

        # Backward pass: propagate feedback
        loss_explanation.backward()

        # Update parameters using optimizer
        optimizer.step()

        # Reset gradients for next iteration
        optimizer.clear_grad()

        # Print loss and accuracy
        batch_len = len(X.data)
        current = batch * BATCH_SIZE + batch_len
        accuracy = loss_score.data / batch_len
        print(
            f"loss: {loss_score.data:>7f} - "
            f"accuracy: {accuracy:>7f}  [{current:>5d}/{size:>5d}]"
        )
````

---

### Validation and Testing

After each epoch, you can validate and test your agent to monitor performance:

````python
def test_loop(dataloader, agent, loss_fn):
    size = len(dataloader.dataset)
    num_batches = len(dataloader)
    tot_loss, correct = 0, 0

    # Set the agent to evaluation mode - important for some operations
    # Unnecessary in this situation but added for best practices
    agent.eval()

    # Disable gradient computation during evaluation with afnio.no_grad()
    # to save memory and speed up inference
    with afnio.no_grad():
        for X, y in dataloader:
            _, gold_sentiment, _ = y

            # Forward pass: agent generates predictions for the test set
            pred = agent(
                fw_model_client,
                inputs={"message": X},
                model="gpt-4.1-nano",
                temperature=0.0,
            )
            pred.data = [
                json.loads(re.sub(r"^```json\n|\n```$", "", item))[
                    "sentiment"
                ].lower()
                for item in pred.data
            ]

            # Evaluate predictions against ground truth labels
            loss_score, _ = loss_fn(pred, gold_sentiment)

            # Accumulate loss and correct predictions
            tot_loss += loss_score.data
            correct = tot_loss

    # Print average loss and accuracy
    tot_loss /= num_batches
    accuracy = (correct / size) * 100
    print(
        f"Test Error: \n Accuracy: {(accuracy):>0.1f}%, "
        f"Avg loss: {tot_loss:>8f} \n"
    )
````

### End-to-End Training & Evaluation

Below is a full example showing how to combine the training and testing loops for agent optimization in Afnio:

```{tip}
For a simpler way to run training and testing loops, track more metrics, and monitor granular LM costs, see the [Trainer](trainer) page. The Trainer class automates these routines and provides additional features for experiment tracking.
```

```python
loss_fn = cog.ExactMatchEvaluator()
constraints = [
    afnio.Variable(
        data="The improved variable must never include or reference the characters `{` or `}`. Do not output them, mention them, or describe them in any way.",
        role="optimizer constraint",
    )
]
optimizer = afnio.optim.TGD(
    agent.parameters(),
    model_client=optim_model_client,
    constraints=constraints,
    momentum=3,
    model="gpt-5",
    temperature=1.0,
    max_completion_tokens=32000,
    reasoning_effort="low",
)

epochs = 5
with te.init("username", "my-project"):  # replace "username" with your Tellurio Studio username (slug format)
    for t in range(epochs):
        print(f"Epoch {t+1}\n-------------------------------")
        train_loop(train_dataloader, agent, loss_fn, optimizer)
        test_loop(test_dataloader, agent, loss_fn)
    print("Done!")
```

_Output:_

```output
Epoch 1
-------------------------------
loss: 22.000000 - accuracy: 0.666667  [   33/   66]
loss: 23.000000 - accuracy: 0.696970  [   66/   66]
Test Error:
 Accuracy: 67.6%, Avg loss: 15.333333

Epoch 2
-------------------------------
loss: 16.000000 - accuracy: 0.484848  [   33/   66]
loss: 21.000000 - accuracy: 0.636364  [   66/   66]
Test Error:
 Accuracy: 79.4%, Avg loss: 18.000000

Epoch 3
-------------------------------
loss: 22.000000 - accuracy: 0.666667  [   33/   66]
loss: 23.000000 - accuracy: 0.696970  [   66/   66]
Test Error:
 Accuracy: 69.1%, Avg loss: 15.666667

Epoch 4
-------------------------------
loss: 25.000000 - accuracy: 0.757576  [   33/   66]
loss: 23.000000 - accuracy: 0.696970  [   66/   66]
Test Error:
 Accuracy: 76.5%, Avg loss: 17.333333

Epoch 5
-------------------------------
loss: 26.000000 - accuracy: 0.787879  [   33/   66]
loss: 21.000000 - accuracy: 0.636364  [   66/   66]
Test Error:
 Accuracy: 72.1%, Avg loss: 16.333333

Done!
```

---

## Further Reading

- [Trainer](trainer)
- [Save, Load and Use Agent](save_load_use_agent)
- [Runs and Experiments](runs_and_experiments)