afnio.models#

class afnio.models.ChatCompletionModel(provider=None, **kwargs)[source]#

Bases: BaseModel

An abstraction for a language model that accepts a prompt composed of an array of messages containing instructions for the model. Each message can have a different role, influencing how the model interprets the input.

async achat(messages, **kwargs)[source]#

Asynchronous method to handle chat-based interactions with the model.

Parameters:

messages (List[Dict[str, str]]) – A list of messages, where each message is represented as a dictionary with “role” (e.g., “user”, “system”) and “content” (the text of the message).
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the model’s response to the chat messages.

Return type:

str

chat(messages, **kwargs)[source]#

Synchronous method to handle chat-based interactions with the model.

Parameters:

messages (List[Dict[str, str]]) – A list of messages, where each message is represented as a dictionary with “role” (e.g., “user”, “system”) and “content” (the text of the message).
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the model’s response to the chat messages.

Return type:

str

clear_usage()#

Clears the token usage statistics.

This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.

get_config()#

Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.

Returns:: A dictionary containing the model’s configuration parameters.
Return type:: dict

get_provider()#: Returns the model provider name.

get_usage()#

Retrieves the current token usage statistics and cost (in USD).

Returns:

A dictionary containing cumulative token usage: statistics since the model instance was initialized.

Return type:

Dict[str, int]

Example

>>> model.get_usage()
{
    'prompt_tokens': 1500,
    'completion_tokens': 1200,
    'total_tokens': 2700,
    'cost': {'amount': 12.00, 'currency': 'USD'}
}

update_usage(usage, model_name=None)#

Updates the internal token usage statistics and cost.

Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.

Behavior:

If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.
If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.

Parameters:

usage (Dict[str, int]) – A dictionary containing token usage metrics, such as prompt_tokens, completion_tokens, and total_tokens.
model_name (str, optional) – The name of the model for which the usage is being updated. If None, cost is copied from usage if available.

Raises:

NotImplementedError – If called on the base class without an implementation.

class afnio.models.EmbeddingModel(provider=None, **kwargs)[source]#

Bases: BaseModel

An abstraction for a model that generates embeddings for input texts.

async aembed(input, **kwargs)[source]#

Asynchronous method to generate embeddings for the given input texts.

Parameters:

input (List[str]) – A list of input strings for which embeddings should be generated.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A list of embeddings, where each embedding is represented: as a list of floats corresponding to the input strings.

Return type:

List[List[float]]

clear_usage()#

Clears the token usage statistics.

This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.

embed(input, **kwargs)[source]#

Synchronous method to generate embeddings for the given input texts.

Parameters:

input (List[str]) – A list of input strings for which embeddings should be generated.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A list of embeddings, where each embedding is represented: as a list of floats corresponding to the input strings.

Return type:

List[List[float]]

get_config()#

Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.

Returns:: A dictionary containing the model’s configuration parameters.
Return type:: dict

get_provider()#: Returns the model provider name.

get_usage()#

Retrieves the current token usage statistics and cost (in USD).

Returns:

A dictionary containing cumulative token usage: statistics since the model instance was initialized.

Return type:

Dict[str, int]

Example

>>> model.get_usage()
{
    'prompt_tokens': 1500,
    'completion_tokens': 1200,
    'total_tokens': 2700,
    'cost': {'amount': 12.00, 'currency': 'USD'}
}

update_usage(usage, model_name=None)#

Updates the internal token usage statistics and cost.

Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.

Behavior:

If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.
If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.

Parameters:

usage (Dict[str, int]) – A dictionary containing token usage metrics, such as prompt_tokens, completion_tokens, and total_tokens.
model_name (str, optional) – The name of the model for which the usage is being updated. If None, cost is copied from usage if available.

Raises:

NotImplementedError – If called on the base class without an implementation.

class afnio.models.TextCompletionModel(provider=None, **kwargs)[source]#

Bases: BaseModel

An abstraction for a language model that accepts a prompt composed of a single text input and generates a textual completion.

async acomplete(prompt, **kwargs)[source]#

Asynchronous method to generate a completion for the given prompt.

Parameters:

prompt (str) – The input text for which the model should generate a completion.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the generated completion.

Return type:

str

clear_usage()#

Clears the token usage statistics.

This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.

complete(prompt, **kwargs)[source]#

Synchronous method to generate a completion for the given prompt.

Parameters:

prompt (str) – The input text for which the model should generate a completion.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the generated completion.

Return type:

str

get_config()#

Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.

Returns:: A dictionary containing the model’s configuration parameters.
Return type:: dict

get_provider()#: Returns the model provider name.

get_usage()#

Retrieves the current token usage statistics and cost (in USD).

Returns:

A dictionary containing cumulative token usage: statistics since the model instance was initialized.

Return type:

Dict[str, int]

Example

>>> model.get_usage()
{
    'prompt_tokens': 1500,
    'completion_tokens': 1200,
    'total_tokens': 2700,
    'cost': {'amount': 12.00, 'currency': 'USD'}
}

update_usage(usage, model_name=None)#

Updates the internal token usage statistics and cost.

Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.

Behavior:

If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.
If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.

Parameters:

usage (Dict[str, int]) – A dictionary containing token usage metrics, such as prompt_tokens, completion_tokens, and total_tokens.
model_name (str, optional) – The name of the model for which the usage is being updated. If None, cost is copied from usage if available.

Raises:

NotImplementedError – If called on the base class without an implementation.

Modules

`model`
`model_registry`
`openai`