afnio.models.model#

Classes

BaseModel([provider, config, usage])

An abstraction for a model.

ChatCompletionModel([provider])

An abstraction for a language model that accepts a prompt composed of an array of messages containing instructions for the model.

EmbeddingModel([provider])

An abstraction for a model that generates embeddings for input texts.

TextCompletionModel([provider])

An abstraction for a language model that accepts a prompt composed of a single text input and generates a textual completion.

class afnio.models.model.BaseModel(provider=None, config=None, usage=None)[source]#

Bases: ABC

An abstraction for a model.

clear_usage()[source]#

Clears the token usage statistics.

This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.

get_config()[source]#

Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.

Returns:

A dictionary containing the model’s configuration parameters.

Return type:

dict

get_provider()[source]#

Returns the model provider name.

get_usage()[source]#

Retrieves the current token usage statistics and cost (in USD).

Returns:

A dictionary containing cumulative token usage

statistics since the model instance was initialized.

Return type:

Dict[str, int]

Example

>>> model.get_usage()
{
    'prompt_tokens': 1500,
    'completion_tokens': 1200,
    'total_tokens': 2700,
    'cost': {'amount': 12.00, 'currency': 'USD'}
}
update_usage(usage, model_name=None)[source]#

Updates the internal token usage statistics and cost.

Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.

Behavior:
  • If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.

  • If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.

Parameters:
  • usage (Dict[str, int]) – A dictionary containing token usage metrics, such as prompt_tokens, completion_tokens, and total_tokens.

  • model_name (str, optional) – The name of the model for which the usage is being updated. If None, cost is copied from usage if available.

Raises:

NotImplementedError – If called on the base class without an implementation.

class afnio.models.model.ChatCompletionModel(provider=None, **kwargs)[source]#

Bases: BaseModel

An abstraction for a language model that accepts a prompt composed of an array of messages containing instructions for the model. Each message can have a different role, influencing how the model interprets the input.

async achat(messages, **kwargs)[source]#

Asynchronous method to handle chat-based interactions with the model.

Parameters:
  • messages (List[Dict[str, str]]) – A list of messages, where each message is represented as a dictionary with “role” (e.g., “user”, “system”) and “content” (the text of the message).

  • **kwargs

    Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

    For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the model’s response to the chat messages.

Return type:

str

chat(messages, **kwargs)[source]#

Synchronous method to handle chat-based interactions with the model.

Parameters:
  • messages (List[Dict[str, str]]) – A list of messages, where each message is represented as a dictionary with “role” (e.g., “user”, “system”) and “content” (the text of the message).

  • **kwargs

    Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

    For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the model’s response to the chat messages.

Return type:

str

clear_usage()#

Clears the token usage statistics.

This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.

get_config()#

Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.

Returns:

A dictionary containing the model’s configuration parameters.

Return type:

dict

get_provider()#

Returns the model provider name.

get_usage()#

Retrieves the current token usage statistics and cost (in USD).

Returns:

A dictionary containing cumulative token usage

statistics since the model instance was initialized.

Return type:

Dict[str, int]

Example

>>> model.get_usage()
{
    'prompt_tokens': 1500,
    'completion_tokens': 1200,
    'total_tokens': 2700,
    'cost': {'amount': 12.00, 'currency': 'USD'}
}
update_usage(usage, model_name=None)#

Updates the internal token usage statistics and cost.

Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.

Behavior:
  • If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.

  • If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.

Parameters:
  • usage (Dict[str, int]) – A dictionary containing token usage metrics, such as prompt_tokens, completion_tokens, and total_tokens.

  • model_name (str, optional) – The name of the model for which the usage is being updated. If None, cost is copied from usage if available.

Raises:

NotImplementedError – If called on the base class without an implementation.

class afnio.models.model.EmbeddingModel(provider=None, **kwargs)[source]#

Bases: BaseModel

An abstraction for a model that generates embeddings for input texts.

async aembed(input, **kwargs)[source]#

Asynchronous method to generate embeddings for the given input texts.

Parameters:
  • input (List[str]) – A list of input strings for which embeddings should be generated.

  • **kwargs

    Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

    For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A list of embeddings, where each embedding is represented

as a list of floats corresponding to the input strings.

Return type:

List[List[float]]

clear_usage()#

Clears the token usage statistics.

This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.

embed(input, **kwargs)[source]#

Synchronous method to generate embeddings for the given input texts.

Parameters:
  • input (List[str]) – A list of input strings for which embeddings should be generated.

  • **kwargs

    Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

    For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A list of embeddings, where each embedding is represented

as a list of floats corresponding to the input strings.

Return type:

List[List[float]]

get_config()#

Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.

Returns:

A dictionary containing the model’s configuration parameters.

Return type:

dict

get_provider()#

Returns the model provider name.

get_usage()#

Retrieves the current token usage statistics and cost (in USD).

Returns:

A dictionary containing cumulative token usage

statistics since the model instance was initialized.

Return type:

Dict[str, int]

Example

>>> model.get_usage()
{
    'prompt_tokens': 1500,
    'completion_tokens': 1200,
    'total_tokens': 2700,
    'cost': {'amount': 12.00, 'currency': 'USD'}
}
update_usage(usage, model_name=None)#

Updates the internal token usage statistics and cost.

Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.

Behavior:
  • If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.

  • If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.

Parameters:
  • usage (Dict[str, int]) – A dictionary containing token usage metrics, such as prompt_tokens, completion_tokens, and total_tokens.

  • model_name (str, optional) – The name of the model for which the usage is being updated. If None, cost is copied from usage if available.

Raises:

NotImplementedError – If called on the base class without an implementation.

class afnio.models.model.TextCompletionModel(provider=None, **kwargs)[source]#

Bases: BaseModel

An abstraction for a language model that accepts a prompt composed of a single text input and generates a textual completion.

async acomplete(prompt, **kwargs)[source]#

Asynchronous method to generate a completion for the given prompt.

Parameters:
  • prompt (str) – The input text for which the model should generate a completion.

  • **kwargs

    Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

    For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the generated completion.

Return type:

str

clear_usage()#

Clears the token usage statistics.

This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.

complete(prompt, **kwargs)[source]#

Synchronous method to generate a completion for the given prompt.

Parameters:
  • prompt (str) – The input text for which the model should generate a completion.

  • **kwargs

    Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.

    For a complete list of supported parameters for each model, refer to the respective API documentation.

Returns:

A string containing the generated completion.

Return type:

str

get_config()#

Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.

Returns:

A dictionary containing the model’s configuration parameters.

Return type:

dict

get_provider()#

Returns the model provider name.

get_usage()#

Retrieves the current token usage statistics and cost (in USD).

Returns:

A dictionary containing cumulative token usage

statistics since the model instance was initialized.

Return type:

Dict[str, int]

Example

>>> model.get_usage()
{
    'prompt_tokens': 1500,
    'completion_tokens': 1200,
    'total_tokens': 2700,
    'cost': {'amount': 12.00, 'currency': 'USD'}
}
update_usage(usage, model_name=None)#

Updates the internal token usage statistics and cost.

Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.

Behavior:
  • If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.

  • If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.

Parameters:
  • usage (Dict[str, int]) – A dictionary containing token usage metrics, such as prompt_tokens, completion_tokens, and total_tokens.

  • model_name (str, optional) – The name of the model for which the usage is being updated. If None, cost is copied from usage if available.

Raises:

NotImplementedError – If called on the base class without an implementation.