afnio.models#
- class afnio.models.ChatCompletionModel(provider=None, **kwargs)[source]#
Bases:
BaseModelAn abstraction for a language model that accepts a prompt composed of an array of messages containing instructions for the model. Each message can have a different role, influencing how the model interprets the input.
- async achat(messages, **kwargs)[source]#
Asynchronous method to handle chat-based interactions with the model.
- Parameters:
messages (List[Dict[str, str]]) – A list of messages, where each message is represented as a dictionary with “role” (e.g., “user”, “system”) and “content” (the text of the message).
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.
For a complete list of supported parameters for each model, refer to the respective API documentation.
- Returns:
A string containing the model’s response to the chat messages.
- Return type:
- chat(messages, **kwargs)[source]#
Synchronous method to handle chat-based interactions with the model.
- Parameters:
messages (List[Dict[str, str]]) – A list of messages, where each message is represented as a dictionary with “role” (e.g., “user”, “system”) and “content” (the text of the message).
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.
For a complete list of supported parameters for each model, refer to the respective API documentation.
- Returns:
A string containing the model’s response to the chat messages.
- Return type:
- clear_usage()#
Clears the token usage statistics.
This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.
- get_config()#
Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.
- Returns:
A dictionary containing the model’s configuration parameters.
- Return type:
- get_provider()#
Returns the model provider name.
- get_usage()#
Retrieves the current token usage statistics and cost (in USD).
- Returns:
- A dictionary containing cumulative token usage
statistics since the model instance was initialized.
- Return type:
Example
>>> model.get_usage() { 'prompt_tokens': 1500, 'completion_tokens': 1200, 'total_tokens': 2700, 'cost': {'amount': 12.00, 'currency': 'USD'} }
- update_usage(usage, model_name=None)#
Updates the internal token usage statistics and cost.
Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.
- Behavior:
If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.
If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.
- Parameters:
- Raises:
NotImplementedError – If called on the base class without an implementation.
- class afnio.models.EmbeddingModel(provider=None, **kwargs)[source]#
Bases:
BaseModelAn abstraction for a model that generates embeddings for input texts.
- async aembed(input, **kwargs)[source]#
Asynchronous method to generate embeddings for the given input texts.
- Parameters:
input (List[str]) – A list of input strings for which embeddings should be generated.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.
For a complete list of supported parameters for each model, refer to the respective API documentation.
- Returns:
- A list of embeddings, where each embedding is represented
as a list of floats corresponding to the input strings.
- Return type:
List[List[float]]
- clear_usage()#
Clears the token usage statistics.
This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.
- embed(input, **kwargs)[source]#
Synchronous method to generate embeddings for the given input texts.
- Parameters:
input (List[str]) – A list of input strings for which embeddings should be generated.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.
For a complete list of supported parameters for each model, refer to the respective API documentation.
- Returns:
- A list of embeddings, where each embedding is represented
as a list of floats corresponding to the input strings.
- Return type:
List[List[float]]
- get_config()#
Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.
- Returns:
A dictionary containing the model’s configuration parameters.
- Return type:
- get_provider()#
Returns the model provider name.
- get_usage()#
Retrieves the current token usage statistics and cost (in USD).
- Returns:
- A dictionary containing cumulative token usage
statistics since the model instance was initialized.
- Return type:
Example
>>> model.get_usage() { 'prompt_tokens': 1500, 'completion_tokens': 1200, 'total_tokens': 2700, 'cost': {'amount': 12.00, 'currency': 'USD'} }
- update_usage(usage, model_name=None)#
Updates the internal token usage statistics and cost.
Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.
- Behavior:
If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.
If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.
- Parameters:
- Raises:
NotImplementedError – If called on the base class without an implementation.
- class afnio.models.TextCompletionModel(provider=None, **kwargs)[source]#
Bases:
BaseModelAn abstraction for a language model that accepts a prompt composed of a single text input and generates a textual completion.
- async acomplete(prompt, **kwargs)[source]#
Asynchronous method to generate a completion for the given prompt.
- Parameters:
prompt (str) – The input text for which the model should generate a completion.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.
For a complete list of supported parameters for each model, refer to the respective API documentation.
- Returns:
A string containing the generated completion.
- Return type:
- clear_usage()#
Clears the token usage statistics.
This resets all numerical values in the usage dictionary to zero (including nested values), while preserving the dictionary structure.
- complete(prompt, **kwargs)[source]#
Synchronous method to generate a completion for the given prompt.
- Parameters:
prompt (str) – The input text for which the model should generate a completion.
**kwargs –
Additional parameters to configure the model’s behavior during chat completion. This may include options such as: - model (str): The model to use (e.g., “gpt-4o”). - temperature (float): Amount of randomness injected into the response. - max_completion_tokens (int): Maximum number of tokens to generate. - etc.
For a complete list of supported parameters for each model, refer to the respective API documentation.
- Returns:
A string containing the generated completion.
- Return type:
- get_config()#
Returns the model configuration. This includes the model name, temperature, max tokens, and other parameters that are used to configure the model’s behavior.
- Returns:
A dictionary containing the model’s configuration parameters.
- Return type:
- get_provider()#
Returns the model provider name.
- get_usage()#
Retrieves the current token usage statistics and cost (in USD).
- Returns:
- A dictionary containing cumulative token usage
statistics since the model instance was initialized.
- Return type:
Example
>>> model.get_usage() { 'prompt_tokens': 1500, 'completion_tokens': 1200, 'total_tokens': 2700, 'cost': {'amount': 12.00, 'currency': 'USD'} }
- update_usage(usage, model_name=None)#
Updates the internal token usage statistics and cost.
Each model provider (e.g., OpenAI, Anthropic) may have a different usage format. This method should be implemented by subclasses to ensure correct parsing and aggregation of token usage.
- Behavior:
If model_name is provided, the method dynamically calculates and updates the cost based on the usage metrics and the pricing for the specified model.
If model_name is None, the method copies the cost value directly from the usage dictionary (if present), which is typically used when restoring state from a checkpoint.
- Parameters:
- Raises:
NotImplementedError – If called on the base class without an implementation.
Modules