afnio.optim.optimizer#
Classes
|
Base class for all optimizers. |
- class afnio.optim.optimizer.Optimizer(params, defaults)[source]#
Bases:
objectBase class for all optimizers.
Warning
Parameters need to be specified as collections that have a deterministic ordering that is consistent between runs. Examples of objects that don’t satisfy those properties are sets and iterators over values of dictionaries.
- Parameters:
params (iterable) – An iterable of
afnio.Variables ordicts. Specifies what Variables should be optimized.defaults (
Dict[str,Any]) – (dict): A dict containing default values of optimization options (used when a parameter group doesn’t specify them).
- add_param_group(param_group)[source]#
Add a param group to the
Optimizers param_groups.This can be useful when fine tuning a pre-trained network as frozen layers can be made trainable and added to the
Optimizeras training progresses.- Parameters:
param_group (dict) – Specifies what Variables should be optimized along with group specific optimization options.
- clear_grad()[source]#
Resets the gradients of all optimized
Variables by setting the .grad attribute of each parameter to an empty list.
- load_state_dict(state_dict, model_clients=None)[source]#
Loads the optimizer state.
- Parameters:
state_dict (dict) – Optimizer state. Should be an object returned from a call to
state_dict().model_clients (dict, optional) – A dictionary mapping model client keys (e.g., ‘fw_model_client’) to their respective instances of
BaseModel. These instances will be used to reconstruct any model clients referenced within the optimizer state. If a required model client is missing, an error will be raised with instructions on how to provide the missing client.
- Raises:
ValueError – If the provided state_dict is invalid, such as when the parameter groups or their sizes do not match the current optimizer configuration.
ValueError – If a required model client is missing from the model_clients dictionary, with details about the expected model client type and key.
Example
>>> openai_client = AsyncOpenAI() >>> optimizer.load_state_dict(saved_state_dict, model_clients={ ... 'model_client': openai_client) ... })
-
state:
DefaultDict[Variable,Any] = {}#
- state_dict()[source]#
Returns the state of the optimizer as a
dict.It contains two entries:
state: a Dict holding current optimization state. Its contentdiffers between optimizer classes, but some common characteristics hold. For example, state is saved per parameter, and the parameter itself is NOT saved.
stateis a Dictionary mapping parameter ids to a Dict with state corresponding to each parameter.
param_groups: a List containing all parameter groups where eachparameter group is a Dict. Each parameter group contains metadata specific to the optimizer, such as learning rate and momentum, as well as a List of parameter IDs of the parameters in the group.
NOTE: The parameter IDs may look like indices but they are just IDs associating state with param_group. When loading from a state_dict, the optimizer will zip the param_group
params(int IDs) and the optimizerparam_groups(actualcog.Parameters) in order to match state WITHOUT additional verification.A returned state dict might look something like:
{ 'state': { 0: { 'momentum_buffer': [ ( Parameter(data='You are...', role='system prompt', requires_grad=True), [Variable(data='The system prompt should...', role='gradient for system prompt')] ) ] }, 1: { 'momentum_buffer': [ ( Parameter(data='Answer this...', role='instructin prompt', requires_grad=True), [Variable(data='The instruction prompt must...', role='gradient to instruction prompt')] ) ] } }, 'param_groups': [ { 'model_client': {'class_type': 'AsyncOpenAI'}, 'messages': [ { 'role': 'system', 'content': [Variable(data='You are part of an optimization system...', role='optimizer system prompt', requires_grad=False)] }, { 'role': 'user', 'content': [Variable(data='Here is the variable you need...', role='optimizer user prompt', requires_grad=False)] } ], 'inputs': {}, 'constraints': [], 'momentum': 2, 'completion_args': {'model': 'gpt-4o'}, 'params': [0, 1] } ] }
- step(closure=None)[source]#
Performs a single optimization step (parameter update).
- Parameters:
closure (Callable, optional) – A closure that reevaluates the model and returns the loss as a tuple containing a numerical score and a textual explanation. This closure is optional for most optimizers.
Note
Unless otherwise specified, this function should not modify the
.gradfield of the parameters.Note
Some optimization algorithms need to reevaluate the function multiple times, so you have to pass in a closure that allows them to recompute your model. The closure should clear the gradients, compute the loss, and return it.
Example:
for input, target in dataset: def closure(): optimizer.clear_grad() output = model(input) loss = loss_fn(output, target) loss.backward() return loss optimizer.step(closure)