afnio.autodiff.evaluator#
Classes
|
Evaluates predictions deterministically using a user-defined evaluation function within the |
|
Evaluates predictions using exact matching within the |
|
Implements an evaluation of a model prediction using a language model (LM) as the judge within the |
- class afnio.autodiff.evaluator.DeterministicEvaluator(*args, **kwargs)[source]#
Bases:
FunctionEvaluates predictions deterministically using a user-defined evaluation function within the
afnioframework, supporting automatic differentiation.This class inherits from
Functionand requires both theforwardandbackwardmethods to be defined.The
DeterministicEvaluatorfunction computes ascoreand anexplanationbased on thepredictionandtargetinputs using a user-defined evaluation function (eval_fn). The evaluation function’s purpose is described byeval_fn_purpose. Outputs include a numerical or textual score and a textual explanation, both wrapped asVariableobjects.The
predictionis aVariable. Thetargetcan be a string, a list of strings, or aVariable. EachVariablepassed as an input argument can have either a scalar or a list.datafield, supporting both individual samples and batch processing. For batch processing, the lengths ofpredictionandtargetmust match.The
success_fnparameter is a user-defined function that returnsTruewhen all predictions evaluated byeval_fnare considered successful, andFalseotherwise. Ifsuccess_fnreturnsTrue, thebackwardpass will skip gradient calculations and directly return an empty gradient, optimizing computational time.The
reduction_fnparameter specifies the aggregation function to use for scores across a batch of predictions and targets. When specified, the reduction function’s purpose is described usingreduction_fn_purpose. If aggregation is not desired, setreduction_fnandreduction_fn_purposetoNone.Example with scalar inputs:
>>> prediction = Variable( ... data="green", ... role="color prediction", ... requires_grad=True ... ) >>> target = "red" >>> def exact_match_fn(p: str, t: str) -> int: ... return 1 if p == t else 0 >>> score, explanation = DeterministicEvaluator.apply( ... prediction, ... target, ... exact_match_fn, ... "exact match", ... ) >>> score.data 0 >>> explanation.data 'The evaluation function, designed for 'exact match', compared the <DATA> field of the predicted variable ('green') with the <DATA> field of the target variable ('red'), resulting in a score: 0.' >>> explanation.backward() >>> prediction.grad[0].data 'Reassess the criteria that led to the initial prediction of 'green'.'
Example with batched inputs:
>>> prediction = Variable( ... data=["green", "blue"], ... role="color prediction", ... requires_grad=True ... ) >>> target = ["red", "blue"] >>> def exact_match_fn(p: str, t: str) -> int: ... return 1 if p == t else 0 >>> score, explanation = DeterministicEvaluator.apply( ... prediction, ... target, ... exact_match_fn, ... "exact match", ... reduction_fn=sum, ... reduction_fn_purpose="summation" ... ) >>> score.data 1 >>> explanation.data 'The evaluation function, designed for 'exact match', compared the <DATA> fields of the predicted variable and the target variable across all samples in the batch, generating individual scores for each pair. These scores were then aggregated using the reduction function 'summation', resulting in a final aggregated score: 1.' >>> explanation.backward() >>> prediction.grad[0].data 'Reassess the criteria that led to the initial prediction of 'green'.'
- classmethod apply(*args, **kwargs)#
Applies the forward function of the custom Function class.
This method handles cases where setup_context is defined to set up the ctx (context) object separately or within the forward method itself.
- static backward(ctx, score_grad_output, explanation_grad_output)[source]#
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses.
It must accept a context
ctxas the first argument, followed by as many outputs as theforward()returned (None will be passed in for non variable outputs of the forward function), and it should return as many variables, as there were inputs toforward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Variable or is a Variable not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve variables saved during the forward pass. It also has an attribute
ctx.needs_input_gradas a tuple of booleans representing whether each input needs gradient. E.g.,backward()will havectx.needs_input_grad[0] = Trueif the first input toforward()needs gradient computed w.r.t. the output.
- static forward(ctx, prediction, target, eval_fn, eval_fn_purpose, success_fn, reduction_fn, reduction_fn_purpose)[source]#
Define the forward of the custom autodiff Function.
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (variables or other types).
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
afnio.autodiff.Function.setup_context()staticmethod to handle setting up thectxobject.outputis the output of the forward,inputsare a Tuple of inputs to the forward.
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Variables should not be stored directly on ctx. Instead, variables should be saved either with
ctx.save_for_backward()if they are intended to be used inbackward.
- static setup_context(ctx, inputs, output)#
There are two ways to define the forward pass of an autodiff.Function.
Either:
Override forward with the signature
forward(ctx, *args, **kwargs).setup_contextis not overridden. Setting up the ctx for backward happens inside theforward.Override forward with the signature
forward(*args, **kwargs)and overridesetup_context. Setting up the ctx for backward happens insidesetup_context(as opposed to inside theforward)
- class afnio.autodiff.evaluator.ExactMatchEvaluator(*args, **kwargs)[source]#
Bases:
FunctionEvaluates predictions using exact matching within the
afnioframework, supporting automatic differentiation.This class inherits from
Functionand requires both theforwardandbackwardmethods to be defined.The
ExactMatchEvaluatorfunction computes ascoreand anexplanationby comparing thedatafields of apredictionand atargetfor an exact match. For each sample:A score of
1is assigned for an exact match.A score of
0is assigned otherwise.
The
predictionis aVariable. Thetargetcan be a string, a list of strings, or aVariable. EachVariablepassed as an input argument can have either a scalar or a list.datafield, supporting both individual samples and batch processing. For batch processing, the lengths ofpredictionandtargetmust match.If batched inputs are provided, the scores can be aggregated using an optional
reduction_fn, such assum. The purpose of the reduction is described usingreduction_fn_purpose. If aggregation is not desired, setreduction_fnandreduction_fn_purposetoNone.Example with scalar inputs:
>>> prediction = Variable( ... data="green", ... role="color prediction", ... requires_grad=True ... ) >>> target = "red", >>> score, explanation = ExactMatchEvaluator.apply(prediction, target) >>> score.data 0 >>> explanation.data 'The evaluation function, designed for 'exact match', compared the <DATA> field of the predicted variable ('green') with the <DATA> field of the target variable ('red'), resulting in a score: 0.' >>> explanation.backward() >>> prediction.grad[0].data 'Reassess the criteria that led to the initial prediction of 'green'.'
Example with batched inputs:
>>> prediction = Variable( ... data=["green", "blue"], ... role="color prediction", ... requires_grad=True ... ) >>> target = ["red", "blue"] >>> score, explanation = ExactMatchEvaluator.apply(prediction, target) >>> score.data 1 >>> explanation.data 'The evaluation function, designed for 'exact match', compared the <DATA> fields of the predicted variable and the target variable across all samples in the batch, generating individual scores for each pair. These scores were then aggregated using the reduction function 'summation', resulting in a final aggregated score: 1.' >>> explanation.backward() >>> prediction.grad[0].data 'Reassess the criteria that led to the initial prediction of 'green'.'
- classmethod apply(*args, **kwargs)#
Applies the forward function of the custom Function class.
This method handles cases where setup_context is defined to set up the ctx (context) object separately or within the forward method itself.
- static backward(ctx, score_grad_output, explanation_grad_output)[source]#
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses.
It must accept a context
ctxas the first argument, followed by as many outputs as theforward()returned (None will be passed in for non variable outputs of the forward function), and it should return as many variables, as there were inputs toforward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Variable or is a Variable not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve variables saved during the forward pass. It also has an attribute
ctx.needs_input_gradas a tuple of booleans representing whether each input needs gradient. E.g.,backward()will havectx.needs_input_grad[0] = Trueif the first input toforward()needs gradient computed w.r.t. the output.
- static forward(ctx, prediction, target, reduction_fn=<built-in function sum>, reduction_fn_purpose='summation')[source]#
Define the forward of the custom autodiff Function.
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (variables or other types).
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
afnio.autodiff.Function.setup_context()staticmethod to handle setting up thectxobject.outputis the output of the forward,inputsare a Tuple of inputs to the forward.
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Variables should not be stored directly on ctx. Instead, variables should be saved either with
ctx.save_for_backward()if they are intended to be used inbackward.
- static setup_context(ctx, inputs, output)#
There are two ways to define the forward pass of an autodiff.Function.
Either:
Override forward with the signature
forward(ctx, *args, **kwargs).setup_contextis not overridden. Setting up the ctx for backward happens inside theforward.Override forward with the signature
forward(*args, **kwargs)and overridesetup_context. Setting up the ctx for backward happens insidesetup_context(as opposed to inside theforward)
- class afnio.autodiff.evaluator.LMJudgeEvaluator(*args, **kwargs)[source]#
Bases:
FunctionImplements an evaluation of a model prediction using a language model (LM) as the judge within the
afnioframework, supporting automatic differentiation.This class inherits from
Functionand requires both theforwardandbackwardmethods to be defined.This function returns a
scoreand anexplanation, both asVariableobjects, by comparing apredictionagainst atarget(when present) using a composite prompt. The prompt is constructed from a list ofmessagesand optionalinputs, which can dynamically populate placeholders in the message templates. The evaluation process leverages the specifiedforward_model_clientto perform the LM-based assessment.The
predictionis aVariable. Thetargetcan be a string, a list of strings, or aVariable. Similarly, theinputsdictionary can include strings, lists of strings, orVariable``s. Each ``Variablepassed as an input argument can have either a scalar or a list .data field, supporting both individual samples and batch processing. For batch processing, the lengths ofprediction,target, and any batchedinputsmust match.The
success_fnparameter is a user-defined function that returnsTruewhen all predictions evaluated by the LM as Judge are considered successful, andFalseotherwise. Ifsuccess_fnreturnsTrue, thebackwardpass will skip gradient calculations and directly return an empty gradient, optimizing computational time.If you are processing a batch of predictions and targets, you can use the
reduction_fnto aggregate individual scores (e.g., usingsumto compute a total score). Thereduction_fn_purposeparameter is a brief description of the aggregation’s purpose (e.g., “summation”). If you don’t want any aggregation, set bothreduction_fnandreduction_fn_purposetoNone.The function operates in two modes controlled by
eval_mode:eval_mode=True (default) – Computes gradients for
predictiononly. Use it for direct feedback on predictions.eval_mode=False – Computes gradients for
messagesandinputs. Use it to optimize the evaluator or align with human evaluation datasets.
Additional model parameters, such as temperature, max tokens, or seed values, can be passed through
completion_argsto customize the LLM’s behavior.Example with scalar inputs:
>>> task = Variable( ... "Evaluate if the translation is accurate.", ... role="evaluation task", ... requires_grad=True ... ) >>> format = Variable( ... "Provide 'score' (true/false) and 'explanation' in JSON.", ... role="output format" ... ) >>> user = Variable( ... "<PREDICTION>{prediction}</PREDICTION><TARGET>{target}</TARGET>", ... role="user query" ... ) >>> prediction = Variable( ... "Hola Mundo", ... role="translated text", ... requires_grad=True ... ) >>> target = Variable("Ciao Mondo", role="expected output") >>> messages = [ ... {"role": "system", "content": [task, format]}, ... {"role": "user", "content": [user]} ... ] >>> score, explanation = LMJudgeEvaluator.apply( ... model, ... messages, ... prediction, ... target, ... temperature=0.5, ... ) >>> score.data False >>> explanation.data 'The translated text is in Spanish, but the expected is in Italian.' >>> explanation.backward() >>> prediction.grad[0].data 'The translated text should be in Italian.'
Example with batched inputs:
>>> task = Variable( ... "Evaluate if the translation is accurate.", ... role="evaluation task", ... requires_grad=True ... ) >>> format = Variable( ... "Provide 'score' (true/false) and 'explanation' in JSON.", ... role="output format" ... ) >>> user = Variable( ... "<PREDICTION>{prediction}</PREDICTION><TARGET>{target}</TARGET>", ... role="user query" ... ) >>> prediction = Variable( ... data=["Hola Mundo", "Salve a tutti"], ... role="translated text", ... requires_grad=True, ... ) >>> target = ["Ciao Mondo", "Salve a tutti"] >>> score, explanation = LMJudgeEvaluator.apply( ... model, ... messages, ... prediction, ... target, ... reduction_fn=sum, ... reduction_fn_purpose="summation", ... ) >>> score.data 1 >>> explanation.data 'The evaluation function, designed using an LM as the judge, compared the <DATA> fields of the predicted variable and the target variable across all samples in the batch. These scores were then aggregated using the reduction function 'summation', resulting in a final aggregated score: 1.'
- classmethod apply(*args, **kwargs)#
Applies the forward function of the custom Function class.
This method handles cases where setup_context is defined to set up the ctx (context) object separately or within the forward method itself.
- static backward(ctx, score_grad_output, explanation_grad_output)[source]#
Define a formula for differentiating the operation with backward mode automatic differentiation.
This function is to be overridden by all subclasses.
It must accept a context
ctxas the first argument, followed by as many outputs as theforward()returned (None will be passed in for non variable outputs of the forward function), and it should return as many variables, as there were inputs toforward(). Each argument is the gradient w.r.t the given output, and each returned value should be the gradient w.r.t. the corresponding input. If an input is not a Variable or is a Variable not requiring grads, you can just pass None as a gradient for that input.The context can be used to retrieve variables saved during the forward pass. It also has an attribute
ctx.needs_input_gradas a tuple of booleans representing whether each input needs gradient. E.g.,backward()will havectx.needs_input_grad[0] = Trueif the first input toforward()needs gradient computed w.r.t. the output.
- static forward(ctx, forward_model_client, messages, prediction, target=None, inputs=None, success_fn=None, reduction_fn=<built-in function sum>, reduction_fn_purpose='summation', eval_mode=True, **completion_args)[source]#
Define the forward of the custom autodiff Function.
This function is to be overridden by all subclasses. There are two ways to define forward:
Usage 1 (Combined forward and ctx):
@staticmethod def forward(ctx: Any, *args: Any, **kwargs: Any) -> Any: pass
It must accept a context ctx as the first argument, followed by any number of arguments (variables or other types).
Usage 2 (Separate forward and ctx):
@staticmethod def forward(*args: Any, **kwargs: Any) -> Any: pass @staticmethod def setup_context(ctx: Any, inputs: Tuple[Any, ...], output: Any) -> None: pass
The forward no longer accepts a ctx argument.
Instead, you must also override the
afnio.autodiff.Function.setup_context()staticmethod to handle setting up thectxobject.outputis the output of the forward,inputsare a Tuple of inputs to the forward.
The context can be used to store arbitrary data that can be then retrieved during the backward pass. Variables should not be stored directly on ctx. Instead, variables should be saved either with
ctx.save_for_backward()if they are intended to be used inbackward.
- static setup_context(ctx, inputs, output)#
There are two ways to define the forward pass of an autodiff.Function.
Either:
Override forward with the signature
forward(ctx, *args, **kwargs).setup_contextis not overridden. Setting up the ctx for backward happens inside theforward.Override forward with the signature
forward(*args, **kwargs)and overridesetup_context. Setting up the ctx for backward happens insidesetup_context(as opposed to inside theforward)