afnio.autodiff#

Functions

backward(variables[, grad_variables, ...])

Computes the sum of gradients of given variables with respect to graph leaves.

afnio.autodiff.backward(variables, grad_variables=None, retain_graph=None, create_graph=False, inputs=None)[source]#

Computes the sum of gradients of given variables with respect to graph leaves.

The graph is differentiated using the chain rule. If any of variables are non-scalar (i.e. their data has more than one element) and require gradient, then the Jacobian-vector product would be computed, in this case the function additionally requires specifying grad_variables. It should be a sequence of matching length, that contains the “vector” in the Jacobian-vector product, usually the gradient of the differentiated function w.r.t. corresponding variables (None is an acceptable value for all variables that don’t need gradient variables).

This function accumulates gradients in the leaves - you might need to zero .grad attributes or set them to None before calling it.

Note

Using this method with create_graph=True will create a reference cycle between the parameter and its gradient which can cause a memory leak. We recommend using autodiff.grad when creating the graph to avoid this. If you have to use this function, make sure to reset the .grad fields of your parameters to None after use to break the cycle and avoid the leak.

Note

When inputs are provided, each input must be a leaf variable. If any input is not a leaf, a RuntimeError is raised.

Parameters:
  • variables (Sequence[Variables] or Variable) – Variables of which the derivative will be computed.

  • grad_variables (Sequence[Variable or None] or Variable, optional) – The “vector” in the Jacobian-vector product, usually gradients w.r.t. each element of corresponding variables. None values can be specified for scalar Variables or ones that don’t require grad. If a None value would be acceptable for all grad_variables, then this argument is optional.

  • retain_graph (bool, optional) – If False, the graph used to compute the grads will be freed. Setting this to True retains the graph, allowing for additional backward calls on the same graph, useful for example for multi-task learning where you have multiple losses. However, retaining the graph is not needed in nearly all cases and can be worked around in a much more efficient way. Defaults to the value of create_graph.

  • create_graph (bool, optional) – If True, graph of the derivative will be constructed, allowing to compute higher order derivative products. Defaults to False.

  • inputs (Sequence[Variable] or Variable or Sequence[GradientEdge], optional) – Inputs w.r.t. which the gradient will be accumulated into .grad. All other Variables will be ignored. If not provided, the gradient is accumulated into all the leaf Variables that were used to compute the variables.

afnio.autodiff.is_grad_enabled()[source]#

Check whether gradients are currently enabled.

afnio.autodiff.no_grad()[source]#

Context manager that disables gradient calculation. All operations within this block will not track gradients, making them more memory-efficient.

Disabling gradient calculation is useful for inference, when you are sure that you will not call Variable.backward(). It will reduce memory consumption for computations that would otherwise have requires_grad=True.

In this mode, the result of every computation will have requires_grad=False, even when the inputs have requires_grad=True. There is an exception! All factory functions, or functions that create a new Variable and take a requires_grad kwarg, will NOT be affected by this mode.

This context manager is thread local; it will not affect computation in other threads.

Also functions as a decorator.

Example::
>>> x = hf.Variable("abc", role="variable", requires_grad=True)
>>> with hf.no_grad():
...     y = x + x
>>> y.requires_grad
False
>>> @hf.no_grad()
... def doubler(x):
...     return x + x
>>> z = doubler(x)
>>> z.requires_grad
False
>>> @hf.no_grad
... def tripler(x):
...     return x + x + x
>>> z = tripler(x)
>>> z.requires_grad
False
>>> # factory function exception
>>> with hf.no_grad():
...     a = hf.cognitive.Parameter("xyz")
>>> a.requires_grad
True
afnio.autodiff.set_grad_enabled(mode)[source]#

Set the global state of gradient tracking.

Modules