Skip to content

Gradient of OpFromGraph fails #1

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ricardoV94 opened this issue Nov 17, 2022 · 3 comments · Fixed by #723
Closed

Gradient of OpFromGraph fails #1

ricardoV94 opened this issue Nov 17, 2022 · 3 comments · Fixed by #723
Labels
bug Something isn't working gradients OpFromGraph

Comments

@ricardoV94
Copy link
Member

ricardoV94 commented Nov 17, 2022

The gradients of OpFromGraph seem a bit fragile. I saw the following failures:

Multiple output

from pytensor.compile.builders import OpFromGraph
import pytensor.tensor as at

x, y = at.scalars("x", "y")
out1 = x + y
out2 = x * y
op = OpFromGraph([x, y], [out1, out2])
outs = op(x, y)
at.grad(outs[0].sum(), x)
Traceback (most recent call last):
  File "/home/ricardo/Documents/Projects/aesara/venv/lib/python3.8/site-packages/IPython/core/interactiveshell.py", line 3441, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-ebcb546bdac3>", line 9, in <module>
    at.grad(outs[0].sum(), x)
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 623, in grad
    _rval: Sequence[Variable] = _populate_grad_dict(
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1434, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1434, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1387, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/ricardo/Documents/Projects/aesara/aesara/gradient.py", line 1213, in access_term_cache
    input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 744, in L_op
    ret_ofg_l = self._lop_op(*inps, return_list=True)
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 769, in __call__
    return super().__call__(*actual_inputs, **kwargs)
  File "/home/ricardo/Documents/Projects/aesara/aesara/graph/op.py", line 297, in __call__
    node = self.make_node(*inputs, **kwargs)
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 784, in make_node
    non_shared_inputs = [
  File "/home/ricardo/Documents/Projects/aesara/aesara/compile/builders.py", line 785, in <listcomp>
    inp_t.filter_variable(inp)
  File "/home/ricardo/Documents/Projects/aesara/aesara/tensor/type.py", line 262, in filter_variable
    other2 = self.convert_variable(other)
  File "/home/ricardo/Documents/Projects/aesara/aesara/tensor/type.py", line 328, in convert_variable
    if (self.ndim == var.type.ndim) and (self.dtype == var.type.dtype):
AttributeError: 'DisconnectedType' object has no attribute 'ndim'

Single output, involving a discrete Elemwise input

from aesara.compile.builders import OpFromGraph
import aesara.tensor as at

x = at.scalar("x")
y = at.lscalar("y")
out1 = x + at.switch(at.eq(y, 0), -1, 1)
at.grad(out1, x)  # Fine

op = OpFromGraph([x, y], [out1])
out2 = op(x, y)
at.grad(out2, x)  # Fails
Traceback (most recent call last):
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/IPython/core/interactiveshell.py", line 3433, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)
  File "<ipython-input-2-b1c4038d13ee>", line 11, in <module>
    at.grad(out2, x)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 521, in grad
    var_to_app_to_idx = _populate_var_to_app_to_idx(outputs, _wrt, consider_constant)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 968, in _populate_var_to_app_to_idx
    account_for(output)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 939, in account_for
    connection_pattern = _node_to_pattern(app)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 817, in _node_to_pattern
    connection_pattern = node.op.connection_pattern(node)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/compile/builders.py", line 851, in connection_pattern
    lop_op = self.get_lop_op()
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/compile/builders.py", line 700, in get_lop_op
    self._recompute_lop_op()
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/configparser.py", line 47, in res
    return f(*args, **kwargs)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/compile/builders.py", line 495, in _recompute_lop_op
    gdefaults_l = fn_grad(wrt=local_inputs)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 623, in grad
    _rval: Sequence[Variable] = _populate_grad_dict(
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1434, in _populate_grad_dict
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1434, in <listcomp>
    rval = [access_grad_cache(elem) for elem in wrt]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1387, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1058, in access_term_cache
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1058, in <listcomp>
    output_grads = [access_grad_cache(var) for var in node.outputs]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1387, in access_grad_cache
    term = access_term_cache(node)[idx]
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/gradient.py", line 1213, in access_term_cache
    input_grads = node.op.L_op(inputs, node.outputs, new_output_grads)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/tensor/elemwise.py", line 548, in L_op
    rval = self._bgrad(inputs, outs, ograds)
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/tensor/elemwise.py", line 648, in _bgrad
    ret.append(transform(scalar_igrad))
  File "/home/ricardo/miniconda3/envs/pymcx/lib/python3.10/site-packages/aesara/tensor/elemwise.py", line 621, in transform
    if isinstance(r.type, (NullType, DisconnectedType)):
AttributeError: 'float' object has no attribute 'type'
@ricardoV94 ricardoV94 changed the title Gradient of multi-output OpFromGraph fails Gradient of OpFromGraph fails Nov 21, 2022
@ricardoV94 ricardoV94 added the bug Something isn't working label Nov 21, 2022
@ricardoV94
Copy link
Member Author

@ricardoV94
Copy link
Member Author

Apparently this was a known limitation:

- grad() make it support DisconnectedType and the new interface
- add support for NullType and DisconnectedType when R_op supports them

@ricardoV94
Copy link
Member Author

ricardoV94 commented Apr 20, 2024

The second issue was seen in OpFromGraph but actually caused by the already fixed: #331

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working gradients OpFromGraph
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant