You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
It's useful to know when arange or eye is in a graph, but if these have constant inputs they will be constant folded, and then we would need to analyze all constants in a graph to find whether they correspond to arange/eye, which can be rather costly to do.
We should prevent these from being constant folded until a later phase? Also we may want to do a single pass at the beginning that converts constants to equivalent arange, eye, alloc. As I mentioned this can be costly, but at least this way it's "cached" because it becomes part of the graph. Whereas a rewrite looking for aranges would otherwise have to do it for every constant every time it is called (or equally unsatisfactory, ignore constants).
This is actually one of the hardest aspects of the kind of destructive/eager optimization we do, as constant_fold destroys information but can also simplify graphs very quickly.
For reference, note that there already exists special logic is alloc to try to not constant_fold it in a few cases where it is worse not to. This is just a bit myopic and won't scale for the kind of rewrites we are interested in #573
Description
It's useful to know when
arange
oreye
is in a graph, but if these have constant inputs they will be constant folded, and then we would need to analyze all constants in a graph to find whether they correspond to arange/eye, which can be rather costly to do.We should prevent these from being constant folded until a later phase? Also we may want to do a single pass at the beginning that converts constants to equivalent
arange
,eye
,alloc
. As I mentioned this can be costly, but at least this way it's "cached" because it becomes part of the graph. Whereas a rewrite looking for aranges would otherwise have to do it for every constant every time it is called (or equally unsatisfactory, ignore constants).This is actually one of the hardest aspects of the kind of destructive/eager optimization we do, as constant_fold destroys information but can also simplify graphs very quickly.
For reference, note that there already exists special logic is
alloc
to try to not constant_fold it in a few cases where it is worse not to. This is just a bit myopic and won't scale for the kind of rewrites we are interested in #573pytensor/pytensor/tensor/basic.py
Lines 1675 to 1713 in 781527f
The text was updated successfully, but these errors were encountered: