You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[SPARK-37290][SQL] - Exponential planning time in case of non-deterministic function
### What changes were proposed in this pull request?
When using non-deterministic function, the method getAllValidConstraints can throw an OOM
```
protected def getAllValidConstraints(projectList: Seq[NamedExpression]): ExpressionSet = {
var allConstraints = child.constraints
projectList.foreach {
case a Alias(l: Literal, _) =>
allConstraints += EqualNullSafe(a.toAttribute, l)
case a Alias(e, _) =>
// For every alias in `projectList`, replace the reference in constraints by its attribute.
allConstraints ++= allConstraints.map(_ transform {
case expr: Expression if expr.semanticEquals(e) =>
a.toAttribute
})
allConstraints += EqualNullSafe(e, a.toAttribute)
case _ => // Don't change.
}
allConstraints
}
```
In particular, this line `allConstraints ++= allConstraints.map(...)` can generate an exponential number of expressions
This is because non deterministic functions are considered unique in a ExpressionSet
Therefore, the number of non-deterministic expressions double every time we go through this line
We can filter and keep only deterministic expression because
1 - the `semanticEquals` automatically discard non deterministic expressions
2 - this method is only used in one code path, and we keep only determinic expressions
```
lazy val constraints: ExpressionSet = {
if (conf.constraintPropagationEnabled) {
validConstraints
.union(inferAdditionalConstraints(validConstraints))
.union(constructIsNotNullConstraints(validConstraints, output))
.filter { c =>
c.references.nonEmpty && c.references.subsetOf(outputSet) && c.deterministic
}
} else {
ExpressionSet()
}
}
```
### Why are the changes needed?
It can lead to an exponential number of expressions and / or OOM
### Does this PR introduce _any_ user-facing change?
No
### How was this patch tested?
Local test
Closes#35233 from Stelyus/SPARK-37290.
Authored-by: Franck Thang <[email protected]>
Signed-off-by: Wenchen Fan <[email protected]>
(cherry picked from commit 881f562)
Signed-off-by: Wenchen Fan <[email protected]>
0 commit comments