-
Notifications
You must be signed in to change notification settings - Fork 1.7k
C++: identify back-edges in the control flow graph. #639
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is some beautiful code. I wonder if this applies to the IR at all, given that there's no tree form of the IR. It's always a graph.
Some of the comments in https://git.semmle.com/Semmle/code/pull/23551 continue to apply here.
I don't understand how continue
and goto
are handled in this code.
exprparents(child,i,parent) or | ||
stmtparents(child,i,parent) or | ||
stmt_decl_bind(parent,i,child) or | ||
(initialisers(child,parent,_,_) and i = 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After #26937 we no longer call dbscheme relations directly from outside the AST classes. I've written almost this predicate recently as part of my QL CFG work. My version is just missing the i
. Also note the comment about ConditionDeclExpr
, which doesn't apply here. It would need to be added.
private class Node extends ControlFlowNodeBase {
/**
* Gets the nearest control-flow node that's a parent of this one, never
* crossing function boundaries.
*/
final Node getParentNode() {
result = this.(Expr).getParent()
or
result = this.(Stmt).getParent()
or
// An Initializer under a ConditionDeclExpr is not part of the CFG.
result.(DeclStmt).getADeclaration().(LocalVariable) = this.(Initializer).getDeclaration()
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really like the idea of building this on top of your QL CFG work. The purpose of this predicate is really to define the order of evaluation within an expression tree. It makes sense that you have written almost the same logic for the CFG work. So it would be much better to reuse your logic here.
// happen that the same element is listed multiple times with | ||
// different indices. For example, in the expression `a?:b` `a` has | ||
// index 0 and 1. | ||
i = min (int j | parents(element, j, child) | j) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note to @ian-semmle : this is another workaround for https://jira.semmle.com/browse/CPP-297
| (child = loop.getInitialization() and i = 0) or | ||
(child = loop.getCondition() and i = 1) or | ||
(child = loop.getStmt() and i = 2) or | ||
(child = loop.getUpdate() and i = 3)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There should probably also be a case for RangeBasedForStmt
.
element instanceof VlaDimensionStmt or | ||
element instanceof AsmStmt or | ||
element instanceof DeclStmt or | ||
element instanceof Initializer) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there any statements deliberately missing from this list? If all statements should be here, except a few special ones, then I suggest flipping the logic around to explicitly list the special ones.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this list by trial-and-error. You might be right that it should apply to all statements.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If that's the case, then I think this list is almost exactly identical to the predicate I have in the QL CFG code that describes all the CFG nodes that should be visited before their children.
I think this PR is valuable, but for #633 it's IR back edges we want. Can we construct a good approximation of those while building the IR CFG? On |
I agree that what #633 needs is IR back edges. I'm not sure that we should construct them while building the IR CFG; @dave-bartolomeo and I have discussed inserting inference operators/pi nodes that might require manipulating the CFG while building later stages of the IR. |
If we need the IR back-edge identification now, I don't see why a hypothetical feature should block the simplest solution. If we want to add the hypothetical feature in the future, we'd need to pay for the loss of simplicity at that point. |
#648 might also invalidate back-edge identification, but I haven't thought hard about it yet. |
I'd like to see some tests for this, covering things like [nested] loops, |
Draft library for identifying back-edges in the control flow graph. This is an improved version of the draft which I previously posted on our internal repo (#23551). This version works pretty well on the examples that I have looked at. I haven't checked all the corner cases though.
To handle all the corner cases correctly, it might be easier to build this on top of the new IR.
@rdmarsh2: I think you need something like this for #633.