Skip to content

C++: identify back-edges in the control flow graph. #639

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

kevinbackhouse
Copy link
Contributor

Draft library for identifying back-edges in the control flow graph. This is an improved version of the draft which I previously posted on our internal repo (#23551). This version works pretty well on the examples that I have looked at. I haven't checked all the corner cases though.

To handle all the corner cases correctly, it might be easier to build this on top of the new IR.

@rdmarsh2: I think you need something like this for #633.

@kevinbackhouse kevinbackhouse added the WIP This is a work-in-progress, do not merge yet! label Dec 7, 2018
@kevinbackhouse kevinbackhouse requested a review from a team as a code owner December 7, 2018 12:29
Copy link
Contributor

@jbj jbj left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is some beautiful code. I wonder if this applies to the IR at all, given that there's no tree form of the IR. It's always a graph.

Some of the comments in https://git.semmle.com/Semmle/code/pull/23551 continue to apply here.

I don't understand how continue and goto are handled in this code.

exprparents(child,i,parent) or
stmtparents(child,i,parent) or
stmt_decl_bind(parent,i,child) or
(initialisers(child,parent,_,_) and i = 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

After #26937 we no longer call dbscheme relations directly from outside the AST classes. I've written almost this predicate recently as part of my QL CFG work. My version is just missing the i. Also note the comment about ConditionDeclExpr, which doesn't apply here. It would need to be added.

private class Node extends ControlFlowNodeBase {
  /**
   * Gets the nearest control-flow node that's a parent of this one, never
   * crossing function boundaries.
   */
  final Node getParentNode() {
    result = this.(Expr).getParent()
    or
    result = this.(Stmt).getParent()
    or
    // An Initializer under a ConditionDeclExpr is not part of the CFG.
    result.(DeclStmt).getADeclaration().(LocalVariable) = this.(Initializer).getDeclaration()
  }
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I really like the idea of building this on top of your QL CFG work. The purpose of this predicate is really to define the order of evaluation within an expression tree. It makes sense that you have written almost the same logic for the CFG work. So it would be much better to reuse your logic here.

// happen that the same element is listed multiple times with
// different indices. For example, in the expression `a?:b` `a` has
// index 0 and 1.
i = min (int j | parents(element, j, child) | j)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to @ian-semmle : this is another workaround for https://jira.semmle.com/browse/CPP-297

| (child = loop.getInitialization() and i = 0) or
(child = loop.getCondition() and i = 1) or
(child = loop.getStmt() and i = 2) or
(child = loop.getUpdate() and i = 3))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should probably also be a case for RangeBasedForStmt.

element instanceof VlaDimensionStmt or
element instanceof AsmStmt or
element instanceof DeclStmt or
element instanceof Initializer)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are there any statements deliberately missing from this list? If all statements should be here, except a few special ones, then I suggest flipping the logic around to explicitly list the special ones.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wrote this list by trial-and-error. You might be right that it should apply to all statements.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If that's the case, then I think this list is almost exactly identical to the predicate I have in the QL CFG code that describes all the CFG nodes that should be visited before their children.

@jbj
Copy link
Contributor

jbj commented Dec 7, 2018

I think this PR is valuable, but for #633 it's IR back edges we want. Can we construct a good approximation of those while building the IR CFG? On TranslatedStmt there can be a predicate to enumerate the back edges from it just like there are predicates to enumerate other edges. This will be something like edges going around a loop, edges from continue, and edges from goto that are not obviously moving forward.

@rdmarsh2
Copy link
Contributor

rdmarsh2 commented Dec 7, 2018

I agree that what #633 needs is IR back edges. I'm not sure that we should construct them while building the IR CFG; @dave-bartolomeo and I have discussed inserting inference operators/pi nodes that might require manipulating the CFG while building later stages of the IR.

@jbj
Copy link
Contributor

jbj commented Dec 10, 2018

If we need the IR back-edge identification now, I don't see why a hypothetical feature should block the simplest solution. If we want to add the hypothetical feature in the future, we'd need to pay for the loss of simplicity at that point.

@rdmarsh2
Copy link
Contributor

#648 might also invalidate back-edge identification, but I haven't thought hard about it yet.

@geoffw0
Copy link
Contributor

geoffw0 commented Dec 12, 2018

I'd like to see some tests for this, covering things like [nested] loops, continue, break and goto.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
WIP This is a work-in-progress, do not merge yet!
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants