-
Notifications
You must be signed in to change notification settings - Fork 1.7k
Java/C++/C#: Misc. dataflow fixes. #2038
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
@@ -213,6 +213,16 @@ private module ImplCommon { | |||
argumentValueFlowsThrough(node1, node2, _) | |||
} | |||
|
|||
private predicate parameterValueFlowNoCtx(ParameterNode p, Node node) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't quite follow this rewrite; can you elaborate on why we need both parameterValueFlowNoCtx
and parameterValueFlowCand
, and in particular why the uses of parameterValueFlowNoCtx
no longer should use flow through callables?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
parameterValueFlowNoCtx
still uses flow through callables as that's included in localValueStep
. The difference between parameterValueFlowCand
and parameterValueFlowNoCtx
is that the latter is a slight precision improvement due to call context being used in the nested flow through callables. I wanted to remove the direct dependency on parameterValueFlowCand
so it can be private in the nested module that's introduced in the next commit, and with that change it makes more sense to define parameterValueFlowNoCtx
like this based on the resulting argumentValueFlowsThrough(node1, node2, _)
- a side-effect is then a slight precision improvement for the detection of getters and setters.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would be nice with a test case that covers this.
pragma[noinline] | ||
private ParameterNode getAParameter(DataFlowCallable c) { result.getEnclosingCallable() = c } | ||
private module FlowThrough_v1 { | ||
private predicate step = simpleLocalFlowStep/2; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This alias is only called twice, do we really need it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The FlowThrough_*
modules are essentially two instantiations of an implicit parameterized module. The step
relation is the input, and the only thing that differs between _v1
and _v2
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's also what I reckoned, but I think we should then at least have a comment saying that, when not using pyrameterization.
} | ||
|
||
cached | ||
module FlowThrough_v2 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps add a comment about how this modules is different from FlowThrough_v1
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain why we need two FlowThrough
modules? AFAICT, they both restrict to value-preserving flow, and v2 adds local flow through fields (localStoreReadStep
); why is local flow through fields not allowed in the localValueStep
relation (which is the only use of FlowThrough_v1
)?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The localValueStep
relation is used to define getters and setters and making it use field flow would turn everything into a big messy SCC. So the basic dependency order is: Basic flow-through, which is then refined with call contexts (that's _v1
), use this to define localValueStep
, use this to define getters and setters, use getters and setters to define the store-localflow-read step as an additional local flow step, with this additional local flow through fields recalculate flow through methods (that's _v2
). The reason why we want this additional flow included in the final value-flow-through relation is that the store-localflow-read step is already used in DataFlowImpl
to calculate taint-flow-through. This means that until now value-flow-through that makes use of store-localflow-read has been miscategorized as a taint-summary rather than a value-summary. In follow-up work I intend to fix the taint-flow-through calculation to no longer accidentally include value-flow-through, and that means that without this change we'd lose value-flow-through for those callables that are relying on a store-localflow-read step.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I second @hvitved 's request to turn this explanation into code comments.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but please turn your answers to @hvitved's questions into code comments.
A number of mostly independent minor fixes to dataflow (commit-by-commit review is encouraged).
The "Improve taint/value distinction for flow through with fields" commit is the largest code-wise, but doesn't actually do much right now - it merely recategorises flow through methods that require a
localStoreReadStep
from being a summarised as a taint step (which was slightly wrong, but didn't matter much) to being summarised as a value step. The main reason for this change is to prepare for the introduction of taint-getters and taint-setters, which will rewrite the current taint-flow-through summarisation.