Java: Add data-flow consistency checks.#2921
Conversation
jbj
left a comment
There was a problem hiding this comment.
I'm happy with the proposed resolution for the two unresolved questions in the PR description.
The next step can be to duplicate this for other languages and import the new module from a few new one-line ql files in the dataflow test directories.
max-schaefer
left a comment
There was a problem hiding this comment.
@aschackmull, I have questions about two of the consistency checks. They currently don't hold for Go (because I went with a very lazy implementation of post-update nodes), and I'm wondering how bad that is.
| } | ||
|
|
||
| query predicate postIsNotPre(PostUpdateNode n, string msg) { | ||
| n.getPreUpdateNode() = n and msg = "PostUpdateNode should not equal its pre-update node." |
There was a problem hiding this comment.
Could you comment briefly on what happens if this condition is violated?
There was a problem hiding this comment.
Flow that exits a method call through a side-effect on an argument may re-enter the method. E.g. in the following you'll get wrong flow from source to sink if you have a call to someMethod where the PostUpdateNode for the argument equals the ArgumentNode:
someMethod(A a) {
sink(a.f);
a.f = source();
}
| exists(int c | | ||
| c = count(n.getPreUpdateNode()) and | ||
| c != 1 and | ||
| msg = "PostUpdateNode should have one pre-update node but has " + c + "." |
There was a problem hiding this comment.
One example is that a store to a PostUpdateNode that doesn't have a corresponding pre-update won't be able to exit through a parameter: in someMethod(A a) { a.f = source(); } the store targets the post-update read of a in the qualifier of a.f, combined with local value flow from the ParameterNode to the corresponding pre-update node, the store is observed as a side-effect of the method and can then exit through the parameter and reach a PostUpdateNode of an argument in a call to someMethod. I don't really have an overview of what would happen if a PostUpdateNode has multiple pre-update nodes.
There was a problem hiding this comment.
Right, I was mostly asking about the case of multiple pre-update nodes. Glad to hear it's not known to mess things up massively.
|
Thanks for your responses, @aschackmull. I think we'll stick with our current simple implementation of post-update nodes for Go, and switch to a more sophisticated model when we start seeing too much imprecision. |
|
Should the new module be named |
For these kinds of queries "consistency" is the preferred term over "sanity". |
hvitved
left a comment
There was a problem hiding this comment.
Great initiative, @aschackmull .
| query predicate uniqueTypeRepr(Node n, string msg) { | ||
| exists(int c | | ||
| n instanceof RelevantNode and | ||
| c = count(getErasedRepr(n.getTypeBound())) and |
There was a problem hiding this comment.
You could use getErasedNodeTypeBound here instead...
There was a problem hiding this comment.
That requires an additional import of DataFlowImplCommon, so perhaps just leave as-is?
|
As it stands, this PR just introduces a lot of unreachable code in qll files. Is there any value to merging that before we have ql/qlref files that reference it? For C/C++ (AST and IR), I suggest putting a |
|
I've added the C++ tests as well as 4 additional checks related to |
|
Let's get it in. I will add C# tests in a follow-up PR. |
This adds a number of consistency queries (i.e. queries that ought to return zero results) for the shared data-flow implementation. The main file is language-independent and can be synced across languages. The idea is that this should help document expectations and invariants in the shared implementation. For Java, these queries mostly give zero results, but extractor issues can result in nodes without callables, and the TypeFlow library can give multiple type bounds in some corner cases.
A few unresolved questions:
isImmutableOrUnobservable(Node n)to disregard arguments withoutPostUpdateNodes when they have been consciously excluded. Currently I've put this inDataFlowPrivate.qll, but perhaps this predicate ought to go in its own file?<lang>/ql/test/instead? A related question is how our plumming for consistency queries work at the moment and whether that gives some restrictions?When we are happy with the structure, I can sync this to C# and C++.