JS: Add flow label for tainted objects and sharpen NosqlInjection#305
Conversation
There was a problem hiding this comment.
This looks very nice.
Food for thought, in addition to the inline comments:
Since the deeply tainted objects origin from JSON, an attacker is unable to add new methods, so most method calls on a deeply tainted object gives information about its type, and hence could be thought of as implicit sanitizers. E.g.
if (tainted.substr(2, 3) == x){
tainted; // sanitized
}
Interestingly, the implicit exception makes tainted.substr(2, 3) a sanitizer for all the nodes it dominates:
tainted.substr(2, 3)
tainted; // sanitized
(I have observed a few cases for js/different-kinds-comparison-bypass where such a sanitizer would have been useful)
| } | ||
|
|
||
| /** | ||
| * Gets the flow label representing a deeply tainted objects. |
| /** | ||
| * Gets the flow label representing a deeply tainted objects. | ||
| * | ||
| * A "tainted object" is an array or object whose values are all assumed to be tainted as well. |
| } | ||
| } | ||
|
|
||
| // TODO: string tests should be classified as sanitizer guards; need support for flow labels on guards |
| /** | ||
| * Holds if this barrier guard only blocks specific labels, and `label` is one of them. | ||
| */ | ||
| predicate blocksSpecificLabel(FlowLabel label) { none() } |
There was a problem hiding this comment.
blocksAllLabels and blocksSpecificLabel are implied by each other, do we really need both?
We should at least document that none or both should be overridden.
There was a problem hiding this comment.
I've added documentation about the overrides.
I agree it's a bit ugly to have both, but defaulting to any() is expensive. Inefficient APIs tend to get hacks and workarounds built on top of them, which I think is the greater evil.
There was a problem hiding this comment.
blocks(FlowLabel label) clashes with blocks(DataFlow::Node node) as QL doesn't have type-based overloading.
isSpecificTo(FlowLabel label)?
There was a problem hiding this comment.
That's a shame. I note, though, that blocks(DataFlow::Node) is marked internal, so we can rename it.
There was a problem hiding this comment.
I'm not convinced about blocks. Can we please agree on another name?
If the TaintKind alias means we can't name predicates properly then we should remove the alias instead of being indecisive about it.
There was a problem hiding this comment.
Sure, I'm not particularly attached to calling it blocks, I just want to keep the name simple. You seem to have much stronger opinions on this than me, so I'll just leave it up to you.
That's a good point, and I think it's feasible to do it. However, if that's the only sanitizer standing in the way of an exploit I think it just might be worth reporting the alert anyway. |
xiemaisi
left a comment
There was a problem hiding this comment.
LGTM overall, I have a few suggestions about names and other minor things, and of course I'd like to see the results of the evaluation.
I'm also wondering whether NoSQL databases perhaps sometimes interpret user-supplied strings as regular expressions or some other kind of pattern? Do we want to count that as an injection vulnerability?
| /** | ||
| * Holds if flow with label `lbl` cannot flow into `node`. | ||
| */ | ||
| predicate isBarrierForLabel(DataFlow::Node node, FlowLabel lbl) { |
There was a problem hiding this comment.
How about renaming this to isBarrierFor? It's shorter and avoids the flow label vs taint kind terminology controversy. We should probably also rename the previous predicate for consistency (it's just been introduced, so no compatibility problems).
| * To block specific labels only, subclasses should override this with `none()` and | ||
| * also override `blocksSpecificLabel`. | ||
| */ | ||
| predicate blocksAllLabels() { any() } |
| /** | ||
| * Holds if this barrier guard only blocks specific labels, and `label` is one of them. | ||
| */ | ||
| predicate blocksSpecificLabel(FlowLabel label) { none() } |
| or | ||
| vp = false and | ||
| (predlbl = FlowLabel::data() or predlbl = FlowLabel::taint()) and | ||
| succlbl = FlowLabel::taint() |
| } | ||
|
|
||
| /** | ||
| * A source of a user-controlled deep object object. |
| /** | ||
| * Holds if this can be a user-controlled deep object, such as a JSON object parsed from user-controlled data. | ||
| */ | ||
| predicate isDeepObject() { none() } |
There was a problem hiding this comment.
I don't like the name much; the word "deep" by itself doesn't seem to bring out the intended meaning of this predicate. (Similarly, I think the qldoc comment needs to be a bit more explicit about what "deep" means in this context.)
There was a problem hiding this comment.
isUserControlledObject?
|
Evaluation shows a bunch of changes to client-side URL redirection due to the change I made. It's hard to see if it's an improvement overall. |
|
I've introduced Please take another look. |
xiemaisi
left a comment
There was a problem hiding this comment.
Great stuff, I really like the new API!
Did you want to look in more detail at the result changes, or is this good to merge?
|
I think it's good to merge. |
Desugar array literals to `::Array.[]`
Kotlin: Give context to diagnostics
Please ignore the first two commits, they are from #299 currently in flight.
This adds support for tracking objects that "deeply" user-controlled, i.e. the user controls the properties of the object and recursively the values of those properties. These are simply called "tainted objects". The result of
JSON.parse(x)is such a tainted object ifxis tainted.This mechanism is then used in
NosqlInjectionto avoid false positives from things likereq.params.xwhich can only be strings.This PR does a bunch of things at once, as I'm hoping this makes it easier to check the validity of the approach. I'm happy to split out some of the commits later, such as the bugfix commit.
This definitely needs a thorough evaluation, but I would like to have the design reviewed first. It's also blocked on https://jira.semmle.com/browse/QL-716 at the moment.