Skip to content

JS: Add flow label for tainted objects and sharpen NosqlInjection#305

Merged
xiemaisi merged 12 commits into
github:masterfrom
asger-semmle:json-taint-kind
Oct 22, 2018
Merged

JS: Add flow label for tainted objects and sharpen NosqlInjection#305
xiemaisi merged 12 commits into
github:masterfrom
asger-semmle:json-taint-kind

Conversation

@asger-semmle
Copy link
Copy Markdown
Contributor

Please ignore the first two commits, they are from #299 currently in flight.

This adds support for tracking objects that "deeply" user-controlled, i.e. the user controls the properties of the object and recursively the values of those properties. These are simply called "tainted objects". The result of JSON.parse(x) is such a tainted object if x is tainted.

This mechanism is then used in NosqlInjection to avoid false positives from things like req.params.x which can only be strings.

This PR does a bunch of things at once, as I'm hoping this makes it easier to check the validity of the approach. I'm happy to split out some of the commits later, such as the bugfix commit.

This definitely needs a thorough evaluation, but I would like to have the design reviewed first. It's also blocked on https://jira.semmle.com/browse/QL-716 at the moment.

@asger-semmle asger-semmle requested a review from a team as a code owner October 10, 2018 18:04
@ghost ghost self-assigned this Oct 11, 2018
Copy link
Copy Markdown

@ghost ghost left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks very nice.

Food for thought, in addition to the inline comments:

Since the deeply tainted objects origin from JSON, an attacker is unable to add new methods, so most method calls on a deeply tainted object gives information about its type, and hence could be thought of as implicit sanitizers. E.g.

if (tainted.substr(2, 3) == x){
  tainted; // sanitized
} 

Interestingly, the implicit exception makes tainted.substr(2, 3) a sanitizer for all the nodes it dominates:

tainted.substr(2, 3)
tainted; // sanitized

(I have observed a few cases for js/different-kinds-comparison-bypass where such a sanitizer would have been useful)

}

/**
* Gets the flow label representing a deeply tainted objects.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/objects/object

/**
* Gets the flow label representing a deeply tainted objects.
*
* A "tainted object" is an array or object whose values are all assumed to be tainted as well.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

whose property values

}
}

// TODO: string tests should be classified as sanitizer guards; need support for flow labels on guards
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(delete)

/**
* Holds if this barrier guard only blocks specific labels, and `label` is one of them.
*/
predicate blocksSpecificLabel(FlowLabel label) { none() }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocksAllLabels and blocksSpecificLabel are implied by each other, do we really need both?
We should at least document that none or both should be overridden.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added documentation about the overrides.

I agree it's a bit ugly to have both, but defaulting to any() is expensive. Inefficient APIs tend to get hacks and workarounds built on top of them, which I think is the greater evil.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocks(FlowLabel label)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocks(FlowLabel label) clashes with blocks(DataFlow::Node node) as QL doesn't have type-based overloading.

isSpecificTo(FlowLabel label)?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's a shame. I note, though, that blocks(DataFlow::Node) is marked internal, so we can rename it.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced about blocks. Can we please agree on another name?

If the TaintKind alias means we can't name predicates properly then we should remove the alias instead of being indecisive about it.

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, I'm not particularly attached to calling it blocks, I just want to keep the name simple. You seem to have much stronger opinions on this than me, so I'll just leave it up to you.

@asger-semmle
Copy link
Copy Markdown
Contributor Author

Interestingly, the implicit exception makes tainted.substr(2, 3) a sanitizer for all the nodes it dominates:

That's a good point, and I think it's feasible to do it. However, if that's the only sanitizer standing in the way of an exploit I think it just might be worth reporting the alert anyway.

@ghost ghost added JS WIP This is a work-in-progress, do not merge yet! labels Oct 12, 2018
Copy link
Copy Markdown

@xiemaisi xiemaisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM overall, I have a few suggestions about names and other minor things, and of course I'd like to see the results of the evaluation.

I'm also wondering whether NoSQL databases perhaps sometimes interpret user-supplied strings as regular expressions or some other kind of pattern? Do we want to count that as an injection vulnerability?

/**
* Holds if flow with label `lbl` cannot flow into `node`.
*/
predicate isBarrierForLabel(DataFlow::Node node, FlowLabel lbl) {
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about renaming this to isBarrierFor? It's shorter and avoids the flow label vs taint kind terminology controversy. We should probably also rename the previous predicate for consistency (it's just been introduced, so no compatibility problems).

* To block specific labels only, subclasses should override this with `none()` and
* also override `blocksSpecificLabel`.
*/
predicate blocksAllLabels() { any() }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As above, I'd prefer blocksAll.

/**
* Holds if this barrier guard only blocks specific labels, and `label` is one of them.
*/
predicate blocksSpecificLabel(FlowLabel label) { none() }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

blocks(FlowLabel label)

or
vp = false and
(predlbl = FlowLabel::data() or predlbl = FlowLabel::taint()) and
succlbl = FlowLabel::taint()
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

}

/**
* A source of a user-controlled deep object object.
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/object object/object/

/**
* Holds if this can be a user-controlled deep object, such as a JSON object parsed from user-controlled data.
*/
predicate isDeepObject() { none() }
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't like the name much; the word "deep" by itself doesn't seem to bring out the intended meaning of this predicate. (Similarly, I think the qldoc comment needs to be a bit more explicit about what "deep" means in this context.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

isUserControlledObject?

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

@asger-semmle
Copy link
Copy Markdown
Contributor Author

Evaluation shows a bunch of changes to client-side URL redirection due to the change I made. It's hard to see if it's an improvement overall.

@asger-semmle
Copy link
Copy Markdown
Contributor Author

I've introduced LabeledBarrierGuardNode and LabeledSanitizerGuardNode instead of the somewhat confusing blocksAll() and blocksSpecificLabel() API.

Please take another look.

Copy link
Copy Markdown

@xiemaisi xiemaisi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great stuff, I really like the new API!

Did you want to look in more detail at the result changes, or is this good to merge?

@asger-semmle
Copy link
Copy Markdown
Contributor Author

I think it's good to merge.

@asger-semmle asger-semmle removed the WIP This is a work-in-progress, do not merge yet! label Oct 22, 2018
@xiemaisi xiemaisi merged commit 7702b58 into github:master Oct 22, 2018
@kamarcum kamarcum unassigned ghost Apr 28, 2020
aibaars pushed a commit that referenced this pull request Oct 14, 2021
smowton pushed a commit to smowton/codeql that referenced this pull request Apr 16, 2022
Kotlin: Give context to diagnostics
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants