JS: add sources and sinks for typeahead.js#2429
Conversation
esbena
left a comment
There was a problem hiding this comment.
This is very nice, thank you for modelling how the source part of typeahead works.
My main concern is about the semantic consequences of the well-meaning improvements to the path-explanations.
Other things to consider:
- performance evaluation since we have added default taint steps
- change notes
- is the model general enough to support other suggestion providers than
bloodhound? (I think it is, but please double check, perhaps make a note of other common suggestion providers that may be worth modelling later).
| module Typeahead { | ||
| /** | ||
| * A reference to the Bloodhound class, which is a utility-class for generating auto-complete suggestions. | ||
| * Sometimes these suggestions can originate from remote sources. |
There was a problem hiding this comment.
Sometimes these suggestions can originate from remote sources.
This part of the docstring can be dropped. We care about remote sources elsewhere.
| */ | ||
| class Bloodhound extends DataFlow::SourceNode { | ||
| Bloodhound() { | ||
| this = DataFlow::moduleImport("typeahead.js/dist/bloodhound.js") |
There was a problem hiding this comment.
😱
I think the more correct way to import bloodhound is as require('bloodhound-js') this package: https://www.npmjs.com/package/bloodhound-js. Can we support that as well?
| * or an object containing an "url" property. | ||
| */ | ||
| override DataFlow::Node getUrl() { | ||
| if exists(option.getALocalSource().getAPropertyWrite("url")) |
There was a problem hiding this comment.
It is usually fine to just formulate this as result = option.getALocalSource().getAPropertyWrite("url").getRhs() or result = option. I can't recall any examples right now though. The more I think about this, the more it feels like a deja vu discussion...
Both formulations have drawbacks, the current formulation would fail to find urlB in the following case:
new Bloodhound({remote: advanced? {url: urlA, ...}: urlB})
My proposed formulation would also find {url: urlA, ...} to be the URL,
There was a problem hiding this comment.
If the getUrl() is used together with mayHaveStringValue(..), then the spurious value from your proposed formulation is not an issue..
I'll change it to your suggestion.
| // the first occurrence of the responseDataNode can be very disconnected from the instantiation of Bloodhound | ||
| // So I do this trick to get a taint-path that is readable to a developer. | ||
| // The above (possibly with added type-tracking) would be the correct way, but which gives unhelpful feedback to developers. | ||
| result = this |
There was a problem hiding this comment.
This is starting to become a common concern, I also mentioned this lack of fully explained path in in #2324. The problem is a UI issue though, and may be resolved by having multiple path explanations for each result. I think we should steer clear of this in QL for now. We can loop @sj in the next time we encounter this.
As such, I would prefer the other solution (using local dataflow for starters) and then letting definitions.ql aid the developer in figuring out why the flow is remote.
Besides, the current solution is semantically wrong: the response data node is not the instance, period. Treating ClientRequest as a remote flow source is just begging for unrelated problems when the request itself flows to some other sink. I agree that it could make sense to add any(RemoteBloodhoundClientRequest r) to Configuration::isSink(s) in a custom query, but in general we should stick to the semantically correct formulation.
|
|
||
| TypeaheadSuggestionFunction() { | ||
| typeaheadCall = JQuery::objectRef().getAMethodCall("typeahead") and | ||
| this = typeaheadCall |
There was a problem hiding this comment.
this matches $(...).typeahead(..., { templates: { suggestion: <this> } }), right? Can we add that as a comment?
| ( | ||
| pred = this | ||
| or | ||
| pred = this.getAFunctionValue().getParameter(1).getACall().getAnArgument() |
There was a problem hiding this comment.
Can we add a source code example for this disjunct?
It would also be nice if another comment could explain why it makes sense that pred is defined by two disjuncts that are so different from each other. It smells like we are trying very hard to provide useful paths for the programmer, which we unfortunately always can if it means that we are cutting semantic corners (c.f. discussion for the RemoteBloodhoundClientRequest class).
There was a problem hiding this comment.
After changing to the semantically correct solution (discussion above), this disjunct will no longer be necessary.
| * A taint step that models a call to `.ttAdapter()` on an instance of Bloodhound. | ||
| */ | ||
| class BloodHoundAdapterStep extends TaintTracking::AdditionalTaintStep, BloodhoundInstance { | ||
| DataFlow::Node successor; |
| /** | ||
| * A taint step that models a call to `.ttAdapter()` on an instance of Bloodhound. | ||
| */ | ||
| class BloodHoundAdapterStep extends TaintTracking::AdditionalTaintStep, BloodhoundInstance { |
There was a problem hiding this comment.
This entire class should be removed if we use typetracking in RemoteBloodhoundClientRequest::getAResponseDataNode, right.
|
|
||
|
|
||
| from Configuration cfg, DataFlow::PathNode source, DataFlow::PathNode sink | ||
| where cfg.hasFlowPath(source, sink) |
There was a problem hiding this comment.
Can we add:
and source.getNode() instanceof RemoteServerResponse
or perhaps:
and source.getNode() instanceof HeuristicSource
to avoid reporting the ordinary XSS results twice in the tests?
| import semmle.javascript.frameworks.Electron | ||
| import semmle.javascript.frameworks.Files | ||
| import semmle.javascript.frameworks.Firebase | ||
| import semmle.javascript.frameworks.typeahead |
There was a problem hiding this comment.
I think we want this spelled as Typeahead, jQuery is a special case. Also, this list of imports happens to be sorted.
esbena
left a comment
There was a problem hiding this comment.
Getting close.
One general thing:
Can you have a pass over all the predicates and consider which ones we really want to expose to the public?
I think I would like to play it safe in the name of compatibility maintenance with this library and make everything but TypeaheadSuggestionFunction private.
| exists(DataFlow::TypeTracker t2 | result = ref(t2).track(t2, t)) | ||
| } | ||
|
|
||
| DataFlow::SourceNode ref() { result = ref(DataFlow::TypeTracker::end()) } |
There was a problem hiding this comment.
See comment for the above predicate.
Co-Authored-By: Esben Sparre Andreasen <esbena@github.com>
The model does not support other third party suggestion providers. But it does support suggestion providers that are part of the client itself. |
esbena
left a comment
There was a problem hiding this comment.
One final refactoring suggestion (changes semantics slightly)
| DataFlow::CallNode getTypeaheadCall() { result = typeaheadCall } | ||
| } | ||
|
|
||
| /** |
There was a problem hiding this comment.
I am sorry for the repeated refactoring of this class. I think we can untangle the concepts a bit more:
/**
* A `source` option for a typeahead.js plugin instance.
*/
private class TypeaheadSource extends DataFlow::ValueNode {
DataFlow::CallNode typeaheadCall;
TypeaheadSource() {
typeaheadCall = JQuery::objectRef().getAMethodCall("typeahead") and
this = typeaheadCall.getOptionArgument(1, "source")
}
/** Gets a node for a suggestion that this source motivates. */
DataFlow::Node getASuggestion() {
exists(TypeaheadSuggestionFunction suggestionCallback |
suggestionCallback.getTypeaheadCall() = typeaheadCall and
result = suggestionCallback.getParameter(0)
)
}
}
- Now the source exists independently of a suggestion function, which seems semantically cleaner.
- The genericly named
getSuccessorhas a more descriptive name
(If we were to redesign this typeahead model. I think I would introduce a class namedTypeaheadInstance, defined as JQuery::objectRef().getAMethodCall("typeahead"), this class would then expose the connection between the TypeaheadSource and TypeaheadSuggestionFunction, but the current visibility choice does not prevent us from doing that later.)
There was a problem hiding this comment.
(If we were to redesign this typeahead model. I think I would introduce a class namedTypeaheadInstance, defined as
JQuery::objectRef().getAMethodCall("typeahead")
With your change JQuery::objectRef().getAMethodCall("typeahead") is already twice in the model, so I'll go ahead and refactor it.
|
Performance is not that bad, but its also not perfect. |
|
Approved. But lets do a performance evaluation that includes a few more (taint) queries (or just run the security suite) before merging. |
On it. |
Looks ok to me. |
|
Lets rerun the two |
I did the 5 slowest projects instead of just 2. |
|
And now I also got the tests to pass |
The library: typeahead.js
This PR is inspired a CVE that is not flagged as a result of this PR (missing source).
There are multiple parts in this PR, the most important being the XSS sink, demonstrated by the below code:
The
sourcecan be more complicated, especially if a library is used to create it.The below is an example of that, where the
Bloodhoundclass (part of thetypeahead.jslibrary) is used to fetch results from a remote URL:I've modelled this source using the
ClientRequestclass, but I've cheated.The
getAResponseDataNodemethod is supposed to return the first Node where the response occurs. In the above example that would be thevalparameter.However, presenting a developer with a taint-flow that is entirely inside the
suggestionfunction, when the actual source is within theBloodhoundclass, is hardly useful.I've therefore made the
getAResponseDataNodemethod return theBloodhoundinstance itself, even though that is technically incorrect, because it gives a more useful taint-path.The
getAResponseDataNodealso contain some skeleton of the code that would be required to do it the "right way" (it still needs added type-tracking).I've found some TP's using this source/sink pair: https://lgtm.com/query/7028618148576176487/
(They require that
semmle.javascript.heuristics.AdditionalSourcesis imported).None of them are really dangerous.