Skip to content

JS: add MembershipTests.qll#3415

Merged
semmle-qlci merged 6 commits into
github:masterfrom
esbena:js/membershiptest
Jun 2, 2020
Merged

JS: add MembershipTests.qll#3415
semmle-qlci merged 6 commits into
github:masterfrom
esbena:js/membershiptest

Conversation

@esbena

@esbena esbena commented May 5, 2020

Copy link
Copy Markdown
Contributor

Addresses https://github.com/github/codeql-javascript-team/issues/93

Adds MembershipTest, as a generalization of InclusionTest. I am open to merging it all into the existing InclusionTest, but given the array- and string-focused docstrings on InclusionTest, I would rather avoid breakage and simply have a slightly ambigous pair of class names.

The new class improves our taint sanitizers and simplifies js/incomplete-url-scheme-check. The most controversial part of this PR is that equality tests are now treated as sanitizers regardless of whether they have a constant side or not.

The sanitizer improvement apparently triggers a performance boost. I am re-running the latest version (with a fix for the missing prototype-pollution results).

I have built this on top of #3391, so if this is reviewed early, the last commit should to be reviewed instead of the entire diff.

@esbena esbena added JS Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish labels May 5, 2020
@esbena esbena requested a review from a team as a code owner May 5, 2020 13:23

@asgerf asgerf left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool! Very comprehensive. 👏

It seems like a lot of code to save 9 lines of code 😅 but I guess time will tell if it pays off.

I couldn't help poke at the tuple counts a bit to see what might explain the speed-up. It seems barrier guards now rely slightly less on magic so maybe that's it? My previous attempts at getting rid of that magic cascaded into other misoptimizations so kudos for wobbling the optimizer in the right direction 😄.

Do we still need the ConstantComparison sanitizer guard or is it completely generalized by MembershipTest? (For retaining backwards compatibility, it should suffice to just remove its AdditionalSanitizerGuard base class while deprecating it).

Comment thread javascript/ql/src/semmle/javascript/MembershipTests.qll Outdated
/**
* A test for whether a candidate is a member of an array.
*/
class ArrayMembershipTest extends OrdinaryInclusionTest {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Either make this private or expose array via a predicate.

Did you mean for every class here to be public? It's a lot to add to the global scope brought in by import javascript.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have made most classes private, and wrapped them all in MembershipTestImplementations.

class SwitchCaseMembershipTest extends MembershipTest::Range, DataFlow::ValueNode {
SwitchStmt switch;

SwitchCaseMembershipTest() { this = switch.getACase().getExpr().flow() }

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm, I think we're missing a specification of what happens with the result of the check. From the description of MembershipTest I'd expect this to be the result of the check (i.e. a boolean) but here, this refers to the case expression.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is an excellent point. It does not look like we have any nodes for the implicit comparison that is performed at the case, so the choice of this here is arbitrary.

To avoid feature creep, I have removed this class from the PR.

Looking forward: do we lean towards add a new synthetic data flow node for the implicit case comparisons, or towards complicating MembershipTest with an optional getResultNode (this won't solve the problem of where the test occurs)?

or
exists(EqualityTest eq, Expr null |
eq.flow() = this and
polarity = eq.getPolarity() and

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a booleanNot here? (Equal to null implies falsy)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eagle eyes!

@esbena

esbena commented May 5, 2020

Copy link
Copy Markdown
Contributor Author

It seems like a lot of code to save 9 lines of code sweat_smile but I guess time will tell if it pays off.

I think it will, we are moving towards spotting blacklist usage in general, and the added lines are not just a refactoring: "javascript|data".split("|").indexOf(scheme) !== -1) is now recognized by js/incomplete-url-scheme-check for instance...

I couldn't help poke at the tuple counts a bit to see what might explain the speed-up. It seems barrier guards now rely slightly less on magic so maybe that's it? My previous attempts at getting rid of that magic cascaded into other misoptimizations so kudos for wobbling the optimizer in the right direction smile.

Yeah, I got lucky with the wobble. We'll know more when nightly/security/dpm finishes tomorrow.

Do we still need the ConstantComparison sanitizer guard or is it completely generalized by MembershipTest? (For retaining backwards compatibility, it should suffice to just remove its AdditionalSanitizerGuard base class while deprecating it).

It is completely generalized.
Technically, it is a breaking change to remove the base class. But breakage is unlikely, so I will make the change anyway.

*/
class ArrayMembershipTest extends OrdinaryInclusionTest {
DataFlow::ArrayCreationNode array;
module MembershipTestImplementations {

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm a little sad to see yet another naming convention emerge here.

This is public API (it's used by a query) so it shouldn't be half-hidden behind a name like "implementations". Looking at the use-case in the prototype pollution query, it doesn't strike me as an implementation detail that there are different kinds of membership tests, so I think we should just merge it with the MembershipTest module itself.

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point. I have merged the two modules.

@esbena

esbena commented May 6, 2020

Copy link
Copy Markdown
Contributor Author

The dpm evaluation is more neutral, so it looks like the rebase, 7139422, or solar flares unwobbled the optimizer.
WDYT? Should we try to reproduce it?

@asgerf

asgerf commented May 6, 2020

Copy link
Copy Markdown
Contributor

Hm, likely a fluke then but maybe run a smoke-test to verify if addressing the last few commits caused it.

What's up with amphtml failing the baseline, and missing results, though? The missing results look like a potential polarity issue.

@esbena

esbena commented May 6, 2020 via email

Copy link
Copy Markdown
Contributor Author

@esbena

esbena commented May 7, 2020

Copy link
Copy Markdown
Contributor Author

Hm, likely a fluke then but maybe run a smoke-test to verify if addressing the last few commits caused it.

Will do.

What's up with amphtml failing the baseline, and missing results, though? The missing results look like a potential polarity issue.

Two test-FPs that have disappeared due to the following pattern where we wrongly classify something as a whitelist check.

   newNumber = prefix + Math.floor(Math.random() * 10000); // TAINTED
      if (currentList.indexOf(newNumber) === -1) { // SANITIZED (wrongly)
        currentList.push(newNumber);

We could fiddle (later) with the membership tests to ensure that this non-duplication pattern is supported, but that shouldn't matter in general for taint analysis.

@asgerf

asgerf commented May 7, 2020

Copy link
Copy Markdown
Contributor

My worry is that the following is now being seen as a sanitizer in the true case, which it clearly shouldn't be:

if (currentList.indexOf(newNumber) === -1)

From your comment it's not clear to me why this isn't considered a problem. It's a bug that was introduced by this PR.

My theory is now that the above is three kinds of membership tests at once:

  • OrdinaryInclusionTest
  • EqualityLeftMembershipTest
  • EqualityRightMembershipTest

As a result, the call to getPolarity returns both true and false, and getCandidate returns -1, currentList.indexOf(newNumber) and newNumber.

I remember writing a comment about this in my original review of the two equality membership classes and I'm quite frustrated to find that it has simply disappeared.

The problem is that when classes have overlapping this values there is no way to call two member predicates on an instance, without getting a mix-and-match of values contributed by two different classes. The only way I can see to fix this is to back MembershipTest by a newtype instead.

@asgerf

asgerf commented May 7, 2020

Copy link
Copy Markdown
Contributor

Coming back at this, maybe a less invasive change is to explicitly exclude equality tests that are also InclusionTest from the Equality{Left,Right}MembershipTest classes?

@esbena

esbena commented May 15, 2020

Copy link
Copy Markdown
Contributor Author

How about changing the class structure such that the candidate is used in the charpred? That ought to be functional.

class MembershipTestCandidate ... {
    DataFlow::Node getTest() { ... }
    boolean getTestPolarity(){ ... }
    DataFlow::Node getATestMemberNode() { ... }
    string getATestMemberString() { ... }

}

@asgerf

asgerf commented May 15, 2020

Copy link
Copy Markdown
Contributor

That might work. Technically we could even bring back the switch variant and just let getTest have no result.

@esbena esbena force-pushed the js/membershiptest branch 2 times, most recently from d733d24 to 3cc6ae4 Compare May 18, 2020 09:46
@esbena esbena force-pushed the js/membershiptest branch from 3cc6ae4 to b3691cd Compare May 18, 2020 09:51
@esbena

esbena commented May 18, 2020

Copy link
Copy Markdown
Contributor Author

How about changing the class structure such that the candidate is used in the charpred?

Done. The changes are mainly textual and then ther required candidate/test swapping.
It is probably easiest to see the differences tor the clients:

we could even bring back the switch variant

Done.


A new evaluation is required.

@esbena

esbena commented May 20, 2020

Copy link
Copy Markdown
Contributor Author

Evaluation on nightly/security has excellent performance.

@esbena esbena removed the Awaiting evaluation Do not merge yet, this PR is waiting for an evaluation to finish label May 20, 2020

@asgerf asgerf left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very minor nit then LGTM

Comment thread javascript/ql/test/query-tests/Security/CWE-020/IncompleteUrlSchemeCheck.js Outdated
…chemeCheck.js

Co-authored-by: Asger F <asgerf@github.com>
asgerf
asgerf previously approved these changes May 22, 2020
@erik-krogh

Copy link
Copy Markdown
Contributor

After this is merged we could look at a generalization to support the sanitizer guard in https://github.com/github/codeql-javascript-CVE-coverage/issues/714.

if (typeof type !== 'undefined' && RRecordTypes.indexOf(type) === -1) {
  console.error('Invalid rrtype:', type);
  return null;
}

@esbena

esbena commented Jun 2, 2020

Copy link
Copy Markdown
Contributor Author

Can get a thumbs up on the conflict resolution?

@semmle-qlci semmle-qlci merged commit e7800d4 into github:master Jun 2, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants