Skip to content

C++: Query for multiplications used in allocations.#4810

Merged
MathiasVP merged 6 commits intogithub:mainfrom
geoffw0:multtoalloc
Jan 7, 2021
Merged

C++: Query for multiplications used in allocations.#4810
MathiasVP merged 6 commits intogithub:mainfrom
geoffw0:multtoalloc

Conversation

@geoffw0
Copy link
Contributor

@geoffw0 geoffw0 commented Dec 10, 2020

Experimental and rather quickly put together query for (non-constant) multiplication results that are used as the size for an allocation. This is a practically motivated query intended to be a fast way to find a common pattern for potential vulnerabilities - e.g. it showed some promise on VLC.

Related:

  • cpp/integer-multiplication-cast-to-long "Multiplication result converted to larger type"
  • cpp/uncontrolled-allocation-size "Overflow in uncontrolled allocation size"

TODO now:

  • review real world results (for potential; they don't need to be perfect for experimental) and set query precision
  • write a proper violation message and query headers
  • add some tests
  • add basic qhelp with an example

TODO later (i.e. not required for merging into experimental):

  • apply range analysis on the multiplication to build confidence it may actually overflow? (@MathiasVP said: "There’s nothing inherently wrong with the result of a multiplication flowing into an allocation, so it would probably need to be more constrained." ... "If we preserve interesting results when we restrict the multiplications to be potentially overflowing I think it could be a good experimental query")
  • apply taint tracking on the multiplication to ensure its user controlled??? (but this might just turn the query results into a subset of cpp/uncontrolled-allocation-size, i.e. remove many of the interesting results)
  • we've also looked at similar cases that flow into a call to memset. In the long run, perhaps 'allocation' could be generalized to something like 'size of any buffer operation'?

@jbj I'm interested in your thoughts on this. Note that the target is only experimental, at least for now.

@geoffw0 geoffw0 added the C++ label Dec 10, 2020
@geoffw0
Copy link
Contributor Author

geoffw0 commented Dec 17, 2020

Looking at a small sample of results on LGTM, I think there are quite a lot of false positives where the numbers come from things that are probably known to be small. I've set the @precision of the query to low for now - and I think we'll need to add range analysis or perhaps taint tracking or something similar before this query could possibly come out of experimental.

@geoffw0 geoffw0 marked this pull request as ready for review December 18, 2020 10:13
@geoffw0 geoffw0 requested a review from a team as a code owner December 18, 2020 10:13
@geoffw0
Copy link
Contributor Author

geoffw0 commented Dec 18, 2020

Added qhelp and changed this from a draft PR to 'ready for review'.

I plan to create a Jira issue for the points about improving results, and potentially moving this query out of experimental, but what we have now is useful (as proved in the hackathon) and complete.

@geoffw0
Copy link
Contributor Author

geoffw0 commented Dec 18, 2020

@hubwriter please could I have a quick documentation review. For context, this is only being merged into 'experimental' (which has lower standards than other queries) for now, but it may be moved out in future so I'd prefer the qhelp be up to scratch.

Copy link
Contributor

@hubwriter hubwriter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

query help LGTM 👍

@geoffw0
Copy link
Contributor Author

geoffw0 commented Jan 7, 2021

I believe this PR can be merged.

Copy link
Contributor

@MathiasVP MathiasVP left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed, this query (as is) has demonstrated its value in the hackathon. Let's investigate the changes needed to bring it out of experimental in another PR.

@MathiasVP MathiasVP merged commit 13a67c9 into github:main Jan 7, 2021
@geoffw0
Copy link
Contributor Author

geoffw0 commented Jan 7, 2021

Created an issue for further improvements: https://github.com/github/codeql-c-analysis-team/issues/203

@intrigus-lgtm
Copy link
Contributor

Indeed, this query (as is) has demonstrated its value in the hackathon.

I assume this was an internal hackathon?

@MathiasVP
Copy link
Contributor

I assume this was an internal hackathon?

Yep. (Sorry for not making that clear in my comment.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants