Skip to content

Python: model os path file accesses#6741

Merged
codeql-ci merged 6 commits into
github:mainfrom
yoff:python/model-os-path-file-accesses
Oct 13, 2021
Merged

Python: model os path file accesses#6741
codeql-ci merged 6 commits into
github:mainfrom
yoff:python/model-os-path-file-accesses

Conversation

@yoff

@yoff yoff commented Sep 23, 2021

Copy link
Copy Markdown
Contributor

This PR includes a number of functions in os.path into the concept FileSystemAccess. These all test for the existence of files in some way.

There are a few more functions that could reveal information about the file system, by telling the last modification time of a file, e.g. getmtime. This will raise an error if the file does not exist, so could also reveal the existence of files depending on the applications error handling. These are currently not included.

@yoff yoff requested a review from a team as a code owner September 23, 2021 13:30

@RasmusWL RasmusWL left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are a few more functions that could reveal information about the file system, by telling the last modification time of a file, e.g. getmtime. This will raise an error if the file does not exist, so could also reveal the existence of files depending on the applications error handling. These are currently not included.

Can we please model these as well? they fit the FileSystemAccess concept as far as I can tell.

@yoff yoff requested a review from RasmusWL September 30, 2021 11:38
Comment thread python/ql/lib/semmle/python/frameworks/Stdlib.qll Outdated
Co-authored-by: Rasmus Wriedt Larsen <rasmuswriedtlarsen@gmail.com>
@yoff

yoff commented Oct 4, 2021

Copy link
Copy Markdown
Contributor Author

There was already a taint-step for os.path.realpath, but we were missing some other ones...

@yoff yoff requested a review from RasmusWL October 4, 2021 17:13

@RasmusWL RasmusWL left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

a few minor things

Comment on lines +254 to +258
private string pathComputation() {
result in [
"abspath", "basename", "commonpath", "dirname", "expanduser", "expandvars", "join",
"normcase", "normpath", "realpath", "relpath", "split"
]

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
private string pathComputation() {
result in [
"abspath", "basename", "commonpath", "dirname", "expanduser", "expandvars", "join",
"normcase", "normpath", "realpath", "relpath", "split"
]
private string pathComputation() {
result in [
"abspath", "basename", "commonpath", "commonprefix", "dirname", "expanduser", "expandvars", "join",
"normcase", "normpath", "realpath", "relpath", "split", "splitdrive", "splitext"
]

or was there a reason not to add commonprefix? I'm wondering if you did not include it since it might be used as a form of sanitizer? -- I don't feel doubt about splitdrive and splitext should have taint-steps

NIT: Since this predicate is only used once (in the char-pred of OsPathComputation), could we please inline it there?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reasons for not including these are given in the list "Functions that need summaries". These functions all do not map tainted input to tainted output but rather map some content to some content.

Comment on lines +249 to +253
// Functions that need summaries:
// - os.path.commonpath(paths): takes a sequence
// - os.path.commonprefix(list): takes a list argument
// - os.path.splitdrive: retunrs a tuple
// - os.path.splittext: returns a tuple

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This list is incomplete (for example, missing split). Since all our current taint-modeling would need proper flow-summaries in the future, I would suggest we remove this, and deal with it once we remodel things. it just seems a bit too brittle to my liking 😐

Suggested change
// Functions that need summaries:
// - os.path.commonpath(paths): takes a sequence
// - os.path.commonprefix(list): takes a list argument
// - os.path.splitdrive: retunrs a tuple
// - os.path.splittext: returns a tuple

@yoff yoff requested a review from RasmusWL October 12, 2021 13:17

@RasmusWL RasmusWL left a comment

Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM 👍

@codeql-ci codeql-ci merged commit 2b0415e into github:main Oct 13, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants