path validation: `char` is not signed by default. by ethomson · Pull Request #4805 · libgit2/libgit2

ethomson · 2018-09-12T09:57:38Z

ARM treats its char type as unsigned type by default; as a result, testing a char value as being < 0 is always false. This is a warning on ARM, which is promoted to an error given our use of -Werror.

Per ISO 9899:199, section "6.2.5 Types":

The three types char, signed char, and unsigned char are collectively
called the character types. The implementation shall define char to
have the same range, representation, and behavior as either signed
char or unsigned char.

...

Irrespective of the choice made, char is a separate type from the other
two and is not compatible with either.

ARM treats its `char` type as `unsigned type` by default; as a result, testing a `char` value as being `< 0` is always false. This is a warning on ARM, which is promoted to an error given our use of `-Werror`. Per ISO 9899:199, section "6.2.5 Types": > The three types char, signed char, and unsigned char are collectively > called the character types. The implementation shall define char to > have the same range, representation, and behavior as either signed > char or unsigned char. > ... > Irrespective of the choice made, char is a separate type from the other > two and is not compatible with either.

ethomson · 2018-09-12T10:02:48Z

Note that this whole test seems rather pointless anyway, since the test immediately following it will return true for any instance where this test also returns true. Is the thinking that the tolower test is more expensive and we could skip it for bytes outside the ASCII range? If so, I would argue that most filenames are mostly ASCII so we've added a test to this loop that will rarely actually catch anything, making the loop more expensive overall.

So my inclination is to actually remove this test entirely.

ethomson · 2018-09-12T10:03:04Z

/cc @carlosmn in case I'm missing something.

carlosmn · 2018-09-17T11:30:12Z

One place where we'd go above 127 is for the special chars in ASCII, like accented vowels and other "European" characters. I think we probably only ever get UTF-8 here, but since we're looking at individual bytes, we might get one with a continuation here. The code in git says

			/*
			 * We know our needles contain only ASCII, so we clamp
			 * here to make the results of tolower() sane.
			 */

which I guess makes sense, maybe, though I'm not sure what tolower() would return that would cause the next case to also not be false. I guess on Windows your active old-timey codepage might affect it?

Other than that, I guess we'd avoid hitting tolower more often if our input contains combining characters and stuff that goes into the bit of UTF-8 made to fit past value 127. Non-roman filenames would hit this fairly often I would imagine.

ethomson · 2018-09-17T12:10:02Z

OK, I think the bigger issue was that I wanted to make sure that I had understood the problem that you were trying to solve. I think that you're right that we shouldn't try to divine what tolower might do on all the platforms. You OK with this as-is?

carlosmn · 2018-09-17T13:00:42Z

I think I would slightly favour (signed char)name[i] < -1 to keep the -1 but it's fine either way.

ethomson · 2018-09-18T01:59:41Z

Personally, I think that the > 127 is more readable, but putting that preference aside: I've actually already cherry-picked this into a different branch, so without a strong reason not to, I'm going to keep this as-is to prevent having to redo that.

ethomson merged commit 744d838 into master Sep 18, 2018

ethomson deleted the signed_char branch October 26, 2018 13:37

snyk-bot mentioned this pull request Feb 23, 2020

[Snyk] Upgrade nodegit from 0.4.1 to 0.26.4 saurabharch/Breezeblocks#1

Open

snyk-bot mentioned this pull request Apr 22, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 aminatakonate000/Graviton-App#4

Open

snyk-bot mentioned this pull request May 5, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 Barnstorm-Online/ngp-openapi-generator#1

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

path validation: `char` is not signed by default.#4805

path validation: `char` is not signed by default.#4805
ethomson merged 1 commit intomasterfrom
signed_char

ethomson commented Sep 12, 2018

Uh oh!

ethomson commented Sep 12, 2018

Uh oh!

ethomson commented Sep 12, 2018

Uh oh!

carlosmn commented Sep 17, 2018

Uh oh!

ethomson commented Sep 17, 2018

Uh oh!

carlosmn commented Sep 17, 2018

Uh oh!

ethomson commented Sep 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

ethomson commented Sep 12, 2018

Uh oh!

ethomson commented Sep 12, 2018

Uh oh!

ethomson commented Sep 12, 2018

Uh oh!

carlosmn commented Sep 17, 2018

Uh oh!

ethomson commented Sep 17, 2018

Uh oh!

carlosmn commented Sep 17, 2018

Uh oh!

ethomson commented Sep 18, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants