Conversation
|
You're welcome to license the fuzzer under GPL. I'm not attached to authorship metadata in Git on it. I was concurrently working on #4729 which contains a C port of the fuzzer. I haven't had a chance to run it for long but it seems to be working in brief testing. You're welcome to cherry-pick commits from that onto this or handle it however else makes sense to you. |
|
Sorry for duplicating the work. I'll see if I can scavenge some bits from your PR. One question I currently have is where we want to store corporas. We can either store them in the oss-fuzz repo (like it is currently done for the download fuzzer) or in our own repo (like it is done for the pack fuzzer). While the files for the download fuzzers are around 8kB only, the other one is more than a whole megabyte. So it's reasonably sized, but it still got me thinking. |
7e21ae9 to
b95ae5f
Compare
|
I've decided to remove the USE_SANITIZER and USE_COVERAGE options and adjusted the documentation to recommend setting those CFLAGS via environment variables. I think providing options for all the different variations a user might want to build libgit2 is not really reasonable. In fact we cannot even use that option with oss-fuzz, as they want to pass in the used sanitizers via CFLAGS themselves. |
|
Oops, hit the button too early. Anyway, I think this PR is ready now. I have a commit waiting at pks-t/oss-fuzz which switches oss-fuzz to use our internal fuzzers and corpora instead of using the one currently hosted at google/oss-fuzz. I'll create a PR as soon as this one here is merged. |
lhchavez
left a comment
There was a problem hiding this comment.
👍. i just realized there was a tiny naming nit in my original PR u_u.
thanks for doing this :D
| @@ -0,0 +1,118 @@ | |||
| /* | |||
| * libgit2 raw packfile fuzz target. | |||
There was a problem hiding this comment.
oops, i just realized that i named the file wrong. it should have been raw_packfile in the filename for consistency with this comment.
the reason is that originally i wanted to split this into raw and cooked (with a valid hash at the end) packfiles, and it made more sense to name them packfile_{raw,cooked}. but then I figured out it was better to use the MSB of the first byte to change behavior, so I ended not using the cooked version.
Do you mind renaming this file to raw_packfile_fuzzer? or just packfile_fuzzer to avoid adding an unneeded qualifier to the name.
There was a problem hiding this comment.
Sure. I also prefer "packfile_fuzzer", so I've renamed it
4bd4859 to
5416952
Compare
|
I think you'll want to also limit the pack object count for |
|
Right. I've also lowered the limit for the other fuzzer |
|
I consider this PR ready. I've verified that oss-fuzz works with
both our fuzzing targets as well as that the standalone fuzzers
work as expected. I'd like to push this to master rather quick,
so that we finally fix the OOM issues and make our fuzzers a lot
more efficient. Also, there is no impact to our existing code, so
I guess there's not a lot of risk involved in fast-forwarding
this.
|
This change adds support for building a fuzz target for exercising the packfile parser, as well as documentation. It also runs the fuzzers in Travis to avoid regressions.
Our layout uses names like "examples" or "tests" which is why the "fuzz" directory doesn't really fit in here. Rename the directory to be called "fuzzers" instead. Furthermore, we rename the fuzzer "fuzz_packfile_raw" to "packfile_raw_fuzzer", which is also in line with the already existing fuzzer at google/oss-fuzz. While at it, rename the "packfile_raw" fuzzer to instead just be called "packfile" fuzzer.
Like all our other internal code, we want to force the use of C90 for our fuzzers. Do so by setting the "C_STANDARD" property of our fuzzing targets.
We do want to notify users compiling our source code early on if they try to use C flags which aren't supported. Add a new macro `AddCFlag`, which results in a fatal error in case the flag is not supported, and use it for our fuzzing flags.
Right now, users are being instrucded to add the "-DCMAKE_EXE_LINKER_FLAGS=-fsanitize=fuzzer" flag when they want to build our fuzzers. This is error-prone and user unfriendly. Instead, just add the flag to our fuzzers' build instructions so that it happens automatically. Adjust the README accordingly.
Both the USE_SANITIZER and USE_COVERAGE options are convenience options that turn on a set of CFLAGS. Despite our own set of CFLAGS required to build libgit2, we have no real business to mess with them, though, as they can easily be passed in by the user via specifying the CFLAGS environment variable. The reasoning behind not providing them is that as soon as we start adding those for some usecases, users might ask for other sets of CFLAGS catering to their specific need in another usecase. Thus, we do not want to support them here.
The packfile_raw fuzzer is using some internal APIs from libgit2, which makes it hard to compile it as part of the oss-fuzz project. As oss-fuzz requires us to link against the C++ FuzzingEngine library, we cannot use "-DBUILD_FUZZERS=ON" directly but instead have to first compile an object from our fuzzers and then link against the C++ library. Compiling the fuzzer objects thus requires an external invocation of CC, and we certainly don't want to do further black magic by adding libgit2's private source directory to the header include path. To fix the issue, convert the code to not use any internal APIs. Besides some headers which we have to add now, this also requires us to change to the hashing function of the ODB. Note that this will change the hashing result, as we have previously not prepended the object header to the data that is to be hashed. But this shouldn't matter in practice, as we don't care for the hash value anyway.
This is a direct copy of the code from google/oss-fuzz, written by Nelson Elhage (@nelhage). Note that due to the ".cc" ending, the file will not yet be picked up by the build system. This is intended, as currently that file is partly written in C++, requiring a conversion to C.
Convert the "download_refs" fuzzer from C++ to C. Rename the source file to have it be picked up by our build system.
By default, libgit2 allows up to 2^32 objects when downloading a packfile from a remote. For each of these objects, libgit2 will allocate up to two small structs, which in total adds up to quite a lot of memory. As a result, our fuzzers might run out of memory rather quick in case where they receive as input a packfile with such a huge count of objects. Limit the packfile object count to 10M objects. This is sufficiently big to still work with most largish repos (linux.git has around 6M objects as of now), but small enough to not cause the fuzzer to OOM.
When using VSTS-based builds, we are in a different location than when
doing Travis builds. Due to this, the relative path to our fuzzer
corpora does not work on VSTS. Fix it by using `${SOURCE_DIR}` instead.
|
I've created google/oss-fuzz#1684 to use our own, upstreamed fuzzers in oss-fuzz |
This includes the fuzzers from @lhchavez and @nelhage, superseding #4433 and superseding the current fuzzers stored in google/oss-fuzz.
I've made some adjustments to the build instructions. Most notably, users don't have to manually add "-DCMAKE_EXE_LINKER_FLAGS=-fsanitize=fuzzer" anymore, and the build errors out during the configuration stage if any of the required sanitizer flags is not supported.
@nelhage, I'd please ask you to give me permissions to add a GPL-2.0 license header to your original fuzzer. It was missing one, and I want to be on the safe side ;) I've also mentioned that the fuzzer is originally yours, but if you wish I can change author information on that commit to pretend it was you who actually committed the C++ one to libgit2. Please tell me your preference.
Please do not merge until @nelhage confirms we're allowed to distribute his code under GPLv2.