git_refdb API fixes by tiennou · Pull Request #5106 · libgit2/libgit2

tiennou · 2019-06-10T13:48:13Z

This is the preliminary part to the reflog message PR (#4316), and is limited to documentation, QOL improvements, NFC, and some bugfixes.

(Please make sure it's it in sync with #4316 on merge, since it's likely to get rebased again).

ethomson · 2019-06-10T13:51:42Z

+	    !backend->reflog_rename || !backend->reflog_delete ||
+	    (backend->lock && !backend->unlock)) {
+		git_error_set(GIT_ERROR_REFERENCE, "incomplete refdb backend implementation");
+		return GIT_EINVALID;


I would have made this an assertion instead of a soft error, personally. There's no recovery for end-users here.

Good point. I didn't want to fail too hard, since we were missing the check in the first place, but maybe an assert is warranted.

But if we used an assert, then we wouldn't error at all on non-debug builds

Yeah but this is a truism throughout our codebase. We generally assert in places that are only a result of the consumer of libgit2 missing the preconditions for our api or writing bad or incomplete code.

I think I prefer runtime checks/error return to static asserts for things like this, as I'd be raising exceptions (NSInternalInconsistencyException FTW) if I had those. But asserts don't really work that way. I just had a feeling that the user might not be in control of that code as well (think copy-pasta from libgit2-backends).

As another data point, struct version checks are currently runtime errors, not asserts.

I think moving asserts to something else is a reasonable concern. But I think it should be a separate PR instead of trying to figure out what that next thing is in this one, and I think we should move the entire code base en masse.

So I think that we should assert for now so that we're consistent with the rest of the code base and so that it's easier to mechanically replace when we decide on what we do want to do.

If that's possible, I'd like to keep it as it: as this is kinda a backward incompatible change about 1) something we forgot to enforce and 2) it's only to be more helpful, as right now "bad" users would eat a segfault anyway, I feel like it would be more helpful for unsuspecting end-users to have a normal "error" (even if it's not quite expected).

To be clear, I'm trying to fixing the few bugs in the refdb layer for which I have an almost 2-years old patch set, so if it's really that contentious, I'll open the issue about the general cleanup and drop it from here — with the missing version check is in, at least this struct is safe for 1.0.

ethomson · 2019-06-10T14:12:13Z


-	error = packed_write(backend);
+	if (found)
+		error = packed_write(backend);


I wonder if it's worth making sure that GIT_ENOTFOUND could not be propagated back from either git_sortedcache_wlock or packed_write. A quick glance suggests that those will never return those, and it's hard to imagine a world where they would be refactored to a point where GIT_ENOTFOUND would be meaningful, but it feels Not Impossible. So we might want to be strict about converting those to a -1 if we want to treat them as a true failure here. 🤔

OTOH, it will probably never matter.

That… went over my head. What's the concern ? That a GIT_ENOTFOUND from wlock or write could be misinterpreted as meaning the ref doesn't exist, while the issue could be more severe ?

pks-t · 2019-06-13T10:33:22Z


 	git_sortedcache_wunlock(backend->refcache);

-	error = packed_write(backend);


It's not only about clobbering error values, is it? Previously, we would've tried to write the packed file even in cases where we didn't remove any reference at all, right?

Yes, from a functional POV it's slightly different in that the packed file will not be rewritten. AFAIR, this was also some kind of performance improvement, in that a packed-ref lookup failure when deleting would cause a "spurious" rewrite of an innocent file (but please double-check me, I'm having a hard time with how the locking is handled). I'll move that to the commit message to clarify that it's not only about the clobbering.

pks-t · 2019-06-13T18:50:29Z

On Thu, Jun 13, 2019 at 04:24:06AM -0700, Edward Thomson wrote: ethomson commented on this pull request. > @@ -68,6 +68,16 @@ int git_refdb_set_backend(git_refdb *db, git_refdb_backend *backend) { GIT_ERROR_CHECK_VERSION(backend, GIT_REFDB_BACKEND_VERSION, "git_refdb_backend"); + if (!backend->exists || !backend->lookup || !backend->iterator || + !backend->write || !backend->rename || !backend->del || + !backend->has_log || !backend->ensure_log || !backend->free || + !backend->reflog_read || !backend->reflog_write || + !backend->reflog_rename || !backend->reflog_delete || + (backend->lock && !backend->unlock)) { + git_error_set(GIT_ERROR_REFERENCE, "incomplete refdb backend implementation"); + return GIT_EINVALID; Yeah but this is a truism throughout our codebase. We generally assert in places that are only a result of the consumer of libgit2 missing the preconditions for our api or writing bad or incomplete code.

Fair enough. The question is whether we want to say "We do so everywhere, so let's just introduce more places where we may run into hard crashes in production". I know, this is kind-of framing this in a rather biased way, but I feel like we should just use error codes. The current mainstream opinion seems to be "never cause segfaults in a library", from what I've read in the last few years. Alternatively, we could also introduce a new macro `GIT_FATAL(GIT_EINVALID, msg)`. On debug builds this may trigger an assert, while it may cause an error return code on release builds. But I don't know whether that's too magical, at least I'm not too thrilled by that.

ethomson · 2019-06-13T20:32:03Z

The current mainstream opinion seems to be "never cause segfaults in a library", from what I've read in the last few years.

I think this is reasonable but we should split this out into a separate conversation, IMO.

pks-t · 2019-06-14T06:13:15Z

On Thu, Jun 13, 2019 at 01:32:13PM -0700, Edward Thomson wrote: > The current mainstream opinion seems to be "never cause > segfaults in a library", from what I've read in the last few > years. I think this is reasonable but we should split this out into a separate conversation, IMO.

Agreed, no need to keep this PR on hold for that. For the sake of having an easier path of conversion I'd be fine with raising an assert for now.

In the case of a failed lookup, we'd paper over that by writing back the packed-refs successfully.

This fixes part of the issue where, given a concurrent `git pack-refs`, a ref lookup could return an old, vestigial value from the packed file, as the valid loose one would have been deleted.

pks-t · 2019-09-27T09:16:16Z

Thanks a lot, @tiennou!

tiennou changed the title ~~git_redb API fixes~~ git_refdb API fixes Jun 10, 2019

ethomson reviewed Jun 10, 2019

View reviewed changes

pks-t approved these changes Jun 13, 2019

View reviewed changes

tiennou force-pushed the fix/ref-api-fixes branch 2 times, most recently from f01b370 to 240a369 Compare June 14, 2019 06:45

pks-t mentioned this pull request Jul 18, 2019

remote: git_cred_acquire_cb crashes if credential is not set #5171

Closed

tiennou added 8 commits September 5, 2019 10:26

refdb: documentation

8db9fd3

refdb: check the version of the backend we're about to set

c2cf984

refdb: ensure all mandatory functions are provided at setup time

baf411e

refdb: make low-level deletion helpers explicit

0a88c83

refdb: fix packed_delete clobbering some errors

9b25cf1

In the case of a failed lookup, we'd paper over that by writing back the packed-refs successfully.

refdb: reorder parameters for consistency

8fd855f

refdb: repurpose filesystem prune function

171116e

refdb: make sure to remove packed refs first

8c14224

This fixes part of the issue where, given a concurrent `git pack-refs`, a ref lookup could return an old, vestigial value from the packed file, as the valid loose one would have been deleted.

tiennou force-pushed the fix/ref-api-fixes branch from 240a369 to 8c14224 Compare September 5, 2019 09:08

pks-t merged commit 7032537 into libgit2:master Sep 27, 2019

pks-t mentioned this pull request Sep 27, 2019

Potential loose/packed ref race when deleting packed refs #5003

Closed

tiennou deleted the fix/ref-api-fixes branch December 6, 2019 21:41

snyk-bot mentioned this pull request Feb 23, 2020

[Snyk] Upgrade nodegit from 0.4.1 to 0.26.4 saurabharch/Breezeblocks#1

Open

snyk-bot mentioned this pull request Mar 12, 2020

[Snyk] Upgrade nodegit from 0.26.0 to 0.26.4 thisconnect/nodegit-kit#95

Merged

snyk-bot mentioned this pull request Apr 22, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 aminatakonate000/Graviton-App#4

Open

snyk-bot mentioned this pull request May 5, 2020

[Snyk] Upgrade nodegit from 0.24.3 to 0.26.5 Barnstorm-Online/ngp-openapi-generator#1

Open


		git_sortedcache_wunlock(backend->refcache);

		error = packed_write(backend);

Conversation

tiennou commented Jun 10, 2019

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

pks-t commented Jun 13, 2019 via email

Uh oh!

ethomson commented Jun 13, 2019

Uh oh!

pks-t commented Jun 14, 2019 via email

Uh oh!

pks-t commented Sep 27, 2019

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants