Skip to content

khash cleanups#4092

Merged
pks-t merged 24 commits intolibgit2:masterfrom
pks-t:pks/khash-cleanups
Feb 17, 2017
Merged

khash cleanups#4092
pks-t merged 24 commits intolibgit2:masterfrom
pks-t:pks/khash-cleanups

Conversation

@pks-t
Copy link
Copy Markdown
Member

@pks-t pks-t commented Jan 25, 2017

So this is a first batch of cleanups for khash-related code. It mainly only transforms all calls to kh_ functions to instead use the type-specific functions (e.g. git_strmap_put instead of kh_put). As is, this does not really change anything in behavior, but it improves encapsulation here.

So my goal later on is to convert the types to use real functions instead. I'm not yet sure if this goal is worthwhile after all and how it would impact performance. But the benefit is two-fold:

  1. We do not have to expose all these externals regarding the macro-magic, for example the nasty GIT__USE_STRMAP thing and others. Instead, I want to include it inside of the respective implementation file, e.g. src/strmap.c and be done with it. This might even reduce the resulting binary size (if the linker didn't fold the different implementations together).

  2. It gets rid of those "magic" macros which looked like normal function calls, but did not behave as such, which were initially bothering me. As ever so often, when one starts to fix up a small annoyance one ends up walking a rather long path, trying to fix everything as you go. But, well. Why not.

I'd appreciate opinions on whether you think the overall goal is worthwhile or not. Disregarding of whether the overall goal (that is creating real functions) is worthwhile, I guess this PR is still worth being merged.

@pks-t pks-t force-pushed the pks/khash-cleanups branch 5 times, most recently from 4bee801 to bc3dd7b Compare January 27, 2017 14:39
@pks-t
Copy link
Copy Markdown
Member Author

pks-t commented Jan 27, 2017

I've started the conversion to get an estimate on how it influences the code. Currently, with 2/5 maps converted, we save around 2% in size for the libgit2_clar:

Before:

   text	   data	    bss	    dec	    hex	filename
6773806	 827000	 856704	8457510	 810d26	libgit2_clar

After:

   text	   data	    bss	    dec	    hex	filename
6697415	 820224	 847264	8364903	 7fa367	libgit2_clar

Guess it'll amount to around 5% in the end. No performance measurements yet. Will do something more involved to see if this actually imposes a real performance hit for us.

But most importantly, it looks a lot nicer and is easier to grok. See https://github.com/pks-t/libgit2/tree/pks/map-macros-to-functions.

@pks-t pks-t changed the title Preparatory khash cleanups khash cleanups Feb 2, 2017
@pks-t
Copy link
Copy Markdown
Member Author

pks-t commented Feb 2, 2017

In contrast to what I've said initially, I've now also included the actual conversion from macros to functions in addition to the cleanups.

There wasn't much gained in terms of resulting code size by the additional two maps being converted to actual functions. Guess this results from the fact that they were both just used in one file - so actually it stays around 2% decreased code size.

I've done some small performance tests. The conversion at least has no impact on our testing suite in terms of performance, which is good. I also did some tests benchmarking the maps directly by calling map-modifying functions over 10 million iterations. There was no performance impact, as well.

While these numbers are not really representative for real-world usage, the at least show that there's no immediate performance penalty by not having them inlined anymore. And actually, I think the result is much easier to work with and reason about than what we previously had.

@pks-t
Copy link
Copy Markdown
Member Author

pks-t commented Feb 2, 2017

By the way, the failures are the usual threading/network related issues

@ethomson
Copy link
Copy Markdown
Member

I'm really happy with this; I think that this is a really big improvement.

If I could pick just one nit: we use git__* names for things that don't have a namespace and don't really belong anywhere else. I would like to see us pull the free functions into the namespace (such as it is) and use the double-underscore after the namespace to indicate privateness, like:

  • git__idxmap_free -> git_idxmap__free
  • git__offmap_free -> git_offmap__free
  • git__strmap_free -> git_strmap__free
  • git__oidmap_free -> git_oidmap__free

(I realize this is quite nitpicky; sorry.)

@pks-t pks-t force-pushed the pks/khash-cleanups branch from 6b15959 to 8f1ff26 Compare February 17, 2017 10:41
@pks-t pks-t merged commit d0c72a9 into libgit2:master Feb 17, 2017
@pks-t pks-t deleted the pks/khash-cleanups branch February 21, 2017 10:39
hackhaslam added a commit to stinb/libgit2 that referenced this pull request Feb 21, 2017
This is a crash on 32-bit and I assume that it doesn't do the right thing on
64-bit either. MSVC emits a warning for this, but of course, it's easy to get
lost among all of the similar 'possible loss of data' warnings.
@pks-t pks-t mentioned this pull request Feb 22, 2017
hackhaslam added a commit to stinb/libgit2 that referenced this pull request Feb 22, 2017
Fixes a regression from libgit2#4092. This is a crash on 32-bit and I assume that
it doesn't do the right thing on 64-bit either. MSVC emits a warning for this,
but of course, it's easy to get lost among all of the similar 'possible loss
of data' warnings.
pks-t added a commit that referenced this pull request Feb 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants