diff mbox series

[3/3] commit-reach.h: add missing declarations (hdr-check)

Message ID 88102e3a-8a02-fca4-4daf-ab428008afc7@ramsayjones.plus.com (mailing list archive)
State New, archived
Headers show
Series some more hdr-check clean headers | expand

Commit Message

Ramsay Jones Oct. 27, 2018, 1:53 a.m. UTC
Add the necessary #includes and forward declarations to allow the header
file to pass the 'hdr-check' target.

Note that, since this header includes the commit-slab implementation
header file (indirectly via commit-slab.h), some of the commit-slab
inline functions (e.g contains_cache_at_peek()) will not compile without
the complete type of 'struct commit'. Hence, we replace the forward
declaration of 'struct commit' with the an #include of the 'commit.h'
header file.

It is possible, using the 'commit-slab-{decl,impl}.h' files, to avoid
this inclusion of the 'commit.h' header. Commit a9f1f1f9f8 ("commit-slab.h:
code split", 2018-05-19) separated the commit-slab interface from its
implementation, to allow for the definition of a public commit-slab data
structure. This enabled us to avoid including the commit-slab implementation
in a header file, which could result in the replication of the commit-slab
functions in each compilation unit in which it was included.

Indeed, if you compile with optimizations disabled, then run this script:

  $ cat -n dup-static.sh
       1 #!/bin/sh
       2
       3 nm $1 | grep ' t ' | cut -d' ' -f3 | sort | uniq -c |
       4 	sort -rn | grep -v '      1'
  $

  $ ./dup-static.sh git | grep contains
       24 init_contains_cache_with_stride
       24 init_contains_cache
       24 contains_cache_peek
       24 contains_cache_at_peek
       24 contains_cache_at
       24 clear_contains_cache
  $

you will find 24 copies of the commit-slab routines for the contains_cache.
Of course, when you enable optimizations again, these duplicate static
functions (mostly) disappear. Compiling with gcc at -O2, leaves two static
functions, thus:

  $ nm commit-reach.o | grep contains_cache
  0000000000000870 t contains_cache_at_peek.isra.1.constprop.6
  $ nm ref-filter.o | grep contains_cache
  00000000000002b0 t clear_contains_cache.isra.14
  $

However, using a shared 'contains_cache' would result in all six of the
above functions as external public functions in the git binary. At present,
only three of these functions are actually called, so the trade-off
seems to favour letting the compiler inline the commit-slab functions.

Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
---
 commit-reach.h | 5 +++--
 1 file changed, 3 insertions(+), 2 deletions(-)

Comments

Junio C Hamano Oct. 29, 2018, 1:13 a.m. UTC | #1
Ramsay Jones <ramsay@ramsayjones.plus.com> writes:

> ...
>        24 clear_contains_cache
>   $
>
> you will find 24 copies of the commit-slab routines for the contains_cache.
> Of course, when you enable optimizations again, these duplicate static
> functions (mostly) disappear. Compiling with gcc at -O2, leaves two static
> functions, thus:
>
>   $ nm commit-reach.o | grep contains_cache
>   0000000000000870 t contains_cache_at_peek.isra.1.constprop.6
>   $ nm ref-filter.o | grep contains_cache
>   00000000000002b0 t clear_contains_cache.isra.14
>   $
>
> However, using a shared 'contains_cache' would result in all six of the
> above functions as external public functions in the git binary.

Sorry, but you lost me here.  Where does the 6 in above 'all six'
come from?  I saw 24 (from unoptimized), and I saw 2 (from
optimized), but...

Ah, OK, the slab system gives us a familiy of 6 helper functions to
deal with the "contains_cache" slab, and we call only 3 of them
(i.e. the implementation of other three are left unused).  Makes
sense.

> At present,
> only three of these functions are actually called, so the trade-off
> seems to favour letting the compiler inline the commit-slab functions.

OK.  I vaguely recall using a linker that can excise the
implementation an unused public function out of the resulting
executable in the past, which may tip the balance in the opposite
direction, but the above reasonong certainly makes sense to me.

> Signed-off-by: Ramsay Jones <ramsay@ramsayjones.plus.com>
> ---
>  commit-reach.h | 5 +++--
>  1 file changed, 3 insertions(+), 2 deletions(-)
>
> diff --git a/commit-reach.h b/commit-reach.h
> index 7d313e2975..f41d8f6ba3 100644
> --- a/commit-reach.h
> +++ b/commit-reach.h
> @@ -1,12 +1,13 @@
>  #ifndef __COMMIT_REACH_H__
>  #define __COMMIT_REACH_H__
>  
> +#include "commit.h"
>  #include "commit-slab.h"
>  
> -struct commit;
>  struct commit_list;
> -struct contains_cache;
>  struct ref_filter;
> +struct object_id;
> +struct object_array;
>  
>  struct commit_list *get_merge_bases_many(struct commit *one,
>  					 int n,
Ramsay Jones Oct. 29, 2018, 11:59 p.m. UTC | #2
On 29/10/2018 01:13, Junio C Hamano wrote:
> Ramsay Jones <ramsay@ramsayjones.plus.com> writes:
> 
>> ...
>>        24 clear_contains_cache
>>   $
>>
>> you will find 24 copies of the commit-slab routines for the contains_cache.
>> Of course, when you enable optimizations again, these duplicate static
>> functions (mostly) disappear. Compiling with gcc at -O2, leaves two static
>> functions, thus:
>>
>>   $ nm commit-reach.o | grep contains_cache
>>   0000000000000870 t contains_cache_at_peek.isra.1.constprop.6
>>   $ nm ref-filter.o | grep contains_cache
>>   00000000000002b0 t clear_contains_cache.isra.14
>>   $
>>
>> However, using a shared 'contains_cache' would result in all six of the
>> above functions as external public functions in the git binary.
> 
> Sorry, but you lost me here.  Where does the 6 in above 'all six'
> come from?  I saw 24 (from unoptimized), and I saw 2 (from
> optimized), but...

As you already gathered, the 'six of the above functions' was the
list of 6 functions which were each duplicated 24 times (you only
left the last one un-snipped above) in the unoptimized git binary.

> Ah, OK, the slab system gives us a familiy of 6 helper functions to
> deal with the "contains_cache" slab, and we call only 3 of them
> (i.e. the implementation of other three are left unused).  Makes
> sense.

Yep, only clear_contains_cache(), contains_cache_at() and
init_contains_cache() are called.

>> At present,
>> only three of these functions are actually called, so the trade-off
>> seems to favour letting the compiler inline the commit-slab functions.
> 
> OK.  I vaguely recall using a linker that can excise the
> implementation an unused public function out of the resulting
> executable in the past, which may tip the balance in the opposite
> direction, 

Yes, and I thought I was using such a linker - I was surprised that
seems not to be the case! ;-) [I know I have used such a linker, and
I could have sworn it was on Linux ... ]

As I said in [1], the first version of this patch actually used a
shared contains_cache (so as not to #include "commit.h"). I changed
that just before sending it out, because I felt the patch conflated
two issues - I fully intended to send a "let's use a shared contains
cache instead" follow up patch! (Again, see [1] for the initial version
of that follow up patch).

But then Derrick said he preferred this version of the patch and I
couldn't really justify the follow up patch, other than to say "you
are making your compiler work harder than it needs to ..." - not very
convincing! :-P

For example, applying the RFC/PATCH from [1] on top of this patch
we can see:

  $ nm git | grep contains_cache
  00000000000d21f0 T clear_contains_cache
  00000000000d2400 T contains_cache_at
  00000000000d2240 T contains_cache_at_peek
  00000000000d2410 T contains_cache_peek
  00000000000d21d0 T init_contains_cache
  00000000000d2190 T init_contains_cache_with_stride
  $ size git
     text	   data	    bss	    dec	    hex	filename
  2531234	  70736	 274832	2876802	 2be582	git
  $ 

whereas, without that patch on top (ie this patch):

  $ nm git | grep contains_cache
  0000000000161e30 t clear_contains_cache.isra.14
  00000000000d2190 t contains_cache_at_peek.isra.1.constprop.6
  $ size git
     text	   data	    bss	    dec	    hex	filename
  2530962	  70736	 274832	2876530	 2be472	git
  $ 

which means the 'shared contains_cache' patch leads to a +272 bytes
of bloat in text section. (from the 3 unused external functions).


[1] https://public-inbox.org/git/2809251f-6eba-b0ac-68fe-b972809ccff7@ramsayjones.plus.com/

>             but the above reasonong certainly makes sense to me.

Thanks!

ATB,
Ramsay Jones
diff mbox series

Patch

diff --git a/commit-reach.h b/commit-reach.h
index 7d313e2975..f41d8f6ba3 100644
--- a/commit-reach.h
+++ b/commit-reach.h
@@ -1,12 +1,13 @@ 
 #ifndef __COMMIT_REACH_H__
 #define __COMMIT_REACH_H__
 
+#include "commit.h"
 #include "commit-slab.h"
 
-struct commit;
 struct commit_list;
-struct contains_cache;
 struct ref_filter;
+struct object_id;
+struct object_array;
 
 struct commit_list *get_merge_bases_many(struct commit *one,
 					 int n,