diff mbox series

[v2,1/6] midx: add MIDX_PROGRESS flag Add the MIDX_PROGRESS flag and update the write|verify|expire|repack functions in midx.h to accept a flags parameter. The MIDX_PROGRESS flag indicates whether the caller of the function would like progress information

Message ID 6badd9ceaf4851b2984e78a5cfd0cb8ec0c810f5.1568998427.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series multi-pack-index: add --no-progress | expand

Commit Message

Linus Arver via GitGitGadget Sept. 20, 2019, 4:53 p.m. UTC
From: William Baker <William.Baker@microsoft.com>

Signed-off-by: William Baker <William.Baker@microsoft.com>
---
 builtin/multi-pack-index.c |  8 ++++----
 builtin/repack.c           |  2 +-
 midx.c                     |  8 ++++----
 midx.h                     | 10 ++++++----
 4 files changed, 15 insertions(+), 13 deletions(-)

Comments

Junio C Hamano Sept. 20, 2019, 8:01 p.m. UTC | #1
"William Baker via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: William Baker <William.Baker@microsoft.com>
>
> Signed-off-by: William Baker <William.Baker@microsoft.com>
> ---
>  builtin/multi-pack-index.c |  8 ++++----
>  builtin/repack.c           |  2 +-
>  midx.c                     |  8 ++++----
>  midx.h                     | 10 ++++++----
>  4 files changed, 15 insertions(+), 13 deletions(-)
>
> diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c
> index b1ea1a6aa1..e86b8cd17d 100644
> --- a/builtin/multi-pack-index.c
> +++ b/builtin/multi-pack-index.c
> @@ -47,16 +47,16 @@ int cmd_multi_pack_index(int argc, const char **argv,
>  	trace2_cmd_mode(argv[0]);
>  
>  	if (!strcmp(argv[0], "repack"))
> -		return midx_repack(the_repository, opts.object_dir, (size_t)opts.batch_size);
> +		return midx_repack(the_repository, opts.object_dir, (size_t)opts.batch_size, 0);
>  	if (opts.batch_size)
>  		die(_("--batch-size option is only for 'repack' subcommand"));
>  
>  	if (!strcmp(argv[0], "write"))
> -		return write_midx_file(opts.object_dir);
> +		return write_midx_file(opts.object_dir, 0);
>  	if (!strcmp(argv[0], "verify"))
> -		return verify_midx_file(the_repository, opts.object_dir);
> +		return verify_midx_file(the_repository, opts.object_dir, 0);
>  	if (!strcmp(argv[0], "expire"))
> -		return expire_midx_packs(the_repository, opts.object_dir);
> +		return expire_midx_packs(the_repository, opts.object_dir, 0);

Hmm, I actually would have expected to see a new .progress field in
the opts structure and these lower-level functions would start
taking a pointer to the whole opts structure, instead of just a
pointer to a single member opts.object_dir.  That way, we can later
extend and enrich the interface between this caller and the
underlying functions without much patch noise in the dispatch layer
(i.e. here).

This step is meant to be just preparing by extending the interface
and passing the new argument throughout the callchain, I believe,
and it looks correctly done, assuming that we are OK to take this
"add a separate 'progress' parameter, when we need more parameters
later, the interface will gain more and more parameters" approach.
Junio C Hamano Sept. 20, 2019, 8:16 p.m. UTC | #2
Junio C Hamano <gitster@pobox.com> writes:

> This step is meant to be just preparing by extending the interface
> and passing the new argument throughout the callchain, I believe,
> and it looks correctly done, assuming that we are OK to take this
> "add a separate 'progress' parameter, when we need more parameters
> later, the interface will gain more and more parameters" approach.

After re-reading the remainder of the series, I think the structure
this patch series takes (i.e. adding a new parameter to these callees)
is better than the alternative (i.e. making sure everybody takes the
pointer to the opts structure).

Thanks.  I couldn't review the proposed log messages of these
commits, which were unreadable, at all, though.
SZEDER Gábor Sept. 21, 2019, 12:11 p.m. UTC | #3
On Fri, Sep 20, 2019 at 09:53:48AM -0700, William Baker via GitGitGadget wrote:
> diff --git a/midx.h b/midx.h
> index f0ae656b5d..e6fa356b5c 100644
> --- a/midx.h
> +++ b/midx.h
> @@ -37,6 +37,8 @@ struct multi_pack_index {
>  	char object_dir[FLEX_ARRAY];
>  };
>  
> +#define MIDX_PROGRESS     (1 << 0)

Please consider using an enum.

A preprocessor constant is just that: a number.  It is shown as a
number while debugging, and there is no clear indication what related
values belong in the same group (apart from the prefix of the macro
name), or what possible values an 'unsigned flags' variable might have
(though in this particular case there is only a single possible
value...).  

An enum, however, is much friendlier to humans when debugging, because
even 'gdb' shows the value using the symbolic names, e.g.:

  write_commit_graph_reachable (obj_dir=0x9ab870 ".git/objects", 
      flags=(COMMIT_GRAPH_WRITE_APPEND | COMMIT_GRAPH_WRITE_PROGRESS), 
      split_opts=0x9670e0 <split_opts>) at commit-graph.c:1141

and it's quite clear what values a variable of type 'enum
commit_graph_write_flags' can have.
William Baker Sept. 23, 2019, 9:55 p.m. UTC | #4
On 9/21/19 5:11 AM, SZEDER Gábor wrote:
>>  
>> +#define MIDX_PROGRESS     (1 << 0)
> 
> Please consider using an enum.
> 
> A preprocessor constant is just that: a number.  It is shown as a
> number while debugging, and there is no clear indication what related
> values belong in the same group (apart from the prefix of the macro
> name), or what possible values an 'unsigned flags' variable might have
> (though in this particular case there is only a single possible
> value...).  
> 
> An enum, however, is much friendlier to humans when debugging, because
> even 'gdb' shows the value using the symbolic names, e.g.:

Thanks for this suggestion.  I agree on the benefits of using an enum
here and will make the switch in my next set of changes.

-William
Junio C Hamano Sept. 28, 2019, 3:40 a.m. UTC | #5
SZEDER Gábor <szeder.dev@gmail.com> writes:

> On Fri, Sep 20, 2019 at 09:53:48AM -0700, William Baker via GitGitGadget wrote:
>> diff --git a/midx.h b/midx.h
>> index f0ae656b5d..e6fa356b5c 100644
>> --- a/midx.h
>> +++ b/midx.h
>> @@ -37,6 +37,8 @@ struct multi_pack_index {
>>  	char object_dir[FLEX_ARRAY];
>>  };
>>  
>> +#define MIDX_PROGRESS     (1 << 0)
>
> Please consider using an enum.

If they are used by assiging one of their values, definitely a good
idea to use an enum.  Are debuggers clever enough that they can
tell, when they see something like this:

	enum gress {
		PROGRESS = 1,
		REGRESS = 2,
	};

	void func(enum gress v);

	...

        void caller(void)
	{
		func(PROGRESS | REGRESS);
		func(PROGRESS + REGRESS);
		func(PROGRESS * 3);
	}

how caller came about to give 3?
William Baker Sept. 30, 2019, 5:01 p.m. UTC | #6
On 9/27/19 8:40 PM, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> 
>>> +#define MIDX_PROGRESS     (1 << 0)
>>
>> Please consider using an enum.
> 
> If they are used by assiging one of their values, definitely a good
> idea to use an enum.  Are debuggers clever enough that they can
> tell, when they see something like this:
> 
> 	enum gress {
> 		PROGRESS = 1,
> 		REGRESS = 2,
> 	};
> 
> 	void func(enum gress v);
> 
> 	...
> 
>         void caller(void)
> 	{
> 		func(PROGRESS | REGRESS);
> 		func(PROGRESS + REGRESS);
> 		func(PROGRESS * 3);
> 	}
> 
> how caller came about to give 3?
> 

My debugger was not smart enough to figure out what flags were combined
to give the value of 3 in this example.  

I saw that the code base is currently a mix of #define and enums when it
comes to flags  (e.g. dir_struct.flags and rebase_options.flags are both
enums) and so using one here would not be something new stylistically.

Although my debugger might not be the smartest, I haven't noticed any
downsides to switching this to an enum.

Thanks,
William
Junio C Hamano Oct. 2, 2019, 5:38 a.m. UTC | #7
William Baker <williamtbakeremail@gmail.com> writes:

> I saw that the code base is currently a mix of #define and enums when it
> comes to flags  (e.g. dir_struct.flags and rebase_options.flags are both
> enums) and so using one here would not be something new stylistically.

Yes.  But it is a different matter to spread a bad practice to
parts of the code that haven't been contaminated by it.
Junio C Hamano Oct. 2, 2019, 5:43 a.m. UTC | #8
William Baker <williamtbakeremail@gmail.com> writes:

> Although my debugger might not be the smartest, I haven't noticed any
> downsides to switching this to an enum.

Well, if you write

	enum { BIT_0 = 1, BIT_1 = 2, BIT_3 = 4 } var;

it's pretty much a promise that the normal value for the var is one
of these listed values to your readers.  But bit flags are meant to
be used combined (after all, they are cheaper alternative for 1-bit
wide bitfields in a structure), so it is misleading to use enum as
such.
Junio C Hamano Oct. 2, 2019, 7:04 a.m. UTC | #9
Junio C Hamano <gitster@pobox.com> writes:

> William Baker <williamtbakeremail@gmail.com> writes:
>
>> Although my debugger might not be the smartest, I haven't noticed any
>> downsides to switching this to an enum.
>
> Well, if you write
>
> 	enum { BIT_0 = 1, BIT_1 = 2, BIT_3 = 4 } var;
>
> it's pretty much a promise that the normal value for the var is one
> of these listed values to your readers.

... that is why compilers give a warning when you write

	switch (var) {
	case ...:
	...
	}

and do not have case arms for all the declared enum values without
having the 'default' arm.
William Baker Oct. 2, 2019, 3:56 p.m. UTC | #10
On 10/2/19 12:04 AM, Junio C Hamano wrote>>
>> Well, if you write
>>
>> 	enum { BIT_0 = 1, BIT_1 = 2, BIT_3 = 4 } var;
>>
>> it's pretty much a promise that the normal value for the var is one
>> of these listed values to your readers.
> 
> ... that is why compilers give a warning when you write
> 
> 	switch (var) {
> 	case ...:
> 	...
> 	}
> 
> and do not have case arms for all the declared enum values without
> having the 'default' arm.
> 

Thanks for all of the details and feedback here.  I can leave the
flag as-is (and not switch to enum).

Thanks,
William
SZEDER Gábor Oct. 7, 2019, 5:12 p.m. UTC | #11
On Wed, Oct 02, 2019 at 02:43:48PM +0900, Junio C Hamano wrote:
> William Baker <williamtbakeremail@gmail.com> writes:
> 
> > Although my debugger might not be the smartest, I haven't noticed any
> > downsides to switching this to an enum.
> 
> Well, if you write
> 
> 	enum { BIT_0 = 1, BIT_1 = 2, BIT_3 = 4 } var;
> 
> it's pretty much a promise that the normal value for the var is one
> of these listed values to your readers.  But bit flags are meant to
> be used combined (after all, they are cheaper alternative for 1-bit
> wide bitfields in a structure), so it is misleading to use enum as
> such.

Having the combination of enum constants with power-of-two values is
not misleading, but rather an idiom.

Back when debugging the racy split index issues I only saw gibberish
like this:

  (gdb) p ce->ce_flags 
  $2 = 469762048

With an enum I now have this instead:

  (gdb) p ce->ce_flags
  $2 = (CE_MATCHED | CE_UPDATE_IN_BASE | CE_STRIP_NAME)

The latter is about as many times more readable as the int value of
that 'ce_flags'.

The sooner everyone gets on board with this the better.
SZEDER Gábor Oct. 7, 2019, 5:29 p.m. UTC | #12
On Sat, Sep 28, 2019 at 12:40:52PM +0900, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> 
> > On Fri, Sep 20, 2019 at 09:53:48AM -0700, William Baker via GitGitGadget wrote:
> >> diff --git a/midx.h b/midx.h
> >> index f0ae656b5d..e6fa356b5c 100644
> >> --- a/midx.h
> >> +++ b/midx.h
> >> @@ -37,6 +37,8 @@ struct multi_pack_index {
> >>  	char object_dir[FLEX_ARRAY];
> >>  };
> >>  
> >> +#define MIDX_PROGRESS     (1 << 0)
> >
> > Please consider using an enum.
> 
> If they are used by assiging one of their values, definitely a good
> idea to use an enum.  Are debuggers clever enough that they can
> tell, when they see something like this:
> 
> 	enum gress {
> 		PROGRESS = 1,
> 		REGRESS = 2,
> 	};
> 
> 	void func(enum gress v);
> 
> 	...
> 
>         void caller(void)
> 	{
> 		func(PROGRESS | REGRESS);
> 		func(PROGRESS + REGRESS);
> 		func(PROGRESS * 3);
> 	}
> 
> how caller came about to give 3?

No, they tend to show (PROGRESS | REGRESS), at least both gdb and lldb
do.  If the enum has only constants with power-of-two values, then that
is the right way to write it, and the other two are asking for trouble
(e.g. after someone changes the value of REGRESS from 2 to 8,
'PROGRESS * 3' will mean something different).
Junio C Hamano Oct. 8, 2019, 4:30 a.m. UTC | #13
SZEDER Gábor <szeder.dev@gmail.com> writes:

>> 		func(PROGRESS | REGRESS);
>> 		func(PROGRESS + REGRESS);
>> 		func(PROGRESS * 3);
>> 	}
>> 
>> how caller came about to give 3?
>
> No, they tend to show (PROGRESS | REGRESS), at least both gdb and lldb
> do.

OK.

>  If the enum has only constants with power-of-two values, then that
> is the right way to write it, and the other two are asking for trouble

If the programmer and the debugger knows the constants are used to
represent bits that can be or'ed together, what you say is correct,
but that is entirely irrelevant.

What I was worried about is that the constants that are used to
represent something that are *NOT* set of bits (hence "PROGRESS * 3"
may be perfectly a reasonable thing for such an application) may be
mistaken by an overly clever debugger and "3" may end up getting
shown as "PROGRESS | REGRESS".  When there are only two constants
(PROGRESS=1 and REGRESS=2), we humans nor debuggers can tell if that
is to represent two bits that can be or'ed together, or it is an
enumeration.

Until we gain the third constant, that is, at which time the third
one may likely be 3 (if enumeration) or 4 (if bits).
William Baker Oct. 8, 2019, 4:24 p.m. UTC | #14
On 10/7/19 9:30 PM, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> 
>>> 		func(PROGRESS | REGRESS);
>>> 		func(PROGRESS + REGRESS);
>>> 		func(PROGRESS * 3);
>>> 	}
>>>
>>> how caller came about to give 3?
>>
>> No, they tend to show (PROGRESS | REGRESS), at least both gdb and lldb
>> do.
> 
> OK.
> 
>>  If the enum has only constants with power-of-two values, then that
>> is the right way to write it, and the other two are asking for trouble
> 
> If the programmer and the debugger knows the constants are used to
> represent bits that can be or'ed together, what you say is correct,
> but that is entirely irrelevant.
> 
> What I was worried about is that the constants that are used to
> represent something that are *NOT* set of bits (hence "PROGRESS * 3"
> may be perfectly a reasonable thing for such an application) may be
> mistaken by an overly clever debugger and "3" may end up getting
> shown as "PROGRESS | REGRESS".  When there are only two constants
> (PROGRESS=1 and REGRESS=2), we humans nor debuggers can tell if that
> is to represent two bits that can be or'ed together, or it is an
> enumeration.
> 
> Until we gain the third constant, that is, at which time the third
> one may likely be 3 (if enumeration) or 4 (if bits).
> 

I tried to see how lldb would handle the "PROGRESS * 3" scenario
but I was unable to get lldb to display the "PROGRESS | REGRESS" format
even when ORing the flags:

(lldb) l 399
   399 		enum test_flags {
   400 			TEST_FLAG_1 = 1 << 0,
   401 			TEST_FLAG_2 = 1 << 1,
   402 		};
   403 		
   404 		enum test_flags flags_1 = TEST_FLAG_1;
   405 		enum test_flags flags_2 = TEST_FLAG_2;
   406 		enum test_flags flags_both = TEST_FLAG_1 | TEST_FLAG_2;
   407 		
   408 		if (flags_1 || flags_2 || flags_both)
(lldb) p flags_1
(test_flags) $0 = TEST_FLAG_1
(lldb) p flags_2
(test_flags) $1 = TEST_FLAG_2
(lldb) p flags_both
(test_flags) $2 = 3
(lldb) fr v flags_both
(test_flags) flags_both = 3
(lldb) fr v --format enum flags_both
(test_flags) flags_both = 3
(lldb) version
lldb-902.0.79.7
  Swift-4.1

Is there something that needs to be adjusted in the config or with
--format to display "TEST_FLAG_1 | TEST_FLAG_2" in this example?

Thanks,
William
SZEDER Gábor Oct. 9, 2019, 12:16 a.m. UTC | #15
On Tue, Oct 08, 2019 at 09:24:56AM -0700, William Baker wrote:
> On 10/7/19 9:30 PM, Junio C Hamano wrote:
> > SZEDER Gábor <szeder.dev@gmail.com> writes:
> > 
> >>> 		func(PROGRESS | REGRESS);
> >>> 		func(PROGRESS + REGRESS);
> >>> 		func(PROGRESS * 3);
> >>> 	}
> >>>
> >>> how caller came about to give 3?
> >>
> >> No, they tend to show (PROGRESS | REGRESS), at least both gdb and lldb
> >> do.

> I tried to see how lldb would handle the "PROGRESS * 3" scenario
> but I was unable to get lldb to display the "PROGRESS | REGRESS" format
> even when ORing the flags:
> 
> (lldb) l 399
>    399 		enum test_flags {
>    400 			TEST_FLAG_1 = 1 << 0,
>    401 			TEST_FLAG_2 = 1 << 1,
>    402 		};
>    403 		
>    404 		enum test_flags flags_1 = TEST_FLAG_1;
>    405 		enum test_flags flags_2 = TEST_FLAG_2;
>    406 		enum test_flags flags_both = TEST_FLAG_1 | TEST_FLAG_2;
>    407 		
>    408 		if (flags_1 || flags_2 || flags_both)
> (lldb) p flags_1
> (test_flags) $0 = TEST_FLAG_1
> (lldb) p flags_2
> (test_flags) $1 = TEST_FLAG_2
> (lldb) p flags_both
> (test_flags) $2 = 3
> (lldb) fr v flags_both
> (test_flags) flags_both = 3
> (lldb) fr v --format enum flags_both
> (test_flags) flags_both = 3
> (lldb) version
> lldb-902.0.79.7
>   Swift-4.1
> 
> Is there something that needs to be adjusted in the config or with
> --format to display "TEST_FLAG_1 | TEST_FLAG_2" in this example?

I'm afraid you are right and lldb doesn't do this, at least not by
default, what a pity.  I tried different versions of gdb and lldb,
distro-shipped older ones and some more recent in containers a couple
of days ago; now I tried it again and see the same behavior as you, so
I must have misremembered.  Sorry for the confusion.

gdb still does the user-friendly thing, though.
SZEDER Gábor Oct. 9, 2019, 1:32 a.m. UTC | #16
On Tue, Oct 08, 2019 at 01:30:34PM +0900, Junio C Hamano wrote:
> SZEDER Gábor <szeder.dev@gmail.com> writes:
> 
> >> 		func(PROGRESS | REGRESS);
> >> 		func(PROGRESS + REGRESS);
> >> 		func(PROGRESS * 3);
> >> 	}
> >> 
> >> how caller came about to give 3?
> >
> > No, they tend to show (PROGRESS | REGRESS), at least both gdb and lldb
> > do.

I was wrong here, gdb does this, but lldb, unfortunately, doesn't; see
my other reply in this thread.

> OK.
> 
> >  If the enum has only constants with power-of-two values, then that
> > is the right way to write it, and the other two are asking for trouble
> 
> If the programmer and the debugger knows the constants are used to
> represent bits that can be or'ed together, what you say is correct,
> but that is entirely irrelevant.
> 
> What I was worried about is that the constants that are used to
> represent something that are *NOT* set of bits (hence "PROGRESS * 3"
> may be perfectly a reasonable thing for such an application)

I don't really see how that could be reasonable, it's prone to break
when changing the values of the enum constants.

> may be
> mistaken by an overly clever debugger and "3" may end up getting
> shown as "PROGRESS | REGRESS".  When there are only two constants
> (PROGRESS=1 and REGRESS=2), we humans nor debuggers can tell if that
> is to represent two bits that can be or'ed together, or it is an
> enumeration.
> 
> Until we gain the third constant, that is, at which time the third
> one may likely be 3 (if enumeration) or 4 (if bits).

Humans benefit from context: they understand the name of the enum type
(e.g. does it end with "_flags"?), the name of the enum constants, and
the comment above the enum's definition (if any), and can then infer
whether those constants represent OR-able bits or not.  If they can't
find this out, then that enum is poorly named and/or documented, which
should be fixed.  As for the patch that I originally commented on, I
would expect the enum to be called e.g. 'midx_flags', and thus already
with that single constant in there it'll be clear that it is intended
as a collection of related OR-able bits.

As for the debugger, if it sees a variable of an enum type whose value
doesn't match any of the enum constants, then there are basically
three possibilities:

  - All constants in that enum have power-of-two values.  In this case
    it's reasonable from the debugger to assume that those constants
    are OR'ed together, and is extremely helpful to display the value
    that way.

  - The constants are just a set of values (1, 2, 3, 42, etc).  In
    this case the variable shouldn't have a value that doesn't match
    one of the constants in the first place, and I would first suspect
    that the program might be buggy.

  - A "mostly" power-of-two enum might contain shorthand constants for
    combinations of a set of other constants, e.g.:

      enum flags {
              BIT0 = (1 << 0),
              BIT1 = (1 << 1),
              BIT2 = (1 << 2),

              FIRST_TWO = (BIT0 | BIT1),
      };
      enum flags f0 = BIT0;
      enum flags f1 = BIT0 | BIT1;
      enum flags f2 = BIT0 | BIT2;
      enum flags f3 = BIT0 | BIT1 | BIT2;

    In this case, sadly, gdb shows only matching constants:

      (gdb) p f0
      $1 = BIT0
      (gdb) p f1
      $2 = FIRST_TWO
      (gdb) p f2
      $3 = 5
      (gdb) p f3
      $4 = 7
William Baker Oct. 15, 2019, 8 p.m. UTC | #17
On 10/8/19 6:32 PM, SZEDER Gábor wrote:
>>> No, they tend to show (PROGRESS | REGRESS), at least both gdb and lldb
>>> do.
> 
> I was wrong here, gdb does this, but lldb, unfortunately, doesn't; see
> my other reply in this thread.
>
>> What I was worried about is that the constants that are used to
>> represent something that are *NOT* set of bits (hence "PROGRESS * 3"
>> may be perfectly a reasonable thing for such an application)
> 
> I don't really see how that could be reasonable, it's prone to break
> when changing the values of the enum constants.
> 
>> may be
>> mistaken by an overly clever debugger and "3" may end up getting
>> shown as "PROGRESS | REGRESS".  When there are only two constants
>> (PROGRESS=1 and REGRESS=2), we humans nor debuggers can tell if that
>> is to represent two bits that can be or'ed together, or it is an
>> enumeration.
>>
>> Until we gain the third constant, that is, at which time the third
>> one may likely be 3 (if enumeration) or 4 (if bits).
> 
> Humans benefit from context: they understand the name of the enum type
> (e.g. does it end with "_flags"?), the name of the enum constants, and
> the comment above the enum's definition (if any), and can then infer
> whether those constants represent OR-able bits or not.  If they can't
> find this out, then that enum is poorly named and/or documented, which
> should be fixed.  As for the patch that I originally commented on, I
> would expect the enum to be called e.g. 'midx_flags', and thus already
> with that single constant in there it'll be clear that it is intended
> as a collection of related OR-able bits.
> 
> As for the debugger, if it sees a variable of an enum type whose value
> doesn't match any of the enum constants, then there are basically
> three possibilities:
> 
>   - All constants in that enum have power-of-two values.  In this case
>     it's reasonable from the debugger to assume that those constants
>     are OR'ed together, and is extremely helpful to display the value
>     that way.
> 
>   - The constants are just a set of values (1, 2, 3, 42, etc).  In
>     this case the variable shouldn't have a value that doesn't match
>     one of the constants in the first place, and I would first suspect
>     that the program might be buggy.
> 
>   - A "mostly" power-of-two enum might contain shorthand constants for
>     combinations of a set of other constants, e.g.:
> 
>       enum flags {
>               BIT0 = (1 << 0),
>               BIT1 = (1 << 1),
>               BIT2 = (1 << 2),
> 
>               FIRST_TWO = (BIT0 | BIT1),
>       };
>       enum flags f0 = BIT0;
>       enum flags f1 = BIT0 | BIT1;
>       enum flags f2 = BIT0 | BIT2;
>       enum flags f3 = BIT0 | BIT1 | BIT2;
> 
>     In this case, sadly, gdb shows only matching constants:
> 
>       (gdb) p f0
>       $1 = BIT0
>       (gdb) p f1
>       $2 = FIRST_TWO
>       (gdb) p f2
>       $3 = 5
>       (gdb) p f3
>       $4 = 7
> 

I believe this is the last open thread/discussion on the "midx: add MIDX_PROGRESS
flag" patch series.  

A quick summary of where the discussion stands:

- The patch series currently uses #define for the new MIDX_PROGRESS flag.
- The git codebase uses both #defines and enums for flags.
- The debugger always shows the enum value's name if there is a match, if values
  are OR-ed together there a few possibilities (see discussion above and earlier
  in the thread).
- There are concerns regarding misinterpreting constants that are not a set of
  bits (e.g. "PROGRESS * 3").

Are there any other details I can provide that would help in reaching a
conclusion as to how to proceed?

At this time there are no other MIDX_* flags and so there's always the option
to revisit the best way to handle multiple MIDX_* flags if/when additional
flags are added.

Thanks,
William
Junio C Hamano Oct. 16, 2019, 2:09 a.m. UTC | #18
William Baker <williamtbakeremail@gmail.com> writes:

> At this time there are no other MIDX_* flags and so there's always the option
> to revisit the best way to handle multiple MIDX_* flags if/when additional
> flags are added.

I do not care too deeply either way, but if you wrote it in one way,
how about completing the series without changing it in the middle,
and leave the clean-ups to a follow-up series (if needed)?
William Baker Oct. 16, 2019, 7:48 p.m. UTC | #19
On 10/15/19 7:09 PM, Junio C Hamano wrote:
>> At this time there are no other MIDX_* flags and so there's always the option
>> to revisit the best way to handle multiple MIDX_* flags if/when additional
>> flags are added.
> 
> I do not care too deeply either way, but if you wrote it in one way,
> how about completing the series without changing it in the middle,
> and leave the clean-ups to a follow-up series (if needed)?


That plan sounds good to me.  The most recent series (v3) should be ready
to go, I don't believe there is any outstanding feedback to address.

Thanks,
William
William Baker Oct. 18, 2019, 9:35 p.m. UTC | #20
On 10/16/19 12:48 PM, William Baker wrote:>> I do not care too deeply either way, but if you wrote it in one way,
>> how about completing the series without changing it in the middle,
>> and leave the clean-ups to a follow-up series (if needed)?
> 
> 
> That plan sounds good to me.  The most recent series (v3) should be ready
> to go, I don't believe there is any outstanding feedback to address.

Follow-up question on this patch series.

I noticed there is a branch named 'wb/midx-progress' in
https://github.com/gitster/git but it does not appear to have the commits
from this patch series and I have not seen it mentioned in
"What's cooking in git.git" emails.

Is there anything else I should do to get these changes picked up in 'pu'?

Thanks again,
William
diff mbox series

Patch

diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c
index b1ea1a6aa1..e86b8cd17d 100644
--- a/builtin/multi-pack-index.c
+++ b/builtin/multi-pack-index.c
@@ -47,16 +47,16 @@  int cmd_multi_pack_index(int argc, const char **argv,
 	trace2_cmd_mode(argv[0]);
 
 	if (!strcmp(argv[0], "repack"))
-		return midx_repack(the_repository, opts.object_dir, (size_t)opts.batch_size);
+		return midx_repack(the_repository, opts.object_dir, (size_t)opts.batch_size, 0);
 	if (opts.batch_size)
 		die(_("--batch-size option is only for 'repack' subcommand"));
 
 	if (!strcmp(argv[0], "write"))
-		return write_midx_file(opts.object_dir);
+		return write_midx_file(opts.object_dir, 0);
 	if (!strcmp(argv[0], "verify"))
-		return verify_midx_file(the_repository, opts.object_dir);
+		return verify_midx_file(the_repository, opts.object_dir, 0);
 	if (!strcmp(argv[0], "expire"))
-		return expire_midx_packs(the_repository, opts.object_dir);
+		return expire_midx_packs(the_repository, opts.object_dir, 0);
 
 	die(_("unrecognized subcommand: %s"), argv[0]);
 }
diff --git a/builtin/repack.c b/builtin/repack.c
index 632c0c0a79..5b3623337f 100644
--- a/builtin/repack.c
+++ b/builtin/repack.c
@@ -561,7 +561,7 @@  int cmd_repack(int argc, const char **argv, const char *prefix)
 	remove_temporary_files();
 
 	if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0))
-		write_midx_file(get_object_directory());
+		write_midx_file(get_object_directory(), 0);
 
 	string_list_clear(&names, 0);
 	string_list_clear(&rollback, 0);
diff --git a/midx.c b/midx.c
index d649644420..b2673f52e8 100644
--- a/midx.c
+++ b/midx.c
@@ -1017,7 +1017,7 @@  static int write_midx_internal(const char *object_dir, struct multi_pack_index *
 	return result;
 }
 
-int write_midx_file(const char *object_dir)
+int write_midx_file(const char *object_dir, unsigned flags)
 {
 	return write_midx_internal(object_dir, NULL, NULL);
 }
@@ -1077,7 +1077,7 @@  static int compare_pair_pos_vs_id(const void *_a, const void *_b)
 			display_progress(progress, _n); \
 	} while (0)
 
-int verify_midx_file(struct repository *r, const char *object_dir)
+int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags)
 {
 	struct pair_pos_vs_id *pairs = NULL;
 	uint32_t i;
@@ -1184,7 +1184,7 @@  int verify_midx_file(struct repository *r, const char *object_dir)
 	return verify_midx_error;
 }
 
-int expire_midx_packs(struct repository *r, const char *object_dir)
+int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags)
 {
 	uint32_t i, *count, result = 0;
 	struct string_list packs_to_drop = STRING_LIST_INIT_DUP;
@@ -1316,7 +1316,7 @@  static int fill_included_packs_batch(struct repository *r,
 	return 0;
 }
 
-int midx_repack(struct repository *r, const char *object_dir, size_t batch_size)
+int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags)
 {
 	int result = 0;
 	uint32_t i;
diff --git a/midx.h b/midx.h
index f0ae656b5d..e6fa356b5c 100644
--- a/midx.h
+++ b/midx.h
@@ -37,6 +37,8 @@  struct multi_pack_index {
 	char object_dir[FLEX_ARRAY];
 };
 
+#define MIDX_PROGRESS     (1 << 0)
+
 struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local);
 int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id);
 int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result);
@@ -47,11 +49,11 @@  int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pa
 int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name);
 int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local);
 
-int write_midx_file(const char *object_dir);
+int write_midx_file(const char *object_dir, unsigned flags);
 void clear_midx_file(struct repository *r);
-int verify_midx_file(struct repository *r, const char *object_dir);
-int expire_midx_packs(struct repository *r, const char *object_dir);
-int midx_repack(struct repository *r, const char *object_dir, size_t batch_size);
+int verify_midx_file(struct repository *r, const char *object_dir, unsigned flags);
+int expire_midx_packs(struct repository *r, const char *object_dir, unsigned flags);
+int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, unsigned flags);
 
 void close_midx(struct multi_pack_index *m);