diff mbox series

[GSoC,v3] apply: Change #define to enum and variable types from int to enum

Message ID 20230215091950.2976-1-vinayakdev.sci@gmail.com (mailing list archive)
State New, archived
Headers show
Series [GSoC,v3] apply: Change #define to enum and variable types from int to enum | expand

Commit Message

Vinayak Dev Feb. 15, 2023, 9:19 a.m. UTC
From: vinayakdsci <vinayakdev.sci@gmail.com>

Change #define constants to enum and variable types from int to enum
in apply.c

Enum constants have an advantage over #define macro constants in that
modern debuggers are able to report them symbolically instead of just
as simple numbers. This makes debugging and catching undercover errors
easier, and can many a times save quite some time and inconvenience.

Signed-off-by: Vinayak Dev <vinayakdev.sci@gmail.com>

---
 apply.c | 17 +++++++++++------
 1 file changed, 11 insertions(+), 6 deletions(-)


base-commit: b1485644f936ee83a995ec24d23f713f4230a1ae

Comments

Junio C Hamano Feb. 15, 2023, 5:49 p.m. UTC | #1
Vinayak Dev <vinayakdev.sci@gmail.com> writes:

> diff --git a/apply.c b/apply.c
> index 5cc5479c9c..b2a03d9fc3 100644
> --- a/apply.c
> +++ b/apply.c
> @@ -205,8 +205,10 @@ struct fragment {
>   * or deflated "literal".
>   */
>  #define binary_patch_method leading

Notice and explain what this line is doing, perhaps?

> -#define BINARY_DELTA_DEFLATED	1
> -#define BINARY_LITERAL_DEFLATED 2
> +enum binary_type_deflated {
> +	BINARY_DELTA_DEFLATED = 1,
> +	BINARY_LITERAL_DEFLATED
> +};

These days, we not just allow but encourage enum definitions to have
a trailing comma after the last item, UNLESS we want to signal that
the one at the end right now MUST stay to be the last one (e.g. a
sentinel at the end).

A patch that adds a new item to, removes an existing item from, or
shuffles existing items in the list can be free of unnecessary patch
noise to omit the last comma.

As a faithful rewrite, forcing the same values to be given as before
by saying that "_DEFLATED must be 1" is a good thing to do, but once
the dust settled from the patch, it would be a good idea to go back
to the code and see if the values MUST be these, or if it is fine to
use any value as long as they are distinct.  If it is the latter,
then it would make a good follow-up patch to remove "= 1", with an
explanation why it is a safe thing to do.

>  static void free_fragment_list(struct fragment *list)
>  {
> @@ -918,14 +920,17 @@ static int gitdiff_hdrend(struct gitdiff_data *state UNUSED,
>   * their names against any previous information, just
>   * to make sure..
>   */
> -#define DIFF_OLD_NAME 0
> -#define DIFF_NEW_NAME 1
> +
> +enum diff_name {
> +	DIFF_OLD_NAME = 0,
> +	DIFF_NEW_NAME
> +};

Ditto.

>  static int gitdiff_verify_name(struct gitdiff_data *state,
>  			       const char *line,
>  			       int isnull,
>  			       char **name,
> -			       int side)
> +			       enum diff_name side)
>  {
>  	if (!*name && !isnull) {
>  		*name = find_name(state->root, line, NULL, state->p_value, TERM_TAB);
> @@ -1910,7 +1915,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
>  	int llen, used;
>  	unsigned long size = *sz_p;
>  	char *buffer = *buf_p;
> -	int patch_method;
> +	enum binary_type_deflated patch_method;

This is not quite sufficient to achieve the goal of helping "modern
debuggers" that was stated in the proposed log message, is it?
parse_binary_hunk() copies the value from this local variable to a
member in the fragment being parsed, like so:

	frag->binary_patch_method = patch_method;

but the thing is, as we have seen earlier, a compiler macro is used
to (ab)use the "leading" member and call it "binary_patch_method".
The type of that member is "unsigned long".

Now if our ultimate goal were to use enum instead of macro, then an
obvious "solution" would be to stop abusing "leading".  Instead, you
would add "enum binary_type_deflated binary_patch_method" member to
the fragment struct and use the enum throughout.

But is it worth it?

Using enum instead of macro is *NOT* a goal.  If doing so makes our
code better, we may do so---which tells us that use of enum is not a
goal but a means to make our code better.  Is adding an extra member
that is used only for binary patches improve our code?  I dunno.

Thanks.
Vinayak Dev Feb. 16, 2023, 2:04 p.m. UTC | #2
On Wed, 15 Feb 2023 at 23:19, Junio C Hamano <gitster@pobox.com> wrote:
>
> Vinayak Dev <vinayakdev.sci@gmail.com> writes:
>
> > diff --git a/apply.c b/apply.c
> > index 5cc5479c9c..b2a03d9fc3 100644
> > --- a/apply.c
> > +++ b/apply.c
> > @@ -205,8 +205,10 @@ struct fragment {
> >   * or deflated "literal".
> >   */
> >  #define binary_patch_method leading
>
> Notice and explain what this line is doing, perhaps?

It does appear that the macro binary_patch_method actually refers to
an unsigned long.
I think leading represents the number of leading context lines in a
hunk/fragment,
and the variable is reused to store the type for a binary fragment.

> > -#define BINARY_DELTA_DEFLATED        1
> > -#define BINARY_LITERAL_DEFLATED 2
> > +enum binary_type_deflated {
> > +     BINARY_DELTA_DEFLATED = 1,
> > +     BINARY_LITERAL_DEFLATED
> > +};
>
> These days, we not just allow but encourage enum definitions to have
> a trailing comma after the last item, UNLESS we want to signal that
> the one at the end right now MUST stay to be the last one (e.g. a
> sentinel at the end).

> A patch that adds a new item to, removes an existing item from, or
> shuffles existing items in the list can be free of unnecessary patch
> noise to omit the last comma.

Ok. Point noted.

> As a faithful rewrite, forcing the same values to be given as before
> by saying that "_DEFLATED must be 1" is a good thing to do, but once
> the dust settled from the patch, it would be a good idea to go back
> to the code and see if the values MUST be these, or if it is fine to
> use any value as long as they are distinct.  If it is the latter,
> then it would make a good follow-up patch to remove "= 1", with an
> explanation why it is a safe thing to do.

Removing the 1 _may_ be a safe thing to do, because the value
fragment->patch_method
is finally used in a switch statement, which makes it apparent that
the difference in the values
is actually necessary, and maybe not the actual value they hold (Also,
random but distinct
values still pass the test cases).

> >  static void free_fragment_list(struct fragment *list)
> >  {
> > @@ -918,14 +920,17 @@ static int gitdiff_hdrend(struct gitdiff_data *state UNUSED,
> >   * their names against any previous information, just
> >   * to make sure..
> >   */
> > -#define DIFF_OLD_NAME 0
> > -#define DIFF_NEW_NAME 1
> > +
> > +enum diff_name {
> > +     DIFF_OLD_NAME = 0,
> > +     DIFF_NEW_NAME
> > +};
>
> Ditto.

I think that since enums start with zero by default, you are right in
saying that the '=0' here can be removed.
I will do so.

> >  static int gitdiff_verify_name(struct gitdiff_data *state,
> >                              const char *line,
> >                              int isnull,
> >                              char **name,
> > -                            int side)
> > +                            enum diff_name side)
> >  {
> >       if (!*name && !isnull) {
> >               *name = find_name(state->root, line, NULL, state->p_value, TERM_TAB);
> > @@ -1910,7 +1915,7 @@ static struct fragment *parse_binary_hunk(struct apply_state *state,
> >       int llen, used;
> >       unsigned long size = *sz_p;
> >       char *buffer = *buf_p;
> > -     int patch_method;
> > +     enum binary_type_deflated patch_method;
>
> This is not quite sufficient to achieve the goal of helping "modern
> debuggers" that was stated in the proposed log message, is it?
> parse_binary_hunk() copies the value from this local variable to a
> member in the fragment being parsed, like so:
>
>         frag->binary_patch_method = patch_method;
>
> but the thing is, as we have seen earlier, a compiler macro is used
> to (ab)use the "leading" member and call it "binary_patch_method".
> The type of that member is "unsigned long".
>
> Now if our ultimate goal were to use enum instead of macro, then an
> obvious "solution" would be to stop abusing "leading".  Instead, you
> would add "enum binary_type_deflated binary_patch_method" member to
> the fragment struct and use the enum throughout.

It does appear to be so. Introducing a new enum binary_type_deflated
binary_patch_method inside of
struct fragment would indeed be a solution.
This could be plausible, because struct fragment does not appear outside of
apply.c. Your suggestion appears to be the only right way to do it, so
as to achieve the
goal of this patch series.

> But is it worth it?
>
> Using enum instead of macro is *NOT* a goal.  If doing so makes our
> code better, we may do so---which tells us that use of enum is not a
> goal but a means to make our code better.  Is adding an extra member
> that is used only for binary patches improve our code?  I dunno.

apply.c actually seemed to be the best place to start such a change, because the
macros in the file suit such a change the best for a start. It would
have brought
unnecessary overhead in many other files, specifically those which
would affect a _lot_ of
the source code. It is just that the changes appear contained enough
to not cause a lot of
noise.
Your suggestions are very correct, I will make sure to keep them in
mind in the future.

Thanks a lot!
Vinayak
Junio C Hamano Feb. 16, 2023, 5:11 p.m. UTC | #3
Vinayak Dev <vinayakdev.sci@gmail.com> writes:

>> As a faithful rewrite, forcing the same values to be given as before
>> by saying that "_DEFLATED must be 1" is a good thing to do, but once
>> the dust settled from the patch, it would be a good idea to go back
>> to the code and see if the values MUST be these, or if it is fine to
>> use any value as long as they are distinct.  If it is the latter,
>> then it would make a good follow-up patch to remove "= 1", with an
>> explanation why it is a safe thing to do.
>
> Removing the 1 _may_ be a safe thing to do, because ...

I didn't mean to suggest you think about it _NOW_ in the context of
working on this patch.  Rather the opposite.  Let's have a faithful
rewrite first and then as a follow-on work after this patch becomes
part of "git", a further clean-up like that can be a good idea.

>> > +enum diff_name {
>> > +     DIFF_OLD_NAME = 0,
>> > +     DIFF_NEW_NAME
>> > +};
>>
>> Ditto.
>
> I think that since enums start with zero by default, you are right in
> saying that the '=0' here can be removed.

Not what I meant.  I was referring to the lack of trailing comma.

> I will do so.

Please don't.

Thanks.
Vinayak Dev Feb. 16, 2023, 5:21 p.m. UTC | #4
On Thu, 16 Feb 2023 at 22:41, Junio C Hamano <gitster@pobox.com> wrote:
>
> Vinayak Dev <vinayakdev.sci@gmail.com> writes:
>
> >> As a faithful rewrite, forcing the same values to be given as before
> >> by saying that "_DEFLATED must be 1" is a good thing to do, but once
> >> the dust settled from the patch, it would be a good idea to go back
> >> to the code and see if the values MUST be these, or if it is fine to
> >> use any value as long as they are distinct.  If it is the latter,
> >> then it would make a good follow-up patch to remove "= 1", with an
> >> explanation why it is a safe thing to do.
> >
> > Removing the 1 _may_ be a safe thing to do, because ...
>
> I didn't mean to suggest you think about it _NOW_ in the context of
> working on this patch.  Rather the opposite.  Let's have a faithful
> rewrite first and then as a follow-on work after this patch becomes
> part of "git", a further clean-up like that can be a good idea.

I am sorry, I just got a little too excited! I will follow-up at the
right time, I assure you.

> >> > +enum diff_name {
> >> > +     DIFF_OLD_NAME = 0,
> >> > +     DIFF_NEW_NAME
> >> > +};
> >>
> >> Ditto.
> >
> > I think that since enums start with zero by default, you are right in
> > saying that the '=0' here can be removed.
>
> Not what I meant.  I was referring to the lack of trailing comma.

Ok! Understood.

> > I will do so.
>
> Please don't.

Ok! I won't. :)

Thanks a lot for your reply!
Vinayak
diff mbox series

Patch

diff --git a/apply.c b/apply.c
index 5cc5479c9c..b2a03d9fc3 100644
--- a/apply.c
+++ b/apply.c
@@ -205,8 +205,10 @@  struct fragment {
  * or deflated "literal".
  */
 #define binary_patch_method leading
-#define BINARY_DELTA_DEFLATED	1
-#define BINARY_LITERAL_DEFLATED 2
+enum binary_type_deflated {
+	BINARY_DELTA_DEFLATED = 1,
+	BINARY_LITERAL_DEFLATED
+};
 
 static void free_fragment_list(struct fragment *list)
 {
@@ -918,14 +920,17 @@  static int gitdiff_hdrend(struct gitdiff_data *state UNUSED,
  * their names against any previous information, just
  * to make sure..
  */
-#define DIFF_OLD_NAME 0
-#define DIFF_NEW_NAME 1
+
+enum diff_name {
+	DIFF_OLD_NAME = 0,
+	DIFF_NEW_NAME
+};
 
 static int gitdiff_verify_name(struct gitdiff_data *state,
 			       const char *line,
 			       int isnull,
 			       char **name,
-			       int side)
+			       enum diff_name side)
 {
 	if (!*name && !isnull) {
 		*name = find_name(state->root, line, NULL, state->p_value, TERM_TAB);
@@ -1910,7 +1915,7 @@  static struct fragment *parse_binary_hunk(struct apply_state *state,
 	int llen, used;
 	unsigned long size = *sz_p;
 	char *buffer = *buf_p;
-	int patch_method;
+	enum binary_type_deflated patch_method;
 	unsigned long origlen;
 	char *data = NULL;
 	int hunk_size = 0;