diff mbox series

[v2,1/3] diff: return diff_filepair from diff queue helpers

Message ID 20250212041825.2455031-2-jltobler@gmail.com (mailing list archive)
State New
Headers show
Series batch blob diff generation | expand

Commit Message

Justin Tobler Feb. 12, 2025, 4:18 a.m. UTC
The `diff_addremove()` and `diff_change()` functions setup and queue
diffs, but do not return the `diff_filepair` added to the queue. In a
subsequent commit, modifications to `diff_filepair` need to take place
in certain cases after being queued.

Split out the queuing operations into `diff_filepair_addremove()` and
`diff_filepair_change()` which also return a handle to the queued
`diff_filepair`.

Signed-off-by: Justin Tobler <jltobler@gmail.com>
---
 diff.c | 66 +++++++++++++++++++++++++++++++++++++++++-----------------
 diff.h | 15 +++++++++++++
 2 files changed, 62 insertions(+), 19 deletions(-)

Comments

Karthik Nayak Feb. 12, 2025, 9:06 a.m. UTC | #1
Justin Tobler <jltobler@gmail.com> writes:

> The `diff_addremove()` and `diff_change()` functions setup and queue
> diffs, but do not return the `diff_filepair` added to the queue. In a
> subsequent commit, modifications to `diff_filepair` need to take place
> in certain cases after being queued.
>
> Split out the queuing operations into `diff_filepair_addremove()` and
> `diff_filepair_change()` which also return a handle to the queued
> `diff_filepair`.
>

This patch keeps `diff_addremove()` and `diff_change()` while
introducing two new functions which return the `diff_filepair`. Just a
thought, why not replace them? The users `diff_addremove()` and
`diff_change()` could simply call the new functions and ignore the
return value?

This would be messy if there were a lot of users of `diff_addremove()`
and `diff_change()`, but I only see a few callers. Wouldn't it be
cleaner to just replace?

The patch looks good to me otherwise.

[snip]

> diff --git a/diff.h b/diff.h
> index 0a566f5531..6ea63f01e7 100644
> --- a/diff.h
> +++ b/diff.h
> @@ -508,6 +508,21 @@ void diff_set_default_prefix(struct diff_options *options);
>
>  int diff_can_quit_early(struct diff_options *);
>
> +struct diff_filepair *diff_filepair_addremove(struct diff_options *,
> +					      int addremove, unsigned mode,
> +					      const struct object_id *oid,
> +					      int oid_valid, const char *fullpath,
> +					      unsigned dirty_submodule);
> +
> +struct diff_filepair *diff_filepair_change(struct diff_options *,
> +					   unsigned mode1, unsigned mode2,
> +					   const struct object_id *old_oid,
> +					   const struct object_id *new_oid,
> +					   int old_oid_valid, int new_oid_valid,
> +					   const char *fullpath,
> +					   unsigned dirty_submodule1,
> +					   unsigned dirty_submodule2);
> +

Nit: would be nice to have some comments to describe what these
functions do.

>  void diff_addremove(struct diff_options *,
>  		    int addremove,
>  		    unsigned mode,
> --
> 2.48.1
Patrick Steinhardt Feb. 12, 2025, 9:23 a.m. UTC | #2
On Tue, Feb 11, 2025 at 10:18:23PM -0600, Justin Tobler wrote:
> The `diff_addremove()` and `diff_change()` functions setup and queue
> diffs, but do not return the `diff_filepair` added to the queue. In a
> subsequent commit, modifications to `diff_filepair` need to take place
> in certain cases after being queued.
> 
> Split out the queuing operations into `diff_filepair_addremove()` and
> `diff_filepair_change()` which also return a handle to the queued
> `diff_filepair`.

One of the things that puzzled me a bit is that we keep the old-style
functions, where the only difference is the return value. Wouldn't it
make more sense to instead adapt these existing functions to reduce the
amount of duplication?

At the same time, while we're already at it, do we maybe also want to
adapt the functions so that they get the `diff_queue` as input instead
of relying on the global queue? That would make them more generally
useful and be a step into the right direction regarding libification. If
so, it would indeed make sense to also rename the function into e.g.
`diff_queue_addremove()`.

Patrick
Justin Tobler Feb. 12, 2025, 5:24 p.m. UTC | #3
On 25/02/12 10:23AM, Patrick Steinhardt wrote:
> On Tue, Feb 11, 2025 at 10:18:23PM -0600, Justin Tobler wrote:
> > The `diff_addremove()` and `diff_change()` functions setup and queue
> > diffs, but do not return the `diff_filepair` added to the queue. In a
> > subsequent commit, modifications to `diff_filepair` need to take place
> > in certain cases after being queued.
> > 
> > Split out the queuing operations into `diff_filepair_addremove()` and
> > `diff_filepair_change()` which also return a handle to the queued
> > `diff_filepair`.
> 
> One of the things that puzzled me a bit is that we keep the old-style
> functions, where the only difference is the return value. Wouldn't it
> make more sense to instead adapt these existing functions to reduce the
> amount of duplication?

This is what I considered doing initially. I noticed though that both
`diff_addremove()` and `diff_change()` are stored as callbacks in
`diff_options` as types `add_remove_fn_t` and `change_fn_t`. The diff
options configured for pruning use `file_add_remove()` and
`file_change()` instead. Returning `diff_filepair` doesn't seems to make
much sense in the context of `file_add_remove()` and `file_change()` as
no filepairs ever get queued, so I opted to factor out the logic into
separate functions instead of adapting the function signatures for all.

This may not be the best option, so I can also change it if that is
best.

> At the same time, while we're already at it, do we maybe also want to
> adapt the functions so that they get the `diff_queue` as input instead
> of relying on the global queue? That would make them more generally
> useful and be a step into the right direction regarding libification. If
> so, it would indeed make sense to also rename the function into e.g.
> `diff_queue_addremove()`.

Thanks for the suggestion. I'll adapt the next version accordingly.

-Justin
Justin Tobler Feb. 12, 2025, 5:35 p.m. UTC | #4
On 25/02/12 01:06AM, Karthik Nayak wrote:
> Justin Tobler <jltobler@gmail.com> writes:
> 
> > The `diff_addremove()` and `diff_change()` functions setup and queue
> > diffs, but do not return the `diff_filepair` added to the queue. In a
> > subsequent commit, modifications to `diff_filepair` need to take place
> > in certain cases after being queued.
> >
> > Split out the queuing operations into `diff_filepair_addremove()` and
> > `diff_filepair_change()` which also return a handle to the queued
> > `diff_filepair`.
> >
> 
> This patch keeps `diff_addremove()` and `diff_change()` while
> introducing two new functions which return the `diff_filepair`. Just a
> thought, why not replace them? The users `diff_addremove()` and
> `diff_change()` could simply call the new functions and ignore the
> return value?

This was mostly to avoid changing the `add_remove_fn_t` and
`change_fn_t` types that store `diff_addremove()` and `diff_change()` in
`diff_options`. The `file_add_remove()` and `file_change()` functions,
which also can be set in `diff_options`, do not ever queue file pairs so
I don't think returning `diff_filepair` makes much sense there.

> This would be messy if there were a lot of users of `diff_addremove()`
> and `diff_change()`, but I only see a few callers. Wouldn't it be
> cleaner to just replace?

Patrick has suggested we avoid using the global `diff_queue_struct`
implicitly. Currently, in the next version I'm planning to keep the
separate functions as `diff_queue_addremove()` and
`diff_queue_change()`, but also accept `diff_queue_struct` as an
argument.

> The patch looks good to me otherwise.
> 
> [snip]
> 
> > diff --git a/diff.h b/diff.h
> > index 0a566f5531..6ea63f01e7 100644
> > --- a/diff.h
> > +++ b/diff.h
> > @@ -508,6 +508,21 @@ void diff_set_default_prefix(struct diff_options *options);
> >
> >  int diff_can_quit_early(struct diff_options *);
> >
> > +struct diff_filepair *diff_filepair_addremove(struct diff_options *,
> > +					      int addremove, unsigned mode,
> > +					      const struct object_id *oid,
> > +					      int oid_valid, const char *fullpath,
> > +					      unsigned dirty_submodule);
> > +
> > +struct diff_filepair *diff_filepair_change(struct diff_options *,
> > +					   unsigned mode1, unsigned mode2,
> > +					   const struct object_id *old_oid,
> > +					   const struct object_id *new_oid,
> > +					   int old_oid_valid, int new_oid_valid,
> > +					   const char *fullpath,
> > +					   unsigned dirty_submodule1,
> > +					   unsigned dirty_submodule2);
> > +
> 
> Nit: would be nice to have some comments to describe what these
> functions do.

I'll add in the next version. Thanks

-Justin
Patrick Steinhardt Feb. 13, 2025, 5:45 a.m. UTC | #5
On Wed, Feb 12, 2025 at 11:24:55AM -0600, Justin Tobler wrote:
> On 25/02/12 10:23AM, Patrick Steinhardt wrote:
> > On Tue, Feb 11, 2025 at 10:18:23PM -0600, Justin Tobler wrote:
> > > The `diff_addremove()` and `diff_change()` functions setup and queue
> > > diffs, but do not return the `diff_filepair` added to the queue. In a
> > > subsequent commit, modifications to `diff_filepair` need to take place
> > > in certain cases after being queued.
> > > 
> > > Split out the queuing operations into `diff_filepair_addremove()` and
> > > `diff_filepair_change()` which also return a handle to the queued
> > > `diff_filepair`.
> > 
> > One of the things that puzzled me a bit is that we keep the old-style
> > functions, where the only difference is the return value. Wouldn't it
> > make more sense to instead adapt these existing functions to reduce the
> > amount of duplication?
> 
> This is what I considered doing initially. I noticed though that both
> `diff_addremove()` and `diff_change()` are stored as callbacks in
> `diff_options` as types `add_remove_fn_t` and `change_fn_t`. The diff
> options configured for pruning use `file_add_remove()` and
> `file_change()` instead. Returning `diff_filepair` doesn't seems to make
> much sense in the context of `file_add_remove()` and `file_change()` as
> no filepairs ever get queued, so I opted to factor out the logic into
> separate functions instead of adapting the function signatures for all.
> 
> This may not be the best option, so I can also change it if that is
> best.

Okay. This is context that should probably be part of the commit
message, as it is quite an important detail to understand the
implementation.

Patrick
diff mbox series

Patch

diff --git a/diff.c b/diff.c
index 019fb893a7..afbb892c26 100644
--- a/diff.c
+++ b/diff.c
@@ -7157,16 +7157,18 @@  void compute_diffstat(struct diff_options *options,
 	options->found_changes = !!diffstat->nr;
 }
 
-void diff_addremove(struct diff_options *options,
-		    int addremove, unsigned mode,
-		    const struct object_id *oid,
-		    int oid_valid,
-		    const char *concatpath, unsigned dirty_submodule)
+struct diff_filepair *diff_filepair_addremove(struct diff_options *options,
+					      int addremove, unsigned mode,
+					      const struct object_id *oid,
+					      int oid_valid,
+					      const char *concatpath,
+					      unsigned dirty_submodule)
 {
 	struct diff_filespec *one, *two;
+	struct diff_filepair *pair;
 
 	if (S_ISGITLINK(mode) && is_submodule_ignored(concatpath, options))
-		return;
+		return NULL;
 
 	/* This may look odd, but it is a preparation for
 	 * feeding "there are unchanged files which should
@@ -7186,7 +7188,7 @@  void diff_addremove(struct diff_options *options,
 
 	if (options->prefix &&
 	    strncmp(concatpath, options->prefix, options->prefix_length))
-		return;
+		return NULL;
 
 	one = alloc_filespec(concatpath);
 	two = alloc_filespec(concatpath);
@@ -7198,25 +7200,28 @@  void diff_addremove(struct diff_options *options,
 		two->dirty_submodule = dirty_submodule;
 	}
 
-	diff_queue(&diff_queued_diff, one, two);
+	pair = diff_queue(&diff_queued_diff, one, two);
 	if (!options->flags.diff_from_contents)
 		options->flags.has_changes = 1;
+
+	return pair;
 }
 
-void diff_change(struct diff_options *options,
-		 unsigned old_mode, unsigned new_mode,
-		 const struct object_id *old_oid,
-		 const struct object_id *new_oid,
-		 int old_oid_valid, int new_oid_valid,
-		 const char *concatpath,
-		 unsigned old_dirty_submodule, unsigned new_dirty_submodule)
+struct diff_filepair *diff_filepair_change(struct diff_options *options,
+					   unsigned old_mode, unsigned new_mode,
+					   const struct object_id *old_oid,
+					   const struct object_id *new_oid,
+					   int old_oid_valid, int new_oid_valid,
+					   const char *concatpath,
+					   unsigned old_dirty_submodule,
+					   unsigned new_dirty_submodule)
 {
 	struct diff_filespec *one, *two;
 	struct diff_filepair *p;
 
 	if (S_ISGITLINK(old_mode) && S_ISGITLINK(new_mode) &&
 	    is_submodule_ignored(concatpath, options))
-		return;
+		return NULL;
 
 	if (options->flags.reverse_diff) {
 		SWAP(old_mode, new_mode);
@@ -7227,7 +7232,7 @@  void diff_change(struct diff_options *options,
 
 	if (options->prefix &&
 	    strncmp(concatpath, options->prefix, options->prefix_length))
-		return;
+		return NULL;
 
 	one = alloc_filespec(concatpath);
 	two = alloc_filespec(concatpath);
@@ -7238,16 +7243,39 @@  void diff_change(struct diff_options *options,
 	p = diff_queue(&diff_queued_diff, one, two);
 
 	if (options->flags.diff_from_contents)
-		return;
+		return p;
 
 	if (options->flags.quick && options->skip_stat_unmatch &&
 	    !diff_filespec_check_stat_unmatch(options->repo, p)) {
 		diff_free_filespec_data(p->one);
 		diff_free_filespec_data(p->two);
-		return;
+		return p;
 	}
 
 	options->flags.has_changes = 1;
+
+	return p;
+}
+
+void diff_addremove(struct diff_options *options, int addremove, unsigned mode,
+		    const struct object_id *oid, int oid_valid,
+		    const char *concatpath, unsigned dirty_submodule)
+{
+	diff_filepair_addremove(options, addremove, mode, oid, oid_valid,
+				concatpath, dirty_submodule);
+}
+
+void diff_change(struct diff_options *options,
+		 unsigned old_mode, unsigned new_mode,
+		 const struct object_id *old_oid,
+		 const struct object_id *new_oid,
+		 int old_oid_valid, int new_oid_valid,
+		 const char *concatpath,
+		 unsigned old_dirty_submodule, unsigned new_dirty_submodule)
+{
+	diff_filepair_change(options, old_mode, new_mode, old_oid, new_oid,
+			     old_oid_valid, new_oid_valid, concatpath,
+			     old_dirty_submodule, new_dirty_submodule);
 }
 
 struct diff_filepair *diff_unmerge(struct diff_options *options, const char *path)
diff --git a/diff.h b/diff.h
index 0a566f5531..6ea63f01e7 100644
--- a/diff.h
+++ b/diff.h
@@ -508,6 +508,21 @@  void diff_set_default_prefix(struct diff_options *options);
 
 int diff_can_quit_early(struct diff_options *);
 
+struct diff_filepair *diff_filepair_addremove(struct diff_options *,
+					      int addremove, unsigned mode,
+					      const struct object_id *oid,
+					      int oid_valid, const char *fullpath,
+					      unsigned dirty_submodule);
+
+struct diff_filepair *diff_filepair_change(struct diff_options *,
+					   unsigned mode1, unsigned mode2,
+					   const struct object_id *old_oid,
+					   const struct object_id *new_oid,
+					   int old_oid_valid, int new_oid_valid,
+					   const char *fullpath,
+					   unsigned dirty_submodule1,
+					   unsigned dirty_submodule2);
+
 void diff_addremove(struct diff_options *,
 		    int addremove,
 		    unsigned mode,