mbox series

[0/3] More add_submodule_odb() cleanup in merge code

Message ID cover.1631123754.git.jonathantanmy@google.com (mailing list archive)
Headers show
Series More add_submodule_odb() cleanup in merge code | expand

Message

Jonathan Tan Sept. 8, 2021, 6:18 p.m. UTC
(CC-ing Elijah in case he has insight into the merge-ort part.)

While continuing the effort into removing add_submodule_odb() (as part
of the submodule partial clone effort) I came across this part of the
merge code that specifies the repository being operated on in two ways:
one as a struct repository pointer and the other as a path. This patch
set unifies the two.

I normally would not send refactoring patches unless I also have a
feature patch that uses the results of said refactoring, but in this
case, I think that these patches are worth having since they clarify a
potentially unclear part of the API.

Note that these patches mean that the merging code no longer supports
submodules that have their .git dirs in the worktree, but from what I
can tell, this seems to be the direction we're going in
(repo_submodule_init() does not support such submodules).

Patch 3 is included to show how I'm verifying some things. Including
something like that in the master branch would probably require
conditional compilation (to exclude the additional field in struct
object used for checking, among other things), so I'm just including it
here for informational purposes.

All these patches work under GIT_TEST_MERGE_ALGORITHM=recursive and
GIT_TEST_MERGE_ALGORITHM=ort (and when that envvar is unset, for good
measure).

Jonathan Tan (3):
  t6437: run absorbgitdirs on repos
  revision: remove "submodule" from opt struct
  DO NOT SUBMIT commit-reach,revision: verify non-mixing

 alloc.c                    |  2 +
 commit-reach.c             | 60 +++++++++++++++++-------
 merge-ort.c                | 55 +++++++++++++++-------
 merge-recursive.c          | 51 +++++++++++++-------
 object.h                   |  1 +
 revision.c                 | 96 ++++++++++++++++++++++----------------
 revision.h                 |  1 -
 t/t6437-submodule-merge.sh |  9 ++--
 8 files changed, 179 insertions(+), 96 deletions(-)

Comments

Elijah Newren Sept. 9, 2021, 3:26 p.m. UTC | #1
On Wed, Sep 8, 2021 at 11:18 AM Jonathan Tan <jonathantanmy@google.com> wrote:
>
> (CC-ing Elijah in case he has insight into the merge-ort part.)

All the submodule merging related functions were lifted from
merge-recursive.c and minimally modified to fit the new structure.
The only substantive change I made was to fix the merge result for
virtual merge bases, but that's like one or two lines of code.  In
particular, everything relating to submodule objects was totally
untouched...and I think that's reflected in the fact that your PATCH 2
basically is the same patch twice, once for merge-recursive and once
for merge-ort.

I read over PATCH 2 and I didn't find anything that looked
problematic, but I'm not up-to-speed on the add_submodule_odb and repo
handling bits of the codebase so I'm not sure I would catch anything.
But I am encouraged by the fact that it looks like you did the same
stuff to merge-recursive and merge-ort; I'd be worried you missed
something if that weren't the case.


As a sidenote, though...

This does remind me that I noticed that the following functions from
object-store.h do not take an explicit repository:

write_object_file()
hash_object_file()
hash_object_file_literally()
force_object_loose()

I have a patch sitting around somewhere (possibly only still
referenced in my 'temp' branch) to make repo_*() variants of the above
functions, and turn the above into simple wrappers of the repo_*()
variants which just pass the_repository (much like someone else did
with read_object_file() and repo_read_object_file()).  It also updates
merge-ort to use the new repo_*() functions.  However, I ended up
excluding it from my merge-ort submissions since it wasn't necessary.
Would this be of interest in your submodule work, though?  I guess
it'd only matter if we started doing real merges of submodules as part
of a parent repo merge.  (As opposed to the fast-forward-only merges
that merge-recursive/merge-ort do right now for submodules.)

> While continuing the effort into removing add_submodule_odb() (as part
> of the submodule partial clone effort) I came across this part of the
> merge code that specifies the repository being operated on in two ways:
> one as a struct repository pointer and the other as a path. This patch
> set unifies the two.
>
> I normally would not send refactoring patches unless I also have a
> feature patch that uses the results of said refactoring, but in this
> case, I think that these patches are worth having since they clarify a
> potentially unclear part of the API.
>
> Note that these patches mean that the merging code no longer supports
> submodules that have their .git dirs in the worktree, but from what I
> can tell, this seems to be the direction we're going in
> (repo_submodule_init() does not support such submodules).
>
> Patch 3 is included to show how I'm verifying some things. Including
> something like that in the master branch would probably require
> conditional compilation (to exclude the additional field in struct
> object used for checking, among other things), so I'm just including it
> here for informational purposes.
>
> All these patches work under GIT_TEST_MERGE_ALGORITHM=recursive and
> GIT_TEST_MERGE_ALGORITHM=ort (and when that envvar is unset, for good
> measure).
>
> Jonathan Tan (3):
>   t6437: run absorbgitdirs on repos
>   revision: remove "submodule" from opt struct
>   DO NOT SUBMIT commit-reach,revision: verify non-mixing
>
>  alloc.c                    |  2 +
>  commit-reach.c             | 60 +++++++++++++++++-------
>  merge-ort.c                | 55 +++++++++++++++-------
>  merge-recursive.c          | 51 +++++++++++++-------
>  object.h                   |  1 +
>  revision.c                 | 96 ++++++++++++++++++++++----------------
>  revision.h                 |  1 -
>  t/t6437-submodule-merge.sh |  9 ++--
>  8 files changed, 179 insertions(+), 96 deletions(-)
>
> --
> 2.33.0.309.g3052b89438-goog
>
Jonathan Tan Sept. 9, 2021, 5:51 p.m. UTC | #2
> On Wed, Sep 8, 2021 at 11:18 AM Jonathan Tan <jonathantanmy@google.com> wrote:
> >
> > (CC-ing Elijah in case he has insight into the merge-ort part.)
> 
> All the submodule merging related functions were lifted from
> merge-recursive.c and minimally modified to fit the new structure.
> The only substantive change I made was to fix the merge result for
> virtual merge bases, but that's like one or two lines of code.  In
> particular, everything relating to submodule objects was totally
> untouched...and I think that's reflected in the fact that your PATCH 2
> basically is the same patch twice, once for merge-recursive and once
> for merge-ort.
> 
> I read over PATCH 2 and I didn't find anything that looked
> problematic, but I'm not up-to-speed on the add_submodule_odb and repo
> handling bits of the codebase so I'm not sure I would catch anything.
> But I am encouraged by the fact that it looks like you did the same
> stuff to merge-recursive and merge-ort; I'd be worried you missed
> something if that weren't the case.

Thanks for taking a look.

> As a sidenote, though...
> 
> This does remind me that I noticed that the following functions from
> object-store.h do not take an explicit repository:
> 
> write_object_file()
> hash_object_file()
> hash_object_file_literally()
> force_object_loose()
> 
> I have a patch sitting around somewhere (possibly only still
> referenced in my 'temp' branch) to make repo_*() variants of the above
> functions, and turn the above into simple wrappers of the repo_*()
> variants which just pass the_repository (much like someone else did
> with read_object_file() and repo_read_object_file()).  It also updates
> merge-ort to use the new repo_*() functions.  However, I ended up
> excluding it from my merge-ort submissions since it wasn't necessary.
> Would this be of interest in your submodule work, though?  I guess
> it'd only matter if we started doing real merges of submodules as part
> of a parent repo merge.  (As opposed to the fast-forward-only merges
> that merge-recursive/merge-ort do right now for submodules.)

I think that these functions would be useful if we were to write into
submodules (for example, in a real merge of submodules, as you said).
Right now my submodule work is just making it support partial clones, so
I'm not planning to introduce any new submodule writes. (Well, besides
the fact that a lazy-fetch in a submodule would end up with new objects,
but that's already handled because we do the fetch in a separate process
in the submodule's gitdir.)