mbox series

[v10,00/15] Upstreaming the Scalar command

Message ID pull.1005.v10.git.1638538470.gitgitgadget@gmail.com (mailing list archive)
Headers show
Series Upstreaming the Scalar command | expand

Message

John Passaro via GitGitGadget Dec. 3, 2021, 1:34 p.m. UTC
tl;dr: This series contributes the core part of the Scalar command to the
Git project. This command provides a convenient way to clone/initialize very
large repositories (think: monorepos).

Note: This patch series' focus is entirely on Scalar, on choosing sensible
defaults and offering a delightful user experience around working with
monorepos, and not about changing any existing paradigms for contrib/ (even
if catching up on the mail thread is likely to give interested readers that
false impression).

Changes since v9:

 * The patches to build Scalar and run its tests as part of Git's CI/PR,
   have been dropped because a recent unrelated patch series does not
   interact well with them.

Changes since v8:

 * The rebase on top of v2.34.0, which changed the default merge strategy to
   ORT, should have changed the default for merge.renames to true. This is
   now the case.
 * Accommodate preemptively for ab/ci-updates which invalidates assumptions
   made by this patch series that would still hold true with v2.34.0 but are
   no longer valid in seen and would trigger CI build breakages.

Changes since v7:

 * Clarified in the commit message why we cannot easily encapsulate the
   Scalar part of the CMake configuration in contrib/scalar/.
 * Improved the README.md.

Changes since v6:

 * Rebased on top of v2.34.0.
 * Inserted a commit that adds contrib/scalar/README.md, containing the
   roadmap of what I have planned for Scalar.
 * The Scalar test's definition of GIT_TEST_MAINT_SCHEDULER has been
   adjusted to accommodate for a change in v2.32.0..v2.34.0.
 * The config setting defaults now include fetch.showForcedUpdates=false,
   which has been identified as helping with a performance issue in large
   repositories.
 * To avoid mistaking the current patch series for being feature-complete
   enough to unleash onto end users, I moved the Makefile rules to build
   HTML/manual pages to a later patch series.
 * The patch that adds support for -c <key>=<value> and -C <directory> was
   moved to its own add-on patch series: While it is obvious that those
   options are valuable to have, an open question is whether there are other
   "pre-command" options in git that would be useful, too, and I would like
   to postpone that discussion to that date.
 * I added two patches that I had planned on keeping in an add-on patch
   series for later, to build and test Scalar as part of the CI. I am still
   not 100% certain that it is a good idea to do so already now, but let's
   see what the reviewers have to say.

Changes since v5:

 * Fixed the commit message talking about make -C contrib/scalar/Makefile.
 * Fixed the git ls-tree invocation suggested in the manual for scalar
   clone.
 * Invoking make -C contrib/scalar, then changing a source file of libgit.a
   and then immediately invoking make -C contrib/scalar again will now
   implicitly rebuild libgit.a.

Changes since v4:

 * scalar delete now refuses to delete anything if it was started from
   within the enlistment.
 * scalar delete releases any handles to the object store before deleting
   the enlistment.
 * The OBJECTS list in the Makefile will now include Scalar.
 * scalar register now supports secondary worktrees, in addition to the
   primary worktree.

Changes since v3:

 * Moved the "Changes since" section to the top, to make it easier to see
   what changed.
 * Reworded the commit message of the first patch.
 * Removed the [RFC] prefix because I did not hear any objections against
   putting this into contrib/.

Changes since v2:

 * Adjusted the description of the list command in the manual page , as
   suggested by Bagas.
 * Addressed two style nits in cmd_run().
 * The documentation of git reconfigure -a was improved.

Changes since v1:

 * A couple typos were fixed
 * The code parsing the output of ls-remote was made more readable
 * The indentation used in scalar.txt now consistently uses tabs
 * We no longer hard-code core.bare = false when registering with Scalar


Background
==========

Microsoft invested a lot of effort into scaling Git to the needs of the
Windows operating system source code. Based on the experience of the first
approach, VFS for Git, the Scalar project was started. Scalar specifically
has as its core goal to funnel all improvements into core Git.


The present
===========

The Scalar project provides a completely functional non-virtual experience
for monorepos. But why stop there. The Scalar project was designed to be a
self-destructing vehicle to allow those key concepts to be moved into core
Git itself for the benefit of all. For example, partial clone,
sparse-checkout, and scheduled background maintenance have already been
upstreamed and removed from Scalar proper. This patch series provides a
C-based implementation of the final remaining portions of the Scalar
command. This will make it easier for users to experiment with the Scalar
command. It will also make it substantially easier to experiment with moving
functionality from Scalar into core Git, while maintaining
backwards-compatibility for existing Scalar users.

The C-based Scalar has been shipped to Scalar users, and can be tested by
any interested reader: https://github.com/microsoft/git/releases/ (it offers
a Git for Windows installer, a macOS package and an Ubuntu package, Scalar
has been included since v2.33.0.vfs.0.0).


Next steps
==========

Since there are existing Scalar users, I want to ensure
backwards-compatibility with its existing command-line interface. Keeping
that in mind, everything in this series is up for discussion.

I obviously believe that Scalar brings a huge benefit, and think that it
would be ideal for all of Scalar's learnings to end up in git clone/git
init/git maintenance eventually. It is also conceivable, however, that the
scalar command could graduate to be a core part of Git at some stage in the
future (such a decision would probably depend highly on users' feedback).
See also the discussion about the architecture of Scalar
[https://lore.kernel.org/git/b67bbef4-e4c3-b6a7-1c7f-7d405902ef8b@gmail.com/],
kicked off by Stolee.

On top of this patch series, I have lined up a few more:

 1. Implement a scalar diagnose command.
 2. Use the built-in FSMonitor (that patch series obviously needs to wait
    for FSMonitor to be integrated).
 3. Modify the config machinery to be more generous about concurrent writes,
    say, to the user-wide config.
 4. A few patches to optionally build and install scalar as part of a
    regular Git install (also teaching git help scalar to find the Scalar
    documentation

These are included in my vfs-with-scalar branch thicket
[https://github.com/dscho/git/commits/vfs-with-scalar]. On top of that, this
branch thicket also includes patches I do not plan on upstreaming, mainly
because they are too specific either to VFS for Git, or they support Azure
Repos (which does not offer partial clones but speaks the GVFS protocol,
which can be used to emulate partial clones).

One other thing is very interesting about that vfs-with-scalar branch
thicket: it contains a GitHub workflow which will run Scalar's quite
extensive Functional Tests suite. This test suite is quite comprehensive and
caught us a lot of bugs in the past, not only in the Scalar code, but also
core Git.


Epilogue
========

Now, to address some questions that I imagine every reader has who made it
this far:

 * Why not put the Scalar functionality directly into core Git, even a
   built-in? I wanted to provide an easy way for Git contributors to "play
   with" Scalar, without forcing a new top-level command into Git.
 * Why implement the Scalar command in the Git code base? Apart from
   simplifying Scalar maintenance in the Microsoft port of Git, the tight
   version coupling between Git and Scalar reduces the maintenance burden
   even further. Besides, I believe that it will make it much easier to
   shift functionality from Scalar into core Git, once we took the hurdle of
   accepting the Scalar code into the code base.
 * Why contribute Scalar to the Git project? We are biased, of course, yet
   our data-driven approach provides evidence that Scalar helps handling
   huge repositories with ease. By contributing it to the core Git project,
   we are able to share it with more users, especially some users who do not
   want to install Microsoft's fork of Git. We also hope that a lot of
   Scalar (maybe all of it) will end up in core Git, to benefit even more
   users.

Derrick Stolee (4):
  scalar: 'register' sets recommended config and starts maintenance
  scalar: 'unregister' stops background maintenance
  scalar: implement 'scalar list'
  scalar: implement the `run` command

Johannes Schindelin (10):
  scalar: add a README with a roadmap
  scalar: create a rudimentary executable
  scalar: start documenting the command
  scalar: create test infrastructure
  scalar: let 'unregister' handle a deleted enlistment directory
    gracefully
  scalar: implement the `clone` subcommand
  scalar: teach 'clone' to support the --single-branch option
  scalar: allow reconfiguring an existing enlistment
  scalar: teach 'reconfigure' to optionally handle all registered
    enlistments
  scalar: implement the `version` command

Matthew John Cheetham (1):
  scalar: implement the `delete` command

 Makefile                         |   9 +
 contrib/scalar/.gitignore        |   2 +
 contrib/scalar/Makefile          |  45 ++
 contrib/scalar/README.md         |  82 +++
 contrib/scalar/scalar.c          | 826 +++++++++++++++++++++++++++++++
 contrib/scalar/scalar.txt        | 145 ++++++
 contrib/scalar/t/Makefile        |  78 +++
 contrib/scalar/t/t9099-scalar.sh |  88 ++++
 8 files changed, 1275 insertions(+)
 create mode 100644 contrib/scalar/.gitignore
 create mode 100644 contrib/scalar/Makefile
 create mode 100644 contrib/scalar/README.md
 create mode 100644 contrib/scalar/scalar.c
 create mode 100644 contrib/scalar/scalar.txt
 create mode 100644 contrib/scalar/t/Makefile
 create mode 100755 contrib/scalar/t/t9099-scalar.sh


base-commit: cd3e606211bb1cf8bc57f7d76bab98cc17a150bc
Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-1005%2Fdscho%2Fscalar-the-beginning-v10
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-1005/dscho/scalar-the-beginning-v10
Pull-Request: https://github.com/gitgitgadget/git/pull/1005

Range-diff vs v9:

  1:  3a2e28275f1 =  1:  3a2e28275f1 scalar: add a README with a roadmap
  2:  50160d61a41 =  2:  50160d61a41 scalar: create a rudimentary executable
  3:  74cd6410931 =  3:  74cd6410931 scalar: start documenting the command
  4:  37231a4dd07 =  4:  37231a4dd07 scalar: create test infrastructure
  5:  a39b9c81214 <  -:  ----------- cmake: optionally build `scalar`, too
  6:  8c6762def30 <  -:  ----------- ci: also run the `scalar` tests
  7:  936ee0475ad =  5:  4439ab4de0b scalar: 'register' sets recommended config and starts maintenance
  8:  09a15f86c3d =  6:  376056066a0 scalar: 'unregister' stops background maintenance
  9:  42121a5764d =  7:  c865e89beb3 scalar: let 'unregister' handle a deleted enlistment directory gracefully
 10:  6afb2eb4163 =  8:  3f8b0abd7d6 scalar: implement 'scalar list'
 11:  dd4e3a4b761 =  9:  60659c47196 scalar: implement the `clone` subcommand
 12:  abd9c8827cd = 10:  45aca840764 scalar: teach 'clone' to support the --single-branch option
 13:  5601f82dbe1 = 11:  15e649a1734 scalar: implement the `run` command
 14:  08e4f548aa8 = 12:  2a3fb40bd9a scalar: allow reconfiguring an existing enlistment
 15:  0cec6dbd2cb = 13:  efd808a0c4a scalar: teach 'reconfigure' to optionally handle all registered enlistments
 16:  835f1c79792 = 14:  8b69462b906 scalar: implement the `delete` command
 17:  4ee1b701c7b = 15:  b5f416d79b4 scalar: implement the `version` command

Comments

Elijah Newren Dec. 3, 2021, 3:48 p.m. UTC | #1
On Fri, Dec 3, 2021 at 5:34 AM Johannes Schindelin via GitGitGadget
<gitgitgadget@gmail.com> wrote:
>
> tl;dr: This series contributes the core part of the Scalar command to the
> Git project. This command provides a convenient way to clone/initialize very
> large repositories (think: monorepos).
>
> Note: This patch series' focus is entirely on Scalar, on choosing sensible
> defaults and offering a delightful user experience around working with
> monorepos, and not about changing any existing paradigms for contrib/ (even
> if catching up on the mail thread is likely to give interested readers that
> false impression).
>
> Changes since v9:
>
>  * The patches to build Scalar and run its tests as part of Git's CI/PR,
>    have been dropped because a recent unrelated patch series does not
>    interact well with them.


i.e. basically undoing this:

...
> Changes since v6:
...
>  * I added two patches that I had planned on keeping in an add-on patch
>    series for later, to build and test Scalar as part of the CI. I am still
>    not 100% certain that it is a good idea to do so already now, but let's
>    see what the reviewers have to say.

...and returning to the original plan:

...
> On top of this patch series, I have lined up a few more:
...
>  4. A few patches to optionally build and install scalar as part of a
>     regular Git install (also teaching git help scalar to find the Scalar
>     documentation

Avoiding the issues and adding the CI later seems reasonable to me.
You addressed the last of my points in v9; I think this version is
good to go.  But one quick comment...

> These are included in my vfs-with-scalar branch thicket
> [https://github.com/dscho/git/commits/vfs-with-scalar]. On top of that, this
> branch thicket also includes patches I do not plan on upstreaming, mainly
> because they are too specific either to VFS for Git, or they support Azure
> Repos (which does not offer partial clones but speaks the GVFS protocol,
> which can be used to emulate partial clones).
>
> One other thing is very interesting about that vfs-with-scalar branch
> thicket: it contains a GitHub workflow which will run Scalar's quite
> extensive Functional Tests suite. This test suite is quite comprehensive and
> caught us a lot of bugs in the past, not only in the Scalar code, but also
> core Git.

From your wording it sounds like the plan might not include moving
these tests over.  Perhaps it doesn't make sense to move them all
over, but since they've caught problems in both Scalar and core Git,
it would be nice to see many of those tests come to Git as well as
part of a future follow on series.
Junio C Hamano Dec. 5, 2021, 10:02 a.m. UTC | #2
Elijah Newren <newren@gmail.com> writes:

> From your wording it sounds like the plan might not include moving
> these tests over.  Perhaps it doesn't make sense to move them all
> over, but since they've caught problems in both Scalar and core Git,
> it would be nice to see many of those tests come to Git as well as
> part of a future follow on series.

Yeah, we may be initially queuing this without tests for expediency,
but a production code cannot go forever without CI tests to ensure
continued code health.  People make changes in other parts of the
system Scalar may depend on and unknowingly break some assumption
Scalar makes on it.
Ævar Arnfjörð Bjarmason Dec. 7, 2021, 8:05 p.m. UTC | #3
On Sun, Dec 05 2021, Junio C Hamano wrote:

> Elijah Newren <newren@gmail.com> writes:
>
>> From your wording it sounds like the plan might not include moving
>> these tests over.  Perhaps it doesn't make sense to move them all
>> over, but since they've caught problems in both Scalar and core Git,
>> it would be nice to see many of those tests come to Git as well as
>> part of a future follow on series.
>
> Yeah, we may be initially queuing this without tests for expediency,
> but a production code cannot go forever without CI tests to ensure
> continued code health.  People make changes in other parts of the
> system Scalar may depend on and unknowingly break some assumption
> Scalar makes on it.

In this case there really isn't any reason not to have the tests go in
at the same time. The explanation in the v10 CL is:
    
    Changes since v9:
    
     * The patches to build Scalar and run its tests as part of Git's CI/PR,
       have been dropped because a recent unrelated patch series does not
       interact well with them.

That assessment isn't correct.

The change in v8->v9 of adding a "make &&" before the "test" was only
necessary because of a logic error in the v8 version. Yes it broke
because the "scalar test" target didn't know how to build its
prerequisites, but the real underlying issue is that it was even trying
at that point. It had no business running in the static-analysis target
where we hadn't built git already.

Now v9->v10 has
dropped the tests entirely, allegedly due to an interaction with my
ab/ci-updates, but there's nothing new there that isn't also the case on
"master".

But we can have our cake and eat it too.

The below patch on top of v9 would make the scalar tests do the right
thing. I.e. whenever we do a "make test" we'll run the scalar tests
too.

The code changes somewhat with ab/ci-updates, but the conflict with
js/scalar is mostly textual, not semantic (and as I've pointed out, to
the extent that ab/ci-updates changed anything it made things a bit
better for js/scalar).

I'd really like to see this scalar series land, but I really don't see
why it's necessary to entirety eject the CI test coverage due to what's
a rather trivilly solved issue.

As I've noted ad-nauseum at this point I think the necessity for the
below patch is rather silly, this should just nicely integrate with
"make test", but <brokenrecord.gif>. But even without that IMO better
approach it's clearly rather trivial to make this series have test
coverage.

It was just broken because it added a test run to the "pedantic" run,
and didn't properly integrate with the multi-"make test" runs on
"master" , both of which are addressed by the patch below.

diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
index 2ef9fbfdd38..af99699f82b 100755
--- a/ci/run-build-and-tests.sh
+++ b/ci/run-build-and-tests.sh
@@ -15,6 +15,26 @@ then
 	export DEVOPTS=pedantic
 fi
 
+make() {
+	scalar_tests=
+	for target
+	do
+		if test $target = "test"
+		then
+			scalar_tests=t
+		fi
+	done
+
+	# Do whatever we would have done with "make"
+	command make "$@"
+
+	# Running tests? Run scalar tests too
+	if test -n "$scalar_tests"
+	then
+		command make -C contrib/scalar test
+	fi
+}
+
 make
 case "$jobname" in
 linux-gcc)
@@ -52,6 +72,4 @@ esac
 
 check_unignored_build_artifacts
 
-make && make -C contrib/scalar test
-
 save_good_tree
Johannes Schindelin Dec. 8, 2021, 11:15 a.m. UTC | #4
Hi Junio,

On Sun, 5 Dec 2021, Junio C Hamano wrote:

> Elijah Newren <newren@gmail.com> writes:
>
> > From your wording it sounds like the plan might not include moving
> > these tests over.  Perhaps it doesn't make sense to move them all
> > over, but since they've caught problems in both Scalar and core Git,
> > it would be nice to see many of those tests come to Git as well as
> > part of a future follow on series.
>
> Yeah, we may be initially queuing this without tests for expediency,
> but a production code cannot go forever without CI tests to ensure
> continued code health.  People make changes in other parts of the
> system Scalar may depend on and unknowingly break some assumption
> Scalar makes on it.

The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
they specifically verify that the `gvfs-helper` (emulating Partial Clone
using the predecessor of Partial Clone, the GVFS protocol) manages to
access the repositories in the intended way.

I do not know off-hand how entangled the GVFS part is in the test suite,
but from what I recall, every single test starts with cloning a test
repository. From Azure Repos. Using the `gvfs-helper`.

Which means that the `gvfs-helper` would need to be upstreamed and be
maintained in the git.git repository proper.

Previously I was under the impression that that might be met with grumpy
rejection.

I do realize, though, that clarity of intention has been missing from this
mail thread all around, so let me ask point blank: Junio, do you want me
to include upstreaming `gvfs-helper` in the overall Scalar plan?

Ciao,
Dscho
Ævar Arnfjörð Bjarmason Dec. 8, 2021, 1:04 p.m. UTC | #5
On Wed, Dec 08 2021, Johannes Schindelin wrote:

> Hi Junio,
>
> On Sun, 5 Dec 2021, Junio C Hamano wrote:
>
>> Elijah Newren <newren@gmail.com> writes:
>>
>> > From your wording it sounds like the plan might not include moving
>> > these tests over.  Perhaps it doesn't make sense to move them all
>> > over, but since they've caught problems in both Scalar and core Git,
>> > it would be nice to see many of those tests come to Git as well as
>> > part of a future follow on series.
>>
>> Yeah, we may be initially queuing this without tests for expediency,
>> but a production code cannot go forever without CI tests to ensure
>> continued code health.  People make changes in other parts of the
>> system Scalar may depend on and unknowingly break some assumption
>> Scalar makes on it.
>
> The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
> they specifically verify that the `gvfs-helper` (emulating Partial Clone
> using the predecessor of Partial Clone, the GVFS protocol) manages to
> access the repositories in the intended way.
>
> I do not know off-hand how entangled the GVFS part is in the test suite,
> but from what I recall, every single test starts with cloning a test
> repository. From Azure Repos. Using the `gvfs-helper`.
>
> Which means that the `gvfs-helper` would need to be upstreamed and be
> maintained in the git.git repository proper.
>
> Previously I was under the impression that that might be met with grumpy
> rejection.
>
> I do realize, though, that clarity of intention has been missing from this
> mail thread all around, so let me ask point blank: Junio, do you want me
> to include upstreaming `gvfs-helper` in the overall Scalar plan?

An alternate way would be be to have our own tests build git, and then
clone and build those third party repos and test them.

I had a patch to do that for git-annex. I think it would be a good idea
to pursue it in general for prominent downstream projects as part of
some extended integration testing:
https://lore.kernel.org/git/20170516203712.15921-1-avarab@gmail.com/
Derrick Stolee Dec. 8, 2021, 2:17 p.m. UTC | #6
On 12/8/2021 6:15 AM, Johannes Schindelin wrote:
> Hi Junio,
> 
> On Sun, 5 Dec 2021, Junio C Hamano wrote:
> 
>> Elijah Newren <newren@gmail.com> writes:
>>
>>> From your wording it sounds like the plan might not include moving
>>> these tests over.  Perhaps it doesn't make sense to move them all
>>> over, but since they've caught problems in both Scalar and core Git,
>>> it would be nice to see many of those tests come to Git as well as
>>> part of a future follow on series.

Moving the C# test suite over doesn't make a lot of sense. We also
are re-using the test suite from VFS for Git, which is probably overkill
here. Those tests were created due to issues that arose with the virtual
filesystem (paired with the GVFS protocol for finding missing objects)
and most of them probably don't test anything interesting in Scalar.

When we _do_ find something interesting in that suite, we port over the
test as a normal Git test so the regression is avoided in the future.

We work to test the -rc0 version of every release with our custom patches
in microsoft/git and then run them through the Scalar and VFS for Git
functional tests as a necessary step before releasing to our internal
users. Since we are doing that already, it is a better use of time to
port tests that actually matter when they come up rather than port the
entire test suite.

>> Yeah, we may be initially queuing this without tests for expediency,
>> but a production code cannot go forever without CI tests to ensure
>> continued code health.  People make changes in other parts of the
>> system Scalar may depend on and unknowingly break some assumption
>> Scalar makes on it.

I think it is important to keep in mind that the Scalar features that
are being submitted here are getting Git-style tests included. The only
thing that is missing right now is a firm link with Git's CI system,
which can be added quickly once things have calmed down in the build
system.

If we are interested in doing something more substantial that is
closer to the Scalar functional tests, then it is important to know
that those tests are running against a production server to clone
data and fetch it dynamically throughout. That is not exactly something
we have done in the Git test suite before.

In fact, I don't think Scalar introduces anything novel here: if we
want to add more coverage of running Git commands while in a
sparse-checkout _and_ partial clone _and_ have a lot of optional config
set, then we can do that independently of Scalar. 'scalar clone' just
sets up a repository in a state that an expert user could do themselves,
so should we spend a lot of effort creating that environment in our
test suite?

We have this already in some form:

1. t1091 and t1092 try to cover important sparse-checkout behavior.

2. t0410, t5616, and others try to cover important partial clone
   behavior.

3. Our GIT_TEST_* variables that are enabled in one of our CI runs
   test many of the advanced config options enabled by Scalar.

The thing that is missing is "all of these things at once" which
would be difficult to do across the test suite with our current test
design. I'm happy to provide the service of checking the Scalar
functional tests before each release as an expensive way to check
that combination of configuration without adding that cost to every
CI run and developer inner loop.
 
> The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
> they specifically verify that the `gvfs-helper` (emulating Partial Clone
> using the predecessor of Partial Clone, the GVFS protocol) manages to
> access the repositories in the intended way.
> 
> I do not know off-hand how entangled the GVFS part is in the test suite,
> but from what I recall, every single test starts with cloning a test
> repository. From Azure Repos. Using the `gvfs-helper`.

There is a single test that checks that 'scalar clone' against github.com
works appropriately [1]. We don't duplicate all of the other tests in
this environment.

[1] https://github.com/microsoft/scalar/blob/68b6e70d77f1c7c13be9f35848a042604f3fb2f1/Scalar.FunctionalTests/Tests/MultiEnlistmentTests/ScalarCloneFromGithub.cs

> Which means that the `gvfs-helper` would need to be upstreamed and be
> maintained in the git.git repository proper.
> 
> Previously I was under the impression that that might be met with grumpy
> rejection.
> 
> I do realize, though, that clarity of intention has been missing from this
> mail thread all around, so let me ask point blank: Junio, do you want me
> to include upstreaming `gvfs-helper` in the overall Scalar plan?

I, for one, don't think that has much value for the core Git project.

Thanks,
-Stolee
Elijah Newren Dec. 8, 2021, 6:29 p.m. UTC | #7
I know this was directed to Junio, but I feel like it was my earlier
comment that accidentally opened this can of worms, so if my opinion
helps resolve it at all...

On Wed, Dec 8, 2021 at 3:15 AM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Junio,
>
> On Sun, 5 Dec 2021, Junio C Hamano wrote:
>
> > Elijah Newren <newren@gmail.com> writes:
> >
> > > From your wording it sounds like the plan might not include moving
> > > these tests over.  Perhaps it doesn't make sense to move them all
> > > over, but since they've caught problems in both Scalar and core Git,
> > > it would be nice to see many of those tests come to Git as well as
> > > part of a future follow on series.
> >
> > Yeah, we may be initially queuing this without tests for expediency,
> > but a production code cannot go forever without CI tests to ensure
> > continued code health.  People make changes in other parts of the
> > system Scalar may depend on and unknowingly break some assumption
> > Scalar makes on it.
>
> The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
> they specifically verify that the `gvfs-helper` (emulating Partial Clone
> using the predecessor of Partial Clone, the GVFS protocol) manages to
> access the repositories in the intended way.
>
> I do not know off-hand how entangled the GVFS part is in the test suite,
> but from what I recall, every single test starts with cloning a test
> repository. From Azure Repos. Using the `gvfs-helper`.
>
> Which means that the `gvfs-helper` would need to be upstreamed and be
> maintained in the git.git repository proper.

Ah, sorry, I was remembering this from an earlier cover letter of yours:

"""
But it was realized that many of these key concepts were independent of the
actual VFS and its projection of the working directory. The Scalar project
was created to make that separation, refine the key concepts, and then
extract those features into the new Scalar command.
"""

when I read

"""
One other thing is very interesting about that vfs-with-scalar branch
thicket: it contains a GitHub workflow which will run Scalar's quite
extensive Functional Tests suite. This test suite is quite comprehensive and
caught us a lot of bugs in the past, not only in the Scalar code, but also
core Git.
"""

and I was thinking (despite the branch name) that you had some scalar
+ git (w/o gvfs) tests that were interesting but not planning to
upstream.  I agree that if they're gvfs + scalar + git then they make
sense to keep internal to your work, though I hope that for any bugs
your internal testcases find in git, that you find an upstreamable
testcase to submit.  I believe Stolee has done exactly that in the
past, so just more of that would be good.

> Previously I was under the impression that that might be met with grumpy
> rejection.
>
> I do realize, though, that clarity of intention has been missing from this
> mail thread all around, so let me ask point blank: Junio, do you want me
> to include upstreaming `gvfs-helper` in the overall Scalar plan?

I know I'm not Junio, but if my opinion matters, I don't think that
needs to be part of the plan.
Junio C Hamano Dec. 8, 2021, 7:55 p.m. UTC | #8
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

> It was just broken because it added a test run to the "pedantic" run,
> and didn't properly integrate with the multi-"make test" runs on
> "master" , both of which are addressed by the patch below.
>
> diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh
> index 2ef9fbfdd38..af99699f82b 100755
> --- a/ci/run-build-and-tests.sh
> +++ b/ci/run-build-and-tests.sh
> @@ -15,6 +15,26 @@ then
>  	export DEVOPTS=pedantic
>  fi
>  
> +make() {
> +	scalar_tests=
> +	for target
> +	do
> +		if test $target = "test"
> +		then
> +			scalar_tests=t
> +		fi
> +	done
> +
> +	# Do whatever we would have done with "make"
> +	command make "$@"
> +
> +	# Running tests? Run scalar tests too
> +	if test -n "$scalar_tests"
> +	then
> +		command make -C contrib/scalar test
> +	fi
> +}
> +
>  make
>  case "$jobname" in
>  linux-gcc)
> @@ -52,6 +72,4 @@ esac
>  
>  check_unignored_build_artifacts
>  
> -make && make -C contrib/scalar test
> -
>  save_good_tree

That is an interesting way to demonstrate how orthogonal the issues
are, which in turn means that it is not such a big deal to add back
the coverage to the part that goes to contrib/scalar/.  As the actual
implementation, it is a bit too icky, though.
Junio C Hamano Dec. 9, 2021, 3:52 a.m. UTC | #9
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:

> The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
> they specifically verify that the `gvfs-helper` (emulating Partial Clone
> using the predecessor of Partial Clone, the GVFS protocol) manages to
> access the repositories in the intended way.
> ...
> I do realize, though, that clarity of intention has been missing from this
> mail thread all around, so let me ask point blank: Junio, do you want me
> to include upstreaming `gvfs-helper` in the overall Scalar plan?

Sorry, I do not follow.

What I was lamenting about was the lack of CI test coverage of stuff
that is already being considered to go 'next'.  Specifically, since
contrib/scalar/Makefile in 'seen' has a 'test' target, it would be a
shame not to exercise it, when we should be able to do so in the CI
fairly easily.

I fail to see what gvfs-helper has to do with anything in the
context of advancing the js/scalar topic as we have today.  If "The
Scalar Functional Tests" that were designed with Azure Repos in mind
is not a good fit to come into contrib/scalar/, it is fine not to
have it here---lack of it would not make the test target you have in
contrib/scalar/Makefile any less valuable, I would think.

Unless you are saying that "make -C contrib/scalar test" is useless,
that is.  But I do not think that is the case.
Johannes Schindelin Dec. 11, 2021, 12:29 a.m. UTC | #10
Hi Junio,

On Wed, 8 Dec 2021, Junio C Hamano wrote:

> Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
>
> > The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
> > they specifically verify that the `gvfs-helper` (emulating Partial Clone
> > using the predecessor of Partial Clone, the GVFS protocol) manages to
> > access the repositories in the intended way.
> > ...
> > I do realize, though, that clarity of intention has been missing from this
> > mail thread all around, so let me ask point blank: Junio, do you want me
> > to include upstreaming `gvfs-helper` in the overall Scalar plan?
>
> Sorry, I do not follow.

In
https://lore.kernel.org/git/CABPp-BGpe9Q5k22Yu8a=1xwu=pZYSeNQoqEgf+DN07cU4EB1ew@mail.gmail.com/
(i.e. in the great great grand parent of this mail), you specifically
replied to my mentioning Scalar's Functional Test suite:

	> > One other thing is very interesting about that vfs-with-scalar
	> > branch thicket: it contains a GitHub workflow which will run
	> > Scalar's quite extensive Functional Tests suite. This test
	> > suite is quite comprehensive and caught us a lot of bugs in
	> > the past, not only in the Scalar code, but also core Git.
	>
	> From your wording it sounds like the plan might not include
	> moving these tests over.  Perhaps it doesn't make sense to move
	> them all over, but since they've caught problems in both Scalar
	> and core Git, it would be nice to see many of those tests come
	> to Git as well as part of a future follow on series.

I had mentioned a couple of times that I had no intention to move Scalar's
Function Tests into contrib/scalar/, and your wording "it would be nice to
see many of those tests come to Git as well" made it sound as if you
disagreed with that intention.

But it was not a clear "please do port them over" nor a "nah, we don't
want that test suite implemented in C# and requiring, for the most part,
access to a dedicacted Azure Repo".

Hence I was asking for a clear answer to the question whether you want me
to spend time on preparing a patch series to contribute Scalar's
Functional Tests to contrib/scalar/ as well.

I _suspect_ your clear answer, if you are willing to give it as clearly,
to be "no, we do not do integration tests here, and besides, C# is not a
language we want to add to Git's tree".

> What I was lamenting about was the lack of CI test coverage of stuff
> that is already being considered to go 'next'.  Specifically, since
> contrib/scalar/Makefile in 'seen' has a 'test' target, it would be a
> shame not to exercise it, when we should be able to do so in the CI
> fairly easily.

We do have a very different understanding of "fairly easily" in that case.
Three iterations, and three weeks time spent on implementing what you
suggest, only to see broken by the merge of the `ab/ci-updates` patch
series, suggesting a fixup for the incorrect merge, seeing that fixup
rejected, and then more discussing, all of that does not strike me as
"fairly easily". It strikes me as "a lot of time and effort was spent,
mostly stepping on toes".

Granted, if `ab/ci-updates` would not have happened, it would have been
much easier. Or if `ab/ci-updates` had waited until `js/scalar` advanced
to `next`. But the way it happened was (unnecessarily?) un-easy.

> I fail to see what gvfs-helper has to do with anything in the
> context of advancing the js/scalar topic as we have today.

Okay, okay! I was just asking about gvfs-helper because that would be
required to port over Scalar's Functional Tests. The same Functional Tests
that I heard you mentioning would be "nice to see" to "come to Git as
well".

> If "The Scalar Functional Tests" that were designed with Azure Repos in
> mind is not a good fit to come into contrib/scalar/, it is fine not to
> have it here---lack of it would not make the test target you have in
> contrib/scalar/Makefile any less valuable, I would think.

The test target won't go anywhere, no worries. Just like the test target
in contrib/subtree/ does not go anywhere.

And just like `contrib/subtree/`, it does not have to be run as part of
Git's CI build.

> Unless you are saying that "make -C contrib/scalar test" is useless,
> that is.  But I do not think that is the case.

It is as useful as `make -C contrib/subtree test`. Which, as Ævar will
readily offer, is broken, because it does not ensure that top-level `make
all` is executed and therefore in a fresh checkout will fail.

Of course, I disagree that it is "broken". It works as designed. It is in
the contrib/ part of the tree, i.e. safely in the realm of "you have to
build Git first, and then the thing in contrib/". In other words, the idea
to "fix" this kind of "broken"ness is a solution in search of a problem.

And as I have said multiple times, I still think that having Scalar's code
in contrib/ is a good spot to experiment with it. It sends the right
signal of "this is not really something we promise to maintain just yet".
It is a logical place for code that developers can build themselves, but
that is not built and installed with Git by default.

Having it in the Git tree will give interested developers a chance who
want to clone a large repository on Linux, without having to touch
anything with "Microsoft" in its repository name.

Having it in the Git tree will give interested developers a chance to
experiment with things like "let's try to let `scalar clone` _not_
clone into `<enlistment>/src/`, but instead create a bare clone in
`<enlistment>/.git` and make `<enlistment>/src/` a worktree". Things like
that.

I would find those things quite a bit more useful than to force regular
Git contributors who want to change libgit.a (even if it is just pointless
refactoring) to pay attention to contrib/scalar/ in CI, when there is
still no clear answer whether Scalar will even become a first-class Git
command eventually (which I hope it will, of course).

Ciao,
Dscho
Ævar Arnfjörð Bjarmason Dec. 11, 2021, 1:07 a.m. UTC | #11
On Sat, Dec 11 2021, Johannes Schindelin wrote:

> Hi Junio,
> [...]
> We do have a very different understanding of "fairly easily" in that case.
> Three iterations, and three weeks time spent on implementing what you
> suggest, only to see broken by the merge of the `ab/ci-updates` patch
> series, suggesting a fixup for the incorrect merge, seeing that fixup
> rejected, and then more discussing, all of that does not strike me as
> "fairly easily". It strikes me as "a lot of time and effort was spent,
> mostly stepping on toes".

I sent you a working path to a fixup in [1] on the 23rd of November
where we won't go from running zero tests in compile-only to running
just the scalar test.

Junio replied[2] ("the above" referring to [1]):

    I think the above shows that it is a bug in the topic itself,

You didn't reply further in that fixup thread, and then your v9 re-roll
a week later still had the same issue[3] discussed therein. I again
pointed that out[4]:

    Is it intentional that the previously compile-only "pedantic" job is now
    running the scalar tests?

You didn't reply, but in your v10 decided to make the current iteration
of this series have no CI testing at all, and cited the interaction with
ab/ci-updates[4]:

    because a recent unrelated patch series does not interact well with them.

Which I think is clearly inaccurate, because...

> Granted, if `ab/ci-updates` would not have happened, it would have been
> much easier. Or if `ab/ci-updates` had waited until `js/scalar` advanced
> to `next`. But the way it happened was (unnecessarily?) un-easy.

...your initial patch to run the scalar tests in CI[5] was part of v7, and
had the issue described above. It pre-dates the v1 of ab/ci-updates
being on-list by a couple of days[6].

So yes, I do think it was "easy", as in that was an easy fix-up. You
just didn't follow up on it and submitted re-rolls with the already
noted breakage.

I don't blame you for that, maybe you were busy, it slipped through
etc.

But I don't accept that delays in this topic are my fault, or something
to the effect that that this whole saga represents some failure of the
review process.

Our topics textually/semantically conflicted, it happens. I offered a
fixup & way forward. Fixing it was trivial, and still is. You just
didn't follow-up.

> [...]
>> If "The Scalar Functional Tests" that were designed with Azure Repos in
>> mind is not a good fit to come into contrib/scalar/, it is fine not to
>> have it here---lack of it would not make the test target you have in
>> contrib/scalar/Makefile any less valuable, I would think.
>
> The test target won't go anywhere, no worries. Just like the test target
> in contrib/subtree/ does not go anywhere.
>
> And just like `contrib/subtree/`, it does not have to be run as part of
> Git's CI build.

But unlike contrib/completion, which we do run as part of Git's CI
build[7]?

>> Unless you are saying that "make -C contrib/scalar test" is useless,
>> that is.  But I do not think that is the case.
>
> It is as useful as `make -C contrib/subtree test`. Which, as Ævar will
> readily offer, is broken, because it does not ensure that top-level `make
> all` is executed and therefore in a fresh checkout will fail.

Before the scalar topic there was only one "make" entry point to build
libgit.a, contrib/scalar/Makefile makes that two. That was the immediate
prompt for the fixup discussion in [1].

So no, I won't offer that "make -C contrib/subtree test" is broken, it
doesn't try to build libgit.a and errors out right away if git isn't
built.

Your scalar patches do try, get most of the way there, and fail.

Your bicycle isn't broken if it doesn't make coffee, but if your fridge
has a built-in coffee maker and it doesn't work it's broken, at least as
it pertains to its coffee making function.

I think I made that distinction clear in [8], but apparently not clear
enough, as you seem to be under the impression that I was conveying the
opposite of the idea I was trying to get across.

> Of course, I disagree that it is "broken". It works as designed. It is in
> the contrib/ part of the tree, i.e. safely in the realm of "you have to
> build Git first, and then the thing in contrib/". In other words, the idea
> to "fix" this kind of "broken"ness is a solution in search of a problem.

I agree with that, but it's your proposed patches that contain the build
integration you're describing as unnecessary for "contrib/subtree/". In
v8->v8 of the series you changed the CI integration from:

    make -C contrib/scalar test

To:

    make && make -C contrib/scalar test

While keeping the bits in contrib/scalar/Makefile that made it go most
of the way towards a working "libgit.a" useful for testing, but it
breaks before we get everything we need to run the "test" target.

Which I find to be odd given the above comparison to contib/subtree/. If
you have to build git first at the top level why is it trying and
failing to build git? "contrib/subtree" doesn't.

> [...]
> I would find those things quite a bit more useful than to force regular
> Git contributors who want to change libgit.a (even if it is just pointless
> refactoring) to pay attention to contrib/scalar/ in CI, when there is
> still no clear answer whether Scalar will even become a first-class Git
> command eventually (which I hope it will, of course).

It's in-tree, scalar.c is compiled by default, so they'll have to choice
but to pay attention to it.

The question is whether we should have test and CI coverage for code in
that state.

1. https://lore.kernel.org/git/211123.86ilwjujmd.gmgdl@evledraar.gmail.com/
2. https://lore.kernel.org/git/xmqqo86a92jm.fsf@gitster.g/
3. https://lore.kernel.org/git/pull.1005.v10.git.1638538470.gitgitgadget@gmail.com/
4. https://lore.kernel.org/git/211130.861r2xelmx.gmgdl@evledraar.gmail.com/
5. https://lore.kernel.org/git/1b0328fa236a35c2427b82f53c32944e513580d3.1637158762.git.gitgitgadget@gmail.com/
6. https://lore.kernel.org/git/cover-0.2-00000000000-20211119T135343Z-avarab@gmail.com/
7. https://lore.kernel.org/git/211210.86a6h9duay.gmgdl@evledraar.gmail.com/
8. https://lore.kernel.org/git/211123.86ee77uj18.gmgdl@evledraar.gmail.com/
9. https://lore.kernel.org/git/pull.1005.v9.git.1638273289.gitgitgadget@gmail.com/
Elijah Newren Dec. 11, 2021, 5:15 a.m. UTC | #12
Hi Dscho,

On Fri, Dec 10, 2021 at 4:29 PM Johannes Schindelin
<Johannes.Schindelin@gmx.de> wrote:
>
> Hi Junio,
>
> On Wed, 8 Dec 2021, Junio C Hamano wrote:
>
> > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> >
> > > The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
> > > they specifically verify that the `gvfs-helper` (emulating Partial Clone
> > > using the predecessor of Partial Clone, the GVFS protocol) manages to
> > > access the repositories in the intended way.
> > > ...
> > > I do realize, though, that clarity of intention has been missing from this
> > > mail thread all around, so let me ask point blank: Junio, do you want me
> > > to include upstreaming `gvfs-helper` in the overall Scalar plan?
> >
> > Sorry, I do not follow.
>
> In
> https://lore.kernel.org/git/CABPp-BGpe9Q5k22Yu8a=1xwu=pZYSeNQoqEgf+DN07cU4EB1ew@mail.gmail.com/
> (i.e. in the great great grand parent of this mail), you specifically
> replied to my mentioning Scalar's Functional Test suite:
>
>         > > One other thing is very interesting about that vfs-with-scalar
>         > > branch thicket: it contains a GitHub workflow which will run
>         > > Scalar's quite extensive Functional Tests suite. This test
>         > > suite is quite comprehensive and caught us a lot of bugs in
>         > > the past, not only in the Scalar code, but also core Git.
>         >
>         > From your wording it sounds like the plan might not include
>         > moving these tests over.  Perhaps it doesn't make sense to move
>         > them all over, but since they've caught problems in both Scalar
>         > and core Git, it would be nice to see many of those tests come
>         > to Git as well as part of a future follow on series.

This is me and my email you are quoting; these aren't Junio's words.
I'm afraid my confusion may have snowballed for others here.  Sorry
about that.

I simply misunderstood at the time -- I thought there were scalar-only
tests (rather than scalar+gvfs tests) that were not being considered
for upstreaming.  As I mentioned before[1], I'm sorry for the
confusion and seemingly opening an unrelated can of worms.  I agree
that we don't need gvfs tests, or tests that combine gvfs with other
things like scalar, or c# tests.

[1] https://lore.kernel.org/git/CABPp-BFmNiqY=NfN7Ys3XE8wYBn1EQ_War+0QLq96Tk7FO6zfg@mail.gmail.com/
Johannes Schindelin Dec. 11, 2021, 1:46 p.m. UTC | #13
Hi Elijah,

On Fri, 10 Dec 2021, Elijah Newren wrote:

> On Fri, Dec 10, 2021 at 4:29 PM Johannes Schindelin
> <Johannes.Schindelin@gmx.de> wrote:
> >
> > On Wed, 8 Dec 2021, Junio C Hamano wrote:
> >
> > > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes:
> > >
> > > > The Scalar Functional Tests were designed with Azure Repos in mind, i.e.
> > > > they specifically verify that the `gvfs-helper` (emulating Partial Clone
> > > > using the predecessor of Partial Clone, the GVFS protocol) manages to
> > > > access the repositories in the intended way.
> > > > ...
> > > > I do realize, though, that clarity of intention has been missing from this
> > > > mail thread all around, so let me ask point blank: Junio, do you want me
> > > > to include upstreaming `gvfs-helper` in the overall Scalar plan?
> > >
> > > Sorry, I do not follow.
> >
> > In
> > https://lore.kernel.org/git/CABPp-BGpe9Q5k22Yu8a=1xwu=pZYSeNQoqEgf+DN07cU4EB1ew@mail.gmail.com/
> > (i.e. in the great great grand parent of this mail), you specifically
> > replied to my mentioning Scalar's Functional Test suite:
> >
> >         > > One other thing is very interesting about that vfs-with-scalar
> >         > > branch thicket: it contains a GitHub workflow which will run
> >         > > Scalar's quite extensive Functional Tests suite. This test
> >         > > suite is quite comprehensive and caught us a lot of bugs in
> >         > > the past, not only in the Scalar code, but also core Git.
> >         >
> >         > From your wording it sounds like the plan might not include
> >         > moving these tests over.  Perhaps it doesn't make sense to move
> >         > them all over, but since they've caught problems in both Scalar
> >         > and core Git, it would be nice to see many of those tests come
> >         > to Git as well as part of a future follow on series.
>
> This is me and my email you are quoting; these aren't Junio's words.
> I'm afraid my confusion may have snowballed for others here.  Sorry
> about that.
>
> I simply misunderstood at the time -- I thought there were scalar-only
> tests (rather than scalar+gvfs tests) that were not being considered
> for upstreaming.  As I mentioned before[1], I'm sorry for the
> confusion and seemingly opening an unrelated can of worms.  I agree
> that we don't need gvfs tests, or tests that combine gvfs with other
> things like scalar, or c# tests.
>
> [1] https://lore.kernel.org/git/CABPp-BFmNiqY=NfN7Ys3XE8wYBn1EQ_War+0QLq96Tk7FO6zfg@mail.gmail.com/

No worries, I am glad it is sorted out now.

Ciao,
Dscho