diff mbox series

[v5,11/14] core.fsyncmethod: tests for batch mode

Message ID 1a4aff8c350b5ffe3c7760faa4accc88c83ce11c.1648616734.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series core.fsyncmethod: add 'batch' mode for faster fsyncing of multiple objects | expand

Commit Message

Neeraj Singh (WINDOWS-SFS) March 30, 2022, 5:05 a.m. UTC
From: Neeraj Singh <neerajsi@microsoft.com>

Add test cases to exercise batch mode for:
 * 'git add'
 * 'git stash'
 * 'git update-index'
 * 'git unpack-objects'

These tests ensure that the added data winds up in the object database.

In this change we introduce a new test helper lib-unique-files.sh. The
goal of this library is to create a tree of files that have different
oids from any other files that may have been created in the current test
repo. This helps us avoid missing validation of an object being added
due to it already being in the repo.

Signed-off-by: Neeraj Singh <neerajsi@microsoft.com>
---
 t/lib-unique-files.sh  | 34 ++++++++++++++++++++++++++++++++++
 t/t3700-add.sh         | 28 ++++++++++++++++++++++++++++
 t/t3903-stash.sh       | 20 ++++++++++++++++++++
 t/t5300-pack-object.sh | 41 +++++++++++++++++++++++++++--------------
 4 files changed, 109 insertions(+), 14 deletions(-)
 create mode 100644 t/lib-unique-files.sh

Comments

Junio C Hamano March 30, 2022, 6:13 p.m. UTC | #1
"Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:

> diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
> new file mode 100644
> index 00000000000..34c01a65256
> --- /dev/null
> +++ b/t/lib-unique-files.sh
> @@ -0,0 +1,34 @@
> +# Helper to create files with unique contents
> +
> +# Create multiple files with unique contents within this test run. Takes the
> +# number of directories, the number of files in each directory, and the base
> +# directory.
> +#
> +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
> +#					 each in my_dir, all with contents
> +#					 different from previous invocations
> +#					 of this command in this run.
> +
> +test_create_unique_files () {
> +	test "$#" -ne 3 && BUG "3 param"
> +
> +	local dirs="$1" &&
> +	local files="$2" &&
> +	local basedir="$3" &&
> +	local counter=0 &&

The cover letter mentioned that in this round, "local var=val" is
avoided?

I've seen instances of local being "curious", e.g.
  https://lore.kernel.org/git/20220125092419.cgtfw32nk2niazfk@carbon/
and the discussion indicates that it may still be relevant.

> +	local i &&
> +	local j &&
> +	test_tick &&
> +	local basedata=$basedir$test_tick &&
> +	rm -rf "$basedir" &&
> +	for i in $(test_seq $dirs)
> +	do
> +		local dir=$basedir/dir$i &&

This, too.

To summarize the findings from the thread is:

 - very old releases of /bin/dash that predates Git, like 0.3.8,
   would not correctly handle assignment on "local" at all.  It may
   not matter to us.

 - semi-old /bin/dash 0.5.10, which is still in use, mishandled
   'local var=$val', but 'local var="$val"' is an acceptable
   workaround for the bug.  "git grep" tells us that we use this
   form in our shell scripts quite a lot, so we may be OK.

 - /bin/dash 0.5.11, which was tagged mid 2020, and newer would glok
   'local var=$val' just fine even $val has $IFS whitespace in it.

So, I'd say the safe practice we should adopt is to use "local" one
per variable, assignment to the variable on the same line of "local"
is OK, but unlike the regular assignment, older dash may mishandle
the right-hand-side unless it is quoted, i.e.

    local var='string literal'
    local var="$variable interpolation"
Neeraj Singh March 31, 2022, 3:55 a.m. UTC | #2
On Wed, Mar 30, 2022 at 11:13 AM Junio C Hamano <gitster@pobox.com> wrote:
>
> "Neeraj Singh via GitGitGadget" <gitgitgadget@gmail.com> writes:
>
> > diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
> > new file mode 100644
> > index 00000000000..34c01a65256
> > --- /dev/null
> > +++ b/t/lib-unique-files.sh
> > @@ -0,0 +1,34 @@
> > +# Helper to create files with unique contents
> > +
> > +# Create multiple files with unique contents within this test run. Takes the
> > +# number of directories, the number of files in each directory, and the base
> > +# directory.
> > +#
> > +# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
> > +#                                     each in my_dir, all with contents
> > +#                                     different from previous invocations
> > +#                                     of this command in this run.
> > +
> > +test_create_unique_files () {
> > +     test "$#" -ne 3 && BUG "3 param"
> > +
> > +     local dirs="$1" &&
> > +     local files="$2" &&
> > +     local basedir="$3" &&
> > +     local counter=0 &&
>
> The cover letter mentioned that in this round, "local var=val" is
> avoided?
>

Whoops.. I guess I wasn't looking widely enough.  I noticed failures
on my Ubuntu system that uses dash.

> I've seen instances of local being "curious", e.g.
>   https://lore.kernel.org/git/20220125092419.cgtfw32nk2niazfk@carbon/
> and the discussion indicates that it may still be relevant.
>
> > +     local i &&
> > +     local j &&
> > +     test_tick &&
> > +     local basedata=$basedir$test_tick &&
> > +     rm -rf "$basedir" &&
> > +     for i in $(test_seq $dirs)
> > +     do
> > +             local dir=$basedir/dir$i &&
>
> This, too.
>
> To summarize the findings from the thread is:
>
>  - very old releases of /bin/dash that predates Git, like 0.3.8,
>    would not correctly handle assignment on "local" at all.  It may
>    not matter to us.
>
>  - semi-old /bin/dash 0.5.10, which is still in use, mishandled
>    'local var=$val', but 'local var="$val"' is an acceptable
>    workaround for the bug.  "git grep" tells us that we use this
>    form in our shell scripts quite a lot, so we may be OK.
>
>  - /bin/dash 0.5.11, which was tagged mid 2020, and newer would glok
>    'local var=$val' just fine even $val has $IFS whitespace in it.
>
> So, I'd say the safe practice we should adopt is to use "local" one
> per variable, assignment to the variable on the same line of "local"
> is OK, but unlike the regular assignment, older dash may mishandle
> the right-hand-side unless it is quoted, i.e.
>
>     local var='string literal'
>     local var="$variable interpolation"
>

Thanks for updating me on the documentation.   I think I'll go back to
one-line assignments with the proper quotations and single variables,
just to keep things short and consistent. I've been looking up
everything as I go with these shell scripts, but there's a lot of
subtlety.  CMD batch files are even uglier, but I've had many years to
learn the common practices there.

I'll ask a question on the perf test patch on a related shell topic.

Thanks,
Neeraj
diff mbox series

Patch

diff --git a/t/lib-unique-files.sh b/t/lib-unique-files.sh
new file mode 100644
index 00000000000..34c01a65256
--- /dev/null
+++ b/t/lib-unique-files.sh
@@ -0,0 +1,34 @@ 
+# Helper to create files with unique contents
+
+# Create multiple files with unique contents within this test run. Takes the
+# number of directories, the number of files in each directory, and the base
+# directory.
+#
+# test_create_unique_files 2 3 my_dir -- Creates 2 directories with 3 files
+#					 each in my_dir, all with contents
+#					 different from previous invocations
+#					 of this command in this run.
+
+test_create_unique_files () {
+	test "$#" -ne 3 && BUG "3 param"
+
+	local dirs="$1" &&
+	local files="$2" &&
+	local basedir="$3" &&
+	local counter=0 &&
+	local i &&
+	local j &&
+	test_tick &&
+	local basedata=$basedir$test_tick &&
+	rm -rf "$basedir" &&
+	for i in $(test_seq $dirs)
+	do
+		local dir=$basedir/dir$i &&
+		mkdir -p "$dir" &&
+		for j in $(test_seq $files)
+		do
+			counter=$((counter + 1)) &&
+			echo "$basedata.$counter">"$dir/file$j.txt"
+		done
+	done
+}
diff --git a/t/t3700-add.sh b/t/t3700-add.sh
index b1f90ba3250..8979c8a5f03 100755
--- a/t/t3700-add.sh
+++ b/t/t3700-add.sh
@@ -8,6 +8,8 @@  test_description='Test of git add, including the -- option.'
 TEST_PASSES_SANITIZE_LEAK=true
 . ./test-lib.sh
 
+. $TEST_DIRECTORY/lib-unique-files.sh
+
 # Test the file mode "$1" of the file "$2" in the index.
 test_mode_in_index () {
 	case "$(git ls-files -s "$2")" in
@@ -34,6 +36,32 @@  test_expect_success \
     'Test that "git add -- -q" works' \
     'touch -- -q && git add -- -q'
 
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'git add: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir1 &&
+	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION add -- ./files_base_dir1/ &&
+	git ls-files --stage files_base_dir1/ |
+	test_parse_ls_files_stage_oids >added_files_oids &&
+
+	# We created 2 subdirs with 4 files each (8 files total) above
+	test_line_count = 8 added_files_oids &&
+	git cat-file --batch-check='%(objectname)' <added_files_oids >added_files_actual &&
+	test_cmp added_files_oids added_files_actual
+"
+
+test_expect_success 'git update-index: core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir2 &&
+	find files_base_dir2 ! -type d -print | xargs git $BATCH_CONFIGURATION update-index --add -- &&
+	git ls-files --stage files_base_dir2 |
+	test_parse_ls_files_stage_oids >added_files2_oids &&
+
+	# We created 2 subdirs with 4 files each (8 files total) above
+	test_line_count = 8 added_files2_oids &&
+	git cat-file --batch-check='%(objectname)' <added_files2_oids >added_files2_actual &&
+	test_cmp added_files2_oids added_files2_actual
+"
+
 test_expect_success \
 	'git add: Test that executable bit is not used if core.filemode=0' \
 	'git config core.filemode 0 &&
diff --git a/t/t3903-stash.sh b/t/t3903-stash.sh
index 4abbc8fccae..20e94881964 100755
--- a/t/t3903-stash.sh
+++ b/t/t3903-stash.sh
@@ -9,6 +9,7 @@  GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=main
 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME
 
 . ./test-lib.sh
+. $TEST_DIRECTORY/lib-unique-files.sh
 
 test_expect_success 'usage on cmd and subcommand invalid option' '
 	test_expect_code 129 git stash --invalid-option 2>usage &&
@@ -1410,6 +1411,25 @@  test_expect_success 'stash handles skip-worktree entries nicely' '
 	git rev-parse --verify refs/stash:A.t
 '
 
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'stash with core.fsyncmethod=batch' "
+	test_create_unique_files 2 4 files_base_dir &&
+	GIT_TEST_FSYNC=1 git $BATCH_CONFIGURATION stash push -u -- ./files_base_dir/ &&
+
+	# The files were untracked, so use the third parent,
+	# which contains the untracked files
+	git ls-tree -r stash^3 -- ./files_base_dir/ |
+	test_parse_ls_tree_oids >stashed_files_oids &&
+
+	# We created 2 dirs with 4 files each (8 files total) above
+	test_line_count = 8 stashed_files_oids &&
+	git cat-file --batch-check='%(objectname)' <stashed_files_oids >stashed_files_actual &&
+	test_cmp stashed_files_oids stashed_files_actual
+"
+
+
 test_expect_success 'git stash succeeds despite directory/file change' '
 	test_create_repo directory_file_switch_v1 &&
 	(
diff --git a/t/t5300-pack-object.sh b/t/t5300-pack-object.sh
index a11d61206ad..f8a0f309e2d 100755
--- a/t/t5300-pack-object.sh
+++ b/t/t5300-pack-object.sh
@@ -161,22 +161,27 @@  test_expect_success 'pack-objects with bogus arguments' '
 '
 
 check_unpack () {
+	local packname="$1" &&
+	local object_list="$2" &&
+	local git_config="$3" &&
 	test_when_finished "rm -rf git2" &&
-	git init --bare git2 &&
-	git -C git2 unpack-objects -n <"$1".pack &&
-	git -C git2 unpack-objects <"$1".pack &&
-	(cd .git && find objects -type f -print) |
-	while read path
-	do
-		cmp git2/$path .git/$path || {
-			echo $path differs.
-			return 1
-		}
-	done
+	git $git_config init --bare git2 &&
+	(
+		git $git_config -C git2 unpack-objects -n <"$packname".pack &&
+		git $git_config -C git2 unpack-objects <"$packname".pack &&
+		git $git_config -C git2 cat-file --batch-check="%(objectname)"
+	) <"$object_list" >current &&
+	cmp "$object_list" current
 }
 
 test_expect_success 'unpack without delta' '
-	check_unpack test-1-${packname_1}
+	check_unpack test-1-${packname_1} obj-list
+'
+
+BATCH_CONFIGURATION='-c core.fsync=loose-object -c core.fsyncmethod=batch'
+
+test_expect_success 'unpack without delta (core.fsyncmethod=batch)' '
+	check_unpack test-1-${packname_1} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'pack with REF_DELTA' '
@@ -185,7 +190,11 @@  test_expect_success 'pack with REF_DELTA' '
 '
 
 test_expect_success 'unpack with REF_DELTA' '
-	check_unpack test-2-${packname_2}
+	check_unpack test-2-${packname_2} obj-list
+'
+
+test_expect_success 'unpack with REF_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-2-${packname_2} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'pack with OFS_DELTA' '
@@ -195,7 +204,11 @@  test_expect_success 'pack with OFS_DELTA' '
 '
 
 test_expect_success 'unpack with OFS_DELTA' '
-	check_unpack test-3-${packname_3}
+	check_unpack test-3-${packname_3} obj-list
+'
+
+test_expect_success 'unpack with OFS_DELTA (core.fsyncmethod=batch)' '
+       check_unpack test-3-${packname_3} obj-list "$BATCH_CONFIGURATION"
 '
 
 test_expect_success 'compare delta flavors' '