diff mbox series

[1/9] sparse-checkout: create builtin with 'list' subcommand

Message ID c37b5f2c29d42e7c0c727596625d4eb50b6ddb4f.1566313865.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series New sparse-checkout builtin and "cone" mode | expand

Commit Message

John Passaro via GitGitGadget Aug. 20, 2019, 3:11 p.m. UTC
From: Derrick Stolee <dstolee@microsoft.com>

The sparse-checkout feature is mostly hidden to users, as its
only documentation is supplementary information in the docs for
'git read-tree'. In addition, users need to know how to edit the
.git/info/sparse-checkout file with the right patterns, then run
the appropriate 'git read-tree -mu HEAD' command. Keeping the
working directory in sync with the sparse-checkout file requires
care.

Begin an effort to make the sparse-checkout feature a porcelain
feature by creating a new 'git sparse-checkout' builtin. This
builtin will be the preferred mechanism for manipulating the
sparse-checkout file and syncing the working directory.

For now, create the basics of the builtin. Includes a single
subcommand, "git sparse-checkout list", that lists the patterns
currently in the sparse-checkout file. Test that these patterns
are parsed and written correctly to the output.

The documentation provided is adapted from the "git read-tree"
documentation with a few edits for clarity in the new context.
Extra sections are added to hint toward a future change to
a moer restricted pattern set.

Signed-off-by: Derrick Stolee <dstolee@microsoft.com>
---
 .gitignore                            |  1 +
 Documentation/git-read-tree.txt       |  2 +-
 Documentation/git-sparse-checkout.txt | 90 +++++++++++++++++++++++++++
 Makefile                              |  1 +
 builtin.h                             |  1 +
 builtin/sparse-checkout.c             | 90 +++++++++++++++++++++++++++
 git.c                                 |  1 +
 t/t1091-sparse-checkout-builtin.sh    | 50 +++++++++++++++
 8 files changed, 235 insertions(+), 1 deletion(-)
 create mode 100644 Documentation/git-sparse-checkout.txt
 create mode 100644 builtin/sparse-checkout.c
 create mode 100755 t/t1091-sparse-checkout-builtin.sh

Comments

Elijah Newren Aug. 23, 2019, 10:30 p.m. UTC | #1
On Tue, Aug 20, 2019 at 8:13 AM Derrick Stolee via GitGitGadget
<gitgitgadget@gmail.com> wrote:
<snip>
> The documentation provided is adapted from the "git read-tree"
> documentation with a few edits for clarity in the new context.
> Extra sections are added to hint toward a future change to
> a moer restricted pattern set.

s/moer/more/

<snip>
> +"Sparse checkout" allows populating the working directory sparsely.
> +It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell
> +Git whether a file in the working directory is worth looking at. If
> +the skip-worktree bit is set, then the file is ignored in the working
> +directory. Git will not populate the contents of those files, which
> +makes a sparse checkout helpful when working in a repository with many
> +files, but only a few are important to the current user.
> +
> +The `$GIT_DIR/info/sparse-checkout` file is used to define the
> +skip-worktree reference bitmap. When Git updates the working
> +directory, it resets the skip-worktree bit in the index based on this
> +file. If an entry
> +matches a pattern in this file, skip-worktree will not be set on
> +that entry. Otherwise, skip-worktree will be set.
> +
> +Then it compares the new skip-worktree value with the previous one. If
> +skip-worktree turns from set to unset, it will add the corresponding
> +file back. If it turns from unset to set, that file will be removed.

I had to read this twice for it to make sense.  Not sure I have a real
good suggestion here, the name "skip-worktree" instead of e.g
"wanted-in-worktree" just naturally leads us into sentences with one
negation automatically and we sometimes have to add more.

Maybe just replace the last two paragraphs with:

The `$GIT_DIR/info/sparse-checkout` file is used to define the
skip-worktree reference bitmap. When Git updates the working
directory, it updates the skip-worktree bits in the index based on this
file and removes or restores files in the working copy to match.

> +
> +## FULL PATTERN SET
> +
> +By default, the sparse-checkout file uses the same syntax as `.gitignore`
> +files.
> +
> +While `$GIT_DIR/info/sparse-checkout` is usually used to specify what
> +files are in, you can also specify what files are _not_ in, using
> +negate patterns. For example, to remove the file `unwanted`:

s/in/included/?

> +Another tricky thing is fully repopulating the working directory when you
> +no longer want sparse checkout. You cannot just disable "sparse
> +checkout" because skip-worktree bits are still in the index and your working
> +directory is still sparsely populated. You should re-populate the working
> +directory with the `$GIT_DIR/info/sparse-checkout` file content as
> +follows:
> +
> +----------------
> +/*
> +----------------

Can we just get rid of this part of the documentation, since there
will be a sparse-checkout command to disable/undo/reset?  However, it
could be useful to mention cases when disabling/undoing/resetting a
sparse-checkout won't work, if there are some.  For example, with the
previous read-tree implementation, you could not undo the sparse
checkout if the index had any unstaged entries, and you couldn't undo
it if there were any files present that corresponding to the sparse
patterns (for fear they'd be overwritten -- however, every once in a
while someone tried to desparsify, it failed e.g. due to the disk
becoming full, and then after freeing up space there were zillions of
files that exactly matched what de-sparsifying would have put there
but the command wanted the user to manually delete them first.)

> +
> +Then you can disable sparse checkout. Sparse checkout support in 'git
> +read-tree' and similar commands is disabled by default. You need to
> +set `core.sparseCheckout` to `true` in order to have sparse checkout
> +support.

...and get rid of this paragraph because I'd expect git
sparse-checkout to come with a subcommand (init/add/whatever) to set
this for the user?  Unless maybe you want to add some words about why
the command sets core.sparseCheckout...and related workspace related
stuff as we talked about elsewhere.

> +test_expect_success 'git sparse-checkout list (empty)' '
> +       git -C repo sparse-checkout list >list 2>err &&
> +       test_line_count = 0 list &&
> +       test_i18ngrep "failed to parse sparse-checkout file; it may not exist" err

Is that the error we want, rather than something like "This worktree
is not sparse (no sparse-checkout file exists and core.sparseCheckout
is false"?

> +'
> +
> +test_expect_success 'git sparse-checkout list (populated)' '
> +       test_when_finished rm -f repo/.git/info/sparse-checkout &&
> +       cat >repo/.git/info/sparse-checkout <<-EOF &&
> +               /folder1/*
> +               /deep/
> +               **/a
> +               !*bin*
> +       EOF
> +       git -C repo sparse-checkout list >list &&
> +       cat >expect <<-EOF &&
> +               /folder1/*
> +               /deep/
> +               **/a
> +               !*bin*
> +       EOF
> +       test_cmp expect list

I have a `./sparsify --stats` that reports
   You are now in a sparse checkout with only 3499 of the 53625 files.
or
  You are not in a sparse checkout.

and I have a --info option that reports both the stats and the list of
sparse paths similar to this sparse-checkout list command.  The stats
aren't important, I guess, but seem nice for the user.  I don't know
if you want to include anything like that in this or another command,
but just thought I'd mention it.
diff mbox series

Patch

diff --git a/.gitignore b/.gitignore
index 521d8f4fb4..9411522c4f 100644
--- a/.gitignore
+++ b/.gitignore
@@ -158,6 +158,7 @@ 
 /git-show-branch
 /git-show-index
 /git-show-ref
+/git-sparse-checkout
 /git-stage
 /git-stash
 /git-status
diff --git a/Documentation/git-read-tree.txt b/Documentation/git-read-tree.txt
index d271842608..da33f84f33 100644
--- a/Documentation/git-read-tree.txt
+++ b/Documentation/git-read-tree.txt
@@ -436,7 +436,7 @@  support.
 SEE ALSO
 --------
 linkgit:git-write-tree[1]; linkgit:git-ls-files[1];
-linkgit:gitignore[5]
+linkgit:gitignore[5]; linkgit:git-sparse-checkout[1];
 
 GIT
 ---
diff --git a/Documentation/git-sparse-checkout.txt b/Documentation/git-sparse-checkout.txt
new file mode 100644
index 0000000000..ca0ca6a12f
--- /dev/null
+++ b/Documentation/git-sparse-checkout.txt
@@ -0,0 +1,90 @@ 
+git-sparse-checkout(1)
+=======================
+
+NAME
+----
+git-sparse-checkout - Initialize and modify the sparse-checkout
+configuration, which reduces the checkout to a set of directories
+given by a list of prefixes.
+
+
+SYNOPSIS
+--------
+[verse]
+'git sparse-checkout <subcommand> [options]'
+
+
+DESCRIPTION
+-----------
+
+Initialize and modify the sparse-checkout configuration, which reduces
+the checkout to a set of directories given by a list of prefixes.
+
+
+COMMANDS
+--------
+'list'::
+	Provide a list of the contents in the sparse-checkout file.
+
+
+SPARSE CHECKOUT
+----------------
+
+"Sparse checkout" allows populating the working directory sparsely.
+It uses the skip-worktree bit (see linkgit:git-update-index[1]) to tell
+Git whether a file in the working directory is worth looking at. If
+the skip-worktree bit is set, then the file is ignored in the working
+directory. Git will not populate the contents of those files, which
+makes a sparse checkout helpful when working in a repository with many
+files, but only a few are important to the current user.
+
+The `$GIT_DIR/info/sparse-checkout` file is used to define the
+skip-worktree reference bitmap. When Git updates the working
+directory, it resets the skip-worktree bit in the index based on this
+file. If an entry
+matches a pattern in this file, skip-worktree will not be set on
+that entry. Otherwise, skip-worktree will be set.
+
+Then it compares the new skip-worktree value with the previous one. If
+skip-worktree turns from set to unset, it will add the corresponding
+file back. If it turns from unset to set, that file will be removed.
+
+## FULL PATTERN SET
+
+By default, the sparse-checkout file uses the same syntax as `.gitignore`
+files.
+
+While `$GIT_DIR/info/sparse-checkout` is usually used to specify what
+files are in, you can also specify what files are _not_ in, using
+negate patterns. For example, to remove the file `unwanted`:
+
+----------------
+/*
+!unwanted
+----------------
+
+Another tricky thing is fully repopulating the working directory when you
+no longer want sparse checkout. You cannot just disable "sparse
+checkout" because skip-worktree bits are still in the index and your working
+directory is still sparsely populated. You should re-populate the working
+directory with the `$GIT_DIR/info/sparse-checkout` file content as
+follows:
+
+----------------
+/*
+----------------
+
+Then you can disable sparse checkout. Sparse checkout support in 'git
+read-tree' and similar commands is disabled by default. You need to
+set `core.sparseCheckout` to `true` in order to have sparse checkout
+support.
+
+SEE ALSO
+--------
+
+linkgit:git-read-tree[1]
+linkgit:gitignore[5]
+
+GIT
+---
+Part of the linkgit:git[1] suite
diff --git a/Makefile b/Makefile
index f9255344ae..d1d067ab65 100644
--- a/Makefile
+++ b/Makefile
@@ -1126,6 +1126,7 @@  BUILTIN_OBJS += builtin/shortlog.o
 BUILTIN_OBJS += builtin/show-branch.o
 BUILTIN_OBJS += builtin/show-index.o
 BUILTIN_OBJS += builtin/show-ref.o
+BUILTIN_OBJS += builtin/sparse-checkout.o
 BUILTIN_OBJS += builtin/stash.o
 BUILTIN_OBJS += builtin/stripspace.o
 BUILTIN_OBJS += builtin/submodule--helper.o
diff --git a/builtin.h b/builtin.h
index 5cf5df69f7..2b25a80cde 100644
--- a/builtin.h
+++ b/builtin.h
@@ -225,6 +225,7 @@  int cmd_shortlog(int argc, const char **argv, const char *prefix);
 int cmd_show(int argc, const char **argv, const char *prefix);
 int cmd_show_branch(int argc, const char **argv, const char *prefix);
 int cmd_show_index(int argc, const char **argv, const char *prefix);
+int cmd_sparse_checkout(int argc, const char **argv, const char *prefix);
 int cmd_status(int argc, const char **argv, const char *prefix);
 int cmd_stash(int argc, const char **argv, const char *prefix);
 int cmd_stripspace(int argc, const char **argv, const char *prefix);
diff --git a/builtin/sparse-checkout.c b/builtin/sparse-checkout.c
new file mode 100644
index 0000000000..6477a6ed9c
--- /dev/null
+++ b/builtin/sparse-checkout.c
@@ -0,0 +1,90 @@ 
+#include "builtin.h"
+#include "config.h"
+#include "dir.h"
+#include "parse-options.h"
+#include "pathspec.h"
+#include "repository.h"
+#include "run-command.h"
+#include "strbuf.h"
+
+static char const * const builtin_sparse_checkout_usage[] = {
+	N_("git sparse-checkout [list]"),
+	NULL
+};
+
+struct opts_sparse_checkout {
+	const char *subcommand;
+	int read_stdin;
+} opts;
+
+static char *get_sparse_checkout_filename(void)
+{
+	return git_pathdup("info/sparse-checkout");
+}
+
+static void write_excludes_to_file(FILE *fp, struct exclude_list *el)
+{
+	int i;
+
+	for (i = 0; i < el->nr; i++) {
+		struct exclude *x = el->excludes[i];
+
+		if (x->flags & EXC_FLAG_NEGATIVE)
+			fprintf(fp, "!");
+
+		fprintf(fp, "%s", x->pattern);
+
+		if (x->flags & EXC_FLAG_MUSTBEDIR)
+			fprintf(fp, "/");
+
+		fprintf(fp, "\n");
+	}
+}
+
+static int sparse_checkout_list(int argc, const char **argv)
+{
+	struct exclude_list el;
+	char *sparse_filename;
+	int res;
+
+	memset(&el, 0, sizeof(el));
+
+	sparse_filename = get_sparse_checkout_filename();
+	res = add_excludes_from_file_to_list(sparse_filename, "", 0, &el, NULL);
+	free(sparse_filename);
+
+	if (res < 0) {
+		warning(_("failed to parse sparse-checkout file; it may not exist"));
+		return 0;
+	}
+
+	write_excludes_to_file(stdout, &el);
+	clear_exclude_list(&el);
+
+	return 0;
+}
+
+int cmd_sparse_checkout(int argc, const char **argv, const char *prefix)
+{
+	static struct option builtin_sparse_checkout_options[] = {
+		OPT_END(),
+	};
+
+	if (argc == 2 && !strcmp(argv[1], "-h"))
+		usage_with_options(builtin_sparse_checkout_usage,
+				   builtin_sparse_checkout_options);
+
+	git_config(git_default_config, NULL);
+	argc = parse_options(argc, argv, prefix,
+			     builtin_sparse_checkout_options,
+			     builtin_sparse_checkout_usage,
+			     PARSE_OPT_STOP_AT_NON_OPTION);
+
+	if (argc > 0) {
+		if (!strcmp(argv[0], "list"))
+			return sparse_checkout_list(argc, argv);
+	}
+
+	usage_with_options(builtin_sparse_checkout_usage,
+			   builtin_sparse_checkout_options);
+}
diff --git a/git.c b/git.c
index c1ee7124ed..dfb680e15d 100644
--- a/git.c
+++ b/git.c
@@ -573,6 +573,7 @@  static struct cmd_struct commands[] = {
 	{ "show-branch", cmd_show_branch, RUN_SETUP },
 	{ "show-index", cmd_show_index },
 	{ "show-ref", cmd_show_ref, RUN_SETUP },
+	{ "sparse-checkout", cmd_sparse_checkout, RUN_SETUP | NEED_WORK_TREE },
 	{ "stage", cmd_add, RUN_SETUP | NEED_WORK_TREE },
 	/*
 	 * NEEDSWORK: Until the builtin stash is thoroughly robust and no
diff --git a/t/t1091-sparse-checkout-builtin.sh b/t/t1091-sparse-checkout-builtin.sh
new file mode 100755
index 0000000000..ba6928c641
--- /dev/null
+++ b/t/t1091-sparse-checkout-builtin.sh
@@ -0,0 +1,50 @@ 
+#!/bin/sh
+
+test_description='sparse checkout builtin tests'
+
+. ./test-lib.sh
+
+test_expect_success 'setup' '
+	git init repo &&
+	(
+		cd repo &&
+		echo "initial" >a &&
+		mkdir folder1 folder2 deep &&
+		mkdir deep/deeper1 deep/deeper2 &&
+		mkdir deep/deeper1/deepest &&
+		cp a folder1 &&
+		cp a folder2 &&
+		cp a deep &&
+		cp a deep/deeper1 &&
+		cp a deep/deeper2 &&
+		cp a deep/deeper1/deepest &&
+		git add . &&
+		git commit -m "initial commit"
+	)
+'
+
+test_expect_success 'git sparse-checkout list (empty)' '
+	git -C repo sparse-checkout list >list 2>err &&
+	test_line_count = 0 list &&
+	test_i18ngrep "failed to parse sparse-checkout file; it may not exist" err
+'
+
+test_expect_success 'git sparse-checkout list (populated)' '
+	test_when_finished rm -f repo/.git/info/sparse-checkout &&
+	cat >repo/.git/info/sparse-checkout <<-EOF &&
+		/folder1/*
+		/deep/
+		**/a
+		!*bin*
+	EOF
+	git -C repo sparse-checkout list >list &&
+	cat >expect <<-EOF &&
+		/folder1/*
+		/deep/
+		**/a
+		!*bin*
+	EOF
+	test_cmp expect list
+'
+
+test_done
\ No newline at end of file