diff mbox series

rev-parse: add option for absolute or relative path formatting

Message ID 20200908185017.2005159-1-sandals@crustytoothpaste.net (mailing list archive)
State New, archived
Headers show
Series rev-parse: add option for absolute or relative path formatting | expand

Commit Message

brian m. carlson Sept. 8, 2020, 6:50 p.m. UTC
git rev-parse has several options which print various paths.  Some of
these paths are printed relative to the current working directory, and
some are absolute.

Normally, this is not a problem, but there are times when one wants
paths entirely in one format or another.  This can be done trivially if
the paths are canonical, but canonicalizing paths is not possible on
some shell scripting environments which lack realpath(1) and also in Go,
which lacks functions that properly canonicalize paths on Windows.

To help out the scripter, let's provide an option which turns most of
the paths printed by git rev-parse to be either relative to the current
working directory or absolute and canonical.  Document which options are
affected and which are not so that users are not confused.

This approach is cleaner and tidier than providing duplicates of
existing options which are either relative or absolute.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
This impetus for this patch is Git LFS, which is written in Go.  Go
lacks a cross-platform way to canonicalize paths in the same way that
Git does, so when Git produces relative paths, such as in some cases
with --git-common-dir, we end up with problems when users are doing
things with unusual paths on Windows, such as when using SUBST or
OneDrive or various other edge cases.  Go upstream does not appear eager
to address this problem.

The obvious solution to this is to have Git canonicalize all paths, and
rather than adding yet another one-off, I decided to implement a more
generic solution.  This can also be valuable for shell scripting
environments which lack realpath(1) (e.g. macOS, IIRC).

Dscho CC'd for visibility into how this will work on Windows.

 Documentation/git-rev-parse.txt | 71 ++++++++++++++++------------
 builtin/rev-parse.c             | 84 ++++++++++++++++++++++++++++-----
 t/t1500-rev-parse.sh            | 36 +++++++++++++-
 3 files changed, 148 insertions(+), 43 deletions(-)

Comments

Johannes Schindelin Sept. 9, 2020, 3:23 a.m. UTC | #1
Hi brian,

On Tue, 8 Sep 2020, brian m. carlson wrote:

> git rev-parse has several options which print various paths.  Some of
> these paths are printed relative to the current working directory, and
> some are absolute.
>
> Normally, this is not a problem, but there are times when one wants
> paths entirely in one format or another.  This can be done trivially if
> the paths are canonical, but canonicalizing paths is not possible on
> some shell scripting environments which lack realpath(1) and also in Go,
> which lacks functions that properly canonicalize paths on Windows.
>
> To help out the scripter, let's provide an option which turns most of
> the paths printed by git rev-parse to be either relative to the current
> working directory or absolute and canonical.  Document which options are
> affected and which are not so that users are not confused.
>
> This approach is cleaner and tidier than providing duplicates of
> existing options which are either relative or absolute.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---

Nicely explained and implemented.

> This impetus for this patch is Git LFS, which is written in Go.  Go
> lacks a cross-platform way to canonicalize paths in the same way that
> Git does, so when Git produces relative paths, such as in some cases
> with --git-common-dir, we end up with problems when users are doing
> things with unusual paths on Windows, such as when using SUBST or
> OneDrive or various other edge cases.  Go upstream does not appear eager
> to address this problem.
>
> The obvious solution to this is to have Git canonicalize all paths, and
> rather than adding yet another one-off, I decided to implement a more
> generic solution.  This can also be valuable for shell scripting
> environments which lack realpath(1) (e.g. macOS, IIRC).
>
> Dscho CC'd for visibility into how this will work on Windows.

On Windows, at least in the version from git-for-windows/git,
`strbuf_realpath()` uses the Win32 API function
`GetFinalPathNameByHandleW()` to canonicalize paths (whenever possible),
meaning that the `subst` issue you mentioned above will be addressed
adequately.

Note: this platform-specific real path handling has not yet made it
upstream, it is still "battle-tested" in Git for Windows.

Thanks,
Dscho

>
>  Documentation/git-rev-parse.txt | 71 ++++++++++++++++------------
>  builtin/rev-parse.c             | 84 ++++++++++++++++++++++++++++-----
>  t/t1500-rev-parse.sh            | 36 +++++++++++++-
>  3 files changed, 148 insertions(+), 43 deletions(-)
>
> diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
> index 19b12b6d43..6b95292559 100644
> --- a/Documentation/git-rev-parse.txt
> +++ b/Documentation/git-rev-parse.txt
> @@ -208,6 +208,15 @@ Options for Files
>  	Only the names of the variables are listed, not their value,
>  	even if they are set.
>
> +--path-format=(absolute|relative)::
> +	Controls the behavior of certain other options following it on the command
> +	line. If specified as absolute, the paths printed by those options will be
> +	absolute and canonical. If specified as relative, the paths will be relative
> +	to the current working directory if that is possible.  The default is option
> +	specific.
> +
> +The following options are modified by `--path-format`:
> +
>  --git-dir::
>  	Show `$GIT_DIR` if defined. Otherwise show the path to
>  	the .git directory. The path shown, when relative, is
> @@ -217,13 +226,42 @@ If `$GIT_DIR` is not defined and the current directory
>  is not detected to lie in a Git repository or work tree
>  print a message to stderr and exit with nonzero status.
>
> +--git-common-dir::
> +	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
> +
> +--resolve-git-dir <path>::
> +	Check if <path> is a valid repository or a gitfile that
> +	points at a valid repository, and print the location of the
> +	repository.  If <path> is a gitfile then the resolved path
> +	to the real repository is printed.
> +
> +--git-path <path>::
> +	Resolve "$GIT_DIR/<path>" and takes other path relocation
> +	variables such as $GIT_OBJECT_DIRECTORY,
> +	$GIT_INDEX_FILE... into account. For example, if
> +	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
> +	--git-path objects/abc" returns /foo/bar/abc.
> +
> +--show-toplevel::
> +	Show the (by default, absolute) path of the top-level directory
> +	of the working tree. If there is no working tree, report an error.
> +
> +--show-superproject-working-tree::
> +	Show the absolute path of the root of the superproject's
> +	working tree (if exists) that uses the current repository as
> +	its submodule.  Outputs nothing if the current repository is
> +	not used as a submodule by any project.
> +
> +--shared-index-path::
> +	Show the path to the shared index file in split index mode, or
> +	empty if not in split-index mode.
> +
> +The following options are unaffected by `--path-format`:
> +
>  --absolute-git-dir::
>  	Like `--git-dir`, but its output is always the canonicalized
>  	absolute path.
>
> ---git-common-dir::
> -	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
> -
>  --is-inside-git-dir::
>  	When the current working directory is below the repository
>  	directory print "true", otherwise "false".
> @@ -238,19 +276,6 @@ print a message to stderr and exit with nonzero status.
>  --is-shallow-repository::
>  	When the repository is shallow print "true", otherwise "false".
>
> ---resolve-git-dir <path>::
> -	Check if <path> is a valid repository or a gitfile that
> -	points at a valid repository, and print the location of the
> -	repository.  If <path> is a gitfile then the resolved path
> -	to the real repository is printed.
> -
> ---git-path <path>::
> -	Resolve "$GIT_DIR/<path>" and takes other path relocation
> -	variables such as $GIT_OBJECT_DIRECTORY,
> -	$GIT_INDEX_FILE... into account. For example, if
> -	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
> -	--git-path objects/abc" returns /foo/bar/abc.
> -
>  --show-cdup::
>  	When the command is invoked from a subdirectory, show the
>  	path of the top-level directory relative to the current
> @@ -261,20 +286,6 @@ print a message to stderr and exit with nonzero status.
>  	path of the current directory relative to the top-level
>  	directory.
>
> ---show-toplevel::
> -	Show the absolute path of the top-level directory of the working
> -	tree. If there is no working tree, report an error.
> -
> ---show-superproject-working-tree::
> -	Show the absolute path of the root of the superproject's
> -	working tree (if exists) that uses the current repository as
> -	its submodule.  Outputs nothing if the current repository is
> -	not used as a submodule by any project.
> -
> ---shared-index-path::
> -	Show the path to the shared index file in split index mode, or
> -	empty if not in split-index mode.
> -
>  --show-object-format[=(storage|input|output)]::
>  	Show the object format (hash algorithm) used for the repository
>  	for storage inside the `.git` directory, input, or output. For
> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> index 669dd2fd6f..311a56ba01 100644
> --- a/builtin/rev-parse.c
> +++ b/builtin/rev-parse.c
> @@ -583,6 +583,55 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
>  	clear_ref_exclusion(&ref_excludes);
>  }
>
> +enum format_type {
> +	/* We would like a relative path. */
> +	FORMAT_RELATIVE,
> +	/* We would like a canonical absolute path. */
> +	FORMAT_CANONICAL,
> +	/* We would like the default behavior. */
> +	FORMAT_DEFAULT,
> +};
> +
> +enum default_type {
> +	/* Our default is a relative path. */
> +	DEFAULT_RELATIVE,
> +	/* Our default is a relative path if there's a shared root. */
> +	DEFAULT_RELATIVE_IF_SHARED,
> +	/* Our default is a canonical absolute path. */
> +	DEFAULT_CANONICAL,
> +	/* Our default is not to modify the item. */
> +	DEFAULT_UNMODIFIED,
> +};
> +
> +static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
> +{
> +	char *cwd = NULL;
> +	/*
> +	 * We don't ever produce a relative path if prefix is NULL, so set the
> +	 * prefix to the current directory so that we can produce a relative
> +	 * path whenever possible.  If we're using RELATIVE_IF_SHARED mode, then
> +	 * we want an absolute path unless the two share a common prefix, so don't
> +	 * set it in that case, since doing so causes a relative path to always
> +	 * be produced if possible.
> +	 */
> +	if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
> +		prefix = cwd = xgetcwd();
> +	if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
> +		puts(path);
> +	} else if (format == FORMAT_RELATIVE ||
> +		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE) ||
> +		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED)) {
> +		struct strbuf buf = STRBUF_INIT;
> +		puts(relative_path(path, prefix, &buf));
> +		strbuf_reset(&buf);
> +	} else {
> +		char *real = real_pathdup(path, 1);
> +		puts(real);
> +		free(real);
> +	}
> +	free(cwd);
> +}
> +
>  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  {
>  	int i, as_is = 0, verify = 0, quiet = 0, revs_count = 0, type = 0;
> @@ -595,6 +644,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  	struct object_context unused;
>  	struct strbuf buf = STRBUF_INIT;
>  	const int hexsz = the_hash_algo->hexsz;
> +	enum format_type format = FORMAT_DEFAULT;
>
>  	if (argc > 1 && !strcmp("--parseopt", argv[1]))
>  		return cmd_parseopt(argc - 1, argv + 1, prefix);
> @@ -650,8 +700,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!argv[i + 1])
>  				die("--git-path requires an argument");
>  			strbuf_reset(&buf);
> -			puts(relative_path(git_path("%s", argv[i + 1]),
> -					   prefix, &buf));
> +			print_path(git_path("%s", argv[i + 1]), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
>  			i++;
>  			continue;
>  		}
> @@ -683,6 +732,16 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  					show_file(arg, 0);
>  				continue;
>  			}
> +			if (opt_with_value(arg, "--path-format", &arg)) {
> +				if (!strcmp(arg, "absolute")) {
> +					format = FORMAT_CANONICAL;
> +				} else if (!strcmp(arg, "relative")) {
> +					format = FORMAT_RELATIVE;
> +				} else {
> +					die("unknown argument to --path-format: %s", arg);
> +				}
> +				continue;
> +			}
>  			if (!strcmp(arg, "--default")) {
>  				def = argv[++i];
>  				if (!def)
> @@ -803,7 +862,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!strcmp(arg, "--show-toplevel")) {
>  				const char *work_tree = get_git_work_tree();
>  				if (work_tree)
> -					puts(work_tree);
> +					print_path(work_tree, prefix, format, DEFAULT_CANONICAL);
>  				else
>  					die("this operation must be run in a work tree");
>  				continue;
> @@ -811,7 +870,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!strcmp(arg, "--show-superproject-working-tree")) {
>  				struct strbuf superproject = STRBUF_INIT;
>  				if (get_superproject_working_tree(&superproject))
> -					puts(superproject.buf);
> +					print_path(superproject.buf, prefix, format, DEFAULT_CANONICAL);
>  				strbuf_release(&superproject);
>  				continue;
>  			}
> @@ -846,16 +905,18 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
>  				char *cwd;
>  				int len;
> +				enum format_type wanted = format;
>  				if (arg[2] == 'g') {	/* --git-dir */
>  					if (gitdir) {
> -						puts(gitdir);
> +						print_path(gitdir, prefix, format, DEFAULT_UNMODIFIED);
>  						continue;
>  					}
>  					if (!prefix) {
> -						puts(".git");
> +						print_path(".git", prefix, format, DEFAULT_UNMODIFIED);
>  						continue;
>  					}
>  				} else {		/* --absolute-git-dir */
> +					wanted = FORMAT_CANONICAL;
>  					if (!gitdir && !prefix)
>  						gitdir = ".git";
>  					if (gitdir) {
> @@ -868,14 +929,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				}
>  				cwd = xgetcwd();
>  				len = strlen(cwd);
> -				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
> +				strbuf_reset(&buf);
> +				strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");
>  				free(cwd);
> +				print_path(buf.buf, prefix, wanted, DEFAULT_CANONICAL);
>  				continue;
>  			}
>  			if (!strcmp(arg, "--git-common-dir")) {
> -				strbuf_reset(&buf);
> -				puts(relative_path(get_git_common_dir(),
> -						   prefix, &buf));
> +				print_path(get_git_common_dir(), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
>  				continue;
>  			}
>  			if (!strcmp(arg, "--is-inside-git-dir")) {
> @@ -905,8 +966,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				if (the_index.split_index) {
>  					const struct object_id *oid = &the_index.split_index->base_oid;
>  					const char *path = git_path("sharedindex.%s", oid_to_hex(oid));
> -					strbuf_reset(&buf);
> -					puts(relative_path(path, prefix, &buf));
> +					print_path(path, prefix, format, DEFAULT_RELATIVE);
>  				}
>  				continue;
>  			}
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 408b97d5af..6f3811d189 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -3,6 +3,16 @@
>  test_description='test git rev-parse'
>  . ./test-lib.sh
>
> +test_one () {
> +	dir="$1" &&
> +	expect="$2" &&
> +	shift &&
> +	shift &&
> +	echo "$expect" >expect &&
> +	git -C "$dir" rev-parse "$@" >actual
> +	test_cmp expect actual
> +}
> +
>  # usage: [options] label is-bare is-inside-git is-inside-work prefix git-dir absolute-git-dir
>  test_rev_parse () {
>  	d=
> @@ -60,7 +70,13 @@ ROOT=$(pwd)
>
>  test_expect_success 'setup' '
>  	mkdir -p sub/dir work &&
> -	cp -R .git repo.git
> +	cp -R .git repo.git &&
> +	git checkout -b main &&
> +	test_commit abc &&
> +	git checkout -b side &&
> +	test_commit def &&
> +	git checkout main &&
> +	git worktree add worktree side
>  '
>
>  test_rev_parse toplevel false false true '' .git "$ROOT/.git"
> @@ -88,6 +104,24 @@ test_rev_parse -C work -g ../repo.git -b t 'GIT_DIR=../repo.git, core.bare = tru
>
>  test_rev_parse -C work -g ../repo.git -b u 'GIT_DIR=../repo.git, core.bare undefined' false false true ''
>
> +test_expect_success 'rev-parse --path-format=absolute' '
> +	test_one "." "$ROOT/.git" --path-format=absolute --git-dir &&
> +	test_one "." "$ROOT/.git" --path-format=absolute --git-common-dir &&
> +	test_one "worktree" "$ROOT/.git/worktrees/worktree" --path-format=absolute --git-dir &&
> +	test_one "worktree" "$ROOT/.git" --path-format=absolute --git-common-dir &&
> +	test_one "." "$ROOT" --path-format=absolute --show-toplevel &&
> +	test_one "." "$ROOT/.git/objects" --path-format=absolute --git-path objects
> +'
> +
> +test_expect_success 'rev-parse --path-format=relative' '
> +	test_one "." ".git" --path-format=relative --git-dir &&
> +	test_one "." ".git" --path-format=relative --git-common-dir &&
> +	test_one "worktree" "../.git/worktrees/worktree" --path-format=relative --git-dir &&
> +	test_one "worktree" "../.git" --path-format=relative --git-common-dir &&
> +	test_one "." "./" --path-format=relative --show-toplevel &&
> +	test_one "." ".git/objects" --path-format=relative --git-path objects
> +'
> +
>  test_expect_success 'git-common-dir from worktree root' '
>  	echo .git >expect &&
>  	git rev-parse --git-common-dir >actual &&
>
SZEDER Gábor Sept. 9, 2020, 2:51 p.m. UTC | #2
On Tue, Sep 08, 2020 at 06:50:17PM +0000, brian m. carlson wrote:
> git rev-parse has several options which print various paths.  Some of
> these paths are printed relative to the current working directory, and
> some are absolute.
> 
> Normally, this is not a problem, but there are times when one wants
> paths entirely in one format or another.  This can be done trivially if
> the paths are canonical, but canonicalizing paths is not possible on
> some shell scripting environments which lack realpath(1) and also in Go,
> which lacks functions that properly canonicalize paths on Windows.
> 
> To help out the scripter, let's provide an option which turns most of
> the paths printed by git rev-parse to be either relative to the current
> working directory or absolute and canonical.  Document which options are
> affected and which are not so that users are not confused.
> 
> This approach is cleaner and tidier than providing duplicates of
> existing options which are either relative or absolute.

Indeed.  I added '--absolute-git-dir' as a one-off, because I didn't
see a use case for other absolute paths, and because I saw some design
issues with a generic solution and I didn't want to go there back
then.  In particular:

> diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
> index 19b12b6d43..6b95292559 100644
> --- a/Documentation/git-rev-parse.txt
> +++ b/Documentation/git-rev-parse.txt
> @@ -208,6 +208,15 @@ Options for Files
>  	Only the names of the variables are listed, not their value,
>  	even if they are set.
>  
> +--path-format=(absolute|relative)::
> +	Controls the behavior of certain other options following it on the command
> +	line.

Does it affect only one subsequent such option on the command line, or
all such options?  IOW, while standing in the top directory of the
worktree, would the following command

  git rev-parse --path-format=absolute --git-dir --git-path foo --show-toplevel

print three absolute paths, or one absolute paths and two relative
paths?

The wording here is not clear on this, the commit message doesn't
mention it, and the tests added in this patch only check one such
option, but looking at the code and doing some testing of my own I
found that it affects all subsequent such options.

>       If specified as absolute, the paths printed by those options will be
> +	absolute and canonical. If specified as relative, the paths will be relative
> +	to the current working directory if that is possible.  The default is option
> +	specific.
> +
> +The following options are modified by `--path-format`:
> +
>  --git-dir::
>  	Show `$GIT_DIR` if defined. Otherwise show the path to
>  	the .git directory. The path shown, when relative, is


> diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
> index 669dd2fd6f..311a56ba01 100644
> --- a/builtin/rev-parse.c
> +++ b/builtin/rev-parse.c
> @@ -583,6 +583,55 @@ static void handle_ref_opt(const char *pattern, const char *prefix)
>  	clear_ref_exclusion(&ref_excludes);
>  }
>  
> +enum format_type {
> +	/* We would like a relative path. */
> +	FORMAT_RELATIVE,
> +	/* We would like a canonical absolute path. */
> +	FORMAT_CANONICAL,
> +	/* We would like the default behavior. */
> +	FORMAT_DEFAULT,
> +};
> +
> +enum default_type {
> +	/* Our default is a relative path. */
> +	DEFAULT_RELATIVE,
> +	/* Our default is a relative path if there's a shared root. */
> +	DEFAULT_RELATIVE_IF_SHARED,
> +	/* Our default is a canonical absolute path. */
> +	DEFAULT_CANONICAL,
> +	/* Our default is not to modify the item. */
> +	DEFAULT_UNMODIFIED,
> +};
> +
> +static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
> +{
> +	char *cwd = NULL;
> +	/*
> +	 * We don't ever produce a relative path if prefix is NULL, so set the
> +	 * prefix to the current directory so that we can produce a relative
> +	 * path whenever possible.  If we're using RELATIVE_IF_SHARED mode, then
> +	 * we want an absolute path unless the two share a common prefix, so don't
> +	 * set it in that case, since doing so causes a relative path to always
> +	 * be produced if possible.
> +	 */
> +	if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
> +		prefix = cwd = xgetcwd();
> +	if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
> +		puts(path);
> +	} else if (format == FORMAT_RELATIVE ||
> +		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE) ||
> +		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED)) {
> +		struct strbuf buf = STRBUF_INIT;
> +		puts(relative_path(path, prefix, &buf));
> +		strbuf_reset(&buf);
> +	} else {
> +		char *real = real_pathdup(path, 1);
> +		puts(real);
> +		free(real);
> +	}
> +	free(cwd);
> +}
> +
>  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  {
>  	int i, as_is = 0, verify = 0, quiet = 0, revs_count = 0, type = 0;
> @@ -595,6 +644,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  	struct object_context unused;
>  	struct strbuf buf = STRBUF_INIT;
>  	const int hexsz = the_hash_algo->hexsz;
> +	enum format_type format = FORMAT_DEFAULT;
>  
>  	if (argc > 1 && !strcmp("--parseopt", argv[1]))
>  		return cmd_parseopt(argc - 1, argv + 1, prefix);
> @@ -650,8 +700,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!argv[i + 1])
>  				die("--git-path requires an argument");
>  			strbuf_reset(&buf);
> -			puts(relative_path(git_path("%s", argv[i + 1]),
> -					   prefix, &buf));
> +			print_path(git_path("%s", argv[i + 1]), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
>  			i++;
>  			continue;
>  		}
> @@ -683,6 +732,16 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  					show_file(arg, 0);
>  				continue;
>  			}
> +			if (opt_with_value(arg, "--path-format", &arg)) {
> +				if (!strcmp(arg, "absolute")) {
> +					format = FORMAT_CANONICAL;
> +				} else if (!strcmp(arg, "relative")) {
> +					format = FORMAT_RELATIVE;
> +				} else {
> +					die("unknown argument to --path-format: %s", arg);
> +				}
> +				continue;
> +			}
>  			if (!strcmp(arg, "--default")) {
>  				def = argv[++i];
>  				if (!def)
> @@ -803,7 +862,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!strcmp(arg, "--show-toplevel")) {
>  				const char *work_tree = get_git_work_tree();
>  				if (work_tree)
> -					puts(work_tree);
> +					print_path(work_tree, prefix, format, DEFAULT_CANONICAL);
>  				else
>  					die("this operation must be run in a work tree");
>  				continue;
> @@ -811,7 +870,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  			if (!strcmp(arg, "--show-superproject-working-tree")) {
>  				struct strbuf superproject = STRBUF_INIT;
>  				if (get_superproject_working_tree(&superproject))
> -					puts(superproject.buf);
> +					print_path(superproject.buf, prefix, format, DEFAULT_CANONICAL);
>  				strbuf_release(&superproject);
>  				continue;
>  			}
> @@ -846,16 +905,18 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
>  				char *cwd;
>  				int len;
> +				enum format_type wanted = format;
>  				if (arg[2] == 'g') {	/* --git-dir */
>  					if (gitdir) {
> -						puts(gitdir);
> +						print_path(gitdir, prefix, format, DEFAULT_UNMODIFIED);
>  						continue;
>  					}
>  					if (!prefix) {
> -						puts(".git");
> +						print_path(".git", prefix, format, DEFAULT_UNMODIFIED);
>  						continue;
>  					}
>  				} else {		/* --absolute-git-dir */
> +					wanted = FORMAT_CANONICAL;
>  					if (!gitdir && !prefix)
>  						gitdir = ".git";
>  					if (gitdir) {
> @@ -868,14 +929,14 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				}
>  				cwd = xgetcwd();
>  				len = strlen(cwd);
> -				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
> +				strbuf_reset(&buf);
> +				strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");
>  				free(cwd);
> +				print_path(buf.buf, prefix, wanted, DEFAULT_CANONICAL);
>  				continue;
>  			}
>  			if (!strcmp(arg, "--git-common-dir")) {
> -				strbuf_reset(&buf);
> -				puts(relative_path(get_git_common_dir(),
> -						   prefix, &buf));
> +				print_path(get_git_common_dir(), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
>  				continue;
>  			}
>  			if (!strcmp(arg, "--is-inside-git-dir")) {
> @@ -905,8 +966,7 @@ int cmd_rev_parse(int argc, const char **argv, const char *prefix)
>  				if (the_index.split_index) {
>  					const struct object_id *oid = &the_index.split_index->base_oid;
>  					const char *path = git_path("sharedindex.%s", oid_to_hex(oid));
> -					strbuf_reset(&buf);
> -					puts(relative_path(path, prefix, &buf));
> +					print_path(path, prefix, format, DEFAULT_RELATIVE);
>  				}
>  				continue;
>  			}
> diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> index 408b97d5af..6f3811d189 100755
> --- a/t/t1500-rev-parse.sh
> +++ b/t/t1500-rev-parse.sh
> @@ -3,6 +3,16 @@
>  test_description='test git rev-parse'
>  . ./test-lib.sh
>  
> +test_one () {
> +	dir="$1" &&
> +	expect="$2" &&
> +	shift &&
> +	shift &&
> +	echo "$expect" >expect &&
> +	git -C "$dir" rev-parse "$@" >actual

Broken && chain.

> +	test_cmp expect actual
> +}
> +
>  # usage: [options] label is-bare is-inside-git is-inside-work prefix git-dir absolute-git-dir
>  test_rev_parse () {
>  	d=
> @@ -60,7 +70,13 @@ ROOT=$(pwd)
>  
>  test_expect_success 'setup' '
>  	mkdir -p sub/dir work &&
> -	cp -R .git repo.git
> +	cp -R .git repo.git &&
> +	git checkout -b main &&
> +	test_commit abc &&
> +	git checkout -b side &&
> +	test_commit def &&
> +	git checkout main &&
> +	git worktree add worktree side
>  '
>  
>  test_rev_parse toplevel false false true '' .git "$ROOT/.git"
> @@ -88,6 +104,24 @@ test_rev_parse -C work -g ../repo.git -b t 'GIT_DIR=../repo.git, core.bare = tru
>  
>  test_rev_parse -C work -g ../repo.git -b u 'GIT_DIR=../repo.git, core.bare undefined' false false true ''
>  
> +test_expect_success 'rev-parse --path-format=absolute' '
> +	test_one "." "$ROOT/.git" --path-format=absolute --git-dir &&
> +	test_one "." "$ROOT/.git" --path-format=absolute --git-common-dir &&
> +	test_one "worktree" "$ROOT/.git/worktrees/worktree" --path-format=absolute --git-dir &&
> +	test_one "worktree" "$ROOT/.git" --path-format=absolute --git-common-dir &&
> +	test_one "." "$ROOT" --path-format=absolute --show-toplevel &&
> +	test_one "." "$ROOT/.git/objects" --path-format=absolute --git-path objects
> +'
> +
> +test_expect_success 'rev-parse --path-format=relative' '
> +	test_one "." ".git" --path-format=relative --git-dir &&
> +	test_one "." ".git" --path-format=relative --git-common-dir &&
> +	test_one "worktree" "../.git/worktrees/worktree" --path-format=relative --git-dir &&
> +	test_one "worktree" "../.git" --path-format=relative --git-common-dir &&
> +	test_one "." "./" --path-format=relative --show-toplevel &&
> +	test_one "." ".git/objects" --path-format=relative --git-path objects
> +'

I would like to see a test that demonstrates that '--absolute-git-dir'
is unaffected by '--path-format=relative', just to be sure.

There are some cases where this new option doesn't do what I would
expect:

  $ ./git -C Documentation/ rev-parse --git-dir --show-toplevel
  /home/szeder/src/git/.git
  /home/szeder/src/git
  $ ./git -C Documentation/ rev-parse --path-format=absolute --git-dir --show-toplevel
  /home/szeder/src/git/.git
  /home/szeder/src/git
  # So far so good, but:
  $ ./git -C Documentation/ rev-parse --path-format=relative --git-dir --show-toplevel
  /home/szeder/src/git/.git
  /home/szeder/src/git


  $ ls -l .git/foo
  ls: cannot access '.git/foo': No such file or directory
  $ ./git rev-parse --git-path foo
  .git/foo
  $ ./git rev-parse --path-format=relative --git-path foo
  .git/foo
  $ ./git rev-parse --path-format=absolute --git-path foo
  /home/szeder/src/git/.git/foo
  $ ./git rev-parse --git-path foo/bar
  .git/foo/bar
  $ ./git rev-parse --path-format=relative --git-path foo/bar
  .git/foo/bar
  # So far so good, but:
  $ ./git rev-parse --path-format=absolute --git-path foo/bar
  fatal: Invalid path '/home/szeder/src/git/.git/foo': No such file or directory
brian m. carlson Sept. 9, 2020, 10:23 p.m. UTC | #3
On 2020-09-09 at 14:51:14, SZEDER Gábor wrote:
> On Tue, Sep 08, 2020 at 06:50:17PM +0000, brian m. carlson wrote:
> > diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
> > index 19b12b6d43..6b95292559 100644
> > --- a/Documentation/git-rev-parse.txt
> > +++ b/Documentation/git-rev-parse.txt
> > @@ -208,6 +208,15 @@ Options for Files
> >  	Only the names of the variables are listed, not their value,
> >  	even if they are set.
> >  
> > +--path-format=(absolute|relative)::
> > +	Controls the behavior of certain other options following it on the command
> > +	line.
> 
> Does it affect only one subsequent such option on the command line, or
> all such options?  IOW, while standing in the top directory of the
> worktree, would the following command
> 
>   git rev-parse --path-format=absolute --git-dir --git-path foo --show-toplevel
> 
> print three absolute paths, or one absolute paths and two relative
> paths?
> 
> The wording here is not clear on this, the commit message doesn't
> mention it, and the tests added in this patch only check one such
> option, but looking at the code and doing some testing of my own I
> found that it affects all subsequent such options.

It affects all subsequent options.  Moreover, I believe it's possible to
switch in the middle of the command line if you want some things
relative and some absolute.  That seemed to be both the easiest solution
and the most flexible, so I went with it.  I'll add more tests for this
case and improve the commit message.

> > diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
> > index 408b97d5af..6f3811d189 100755
> > --- a/t/t1500-rev-parse.sh
> > +++ b/t/t1500-rev-parse.sh
> > @@ -3,6 +3,16 @@
> >  test_description='test git rev-parse'
> >  . ./test-lib.sh
> >  
> > +test_one () {
> > +	dir="$1" &&
> > +	expect="$2" &&
> > +	shift &&
> > +	shift &&
> > +	echo "$expect" >expect &&
> > +	git -C "$dir" rev-parse "$@" >actual
> 
> Broken && chain.

Thanks, will fix.

> > +	test_cmp expect actual
> > +}
> > +
> >  # usage: [options] label is-bare is-inside-git is-inside-work prefix git-dir absolute-git-dir
> >  test_rev_parse () {
> >  	d=
> > @@ -60,7 +70,13 @@ ROOT=$(pwd)
> >  
> >  test_expect_success 'setup' '
> >  	mkdir -p sub/dir work &&
> > -	cp -R .git repo.git
> > +	cp -R .git repo.git &&
> > +	git checkout -b main &&
> > +	test_commit abc &&
> > +	git checkout -b side &&
> > +	test_commit def &&
> > +	git checkout main &&
> > +	git worktree add worktree side
> >  '
> >  
> >  test_rev_parse toplevel false false true '' .git "$ROOT/.git"
> > @@ -88,6 +104,24 @@ test_rev_parse -C work -g ../repo.git -b t 'GIT_DIR=../repo.git, core.bare = tru
> >  
> >  test_rev_parse -C work -g ../repo.git -b u 'GIT_DIR=../repo.git, core.bare undefined' false false true ''
> >  
> > +test_expect_success 'rev-parse --path-format=absolute' '
> > +	test_one "." "$ROOT/.git" --path-format=absolute --git-dir &&
> > +	test_one "." "$ROOT/.git" --path-format=absolute --git-common-dir &&
> > +	test_one "worktree" "$ROOT/.git/worktrees/worktree" --path-format=absolute --git-dir &&
> > +	test_one "worktree" "$ROOT/.git" --path-format=absolute --git-common-dir &&
> > +	test_one "." "$ROOT" --path-format=absolute --show-toplevel &&
> > +	test_one "." "$ROOT/.git/objects" --path-format=absolute --git-path objects
> > +'
> > +
> > +test_expect_success 'rev-parse --path-format=relative' '
> > +	test_one "." ".git" --path-format=relative --git-dir &&
> > +	test_one "." ".git" --path-format=relative --git-common-dir &&
> > +	test_one "worktree" "../.git/worktrees/worktree" --path-format=relative --git-dir &&
> > +	test_one "worktree" "../.git" --path-format=relative --git-common-dir &&
> > +	test_one "." "./" --path-format=relative --show-toplevel &&
> > +	test_one "." ".git/objects" --path-format=relative --git-path objects
> > +'
> 
> I would like to see a test that demonstrates that '--absolute-git-dir'
> is unaffected by '--path-format=relative', just to be sure.
> 
> There are some cases where this new option doesn't do what I would
> expect:
> 
>   $ ./git -C Documentation/ rev-parse --git-dir --show-toplevel
>   /home/szeder/src/git/.git
>   /home/szeder/src/git
>   $ ./git -C Documentation/ rev-parse --path-format=absolute --git-dir --show-toplevel
>   /home/szeder/src/git/.git
>   /home/szeder/src/git
>   # So far so good, but:
>   $ ./git -C Documentation/ rev-parse --path-format=relative --git-dir --show-toplevel
>   /home/szeder/src/git/.git
>   /home/szeder/src/git
> 
> 
>   $ ls -l .git/foo
>   ls: cannot access '.git/foo': No such file or directory
>   $ ./git rev-parse --git-path foo
>   .git/foo
>   $ ./git rev-parse --path-format=relative --git-path foo
>   .git/foo
>   $ ./git rev-parse --path-format=absolute --git-path foo
>   /home/szeder/src/git/.git/foo
>   $ ./git rev-parse --git-path foo/bar
>   .git/foo/bar
>   $ ./git rev-parse --path-format=relative --git-path foo/bar
>   .git/foo/bar
>   # So far so good, but:
>   $ ./git rev-parse --path-format=absolute --git-path foo/bar
>   fatal: Invalid path '/home/szeder/src/git/.git/foo': No such file or directory

That's going to be a little tricky to fix, but I'll look into it.
brian m. carlson Sept. 9, 2020, 10:31 p.m. UTC | #4
On 2020-09-09 at 03:23:32, Johannes Schindelin wrote:
> On Tue, 8 Sep 2020, brian m. carlson wrote:
> 
> > git rev-parse has several options which print various paths.  Some of
> > these paths are printed relative to the current working directory, and
> > some are absolute.
> >
> > Normally, this is not a problem, but there are times when one wants
> > paths entirely in one format or another.  This can be done trivially if
> > the paths are canonical, but canonicalizing paths is not possible on
> > some shell scripting environments which lack realpath(1) and also in Go,
> > which lacks functions that properly canonicalize paths on Windows.
> >
> > To help out the scripter, let's provide an option which turns most of
> > the paths printed by git rev-parse to be either relative to the current
> > working directory or absolute and canonical.  Document which options are
> > affected and which are not so that users are not confused.
> >
> > This approach is cleaner and tidier than providing duplicates of
> > existing options which are either relative or absolute.
> >
> > Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> > ---
> 
> Nicely explained and implemented.

Thanks.

> > This impetus for this patch is Git LFS, which is written in Go.  Go
> > lacks a cross-platform way to canonicalize paths in the same way that
> > Git does, so when Git produces relative paths, such as in some cases
> > with --git-common-dir, we end up with problems when users are doing
> > things with unusual paths on Windows, such as when using SUBST or
> > OneDrive or various other edge cases.  Go upstream does not appear eager
> > to address this problem.
> >
> > The obvious solution to this is to have Git canonicalize all paths, and
> > rather than adding yet another one-off, I decided to implement a more
> > generic solution.  This can also be valuable for shell scripting
> > environments which lack realpath(1) (e.g. macOS, IIRC).
> >
> > Dscho CC'd for visibility into how this will work on Windows.
> 
> On Windows, at least in the version from git-for-windows/git,
> `strbuf_realpath()` uses the Win32 API function
> `GetFinalPathNameByHandleW()` to canonicalize paths (whenever possible),
> meaning that the `subst` issue you mentioned above will be addressed
> adequately.

Yeah, I looked into the function Git for Windows used, and it's also
used by Rust's std::path::canonicalize, so I think there's wide
agreement that this is the right function to be using here.

I think for _most_ of our cases, it matters a little less how we're
handling paths than that we're handling them consistently, but if we can
fix all of the cases, not just most, that's preferable.
SZEDER Gábor Sept. 10, 2020, 3:19 p.m. UTC | #5
On Wed, Sep 09, 2020 at 10:23:33PM +0000, brian m. carlson wrote:
> On 2020-09-09 at 14:51:14, SZEDER Gábor wrote:
> > On Tue, Sep 08, 2020 at 06:50:17PM +0000, brian m. carlson wrote:
> > > diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
> > > index 19b12b6d43..6b95292559 100644
> > > --- a/Documentation/git-rev-parse.txt
> > > +++ b/Documentation/git-rev-parse.txt
> > > @@ -208,6 +208,15 @@ Options for Files
> > >  	Only the names of the variables are listed, not their value,
> > >  	even if they are set.
> > >  
> > > +--path-format=(absolute|relative)::
> > > +	Controls the behavior of certain other options following it on the command
> > > +	line.
> > 
> > Does it affect only one subsequent such option on the command line, or
> > all such options?  IOW, while standing in the top directory of the
> > worktree, would the following command
> > 
> >   git rev-parse --path-format=absolute --git-dir --git-path foo --show-toplevel
> > 
> > print three absolute paths, or one absolute paths and two relative
> > paths?
> > 
> > The wording here is not clear on this, the commit message doesn't
> > mention it, and the tests added in this patch only check one such
> > option, but looking at the code and doing some testing of my own I
> > found that it affects all subsequent such options.
> 
> It affects all subsequent options.  Moreover, I believe it's possible to
> switch in the middle of the command line if you want some things
> relative and some absolute.  That seemed to be both the easiest solution
> and the most flexible, so I went with it.  I'll add more tests for this
> case and improve the commit message.

Well, on first sight it seems nice that scripts have to specify
'--path-format=absolute' only once when querying multiple paths at
once, though on second thought no prudent script would query multiple
paths at once, because they might contain newlines.

> >   $ ./git -C Documentation/ rev-parse --path-format=relative --git-dir --show-toplevel
> >   /home/szeder/src/git/.git
> >   /home/szeder/src/git

> >   $ ./git rev-parse --path-format=absolute --git-path foo/bar
> >   fatal: Invalid path '/home/szeder/src/git/.git/foo': No such file or directory
> 
> That's going to be a little tricky to fix, but I'll look into it.

Yeah, I think I found trouble that way, too.  I wonder whether
'--path-format=relative' is really worth having, though.  Clearly
there's a need for absolute paths, because getting relative paths
causes difficulties for some scripts; I described one such use case in
a2f5a87626 (rev-parse: add '--absolute-git-dir' option, 2017-02-03),
and you just ran into another with Git LFS.  However, is there really
a use case that requires relative paths, because an absolute path
would cause similar difficulties?  I couldn't come with any.

So perhaps it would be sufficient to introduce only
'--path-format=absolute' (or equivalent) for now, and add a relative
path variant only when there will be an actual compelling reason to do
so.  And that would save you from the pain of addressing these bugs
shown above.
brian m. carlson Sept. 11, 2020, 12:03 a.m. UTC | #6
On 2020-09-10 at 15:19:35, SZEDER Gábor wrote:
> Well, on first sight it seems nice that scripts have to specify
> '--path-format=absolute' only once when querying multiple paths at
> once, though on second thought no prudent script would query multiple
> paths at once, because they might contain newlines.

We do that in Git LFS at the cost of not handling paths with newlines
because calling out to Git multiple times is expensive, very especially
on Windows.  Of course, Windows doesn't allow newlines in paths, but it
seems silly to have two different code paths for the same thing,
especially on top of the different paths we have for versions of Git
with and without worktrees (and therefore --git-common-dir).

If you're writing a script, you're not expecting great performance (or
you wouldn't be writing shell), so it's less of a problem and you can
actually afford the cost of being a little more correct.

I think if we wanted to prudently handle paths with newlines, we'd want
to add a -z option to rev-parse (think about the case in shell where the
path ends with a trailing newline).  I may get around to that at some
point, but anyone is welcome to pick it up before me.

> Yeah, I think I found trouble that way, too.  I wonder whether
> '--path-format=relative' is really worth having, though.  Clearly
> there's a need for absolute paths, because getting relative paths
> causes difficulties for some scripts; I described one such use case in
> a2f5a87626 (rev-parse: add '--absolute-git-dir' option, 2017-02-03),
> and you just ran into another with Git LFS.  However, is there really
> a use case that requires relative paths, because an absolute path
> would cause similar difficulties?  I couldn't come with any.

The only case I can come up with is possibly wanting to know if a path
is within another path, or if it's outside, and doing that with relative
paths may be easier.

> So perhaps it would be sufficient to introduce only
> '--path-format=absolute' (or equivalent) for now, and add a relative
> path variant only when there will be an actual compelling reason to do
> so.  And that would save you from the pain of addressing these bugs
> shown above.

I'll consider it when I do my reroll, which will probably be this
weekend.
Jonathan Nieder Nov. 4, 2020, 10:16 p.m. UTC | #7
brian m. carlson wrote:

> This impetus for this patch is Git LFS, which is written in Go.  Go
> lacks a cross-platform way to canonicalize paths in the same way that
> Git does, so when Git produces relative paths, such as in some cases
> with --git-common-dir, we end up with problems when users are doing
> things with unusual paths on Windows, such as when using SUBST or
> OneDrive or various other edge cases.  Go upstream does not appear eager
> to address this problem.

Can you describe the user-facing symptom?  While reviewing
https://lore.kernel.org/git/20201009191511.267461-1-sandals@crustytoothpaste.net/
I'm trying to understand the motivation and I'm getting stuck at
trying to understand the basics of the problem being solved.

E.g. is this about being able to compare paths, being able to open
files, resolving symlinks, something else?

Thanks,
Jonathan
brian m. carlson Nov. 5, 2020, 3:11 a.m. UTC | #8
On 2020-11-04 at 22:16:59, Jonathan Nieder wrote:
> brian m. carlson wrote:
> 
> > This impetus for this patch is Git LFS, which is written in Go.  Go
> > lacks a cross-platform way to canonicalize paths in the same way that
> > Git does, so when Git produces relative paths, such as in some cases
> > with --git-common-dir, we end up with problems when users are doing
> > things with unusual paths on Windows, such as when using SUBST or
> > OneDrive or various other edge cases.  Go upstream does not appear eager
> > to address this problem.
> 
> Can you describe the user-facing symptom?  While reviewing
> https://lore.kernel.org/git/20201009191511.267461-1-sandals@crustytoothpaste.net/
> I'm trying to understand the motivation and I'm getting stuck at
> trying to understand the basics of the problem being solved.
> 
> E.g. is this about being able to compare paths, being able to open
> files, resolving symlinks, something else?

The goal is to resolve paths the way Git does and allow verify that a
path is within the repository.

Some paths, such as SUBST drives, are canonicalized by Git using
GetFinalPathNameByHandle, which is the standard way of canonicalizing
paths on Windows.

Go provides filepath.EvalSymlinks, which does not handle these paths,
and consequently, a check that a given path is within the repository (or
git directory) fails because Go sees the path as being on one drive
(after canonicalization) and Git sees the repository as being on
another.  This is in part because the .git directory we've canonicalized
in Go is totally different from the path that Git has for it, and
consequently they aren't comparable.

That's the user-facing symptom.

There are a variety of other Windows paths that are handled differently
as well beyond the SUBST paths.  I'm not an expert on Windows or Windows
paths, so I can't enumerate all the cases where one case works and the
other fails.  Ideally Go would learn to canonicalize paths the way the
system does using a cross-platform function.  I believe there's an issue
upstream to deal with this at https://github.com/golang/go/issues/42201.

This is also generally applicable for scripting, where realpath(1) is
not always available (e.g., on macOS), but mostly this is here to make
Windows work more nicely, since it has more complex path functionality.
Jonathan Nieder Nov. 6, 2020, 12:51 a.m. UTC | #9
brian m. carlson wrote:
> On 2020-11-04 at 22:16:59, Jonathan Nieder wrote:
> > brian m. carlson wrote:

>>> This impetus for this patch is Git LFS, which is written in Go.  Go
>>> lacks a cross-platform way to canonicalize paths in the same way that
>>> Git does, so when Git produces relative paths, such as in some cases
>>> with --git-common-dir, we end up with problems when users are doing
>>> things with unusual paths on Windows, such as when using SUBST
[...]
>> Can you describe the user-facing symptom?  While reviewing
>> https://lore.kernel.org/git/20201009191511.267461-1-sandals@crustytoothpaste.net/
>> I'm trying to understand the motivation and I'm getting stuck at
>> trying to understand the basics of the problem being solved.
[...]
> The goal is to resolve paths the way Git does and allow verify that a
> path is within the repository.

Ah, thank you.  So if I am understanding the above and [1] correctly,
this means:

- when a path is within a repository, converting it to a path relative
  to the repository root

- when a path is not within a repository, learning so

and that making relative paths with ../../ portion that exits the
repository is *not* an important part of this use case (though it
could be useful for other things).

[...]
> This is also generally applicable for scripting, where realpath(1) is
> not always available (e.g., on macOS), but mostly this is here to make
> Windows work more nicely, since it has more complex path functionality.

Thanks much.  I think this tells me enough to understand the series.

Sincerely,
Jonathan

[1] https://github.com/git-lfs/git-lfs/issues/4012 --- thanks to Emily
for the pointer in [2].
[2] https://lore.kernel.org/git/20201104230157.GH2774782@google.com/
brian m. carlson Nov. 6, 2020, 1:57 a.m. UTC | #10
On 2020-11-06 at 00:51:10, Jonathan Nieder wrote:
> Ah, thank you.  So if I am understanding the above and [1] correctly,
> this means:
> 
> - when a path is within a repository, converting it to a path relative
>   to the repository root
> 
> - when a path is not within a repository, learning so
> 
> and that making relative paths with ../../ portion that exits the
> repository is *not* an important part of this use case (though it
> could be useful for other things).

Correct.  I felt that adding an --absolute option would be less
beneficial than both an absolute and a relative option, so I implemented
a more generic solution, even though my use case didn't require it.

> Thanks much.  I think this tells me enough to understand the series.

Sure, thanks for asking.
diff mbox series

Patch

diff --git a/Documentation/git-rev-parse.txt b/Documentation/git-rev-parse.txt
index 19b12b6d43..6b95292559 100644
--- a/Documentation/git-rev-parse.txt
+++ b/Documentation/git-rev-parse.txt
@@ -208,6 +208,15 @@  Options for Files
 	Only the names of the variables are listed, not their value,
 	even if they are set.
 
+--path-format=(absolute|relative)::
+	Controls the behavior of certain other options following it on the command
+	line. If specified as absolute, the paths printed by those options will be
+	absolute and canonical. If specified as relative, the paths will be relative
+	to the current working directory if that is possible.  The default is option
+	specific.
+
+The following options are modified by `--path-format`:
+
 --git-dir::
 	Show `$GIT_DIR` if defined. Otherwise show the path to
 	the .git directory. The path shown, when relative, is
@@ -217,13 +226,42 @@  If `$GIT_DIR` is not defined and the current directory
 is not detected to lie in a Git repository or work tree
 print a message to stderr and exit with nonzero status.
 
+--git-common-dir::
+	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
+
+--resolve-git-dir <path>::
+	Check if <path> is a valid repository or a gitfile that
+	points at a valid repository, and print the location of the
+	repository.  If <path> is a gitfile then the resolved path
+	to the real repository is printed.
+
+--git-path <path>::
+	Resolve "$GIT_DIR/<path>" and takes other path relocation
+	variables such as $GIT_OBJECT_DIRECTORY,
+	$GIT_INDEX_FILE... into account. For example, if
+	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
+	--git-path objects/abc" returns /foo/bar/abc.
+
+--show-toplevel::
+	Show the (by default, absolute) path of the top-level directory
+	of the working tree. If there is no working tree, report an error.
+
+--show-superproject-working-tree::
+	Show the absolute path of the root of the superproject's
+	working tree (if exists) that uses the current repository as
+	its submodule.  Outputs nothing if the current repository is
+	not used as a submodule by any project.
+
+--shared-index-path::
+	Show the path to the shared index file in split index mode, or
+	empty if not in split-index mode.
+
+The following options are unaffected by `--path-format`:
+
 --absolute-git-dir::
 	Like `--git-dir`, but its output is always the canonicalized
 	absolute path.
 
---git-common-dir::
-	Show `$GIT_COMMON_DIR` if defined, else `$GIT_DIR`.
-
 --is-inside-git-dir::
 	When the current working directory is below the repository
 	directory print "true", otherwise "false".
@@ -238,19 +276,6 @@  print a message to stderr and exit with nonzero status.
 --is-shallow-repository::
 	When the repository is shallow print "true", otherwise "false".
 
---resolve-git-dir <path>::
-	Check if <path> is a valid repository or a gitfile that
-	points at a valid repository, and print the location of the
-	repository.  If <path> is a gitfile then the resolved path
-	to the real repository is printed.
-
---git-path <path>::
-	Resolve "$GIT_DIR/<path>" and takes other path relocation
-	variables such as $GIT_OBJECT_DIRECTORY,
-	$GIT_INDEX_FILE... into account. For example, if
-	$GIT_OBJECT_DIRECTORY is set to /foo/bar then "git rev-parse
-	--git-path objects/abc" returns /foo/bar/abc.
-
 --show-cdup::
 	When the command is invoked from a subdirectory, show the
 	path of the top-level directory relative to the current
@@ -261,20 +286,6 @@  print a message to stderr and exit with nonzero status.
 	path of the current directory relative to the top-level
 	directory.
 
---show-toplevel::
-	Show the absolute path of the top-level directory of the working
-	tree. If there is no working tree, report an error.
-
---show-superproject-working-tree::
-	Show the absolute path of the root of the superproject's
-	working tree (if exists) that uses the current repository as
-	its submodule.  Outputs nothing if the current repository is
-	not used as a submodule by any project.
-
---shared-index-path::
-	Show the path to the shared index file in split index mode, or
-	empty if not in split-index mode.
-
 --show-object-format[=(storage|input|output)]::
 	Show the object format (hash algorithm) used for the repository
 	for storage inside the `.git` directory, input, or output. For
diff --git a/builtin/rev-parse.c b/builtin/rev-parse.c
index 669dd2fd6f..311a56ba01 100644
--- a/builtin/rev-parse.c
+++ b/builtin/rev-parse.c
@@ -583,6 +583,55 @@  static void handle_ref_opt(const char *pattern, const char *prefix)
 	clear_ref_exclusion(&ref_excludes);
 }
 
+enum format_type {
+	/* We would like a relative path. */
+	FORMAT_RELATIVE,
+	/* We would like a canonical absolute path. */
+	FORMAT_CANONICAL,
+	/* We would like the default behavior. */
+	FORMAT_DEFAULT,
+};
+
+enum default_type {
+	/* Our default is a relative path. */
+	DEFAULT_RELATIVE,
+	/* Our default is a relative path if there's a shared root. */
+	DEFAULT_RELATIVE_IF_SHARED,
+	/* Our default is a canonical absolute path. */
+	DEFAULT_CANONICAL,
+	/* Our default is not to modify the item. */
+	DEFAULT_UNMODIFIED,
+};
+
+static void print_path(const char *path, const char *prefix, enum format_type format, enum default_type def)
+{
+	char *cwd = NULL;
+	/*
+	 * We don't ever produce a relative path if prefix is NULL, so set the
+	 * prefix to the current directory so that we can produce a relative
+	 * path whenever possible.  If we're using RELATIVE_IF_SHARED mode, then
+	 * we want an absolute path unless the two share a common prefix, so don't
+	 * set it in that case, since doing so causes a relative path to always
+	 * be produced if possible.
+	 */
+	if (!prefix && (format != FORMAT_DEFAULT || def != DEFAULT_RELATIVE_IF_SHARED))
+		prefix = cwd = xgetcwd();
+	if (format == FORMAT_DEFAULT && def == DEFAULT_UNMODIFIED) {
+		puts(path);
+	} else if (format == FORMAT_RELATIVE ||
+		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE) ||
+		  (format == FORMAT_DEFAULT && def == DEFAULT_RELATIVE_IF_SHARED)) {
+		struct strbuf buf = STRBUF_INIT;
+		puts(relative_path(path, prefix, &buf));
+		strbuf_reset(&buf);
+	} else {
+		char *real = real_pathdup(path, 1);
+		puts(real);
+		free(real);
+	}
+	free(cwd);
+}
+
 int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 {
 	int i, as_is = 0, verify = 0, quiet = 0, revs_count = 0, type = 0;
@@ -595,6 +644,7 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 	struct object_context unused;
 	struct strbuf buf = STRBUF_INIT;
 	const int hexsz = the_hash_algo->hexsz;
+	enum format_type format = FORMAT_DEFAULT;
 
 	if (argc > 1 && !strcmp("--parseopt", argv[1]))
 		return cmd_parseopt(argc - 1, argv + 1, prefix);
@@ -650,8 +700,7 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!argv[i + 1])
 				die("--git-path requires an argument");
 			strbuf_reset(&buf);
-			puts(relative_path(git_path("%s", argv[i + 1]),
-					   prefix, &buf));
+			print_path(git_path("%s", argv[i + 1]), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
 			i++;
 			continue;
 		}
@@ -683,6 +732,16 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 					show_file(arg, 0);
 				continue;
 			}
+			if (opt_with_value(arg, "--path-format", &arg)) {
+				if (!strcmp(arg, "absolute")) {
+					format = FORMAT_CANONICAL;
+				} else if (!strcmp(arg, "relative")) {
+					format = FORMAT_RELATIVE;
+				} else {
+					die("unknown argument to --path-format: %s", arg);
+				}
+				continue;
+			}
 			if (!strcmp(arg, "--default")) {
 				def = argv[++i];
 				if (!def)
@@ -803,7 +862,7 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!strcmp(arg, "--show-toplevel")) {
 				const char *work_tree = get_git_work_tree();
 				if (work_tree)
-					puts(work_tree);
+					print_path(work_tree, prefix, format, DEFAULT_CANONICAL);
 				else
 					die("this operation must be run in a work tree");
 				continue;
@@ -811,7 +870,7 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 			if (!strcmp(arg, "--show-superproject-working-tree")) {
 				struct strbuf superproject = STRBUF_INIT;
 				if (get_superproject_working_tree(&superproject))
-					puts(superproject.buf);
+					print_path(superproject.buf, prefix, format, DEFAULT_CANONICAL);
 				strbuf_release(&superproject);
 				continue;
 			}
@@ -846,16 +905,18 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				const char *gitdir = getenv(GIT_DIR_ENVIRONMENT);
 				char *cwd;
 				int len;
+				enum format_type wanted = format;
 				if (arg[2] == 'g') {	/* --git-dir */
 					if (gitdir) {
-						puts(gitdir);
+						print_path(gitdir, prefix, format, DEFAULT_UNMODIFIED);
 						continue;
 					}
 					if (!prefix) {
-						puts(".git");
+						print_path(".git", prefix, format, DEFAULT_UNMODIFIED);
 						continue;
 					}
 				} else {		/* --absolute-git-dir */
+					wanted = FORMAT_CANONICAL;
 					if (!gitdir && !prefix)
 						gitdir = ".git";
 					if (gitdir) {
@@ -868,14 +929,14 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				}
 				cwd = xgetcwd();
 				len = strlen(cwd);
-				printf("%s%s.git\n", cwd, len && cwd[len-1] != '/' ? "/" : "");
+				strbuf_reset(&buf);
+				strbuf_addf(&buf, "%s%s.git", cwd, len && cwd[len-1] != '/' ? "/" : "");
 				free(cwd);
+				print_path(buf.buf, prefix, wanted, DEFAULT_CANONICAL);
 				continue;
 			}
 			if (!strcmp(arg, "--git-common-dir")) {
-				strbuf_reset(&buf);
-				puts(relative_path(get_git_common_dir(),
-						   prefix, &buf));
+				print_path(get_git_common_dir(), prefix, format, DEFAULT_RELATIVE_IF_SHARED);
 				continue;
 			}
 			if (!strcmp(arg, "--is-inside-git-dir")) {
@@ -905,8 +966,7 @@  int cmd_rev_parse(int argc, const char **argv, const char *prefix)
 				if (the_index.split_index) {
 					const struct object_id *oid = &the_index.split_index->base_oid;
 					const char *path = git_path("sharedindex.%s", oid_to_hex(oid));
-					strbuf_reset(&buf);
-					puts(relative_path(path, prefix, &buf));
+					print_path(path, prefix, format, DEFAULT_RELATIVE);
 				}
 				continue;
 			}
diff --git a/t/t1500-rev-parse.sh b/t/t1500-rev-parse.sh
index 408b97d5af..6f3811d189 100755
--- a/t/t1500-rev-parse.sh
+++ b/t/t1500-rev-parse.sh
@@ -3,6 +3,16 @@ 
 test_description='test git rev-parse'
 . ./test-lib.sh
 
+test_one () {
+	dir="$1" &&
+	expect="$2" &&
+	shift &&
+	shift &&
+	echo "$expect" >expect &&
+	git -C "$dir" rev-parse "$@" >actual
+	test_cmp expect actual
+}
+
 # usage: [options] label is-bare is-inside-git is-inside-work prefix git-dir absolute-git-dir
 test_rev_parse () {
 	d=
@@ -60,7 +70,13 @@  ROOT=$(pwd)
 
 test_expect_success 'setup' '
 	mkdir -p sub/dir work &&
-	cp -R .git repo.git
+	cp -R .git repo.git &&
+	git checkout -b main &&
+	test_commit abc &&
+	git checkout -b side &&
+	test_commit def &&
+	git checkout main &&
+	git worktree add worktree side
 '
 
 test_rev_parse toplevel false false true '' .git "$ROOT/.git"
@@ -88,6 +104,24 @@  test_rev_parse -C work -g ../repo.git -b t 'GIT_DIR=../repo.git, core.bare = tru
 
 test_rev_parse -C work -g ../repo.git -b u 'GIT_DIR=../repo.git, core.bare undefined' false false true ''
 
+test_expect_success 'rev-parse --path-format=absolute' '
+	test_one "." "$ROOT/.git" --path-format=absolute --git-dir &&
+	test_one "." "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "worktree" "$ROOT/.git/worktrees/worktree" --path-format=absolute --git-dir &&
+	test_one "worktree" "$ROOT/.git" --path-format=absolute --git-common-dir &&
+	test_one "." "$ROOT" --path-format=absolute --show-toplevel &&
+	test_one "." "$ROOT/.git/objects" --path-format=absolute --git-path objects
+'
+
+test_expect_success 'rev-parse --path-format=relative' '
+	test_one "." ".git" --path-format=relative --git-dir &&
+	test_one "." ".git" --path-format=relative --git-common-dir &&
+	test_one "worktree" "../.git/worktrees/worktree" --path-format=relative --git-dir &&
+	test_one "worktree" "../.git" --path-format=relative --git-common-dir &&
+	test_one "." "./" --path-format=relative --show-toplevel &&
+	test_one "." ".git/objects" --path-format=relative --git-path objects
+'
+
 test_expect_success 'git-common-dir from worktree root' '
 	echo .git >expect &&
 	git rev-parse --git-common-dir >actual &&