diff mbox series

[1/2] cat-file: configurable number of symlink resolutions

Message ID cbf38c7281de33289f622c9926c75744323311af.1718615028.git.gitgitgadget@gmail.com (mailing list archive)
State New
Headers show
Series Symlink resolutions: limits and return modes | expand

Commit Message

Miguel Ángel Pastor Olivar June 17, 2024, 9:03 a.m. UTC
From: =?UTF-8?q?Miguel=20=C3=81ngel=20Pastor=20Olivar?= <migue@github.com>

Sometimes, it can be useful to limit the number of symlink resolutions
performed while looking for a tree entry.

The goal is to provide the ability to resolve up to a particular depth,
instead of reaching the end of the link chain.

The current code already provides a limit to the maximum number of
resolutions that can be performed
(GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS). This patch introduces a new
config setting to make the previous property configurable. No logical
changes are introduced in this patch

Signed-off-by: Miguel Ángel Pastor Olivar <migue@github.com>
---
 Documentation/config/core.txt |  5 +++++
 config.c                      |  5 +++++
 environment.c                 |  1 +
 environment.h                 |  1 +
 t/t1006-cat-file.sh           | 17 +++++++++++++++++
 tree-walk.c                   |  7 ++++++-
 6 files changed, 35 insertions(+), 1 deletion(-)

Comments

Junio C Hamano June 17, 2024, 7:33 p.m. UTC | #1
"Miguel Ángel Pastor Olivar via GitGitGadget"
<gitgitgadget@gmail.com> writes:

> diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
> index 93d65e1dfd2..ca2d1eede52 100644
> --- a/Documentation/config/core.txt
> +++ b/Documentation/config/core.txt
> @@ -757,3 +757,8 @@ core.maxTreeDepth::
>  	tree (e.g., "a/b/cde/f" has a depth of 4). This is a fail-safe
>  	to allow Git to abort cleanly, and should not generally need to
>  	be adjusted. The default is 4096.
> +
> +core.maxSymlinkDepth::
> +	The maximum number of symlinks Git is willing to resolve while
> +	looking for a tree entry.
> +	The default is GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS.
> \ No newline at end of file

Style: please do not end our text files with an incomplete line.

Regarding the patch contents, this is an end-user facing document.
How would they learn what the actual value is?

Is there a "valid range" the users are allowed to set this value to?
If so, what is the range?  What do users get when they set it outside
the allowed range?  Do they get warned?  Do they get die()?  Is the
value silently ignored?

If there is no upper limit for the "valid range", how does a user
set it to "infinity", and what's the downside of doing so?  What
happens when the user sets it to 0, or a negative value, if there is
no lower limit for the "valid range"?  The questions in this
paragraph your updated documentation text do not have to answer if
your "valid range" does have both upper and lower limit, but the
documentation must answer questions in the previous paragraph.

> diff --git a/config.c b/config.c
> index abce05b7744..d69e9a3ae6b 100644
> --- a/config.c
> +++ b/config.c
> @@ -1682,6 +1682,11 @@ static int git_default_core_config(const char *var, const char *value,
>  		return 0;
>  	}
>  
> +	if (!strcmp(var, "core.maxsymlinkdepth")) {
> +		max_symlink_depth = git_config_int(var, value, ctx->kvi);
> +		return 0;
> +	}
> +
>  	/* Add other config variables here and to Documentation/config.txt. */
>  	return platform_core_config(var, value, ctx, cb);
>  }
> diff --git a/environment.c b/environment.c
> index 701d5151354..6d7a5001eb1 100644
> --- a/environment.c
> +++ b/environment.c
> @@ -95,6 +95,7 @@ int max_allowed_tree_depth =
>  #else
>  	2048;
>  #endif
> +int max_symlink_depth = -1;

Why set it to -1 here, instead of initializing it to the
GET_TREE_ENTRY_FOLLOW_SYMLINKS?  By introducing a configuration
variable (which by the way I am not convinced is necessarily a good
idea to begin with), you are surfacing that built-in default value
as a more prominent thing, not hidden away in a little corner of
tree-walk.c implementation detail.  If you do define a "valid range
of values", the code that parses core.maxsymlinkdepth in config.c
may want to learn what the value of GET_TREE_ENTRY_FOLLOW_SYMLINKS
is, which means the symbol may need to be visible in some common
header file anyway.

By the way, this is not a new problem this patch introduces, as the
default GET_TREE_ENTRY_FOLLOW_SYMLINKS came from 275721c2
(tree-walk: learn get_tree_entry_follow_symlinks, 2015-05-20), but I
wonder if the default number should somehow be aligned with the
other upper limit, SYMREF_MAXDEPTH for a symbolic ref pointing at
another symbolic ref pointing at yet another ...

> +test_expect_success 'git cat-file --batch --follow-symlink stop resolving symlinks' '
> +	printf "loop 22\nHEAD:link-to-symlink-3\n">expect &&
> +	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=1 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&

Style: a redirection operator needs a single SP before it and no SP
between it and its target, i.e.

	printf "loop 22..." >expect &&
	printf "HEAD:link ..." |
        git ... cat-file ... >actual &&

Also fold overly long line after "|" pipeline.

> diff --git a/tree-walk.c b/tree-walk.c
> index 6565d9ad993..3ec2302309e 100644
> --- a/tree-walk.c
> +++ b/tree-walk.c
> @@ -664,7 +664,12 @@ enum get_oid_result get_tree_entry_follow_symlinks(struct repository *r,
>  	struct object_id current_tree_oid;
>  	struct strbuf namebuf = STRBUF_INIT;
>  	struct tree_desc t;
> -	int follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
> +	int follows_remaining =
> +		max_symlink_depth > -1 &&
> +				max_symlink_depth <=
> +					GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS ?
> +			max_symlink_depth :
> +			GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;

Strange indentation.

If you range-limit at the place the configuration was parsed, you do
not have to do any of this here, but if you insist hiding
GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS from others (yet still use
it in the end-user facing documentation???), then

	int follows_remaining =
		(-1 < max_symlink_depth &&
		 max_symlink_depth <= GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS)
		? max_symlink_depth
		: GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;

or perhaps a lot easier to read form, i.e.

	int follows_remaining = max_symlink_depth;

        if (follows_remaining < -1 ||
            GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS < follows_remaining)
		follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;

>  	init_tree_desc(&t, NULL, NULL, 0UL);
>  	strbuf_addstr(&namebuf, name);


Thanks.
diff mbox series

Patch

diff --git a/Documentation/config/core.txt b/Documentation/config/core.txt
index 93d65e1dfd2..ca2d1eede52 100644
--- a/Documentation/config/core.txt
+++ b/Documentation/config/core.txt
@@ -757,3 +757,8 @@  core.maxTreeDepth::
 	tree (e.g., "a/b/cde/f" has a depth of 4). This is a fail-safe
 	to allow Git to abort cleanly, and should not generally need to
 	be adjusted. The default is 4096.
+
+core.maxSymlinkDepth::
+	The maximum number of symlinks Git is willing to resolve while
+	looking for a tree entry.
+	The default is GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS.
\ No newline at end of file
diff --git a/config.c b/config.c
index abce05b7744..d69e9a3ae6b 100644
--- a/config.c
+++ b/config.c
@@ -1682,6 +1682,11 @@  static int git_default_core_config(const char *var, const char *value,
 		return 0;
 	}
 
+	if (!strcmp(var, "core.maxsymlinkdepth")) {
+		max_symlink_depth = git_config_int(var, value, ctx->kvi);
+		return 0;
+	}
+
 	/* Add other config variables here and to Documentation/config.txt. */
 	return platform_core_config(var, value, ctx, cb);
 }
diff --git a/environment.c b/environment.c
index 701d5151354..6d7a5001eb1 100644
--- a/environment.c
+++ b/environment.c
@@ -95,6 +95,7 @@  int max_allowed_tree_depth =
 #else
 	2048;
 #endif
+int max_symlink_depth = -1;
 
 #ifndef PROTECT_HFS_DEFAULT
 #define PROTECT_HFS_DEFAULT 0
diff --git a/environment.h b/environment.h
index e9f01d4d11c..ea39c2887b1 100644
--- a/environment.h
+++ b/environment.h
@@ -141,6 +141,7 @@  extern size_t delta_base_cache_limit;
 extern unsigned long big_file_threshold;
 extern unsigned long pack_size_limit_cfg;
 extern int max_allowed_tree_depth;
+extern int max_symlink_depth;
 
 /*
  * Accessors for the core.sharedrepository config which lazy-load the value
diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh
index e12b2219721..fd7ab1d1eff 100755
--- a/t/t1006-cat-file.sh
+++ b/t/t1006-cat-file.sh
@@ -878,6 +878,9 @@  test_expect_success 'cat-file -t and -s on corrupt loose object' '
 test_expect_success 'prep for symlink tests' '
 	echo_without_newline "$hello_content" >morx &&
 	test_ln_s_add morx same-dir-link &&
+	test_ln_s_add same-dir-link link-to-symlink-1 &&
+	test_ln_s_add link-to-symlink-1 link-to-symlink-2 &&
+	test_ln_s_add link-to-symlink-2 link-to-symlink-3 &&
 	test_ln_s_add dir link-to-dir &&
 	test_ln_s_add ../fleem out-of-repo-link &&
 	test_ln_s_add .. out-of-repo-link-dir &&
@@ -1096,6 +1099,20 @@  test_expect_success 'git cat-file --batch --follow-symlink returns correct sha a
 	test_cmp expect actual
 '
 
+test_expect_success 'git cat-file --batch --follow-symlink stop resolving symlinks' '
+	printf "loop 22\nHEAD:link-to-symlink-3\n">expect &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=1 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=2 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=3 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual &&
+	oid=$(git rev-parse HEAD:morx) &&
+	printf "${oid} blob 11\nHello World\n" >expect &&
+	printf 'HEAD:link-to-symlink-3' | git -c core.maxsymlinkdepth=4 cat-file --batch="%(objectname) %(objecttype) %(objectsize)" --follow-symlinks > actual &&
+	test_cmp expect actual
+'
+
 test_expect_success 'cat-file --batch-all-objects shows all objects' '
 	# make new repos so we know the full set of objects; we will
 	# also make sure that there are some packed and some loose
diff --git a/tree-walk.c b/tree-walk.c
index 6565d9ad993..3ec2302309e 100644
--- a/tree-walk.c
+++ b/tree-walk.c
@@ -664,7 +664,12 @@  enum get_oid_result get_tree_entry_follow_symlinks(struct repository *r,
 	struct object_id current_tree_oid;
 	struct strbuf namebuf = STRBUF_INIT;
 	struct tree_desc t;
-	int follows_remaining = GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
+	int follows_remaining =
+		max_symlink_depth > -1 &&
+				max_symlink_depth <=
+					GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS ?
+			max_symlink_depth :
+			GET_TREE_ENTRY_FOLLOW_SYMLINKS_MAX_LINKS;
 
 	init_tree_desc(&t, NULL, NULL, 0UL);
 	strbuf_addstr(&namebuf, name);