mbox series

[v2,0/4] add ref content check for files backend

Message ID Zs348uXMBdCuwF-2@ArchLinux (mailing list archive)
Headers show
Series add ref content check for files backend | expand

Message

shejialuo Aug. 27, 2024, 4:04 p.m. UTC
Hi All:

This new version handles the following reviews:

1. According to the advice from the Junio, we should just use "{0}" to
initialize the zero structure "fsck_ref_report". This version handles
this in [PATCH v2 1/4].
2. According to the advice from the Patrick, use "strrchr" instead of
looping to make the code more clean in [PATCH v2 3/4].
3. Use "goto" to remove ident.

However, the most important thing for this patch is which fsck message
type I choose. I have recorded the reason in the commit message. But I
wanna explain the motivation in cover letter for making the reviewers
easy to understand.

Actually, in the review process of the first version. Junio thought we
should use "FSCK_INFO" and Patrick thought we should use "FSCK_WARN".
And I raised a question here, what is the difference between the
"FSCK_INFO" and "FSCK_WARN" because in "fsck.c::fsck_vreport" function,
we will convert "FSCK_INFO" to "FSCK_WARN" like the following:

    static int fsck_vreport(...)
    {
        enum fsck_msg_type msg_type = fsck_msg_type(msg_id, options);

        if (msg_type == FSCK_FATAL)
            msg_type = FSCK_ERROR;
        if (msg_type == FSCK_INFO)
             msg_type = FSCK_WARN;
        ...
    }

And I have gone back to the history. Actually the first time the fsck
message type was set up at f27d05b170 (fsck: allow upgrading fsck
warnings to errors, 2015-06-22):

  https://lore.kernel.org/git/cover.1418055173.git.johannes.schindelin@gmx.de/

And I have understood why we need "FSCK_INFO". This is because when
setting the "strict" filed in "fsck_options", all the fsck warns will
become fsck errors. For example, this change verifies my thinking:
4dd3b045f5 (fsck: downgrade tree badFilemode to "info", 2022-08-10).

As you can see, this restriction makes the code safer. So, I agree with
Junio, at now, we should use "FSCK_INFO" for trailing garbage and ref
content ends without newline.

But we should report fsck errors for the following two situations for
"git-fsck(1)" will report fsck errors by implicitly checking the ref
database consistency.

1. "parse_loose_ref_contents" fail.
2. symref content is bad (cannot parse).

Thanks,
Jialuo

shejialuo (4):
  ref: initialize "fsck_ref_report" with zero
  ref: add regular ref content check for files backend
  ref: add symbolic ref content check for files backend
  ref: add symlink ref check for files backend

 Documentation/fsck-msgids.txt |  12 +++
 fsck.h                        |   4 +
 refs.c                        |   2 +-
 refs/files-backend.c          | 179 +++++++++++++++++++++++++++++++-
 refs/refs-internal.h          |   2 +-
 t/t0602-reffiles-fsck.sh      | 185 ++++++++++++++++++++++++++++++++++
 6 files changed, 379 insertions(+), 5 deletions(-)

Range-diff against v1:
1:  9ed3026ac5 ! 1:  0367904c81 fsck: introduce "FSCK_REF_REPORT_DEFAULT" macro
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    fsck: introduce "FSCK_REF_REPORT_DEFAULT" macro
    +    ref: initialize "fsck_ref_report" with zero
     
         In "fsck.c::fsck_refs_error_function", we need to tell whether "oid" and
         "referent" is NULL. So, we need to always initialize these parameters to
         NULL instead of letting them point to anywhere when creating a new
         "fsck_ref_report" structure.
     
    -    In order to conveniently create a new "fsck_ref_report", add a new macro
    -    "FSCK_REF_REPORT_DEFAULT".
    +    The original code explicitly specifies the ".path" field to initialize
    +    the "fsck_ref_report" structure. However, it introduces confusion how we
    +    initialize the other fields. In order to avoid this, initialize the
    +    "fsck_ref_report" with zero to make clear that everything in
    +    "fsck_ref_report" is zero initialized.
     
         Mentored-by: Patrick Steinhardt <ps@pks.im>
         Mentored-by: Karthik Nayak <karthik.188@gmail.com>
         Signed-off-by: shejialuo <shejialuo@gmail.com>
     
    - ## fsck.h ##
    -@@ fsck.h: struct fsck_ref_report {
    - 	const char *referent;
    - };
    - 
    -+#define FSCK_REF_REPORT_DEFAULT { \
    -+	.path = NULL, \
    -+	.oid = NULL, \
    -+	.referent = NULL, \
    -+}
    -+
    - struct fsck_options {
    - 	fsck_walk_func walk;
    - 	fsck_error error_func;
    -
      ## refs/files-backend.c ##
     @@ refs/files-backend.c: static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
      		goto cleanup;
      
      	if (check_refname_format(iter->basename, REFNAME_ALLOW_ONELEVEL)) {
     -		struct fsck_ref_report report = { .path = NULL };
    -+		struct fsck_ref_report report = FSCK_REF_REPORT_DEFAULT;
    ++		struct fsck_ref_report report = {0};
      
      		strbuf_addf(&sb, "%s/%s", refs_check_dir, iter->relative_path);
      		report.path = sb.buf;
2:  714284cf2b ! 2:  7b6f4145cd ref: add regular ref content check for files backend
    @@ Metadata
      ## Commit message ##
         ref: add regular ref content check for files backend
     
    -    We implicitly reply on "git-fsck(1)" to check the consistency of regular
    -    refs. However, when parsing the regular refs for files backend, we allow
    -    the ref content to end with no newline or contain some garbages. We
    -    should warn the user about above situations.
    +    We implicitly rely on "git-fsck(1)" to check the consistency of regular
    +    refs. However, when parsing the regular refs for files backend by using
    +    "files-backend.c::parse_loose_ref_contents", we allow the ref content to
    +    be end with no newline or contain some garbages.
     
    -    In order to provide above functionality, enhance the "git-refs verify"
    -    command by adding consistency check for regular refs for files backend.
    +    It may seem that we should report an error or warn fsck message to the
    +    user about above situations. However, there may be some third-party
    +    tools customizing the content of refs. We should not report an error
    +    fsck message.
     
    -    Add the following three fsck messages to represent the above situations:
    +    And we cannot either report a warn fsck message to the user. This is
    +    because for "git-receive-pack(1)" and "git-fetch-pack(1)", they will
    +    parse the fsck message type and check the message type by
    +    "fsck.c::is_valid_msg_type". Only the fsck infos are not valid. If we
    +    make the fsck message type to be warn, the user could upgrade the fsck
    +    warnings to errors. And the user can also set the "strict" field in
    +    "fsck_options" to upgrade the fsck warnings to errors.
     
    -    1. "badRefContent(ERROR)": A ref has a bad content.
    -    2. "refMissingNewline(WARN)": A valid ref does not end with newline.
    -    3. "trailingRefContent(WARN)": A ref has trailing contents.
    +    We should not allow the user to upgrade the fsck warnings to errors. It
    +    might cause compatibility issue which will break the legacy repository.
    +    So we add the following two fsck infos to represent the situation where
    +    the ref content ends without newline or has garbages:
    +
    +    1. "refMissingNewline(INFO)": A valid ref does not end with newline.
    +    2. "trailingRefContent(INFO)": A ref has trailing contents.
    +
    +    In "fsck.c::fsck_vreport", we will convert "FSCK_INFO" to "FSCK_WARN",
    +    and we can still warn the user about these situations when using
    +    "git-refs verify" without introducing compatibility issue.
    +
    +    In current "git-fsck(1)", it will report an error when the ref content
    +    is bad, so we should following this to report an error to the user when
    +    "parse_loose_ref_contents" fails. And we add a new fsck error message
    +    called "badRefContent(ERROR)" to represent that a ref has a bad content.
     
         In order to tell whether the ref has trailing content, add a new
         parameter "trailing" to "parse_loose_ref_contents". Then introduce a new
    -    function "files_fsck_refs_content" to check the regular refs.
    +    function "files_fsck_refs_content" to check the regular refs to enhance
    +    the "git-refs verify".
     
         Mentored-by: Patrick Steinhardt <ps@pks.im>
         Mentored-by: Karthik Nayak <karthik.188@gmail.com>
    @@ Documentation/fsck-msgids.txt
      	(WARN) Tree contains entries pointing to a null sha1.
      
     +`refMissingNewline`::
    -+	(WARN) A valid ref does not end with newline.
    ++	(INFO) A valid ref does not end with newline.
     +
     +`trailingRefContent`::
    -+	(WARN) A ref has trailing contents.
    ++	(INFO) A ref has trailing contents.
     +
      `treeNotSorted`::
      	(ERROR) A tree is not properly sorted.
    @@ fsck.h: enum fsck_msg_type {
      	FUNC(BAD_REF_NAME, ERROR) \
      	FUNC(BAD_TIMEZONE, ERROR) \
     @@ fsck.h: enum fsck_msg_type {
    - 	FUNC(HAS_DOTDOT, WARN) \
    - 	FUNC(HAS_DOTGIT, WARN) \
    - 	FUNC(NULL_SHA1, WARN) \
    -+	FUNC(REF_MISSING_NEWLINE, WARN) \
    -+	FUNC(TRAILING_REF_CONTENT, WARN) \
    - 	FUNC(ZERO_PADDED_FILEMODE, WARN) \
    - 	FUNC(NUL_IN_COMMIT, WARN) \
    - 	FUNC(LARGE_PATHNAME, WARN) \
    + 	FUNC(MAILMAP_SYMLINK, INFO) \
    + 	FUNC(BAD_TAG_NAME, INFO) \
    + 	FUNC(MISSING_TAGGER_ENTRY, INFO) \
    ++	FUNC(REF_MISSING_NEWLINE, INFO) \
    ++	FUNC(TRAILING_REF_CONTENT, INFO) \
    + 	/* ignored (elevated when requested) */ \
    + 	FUNC(EXTRA_HEADER_ENTRY, IGNORE)
    + 
     
      ## refs.c ##
     @@ refs.c: static int refs_read_special_head(struct ref_store *ref_store,
    @@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
     +				   const char *refs_check_dir,
     +				   struct dir_iterator *iter)
     +{
    -+	struct fsck_ref_report report = FSCK_REF_REPORT_DEFAULT;
     +	struct strbuf ref_content = STRBUF_INIT;
     +	struct strbuf referent = STRBUF_INIT;
     +	struct strbuf refname = STRBUF_INIT;
    ++	struct fsck_ref_report report = {0};
     +	const char *trailing = NULL;
     +	unsigned int type = 0;
     +	int failure_errno = 0;
    @@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
     +		}
     +
     +		if (parse_loose_ref_contents(ref_store->repo->hash_algo,
    -+					    ref_content.buf, &oid, &referent,
    -+					    &type, &trailing, &failure_errno)) {
    ++					     ref_content.buf, &oid, &referent,
    ++					     &type, &trailing, &failure_errno)) {
     +			ret = fsck_report_ref(o, &report,
     +					      FSCK_MSG_BAD_REF_CONTENT,
     +					      "invalid ref content");
    @@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
     +				goto cleanup;
     +			}
     +		}
    ++		goto cleanup;
     +	}
     +
     +cleanup:
3:  032b0d6a64 ! 3:  20d8556902 ref: add symbolic ref content check for files backend
    @@ Commit message
         3. "ref: refs/heads/master\n\n"
     
         But we do not allow any non-null trailing garbage. The following are bad
    -    symbolic contents.
    +    symbolic contents which will be reported as fsck error by "git-fsck(1)".
     
         1. "ref: refs/heads/master garbage\n"
         2. "ref: refs/heads/master \n\n\n garbage  "
     
    -    In order to provide above checks, we will traverse the "pointee" to
    -    report the user whether this is null-garbage or no newline. And if
    -    symbolic refs contain non-null garbage, we will report
    -    "FSCK_MSG_BAD_REF_CONTENT" to the user.
    -
    -    Then, we will check the name of the "pointee" is correct by using
    -    "check_refname_format". And then if we can access the "pointee_path" in
    -    the file system, we should ensure that the file type is correct.
    +    In order to provide above checks, we will use "strrchr" to check whether
    +    we have newline in the ref content. Then we will check the name of the
    +    "pointee" is correct by using "check_refname_format". If the function
    +    fails, we need to trim the "pointee" to see whether the null-garbage
    +    causes the function fails. If so, we need to report that there is
    +    null-garbage in the symref content. Otherwise, we should report the user
    +    the "pointee" is bad.
     
         Mentored-by: Patrick Steinhardt <ps@pks.im>
         Mentored-by: Karthik Nayak <karthik.188@gmail.com>
    @@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
     +				    struct strbuf *pointee_name,
     +				    struct strbuf *pointee_path)
     +{
    -+	unsigned int newline_num = 0;
    -+	unsigned int space_num = 0;
    ++	const char *newline_pos = NULL;
     +	const char *p = NULL;
     +	struct stat st;
     +	int ret = 0;
    @@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
     +		goto out;
     +	}
     +
    -+	while (*p != '\0') {
    -+		if ((space_num || newline_num) && !isspace(*p)) {
    -+			ret = fsck_report_ref(o, report,
    -+					      FSCK_MSG_BAD_REF_CONTENT,
    -+					      "contains non-null garbage");
    -+			goto out;
    -+		}
    -+
    -+		if (*p == '\n') {
    -+			newline_num++;
    -+		} else if (*p == ' ') {
    -+			space_num++;
    -+		}
    -+		p++;
    -+	}
    -+
    -+	if (space_num || newline_num > 1) {
    -+		ret = fsck_report_ref(o, report,
    -+				      FSCK_MSG_TRAILING_REF_CONTENT,
    -+				      "trailing null-garbage");
    -+	} else if (!newline_num) {
    ++	newline_pos = strrchr(p, '\n');
    ++	if (!newline_pos || *(newline_pos + 1)) {
     +		ret = fsck_report_ref(o, report,
     +				      FSCK_MSG_REF_MISSING_NEWLINE,
     +				      "missing newline");
     +	}
     +
    -+	strbuf_rtrim(pointee_name);
    -+
     +	if (check_refname_format(pointee_name->buf, 0)) {
    ++		/*
    ++		 * When containing null-garbage, "check_refname_format" will
    ++		 * fail, we should trim the "pointee" to check again.
    ++		 */
    ++		strbuf_rtrim(pointee_name);
    ++		if (!check_refname_format(pointee_name->buf, 0)) {
    ++			ret = fsck_report_ref(o, report,
    ++					      FSCK_MSG_TRAILING_REF_CONTENT,
    ++					      "trailing null-garbage");
    ++			goto out;
    ++		}
    ++
     +		ret = fsck_report_ref(o, report,
     +				      FSCK_MSG_BAD_SYMREF_POINTEE,
     +				      "points to refname with invalid format");
    @@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
      				   const char *refs_check_dir,
      				   struct dir_iterator *iter)
      {
    - 	struct fsck_ref_report report = FSCK_REF_REPORT_DEFAULT;
     +	struct strbuf pointee_path = STRBUF_INIT;
      	struct strbuf ref_content = STRBUF_INIT;
      	struct strbuf referent = STRBUF_INIT;
    @@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_s
     +						       &referent,
     +						       &pointee_path);
      		}
    + 		goto cleanup;
      	}
    - 
     @@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_store,
      	strbuf_release(&refname);
      	strbuf_release(&ref_content);
    @@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be che
     +	printf "ref: refs/heads/branch     " > $branch_dir_prefix/a/b/branch-trailing &&
     +	git refs verify 2>err &&
     +	cat >expect <<-EOF &&
    ++	warning: refs/heads/a/b/branch-trailing: refMissingNewline: missing newline
     +	warning: refs/heads/a/b/branch-trailing: trailingRefContent: trailing null-garbage
     +	EOF
     +	rm $branch_dir_prefix/a/b/branch-trailing &&
    @@ t/t0602-reffiles-fsck.sh: test_expect_success 'regular ref content should be che
     +	printf "ref: refs/heads/branch \n\n " > $branch_dir_prefix/a/b/branch-trailing &&
     +	git refs verify 2>err &&
     +	cat >expect <<-EOF &&
    ++	warning: refs/heads/a/b/branch-trailing: refMissingNewline: missing newline
     +	warning: refs/heads/a/b/branch-trailing: trailingRefContent: trailing null-garbage
     +	EOF
     +	rm $branch_dir_prefix/a/b/branch-trailing &&
4:  147a873958 ! 4:  d9867c5f87 ref: add symlink ref consistency check for files backend
    @@ Metadata
     Author: shejialuo <shejialuo@gmail.com>
     
      ## Commit message ##
    -    ref: add symlink ref consistency check for files backend
    +    ref: add symlink ref check for files backend
     
         We have already introduced "files_fsck_symref_target". We should reuse
         this function to handle the symrefs which are legacy symbolic links. We
    @@ refs/files-backend.c: typedef int (*files_fsck_refs_fn)(struct ref_store *ref_st
     +				    struct strbuf *pointee_path,
     +				    unsigned int symbolic_link)
      {
    - 	unsigned int newline_num = 0;
    - 	unsigned int space_num = 0;
    + 	const char *newline_pos = NULL;
    + 	const char *p = NULL;
     @@ refs/files-backend.c: static int files_fsck_symref_target(struct fsck_options *o,
      		goto out;
      	}
      
    --	while (*p != '\0') {
    --		if ((space_num || newline_num) && !isspace(*p)) {
    --			ret = fsck_report_ref(o, report,
    --					      FSCK_MSG_BAD_REF_CONTENT,
    --					      "contains non-null garbage");
    --			goto out;
    +-	newline_pos = strrchr(p, '\n');
    +-	if (!newline_pos || *(newline_pos + 1)) {
    +-		ret = fsck_report_ref(o, report,
    +-				      FSCK_MSG_REF_MISSING_NEWLINE,
    +-				      "missing newline");
     +	if (!symbolic_link) {
    -+		while (*p != '\0') {
    -+			if ((space_num || newline_num) && !isspace(*p)) {
    -+				ret = fsck_report_ref(o, report,
    -+						      FSCK_MSG_BAD_REF_CONTENT,
    -+						      "contains non-null garbage");
    -+				goto out;
    -+			}
    -+
    -+			if (*p == '\n') {
    -+				newline_num++;
    -+			} else if (*p == ' ') {
    -+				space_num++;
    -+			}
    -+			p++;
    - 		}
    - 
    --		if (*p == '\n') {
    --			newline_num++;
    --		} else if (*p == ' ') {
    --			space_num++;
    -+		if (space_num || newline_num > 1) {
    -+			ret = fsck_report_ref(o, report,
    -+					      FSCK_MSG_TRAILING_REF_CONTENT,
    -+					      "trailing null-garbage");
    -+		} else if (!newline_num) {
    ++		newline_pos = strrchr(p, '\n');
    ++		if (!newline_pos || *(newline_pos + 1)) {
     +			ret = fsck_report_ref(o, report,
     +					      FSCK_MSG_REF_MISSING_NEWLINE,
     +					      "missing newline");
    - 		}
    --		p++;
    --	}
    - 
    --	if (space_num || newline_num > 1) {
    --		ret = fsck_report_ref(o, report,
    --				      FSCK_MSG_TRAILING_REF_CONTENT,
    --				      "trailing null-garbage");
    --	} else if (!newline_num) {
    --		ret = fsck_report_ref(o, report,
    --				      FSCK_MSG_REF_MISSING_NEWLINE,
    --				      "missing newline");
    -+		strbuf_rtrim(pointee_name);
    ++		}
      	}
      
    --	strbuf_rtrim(pointee_name);
    --
      	if (check_refname_format(pointee_name->buf, 0)) {
    +-		/*
    +-		 * When containing null-garbage, "check_refname_format" will
    +-		 * fail, we should trim the "pointee" to check again.
    +-		 */
    +-		strbuf_rtrim(pointee_name);
    +-		if (!check_refname_format(pointee_name->buf, 0)) {
    +-			ret = fsck_report_ref(o, report,
    +-					      FSCK_MSG_TRAILING_REF_CONTENT,
    +-					      "trailing null-garbage");
    +-			goto out;
    ++		if (!symbolic_link) {
    ++			/*
    ++			* When containing null-garbage, "check_refname_format" will
    ++			* fail, we should trim the "pointee" to check again.
    ++			*/
    ++			strbuf_rtrim(pointee_name);
    ++			if (!check_refname_format(pointee_name->buf, 0)) {
    ++				ret = fsck_report_ref(o, report,
    ++						      FSCK_MSG_TRAILING_REF_CONTENT,
    ++						      "trailing null-garbage");
    ++				goto out;
    ++			}
    + 		}
    + 
      		ret = fsck_report_ref(o, report,
    - 				      FSCK_MSG_BAD_SYMREF_POINTEE,
     @@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_store,
    - 	struct fsck_ref_report report = FSCK_REF_REPORT_DEFAULT;
    + {
      	struct strbuf pointee_path = STRBUF_INIT;
      	struct strbuf ref_content = STRBUF_INIT;
     +	struct strbuf abs_gitdir = STRBUF_INIT;
      	struct strbuf referent = STRBUF_INIT;
      	struct strbuf refname = STRBUF_INIT;
    + 	struct fsck_ref_report report = {0};
    ++	const char *pointee_name = NULL;
     +	unsigned int symbolic_link = 0;
      	const char *trailing = NULL;
      	unsigned int type = 0;
    @@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_s
     -						       &pointee_path);
     +						       &pointee_path,
     +						       symbolic_link);
    -+		}
    -+	} else if (S_ISLNK(iter->st.st_mode)) {
    -+		const char *pointee_name = NULL;
    + 		}
    + 		goto cleanup;
    + 	}
    + 
    ++	symbolic_link = 1;
     +
    -+		symbolic_link = 1;
    ++	strbuf_add_real_path(&pointee_path, iter->path.buf);
    ++	strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
    ++	strbuf_normalize_path(&abs_gitdir);
    ++	if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
    ++		strbuf_addch(&abs_gitdir, '/');
     +
    -+		strbuf_add_real_path(&pointee_path, iter->path.buf);
    -+		strbuf_add_absolute_path(&abs_gitdir, ref_store->gitdir);
    -+		strbuf_normalize_path(&abs_gitdir);
    -+		if (!is_dir_sep(abs_gitdir.buf[abs_gitdir.len - 1]))
    -+			strbuf_addch(&abs_gitdir, '/');
    ++	if (!skip_prefix(pointee_path.buf, abs_gitdir.buf, &pointee_name)) {
    ++		ret = fsck_report_ref(o, &report,
    ++				      FSCK_MSG_BAD_SYMREF_POINTEE,
    ++				      "point to target outside gitdir");
    ++		goto cleanup;
    ++	}
     +
    -+		if (!skip_prefix(pointee_path.buf,
    -+				 abs_gitdir.buf, &pointee_name)) {
    -+			ret = fsck_report_ref(o, &report,
    -+					       FSCK_MSG_BAD_SYMREF_POINTEE,
    -+					       "point to target outside gitdir");
    -+			goto cleanup;
    - 		}
    ++	strbuf_addstr(&referent, pointee_name);
    ++	ret = files_fsck_symref_target(o, &report, refname.buf,
    ++				       &referent, &pointee_path,
    ++				       symbolic_link);
     +
    -+		strbuf_addstr(&referent, pointee_name);
    -+		ret = files_fsck_symref_target(o, &report, refname.buf,
    -+					       &referent, &pointee_path,
    -+					       symbolic_link);
    - 	}
    - 
      cleanup:
    -@@ refs/files-backend.c: static int files_fsck_refs_content(struct ref_store *ref_store,
    + 	strbuf_release(&refname);
      	strbuf_release(&ref_content);
      	strbuf_release(&referent);
      	strbuf_release(&pointee_path);

Comments

Junio C Hamano Aug. 28, 2024, 9:28 p.m. UTC | #1
Here is another one.

By the way, Peff, do we have MAYBE_UNUSED that can be used in a case
like this one?  Platforms without symbolic links supported may well
define NO_SYMLINK_HEAD, which makes the incoming parameters unused.

static int create_ref_symlink(struct ref_lock *lock, const char *target)
{
	int ret = -1;
#ifndef NO_SYMLINK_HEAD
	char *ref_path = get_locked_file_path(&lock->lk);
	unlink(ref_path);
	ret = symlink(target, ref_path);
	free(ref_path);

	if (ret)
		fprintf(stderr, "no symlink - falling back to symbolic ref\n");
#endif
	return ret;
}

We can of course do the attached, which I'll let shejialuo to squash
into an appropriate patch in the series.

Thanks.


 refs/files-backend.c | 7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)

diff --git c/refs/files-backend.c w/refs/files-backend.c
index 69dd283c9d..110af32788 100644
--- c/refs/files-backend.c
+++ w/refs/files-backend.c
@@ -1951,10 +1951,13 @@ static int commit_ref_update(struct files_ref_store *refs,
 	return 0;
 }
 
+#ifdef NO_SYMLINK_HEAD
+#define create_ref_symlink(lock, referent) (-1)
+#else
 static int create_ref_symlink(struct ref_lock *lock, const char *target)
 {
 	int ret = -1;
-#ifndef NO_SYMLINK_HEAD
+
 	char *ref_path = get_locked_file_path(&lock->lk);
 	unlink(ref_path);
 	ret = symlink(target, ref_path);
@@ -1962,9 +1965,9 @@ static int create_ref_symlink(struct ref_lock *lock, const char *target)
 
 	if (ret)
 		fprintf(stderr, "no symlink - falling back to symbolic ref\n");
-#endif
 	return ret;
 }
+#endif
 
 static int create_symref_lock(struct ref_lock *lock, const char *target,
 			      struct strbuf *err)
Jeff King Aug. 29, 2024, 4:02 a.m. UTC | #2
On Wed, Aug 28, 2024 at 02:28:47PM -0700, Junio C Hamano wrote:

> By the way, Peff, do we have MAYBE_UNUSED that can be used in a case
> like this one?  Platforms without symbolic links supported may well
> define NO_SYMLINK_HEAD, which makes the incoming parameters unused.

Yes, it would be fine to use MAYBE_UNUSED in a case like this.

The other option, and what I did for a conditional compilation in
imap-send.c, is to just mention the variable like:

  /* mark as used to appease -Wunused-parameter with NO_SYMLINK_HEAD */
  (void)lock;
  (void)target;

In retrospect I think MAYBE_UNUSED is probably a little less magical,
and I perhaps should have used it there.

In this particular case, though, where there's no actual code in one
half of the #ifdef, I think just defining two separate functions is
cleaner. I.e., what you did with a macro below, though I'd probably have
just used a real function with UNUSED markers.

As an aside, I wonder if we should consider deprecating and eventually
dropping support for core.prefersymlinkrefs. I can't think of a reason
anybody would want to use it, and of course it makes no sense as we move
on to alternate backends like reftables. I sent patches ages ago:

  https://lore.kernel.org/git/20151229060055.GA17047@sigill.intra.peff.net/

but I think it may have just gotten lost in the shuffle, and I've
somehow been meaning to re-submit them for 9 years. :-/

-Peff
Junio C Hamano Aug. 29, 2024, 4:59 a.m. UTC | #3
Jeff King <peff@peff.net> writes:

> As an aside, I wonder if we should consider deprecating and eventually
> dropping support for core.prefersymlinkrefs. I can't think of a reason
> anybody would want to use it, and of course it makes no sense as we move
> on to alternate backends like reftables.

Yup.  Perhaps add an entry or two to BreakingChanges document?

 Documentation/BreakingChanges.txt | 6 ++++++
 1 file changed, 6 insertions(+)

diff --git c/Documentation/BreakingChanges.txt w/Documentation/BreakingChanges.txt
index 0532bfcf7f..2a85740f3c 100644
--- c/Documentation/BreakingChanges.txt
+++ w/Documentation/BreakingChanges.txt
@@ -115,6 +115,12 @@ info/grafts as outdated, 2014-03-05) and will be removed.
 +
 Cf. <20140304174806.GA11561@sigill.intra.peff.net>.
 
+* Support for core.prefersymlinkrefs will be dropped.  Support for
+  existing repositories that use symbolic links to represent a
+  symbolic ref may or may not be dropped.
++
+Cf. <20240829040215.GA4054823@coredump.intra.peff.net>
+
 == Superseded features that will not be deprecated
 
 Some features have gained newer replacements that aim to improve the design in
Patrick Steinhardt Aug. 29, 2024, 7 a.m. UTC | #4
On Wed, Aug 28, 2024 at 09:59:58PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > As an aside, I wonder if we should consider deprecating and eventually
> > dropping support for core.prefersymlinkrefs. I can't think of a reason
> > anybody would want to use it, and of course it makes no sense as we move
> > on to alternate backends like reftables.
> 
> Yup.  Perhaps add an entry or two to BreakingChanges document?
> 
>  Documentation/BreakingChanges.txt | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git c/Documentation/BreakingChanges.txt w/Documentation/BreakingChanges.txt
> index 0532bfcf7f..2a85740f3c 100644
> --- c/Documentation/BreakingChanges.txt
> +++ w/Documentation/BreakingChanges.txt
> @@ -115,6 +115,12 @@ info/grafts as outdated, 2014-03-05) and will be removed.
>  +
>  Cf. <20140304174806.GA11561@sigill.intra.peff.net>.
>  
> +* Support for core.prefersymlinkrefs will be dropped.  Support for
> +  existing repositories that use symbolic links to represent a
> +  symbolic ref may or may not be dropped.
> ++
> +Cf. <20240829040215.GA4054823@coredump.intra.peff.net>
> +
>  == Superseded features that will not be deprecated

Yes, I'm very much in favor of that. As Peff said, I don't see a single
reason why it would make sense to use symlinks nowadays. We have also
supported the "new" syntax for ages now, and I'd be surprised if there
were repos out there using it on purpose.

We should probably do the above together with a new check that starts to
warn about symbolic links in "refs/" such that users become aware of
this deprecation. We'd have to grow the infrastructure to also scan root
refs though, which to the best of my knowledge we don't currently scan.

Patrick
Junio C Hamano Aug. 29, 2024, 3:07 p.m. UTC | #5
Patrick Steinhardt <ps@pks.im> writes:

>> +* Support for core.prefersymlinkrefs will be dropped.  Support for
>> +  existing repositories that use symbolic links to represent a
>> +  symbolic ref may or may not be dropped.
>> ++
>> +Cf. <20240829040215.GA4054823@coredump.intra.peff.net>
>> +
>>  == Superseded features that will not be deprecated
> ...
> We should probably do the above together with a new check that starts to
> warn about symbolic links in "refs/" such that users become aware of
> this deprecation. We'd have to grow the infrastructure to also scan root
> refs though, which to the best of my knowledge we don't currently scan.

Yup, that is why the above suggestion is on _this_ thread that is
about the "check for curiously formatted symrefs, in the hope that
we can retroactively tighten our checks later" topic.
shejialuo Aug. 29, 2024, 3:48 p.m. UTC | #6
On Wed, Aug 28, 2024 at 09:59:58PM -0700, Junio C Hamano wrote:
> Jeff King <peff@peff.net> writes:
> 
> > As an aside, I wonder if we should consider deprecating and eventually
> > dropping support for core.prefersymlinkrefs. I can't think of a reason
> > anybody would want to use it, and of course it makes no sense as we move
> > on to alternate backends like reftables.
> 
> Yup.  Perhaps add an entry or two to BreakingChanges document?
> 
>  Documentation/BreakingChanges.txt | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git c/Documentation/BreakingChanges.txt w/Documentation/BreakingChanges.txt
> index 0532bfcf7f..2a85740f3c 100644
> --- c/Documentation/BreakingChanges.txt
> +++ w/Documentation/BreakingChanges.txt
> @@ -115,6 +115,12 @@ info/grafts as outdated, 2014-03-05) and will be removed.
>  +
>  Cf. <20140304174806.GA11561@sigill.intra.peff.net>.
>  
> +* Support for core.prefersymlinkrefs will be dropped.  Support for
> +  existing repositories that use symbolic links to represent a
> +  symbolic ref may or may not be dropped.
> ++
> +Cf. <20240829040215.GA4054823@coredump.intra.peff.net>
> +
>  == Superseded features that will not be deprecated
>  
>  Some features have gained newer replacements that aim to improve the design in

From my current understanding, I think I need to rebase two patches
provided by your here:

  https://lore.kernel.org/git/xmqqle0gzdyh.fsf_-_@gitster.g/
  https://lore.kernel.org/git/xmqqbk1cz69c.fsf@gitster.g/

I think in this patch, we just info the user that we will drop
"core.prefersymlinkrefs" later, so I should not concern about this
patch and also the [PATCH 8/6].

Thanks,
Jialuo
Junio C Hamano Aug. 29, 2024, 4:12 p.m. UTC | #7
shejialuo <shejialuo@gmail.com> writes:

> From my current understanding, I think I need to rebase two patches
> provided by your here:
>
>   https://lore.kernel.org/git/xmqqle0gzdyh.fsf_-_@gitster.g/
>   https://lore.kernel.org/git/xmqqbk1cz69c.fsf@gitster.g/

They are to be squashed into your patch, "suggested edit" for your
changes, not "to be rebased".  In other words, we do not want to see
a patch (from your v2 as-is) to create problems and then another
patch (taken from one of these links) applied on top to remedy them.
We instead want to see a patch (start from your v2 but with the
changes from these links) that does not introduce problems in the
first place.
Jeff King Aug. 29, 2024, 7:48 p.m. UTC | #8
On Thu, Aug 29, 2024 at 09:00:58AM +0200, Patrick Steinhardt wrote:

> > diff --git c/Documentation/BreakingChanges.txt w/Documentation/BreakingChanges.txt
> > index 0532bfcf7f..2a85740f3c 100644
> > --- c/Documentation/BreakingChanges.txt
> > +++ w/Documentation/BreakingChanges.txt
> > @@ -115,6 +115,12 @@ info/grafts as outdated, 2014-03-05) and will be removed.
> >  +
> >  Cf. <20140304174806.GA11561@sigill.intra.peff.net>.
> >  
> > +* Support for core.prefersymlinkrefs will be dropped.  Support for
> > +  existing repositories that use symbolic links to represent a
> > +  symbolic ref may or may not be dropped.
> > ++
> > +Cf. <20240829040215.GA4054823@coredump.intra.peff.net>
> > +
> >  == Superseded features that will not be deprecated
> 
> Yes, I'm very much in favor of that. As Peff said, I don't see a single
> reason why it would make sense to use symlinks nowadays. We have also
> supported the "new" syntax for ages now, and I'd be surprised if there
> were repos out there using it on purpose.
> 
> We should probably do the above together with a new check that starts to
> warn about symbolic links in "refs/" such that users become aware of
> this deprecation. We'd have to grow the infrastructure to also scan root
> refs though, which to the best of my knowledge we don't currently scan.

I think the first step of the proposal (and what I had written in the
patches that I linked) was just that we would stop _writing_ symlinks.
And there we'd only need to warn people who have that config option set.

Whether to drop the reading side is less clear to me. I think in the
long run it is good as a cleanup (and one less source of weird behavior
that malicious local repos can trigger). But that decision can be made
separately. I think it would be OK to just issue a deprecation warning
whenever we actually follow a symlink (because I think we do so
manually, since we need to know the target name as part of the
resolution process).

-Peff