diff mbox series

[v3,2/4] ref: add regular ref content check for files backend

Message ID Ztb_HqLg-WvwA2I0@ArchLinux (mailing list archive)
State Superseded
Headers show
Series add ref content check for files backend | expand

Commit Message

shejialuo Sept. 3, 2024, 12:20 p.m. UTC
We implicitly rely on "git-fsck(1)" to check the consistency of regular
refs. However, when parsing the regular refs for files backend by using
"files-backend.c::parse_loose_ref_contents", we allow the ref content to
end with no newline or to contain some garbages.

Even though we never create such loose refs ourselves, we have accepted
such loose refs. So, it is entirely possible that some third-party tools
may rely on such loose refs being valid. We should not report an error
fsck message at current. But let's notice such a "curiously formatted"
loose refs being valid and tell the user our findings, so we can access
the possible extent of damage when we tighten the parsing rules in the
future.

And it's not suitable to either report a warn fsck message to the user.
This is because if the caller set the "strict" field in "fsck_options",
fsck warns will be automatically upgraded to errors. We should not allow
user to specify the "--strict" flag to upgrade the fsck warnings to
errors at current. It might cause compatibility issue which may break
the legacy repository. So we add the following two fsck infos to
represent the situation where the ref content ends without newline or has
garbages:

1. "refMissingNewline(INFO)": A ref does not end with newline. This kind
   of ref may be considered ERROR in the future.
2. "trailingRefContent(INFO)": A ref has trailing contents. This kind of
   ref may be considered ERROR in the future.

It may seem that we could not give the user any warnings by creating
fsck infos. However, in "fsck.c::fsck_vreport", we will convert
"FSCK_INFO" to "FSCK_WARN" and we can still warn the user about these
situations when using "git-refs verify" without introducing
compatibility issue.

In current "git-fsck(1)", it will report an error when the ref content
is bad, so we should following this to report an error to the user when
"parse_loose_ref_contents" fails. And we add a new fsck error message
called "badRefContent(ERROR)" to represent that a ref has a bad content.

In order to tell whether the ref has trailing content, add a new
parameter "trailing" to "parse_loose_ref_contents". Then introduce a new
function "files_fsck_refs_content" to check the regular refs to enhance
the "git-refs verify".

Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 Documentation/fsck-msgids.txt |  11 ++++
 fsck.h                        |   3 +
 refs.c                        |   2 +-
 refs/files-backend.c          |  68 ++++++++++++++++++-
 refs/refs-internal.h          |   2 +-
 t/t0602-reffiles-fsck.sh      | 120 ++++++++++++++++++++++++++++++++++
 6 files changed, 202 insertions(+), 4 deletions(-)

Comments

Patrick Steinhardt Sept. 9, 2024, 3:04 p.m. UTC | #1
On Tue, Sep 03, 2024 at 08:20:46PM +0800, shejialuo wrote:
> We implicitly rely on "git-fsck(1)" to check the consistency of regular
> refs. However, when parsing the regular refs for files backend by using
> "files-backend.c::parse_loose_ref_contents", we allow the ref content to
> end with no newline or to contain some garbages.
> 
> Even though we never create such loose refs ourselves, we have accepted
> such loose refs. So, it is entirely possible that some third-party tools
> may rely on such loose refs being valid. We should not report an error
> fsck message at current. But let's notice such a "curiously formatted"
> loose refs being valid and tell the user our findings, so we can access
> the possible extent of damage when we tighten the parsing rules in the
> future.
> 
> And it's not suitable to either report a warn fsck message to the user.

s/to either/either to

> This is because if the caller set the "strict" field in "fsck_options",
> fsck warns will be automatically upgraded to errors. We should not allow
> user to specify the "--strict" flag to upgrade the fsck warnings to
> errors at current.

This is formulated a bit curiously: it reads as if we wanted to limit
what the user can do, but what we really want to ensure is that the
`--strict` flag doesn't convert it into an error. So maybe something
like this instead of the second sentence:

    We don't (yet) want the "--strict" flag that controls this bit to
    end up generating errors for such weirdly-formatted reference
    contents, as we first want to assess whether this retroactive
    tightening will cause issues for any tools out there.

> It might cause compatibility issue which may break

s/issue/issues

> the legacy repository. So we add the following two fsck infos to

I wouldn't call it "legacy" just yet, as we didn't yet decide whether
we're going to make this formatting invalid in the first place. It's
rather a test balloon.

> represent the situation where the ref content ends without newline or has
> garbages:

s/garbages/trailing garbage

> 1. "refMissingNewline(INFO)": A ref does not end with newline. This kind
>    of ref may be considered ERROR in the future.
> 2. "trailingRefContent(INFO)": A ref has trailing contents. This kind of
>    ref may be considered ERROR in the future.

In both cases, "may be considered ERROR" -> "may be considered an
error". Also in the actual messages.

> It may seem that we could not give the user any warnings by creating
> fsck infos. However, in "fsck.c::fsck_vreport", we will convert
> "FSCK_INFO" to "FSCK_WARN" and we can still warn the user about these
> situations when using "git-refs verify" without introducing

s/"git-refs verify"/"git refs verify". We don't use dashed builtins
nowadays anymore.

> compatibility issue.

s/issue/issues

> In current "git-fsck(1)", it will report an error when the ref content
> is bad, so we should following this to report an error to the user when
> "parse_loose_ref_contents" fails. And we add a new fsck error message
> called "badRefContent(ERROR)" to represent that a ref has a bad content.

Okay, so this is basically porting over behaviour that git-fsck(1)
already has to `git refs verify` and should thus not cause new issues
anywhere. I think it would have made sense to do so in a first step and
then introduce the tightened rules in a separate commit.

Will we eventually remove those checks from git-fsck(1) when we adapt it
to call `git refs verify`? If so, we should likely note that in the
commit message.

> In order to tell whether the ref has trailing content, add a new
> parameter "trailing" to "parse_loose_ref_contents". Then introduce a new
> function "files_fsck_refs_content" to check the regular refs to enhance
> the "git-refs verify".

This paragraph only re-explains what the diff already tells us, so it
can likely be removed.

> Mentored-by: Patrick Steinhardt <ps@pks.im>
> Mentored-by: Karthik Nayak <karthik.188@gmail.com>
> Signed-off-by: shejialuo <shejialuo@gmail.com>
> ---
>  Documentation/fsck-msgids.txt |  11 ++++
>  fsck.h                        |   3 +
>  refs.c                        |   2 +-
>  refs/files-backend.c          |  68 ++++++++++++++++++-
>  refs/refs-internal.h          |   2 +-
>  t/t0602-reffiles-fsck.sh      | 120 ++++++++++++++++++++++++++++++++++
>  6 files changed, 202 insertions(+), 4 deletions(-)
> 
> diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
> index 68a2801f15..06d045ac48 100644
> --- a/Documentation/fsck-msgids.txt
> +++ b/Documentation/fsck-msgids.txt
> @@ -19,6 +19,9 @@
>  `badParentSha1`::
>  	(ERROR) A commit object has a bad parent sha1.
>  
> +`badRefContent`::
> +	(ERROR) A ref has a bad content.
> +

s/a bad content/bad content

>  `badRefFiletype`::
>  	(ERROR) A ref has a bad file type.
>  
> @@ -170,6 +173,14 @@
>  `nullSha1`::
>  	(WARN) Tree contains entries pointing to a null sha1.
>  
> +`refMissingNewline`::
> +	(INFO) A ref does not end with newline. This kind of ref may
> +	be considered ERROR in the future.
> +

I'd reformulate the second sentence to "This will be considered an error
in the future". This indicates that we have the intent to tighten this
check to any user and would urge them to speak up in case they disagree
with such a tightening.

> +`trailingRefContent`::
> +	(INFO) A ref has trailing contents. This kind of ref may be
> +	considered ERROR in the future.

Same.

> @@ -3430,6 +3434,65 @@ typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
>  				  const char *refs_check_dir,
>  				  struct dir_iterator *iter);
>  
> +static int files_fsck_refs_content(struct ref_store *ref_store,
> +				   struct fsck_options *o,
> +				   const char *refs_check_dir,
> +				   struct dir_iterator *iter)
> +{
> +	struct strbuf ref_content = STRBUF_INIT;
> +	struct strbuf referent = STRBUF_INIT;
> +	struct strbuf refname = STRBUF_INIT;
> +	struct fsck_ref_report report = {0};
> +	const char *trailing = NULL;
> +	unsigned int type = 0;
> +	int failure_errno = 0;
> +	struct object_id oid;
> +	int ret = 0;
> +
> +	strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
> +	report.path = refname.buf;
> +
> +	if (S_ISLNK(iter->st.st_mode))
> +		goto cleanup;
> +
> +	if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
> +		ret = error_errno(_("%s/%s: unable to read the ref"),
> +				  refs_check_dir, iter->relative_path);

We typically have the name of things we read trailing and not leading in
error messages. So this should rather be "unable do read ref '%s/%s'".

> +		goto cleanup;
> +	}
> +
> +	if (parse_loose_ref_contents(ref_store->repo->hash_algo,
> +				     ref_content.buf, &oid, &referent,
> +				     &type, &trailing, &failure_errno)) {
> +		ret = fsck_report_ref(o, &report,
> +				      FSCK_MSG_BAD_REF_CONTENT,
> +				      "invalid ref content");
> +		goto cleanup;
> +	}
> +
> +	if (!(type & REF_ISSYMREF)) {

Coming back to my comment further up, I guess this whole block here
could be introduced in a separate commit. So the first commit introduces
the infra to check loose ref contents as an obvious step because we
simply port over rules that already exist in git-fsck(1). And the second
step could then do this retroactive tightening with the justification
you have spelt out in the commit message.

> +		if (*trailing == '\0') {

`if (!*trailing)`

Patrick
shejialuo Sept. 10, 2024, 7:42 a.m. UTC | #2
On Mon, Sep 09, 2024 at 05:04:07PM +0200, Patrick Steinhardt wrote:
> > This is because if the caller set the "strict" field in "fsck_options",
> > fsck warns will be automatically upgraded to errors. We should not allow
> > user to specify the "--strict" flag to upgrade the fsck warnings to
> > errors at current.
> 
> This is formulated a bit curiously: it reads as if we wanted to limit
> what the user can do, but what we really want to ensure is that the
> `--strict` flag doesn't convert it into an error. So maybe something
> like this instead of the second sentence:
> 
>     We don't (yet) want the "--strict" flag that controls this bit to
>     end up generating errors for such weirdly-formatted reference
>     contents, as we first want to assess whether this retroactive
>     tightening will cause issues for any tools out there.
> 

Thanks, I will improve this in the next version.

> > the legacy repository. So we add the following two fsck infos to
> 
> I wouldn't call it "legacy" just yet, as we didn't yet decide whether
> we're going to make this formatting invalid in the first place. It's
> rather a test balloon.
> 

I agree, we should drop "legacy" here.

> > In current "git-fsck(1)", it will report an error when the ref content
> > is bad, so we should following this to report an error to the user when
> > "parse_loose_ref_contents" fails. And we add a new fsck error message
> > called "badRefContent(ERROR)" to represent that a ref has a bad content.
> 
> Okay, so this is basically porting over behaviour that git-fsck(1)
> already has to `git refs verify` and should thus not cause new issues
> anywhere. I think it would have made sense to do so in a first step and
> then introduce the tightened rules in a separate commit.
> 

By reading the whole comments, we'd better create a commit which ports
the existing checks to "git refs verify" both for regular refs and
symrefs.

So, I will add more commits in the next version with the following
sequences:

1. Set up the infrastructure to check the contents for refs.
2. Port existing checks in "git-fsck(1)" to "git refs verify".
3. Introduce the tightened rules.

> Will we eventually remove those checks from git-fsck(1) when we adapt it
> to call `git refs verify`? If so, we should likely note that in the
> commit message.

We should do this, as we have discussed before, "git-fsck(1)" implicitly
checks some refs which makes the code hard to understand.

> Coming back to my comment further up, I guess this whole block here
> could be introduced in a separate commit. So the first commit introduces
> the infra to check loose ref contents as an obvious step because we
> simply port over rules that already exist in git-fsck(1). And the second
> step could then do this retroactive tightening with the justification
> you have spelt out in the commit message.

Yes, it will be much more clear. So, I should not simply classify the
situations by the type of refs.

Thanks,
Jialuo
karthik nayak Sept. 10, 2024, 4:07 p.m. UTC | #3
shejialuo <shejialuo@gmail.com> writes:

> We implicitly rely on "git-fsck(1)" to check the consistency of regular
> refs. However, when parsing the regular refs for files backend by using

Nit: s/for/in the/

> "files-backend.c::parse_loose_ref_contents", we allow the ref content to
> end with no newline or to contain some garbages.

The 'no newline' reads a bit odd, perhaps, "we allow the ref's content
to end with garbage or without a newline."


> Even though we never create such loose refs ourselves, we have accepted
> such loose refs. So, it is entirely possible that some third-party tools
> may rely on such loose refs being valid. We should not report an error
> fsck message at current. But let's notice such a "curiously formatted"

s/such a/such/ since the next line uses 'refs' in plural form.

> loose refs being valid and tell the user our findings, so we can access

s/access/assess

> the possible extent of damage when we tighten the parsing rules in the
> future.
>

We could also rewrite the last sentence to make it a little more clearer
as "We should notify the users about such 'curiously formatted' loose
refs so that adequate care is taken before we decide to tighter the rules
in the future."

> And it's not suitable to either report a warn fsck message to the user.
> This is because if the caller set the "strict" field in "fsck_options",
> fsck warns will be automatically upgraded to errors. We should not allow
> user to specify the "--strict" flag to upgrade the fsck warnings to
> errors at current. It might cause compatibility issue which may break
> the legacy repository. So we add the following two fsck infos to

I think Patrick touched base here and I agree with his comments.

> represent the situation where the ref content ends without newline or has
> garbages:
>
> 1. "refMissingNewline(INFO)": A ref does not end with newline. This kind
>    of ref may be considered ERROR in the future.
> 2. "trailingRefContent(INFO)": A ref has trailing contents. This kind of

s/contents/content

>    ref may be considered ERROR in the future.
>
> It may seem that we could not give the user any warnings by creating

s/could/would

> fsck infos. However, in "fsck.c::fsck_vreport", we will convert

I think we can also rephrase this first sentence a little better,
perhaps:

    It might appear that we can't provide the user with any warnings by
    using FSCK_INFO.

> "FSCK_INFO" to "FSCK_WARN" and we can still warn the user about these
> situations when using "git-refs verify" without introducing
> compatibility issue.

s/issue/issues

> In current "git-fsck(1)", it will report an error when the ref content
> is bad, so we should following this to report an error to the user when
> "parse_loose_ref_contents" fails. And we add a new fsck error message
> called "badRefContent(ERROR)" to represent that a ref has a bad content.

I would rephrase this a bit, as:

    The "git-fsck(1)" command reports an error when the ref content is
    invalid. Following this, add a similar check to "git refs verify". A
    a new fsck error message called "badRefContent(ERROR)" to represent
    that a ref has a invalid content.

[snip]

> +static int files_fsck_refs_content(struct ref_store *ref_store,
> +				   struct fsck_options *o,
> +				   const char *refs_check_dir,
> +				   struct dir_iterator *iter)
> +{
> +	struct strbuf ref_content = STRBUF_INIT;
> +	struct strbuf referent = STRBUF_INIT;
> +	struct strbuf refname = STRBUF_INIT;
> +	struct fsck_ref_report report = {0};
> +	const char *trailing = NULL;
> +	unsigned int type = 0;
> +	int failure_errno = 0;
> +	struct object_id oid;
> +	int ret = 0;
> +
> +	strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
> +	report.path = refname.buf;
> +
> +	if (S_ISLNK(iter->st.st_mode))
> +		goto cleanup;

Since we iterate over all refs, we don't need to check the target for a
symbolic link. So we skip all symbolic links. Makes sense. Would be nice
to have a comment here.

[snip]
shejialuo Sept. 13, 2024, 10:25 a.m. UTC | #4
On Tue, Sep 10, 2024 at 09:07:15AM -0700, karthik nayak wrote:

[snip]

> > +static int files_fsck_refs_content(struct ref_store *ref_store,
> > +				   struct fsck_options *o,
> > +				   const char *refs_check_dir,
> > +				   struct dir_iterator *iter)
> > +{
> > +	struct strbuf ref_content = STRBUF_INIT;
> > +	struct strbuf referent = STRBUF_INIT;
> > +	struct strbuf refname = STRBUF_INIT;
> > +	struct fsck_ref_report report = {0};
> > +	const char *trailing = NULL;
> > +	unsigned int type = 0;
> > +	int failure_errno = 0;
> > +	struct object_id oid;
> > +	int ret = 0;
> > +
> > +	strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
> > +	report.path = refname.buf;
> > +
> > +	if (S_ISLNK(iter->st.st_mode))
> > +		goto cleanup;
> 
> Since we iterate over all refs, we don't need to check the target for a
> symbolic link. So we skip all symbolic links. Makes sense. Would be nice
> to have a comment here.
> 

Today I am handling the reviews, there is a misunderstanding here. It's
correct that "We don't need to check the target for a symbolic link".
But we do need to check the symbolic links. It might be a symlink
symref. In here, we just ignore the implementation and will be
implemented in the later patch.
diff mbox series

Patch

diff --git a/Documentation/fsck-msgids.txt b/Documentation/fsck-msgids.txt
index 68a2801f15..06d045ac48 100644
--- a/Documentation/fsck-msgids.txt
+++ b/Documentation/fsck-msgids.txt
@@ -19,6 +19,9 @@ 
 `badParentSha1`::
 	(ERROR) A commit object has a bad parent sha1.
 
+`badRefContent`::
+	(ERROR) A ref has a bad content.
+
 `badRefFiletype`::
 	(ERROR) A ref has a bad file type.
 
@@ -170,6 +173,14 @@ 
 `nullSha1`::
 	(WARN) Tree contains entries pointing to a null sha1.
 
+`refMissingNewline`::
+	(INFO) A ref does not end with newline. This kind of ref may
+	be considered ERROR in the future.
+
+`trailingRefContent`::
+	(INFO) A ref has trailing contents. This kind of ref may be
+	considered ERROR in the future.
+
 `treeNotSorted`::
 	(ERROR) A tree is not properly sorted.
 
diff --git a/fsck.h b/fsck.h
index 500b4c04d2..b85072df57 100644
--- a/fsck.h
+++ b/fsck.h
@@ -31,6 +31,7 @@  enum fsck_msg_type {
 	FUNC(BAD_NAME, ERROR) \
 	FUNC(BAD_OBJECT_SHA1, ERROR) \
 	FUNC(BAD_PARENT_SHA1, ERROR) \
+	FUNC(BAD_REF_CONTENT, ERROR) \
 	FUNC(BAD_REF_FILETYPE, ERROR) \
 	FUNC(BAD_REF_NAME, ERROR) \
 	FUNC(BAD_TIMEZONE, ERROR) \
@@ -84,6 +85,8 @@  enum fsck_msg_type {
 	FUNC(MAILMAP_SYMLINK, INFO) \
 	FUNC(BAD_TAG_NAME, INFO) \
 	FUNC(MISSING_TAGGER_ENTRY, INFO) \
+	FUNC(REF_MISSING_NEWLINE, INFO) \
+	FUNC(TRAILING_REF_CONTENT, INFO) \
 	/* ignored (elevated when requested) */ \
 	FUNC(EXTRA_HEADER_ENTRY, IGNORE)
 
diff --git a/refs.c b/refs.c
index 74de3d3009..5e74881945 100644
--- a/refs.c
+++ b/refs.c
@@ -1758,7 +1758,7 @@  static int refs_read_special_head(struct ref_store *ref_store,
 	}
 
 	result = parse_loose_ref_contents(ref_store->repo->hash_algo, content.buf,
-					  oid, referent, type, failure_errno);
+					  oid, referent, type, NULL, failure_errno);
 
 done:
 	strbuf_release(&full_path);
diff --git a/refs/files-backend.c b/refs/files-backend.c
index 890d0324e1..0187b85c5f 100644
--- a/refs/files-backend.c
+++ b/refs/files-backend.c
@@ -560,7 +560,7 @@  static int read_ref_internal(struct ref_store *ref_store, const char *refname,
 	buf = sb_contents.buf;
 
 	ret = parse_loose_ref_contents(ref_store->repo->hash_algo, buf,
-				       oid, referent, type, &myerr);
+				       oid, referent, type, NULL, &myerr);
 
 out:
 	if (ret && !myerr)
@@ -597,7 +597,7 @@  static int files_read_symbolic_ref(struct ref_store *ref_store, const char *refn
 int parse_loose_ref_contents(const struct git_hash_algo *algop,
 			     const char *buf, struct object_id *oid,
 			     struct strbuf *referent, unsigned int *type,
-			     int *failure_errno)
+			     const char **trailing, int *failure_errno)
 {
 	const char *p;
 	if (skip_prefix(buf, "ref:", &buf)) {
@@ -619,6 +619,10 @@  int parse_loose_ref_contents(const struct git_hash_algo *algop,
 		*failure_errno = EINVAL;
 		return -1;
 	}
+
+	if (trailing)
+		*trailing = p;
+
 	return 0;
 }
 
@@ -3430,6 +3434,65 @@  typedef int (*files_fsck_refs_fn)(struct ref_store *ref_store,
 				  const char *refs_check_dir,
 				  struct dir_iterator *iter);
 
+static int files_fsck_refs_content(struct ref_store *ref_store,
+				   struct fsck_options *o,
+				   const char *refs_check_dir,
+				   struct dir_iterator *iter)
+{
+	struct strbuf ref_content = STRBUF_INIT;
+	struct strbuf referent = STRBUF_INIT;
+	struct strbuf refname = STRBUF_INIT;
+	struct fsck_ref_report report = {0};
+	const char *trailing = NULL;
+	unsigned int type = 0;
+	int failure_errno = 0;
+	struct object_id oid;
+	int ret = 0;
+
+	strbuf_addf(&refname, "%s/%s", refs_check_dir, iter->relative_path);
+	report.path = refname.buf;
+
+	if (S_ISLNK(iter->st.st_mode))
+		goto cleanup;
+
+	if (strbuf_read_file(&ref_content, iter->path.buf, 0) < 0) {
+		ret = error_errno(_("%s/%s: unable to read the ref"),
+				  refs_check_dir, iter->relative_path);
+		goto cleanup;
+	}
+
+	if (parse_loose_ref_contents(ref_store->repo->hash_algo,
+				     ref_content.buf, &oid, &referent,
+				     &type, &trailing, &failure_errno)) {
+		ret = fsck_report_ref(o, &report,
+				      FSCK_MSG_BAD_REF_CONTENT,
+				      "invalid ref content");
+		goto cleanup;
+	}
+
+	if (!(type & REF_ISSYMREF)) {
+		if (*trailing == '\0') {
+			ret = fsck_report_ref(o, &report,
+					      FSCK_MSG_REF_MISSING_NEWLINE,
+					      "missing newline");
+			goto cleanup;
+		}
+
+		if (*trailing != '\n' || (*(trailing + 1) != '\0')) {
+			ret = fsck_report_ref(o, &report,
+					      FSCK_MSG_TRAILING_REF_CONTENT,
+					      "trailing garbage in ref");
+			goto cleanup;
+		}
+	}
+
+cleanup:
+	strbuf_release(&refname);
+	strbuf_release(&ref_content);
+	strbuf_release(&referent);
+	return ret;
+}
+
 static int files_fsck_refs_name(struct ref_store *ref_store UNUSED,
 				struct fsck_options *o,
 				const char *refs_check_dir,
@@ -3512,6 +3575,7 @@  static int files_fsck_refs(struct ref_store *ref_store,
 {
 	files_fsck_refs_fn fsck_refs_fn[]= {
 		files_fsck_refs_name,
+		files_fsck_refs_content,
 		NULL,
 	};
 
diff --git a/refs/refs-internal.h b/refs/refs-internal.h
index 2313c830d8..73b05f971b 100644
--- a/refs/refs-internal.h
+++ b/refs/refs-internal.h
@@ -715,7 +715,7 @@  struct ref_store {
 int parse_loose_ref_contents(const struct git_hash_algo *algop,
 			     const char *buf, struct object_id *oid,
 			     struct strbuf *referent, unsigned int *type,
-			     int *failure_errno);
+			     const char **trailing, int *failure_errno);
 
 /*
  * Fill in the generic part of refs and add it to our collection of
diff --git a/t/t0602-reffiles-fsck.sh b/t/t0602-reffiles-fsck.sh
index 71a4d1a5ae..a06ad044f2 100755
--- a/t/t0602-reffiles-fsck.sh
+++ b/t/t0602-reffiles-fsck.sh
@@ -89,4 +89,124 @@  test_expect_success 'ref name check should be adapted into fsck messages' '
 	test_must_be_empty err
 '
 
+test_expect_success 'regular ref content should be checked (individual)' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	branch_dir_prefix=.git/refs/heads &&
+	tag_dir_prefix=.git/refs/tags &&
+	cd repo &&
+	test_commit default &&
+	mkdir -p "$branch_dir_prefix/a/b" &&
+
+	git refs verify 2>err &&
+	test_must_be_empty err &&
+
+	printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+	git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	warning: refs/heads/branch-no-newline: refMissingNewline: missing newline
+	EOF
+	rm $branch_dir_prefix/branch-no-newline &&
+	test_cmp expect err &&
+
+	printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+	git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	warning: refs/heads/branch-garbage: trailingRefContent: trailing garbage in ref
+	EOF
+	rm $branch_dir_prefix/branch-garbage &&
+	test_cmp expect err &&
+
+	printf "%s\n\n\n" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-1 &&
+	git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	warning: refs/tags/tag-garbage-1: trailingRefContent: trailing garbage in ref
+	EOF
+	rm $tag_dir_prefix/tag-garbage-1 &&
+	test_cmp expect err &&
+
+	printf "%s\n\n\n  garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-2 &&
+	git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	warning: refs/tags/tag-garbage-2: trailingRefContent: trailing garbage in ref
+	EOF
+	rm $tag_dir_prefix/tag-garbage-2 &&
+	test_cmp expect err &&
+
+	printf "%s    garbage\n\na" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-3 &&
+	git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	warning: refs/tags/tag-garbage-3: trailingRefContent: trailing garbage in ref
+	EOF
+	rm $tag_dir_prefix/tag-garbage-3 &&
+	test_cmp expect err &&
+
+	printf "%s garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-4 &&
+	test_must_fail git -c fsck.trailingRefContent=error refs verify 2>err &&
+	cat >expect <<-EOF &&
+	error: refs/tags/tag-garbage-4: trailingRefContent: trailing garbage in ref
+	EOF
+	rm $tag_dir_prefix/tag-garbage-4 &&
+	test_cmp expect err &&
+
+	printf "%sx" "$(git rev-parse main)" >$tag_dir_prefix/tag-bad-1 &&
+	test_must_fail git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	error: refs/tags/tag-bad-1: badRefContent: invalid ref content
+	EOF
+	rm $tag_dir_prefix/tag-bad-1 &&
+	test_cmp expect err &&
+
+	printf "xfsazqfxcadas" >$tag_dir_prefix/tag-bad-2 &&
+	test_must_fail git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	error: refs/tags/tag-bad-2: badRefContent: invalid ref content
+	EOF
+	rm $tag_dir_prefix/tag-bad-2 &&
+	test_cmp expect err &&
+
+	printf "xfsazqfxcadas" >$branch_dir_prefix/a/b/branch-bad &&
+	test_must_fail git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	error: refs/heads/a/b/branch-bad: badRefContent: invalid ref content
+	EOF
+	rm $branch_dir_prefix/a/b/branch-bad &&
+	test_cmp expect err
+'
+
+test_expect_success 'regular ref content should be checked (aggregate)' '
+	test_when_finished "rm -rf repo" &&
+	git init repo &&
+	branch_dir_prefix=.git/refs/heads &&
+	tag_dir_prefix=.git/refs/tags &&
+	cd repo &&
+	test_commit default &&
+	mkdir -p "$branch_dir_prefix/a/b" &&
+
+	printf "%s" "$(git rev-parse main)" >$branch_dir_prefix/branch-no-newline &&
+	printf "%s garbage" "$(git rev-parse main)" >$branch_dir_prefix/branch-garbage &&
+	printf "%s\n\n\n" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-1 &&
+	printf "%s\n\n\n  garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-2 &&
+	printf "%s    garbage\n\na" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-3 &&
+	printf "%s garbage" "$(git rev-parse main)" >$tag_dir_prefix/tag-garbage-4 &&
+	printf "%sx" "$(git rev-parse main)" >$tag_dir_prefix/tag-bad-1 &&
+	printf "xfsazqfxcadas" >$tag_dir_prefix/tag-bad-2 &&
+	printf "xfsazqfxcadas" >$branch_dir_prefix/a/b/branch-bad &&
+
+	test_must_fail git refs verify 2>err &&
+	cat >expect <<-EOF &&
+	error: refs/heads/a/b/branch-bad: badRefContent: invalid ref content
+	error: refs/tags/tag-bad-1: badRefContent: invalid ref content
+	error: refs/tags/tag-bad-2: badRefContent: invalid ref content
+	warning: refs/heads/branch-garbage: trailingRefContent: trailing garbage in ref
+	warning: refs/heads/branch-no-newline: refMissingNewline: missing newline
+	warning: refs/tags/tag-garbage-1: trailingRefContent: trailing garbage in ref
+	warning: refs/tags/tag-garbage-2: trailingRefContent: trailing garbage in ref
+	warning: refs/tags/tag-garbage-3: trailingRefContent: trailing garbage in ref
+	warning: refs/tags/tag-garbage-4: trailingRefContent: trailing garbage in ref
+	EOF
+	sort err >sorted_err &&
+	test_cmp expect sorted_err
+'
+
 test_done