diff mbox series

[v2,5/8] packed-backend: check whether the refname contains NUL characters

Message ID Z5r7EkDwEsxuLJzn@ArchLinux (mailing list archive)
State New
Headers show
Series add more ref consistency checks | expand

Commit Message

shejialuo Jan. 30, 2025, 4:07 a.m. UTC
We have already implemented the header consistency check for the raw
"packed-refs" file. Before we implement the consistency check for each
ref entry, let's analysis [1] which reports that "git fsck" cannot
detect some NUL characters.

"packed-backend.c::next_record" will use "check_refname_format" to check
the consistency of the refname. If it is not OK, the program will die.
So, we already have the code path and we must miss out something.

We use the following code to get the refname:

    strbuf_add(&iter->refname_buf, p, eol - p);
    iter->base.refname = iter->refname_buf.buf

In the above code, `p` is the start pointer of the refname and `eol` is
the next newline pointer. We calculate the length of the refname by
subtracting the two pointers. Then we add the memory range between `p`
and `eol` to get the refname.

However, if there are some NUL characters in the memory range between `p`
and `eol`, we will see the refname as a valid ref name as long as the
memory range between `p` and first occurred NUL character is valid.

In order to catch above corruption, create a new function
"refname_contains_nul" by searching the first NUL character. If it is
not at the end of the string, there must be some NUL characters in the
refname.

Use this function in "next_record" function to die the program if
"refname_contains_nul" returns true.

[1] https://lore.kernel.org/git/6cfee0e4-3285-4f18-91ff-d097da9de737@rd10.de/

Reported-by: R. Diez <rdiez-temp3@rd10.de>
Mentored-by: Patrick Steinhardt <ps@pks.im>
Mentored-by: Karthik Nayak <karthik.188@gmail.com>
Signed-off-by: shejialuo <shejialuo@gmail.com>
---
 refs/packed-backend.c | 19 +++++++++++++++++++
 1 file changed, 19 insertions(+)

Comments

Patrick Steinhardt Feb. 3, 2025, 8:40 a.m. UTC | #1
On Thu, Jan 30, 2025 at 12:07:46PM +0800, shejialuo wrote:
> We have already implemented the header consistency check for the raw
> "packed-refs" file. Before we implement the consistency check for each
> ref entry, let's analysis [1] which reports that "git fsck" cannot
> detect some NUL characters.

This paragraph doesn't quite parse. I think it can simply be left out,
as the remainder of the commit message already explains in more than
enough detail what you're doing.

> "packed-backend.c::next_record" will use "check_refname_format" to check
> the consistency of the refname. If it is not OK, the program will die.
> So, we already have the code path and we must miss out something.
> 
> We use the following code to get the refname:
> 
>     strbuf_add(&iter->refname_buf, p, eol - p);
>     iter->base.refname = iter->refname_buf.buf
> 
> In the above code, `p` is the start pointer of the refname and `eol` is
> the next newline pointer. We calculate the length of the refname by
> subtracting the two pointers. Then we add the memory range between `p`
> and `eol` to get the refname.
> 
> However, if there are some NUL characters in the memory range between `p`
> and `eol`, we will see the refname as a valid ref name as long as the
> memory range between `p` and first occurred NUL character is valid.
> 
> In order to catch above corruption, create a new function
> "refname_contains_nul" by searching the first NUL character. If it is
> not at the end of the string, there must be some NUL characters in the
> refname.
> 
> Use this function in "next_record" function to die the program if
> "refname_contains_nul" returns true.

Yeah, makes sense to me. NUL bytes are invalid, and nothing good can
come out of it.

> diff --git a/refs/packed-backend.c b/refs/packed-backend.c
> index 883189f3a1..870c8e7aaa 100644
> --- a/refs/packed-backend.c
> +++ b/refs/packed-backend.c
> @@ -494,6 +494,22 @@ static void verify_buffer_safe(struct snapshot *snapshot)
>  				 last_line, eof - last_line);
>  }
>  
> +/*
> + * When parsing the "packed-refs" file, we will parse it line by line.
> + * Because we know the start pointer of the refname and the next
> + * newline pointer, we could calculate the length of the refname by
> + * subtracting the two pointers. However, there is a corner case where
> + * the refname contains corrupted embedded NUL characters. And
> + * `check_refname_format()` will not catch this when the truncated
> + * refname is still a valid refname. To prevent this, we need to check
> + * whether the refname contains the NUL characters.
> + */
> +static int refname_contains_nul(struct strbuf *refname)
> +{
> +	const char *pos = memchr(refname->buf, '\0', refname->len + 1);
> +	return pos < refname->buf + refname->len;
> +}

This can be simplified to:

    return !!memchr(refname->buf, '\0', refname->len);

Ideally, we'd be amending `check_refname_format()` to do the checking
for us. But we can't without a wider refactoring because that function
gets a C string, and C strings are naturally terminadet by NUL
characters.

I think that adding a new function for this is a bit over the top
though, as the check is unlikely to be useful in a lot of places and the
logic is rather trivial. So I'd just inline the check into
`next_record()`.

Patrick
diff mbox series

Patch

diff --git a/refs/packed-backend.c b/refs/packed-backend.c
index 883189f3a1..870c8e7aaa 100644
--- a/refs/packed-backend.c
+++ b/refs/packed-backend.c
@@ -494,6 +494,22 @@  static void verify_buffer_safe(struct snapshot *snapshot)
 				 last_line, eof - last_line);
 }
 
+/*
+ * When parsing the "packed-refs" file, we will parse it line by line.
+ * Because we know the start pointer of the refname and the next
+ * newline pointer, we could calculate the length of the refname by
+ * subtracting the two pointers. However, there is a corner case where
+ * the refname contains corrupted embedded NUL characters. And
+ * `check_refname_format()` will not catch this when the truncated
+ * refname is still a valid refname. To prevent this, we need to check
+ * whether the refname contains the NUL characters.
+ */
+static int refname_contains_nul(struct strbuf *refname)
+{
+	const char *pos = memchr(refname->buf, '\0', refname->len + 1);
+	return pos < refname->buf + refname->len;
+}
+
 #define SMALL_FILE_SIZE (32*1024)
 
 /*
@@ -895,6 +911,9 @@  static int next_record(struct packed_ref_iterator *iter)
 	strbuf_add(&iter->refname_buf, p, eol - p);
 	iter->base.refname = iter->refname_buf.buf;
 
+	if (refname_contains_nul(&iter->refname_buf))
+		die("packed refname contains embedded NULL: %s", iter->base.refname);
+
 	if (check_refname_format(iter->base.refname, REFNAME_ALLOW_ONELEVEL)) {
 		if (!refname_is_safe(iter->base.refname))
 			die("packed refname is dangerous: %s",