From patchwork Tue Feb 25 06:28:24 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989346 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2E7A3257437 for ; Tue, 25 Feb 2025 06:28:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740464908; cv=none; b=Oz/WCAdFIhHZu6H+TDlMw6bzoKezh5yNxdDaBsQIOXCrh8iPibu8nI6/xFmCXzqse7KJ2NGW79foRu7dB/IF/sBaxC+YAscAXXCqP01Xmxat2ACGHiAvIqMIWdiT+2FPYdDs+i+HHE7hXeWsdLdHUT1vQrLVPyrtjJtXwim1SX4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740464908; c=relaxed/simple; bh=a3+CFjYIABbUrhwimUaUcdaaSZ1z4vwFkMCNIxb3Jy4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=rjP8ai97hJ2j6FinH+Lb7CCKpK05xrOXcQdzPbI+wE8LxwQzYALXoiQ857H7DRaikvRDYLRAhVluBSC9/+RQqWQYPdzmK13naNJShKMqMbGpHl9WZjEsFmJ7xnUIh4p3M/6Vsmoj3tSfiV6e58hLnrg6SdHfQ1tB5lIR9UPoIuw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=OgzCTUQF; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="OgzCTUQF" Received: (qmail 24560 invoked by uid 109); 25 Feb 2025 06:28:25 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=a3+CFjYIABbUrhwimUaUcdaaSZ1z4vwFkMCNIxb3Jy4=; b=OgzCTUQFe7Xhre+C9RIvBUsfrRolbRgl3i6K4Qx1iCi366DKMuCeQ8tgv6wOO3c2DRLGteK/ZH6Bb69JAqjnIgl92ybzLUhEn3PUy44mjUAuc6wqCMCfbkHQGrgpKV+Byy0loIRTCOSC5EuMbPTSCl4aNb9Jxo8fL7yME3k04acMhJQDbF3fB4n2dZmJmGhZOtFSMNzNzils/JSUGKIh169jHBkxdW2pIXpOJw6KU/Z02fWOQxPV82xfB4g0gtFVjzzq9gbHpao2Fb4/LJLidHOj+BjVWxo4bhcU2ebpVI5FHC32G3y/hxlqFCsLkbW/I+qsQs6bTxW6b/+w7hC/ew== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:28:25 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2672 invoked by uid 111); 25 Feb 2025 06:28:24 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:28:24 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:28:24 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 01/10] loose_object_info(): BUG() on inflating content with unknown type Message-ID: <20250225062824.GA1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> After unpack_loose_header() returns, it will have inflated not only the object header, but possibly some bytes of the object content. When we call unpack_loose_rest() to extract the actual content, it finds those extra bytes by skipping past the header's terminating NUL in the buffer. Like this: int bytes = strlen(buffer) + 1; n = stream->total_out - bytes; ... memcpy(buf, (char *) buffer + bytes, n); This won't work with the OBJECT_INFO_ALLOW_UNKNOWN_TYPE flag, as there we allow a header of arbitrary size. We put into a strbuf, but feed only the final 32-byte chunk we read to unpack_loose_rest(). In that case stream->total_out may unexpectedly large, and thus our "n" will be large, causing an out-of-bounds read (we do check it against our allocated buffer size, which prevents an out-of-bounds write). Probably this could be made to work by feeding the strbuf to unpack_loose_rest(), along with adjusting some types (e.g., "bytes" would need to be a size_t, since it is no longer operating on a 32-byte buffer). But I don't think it's possible to actually trigger this in practice. The only caller who passes ALLOW_UNKNOWN_TYPE is cat-file, which only allows it with the "-t" and "-s" options (neither of which access the content). There is one way you can _almost_ trigger it: the oid compat routines (i.e., accessing sha1 via sha256 names and vice versa) will convert objects on the fly (which requires access to the content) using the same flags that were passed in. So in theory this: t='some very large type field that causes an extra inflate call' sha1_oid=$(git hash-object -w -t "$t" file) sha256_oid=$(git rev-parse --output-object-format=sha256 $sha1_oid) git cat-file --allow-unknown-type -s $sha256_oid would try to access the content. But it doesn't work, because using compat objects requires an entry in the .git/objects/loose-object-idx file, and we don't generate such an entry for non-standard types (see the "compat" section of write_object_file_literally()). If we use "t=blob" instead, then it does access the compat object, but it doesn't trigger the problem (because "blob" is a standard short type name, and it fits in the initial 32-byte buffer). So given that this is almost a memory error bug, I think it's worth addressing. But because we can't actually trigger the situation, I'm hesitant to try a fix that we can't run. Instead let's document the restriction and protect ourselves from the out-of-bounds read by adding a BUG() check. Signed-off-by: Jeff King --- I found this because I was tracing the code path after unpack_loose_header() returns to verify some assumptions in the other patches. It really makes me wonder if this "unknown type" stuff has any value at all. You can create an object with any type using "hash-object --literally -t". And you can ask about its type and size. But you can never retrieve the object content! Nor can you pack it or transfer it, since packs use a numeric type field. This code was added ~2015, but I don't think anybody built more on top of it. I wonder if we should just consider it a failed experiment and rip out the support. object-file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/object-file.c b/object-file.c index 00c3a4b910..45b251ba04 100644 --- a/object-file.c +++ b/object-file.c @@ -1580,6 +1580,8 @@ static int loose_object_info(struct repository *r, if (!oi->contentp) break; + if (hdrbuf.len) + BUG("unpacking content with unknown types not yet supported"); *oi->contentp = unpack_loose_rest(&stream, hdr, *oi->sizep, oid); if (*oi->contentp) goto cleanup; From patchwork Tue Feb 25 06:29:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989347 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 81E50257422 for ; Tue, 25 Feb 2025 06:29:02 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740464944; cv=none; b=g6Rdj100Mxci/xJyv4nZYUpkJpQNu+dgXcFM9Wa67HFrhhwHwe4XlA5yNnl9KMQHQRPQtc2fPaDYY+SdcsCdVl38f+mcNr6Pr2sfZLNNsWdyXKIbSkXPqKPvnZ9vDbsSLUObQFku1xGY0GuDaoxQqWMc8MDho3w+4y8pY/UP1a0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740464944; c=relaxed/simple; bh=s1akJXVmihyq7Ut+m85y2/lZ6g3Y0TiPxOG1hBp6UpU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gsgdK8+hGiEDU+yD8e259evUN51TX8l98GbSwhAI3gFuZT3xapPclBejKGbG0Nf7OEb/nwnVCgjQGYZjfBXvTIOnaQ4rGu7mIyS/JuDexX1Ko8qZk3txcX+8iqhu0UOfQ2fVog0ts9mB8t8L5bSuOzj3HylikUUvpnmCxWe0GG0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=TMI/moOJ; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="TMI/moOJ" Received: (qmail 24570 invoked by uid 109); 25 Feb 2025 06:29:02 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=s1akJXVmihyq7Ut+m85y2/lZ6g3Y0TiPxOG1hBp6UpU=; b=TMI/moOJYzxPe+LKBqYa1qhmv+fjAzJIQMUj9kRkoax8KiBiH2+uGwQ0isLr01qG1l9OjVJjdJ1BooirEEbqeXyIgb66j0f3H0KpC2FpazRsFHljkwXGDU6tUbj1dCYYgeUZAZTMy4wRxI++F2xCsLt/+LcU8lMUOecTRlQ9CWoLPbL7yJE+PE8ZgLONpQPdMA0H3lzGM8xx/K0z0AH0jX6D59m1s6zp+1tHpNrOnBC/hJIBqq3lIsJPH4KBlEdE3Q/M9XwxiVAenfL9LwXaAXjeXDhdlQ++Xgp+pLBBIUYT1TJXj/l3LgmvQAaWYM58SvBJ7T+B+QLlIMNahw21jA== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:29:02 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2692 invoked by uid 111); 25 Feb 2025 06:29:01 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:29:01 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:29:00 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 02/10] unpack_loose_header(): simplify next_out assignment Message-ID: <20250225062900.GB1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> When using OBJECT_INFO_ALLOW_UNKNOWN_TYPE to unpack a header that doesn't fit into our initial 32-byte buffer, we loop over calls git_inflate(), feeding it our buffer to the "next_out" pointer each time. As the code is written, we reset next_out after each inflate call (and after reading the output), ready for the next loop. This isn't wrong, but there are a few advantages to setting up "next_out" right before each inflate call, rather than after: 1. It drops a few duplicated lines of code. 2. It makes it obvious that we always feed a fresh buffer on each call (and thus can never see Z_BUF_ERROR due to due to a lack of output space). 3. After we exit the loop, we'll leave stream->next_out pointing to the end of the fetched data (this is how zlib callers find out how much data is in the buffer). This doesn't matter in practice, since nobody looks at it again. But it's probably the least-surprising thing to do, as it matches how next_out is left when the whole thing fits in the initial 32-byte buffer (and we don't enter the loop at all). Signed-off-by: Jeff King --- Not strictly necessary, but I think it makes some of the reasoning in patch 4 easier. object-file.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/object-file.c b/object-file.c index 45b251ba04..0fd42981fb 100644 --- a/object-file.c +++ b/object-file.c @@ -1385,18 +1385,17 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream, * reading the stream. */ strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer); - stream->next_out = buffer; - stream->avail_out = bufsiz; do { + stream->next_out = buffer; + stream->avail_out = bufsiz; + obj_read_unlock(); status = git_inflate(stream, 0); obj_read_lock(); strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer); if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer)) return 0; - stream->next_out = buffer; - stream->avail_out = bufsiz; } while (status != Z_STREAM_END); return ULHR_TOO_LONG; } From patchwork Tue Feb 25 06:29:40 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989348 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F11025B678 for ; Tue, 25 Feb 2025 06:29:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740464983; cv=none; b=g8VKxm9CAyxd/nYRX9sRfSwBZ33AdQ67TFqJ9ronRx5F78zVKW1B4xmGWdbwsrLrhGlipWVpvVR0tRiRVi36k9Ti+olTxRnCIORuP391uJSnMTHYdL0ElrLXlDpvYZIYsoGoFKJnPGlnUtvi8GRS5uRaUz7DwLKidhJlWrslBQk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740464983; c=relaxed/simple; bh=DYIxHgMF1g1XR/FOe9ZxE+b+BndWt/AyeTx6octEjZk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=H4T4NBT3Tzq1i73WJ9p9bY3omTiEgRRvmed3OrylWgmJSa0wn+o02T1WIQf+dbm9gJN+/arZrPxolMl1Y94fcHpkbj7fItB4rplvQUhXUNhYW/3ePZEi15cxBaGQiw2VrO3fb7cn5wq7TZcJuBQtYLRTeNrHn1ywlGDswLv/n/E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=NWQfotow; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="NWQfotow" Received: (qmail 24587 invoked by uid 109); 25 Feb 2025 06:29:41 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=DYIxHgMF1g1XR/FOe9ZxE+b+BndWt/AyeTx6octEjZk=; b=NWQfotowl8f/knyU3a33z0aWV32giE4FLopMRsdn+Ozi9CvsTAaU9okIxlm0x47j75sN4H5o+GHVW0pI2DsQ9GpcsChJOQCHno7Ud9Jcbpj114dwlVIbt+z1RjhbnP+rsBLuzxFLbA3rGcaMCyUNnrUmWBWivCJTsXbxfiB7/Pp2cnY1SXAWFefgzTqWuXclu021TVFj+DNvSXWma28C0gnRuYaKHACTjKhJ16MeH3gPvitvq2JVuZ8m6re8/E747frZ6xfQvkcTNB8tsLLv/kPvlla1p3qAicvEiHH+n2Ksj3wBzD1JoF4nAi8Ec1ASwktrdmq37KZBDIJjJD2Zcw== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:29:41 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2701 invoked by uid 111); 25 Feb 2025 06:29:40 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:29:40 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:29:40 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 03/10] unpack_loose_header(): report headers without NUL as "bad" Message-ID: <20250225062940.GC1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> If a caller asks us to read the whole loose object header value into a strbuf (e.g., via "cat-file --allow-unknown-type"), we'll keep reading until we see a NUL byte marking the end of the header. If we hit Z_STREAM_END before seeing the NUL, we obviously have to stop. But we return ULHR_TOO_LONG, which doesn't make any sense. The "too long" return code is used in the normal, 32-byte limited mode to indicate that we stopped looking. There is no such thing as "too long" here, as we'd keep reading forever until we see the end of stream or the NUL. Instead, we should return ULHR_BAD. The loose object has no NUL marking the end of header, so it is malformed. The behavior difference is slight; in either case we'd consider the object unreadable and refuse to go further. The only difference is the specific error message we produce. There's no test case here, as we'd need to generate a valid zlib stream without a NUL. That's not something Git will do without writing new custom code. And in the next patch we'll fix another bug in this area which will make this easier to do (and we will test it then). Signed-off-by: Jeff King --- You might be curious what happens if we see an error before the stream end. If so, read on to patch 4... object-file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 0fd42981fb..8c9295413b 100644 --- a/object-file.c +++ b/object-file.c @@ -1397,7 +1397,7 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream, if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer)) return 0; } while (status != Z_STREAM_END); - return ULHR_TOO_LONG; + return ULHR_BAD; } static void *unpack_loose_rest(git_zstream *stream, From patchwork Tue Feb 25 06:29:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989349 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 19F3D25A343 for ; Tue, 25 Feb 2025 06:29:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465002; cv=none; b=JEqtJe75uJMTQyNDLaE2L1MjPYr4e2DKVBV545/MCyFa/ZOYZPo9LW7T6i6G7DZtqXREPGpw2cxTAx1pzfCbGMm7Q7Esmub1SVJxmakEiM+angisGnztY5d3TSvrER9Qjh65KYCC4+jOg4IYu3fa/h5acuOl7CuXz3b8aBYgPz0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465002; c=relaxed/simple; bh=LzgM5QvFa5WiI2p/dd2I95OmHFyeMo/XNlq5gq6XH+s=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=fl5VFBgEo9pv/4/xfWJLtlDSFoQmfaRY3n/vMlr9CWivJ95lh5CA6KarVPHCvfx9oS+upF/NToaKWA/8xCO21D6TmpG0PR7Wx+hMlg3P05JqyLOuheebEPQt+t/zpd2ZOr0vMZu6VBm6BWppUyeHHGKfs0/jm+uPHjYjugbVJLU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=Q8whKnUL; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="Q8whKnUL" Received: (qmail 24596 invoked by uid 109); 25 Feb 2025 06:29:59 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=LzgM5QvFa5WiI2p/dd2I95OmHFyeMo/XNlq5gq6XH+s=; b=Q8whKnULIQxXDAXKkMZLkh+l+ozM0INXPTI5tHLcFPm7XYsXfSNVWo9WeJ2G5JQ+asMnaHhxHXHSkokQEsM0CqV7MD/5ntVoCcFDsMVMjmz9nkcsvqHpT2DzOYgZEfPjHe/kPLMrA7QYkKg6gyTD7kisgTOglfZKua5sbSeK+KXlZm0EAp2Dk92L84LPRtoTFlyy6oOtO62Flcbb756vhue/glROr/FqfOE6CQPcsz38/rKuMRh148Njqtm6TAjjmOrlRhOQeMmjkloj+MCeJgpmmixaoGby4xs5zqW+7a9rzGjIDe7xlVPrExm1RL4qAPU4hX7FyucGT7DpUIM8eg== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:29:59 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2705 invoked by uid 111); 25 Feb 2025 06:29:58 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:29:58 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:29:58 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 04/10] unpack_loose_header(): fix infinite loop on broken zlib input Message-ID: <20250225062958.GD1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> When reading a loose object, we first try to expand the first 32 bytes to read the type+size header. This is enough for any of the normal Git types. But since 46f034483e (sha1_file: support reading from a loose object of unknown type, 2015-05-03), the caller can also ask us to parse any unknown names, which can be much longer. In this case we keep inflating until we find the NUL at the end of the header, or hit Z_STREAM_END. But what if zlib can't make forward progress? For example, if the loose object file is truncated, we'll have no more data to feed it. It will return Z_BUF_ERROR, and we'll just loop infinitely, calling git_inflate() over and over but never seeing new bytes nor an end-of-stream marker. We can fix this by only looping when we think we can make forward progress. This will always be Z_OK in this case. In other code we might also be able to continue on Z_BUF_ERROR, but: - We will never see Z_BUF_ERROR because the output buffer is full; we always feed a fresh 32-byte buffer on each call to git_inflate(). - We may see Z_BUF_ERROR if we run out of input. But since we've fed the whole mmap'd buffer to zlib, if it runs out of input there is nothing more we can do. So if we don't see Z_OK (and didn't see the end-of-header NUL, otherwise we'd have broken out of the loop), then we should stop looping and return an error. The test case shows an example where the input is truncated (which gives us the input Z_BUF_ERROR case above). Although we do operate on objects we might get from an untrusted remote, I don't think the security implications of this bug are too great. It can only trigger if both of these are true: - You're reading a loose object whose on-disk representation was written by an attacker. So fetching an object (or receiving a push) are mostly OK, because even with unpack-objects it is our local, trusted code that writes out the object file. The exception may be fetching from an untrusted local repo, or using dumb-http, which copies objects verbatim. But... - The only code path which triggers the inflate loop is cat-file's --allow-unknown-type option. This is unlikely to be called at all outside of debugging. But I also suspect that objects with non-standard types (or that are truncated) would not survive the usual fetch/receive checks in the first place. So I think it would be quite hard to trick somebody into running the infinite loop, and we can just fix the bug. Co-authored-by: Taylor Blau Signed-off-by: Jeff King --- object-file.c | 2 +- t/t1006-cat-file.sh | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 8c9295413b..5b2446bfc1 100644 --- a/object-file.c +++ b/object-file.c @@ -1396,7 +1396,7 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream, strbuf_add(header, buffer, stream->next_out - (unsigned char *)buffer); if (memchr(buffer, '\0', stream->next_out - (unsigned char *)buffer)) return 0; - } while (status != Z_STREAM_END); + } while (status == Z_OK); return ULHR_BAD; } diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 398865d6eb..0b0d915773 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -903,6 +903,25 @@ test_expect_success 'cat-file -t and -s on corrupt loose object' ' ) ' +test_expect_success 'truncated object with --allow-unknown-type' - <<\EOT + objtype='a really long type name that exceeds the 32-byte limit' && + blob=$(git hash-object -w --literally -t "$objtype" /dev/null) && + objpath=.git/objects/$(test_oid_to_path "$blob") && + + # We want to truncate the object far enough in that we don't hit the + # end while inflating the first 32 bytes (since we want to have to dig + # for the trailing NUL of the header). But we don't want to go too far, + # since our header isn't very big. And of course we are counting + # deflated zlib bytes in the on-disk file, so it's a bit of a guess. + # Empirically 50 seems to work. + mv "$objpath" obj.bak && + test_when_finished 'mv obj.bak "$objpath"' && + test_copy_bytes 50 "$objpath" && + + test_must_fail git cat-file --allow-unknown-type -t $blob 2>err && + test_grep "unable to unpack $blob header" err +EOT + # Tests for git cat-file --follow-symlinks test_expect_success 'prep for symlink tests' ' echo_without_newline "$hello_content" >morx && From patchwork Tue Feb 25 06:30:26 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989350 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 07542257437 for ; Tue, 25 Feb 2025 06:30:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465029; cv=none; b=Tk72IM5L9lwbuvjTS7wOGxriZVdKTWbNAVPjncAtZIy9q0O9ilWVDIoReu+Z/m7WXw38K4vpIbPrjmENgcWV4XpPSal/HKYt6k9kJRbymrKmQwa5TwVBaQJqe9Fa7bhKhNSW/gzyAPGs4Md4v0HimAjsHoUKs8kirF7oYi2xxeI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465029; c=relaxed/simple; bh=6Yuc0V/zUvjf1rmB2CQrePXucNIgrCiWaFlQwZ5eRPo=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=k+x0/k/KFwG5ae/JV4AQ/4cwx9RwbCg5qPqTQhOB8N9PcXetZVeX6ps9TnJCLtqOTzlE235cEYiVemOdQJBgtujNEKAoU/hwuEvpUwPxaOMCMxKVeqMadR65Jszc7mn7ivE8Tchxqqqna6iBTq4vAxHRsRz0v8u06YuqRgV7FD8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=H0wzeQOq; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="H0wzeQOq" Received: (qmail 24607 invoked by uid 109); 25 Feb 2025 06:30:27 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=6Yuc0V/zUvjf1rmB2CQrePXucNIgrCiWaFlQwZ5eRPo=; b=H0wzeQOq5c28x2+IDVpyfEzTsUztlGNIVHH9XRT7Ods6H3AxzGDQyvGdVbz4mGoFjnCiBerwW3yiB0yzCv2LgUUEqhDM/LMxYoy3ogkoHJZydoPBT8bNRfeVwKZLdXW8ixznyHaNJfAiHHdHQvN086c4VWeN/A5lVm2wcEvU4nyVK20iAoEJaZumNITKpi3hgZ4zKqVUSEn4B7v3veLlZPYg66nXrgwq5uCg7P+0k4ABgFHyU0MpUS4bqrNOszzEGDkGVzMHajqT8kaT57tG0Kgo8WlTe5RiVzTZl6I+PMsSYM2A6fYPGcXmTOo1oG13NR7St7it0JN6QpqE4dwCUQ== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:30:27 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2751 invoked by uid 111); 25 Feb 2025 06:30:26 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:30:26 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:30:26 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 05/10] git_inflate(): skip zlib_post_call() sanity check on Z_NEED_DICT Message-ID: <20250225063026.GE1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> This fixes a case where malformed object input can cause us to hit a BUG() call in the git-zlib.c code. The zlib format allows the use of preset dictionaries to reduce the size of deflated data. The checksum of the dictionary is computed by the deflate code and goes into the stream. On the inflating side, zlib sees the dictionary checksum and returns Z_NEED_DICT, asking the caller to provide the dictionary data via inflateSetDictionary(). This should never happen in Git, because we never provide a dictionary for deflating (and if we get a stream that mentions a dictionary, we have no idea how to provide it). So normally Z_NEED_DICT is a hard error for us. But something interesting happens if we _do_ happen to see it (e.g., because of a corrupt or malicious input). In git_inflate() as we loop over calls to zlib's inflate(), we translate between our large-integer git_zstream values and zlib's native z_stream types, copying in and out with zlib_pre_call() and zlib_post_call(). In zlib_post_call() we have a few sanity checks, including one that checks that the number of bytes consumed by zlib (as measured by it moving the "next_in" pointer) is equal to the movement of its "total_in" count. But these do not correspond when we see Z_NEED_DICT! Zlib consumes the bytes from the input buffer but it does not increment total_in. And so we hit the BUG("total_in mismatch") call. There are a few options here: - We could ditch that BUG() check. It is making too many assumptions about how zlib updates these values. But it does have value in most cases as a sanity check on the values we're copying. - We could skip the zlib_post_call() entirely when we see Z_NEED_DICT. We know that it's hard error for us, so we should just send the status up the stack and let the caller bail. The downside is that if we ever did want to support dictionaries, we couldn't (the git_zstream will be out of sync, since we never copied its values back from the z_stream). - We could continue to call zlib_post_call(), but skip just that BUG() check if the status is Z_NEED_DICT. This keeps git_inflate() as a thin wrapper around inflate(), and would let us later support dictionaries for some calls if we wanted to. This patch uses the third approach. It seems like the least-surprising thing to keep git_inflate() a close to inflate() as possible. And while it makes the diff a bit larger (since we have to pass the status down to to the zlib_post_call() function), it's a static local function, and every caller by definition will have just made a zlib call (and so will have a status integer). Co-authored-by: Taylor Blau Signed-off-by: Jeff King --- git-zlib.c | 27 ++++++++++++++++----------- t/t1006-cat-file.sh | 32 ++++++++++++++++++++++++++++++++ 2 files changed, 48 insertions(+), 11 deletions(-) diff --git a/git-zlib.c b/git-zlib.c index 651dd9e07c..df9604910e 100644 --- a/git-zlib.c +++ b/git-zlib.c @@ -45,7 +45,7 @@ static void zlib_pre_call(git_zstream *s) s->z.avail_out = zlib_buf_cap(s->avail_out); } -static void zlib_post_call(git_zstream *s) +static void zlib_post_call(git_zstream *s, int status) { unsigned long bytes_consumed; unsigned long bytes_produced; @@ -54,7 +54,12 @@ static void zlib_post_call(git_zstream *s) bytes_produced = s->z.next_out - s->next_out; if (s->z.total_out != s->total_out + bytes_produced) BUG("total_out mismatch"); - if (s->z.total_in != s->total_in + bytes_consumed) + /* + * zlib does not update total_in when it returns Z_NEED_DICT, + * causing a mismatch here. Skip the sanity check in that case. + */ + if (status != Z_NEED_DICT && + s->z.total_in != s->total_in + bytes_consumed) BUG("total_in mismatch"); s->total_out = s->z.total_out; @@ -72,7 +77,7 @@ void git_inflate_init(git_zstream *strm) zlib_pre_call(strm); status = inflateInit(&strm->z); - zlib_post_call(strm); + zlib_post_call(strm, status); if (status == Z_OK) return; die("inflateInit: %s (%s)", zerr_to_string(status), @@ -90,7 +95,7 @@ void git_inflate_init_gzip_only(git_zstream *strm) zlib_pre_call(strm); status = inflateInit2(&strm->z, windowBits); - zlib_post_call(strm); + zlib_post_call(strm, status); if (status == Z_OK) return; die("inflateInit2: %s (%s)", zerr_to_string(status), @@ -103,7 +108,7 @@ void git_inflate_end(git_zstream *strm) zlib_pre_call(strm); status = inflateEnd(&strm->z); - zlib_post_call(strm); + zlib_post_call(strm, status); if (status == Z_OK) return; error("inflateEnd: %s (%s)", zerr_to_string(status), @@ -122,7 +127,7 @@ int git_inflate(git_zstream *strm, int flush) ? 0 : flush); if (status == Z_MEM_ERROR) die("inflate: out of memory"); - zlib_post_call(strm); + zlib_post_call(strm, status); /* * Let zlib work another round, while we can still @@ -160,7 +165,7 @@ void git_deflate_init(git_zstream *strm, int level) memset(strm, 0, sizeof(*strm)); zlib_pre_call(strm); status = deflateInit(&strm->z, level); - zlib_post_call(strm); + zlib_post_call(strm, status); if (status == Z_OK) return; die("deflateInit: %s (%s)", zerr_to_string(status), @@ -176,7 +181,7 @@ static void do_git_deflate_init(git_zstream *strm, int level, int windowBits) status = deflateInit2(&strm->z, level, Z_DEFLATED, windowBits, 8, Z_DEFAULT_STRATEGY); - zlib_post_call(strm); + zlib_post_call(strm, status); if (status == Z_OK) return; die("deflateInit2: %s (%s)", zerr_to_string(status), @@ -207,7 +212,7 @@ int git_deflate_abort(git_zstream *strm) zlib_pre_call(strm); status = deflateEnd(&strm->z); - zlib_post_call(strm); + zlib_post_call(strm, status); return status; } @@ -227,7 +232,7 @@ int git_deflate_end_gently(git_zstream *strm) zlib_pre_call(strm); status = deflateEnd(&strm->z); - zlib_post_call(strm); + zlib_post_call(strm, status); return status; } @@ -244,7 +249,7 @@ int git_deflate(git_zstream *strm, int flush) ? 0 : flush); if (status == Z_MEM_ERROR) die("deflate: out of memory"); - zlib_post_call(strm); + zlib_post_call(strm, status); /* * Let zlib work another round, while we can still diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index 0b0d915773..e493600aff 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -922,6 +922,38 @@ test_expect_success 'truncated object with --allow-unknown-type' - <<\EOT test_grep "unable to unpack $blob header" err EOT +test_expect_success 'object reading handles zlib dictionary' - <<\EOT + echo 'content that will be recompressed' >file && + blob=$(git hash-object -w file) && + objpath=.git/objects/$(test_oid_to_path "$blob") && + + # Recompress a loose object using a precomputed zlib dictionary. + # This was originally done with: + # + # perl -MCompress::Raw::Zlib -e ' + # binmode STDIN; + # binmode STDOUT; + # my $data = do { local $/; }; + # my $in = new Compress::Raw::Zlib::Inflate; + # my $de = new Compress::Raw::Zlib::Deflate( + # -Dictionary => "anything" + # ); + # $in->inflate($data, $raw); + # $de->deflate($raw, $out); + # print $out; + # ' $objpath + # + # but we do not want to require the perl module for all test runs (nor + # carry a custom t/helper program that uses zlib features we don't + # otherwise care about). + mv "$objpath" obj.bak && + test_when_finished 'mv obj.bak "$objpath"' && + printf '\170\273\017\112\003\143' >$objpath && + + test_must_fail git cat-file blob $blob 2>err && + test_grep 'error: inflate: needs dictionary' err +EOT + # Tests for git cat-file --follow-symlinks test_expect_success 'prep for symlink tests' ' echo_without_newline "$hello_content" >morx && From patchwork Tue Feb 25 06:30:56 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989362 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A857256C88 for ; Tue, 25 Feb 2025 06:30:58 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465060; cv=none; b=OnEmn1sAuf58JwxHMuwRVq20LsG7vHhCOv8uvP82+d9d3XbjsQl4RKGX13xZIsJeJ9xzYqoRtYqqvyC009I2uHTphB783F1MF/rT92wXqhPvlUzEJvM/GdtNrQL7hE+RBXMwZhKqt8wEJUaoOzrXasBMmBcGnxWY4pPvCqlXR+Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465060; c=relaxed/simple; bh=dIlvnyf7j8YaT1s4xEGjK41/PMSJ3UI9fWr2UOG1ckw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=WJGf+s2Uw6U67hz7I/OH3/vEYZYT83bzrBzcEHZ9m9HHEGzF8SALzHLZFl+jT8vbvnHvItxrtQj3cw2Svg899I5ldVjVXDuYMXSGyUQwVzBjKQ8VhaQ5G5EtP34PjvP5xEbpB1YcqUyaa0tvsvSqMxESBUE9duylUxQg3o30I1M= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=aG9waDtz; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="aG9waDtz" Received: (qmail 24620 invoked by uid 109); 25 Feb 2025 06:30:57 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=dIlvnyf7j8YaT1s4xEGjK41/PMSJ3UI9fWr2UOG1ckw=; b=aG9waDtzxp2OtdPZAJ+W/jCGIFFQnRW3Dp3+galZloIUDE9ohJhz3X8H2gFrdHxmWM3CreV79LXVyu+dOIiO9dwAEwC+/VFb5Lx1/C3TbdlhNNYw705OqQvPzp0Dl/bNgdihVYqtb14WDWNhJxXIHlcDn/TTJLmIdTpNpgcOBe3R2/Oy4ra5HJRVfZcN3Vr59tCSkJVgCRD9FhG5AF0WDYL/QRmvhCWxEZiC012O9jo33mZSgD5QrTtJO4jqflpZuj6XQL/PJB+qJ/KCIzEOoUMaXalPAjbdPaDXsHGQBLRIe4lNIEfjZArPAygR3ERA7jjpOEhUZBr5aF4VYFUsCA== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:30:57 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2755 invoked by uid 111); 25 Feb 2025 06:30:57 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:30:57 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:30:56 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 06/10] unpack_loose_header(): avoid numeric comparison of zlib status Message-ID: <20250225063056.GF1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> When unpacking a loose header, we try to inflate the first 32 bytes. We'd expect either Z_OK (we filled up the output buffer, but there are more bytes in the object) or Z_STREAM_END (this is a tiny object whose header and content fit in the buffer). We check for that with "if (status < Z_OK)", making the assumption that all of the errors we'd see have negative values (as Z_OK itself is "0", and Z_STREAM_END is "1"). But there's at least one case this misses: Z_NEED_DICT is "2". This isn't something we'd ever expect to see, but if we do see it, we should consider it an error (since we have no dictionary to load). Instead, the current code interprets Z_NEED_DICT as success and looks for the object header's terminating NUL in the bytes we've read. This will generaly be zero bytes if the dictionary is mentioned at the start of the stream. So we'll fail to find it and complain "the header is too long" (ULHR_LONG). But really, the problem is that the object is malformed, and we should return ULHR_BAD. This is a minor bug, as we consider both cases to be an error. But it does mean we print the wrong error message. The test case added in the previous patch triggers this code, so we can just confirm the error message we see here. Signed-off-by: Jeff King --- object-file.c | 2 +- t/t1006-cat-file.sh | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 5b2446bfc1..5b347cfc2f 100644 --- a/object-file.c +++ b/object-file.c @@ -1362,7 +1362,7 @@ enum unpack_loose_header_result unpack_loose_header(git_zstream *stream, obj_read_unlock(); status = git_inflate(stream, 0); obj_read_lock(); - if (status < Z_OK) + if (status != Z_OK && status != Z_STREAM_END) return ULHR_BAD; /* diff --git a/t/t1006-cat-file.sh b/t/t1006-cat-file.sh index e493600aff..86a2825473 100755 --- a/t/t1006-cat-file.sh +++ b/t/t1006-cat-file.sh @@ -951,6 +951,8 @@ test_expect_success 'object reading handles zlib dictionary' - <<\EOT printf '\170\273\017\112\003\143' >$objpath && test_must_fail git cat-file blob $blob 2>err && + test_grep ! 'too long' err && + test_grep 'error: unable to unpack' err && test_grep 'error: inflate: needs dictionary' err EOT From patchwork Tue Feb 25 06:31:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989363 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EC05B255E4E for ; Tue, 25 Feb 2025 06:31:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465078; cv=none; b=g29xqXPuQvW9Ohf7ToMTNov9R5JT3OxLOBoaQsDl4l6rS+XBDFA4nR3YTZ9LxmOYPyhZrkXF+hmBO8gYp4na6BVPGXLyCrSSbcXjpvY8HyuzAlEOk53nZFC+YgDmpC9G4NhQIxLPChZDQIze+1JcXWZ33yzUombaWjxrcxkFi1c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465078; c=relaxed/simple; bh=WMcgWFp9+KLiZFsnz9L0+TrqgNFh6ReZZmkPm6fMBwM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=d5h0Zt9ZwdA9ymHy4AlJrihHpMKwZlgXCDcFtmKRtlrBC1dTz1Nwn9KwK5vKRMfMCo5xhy+yeWc3dr0VVFBiVpn5vsZpyVNV41snZ6RKysUWMwKNJ5GkLCtRqs6esxD1iJL0Ia+O1NiXW8vYG38boyKOUDejx/P/RD8nSW0tw9w= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=id6bsllj; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="id6bsllj" Received: (qmail 24631 invoked by uid 109); 25 Feb 2025 06:31:16 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=WMcgWFp9+KLiZFsnz9L0+TrqgNFh6ReZZmkPm6fMBwM=; b=id6bslljgkeNA/+90eQKiAAMHSyNr8OkeOT8Ik21L3/cr5CHzyj5lt7aeGV678jae3peWUixumQ9p8TA6QNyBVnClhukQO5KykxViIwy4nvVeGr9eLqVpO5R79kcHgV3IPQV3+0LfB6s2jXgYD/kypD4ZzmTlKjAMVEu1tVtxyMFxJaPTGYJjZ8uiGnsF3hh0NMp+OYHmIa1lxJnQf+pOgUwgOARDjDxtzcxvffD6ijqO5zDWx8gzA4/OOHLQA8ei99vtg1gI0/eyQCjYz7K9mRfShzd78LjbUOnPvgbRZRStSQVYompJRPkRbRAlbsJulj9nRFIDC5UVuoSL2BTUA== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:31:16 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2759 invoked by uid 111); 25 Feb 2025 06:31:15 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:31:15 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:31:15 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 07/10] unpack_loose_rest(): avoid numeric comparison of zlib status Message-ID: <20250225063115.GG1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> When unpacking the actual content of a loose object file, we insist both that the status code we got is Z_STREAM_END, and that we consumed all bytes. If we didn't, we'll return an error, but the specific error message we produce depends on which of the two error conditions we saw. So we'll check both a second time to decide which error to produce. But this second time, our status code check is loose: it checks for a negative status value. This can get confused by zlib codes which are not negative, such as Z_NEED_DICT. In this case we'd erroneously print nothing at all, when we should say "corrupt loose object". Instead, this second check should check explicitly against Z_STREAM_END. Note that Z_OK is "0", so the existing code also produced no message for Z_OK. But it's impossible to see that status, since we only break out of the inflate loop when we stop seeing Z_OK (so a stream which has more bytes than its object header claims would eventually yield Z_BUF_ERROR). There's no test here, as it would require a loose object whose zlib stream returns Z_NEED_DICT in the middle of the object content. I think that is probably possible, but even our Z_NEED_DICT test in t1006 does not trigger this, because we hit that error while reading the header. I found this bug while reviewing all callers of git_inflate() for bugs similar to the one we saw in unpack_loose_header(). This was the only other case that did a numeric comparison rather than explicitly checking for Z_STREAM_END. Signed-off-by: Jeff King --- object-file.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 5b347cfc2f..859888eb2a 100644 --- a/object-file.c +++ b/object-file.c @@ -1441,7 +1441,7 @@ static void *unpack_loose_rest(git_zstream *stream, return buf; } - if (status < 0) + if (status != Z_STREAM_END) error(_("corrupt loose object '%s'"), oid_to_hex(oid)); else if (stream->avail_in) error(_("garbage at end of loose object '%s'"), From patchwork Tue Feb 25 06:33:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989364 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 33520136341 for ; Tue, 25 Feb 2025 06:33:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465195; cv=none; b=aKgmQ2KVpXkLyqD6TlZFkcbSECeQMBu4ejowypzkwkmMkfWeKGPuIyHoAojwTSpl6AE5uXWjpLVSHXJ2HsLDhoItTNynArP9rtUPjP0FsxvTg8Kbtos9R0cTkTmR6eJjFZY21EKTgljE5YbdHxUwBcVMeqSb7wDi/TP0WNihPGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465195; c=relaxed/simple; bh=qclubNOT9fAUzvov1T9rC4L8ngwbdMiMBcBj9zFZNNw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Q3lN9LzUPK4l3d8QQ30Pdh/4to0Jn/4eWjbESL2Z1Voj1dOXE+rrBH9Z12jnvUH1l5yknIxYF3zEElgsP2owKNtWrYINog4iayWY22PLVuIWFEzQHj0jKhlPpMOvAOCTOKb2CTbJWVp8BOiOPamhHF6LwJLYodCULDIa1jmj4GE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=gKWRRsje; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="gKWRRsje" Received: (qmail 24662 invoked by uid 109); 25 Feb 2025 06:33:13 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=qclubNOT9fAUzvov1T9rC4L8ngwbdMiMBcBj9zFZNNw=; b=gKWRRsjekpOAPcUk1qwehH2o1yNouOaLswPVLGny3K3NqhPx+eYIFMSLeuedh4MwQVoFLVgWkjXPKswtuGVRd5gRC4UoPvpK/GBmXRiS57PB5T9huCDlulXg0nQLhPDa5+1zqVShKKDkyRa+P3bF85NViNoOrvBeLPGAewai/vdTZ8+uGk9HRJY+HMmD5Lt75OhEUdjuX24Cob7qBtVsot3UHKYRj1hE/pWL2EbD7HB4gkeY5GbD0tD70gqcCbghWTKYqzkp5gEN88RBfkIyXIHu/sWrFFK9iMAG1Mn0UAVjOZemQHq2n02U6qRKcurwaNexPvLA6AqKI3/pNbsF2A== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:33:13 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2798 invoked by uid 111); 25 Feb 2025 06:33:12 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:33:12 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:33:12 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 08/10] unpack_loose_rest(): never clean up zstream Message-ID: <20250225063312.GH1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> The unpack_loose_rest() function has funny ownership semantics: we pass in a z_stream opened by the caller, but then only _sometimes_ close it. This oddity has developed over time. When the function was originally split out in 5180cacc20 (Split up unpack_sha1_file() some more, 2005-06-02), it always called inflateEnd() to clean up the stream (though nowadays it is a git_zstream and we call git_inflate_end()). But in 7efbff7531 (unpack_sha1_file(): detect corrupt loose object files., 2007-03-05) we added error code paths which don't close the stream. This makes some sense, as we'd still look at parts of the stream struct to decide which error to show (though I am not sure in practice if inflateEnd() even touches those fields). This subtlety makes it hard to know when the caller has to clean up the stream and when it does not. That led to the leak fixed by aa9ef614dc (object-file: fix memory leak when reading corrupted headers, 2024-08-14). Let's instead always leave the stream intact, forcing the caller to clean it up. You might think that would create more work for the callers, but it actually ends up simplifying them, since they can put the call to git_inflate_end() in the common cleanup code path. Two things to note, though: - The check_stream_oid() function is used as a replacement for unpack_loose_rest() in read_loose_object() to read blobs. It inherited the same funny semantics, and we should fix it here, too (to keep the cleanup in read_loose_object() consistent). - In read_loose_object() we need a second "out" label, as we can jump to the existing label before opening the stream at all (and since the struct is opaque, there is no way to if it was initialized or not, so we must not call git_inflate_end() in that case). Signed-off-by: Jeff King --- This patch and the two after are pure cleanups which should have no behavior effect. I think they make things better, but one could argue they are churn. I didn't reproduce it, but I think this is also fixing a leak when check_stream_oid() returned an error. object-file.c | 19 ++++++++----------- 1 file changed, 8 insertions(+), 11 deletions(-) diff --git a/object-file.c b/object-file.c index 859888eb2a..8cf87caef5 100644 --- a/object-file.c +++ b/object-file.c @@ -1437,7 +1437,6 @@ static void *unpack_loose_rest(git_zstream *stream, } } if (status == Z_STREAM_END && !stream->avail_in) { - git_inflate_end(stream); return buf; } @@ -1601,8 +1600,8 @@ static int loose_object_info(struct repository *r, die(_("loose object %s (stored in %s) is corrupt"), oid_to_hex(oid), path); - git_inflate_end(&stream); cleanup: + git_inflate_end(&stream); munmap(map, mapsize); if (oi->sizep == &size_scratch) oi->sizep = NULL; @@ -3081,7 +3080,6 @@ static int check_stream_oid(git_zstream *stream, git_hash_update(&c, buf, stream->next_out - buf); total_read += stream->next_out - buf; } - git_inflate_end(stream); if (status != Z_STREAM_END) { error(_("corrupt loose object '%s'"), oid_to_hex(expected_oid)); @@ -3128,35 +3126,34 @@ int read_loose_object(const char *path, if (unpack_loose_header(&stream, map, mapsize, hdr, sizeof(hdr), NULL) != ULHR_OK) { error(_("unable to unpack header of %s"), path); - git_inflate_end(&stream); - goto out; + goto out_inflate; } if (parse_loose_header(hdr, oi) < 0) { error(_("unable to parse header of %s"), path); - git_inflate_end(&stream); - goto out; + goto out_inflate; } if (*oi->typep == OBJ_BLOB && *size > big_file_threshold) { if (check_stream_oid(&stream, hdr, *size, path, expected_oid) < 0) - goto out; + goto out_inflate; } else { *contents = unpack_loose_rest(&stream, hdr, *size, expected_oid); if (!*contents) { error(_("unable to unpack contents of %s"), path); - git_inflate_end(&stream); - goto out; + goto out_inflate; } hash_object_file_literally(the_repository->hash_algo, *contents, *size, oi->type_name->buf, real_oid); if (!oideq(expected_oid, real_oid)) - goto out; + goto out_inflate; } ret = 0; /* everything checks out */ +out_inflate: + git_inflate_end(&stream); out: if (map) munmap(map, mapsize); From patchwork Tue Feb 25 06:33:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989365 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D8E36136341 for ; Tue, 25 Feb 2025 06:33:52 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465234; cv=none; b=R4cAQ5KOxH+H32NRLQ6/rVatkXavubqaaZjG9broSWIT3Pjozggr9HgZi04FB+mpAIIN8DD77hE5C67MeZx1j98D2k9Osl8XQfsknDQsiJ8BM48r2ovcUxV+GBoY6s4egC53roA04gau1DJeLcsx/Ywf4p+35ZRpaCgJO8u+v50= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465234; c=relaxed/simple; bh=so0BICattfUdKDHR91f0iBp6WPIo+zEcUrlX5wN6v0A=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=plHzqADdN7ZrwAk0DceFwfpomIHvs6I0Ou6T+IwkGCUmv/eGSQVh5LMZtP8ATBps8foEagmENwPHhkJz8GMrA07j0RX12vH+iQtV3RRUAEpJv0VzigD2jIjmXNd2Kz8TX0jpJ/UMIpZewTnArRocK8qbntMLTxc5QezmZjqAkP4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=JKGTtjE0; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="JKGTtjE0" Received: (qmail 24671 invoked by uid 109); 25 Feb 2025 06:33:52 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=so0BICattfUdKDHR91f0iBp6WPIo+zEcUrlX5wN6v0A=; b=JKGTtjE0oEhNk6Yr9Psx10L3+Tv2uBTylwcGMUneOatRfe6ZkXcWcMWhzlwrEm62Er5h8JIBKOaUKOC8CZWf4aINCbSsoB3IrfkOt+gQu9NdXuyzlUHJglTNI4HKkrv9LnH4AGrKOy7lE22IxkDoG/dMxS8laDBd8/xUgUMPp1veDHOO6O902aLLYt0rowGKqN+iXiPkWAxOOjchEpg4mcTO1V+1r6lZi37FS8rmnbUZH518pCmKORpnDWBF4LkJaB68DYM3ARiTWG+1F/G2802kd8Yn8MDcBt4RzyDD7vbDpKAWFXY8UVoHYEzccjzJeNrJAYw1Uw/MEAIUTXFP6g== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:33:52 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2803 invoked by uid 111); 25 Feb 2025 06:33:51 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:33:51 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:33:51 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 09/10] unpack_loose_rest(): simplify error handling Message-ID: <20250225063351.GI1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> Inflating a loose object is considered successful only if we got Z_STREAM_END and there were no more bytes. We check both of those conditions and return success, but then have to check them a second time to decide which error message to produce. I.e., we do something like this: if (!error_1 && !error_2) ...return success... if (error_1) ...handle error1... else if (error_2) ...handle error2... ...common error handling... This repetition was the source of a small bug fixed in an earlier commit (our Z_STREAM_END check was not the same in the two conditionals). Instead we can chain them all into a single if/else cascade, which avoids repeating ourselves: if (error_1) ...handle error1... else if (error_2) ...handle error2.... else ...return success... ...common error handling... Signed-off-by: Jeff King --- object-file.c | 6 +++--- 1 file changed, 3 insertions(+), 3 deletions(-) diff --git a/object-file.c b/object-file.c index 8cf87caef5..b7928fb74e 100644 --- a/object-file.c +++ b/object-file.c @@ -1436,15 +1436,15 @@ static void *unpack_loose_rest(git_zstream *stream, obj_read_lock(); } } - if (status == Z_STREAM_END && !stream->avail_in) { - return buf; - } if (status != Z_STREAM_END) error(_("corrupt loose object '%s'"), oid_to_hex(oid)); else if (stream->avail_in) error(_("garbage at end of loose object '%s'"), oid_to_hex(oid)); + else + return buf; + free(buf); return NULL; } From patchwork Tue Feb 25 06:34:21 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13989366 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA1FB15442C for ; Tue, 25 Feb 2025 06:34:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465265; cv=none; b=bKSCTypQ0gbQF41hcFy64UjsjcHthQIyy/j71tugsWBvZfe40J5ostjhrA5ZoIAHmqYBIL6tOFLecgu97Z0JjMfDcoChTb2342xcajMz+uh67ZJ9g52/yTcRzY9g2VO2aksUNLL+vpeniJuYGks+cpvOvCSanMqEO4aSK+ZXxIQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1740465265; c=relaxed/simple; bh=ik2SPWNnMWrhYMZDC2w353MyLwsMuZ3iZ8SlNDXW0Ao=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=cdTmUOgCp900MNhL9V2vRBXWUryuRzWPLdFV/EvbuQ9UqVm+VcamLod08qFHCWkKMGLYdtkgLsqwMpZJro8uJdWtwEwiCd5gUtTXvwTZBVeqrFsVwDXBXkOj/KeQXZvHHBCpqHoj49lRNIdiAYyUkqzuV1GpeJPrXmnhsTM9IKs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=EnAklLml; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="EnAklLml" Received: (qmail 24682 invoked by uid 109); 25 Feb 2025 06:34:23 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=ik2SPWNnMWrhYMZDC2w353MyLwsMuZ3iZ8SlNDXW0Ao=; b=EnAklLmlfR6lJcKRHH8cNC/ZEOwPclhlidmWuFS4YBn6fgzeb69dNLfoDYcAnBUCEzy5CGT06DL/fhl4e+BY+wTX0HfjRiRBT/f+R54IGHmPtFNxjtNtAa08mmJ7jyV+m1FjkbLZ6WOmgY8bWEasm5nlZH0wU7KG7nu6q7Hx734bsw0avr7IwoOUBZEKB+vm8XNyhPUTERlPdi0aVErLDOoTTUDHgBeWcoGF+3YcGWnVj5NCWTI3nw9RrZWld8K3gZcpF185c1NrS3kMkCCJlNk2mlSewAzltpargi9dBHoXytjF4WfMdUe4KoDMq3oTBrw8zqZGZvNvOdKYGFsXdA== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Tue, 25 Feb 2025 06:34:23 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2809 invoked by uid 111); 25 Feb 2025 06:34:22 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Tue, 25 Feb 2025 01:34:22 -0500 Authentication-Results: peff.net; auth=none Date: Tue, 25 Feb 2025 01:34:21 -0500 From: Jeff King To: git@vger.kernel.org Cc: Taylor Blau Subject: [PATCH 10/10] unpack_loose_rest(): rewrite return handling for clarity Message-ID: <20250225063421.GJ1293961@coredump.intra.peff.net> References: <20250225062518.GA1293854@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20250225062518.GA1293854@coredump.intra.peff.net> We have a pattern like: if (error1) ...handle error 1... else if (error2) ...handle error 2... else ...return buf... ...free buf and return NULL... This is a little subtle because it is the return in the success block that lets us skip the common error handling. Rewrite this instead to free the buffer in each error path, marking it as NULL, and then all code paths can use the common return. This should make the logic a bit easier to follow. It does mean duplicating the buf cleanup for errors, but it's a single line. Signed-off-by: Jeff King --- Obviously could be squashed into the previous one, but I thought the sequence of diffs made it easier to understand what was being changed. object-file.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/object-file.c b/object-file.c index b7928fb74e..1df8870578 100644 --- a/object-file.c +++ b/object-file.c @@ -1437,16 +1437,16 @@ static void *unpack_loose_rest(git_zstream *stream, } } - if (status != Z_STREAM_END) + if (status != Z_STREAM_END) { error(_("corrupt loose object '%s'"), oid_to_hex(oid)); - else if (stream->avail_in) + FREE_AND_NULL(buf); + } else if (stream->avail_in) { error(_("garbage at end of loose object '%s'"), oid_to_hex(oid)); - else - return buf; + FREE_AND_NULL(buf); + } - free(buf); - return NULL; + return buf; } /*