From patchwork Sun Nov 17 09:08:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13877811 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6737F61FFE for ; Sun, 17 Nov 2024 09:08:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834498; cv=none; b=brtZKYL5FEkrG91o3e+UnU/0XdYBfHt5KCaxMApt+ONbVUNXlbijKwHSHtsSPoF//DJFwV4fHTroeZHsgBgMJbmMR7nJGFhEj0ztyUXhcDyyBDBTvM3s/S24/3eDz24ycrrx6pJXjAWRZjPYmb9rYIVdWCHoOXechMH6NVSVPD8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834498; c=relaxed/simple; bh=JQaSB3EnPPdC8RM0ykJU/TDtciVN/dWviX35opkFPLg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=id8YHbLF4+W9lbr+oxk3C8pXcjwpdU2FF3lcs+5bhdT4twWFW17vJt+rJI2+xUcO1FtWibOk0xpqD/naHSh6jITrTKdPuTXqOJ9jT6urBZaq1DxQx70X/uG+zXfPKs2Ci+QUgy2aaAKz7/J0TYWVy01b/eoOn/sawUVUlvHb7Ws= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=gWG/Ovc/; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="gWG/Ovc/" Received: (qmail 9263 invoked by uid 109); 17 Nov 2024 09:08:15 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:content-transfer-encoding:in-reply-to; s=20240930; bh=JQaSB3EnPPdC8RM0ykJU/TDtciVN/dWviX35opkFPLg=; b=gWG/Ovc/5alUV/rxCodQ7pgrdpIYQ+PPEbuCyGWHoVq6Z3umdKglRnwvXh9ThnqonhwBMYb02dB2yRGD0G2/cQJDXpFIXVu79CX0bfNuFAjfSEePyPuh8XyLNi4ZR3gyVEE39EQ4IXhw5Ctp8A1ugfnTkb7aYrUuAsYv7Cm1mXER3+VIbrI3UxinM/z4xiXkocsOMo99+cwDnI4ETM6ECfQH2+OiewDGYIjwAkoLSLAxo738X+UMxhWGv/XUpjyZNiYfUll/0wKX3fp/7koCJtTCWdgu8oucI29hqDbLvwtesIPPTtyHMfsb769GYUSzF2FQokG6pzwMeeRW6wcstg== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Sun, 17 Nov 2024 09:08:15 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2216 invoked by uid 111); 17 Nov 2024 09:08:19 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 17 Nov 2024 04:08:19 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 17 Nov 2024 04:08:14 -0500 From: Jeff King To: Sam James Cc: "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 1/5] object-file: prefer array-of-bytes initializer for hash literals Message-ID: <20241117090814.GA3409496@coredump.intra.peff.net> References: <20241117090329.GA2341486@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241117090329.GA2341486@coredump.intra.peff.net> We hard-code a few well-known hash values for empty trees and blobs in both sha1 and sha256 formats. We do so with string literals like this: #define EMPTY_TREE_SHA256_BIN_LITERAL \ "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \ "\x04\xd4\x5d\x8d\x85\xef\xa9\xb0\x57\xb5" \ "\x3b\x14\xb4\xb9\xb9\x39\xdd\x74\xde\xcc" \ "\x53\x21" and then use it to initialize the hash field of an object_id struct. That hash field is exactly 32 bytes long (the size we need for sha256). But the string literal above is actually 33 bytes long due to the NUL terminator. It's legal in C to initialize from a longer string literal; the extra bytes are just ignored. However, the upcoming gcc 15 will start warning about this: CC object-file.o object-file.c:52:9: warning: initializer-string for array of ‘unsigned char’ is too long [-Wunterminated-string-initialization] 52 | "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ object-file.c:79:17: note: in expansion of macro ‘EMPTY_TREE_SHA256_BIN_LITERAL’ which is understandable. Even though this is not a bug for us, since we do not care about the NUL terminator (and are just using the literal as a convenient format), it would be easy to accidentally create an array that was mistakenly unterminated. We can avoid this warning by switching the initializer to an actual array of unsigned values. That arguably demonstrates our intent more clearly anyway. Reported-by: Sam James Signed-off-by: Jeff King --- I actually didn't find exact wording in the standard for using a longer literal. But C99 section 6.7.8 (Initialization), para 32 shows this exact case as "example 8". You can view the diff with "--color-words --word-diff-regex=." to more clearly see that the values themselves weren't changed. object-file.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/object-file.c b/object-file.c index b1a3463852..25ba54594b 100644 --- a/object-file.c +++ b/object-file.c @@ -45,23 +45,27 @@ #define MAX_HEADER_LEN 32 -#define EMPTY_TREE_SHA1_BIN_LITERAL \ - "\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60" \ - "\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04" -#define EMPTY_TREE_SHA256_BIN_LITERAL \ - "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \ - "\x04\xd4\x5d\x8d\x85\xef\xa9\xb0\x57\xb5" \ - "\x3b\x14\xb4\xb9\xb9\x39\xdd\x74\xde\xcc" \ - "\x53\x21" - -#define EMPTY_BLOB_SHA1_BIN_LITERAL \ - "\xe6\x9d\xe2\x9b\xb2\xd1\xd6\x43\x4b\x8b" \ - "\x29\xae\x77\x5a\xd8\xc2\xe4\x8c\x53\x91" -#define EMPTY_BLOB_SHA256_BIN_LITERAL \ - "\x47\x3a\x0f\x4c\x3b\xe8\xa9\x36\x81\xa2" \ - "\x67\xe3\xb1\xe9\xa7\xdc\xda\x11\x85\x43" \ - "\x6f\xe1\x41\xf7\x74\x91\x20\xa3\x03\x72" \ - "\x18\x13" +#define EMPTY_TREE_SHA1_BIN_LITERAL { \ + 0x4b, 0x82, 0x5d, 0xc6, 0x42, 0xcb, 0x6e, 0xb9, 0xa0, 0x60, \ + 0xe5, 0x4b, 0xf8, 0xd6, 0x92, 0x88, 0xfb, 0xee, 0x49, 0x04 \ +} +#define EMPTY_TREE_SHA256_BIN_LITERAL { \ + 0x6e, 0xf1, 0x9b, 0x41, 0x22, 0x5c, 0x53, 0x69, 0xf1, 0xc1, \ + 0x04, 0xd4, 0x5d, 0x8d, 0x85, 0xef, 0xa9, 0xb0, 0x57, 0xb5, \ + 0x3b, 0x14, 0xb4, 0xb9, 0xb9, 0x39, 0xdd, 0x74, 0xde, 0xcc, \ + 0x53, 0x21 \ +} + +#define EMPTY_BLOB_SHA1_BIN_LITERAL { \ + 0xe6, 0x9d, 0xe2, 0x9b, 0xb2, 0xd1, 0xd6, 0x43, 0x4b, 0x8b, \ + 0x29, 0xae, 0x77, 0x5a, 0xd8, 0xc2, 0xe4, 0x8c, 0x53, 0x91 \ +} +#define EMPTY_BLOB_SHA256_BIN_LITERAL { \ + 0x47, 0x3a, 0x0f, 0x4c, 0x3b, 0xe8, 0xa9, 0x36, 0x81, 0xa2, \ + 0x67, 0xe3, 0xb1, 0xe9, 0xa7, 0xdc, 0xda, 0x11, 0x85, 0x43, \ + 0x6f, 0xe1, 0x41, 0xf7, 0x74, 0x91, 0x20, 0xa3, 0x03, 0x72, \ + 0x18, 0x13 \ +} static const struct object_id empty_tree_oid = { .hash = EMPTY_TREE_SHA1_BIN_LITERAL, From patchwork Sun Nov 17 09:08:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13877812 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA8FDA92D for ; Sun, 17 Nov 2024 09:08:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834515; cv=none; b=qFuNCm6j3qchhZ4PiGwQCLx0O9Z1ydIUNVI6McSWCP73szAD3jUuAHEyikJhtZT4AY30A9ySzQNiy/bWdGkgfp5Alc8YKOWKo1bIi7cCGWID/uzpqufg/DYdtER4Vx6XHelcbxZXWH76sAvoTBBVXCak9B9ALg/QP4qRb1k7naw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834515; c=relaxed/simple; bh=VVmx5Fn7fqyFV+yT8KIh9Bw2ISjDjc+l78zNFcEhFXw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Oxeo5yE/9P4QgnbRUycw8dWcYD8CzUXyjd3r94AlWiAnz4YcuuCXllcXReOZ6ugexZI5r9lEHyYzOsS3zw5n6x5KtTGCDRqDfc0p4o6/ZMPIX02wYVX1Wn/KNMT4BgEyFqo18jVNAWAGk9p4707RLL5m/ODryeuO4m0du7vEksc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=TV1+OaQo; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="TV1+OaQo" Received: (qmail 9271 invoked by uid 109); 17 Nov 2024 09:08:32 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=VVmx5Fn7fqyFV+yT8KIh9Bw2ISjDjc+l78zNFcEhFXw=; b=TV1+OaQobgmJeHLm11x40Aom8RyC2PIsyuzul7ENB5nRVE/tleohVrXin3n4odrkfFNqO9neGimjqUdYM9RY6rzGlmcGvPq5aO3nsz3eYVV6wzVA7QQfyOyhqQVe2MMLDKseahUSwiL+m++iOsESrp7GTIYXEO0REAsbqhn1ELcwxUDOYJ3BvgMKWRbLnaBKYlbZ//SZJgyi/pD6FXqyKDdDr0eGa8ebq/+h5UlAmulAdBmjLIVh4TKlvoqdaLs/Y9cATmYdzOXyDlNmaIom75H6LguIsedTCk1GXUNHDJNuX6GZYPZ9JqpHZJK6d0vuIXxiqO5utTi0fh7ADPQRrg== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Sun, 17 Nov 2024 09:08:32 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2224 invoked by uid 111); 17 Nov 2024 09:08:36 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 17 Nov 2024 04:08:36 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 17 Nov 2024 04:08:31 -0500 From: Jeff King To: Sam James Cc: "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 2/5] object-file: drop confusing oid initializer of empty_tree struct Message-ID: <20241117090831.GB3409496@coredump.intra.peff.net> References: <20241117090329.GA2341486@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241117090329.GA2341486@coredump.intra.peff.net> We treat the empty tree specially, providing an in-memory "cached" copy, which allows you to diff against it even if the object doesn't exist in the repository. This is implemented as part of the larger cached_object subsystem, but we use a stand-alone empty_tree struct. We initialize the oid of that struct using EMPTY_TREE_SHA1_BIN_LITERAL. At first glance, that seems like a bug; how could this ever work for sha256 repositories? The answer is that we never look at the oid field! The oid field is used to look up entries added by pretend_object_file() to the cached_objects array. But for our stand-alone entry, we look for it independently using the_hash_algo->empty_tree, which will point to the correct algo struct for the repository. This happened in 62ba93eaa9 (sha1_file: convert cached object code to struct object_id, 2018-05-02), which even mentions that this field is never used. Let's reduce confusion for anybody reading this code by replacing the sha1 initializer with a comment. The resulting field will be all-zeroes, so any violation of our assumption that the oid field is not used will break equally for sha1 and sha256. Signed-off-by: Jeff King --- object-file.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/object-file.c b/object-file.c index 25ba54594b..b7c4fdcabd 100644 --- a/object-file.c +++ b/object-file.c @@ -326,9 +326,7 @@ static struct cached_object { static int cached_object_nr, cached_object_alloc; static struct cached_object empty_tree = { - .oid = { - .hash = EMPTY_TREE_SHA1_BIN_LITERAL, - }, + /* no oid needed; we'll look it up manually based on the_hash_algo */ .type = OBJ_TREE, .buf = "", }; From patchwork Sun Nov 17 09:08:42 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13877814 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0978CA92D for ; Sun, 17 Nov 2024 09:08:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834525; cv=none; b=mT428p1bLBvWBeJuHJ84sJ+vb1bq6jv+03NNACITFZAWiQM+0D8kbQnwVAD3ZDjmIJ+lOdXEEK1Zxe6S052pp/NnlXoU8eCiGd2+CyUT/9sDuByuLfUcRJtvM9/R850G7XRa9n4g8UWrl52yxwzofC2zZaAR/kJcv1F9r+qHciw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834525; c=relaxed/simple; bh=utrCCbGSaUzhvYnMmrB9/RgD71oaxADpQpGlW3VlP3E=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=ZGyNlNKHTQM7MXwMIknwHqX50df0IE8IqOaeB4M/w+33lLoykdvzg/VibOHJIKy8IggmLwhr8Q4WGkTrzB12t63sQh1+IrWopuGxznY1UXZ2Ydy2OsioYBaCBH4XjNlS7rLgdium6r2K+VrBDODmdkrEpp8vtF52Cy4x+RNgcqw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=OclhIsw9; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="OclhIsw9" Received: (qmail 9281 invoked by uid 109); 17 Nov 2024 09:08:43 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=utrCCbGSaUzhvYnMmrB9/RgD71oaxADpQpGlW3VlP3E=; b=OclhIsw9zqt0LcT1GE8n+ekuN1ThvIIwt03IXRo7rNoRetZ/s4DqCdrb6qkoCMZl4c1mcpeM0qGmDaXQpcbGyzRKzdsFMJdADf6pH/9vUobgbzA4fx5uXER9nw1vr/niuYcm70tpGtXRni9LOYalvCwKX/p9Tv1p7sUevShqkwJGQx9drqh92Xd8vSDtzD0xy2uSUSCx0sLgKqCDUacYBV8PsYz1Iw8MjYryrmi1Lx+6JnMTTM4F91OLiIBDMAtNVy231v9G9fphKGRETrcSX25Gwse5xH0lPsWyvpT82EeMIdrffqCNKkSsq/3u7uH8R88yI+J3j8OqLTNgBo2XOQ== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Sun, 17 Nov 2024 09:08:43 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2231 invoked by uid 111); 17 Nov 2024 09:08:47 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 17 Nov 2024 04:08:47 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 17 Nov 2024 04:08:42 -0500 From: Jeff King To: Sam James Cc: "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 3/5] object-file: move empty_tree struct into find_cached_object() Message-ID: <20241117090842.GC3409496@coredump.intra.peff.net> References: <20241117090329.GA2341486@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241117090329.GA2341486@coredump.intra.peff.net> The fake empty_tree struct is a static global, but the only code that looks at it is find_cached_object(). The struct itself is a little odd, with an invalid "oid" field that is handled specially by that function. Since it's really just an implementation detail, let's move it to a static within the function. That future-proofs against other code trying to use it and seeing the weird oid value. Signed-off-by: Jeff King --- object-file.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/object-file.c b/object-file.c index b7c4fdcabd..5fadd470c1 100644 --- a/object-file.c +++ b/object-file.c @@ -325,14 +325,13 @@ static struct cached_object { } *cached_objects; static int cached_object_nr, cached_object_alloc; -static struct cached_object empty_tree = { - /* no oid needed; we'll look it up manually based on the_hash_algo */ - .type = OBJ_TREE, - .buf = "", -}; - static struct cached_object *find_cached_object(const struct object_id *oid) { + static struct cached_object empty_tree = { + /* no oid needed; we'll look it up manually based on the_hash_algo */ + .type = OBJ_TREE, + .buf = "", + }; int i; struct cached_object *co = cached_objects; From patchwork Sun Nov 17 09:10:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13877815 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 499B861FFE for ; Sun, 17 Nov 2024 09:10:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834630; cv=none; b=kryqTnT8EkepFj/CAWzLPeizuQ4UUIRLBwQI8oOQu15x4V+zoJQsKKi89GM4RoS+sdtFZKMEfEyCMZn+xahuYgIQRW7b43ZSkPoXngPKiwyJjut261qwnacBbL+PrOk9m2MmUgKz3MNh8hn9BjcLy+ncKXoZl1yj4XbXB0XKULU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834630; c=relaxed/simple; bh=3uQ1jGO4XE8zvqME6/PceDXJw+pFeOZg8jLm3jGp89Q=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=YxIU4GO3jXr7Xlopr+nC4V2148oU7x5SFdInVTN0XJ0gfUlrfyro4bYPxCOeq/lL5TampOktUbJmJ69ib0KIsqcOosOOdo0RmoeRjdyAf8ckUEFGhou5iMzSToRgVKtglo87Cby7rGhDQ7p10OXwOhlXZdeKFOqgKp2bv4I6KW0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=DtEON139; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="DtEON139" Received: (qmail 9294 invoked by uid 109); 17 Nov 2024 09:10:25 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=3uQ1jGO4XE8zvqME6/PceDXJw+pFeOZg8jLm3jGp89Q=; b=DtEON139Bv+C+FM9+7yeNAG1qzr/NEnZiutdGRPjlLR0qoqjgDI8RJ9o/1xyFrAylmRLwi2bRdmLmjUY7IbxGnymFS7XkRrJGEVZHLO1iz2OjplzviW2DwBqBKU66pCidVeE++NCWJlhis2yv5rT3phCW9IcYd7JylHgkEyVEPTloqS+zDGEFYNxE4j+pZzAT/idaJKSJ8SJxqNn1mfCydMWkmjCaFfpiXtoLeD9YJNSbZewtlWENhXZp3V7XeH/oPc02r55KrhejK5i41LoHcvpEmbKCVrDXX3N84WJys/pZRb+3RVs4XmC/F1x2n+iIsAzaNn1vdYR/6CZBur1Jw== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Sun, 17 Nov 2024 09:10:25 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2273 invoked by uid 111); 17 Nov 2024 09:10:29 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 17 Nov 2024 04:10:29 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 17 Nov 2024 04:10:24 -0500 From: Jeff King To: Sam James Cc: "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 4/5] object-file: drop oid field from find_cached_object() return value Message-ID: <20241117091024.GD3409496@coredump.intra.peff.net> References: <20241117090329.GA2341486@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241117090329.GA2341486@coredump.intra.peff.net> The pretend_object_file() function adds to an array mapping oids to object contents, which are later retrieved with find_cached_object(). We naturally need to store the oid for each entry, since it's the lookup key. But find_cached_object() also returns a hard-coded empty_tree object. There we don't care about its oid field and instead compare against the_hash_algo->empty_tree. The oid field is left as all-zeroes. This all works, but it means that the cached_object struct we return from find_cached_object() may or may not have a valid oid field, depend whether it is the hard-coded tree or came from pretend_object_file(). Nobody looks at the field, so there's no bug. But let's future-proof it by returning only the object contents themselves, not the oid. We'll continue to call this "struct cached_object", and the array entry mapping the key to those contents will be a "cached_object_entry". This would also let us swap out the array for a better data structure (like a hashmap) if we chose, but there's not much point. The only code that adds an entry is git-blame, which adds at most a single entry per process. Signed-off-by: Jeff King --- object-file.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/object-file.c b/object-file.c index 5fadd470c1..e461e351ca 100644 --- a/object-file.c +++ b/object-file.c @@ -317,27 +317,28 @@ int hash_algo_by_length(int len) * to write them into the object store (e.g. a browse-only * application). */ -static struct cached_object { +static struct cached_object_entry { struct object_id oid; - enum object_type type; - const void *buf; - unsigned long size; + struct cached_object { + enum object_type type; + const void *buf; + unsigned long size; + } value; } *cached_objects; static int cached_object_nr, cached_object_alloc; static struct cached_object *find_cached_object(const struct object_id *oid) { static struct cached_object empty_tree = { - /* no oid needed; we'll look it up manually based on the_hash_algo */ .type = OBJ_TREE, .buf = "", }; int i; - struct cached_object *co = cached_objects; + struct cached_object_entry *co = cached_objects; for (i = 0; i < cached_object_nr; i++, co++) { if (oideq(&co->oid, oid)) - return co; + return &co->value; } if (oideq(oid, the_hash_algo->empty_tree)) return &empty_tree; @@ -1850,7 +1851,7 @@ int oid_object_info(struct repository *r, int pretend_object_file(void *buf, unsigned long len, enum object_type type, struct object_id *oid) { - struct cached_object *co; + struct cached_object_entry *co; char *co_buf; hash_object_file(the_hash_algo, buf, len, type, oid); @@ -1859,11 +1860,11 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type, return 0; ALLOC_GROW(cached_objects, cached_object_nr + 1, cached_object_alloc); co = &cached_objects[cached_object_nr++]; - co->size = len; - co->type = type; + co->value.size = len; + co->value.type = type; co_buf = xmalloc(len); memcpy(co_buf, buf, len); - co->buf = co_buf; + co->value.buf = co_buf; oidcpy(&co->oid, oid); return 0; } From patchwork Sun Nov 17 09:14:23 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13877817 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 87E7EA92D for ; Sun, 17 Nov 2024 09:14:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834867; cv=none; b=qQCKYIAzZrohXfLpEsMXbP0m2M1j5lcTcj0vvSy4GqT7yck4iqsCPPrcAR/cwhaY7D2V/RuBoFrSTKDITIQlv2l9ZhztFbvL52lQRHTdR6/SIoLatn/NMEUWGe1dMMiFojetvZTMkGrnRCnY45J2rqqtPQMbMYhh7xgbF6QMgMA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731834867; c=relaxed/simple; bh=4izKsYwXmx17YVg8Sxvo+UN/oUwBu7jF2GJVGhZyuqI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=kP+Lu1n3OfVM8FhY3AlbyrdAshDdLU0aDRvZX3muwYcLetKb9pV34dzFIKxJ77eXxCny2QVtOInHAuOzppoQwkdBDJQqqWjIq6O7mvCFxTXCPaFkxKpWNatpdZsh7Gk63yOnllXsRv23/kevzKf5YDlw/FEBKQ7WPj2l3DWwxzE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=Fsri2Xap; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="Fsri2Xap" Received: (qmail 9309 invoked by uid 109); 17 Nov 2024 09:14:24 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=4izKsYwXmx17YVg8Sxvo+UN/oUwBu7jF2GJVGhZyuqI=; b=Fsri2Xap0g6OFrx2f3ONsOMoqKFdO7B4omaMJgr+5Xvw84DKowv5/gPMdHiQ4OLHOywV/AFlR6VffbEFxdYBVY13zU5vdlyBPLSnBX9OFlnGkRv7NPbj6i6VHemKUFH3elERHrmbApqwFovwZkdOukOnakFQ1cjW8vGb+wlyQSoic+cUhgMQL9DojKfKGL8WS0TGSlOy37L5AYo44GtxUl8+PlhuDh65MNfaRC8zT2ok+aV6GzNlntLfIrfKKFHVDPNSjl7feoPScmAcFq2Hq4PQYYgYqMogCOiVWOdhZRrZjPAZv7MFYwqKVt2p2MEmIMF9JSSlh2gaSEI3yLaOKA== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Sun, 17 Nov 2024 09:14:24 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 2330 invoked by uid 111); 17 Nov 2024 09:14:28 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Sun, 17 Nov 2024 04:14:28 -0500 Authentication-Results: peff.net; auth=none Date: Sun, 17 Nov 2024 04:14:23 -0500 From: Jeff King To: Sam James Cc: "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 5/5] object-file: inline empty tree and blob literals Message-ID: <20241117091423.GE3409496@coredump.intra.peff.net> References: <20241117090329.GA2341486@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241117090329.GA2341486@coredump.intra.peff.net> We define macros with the bytes of the empty trees and blobs for sha1 and sha256. But since e1ccd7e2b1 (sha1_file: only expose empty object constants through git_hash_algo, 2018-05-02), those are used only for initializing the git_hash_algo entries. Any other code using the macros directly would be suspicious, since a hash_algo pointer is the level of indirection we use to make everything work with both sha1 and sha256. So let's future proof against code doing the wrong thing by dropping the macros entirely and just initializing the structs directly. Signed-off-by: Jeff King --- Sadly you can't use --word-diff to make sense of this one, and --color-moved is foiled by dropping the trailing backslashes. Anybody is welcome to verify that the values did not change. :) We're still left with split-out structs for empty_tree_oid_sha256, etc (and confusingly the sha1 ones do not even say "sha1"). I think we could take this all one step further and inline those directly into the git_hash_algo. But that would have repercussions throughout the tree, since many spots would switch from: repo->hash_algo->empty_tree to: &repo->hash_algo->empty_tree Not sure if it's worth it (it's also one less pointer chase, but I kind of doubt that accessing the empty tree oid is ever a hot code path). object-file.c | 47 ++++++++++++++++++++--------------------------- 1 file changed, 20 insertions(+), 27 deletions(-) diff --git a/object-file.c b/object-file.c index e461e351ca..e325a52be5 100644 --- a/object-file.c +++ b/object-file.c @@ -44,47 +44,40 @@ /* The maximum size for an object header. */ #define MAX_HEADER_LEN 32 - -#define EMPTY_TREE_SHA1_BIN_LITERAL { \ - 0x4b, 0x82, 0x5d, 0xc6, 0x42, 0xcb, 0x6e, 0xb9, 0xa0, 0x60, \ - 0xe5, 0x4b, 0xf8, 0xd6, 0x92, 0x88, 0xfb, 0xee, 0x49, 0x04 \ -} -#define EMPTY_TREE_SHA256_BIN_LITERAL { \ - 0x6e, 0xf1, 0x9b, 0x41, 0x22, 0x5c, 0x53, 0x69, 0xf1, 0xc1, \ - 0x04, 0xd4, 0x5d, 0x8d, 0x85, 0xef, 0xa9, 0xb0, 0x57, 0xb5, \ - 0x3b, 0x14, 0xb4, 0xb9, 0xb9, 0x39, 0xdd, 0x74, 0xde, 0xcc, \ - 0x53, 0x21 \ -} - -#define EMPTY_BLOB_SHA1_BIN_LITERAL { \ - 0xe6, 0x9d, 0xe2, 0x9b, 0xb2, 0xd1, 0xd6, 0x43, 0x4b, 0x8b, \ - 0x29, 0xae, 0x77, 0x5a, 0xd8, 0xc2, 0xe4, 0x8c, 0x53, 0x91 \ -} -#define EMPTY_BLOB_SHA256_BIN_LITERAL { \ - 0x47, 0x3a, 0x0f, 0x4c, 0x3b, 0xe8, 0xa9, 0x36, 0x81, 0xa2, \ - 0x67, 0xe3, 0xb1, 0xe9, 0xa7, 0xdc, 0xda, 0x11, 0x85, 0x43, \ - 0x6f, 0xe1, 0x41, 0xf7, 0x74, 0x91, 0x20, 0xa3, 0x03, 0x72, \ - 0x18, 0x13 \ -} - static const struct object_id empty_tree_oid = { - .hash = EMPTY_TREE_SHA1_BIN_LITERAL, + .hash ={ + 0x4b, 0x82, 0x5d, 0xc6, 0x42, 0xcb, 0x6e, 0xb9, 0xa0, 0x60, + 0xe5, 0x4b, 0xf8, 0xd6, 0x92, 0x88, 0xfb, 0xee, 0x49, 0x04 + }, .algo = GIT_HASH_SHA1, }; static const struct object_id empty_blob_oid = { - .hash = EMPTY_BLOB_SHA1_BIN_LITERAL, + .hash = { + 0xe6, 0x9d, 0xe2, 0x9b, 0xb2, 0xd1, 0xd6, 0x43, 0x4b, 0x8b, + 0x29, 0xae, 0x77, 0x5a, 0xd8, 0xc2, 0xe4, 0x8c, 0x53, 0x91 + }, .algo = GIT_HASH_SHA1, }; static const struct object_id null_oid_sha1 = { .hash = {0}, .algo = GIT_HASH_SHA1, }; static const struct object_id empty_tree_oid_sha256 = { - .hash = EMPTY_TREE_SHA256_BIN_LITERAL, + .hash = { + 0x6e, 0xf1, 0x9b, 0x41, 0x22, 0x5c, 0x53, 0x69, 0xf1, 0xc1, + 0x04, 0xd4, 0x5d, 0x8d, 0x85, 0xef, 0xa9, 0xb0, 0x57, 0xb5, + 0x3b, 0x14, 0xb4, 0xb9, 0xb9, 0x39, 0xdd, 0x74, 0xde, 0xcc, + 0x53, 0x21 + }, .algo = GIT_HASH_SHA256, }; static const struct object_id empty_blob_oid_sha256 = { - .hash = EMPTY_BLOB_SHA256_BIN_LITERAL, + .hash = { + 0x47, 0x3a, 0x0f, 0x4c, 0x3b, 0xe8, 0xa9, 0x36, 0x81, 0xa2, + 0x67, 0xe3, 0xb1, 0xe9, 0xa7, 0xdc, 0xda, 0x11, 0x85, 0x43, + 0x6f, 0xe1, 0x41, 0xf7, 0x74, 0x91, 0x20, 0xa3, 0x03, 0x72, + 0x18, 0x13 + }, .algo = GIT_HASH_SHA256, }; static const struct object_id null_oid_sha256 = {