From patchwork Mon Nov 18 09:54:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13878332 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C34617C98 for ; Mon, 18 Nov 2024 09:54:42 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923684; cv=none; b=GCog7HYalKUSknZzjT9EmqxT16VDUHVpYayF5Noz8hhSWZ7+PR6JvIcq/44zM1Zy/9/Bnb/FC0/Q3F9diYYKpcTVhaWYut941nDm+2v/ioz5bO5gIJfVVtQAU5sWjx12RJHMDftx/0LTnFug8ATUpmpQshGTNLb362NUvFPXJl4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923684; c=relaxed/simple; bh=MaJDj7hDyHcKPLmm2ySPdWzBZuh3zQEYuJ3EUir85uQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=YqKnRzsS72FKC7VTVqkQix6GNCYhPa6B4BFkTGbqq7ZrbERHLmwwPy4QFuvf2KRPxcZBjB5nLdR7dj1sXjqxLBPa0cZk1rjI1s9sWhRigLc3rCZXOKQmKllqBP7KAEMSCL1FPFxJUKpKXWWY4g6V+rxWnBzHJEk+hg9xvgp7lR4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=boPJRqYq; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="boPJRqYq" Received: (qmail 22396 invoked by uid 109); 18 Nov 2024 09:54:41 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:content-transfer-encoding:in-reply-to; s=20240930; bh=MaJDj7hDyHcKPLmm2ySPdWzBZuh3zQEYuJ3EUir85uQ=; b=boPJRqYqANkRGJW/ItjszJNe3uj6nTzBvANNXmiKcc/FkpnKFm752qWDOolJ6QDjwdqtOdTXZ2ghTSeWhqcKlM0/SGAkOFQgJm74aYQkifTa0Ll9Xohq6P1dvW7RGc0W8jOZ6+vKepNfNdW/xWjXa7KnnkyuPvWe//IPPqC3EnV/JrxKA/DazyWsjVQ/dbiiqmjco1OJz/NVAtYysEg0lL6Ee7cELLm7p6e+Eh8ChQitc1cOyaG9EnovFQY/1cvLViUtVEgX1UET4A2jDCpjVVPaY63TryV7zeJcPsc4Ch/XDEbH3KJbsvdMcCt0HEbSJSlquSgmzXJIC8n1gLwxzA== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 18 Nov 2024 09:54:41 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 18264 invoked by uid 111); 18 Nov 2024 09:54:45 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 18 Nov 2024 04:54:45 -0500 Authentication-Results: peff.net; auth=none Date: Mon, 18 Nov 2024 04:54:40 -0500 From: Jeff King To: Sam James Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Chris Torek , "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 1/6] object-file: prefer array-of-bytes initializer for hash literals Message-ID: <20241118095440.GA3992317@coredump.intra.peff.net> References: <20241118095423.GA3990835@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241118095423.GA3990835@coredump.intra.peff.net> We hard-code a few well-known hash values for empty trees and blobs in both sha1 and sha256 formats. We do so with string literals like this: #define EMPTY_TREE_SHA256_BIN_LITERAL \ "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \ "\x04\xd4\x5d\x8d\x85\xef\xa9\xb0\x57\xb5" \ "\x3b\x14\xb4\xb9\xb9\x39\xdd\x74\xde\xcc" \ "\x53\x21" and then use it to initialize the hash field of an object_id struct. That hash field is exactly 32 bytes long (the size we need for sha256). But the string literal above is actually 33 bytes long due to the NUL terminator. This is legal in C, and the NUL is ignored. Side note on legality: in general excess initializer elements are forbidden, and gcc will warn on both of these: char foo[3] = { 'h', 'u', 'g', 'e' }; char bar[3] = "VeryLongString"; I couldn't find specific language in the standard allowing initialization from a string literal where _just_ the NUL is ignored, but C99 section 6.7.8 (Initialization), paragraph 32 shows this exact case as "example 8". However, the upcoming gcc 15 will start warning for this case (when compiled with -Wextra via DEVELOPER=1): CC object-file.o object-file.c:52:9: warning: initializer-string for array of ‘unsigned char’ is too long [-Wunterminated-string-initialization] 52 | "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \ | ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ object-file.c:79:17: note: in expansion of macro ‘EMPTY_TREE_SHA256_BIN_LITERAL’ which is understandable. Even though this is not a bug for us, since we do not care about the NUL terminator (and are just using the literal as a convenient format), it would be easy to accidentally create an array that was mistakenly unterminated. We can avoid this warning by switching the initializer to an actual array of unsigned values. That arguably demonstrates our intent more clearly anyway. Reported-by: Sam James Signed-off-by: Jeff King --- object-file.c | 38 +++++++++++++++++++++----------------- 1 file changed, 21 insertions(+), 17 deletions(-) diff --git a/object-file.c b/object-file.c index b1a3463852..8101585616 100644 --- a/object-file.c +++ b/object-file.c @@ -45,23 +45,27 @@ #define MAX_HEADER_LEN 32 -#define EMPTY_TREE_SHA1_BIN_LITERAL \ - "\x4b\x82\x5d\xc6\x42\xcb\x6e\xb9\xa0\x60" \ - "\xe5\x4b\xf8\xd6\x92\x88\xfb\xee\x49\x04" -#define EMPTY_TREE_SHA256_BIN_LITERAL \ - "\x6e\xf1\x9b\x41\x22\x5c\x53\x69\xf1\xc1" \ - "\x04\xd4\x5d\x8d\x85\xef\xa9\xb0\x57\xb5" \ - "\x3b\x14\xb4\xb9\xb9\x39\xdd\x74\xde\xcc" \ - "\x53\x21" - -#define EMPTY_BLOB_SHA1_BIN_LITERAL \ - "\xe6\x9d\xe2\x9b\xb2\xd1\xd6\x43\x4b\x8b" \ - "\x29\xae\x77\x5a\xd8\xc2\xe4\x8c\x53\x91" -#define EMPTY_BLOB_SHA256_BIN_LITERAL \ - "\x47\x3a\x0f\x4c\x3b\xe8\xa9\x36\x81\xa2" \ - "\x67\xe3\xb1\xe9\xa7\xdc\xda\x11\x85\x43" \ - "\x6f\xe1\x41\xf7\x74\x91\x20\xa3\x03\x72" \ - "\x18\x13" +#define EMPTY_TREE_SHA1_BIN_LITERAL { \ + 0x4b, 0x82, 0x5d, 0xc6, 0x42, 0xcb, 0x6e, 0xb9, 0xa0, 0x60, \ + 0xe5, 0x4b, 0xf8, 0xd6, 0x92, 0x88, 0xfb, 0xee, 0x49, 0x04 \ +} +#define EMPTY_TREE_SHA256_BIN_LITERAL { \ + 0x6e, 0xf1, 0x9b, 0x41, 0x22, 0x5c, 0x53, 0x69, 0xf1, 0xc1, \ + 0x04, 0xd4, 0x5d, 0x8d, 0x85, 0xef, 0xa9, 0xb0, 0x57, 0xb5, \ + 0x3b, 0x14, 0xb4, 0xb9, 0xb9, 0x39, 0xdd, 0x74, 0xde, 0xcc, \ + 0x53, 0x21 \ +} + +#define EMPTY_BLOB_SHA1_BIN_LITERAL { \ + 0xe6, 0x9d, 0xe2, 0x9b, 0xb2, 0xd1, 0xd6, 0x43, 0x4b, 0x8b, \ + 0x29, 0xae, 0x77, 0x5a, 0xd8, 0xc2, 0xe4, 0x8c, 0x53, 0x91 \ +} +#define EMPTY_BLOB_SHA256_BIN_LITERAL { \ + 0x47, 0x3a, 0x0f, 0x4c, 0x3b, 0xe8, 0xa9, 0x36, 0x81, 0xa2, \ + 0x67, 0xe3, 0xb1, 0xe9, 0xa7, 0xdc, 0xda, 0x11, 0x85, 0x43, \ + 0x6f, 0xe1, 0x41, 0xf7, 0x74, 0x91, 0x20, 0xa3, 0x03, 0x72, \ + 0x18, 0x13 \ +} static const struct object_id empty_tree_oid = { .hash = EMPTY_TREE_SHA1_BIN_LITERAL, From patchwork Mon Nov 18 09:55:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13878333 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 967C717C98 for ; Mon, 18 Nov 2024 09:55:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923711; cv=none; b=DRoFlUIeSGgv+++7lFrQBce64sfr0LRY8WXeJrRF1s+pvGIC591ojynDO/6K3dtIlX7awMWofv6fp3JECAo12nUBM5mSduq/Dssgk3KFdAM79vo4vdMNCOJHd5Tv6HB+5GuTHeAfQYT+lxD1ffyzJkCF6PqAU4Lq/1RpgHFvbpc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923711; c=relaxed/simple; bh=2MSDKhNjMoZ6RNN2JH2Lt+3ZZr58JMmutz+Q1tlQrwc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=kzWSlxcqqR4QgNkqfw89VdZRqkpxP0ZAYRQGAB2JLG7meOOSZUXFRuRSDH0eIHFkRhA15BBwSkqWUshCYwWFMDoH0Nt9UXVYVHbUmkmZYxRYYhw2+bLku6LLKSUEvsXIr3tf48S7YES/561BS35zvg8eIT4xA9Hg7+3mcKg9N54= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=aJ6nXZ6n; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="aJ6nXZ6n" Received: (qmail 22418 invoked by uid 109); 18 Nov 2024 09:55:08 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=2MSDKhNjMoZ6RNN2JH2Lt+3ZZr58JMmutz+Q1tlQrwc=; b=aJ6nXZ6n5Lef3kZUdDpTnbte3Mur3TbltsukOzXgalz78w3az3K4Xi9BdTkfgk3lgX48D102ak/4pqOPmJaOI7lYL/NsSBWsbWM+DdgPuPyugvketzcYUexYF6RyGhOECjwm7wCdCv0njEwm+5koXzCBxAmtVLLNJZ9s2qUdXXDVB9gnvktuW2n2434Bzj1/9+XiqA8/n4Q8XbPHit9LU1MB9DbNpLp3n5PSmBdDPJilw8Le+l76vjt56yrXq6FfGHulWX235WPaaER1fHyeH6gl1qAURB/K9wAm+AV1mg2IsHs0FbZCsovGjHl4ReKqjBfhG1bh1AzUVlK+yWuInQ== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 18 Nov 2024 09:55:08 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 18302 invoked by uid 111); 18 Nov 2024 09:55:12 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 18 Nov 2024 04:55:12 -0500 Authentication-Results: peff.net; auth=none Date: Mon, 18 Nov 2024 04:55:07 -0500 From: Jeff King To: Sam James Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Chris Torek , "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 2/6] object-file: drop confusing oid initializer of empty_tree struct Message-ID: <20241118095507.GB3992317@coredump.intra.peff.net> References: <20241118095423.GA3990835@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241118095423.GA3990835@coredump.intra.peff.net> We treat the empty tree specially, providing an in-memory "cached" copy, which allows you to diff against it even if the object doesn't exist in the repository. This is implemented as part of the larger cached_object subsystem, but we use a stand-alone empty_tree struct. We initialize the oid of that struct using EMPTY_TREE_SHA1_BIN_LITERAL. At first glance, that seems like a bug; how could this ever work for sha256 repositories? The answer is that we never look at the oid field! The oid field is used to look up entries added by pretend_object_file() to the cached_objects array. But for our stand-alone entry, we look for it independently using the_hash_algo->empty_tree, which will point to the correct algo struct for the repository. This happened in 62ba93eaa9 (sha1_file: convert cached object code to struct object_id, 2018-05-02), which even mentions that this field is never used. Let's reduce confusion for anybody reading this code by replacing the sha1 initializer with a comment. The resulting field will be all-zeroes, so any violation of our assumption that the oid field is not used will break equally for sha1 and sha256. Signed-off-by: Jeff King --- object-file.c | 4 +--- 1 file changed, 1 insertion(+), 3 deletions(-) diff --git a/object-file.c b/object-file.c index 8101585616..19fc4afa43 100644 --- a/object-file.c +++ b/object-file.c @@ -326,9 +326,7 @@ static struct cached_object { static int cached_object_nr, cached_object_alloc; static struct cached_object empty_tree = { - .oid = { - .hash = EMPTY_TREE_SHA1_BIN_LITERAL, - }, + /* no oid needed; we'll look it up manually based on the_hash_algo */ .type = OBJ_TREE, .buf = "", }; From patchwork Mon Nov 18 09:55:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13878334 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7C63F17C98 for ; Mon, 18 Nov 2024 09:55:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923715; cv=none; b=fSmtsXwIbrmXIsnfRTbsqBnwSanOV+TXLI1zPT0YLDtPdOSQuH9GB4PcrSxWcOK0l1L+T+orSYHEZBHZUN6G0pwwvcLps8rZLswd3gMZ4qsxt9K6c/Z4+lhxNqb4O2BZJUXxP+5Leke13jLtey8aet7xwJIHYPL3UR96TltL/qc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923715; c=relaxed/simple; bh=Ybgwm0QJkP0HpHgMXBMyz2gzAUa4sHIMs42zVCqw7cM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=eJLmLakpN4GddF2eihf2mfNuAbYdt+8gs2Nd0ZAHrAM9/1vAiju0HE3kiCCIoBJhYcqGcNqT/+QeeDSTCh9MWMgn9J76mXz3viXBUsmWOZic5iB+yJbNF5GwOtAocT9CGCXSuzNTAdNJv0p3D4/N2pdohd0KKeD6I/H27fg0UfI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=Jg98kf+f; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="Jg98kf+f" Received: (qmail 22436 invoked by uid 109); 18 Nov 2024 09:55:12 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=Ybgwm0QJkP0HpHgMXBMyz2gzAUa4sHIMs42zVCqw7cM=; b=Jg98kf+fNLKQwHrH4jL/Jkb7hBCW+s+bE25yelgtMytFxxCQjJy5HN7bDknpGyHxJPd+3+k30m4Ju2oq2KOYR2YfZXe7NRwnYtWySjFwBKp5CbYQHN3TwEUJC4Gi4RccIAgR1HSUae/gCe0ynLPbcrXb0i1HoxC9awlAacyESsVLDEx2Iherwu0mZX9Q+ELmmWJcmt9U7RP9ki3EhcMMglda26fZV16zt65UCWVv86jSqOSrhwe8hi3nGEM03gjlygbm5i1PO3/zVYhzhzCYJSCQjT1swO5cDhzupJAVpUBiE3jcOVWCsvhotlRxOZLayxpPPRm3CIU8ZlISpAesSQ== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 18 Nov 2024 09:55:12 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 18310 invoked by uid 111); 18 Nov 2024 09:55:16 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 18 Nov 2024 04:55:16 -0500 Authentication-Results: peff.net; auth=none Date: Mon, 18 Nov 2024 04:55:11 -0500 From: Jeff King To: Sam James Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Chris Torek , "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 3/6] object-file: move empty_tree struct into find_cached_object() Message-ID: <20241118095511.GC3992317@coredump.intra.peff.net> References: <20241118095423.GA3990835@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241118095423.GA3990835@coredump.intra.peff.net> The fake empty_tree struct is a static global, but the only code that looks at it is find_cached_object(). The struct itself is a little odd, with an invalid "oid" field that is handled specially by that function. Since it's really just an implementation detail, let's move it to a static within the function. That future-proofs against other code trying to use it and seeing the weird oid value. Signed-off-by: Jeff King --- object-file.c | 11 +++++------ 1 file changed, 5 insertions(+), 6 deletions(-) diff --git a/object-file.c b/object-file.c index 19fc4afa43..4d4280543e 100644 --- a/object-file.c +++ b/object-file.c @@ -325,14 +325,13 @@ static struct cached_object { } *cached_objects; static int cached_object_nr, cached_object_alloc; -static struct cached_object empty_tree = { - /* no oid needed; we'll look it up manually based on the_hash_algo */ - .type = OBJ_TREE, - .buf = "", -}; - static struct cached_object *find_cached_object(const struct object_id *oid) { + static struct cached_object empty_tree = { + /* no oid needed; we'll look it up manually based on the_hash_algo */ + .type = OBJ_TREE, + .buf = "", + }; int i; struct cached_object *co = cached_objects; From patchwork Mon Nov 18 09:55:15 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13878335 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3F61117C98 for ; Mon, 18 Nov 2024 09:55:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923719; cv=none; b=MPFypo+0WGOaj9FmY6P0LDMr3Kz5Xb0t2//xKEM3vaaAhuJo/nvtLmkB1Lhoa8AvTpjsHu4UWPmDlSl717c++2fnkB8bDCdSy3E2wsTlRkoA05+2xhob3PdJNsL/V8+AC4pCx+TXdk+QKQxZ207lhbu1DttT562Y+jUsFb5HgeU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923719; c=relaxed/simple; bh=681QhZiK+l0PIHqs7WNmA8JGlKrfjBeNzT8JFIQWpN8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=dyvy/l5MYYXm9hYbe23ad9QpOOz173KYZWI2wUbiSubmUpJEHI2GA/S90/3jasZfuG/It2ko0fuyMy6HD7HLUuLUV7laN/3bWxxolqd3gw8fllSHKDso3XKYqKYFndoA7DifEI+vDWVH4rmaLPQ7mWvBWC3PsErzN2An7vgMymU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=UJ+B0cCR; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="UJ+B0cCR" Received: (qmail 22452 invoked by uid 109); 18 Nov 2024 09:55:16 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=681QhZiK+l0PIHqs7WNmA8JGlKrfjBeNzT8JFIQWpN8=; b=UJ+B0cCRWasSARelyrBQJETSAiMmqFn4+VZyIa17nBFBEx7K1Bv26zON0mWDs1uXieRR4Npr3QDjeuaJav+92E9RV42K9Ui+N8S3igHO7Nd1JfkdRPQe2xhHpg/JxMAQNcIc2000t1362c+qeJhS+f4CotU+nmoqGBUaovFdDStnYoqfgfQOlV5Xec4qgrj2VWUC9FUhw9JzfmdFnXERfrYz4Tue/60dUzZ4mKXFNY+OapaQU4y8QNEVNrrelTwrmmKObEIYfO1lci1zB0j6FrYxPPI+E6ATRY1yAMpqKnxAGc5BFIIgqG15Go0LwOhr6WINajnDtVXxXJdZ36G8EA== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 18 Nov 2024 09:55:16 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 18318 invoked by uid 111); 18 Nov 2024 09:55:20 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 18 Nov 2024 04:55:20 -0500 Authentication-Results: peff.net; auth=none Date: Mon, 18 Nov 2024 04:55:15 -0500 From: Jeff King To: Sam James Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Chris Torek , "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 4/6] object-file: drop oid field from find_cached_object() return value Message-ID: <20241118095515.GD3992317@coredump.intra.peff.net> References: <20241118095423.GA3990835@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241118095423.GA3990835@coredump.intra.peff.net> The pretend_object_file() function adds to an array mapping oids to object contents, which are later retrieved with find_cached_object(). We naturally need to store the oid for each entry, since it's the lookup key. But find_cached_object() also returns a hard-coded empty_tree object. There we don't care about its oid field and instead compare against the_hash_algo->empty_tree. The oid field is left as all-zeroes. This all works, but it means that the cached_object struct we return from find_cached_object() may or may not have a valid oid field, depend whether it is the hard-coded tree or came from pretend_object_file(). Nobody looks at the field, so there's no bug. But let's future-proof it by returning only the object contents themselves, not the oid. We'll continue to call this "struct cached_object", and the array entry mapping the key to those contents will be a "cached_object_entry". This would also let us swap out the array for a better data structure (like a hashmap) if we chose, but there's not much point. The only code that adds an entry is git-blame, which adds at most a single entry per process. Signed-off-by: Jeff King --- object-file.c | 23 ++++++++++++----------- 1 file changed, 12 insertions(+), 11 deletions(-) diff --git a/object-file.c b/object-file.c index 4d4280543e..67a6731066 100644 --- a/object-file.c +++ b/object-file.c @@ -317,27 +317,28 @@ int hash_algo_by_length(int len) * to write them into the object store (e.g. a browse-only * application). */ -static struct cached_object { +static struct cached_object_entry { struct object_id oid; - enum object_type type; - const void *buf; - unsigned long size; + struct cached_object { + enum object_type type; + const void *buf; + unsigned long size; + } value; } *cached_objects; static int cached_object_nr, cached_object_alloc; static struct cached_object *find_cached_object(const struct object_id *oid) { static struct cached_object empty_tree = { - /* no oid needed; we'll look it up manually based on the_hash_algo */ .type = OBJ_TREE, .buf = "", }; int i; - struct cached_object *co = cached_objects; + struct cached_object_entry *co = cached_objects; for (i = 0; i < cached_object_nr; i++, co++) { if (oideq(&co->oid, oid)) - return co; + return &co->value; } if (oideq(oid, the_hash_algo->empty_tree)) return &empty_tree; @@ -1850,7 +1851,7 @@ int oid_object_info(struct repository *r, int pretend_object_file(void *buf, unsigned long len, enum object_type type, struct object_id *oid) { - struct cached_object *co; + struct cached_object_entry *co; char *co_buf; hash_object_file(the_hash_algo, buf, len, type, oid); @@ -1859,11 +1860,11 @@ int pretend_object_file(void *buf, unsigned long len, enum object_type type, return 0; ALLOC_GROW(cached_objects, cached_object_nr + 1, cached_object_alloc); co = &cached_objects[cached_object_nr++]; - co->size = len; - co->type = type; + co->value.size = len; + co->value.type = type; co_buf = xmalloc(len); memcpy(co_buf, buf, len); - co->buf = co_buf; + co->value.buf = co_buf; oidcpy(&co->oid, oid); return 0; } From patchwork Mon Nov 18 09:55:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13878336 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 422F8193074 for ; Mon, 18 Nov 2024 09:55:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923722; cv=none; b=DZFz5UrIanmpRmg9xo5TeDDvHLu7wUH3xRdcfHfuq3vYr2Rfok9oTqYPAHX04i802poz8eP79zH8DsiFqKViUlScRChm0zMzNTNOMO+DvzuOYFsP1nxvcXjiyW1EDJaGUbb7gT35PSb2QVis44GXIcflhs7td8VZ81r0ZeHPeJE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923722; c=relaxed/simple; bh=z06q/pbOACVU03Lw2CBChSgv3Wll1BN0b3HpMw3KazM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=AcEnR9yzeWkTUCRFDH0L+XfrvrwFKV/ytdcoHMBatm6y9ejphdMqR9zL/+xSelXRpwoIjcX8KFof73bGCkb/Q/YmNekV23Ekuv/P+5J6IeW+8KO1eko4n82oGgm/RaBo43BKSajQOaf4PAaQMi7b3orn1TOPRT0u4y8P4MGfxaM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=O/U3QrCp; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="O/U3QrCp" Received: (qmail 22477 invoked by uid 109); 18 Nov 2024 09:55:20 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=z06q/pbOACVU03Lw2CBChSgv3Wll1BN0b3HpMw3KazM=; b=O/U3QrCpuPe+Q2o2+6nY3H7WZJJLC9mjk1OgoC9ClzOXz3asIOsQzEXtLc8X4g1QA22MrZljvL/1viRA5iGrHkSB+Uz5aNZoKow+8VVp5qGdq7A21svhBFQQfTQY0SHFdrok777K0FlPFS5r2oqCQ68MYwyenU1KVPDUHhNQJqjDqaZ8iALazbSKTo17oi8I2IuKdRey2gl16ZOlfLvY4E4D6ovsMirDKJYHAoR4l/jKol/5AUtEsjIayCct+IyyVU/kR4yOce/dODcsOwxqT+AquClKbHASsgjd/crRQ+2ppaIavro3Dx4eGo+WOyi+5xZyv/zY26Cfi7AdV9An4w== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 18 Nov 2024 09:55:20 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 18326 invoked by uid 111); 18 Nov 2024 09:55:24 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 18 Nov 2024 04:55:24 -0500 Authentication-Results: peff.net; auth=none Date: Mon, 18 Nov 2024 04:55:19 -0500 From: Jeff King To: Sam James Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Chris Torek , "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 5/6] object-file: treat cached_object values as const Message-ID: <20241118095519.GE3992317@coredump.intra.peff.net> References: <20241118095423.GA3990835@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241118095423.GA3990835@coredump.intra.peff.net> The cached-object API maps oids to in-memory entries. Once inserted, these entries should be immutable. Let's return them from the find_cached_object() call with a const tag to make this clear. Suggested-by: Patrick Steinhardt Signed-off-by: Jeff King --- object-file.c | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/object-file.c b/object-file.c index 67a6731066..ec62e5fb3b 100644 --- a/object-file.c +++ b/object-file.c @@ -327,14 +327,14 @@ static struct cached_object_entry { } *cached_objects; static int cached_object_nr, cached_object_alloc; -static struct cached_object *find_cached_object(const struct object_id *oid) +static const struct cached_object *find_cached_object(const struct object_id *oid) { - static struct cached_object empty_tree = { + static const struct cached_object empty_tree = { .type = OBJ_TREE, .buf = "", }; int i; - struct cached_object_entry *co = cached_objects; + const struct cached_object_entry *co = cached_objects; for (i = 0; i < cached_object_nr; i++, co++) { if (oideq(&co->oid, oid)) @@ -1629,7 +1629,7 @@ static int do_oid_object_info_extended(struct repository *r, struct object_info *oi, unsigned flags) { static struct object_info blank_oi = OBJECT_INFO_INIT; - struct cached_object *co; + const struct cached_object *co; struct pack_entry e; int rtype; const struct object_id *real = oid; From patchwork Mon Nov 18 09:55:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff King X-Patchwork-Id: 13878337 Received: from cloud.peff.net (cloud.peff.net [104.130.231.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6ED00192B89 for ; Mon, 18 Nov 2024 09:55:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=104.130.231.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923726; cv=none; b=k62diEuLR4AaA4CRAAiiZ/6vvXaLHe+CxEScEwaWXJalrMkKZc3upMKcy3AA6+j1aDI81/o8IGb8jaFTIFQGqaDyiTmAUGDEUio1JTy96JKXNBvwsgIrBtIy7300xFetie3bha12aetHmDvsLFMXAcz8ycBqJbL6CuXP1l+uhqw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1731923726; c=relaxed/simple; bh=I0ap1APTcaOpBAlSS06ojJMOw2fSfo+LhFjU8f1ClrA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=DsKVFzbGGQCTgMb+vOKLNkKP+s+wGVDzpyn3Qy693EoQdPmYkanDmo9uFlvPj/4ARBXq2ZZ4B/wLmjIV4c6bNdAnpGJHbvLmSEs+Pzsrfh9ikpwCJlvJTnpIa9t7EEdZz2SnasJJtVx2f5JPG5MwRctAVTnm7lE1QZ+fP8+MCSs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net; spf=pass smtp.mailfrom=peff.net; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b=CwyKi4CD; arc=none smtp.client-ip=104.130.231.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=peff.net Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=peff.net Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=peff.net header.i=@peff.net header.b="CwyKi4CD" Received: (qmail 22496 invoked by uid 109); 18 Nov 2024 09:55:23 -0000 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed; d=peff.net; h=date:from:to:cc:subject:message-id:references:mime-version:content-type:in-reply-to; s=20240930; bh=I0ap1APTcaOpBAlSS06ojJMOw2fSfo+LhFjU8f1ClrA=; b=CwyKi4CD9ER0ghPsVCVyu4aqDD33oWhGFVkS+f2FxEhCOXX36L/TcIXPoU/wFhjI2kLSrGCFf0w1gp7vWX82nxk1/nDLGlFSkg/2O2Wol5EXH366AIDlcQjCbOvKaR5DuSZsSgilCtcdvHi98gqqbEl0EKRWmmb4QapkPSvieuqhzODsLIHN85lQu63msPkBP2pxNopw8IyEgusDFLJIY2njzAXmc3BSbtWELyG1rAuMIYDUgMsj8tpBa6K1Ut1N9xo4RUE/IZLI9QddFEvgchex4m1q1z+ajwcDd5qo4NYiRlZc1dlzo/PCXrFvYuNUg9YGxG/lPnQ0OoWgsqNvBg== Received: from Unknown (HELO peff.net) (10.0.1.2) by cloud.peff.net (qpsmtpd/0.94) with ESMTP; Mon, 18 Nov 2024 09:55:23 +0000 Authentication-Results: cloud.peff.net; auth=none Received: (qmail 18334 invoked by uid 111); 18 Nov 2024 09:55:27 -0000 Received: from coredump.intra.peff.net (HELO coredump.intra.peff.net) (10.0.0.2) by peff.net (qpsmtpd/0.94) with (TLS_AES_256_GCM_SHA384 encrypted) ESMTPS; Mon, 18 Nov 2024 04:55:27 -0500 Authentication-Results: peff.net; auth=none Date: Mon, 18 Nov 2024 04:55:22 -0500 From: Jeff King To: Sam James Cc: =?utf-8?b?UmVuw6k=?= Scharfe , Patrick Steinhardt , Chris Torek , "brian m. carlson" , git@vger.kernel.org Subject: [PATCH 6/6] object-file: inline empty tree and blob literals Message-ID: <20241118095522.GF3992317@coredump.intra.peff.net> References: <20241118095423.GA3990835@coredump.intra.peff.net> Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20241118095423.GA3990835@coredump.intra.peff.net> We define macros with the bytes of the empty trees and blobs for sha1 and sha256. But since e1ccd7e2b1 (sha1_file: only expose empty object constants through git_hash_algo, 2018-05-02), those are used only for initializing the git_hash_algo entries. Any other code using the macros directly would be suspicious, since a hash_algo pointer is the level of indirection we use to make everything work with both sha1 and sha256. So let's future proof against code doing the wrong thing by dropping the macros entirely and just initializing the structs directly. Signed-off-by: Jeff King --- object-file.c | 47 ++++++++++++++++++++--------------------------- 1 file changed, 20 insertions(+), 27 deletions(-) diff --git a/object-file.c b/object-file.c index ec62e5fb3b..891eaa2b4b 100644 --- a/object-file.c +++ b/object-file.c @@ -44,47 +44,40 @@ /* The maximum size for an object header. */ #define MAX_HEADER_LEN 32 - -#define EMPTY_TREE_SHA1_BIN_LITERAL { \ - 0x4b, 0x82, 0x5d, 0xc6, 0x42, 0xcb, 0x6e, 0xb9, 0xa0, 0x60, \ - 0xe5, 0x4b, 0xf8, 0xd6, 0x92, 0x88, 0xfb, 0xee, 0x49, 0x04 \ -} -#define EMPTY_TREE_SHA256_BIN_LITERAL { \ - 0x6e, 0xf1, 0x9b, 0x41, 0x22, 0x5c, 0x53, 0x69, 0xf1, 0xc1, \ - 0x04, 0xd4, 0x5d, 0x8d, 0x85, 0xef, 0xa9, 0xb0, 0x57, 0xb5, \ - 0x3b, 0x14, 0xb4, 0xb9, 0xb9, 0x39, 0xdd, 0x74, 0xde, 0xcc, \ - 0x53, 0x21 \ -} - -#define EMPTY_BLOB_SHA1_BIN_LITERAL { \ - 0xe6, 0x9d, 0xe2, 0x9b, 0xb2, 0xd1, 0xd6, 0x43, 0x4b, 0x8b, \ - 0x29, 0xae, 0x77, 0x5a, 0xd8, 0xc2, 0xe4, 0x8c, 0x53, 0x91 \ -} -#define EMPTY_BLOB_SHA256_BIN_LITERAL { \ - 0x47, 0x3a, 0x0f, 0x4c, 0x3b, 0xe8, 0xa9, 0x36, 0x81, 0xa2, \ - 0x67, 0xe3, 0xb1, 0xe9, 0xa7, 0xdc, 0xda, 0x11, 0x85, 0x43, \ - 0x6f, 0xe1, 0x41, 0xf7, 0x74, 0x91, 0x20, 0xa3, 0x03, 0x72, \ - 0x18, 0x13 \ -} - static const struct object_id empty_tree_oid = { - .hash = EMPTY_TREE_SHA1_BIN_LITERAL, + .hash = { + 0x4b, 0x82, 0x5d, 0xc6, 0x42, 0xcb, 0x6e, 0xb9, 0xa0, 0x60, + 0xe5, 0x4b, 0xf8, 0xd6, 0x92, 0x88, 0xfb, 0xee, 0x49, 0x04 + }, .algo = GIT_HASH_SHA1, }; static const struct object_id empty_blob_oid = { - .hash = EMPTY_BLOB_SHA1_BIN_LITERAL, + .hash = { + 0xe6, 0x9d, 0xe2, 0x9b, 0xb2, 0xd1, 0xd6, 0x43, 0x4b, 0x8b, + 0x29, 0xae, 0x77, 0x5a, 0xd8, 0xc2, 0xe4, 0x8c, 0x53, 0x91 + }, .algo = GIT_HASH_SHA1, }; static const struct object_id null_oid_sha1 = { .hash = {0}, .algo = GIT_HASH_SHA1, }; static const struct object_id empty_tree_oid_sha256 = { - .hash = EMPTY_TREE_SHA256_BIN_LITERAL, + .hash = { + 0x6e, 0xf1, 0x9b, 0x41, 0x22, 0x5c, 0x53, 0x69, 0xf1, 0xc1, + 0x04, 0xd4, 0x5d, 0x8d, 0x85, 0xef, 0xa9, 0xb0, 0x57, 0xb5, + 0x3b, 0x14, 0xb4, 0xb9, 0xb9, 0x39, 0xdd, 0x74, 0xde, 0xcc, + 0x53, 0x21 + }, .algo = GIT_HASH_SHA256, }; static const struct object_id empty_blob_oid_sha256 = { - .hash = EMPTY_BLOB_SHA256_BIN_LITERAL, + .hash = { + 0x47, 0x3a, 0x0f, 0x4c, 0x3b, 0xe8, 0xa9, 0x36, 0x81, 0xa2, + 0x67, 0xe3, 0xb1, 0xe9, 0xa7, 0xdc, 0xda, 0x11, 0x85, 0x43, + 0x6f, 0xe1, 0x41, 0xf7, 0x74, 0x91, 0x20, 0xa3, 0x03, 0x72, + 0x18, 0x13 + }, .algo = GIT_HASH_SHA256, }; static const struct object_id null_oid_sha256 = {