From patchwork Thu Sep 26 15:22:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813457 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 12A235672 for ; Thu, 26 Sep 2024 15:22:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364156; cv=none; b=Kvm0YvoLUa4WAc86EQOtzuoC9vASLDjWtByW8sUzckig5k60Avlk7AyGwIcdALxDqKKyKtqfFt2nw/5xWdfLeMRncFiUGxlT/+wPq1ovdYjtIbzNebjtl+mFAcKjpffwhQVk74JOYU4fwjgr/Eb3fCrSP09CqAkusi7wzLz57vU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364156; c=relaxed/simple; bh=gFldgANGjE9FE8m8avklbkSZQ+IfRpTbs4Pllt771x0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=YR/oHkY2yeBywp5PqetHXdtk4jFuBxcPNSH2ZUxSyaso/PrN0epKYi7GXGft++DMPGWH6k5VthjayDGGltKOh8K75KGapcFHkdw64RDxulw536FmgP/V//HiekCh+Mo2AtgdLCC4GnMB5BlIn4q3Jhb4iH4vrrDDrQnCTZXVji0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=CGeHL93w; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="CGeHL93w" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-6e22f10cc11so10105717b3.1 for ; Thu, 26 Sep 2024 08:22:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364153; x=1727968953; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=6I9WriWDw0/ZYmtXlKlhRpyxeBFdZ0cIHHSDg2mVVHk=; b=CGeHL93wesk9K9+VwNvfRgoKj1qu0W+uu/N6RVWN2Kv9x9i3UHp4138tLUKIIOHNEM agdPRDtUQSQt04Z0LC5oC/CEIEiq3u4qMR/umOjpbHkoXYRWEciidRH2dJU6V1tJpIrL J9FQP5qo9I7vMxz3PrbWYWAW9k6UU37LCnEifYGmEEKen+A4yURPQNs80xazo3+b9xKj hd+0TsVOgk+NdgP8IzQa8gT9s4MGFalQ3RJ4Ohls9XQuW4aUUbCJsJn208n6d7gtzwcL SXz/iypXZ0AFIMe166iLGBesuxGtK7uzUFrrarTbItIFOm2qyMbcdrp+U6WJreXy3tvh Ru0Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364153; x=1727968953; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=6I9WriWDw0/ZYmtXlKlhRpyxeBFdZ0cIHHSDg2mVVHk=; b=isZOyjtoYBVTtK6wNBJi1iQyG3kSDamQ3+CU9zHAuXAnCTMzwjzNBkPHHeB8cDikUm AdnTOKYkTwlNmeiC2EafFoldWSlPqLFLoBmocmpjFdH9V0iFwI65K86MxbiKKUo4tIU7 3Y+cFzG4QNi/7WQ5xiKeakUVbW6Jqt7w02EYFj8p3ZZz/goOhTpEWMk4FctCcFaD6NGs DwDNRCrztlrfOqmqE/8Oo68PSgHrdXqdSQDtBWJczIHnYthvbi39i0mzx/xwRFfj/4rO oQvlI+7co4xTOldM84cER0I74L6ewXWezNCBPnRIoY6h1+Y7cTzxy2Ut0Yy/PGIHbBQF K2yw== X-Gm-Message-State: AOJu0Yxi6VNrJld9X9A1JRXpwAoEyQ7jDEbJ/k8RRbjykjUaUQL7mjWn g3qwsmMjR9BbaVfd6x3ine6haZuzfCkR2IDwVBBzDe2/QtnP3Co09ORKAqtNg+jSjCOQrnlugYN XZHw= X-Google-Smtp-Source: AGHT+IFBj34vMJpZTQWM0W9Sk9hcyNdPBkvDz1/h8Z1pw9FNk4ghJd2hXVparpB8P/yQYF4B0tKoSA== X-Received: by 2002:a05:690c:1e:b0:6e2:d80:d844 with SMTP id 00721157ae682-6e21da60076mr64328297b3.33.1727364153526; Thu, 26 Sep 2024 08:22:33 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e2452f7eb5sm192707b3.18.2024.09.26.08.22.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:33 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:32 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 1/8] finalize_object_file(): check for name collision before renaming Message-ID: <6f1ee91fff315678fef39a54220eae91632d2df9.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We prefer link()/unlink() to rename() for object files, with the idea that we should prefer the data that is already on disk to what is incoming. But we may fall back to rename() if the user has configured us to do so, or if the filesystem seems not to support cross-directory links. This loses the "prefer what is on disk" property. We can mitigate this somewhat by trying to stat() the destination filename before doing the rename. This is racy, since the object could be created between the stat() and rename() calls. But in practice it is expanding the definition of "what is already on disk" to be the point that the function is called. That is enough to deal with any potential attacks where an attacker is trying to collide hashes with what's already in the repository. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- object-file.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/object-file.c b/object-file.c index 968da27cd4..38407f468a 100644 --- a/object-file.c +++ b/object-file.c @@ -1937,6 +1937,7 @@ static void write_object_file_prepare_literally(const struct git_hash_algo *algo */ int finalize_object_file(const char *tmpfile, const char *filename) { + struct stat st; int ret = 0; if (object_creation_mode == OBJECT_CREATION_USES_RENAMES) @@ -1957,9 +1958,12 @@ int finalize_object_file(const char *tmpfile, const char *filename) */ if (ret && ret != EEXIST) { try_rename: - if (!rename(tmpfile, filename)) + if (!stat(filename, &st)) + ret = EEXIST; + else if (!rename(tmpfile, filename)) goto out; - ret = errno; + else + ret = errno; } unlink_or_warn(tmpfile); if (ret) { From patchwork Thu Sep 26 15:22:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813458 Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id ECEE31509BF for ; Thu, 26 Sep 2024 15:22:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364159; cv=none; b=Rg1ddr5RjLfhFEQvHJa/INO+jVr0rY4QlF2F8OTB1n3U6jXm8mzuqSKLnaZJ7NQdGVTiyjsL5AHgsgvqBxzNX+D5oaRCzWLUKouxQJieaFCipYGwgmi5gQ8qtZox4ODFG/V3TRruFfORKb2q9aD+ARwm8+ngNsDpeUeSoTt0qZc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364159; c=relaxed/simple; bh=1oqJeg/ApvJGgWVk+L4bWE8NkNkI6oCBbDBrzS1BWOY=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GRAeW0k1YDqS0J5wN2ZfQKhpUNn15MwTc1Abz3uzbRxSXdhJicDoeqrt06PfMrZ+s3jGkORar1g5iOmtvRoaLE8UP5kwTpfbBig+U1wp2F8GqDnpfZXpt2HysEVJ8KVOIG8qzujwON1D0lIst1QWgvR4CVKmcWwovBJBENjchQ4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=YM0tOhTL; arc=none smtp.client-ip=209.85.219.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="YM0tOhTL" Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-e04196b7603so1071287276.0 for ; Thu, 26 Sep 2024 08:22:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364156; x=1727968956; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Z2Ef2vIQsuHHfBp7Av64sbPmnMivM7QcEaEOawEJggo=; b=YM0tOhTLPQRFQ77xCi6H3PPlgPySvYeqv9BN7iAjD8YNrRKbtINP/n7uCha/NiTcmB xfXarBvmzfLcApu5kPZNxRccEEkup4EXs8Qv8ZGhSfgIdMlKexPS031N/rmStjvKX9/3 1s4VMNiBzJpMI8Lbw+lzvTUPhdAziSObgFHTc+SSU5Gq5aaJfcZOACPiGQUkNFzuV75s KR6An8UTKls8kMyffOBTSNzSpueVeQXxtEkXEPEQBqr/kfShabw3l8hVS38IYZXKM/Hb UAF8o4sDd/zPC5CAKk/m06HvBvKMityLEIPUIV88pO7H0pbixvsGus3XiitutXvUalB5 IxNA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364156; x=1727968956; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Z2Ef2vIQsuHHfBp7Av64sbPmnMivM7QcEaEOawEJggo=; b=Uf1aLgzRcj7CPbuymuGh7Kv5Zr8w8L0T4/mEkjZG/EQIG/cXNoh5kH2V27ZAH7H6n3 96m0e1bb5Gfg7+ZtOPxQzfQRvOoy+Nizxz8T6oKAsa+KE5rjEoglJGMoWqvi85O/sJ8c g5Q2muOyrYJRWSc7B7lL/BEf4BWISbOsZ1EiPmzxsds0MfwkAokDj9RRThkdaJ3fZY24 3FXQjEjSYZfBPfy4nsi833zxOcX9k8M22oYiQJD6WyjBOO0W1ENLdVo4swgRUOxYgeVL Iqfm4b8GdXnK8KgLQvsCFYlE7BkWwyBkvL2L/QM4z58VpBxph+12kwJSMHBFguxSUZR5 KFsg== X-Gm-Message-State: AOJu0YzheTuws1UDO6j+qGAzJ4Bfzh8p7FJAQqTW+S471VB14U71S0Wg 4bu88EeUqOxd/j6K/zQryHaohEU5Eaj7u7BG7xW17U2K+1Rsh/A01l/ES42LbLerKgXZh2XbNb1 HQ1o= X-Google-Smtp-Source: AGHT+IFIPgSe1BnZt8s8vHdQu3MvfvLZmTJRiXyoUifp8/k626wNn74aJiPn4RYTnyP+vrbT1PxBeQ== X-Received: by 2002:a05:6902:dca:b0:e25:b49f:c8b7 with SMTP id 3f1490d57ef6-e25b49fcaf8mr4011369276.50.1727364156626; Thu, 26 Sep 2024 08:22:36 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e25e3ef9b70sm23616276.3.2024.09.26.08.22.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:36 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:35 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 2/8] finalize_object_file(): refactor unlink_or_warn() placement Message-ID: <133047ca8c9aed3e297483b60ca3fdb02eb532a5.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: As soon as we've tried to link() a temporary object into place, we then unlink() the tempfile immediately, whether we were successful or not. For the success case, this is because we no longer need the old file (it's now linked into place). For the error case, there are two outcomes. Either we got EEXIST, in which case we consider the collision to be a noop. Or we got a system error, in which we case we are just cleaning up after ourselves. Using a single line for all of these cases has some problems: - in the error case, our unlink() may clobber errno, which we use in the error message - for the collision case, there's a FIXME that indicates we should do a collision check. In preparation for implementing that, we'll need to actually hold on to the file. Split these three cases into their own calls to unlink_or_warn(). This is more verbose, but lets us do the right thing in each case. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- object-file.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 38407f468a..5a1da57711 100644 --- a/object-file.c +++ b/object-file.c @@ -1944,6 +1944,8 @@ int finalize_object_file(const char *tmpfile, const char *filename) goto try_rename; else if (link(tmpfile, filename)) ret = errno; + else + unlink_or_warn(tmpfile); /* * Coda hack - coda doesn't like cross-directory links, @@ -1965,12 +1967,15 @@ int finalize_object_file(const char *tmpfile, const char *filename) else ret = errno; } - unlink_or_warn(tmpfile); if (ret) { if (ret != EEXIST) { + int saved_errno = errno; + unlink_or_warn(tmpfile); + errno = saved_errno; return error_errno(_("unable to write file %s"), filename); } /* FIXME!!! Collision check here ? */ + unlink_or_warn(tmpfile); } out: From patchwork Thu Sep 26 15:22:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813459 Received: from mail-yb1-f181.google.com (mail-yb1-f181.google.com [209.85.219.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A48A214B959 for ; Thu, 26 Sep 2024 15:22:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364163; cv=none; b=mQPwviK7aIrStaZXqVoLR6I2Tk4WHMK4SHNoUcrjTuE7wOIyLT3xBK1nAB2SKPfYZwejJ5pCoH8Jmp1EKTUrDMVdqU8omat2v8HLMAutdVqXeWLzNNu6ntTIpNV+qV345T3B3SE4btdfVrh4hTAaL/HZyn/juk7hipatTiWn72w= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364163; c=relaxed/simple; bh=AI6pY6VsJ3uCbkzNgtJ2GT0EEY7aEOoU0Dybp4mMkrA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=o5JVvPzqkwheREd6NsUkTPLDOs4PEDUlTPlGaQh22wyxZ9DhG9D78zr+f6ym7me3zn3JsH8BlNHtFPZlwIFyZoJy7SVxCnB1AN0zJ/ZVyebkzNjWOtYciZz1zEMv44SMQBO1uAPOZ0ggmaDRN9a+V01/dtx7H3H+4J51Ne9oqDQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=kArNJx2p; arc=none smtp.client-ip=209.85.219.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="kArNJx2p" Received: by mail-yb1-f181.google.com with SMTP id 3f1490d57ef6-e25cc9c93a9so963832276.2 for ; Thu, 26 Sep 2024 08:22:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364160; x=1727968960; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2BLYq9SIuzeL4a/pdDAaU0cM1edyzPuMBS8r4CoWOYE=; b=kArNJx2pKuf8P9onZmpp6IL81c7/rFudx+3LkNkY306uunQ3IJJwZoHW2Je1/kJZ5b Ek6F7r19XlIiUjPko6DZ/rrGy6qKQJMqNpgkghWfLIrwz2ZGHLGkRWBt1itI5ZJB+XRZ /tIs++doU20P7U/4luPE/EMDTyhtQRvfkOgqjcsroShXU/SzxmLW/XFGJ4IuRUwunIhd oVF0+7bUopulRFelfrzXEDQPayy+ptRF00OySFZGwATR/MLv0V5Lfcof2JGK0M7QOtFz PTX9O2++TjNgcUlrYsuiFcBGveJWOsZNyBDiQcXfjY4sZu4Ffz0SK1bH5WGwOezWr+IM S/EA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364160; x=1727968960; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2BLYq9SIuzeL4a/pdDAaU0cM1edyzPuMBS8r4CoWOYE=; b=xF1AM8slvEE5gBrd+Uk1B9m8uOz/Oeldb/CGt/H8VmGxGeHpxPUy3SpQNLi8MzNr28 MI6xMPB/G6/AePsy59QsPzM5wjaQuJa5WH6FP3S9XgvPt0DcQ2Qj/J37YjRkiMdiDC8E eKKaVHI1R9wLYalUzoQIYItF4fZTQ4zcygcrNhdBg1VitWafj2dNnI0EyDUGhLx0gR2C NQGY68dIs/qI6K1kNCFBlBZflttBHmxyDykaNqKs2q55wbWVYZIRI/7Vkai7STEExHGY ZypRyvWQfW0ieGBEZDchSqtPI6N4Jqx4yBkJiy7Ym9MjKGtuxrrsb7QFNDhvdPHPcztB MKIA== X-Gm-Message-State: AOJu0YzLAxJpdnmVVEKyunqoB7bmI48kKdvvkU3k3MvHSqrhlHXKV3PS S0m65LpykR2/Z4owLabuIVpL7KCVvM/i1CBg7auOngx8RVoZefUc7mDMH9XF6dEKsbnmSbxsmVB V3AE= X-Google-Smtp-Source: AGHT+IHTADL/agC2owMSFR/rTD4kjmdoIGI5gxWM5uvp2eqDpf91f42fmGyUBFezCPEiwa17sYxK+Q== X-Received: by 2002:a05:6902:906:b0:e25:d6e3:d129 with SMTP id 3f1490d57ef6-e25d6e3d223mr690367276.13.1727364159778; Thu, 26 Sep 2024 08:22:39 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e25e4015fe9sm22980276.26.2024.09.26.08.22.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:39 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 3/8] finalize_object_file(): implement collision check Message-ID: <41d38352a49f2955e21e993cd89524615d80a168.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We've had "FIXME!!! Collision check here ?" in finalize_object_file() since aac1794132 (Improve sha1 object file writing., 2005-05-03). That is, when we try to write a file with the same name, we assume the on-disk contents are the same and blindly throw away the new copy. One of the reasons we never implemented this is because the files it moves are all named after the cryptographic hash of their contents (either loose objects, or packs which have their hash in the name these days). So we are unlikely to see such a collision by accident. And even though there are weaknesses in sha1, we assume they are mitigated by our use of sha1dc. So while it's a theoretical concern now, it hasn't been a priority. However, if we start using weaker hashes for pack checksums and names, this will become a practical concern. So in preparation, let's actually implement a byte-for-byte collision check. The new check will cause the write of new differing content to be a failure, rather than a silent noop, and we'll retain the temporary file on disk. If there's no collision present, we'll clean up the temporary file as usual after either rename()-ing or link()-ing it into place. Note that this may cause some extra computation when the files are in fact identical, but this should happen rarely. Loose objects are exempt from this check, and the collision check may be skipped by calling the _flags variant of this function with the FOF_SKIP_COLLISION_CHECK bit set. This is done for a couple of reasons: - We don't treat the hash of the loose object file's contents as a checksum, since the same loose object can be stored using different bytes on disk (e.g., when adjusting core.compression, using a different version of zlib, etc.). This is fundamentally different from cases where finalize_object_file() is operating over a file which uses the hash value as a checksum of the contents. In other words, a pair of identical loose objects can be stored using different bytes on disk, and that should not be treated as a collision. - We already use the path of the loose object as its hash value / object name, so checking for collisions at the content level doesn't add anything. Adding a content-level collision check would have to happen at a higher level than in finalize_object_file(), since (avoiding race conditions) writing an object loose which already exists in the repository will prevent us from even reaching finalize_object_file() via the object freshening code. There is a collision check in index-pack via its `check_collision()` function, but there isn't an analogous function in unpack-objects, which just feeds the result to write_object_file(). So skipping the collision check here does not change for better or worse the hardness of loose object writes. As a small note related to the latter bullet point above, we must teach the tmp-objdir routines to similarly skip the content-level collision checks when calling migrate_one() on a loose object file, which we do by setting the FOF_SKIP_COLLISION_CHECK bit when we are inside of a loose object shard. Co-authored-by: Jeff King Signed-off-by: Jeff King Helped-by: Elijah Newren Signed-off-by: Taylor Blau --- object-file.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++--- object-file.h | 6 +++++ tmp-objdir.c | 26 ++++++++++++++------ 3 files changed, 89 insertions(+), 10 deletions(-) diff --git a/object-file.c b/object-file.c index 5a1da57711..b9a3a1f62d 100644 --- a/object-file.c +++ b/object-file.c @@ -1932,10 +1932,67 @@ static void write_object_file_prepare_literally(const struct git_hash_algo *algo hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen); } +static int check_collision(const char *filename_a, const char *filename_b) +{ + char buf_a[4096], buf_b[4096]; + int fd_a = -1, fd_b = -1; + int ret = 0; + + fd_a = open(filename_a, O_RDONLY); + if (fd_a < 0) { + ret = error_errno(_("unable to open %s"), filename_a); + goto out; + } + + fd_b = open(filename_b, O_RDONLY); + if (fd_b < 0) { + ret = error_errno(_("unable to open %s"), filename_b); + goto out; + } + + while (1) { + ssize_t sz_a, sz_b; + + sz_a = read_in_full(fd_a, buf_a, sizeof(buf_a)); + if (sz_a < 0) { + ret = error_errno(_("unable to read %s"), filename_a); + goto out; + } + + sz_b = read_in_full(fd_b, buf_b, sizeof(buf_b)); + if (sz_b < 0) { + ret = error_errno(_("unable to read %s"), filename_b); + goto out; + } + + if (sz_a != sz_b || memcmp(buf_a, buf_b, sz_a)) { + ret = error(_("files '%s' and '%s' differ in contents"), + filename_a, filename_b); + goto out; + } + + if (sz_a < sizeof(buf_a)) + break; + } + +out: + if (fd_a > -1) + close(fd_a); + if (fd_b > -1) + close(fd_b); + return ret; +} + /* * Move the just written object into its final resting place. */ int finalize_object_file(const char *tmpfile, const char *filename) +{ + return finalize_object_file_flags(tmpfile, filename, 0); +} + +int finalize_object_file_flags(const char *tmpfile, const char *filename, + enum finalize_object_file_flags flags) { struct stat st; int ret = 0; @@ -1974,7 +2031,9 @@ int finalize_object_file(const char *tmpfile, const char *filename) errno = saved_errno; return error_errno(_("unable to write file %s"), filename); } - /* FIXME!!! Collision check here ? */ + if (!(flags & FOF_SKIP_COLLISION_CHECK) && + check_collision(tmpfile, filename)) + return -1; unlink_or_warn(tmpfile); } @@ -2228,7 +2287,8 @@ static int write_loose_object(const struct object_id *oid, char *hdr, warning_errno(_("failed utime() on %s"), tmp_file.buf); } - return finalize_object_file(tmp_file.buf, filename.buf); + return finalize_object_file_flags(tmp_file.buf, filename.buf, + FOF_SKIP_COLLISION_CHECK); } static int freshen_loose_object(const struct object_id *oid) @@ -2350,7 +2410,8 @@ int stream_loose_object(struct input_stream *in_stream, size_t len, strbuf_release(&dir); } - err = finalize_object_file(tmp_file.buf, filename.buf); + err = finalize_object_file_flags(tmp_file.buf, filename.buf, + FOF_SKIP_COLLISION_CHECK); if (!err && compat) err = repo_add_loose_object_map(the_repository, oid, &compat_oid); cleanup: diff --git a/object-file.h b/object-file.h index d6414610f8..81b30d269c 100644 --- a/object-file.h +++ b/object-file.h @@ -117,7 +117,13 @@ int check_object_signature(struct repository *r, const struct object_id *oid, */ int stream_object_signature(struct repository *r, const struct object_id *oid); +enum finalize_object_file_flags { + FOF_SKIP_COLLISION_CHECK = 1, +}; + int finalize_object_file(const char *tmpfile, const char *filename); +int finalize_object_file_flags(const char *tmpfile, const char *filename, + enum finalize_object_file_flags flags); /* Helper to check and "touch" a file */ int check_and_freshen_file(const char *fn, int freshen); diff --git a/tmp-objdir.c b/tmp-objdir.c index c2fb9f9193..9da0071cba 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -206,9 +206,11 @@ static int read_dir_paths(struct string_list *out, const char *path) return 0; } -static int migrate_paths(struct strbuf *src, struct strbuf *dst); +static int migrate_paths(struct strbuf *src, struct strbuf *dst, + enum finalize_object_file_flags flags); -static int migrate_one(struct strbuf *src, struct strbuf *dst) +static int migrate_one(struct strbuf *src, struct strbuf *dst, + enum finalize_object_file_flags flags) { struct stat st; @@ -220,12 +222,18 @@ static int migrate_one(struct strbuf *src, struct strbuf *dst) return -1; } else if (errno != EEXIST) return -1; - return migrate_paths(src, dst); + return migrate_paths(src, dst, flags); } - return finalize_object_file(src->buf, dst->buf); + return finalize_object_file_flags(src->buf, dst->buf, flags); } -static int migrate_paths(struct strbuf *src, struct strbuf *dst) +static int is_loose_object_shard(const char *name) +{ + return strlen(name) == 2 && isxdigit(name[0]) && isxdigit(name[1]); +} + +static int migrate_paths(struct strbuf *src, struct strbuf *dst, + enum finalize_object_file_flags flags) { size_t src_len = src->len, dst_len = dst->len; struct string_list paths = STRING_LIST_INIT_DUP; @@ -239,11 +247,15 @@ static int migrate_paths(struct strbuf *src, struct strbuf *dst) for (i = 0; i < paths.nr; i++) { const char *name = paths.items[i].string; + enum finalize_object_file_flags flags_copy = flags; strbuf_addf(src, "/%s", name); strbuf_addf(dst, "/%s", name); - ret |= migrate_one(src, dst); + if (is_loose_object_shard(name)) + flags_copy |= FOF_SKIP_COLLISION_CHECK; + + ret |= migrate_one(src, dst, flags_copy); strbuf_setlen(src, src_len); strbuf_setlen(dst, dst_len); @@ -271,7 +283,7 @@ int tmp_objdir_migrate(struct tmp_objdir *t) strbuf_addbuf(&src, &t->path); strbuf_addstr(&dst, repo_get_object_directory(the_repository)); - ret = migrate_paths(&src, &dst); + ret = migrate_paths(&src, &dst, 0); strbuf_release(&src); strbuf_release(&dst); From patchwork Thu Sep 26 15:22:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813460 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0A9E214BFBF for ; Thu, 26 Sep 2024 15:22:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364165; cv=none; b=XTfNaKZBs6OaqukQ9AN+uPKmoCt0CEYSftGFKHbOod4TqerhtYz3Wiax9YwRa40+Jukne1q4XkmvrmXNWMTB8uq8HgPSm7MutZmXrioxJ2nelPgT2tp3lEgFe6w/0Rgof9pfbFNt5Z7f3UBYQYTSxNZUqgWzidz+Bdo6/OAVzrI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364165; c=relaxed/simple; bh=38km5qfmVKzvuvNxohccD4IGEExqYO6kwI5Az3NQtxM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=d2HWxCA9nJqVZDmjLzJzSQtpE4Zrtv1JYEv9PT8wVeborhA2RcoM21JFwXRweHKN4x7sCMhJwof+wrI6e0Tt8zd9XINAbNJtGt523AeAOvFQq6RtO30RnygXjgct4prB3P1VZGCn5W3KJuZudCHGGiSaUAotGp3i8X3KxHZsJBo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=GpjUfscA; arc=none smtp.client-ip=209.85.128.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="GpjUfscA" Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-6d6891012d5so10626267b3.2 for ; Thu, 26 Sep 2024 08:22:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364163; x=1727968963; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=A/rI6LFPvKIn/KM6bPhkm4cF+Wnrb0QdVwN/LX0uztw=; b=GpjUfscA+/vqOWUlsss7wp3H8mM4o8ZFFqj7Zc6/jOFWM7q5YZB2zyHzPmBwtrvvbC NXukl5hjg9eTUj5gZHt1hEAXXZ79Om7xcogpAbYEcLsLoyr1nOols5QKHkn/SK8OvSdc sGd3nS5ETn/o6X2/3Nsd5TpbTmXH548w+EEe8di55ZLcU4Uv77KWb3SbbhS3DtaDg+Ez IFa78pIcYUpiyu2s1NrolvQ+d3Rq21U/LIzZNduG1VOxlgWoqP7o90gXEN6n5zJzRJ7o QedsubLU5EXKX3ntB3GwFxuocJwLNMnGCmfiVgeD6CLr6VitpwqT3J4neYF8b6KZjlfq oODA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364163; x=1727968963; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=A/rI6LFPvKIn/KM6bPhkm4cF+Wnrb0QdVwN/LX0uztw=; b=JZS1KONwEcDiifybCgD5spMGC7RdFpRWGbXOwxxzgsTzzVtmoQRsuN98gVxFxJNYRu DQqjn1oc1rfCsVdobnrGFDoqCA2Y1zJvQSZS34PslHMsJrXETH/OAZbxyl9Cnsa4crpQ e5EKZfIg943fbNIPKr55CXGmDDgAEzoNE/zu+jHOtS7KYAak3vFG2b6xCYmq5jJy5fp4 ymEXQIjyxDecCm7Mcclk7/Qyy73dKRoUkUlhVqYRc4vdrSwiykGKvg2zYIJW1pQcOatX dXLoijLMsxdmexM/Gc5ORtpUAzWsX4zfR2DAkIUbAN0vipA+/HHT06A73dN5Ve3v0dtS 0/sg== X-Gm-Message-State: AOJu0YwyCab5nvDgE7sqM6TxgmUuNgUnU8oN08QYa+2tHZ0pg/HYX8ap qQRqpb8ImBVQUAP7DICySxaj/9KX28shsEMLNfydniM8brD7Y+tgZBViOJGLepTwB7IgIkAdVkN gmfw= X-Google-Smtp-Source: AGHT+IG+KgBP6ilsHuD2JDJ8UwHQ8ahEiEDQ1rIdc5WzQlmTxT5ZabfwFYGKLisC7LDs48xnR6W8nA== X-Received: by 2002:a05:690c:10d:b0:6de:b23:f2a1 with SMTP id 00721157ae682-6e21d70a70fmr57482417b3.15.1727364162807; Thu, 26 Sep 2024 08:22:42 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e24538aff0sm142317b3.126.2024.09.26.08.22.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:42 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 4/8] pack-objects: use finalize_object_file() to rename pack/idx/etc Message-ID: <611475d83e261c1b1ddd29d5e711a7e2d75e0341.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In most places that write files to the object database (even packfiles via index-pack or fast-import), we use finalize_object_file(). This prefers link()/unlink() over rename(), because it means we will prefer data that is already in the repository to data that we are newly writing. We should do the same thing in pack-objects. Even though we don't think of it as accepting outside data (and thus not being susceptible to collision attacks), in theory a determined attacker could present just the right set of objects to cause an incremental repack to generate a pack with their desired hash. This has some test and real-world fallout, as seen in the adjustment to t5303 below. That test script assumes that we can "fix" corruption by repacking into a good state, including when the pack generated by that repack operation collides with a (corrupted) pack with the same hash. This violates our assumption from the previous adjustments to finalize_object_file() that if we're moving a new file over an existing one, that since their checksums match, so too must their contents. This makes "fixing" corruption like this a more explicit operation, since the test (and users, who may fix real-life corruption using a similar technique) must first move the broken contents out of the way. Note also that we now call adjust_shared_perm() twice. We already call adjust_shared_perm() in stage_tmp_packfiles(), and now call it again in finalize_object_file(). This is somewhat wasteful, but cleaning up the existing calls to adjust_shared_perm() is tricky (because sometimes we're writing to a tmpfile, and sometimes we're writing directly into the final destination), so let's tolerate some minor waste until we can more carefully clean up the now-redundant calls. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- pack-write.c | 7 ++++--- t/t5303-pack-corruption-resilience.sh | 7 ++++++- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/pack-write.c b/pack-write.c index 27965672f1..f415604c15 100644 --- a/pack-write.c +++ b/pack-write.c @@ -8,6 +8,7 @@ #include "csum-file.h" #include "remote.h" #include "chunk-format.h" +#include "object-file.h" #include "pack-mtimes.h" #include "pack-objects.h" #include "pack-revindex.h" @@ -528,9 +529,9 @@ static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source, size_t name_prefix_len = name_prefix->len; strbuf_addstr(name_prefix, ext); - if (rename(source, name_prefix->buf)) - die_errno("unable to rename temporary file to '%s'", - name_prefix->buf); + if (finalize_object_file(source, name_prefix->buf)) + die("unable to rename temporary file to '%s'", + name_prefix->buf); strbuf_setlen(name_prefix, name_prefix_len); } diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh index 61469ef4a6..e6a43ec9ae 100755 --- a/t/t5303-pack-corruption-resilience.sh +++ b/t/t5303-pack-corruption-resilience.sh @@ -44,9 +44,14 @@ create_new_pack() { } do_repack() { + for f in $pack.* + do + mv $f "$(echo $f | sed -e 's/pack-/pack-corrupt-/')" || return 1 + done && pack=$(printf "$blob_1\n$blob_2\n$blob_3\n" | git pack-objects $@ .git/objects/pack/pack) && - pack=".git/objects/pack/pack-${pack}" + pack=".git/objects/pack/pack-${pack}" && + rm -f .git/objects/pack/pack-corrupt-* } do_corrupt_object() { From patchwork Thu Sep 26 15:22:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813461 Received: from mail-yb1-f182.google.com (mail-yb1-f182.google.com [209.85.219.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5746D15383A for ; Thu, 26 Sep 2024 15:22:47 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364168; cv=none; b=X9QHEaK7nDs/RlHAmouxeh+Tp/pvj0ALed4e8hIH7BGKWHjbvWXyr01fvnQ3Qo6Qm8rXsJyjbH8n5tuQVfSdzQMdgLtCTxnEjywkCkmTXFWwoLrCCTF3fPzlcWPm5oNTInPIQ/KH+5B2QR8SmaCwgE+d2GQN0wvz1Sk0yiBw0sA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364168; c=relaxed/simple; bh=5Ekba7RMtLZKry+MkFEbD8tTQwNJQyfPwsnOJ+IRVI8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=p3+g6Ntvfifjk4IPBoUCG4nqg5SCCI62AXqsrPp9hz5BN3sKiLAyY8qc1gYlvMIN1NCw8+sG3A13KcRTH9bjJR1V5UQjHLImGc8vqZtutkP8GYwOIdfaxjOBTJaYt4EEYBxiFBOTLreeyA/MnUs0aItFSUCM5LkoGOxDiF7z9io= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=F9TI+taw; arc=none smtp.client-ip=209.85.219.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="F9TI+taw" Received: by mail-yb1-f182.google.com with SMTP id 3f1490d57ef6-e25cbc50d7fso955802276.1 for ; Thu, 26 Sep 2024 08:22:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364166; x=1727968966; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=QMv3K2AHM700CA19tjbjvf8DVwj5IC/uRDnyMgTY88s=; b=F9TI+tawCQBbTypcRgtb71vwyiwYM+tCAAvbjrX5n+SaQpntwB26Nd98/sLXWAxuZ6 eNg2aPYRKgZddw0AyadZop2p1wP1rjuUQ7BJTuzXfEG5RSeqs7ua0la06Xw56PmyW6Hs yFi7uTE5KF4+33VkMBgQ7V8aSvnIjFsLwEn10osA+2kt9Znu6aiKLu5yaOQIK+jR+96k mucJysdZ3kmmPtjkdJczbT06WYvtmzz1SCL4sqWwMsOqKyE+Ftxby4ZQr5lyu+CYeDRU h+TOVd6kjjohdC1YSccn46StKJMFloSVaOeDoYkIiRpGXJjkgHrYqHIFmfpaDjDM+QZe qP1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364166; x=1727968966; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=QMv3K2AHM700CA19tjbjvf8DVwj5IC/uRDnyMgTY88s=; b=fCyvp6HQG5zL0/MZS2wKI69XQ6efVob4yf2fPjO5jscCEpEI9v4HoZXdAaQ4FKLcK5 lAwhAv799tE5kSzQZZEDQwagTrMW5/I/6CDunDYXygCoW4ATuSUtKRfNe4rWKB1pNoEt zvE5nTUT2cClDwQQdiWh8WCym00shIRuez/alRqTkaPtrj39+8y5gLUPJUm+LzPoDyWC GBUrJ1MgJBYCjUfcI5PXmDwxEem9Fy+6u6otRlEg2iUoLKSvBxL97XiT+xxNT1+Yq3HZ 6RDpWHTYCz7SfTwhtZuKI5ip9GYm0tGMEFt3EtU6Ts/lKJSO45UgDarETOsq7NbdYWm/ E1dA== X-Gm-Message-State: AOJu0YyUyaUEMXOEPf3Iqo+R7ScF5gK5qc8Nrl5cad4ZIm30mDjkuCfR 2E5aWIjfe2BhTmh2FRIGd606OYXG5HTbihPibqe3uDb16aX/a3IDO7ljbOBpjgvJLYfOWkAxuTR tskg= X-Google-Smtp-Source: AGHT+IGX/W+UsU7PWn6/Dte43NHjW16lvw/3hzlpDXCia50jajF990snbArflMd86MC+nZ8D1uecfQ== X-Received: by 2002:a05:6902:d41:b0:e25:c5ec:5797 with SMTP id 3f1490d57ef6-e25c5ec58bemr2904178276.18.1727364165951; Thu, 26 Sep 2024 08:22:45 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e25e3efab1dsm23678276.6.2024.09.26.08.22.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:45 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:44 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 5/8] sha1: do not redefine `platform_SHA_CTX` and friends Message-ID: <9913a5d971389e7e657ea151eaedc94bf025f3fc.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Our in-tree SHA-1 wrappers all define platform_SHA_CTX and related macros to point at the opaque "context" type, init, update, and similar functions for each specific implementation. In hash.h, we use these platform_ variables to set up the function pointers for, e.g., the_hash_algo->init_fn(), etc. But while these header files have a header-specific macro that prevents them declaring their structs / functions multiple times, they unconditionally define the platform variables, making it impossible to load multiple SHA-1 implementations at once. As a prerequisite for loading a separate SHA-1 implementation for non-cryptographic uses, only define the platform_ variables if they have not already been defined. Signed-off-by: Taylor Blau --- block-sha1/sha1.h | 2 ++ sha1/openssl.h | 2 ++ sha1dc_git.h | 3 +++ 3 files changed, 7 insertions(+) diff --git a/block-sha1/sha1.h b/block-sha1/sha1.h index 9fb0441b98..47bb916636 100644 --- a/block-sha1/sha1.h +++ b/block-sha1/sha1.h @@ -16,7 +16,9 @@ void blk_SHA1_Init(blk_SHA_CTX *ctx); void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *dataIn, size_t len); void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx); +#ifndef platform_SHA_CTX #define platform_SHA_CTX blk_SHA_CTX #define platform_SHA1_Init blk_SHA1_Init #define platform_SHA1_Update blk_SHA1_Update #define platform_SHA1_Final blk_SHA1_Final +#endif diff --git a/sha1/openssl.h b/sha1/openssl.h index 006c1f4ba5..1038af47da 100644 --- a/sha1/openssl.h +++ b/sha1/openssl.h @@ -40,10 +40,12 @@ static inline void openssl_SHA1_Clone(struct openssl_SHA1_CTX *dst, EVP_MD_CTX_copy_ex(dst->ectx, src->ectx); } +#ifndef platform_SHA_CTX #define platform_SHA_CTX openssl_SHA1_CTX #define platform_SHA1_Init openssl_SHA1_Init #define platform_SHA1_Clone openssl_SHA1_Clone #define platform_SHA1_Update openssl_SHA1_Update #define platform_SHA1_Final openssl_SHA1_Final +#endif #endif /* SHA1_OPENSSL_H */ diff --git a/sha1dc_git.h b/sha1dc_git.h index 60e3ce8439..f6f880cabe 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -18,7 +18,10 @@ void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); #define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */ + +#ifndef platform_SHA_CTX #define platform_SHA_CTX SHA1_CTX #define platform_SHA1_Init git_SHA1DCInit #define platform_SHA1_Update git_SHA1DCUpdate #define platform_SHA1_Final git_SHA1DCFinal +#endif From patchwork Thu Sep 26 15:22:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813462 Received: from mail-yw1-f174.google.com (mail-yw1-f174.google.com [209.85.128.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 462AE155A3C for ; Thu, 26 Sep 2024 15:22:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364172; cv=none; b=Nu0tgO83utULPY1LTnCvvQd4KcnesKbTLLHV4GOciu9VcqXtt1WuZUcdsVsVBFIHb/6cPPK6KvgSfMWx5A71KmH8XqTLi17ufr69D1JRINBvEsjaVNSlIkuAi8m6sN19WdUR1PRihAO6/ZDk0eqzxT3hUwNVYpAm7nJeYuomWBs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364172; c=relaxed/simple; bh=EyPTsi82CiaV3UclfaKd1z4+2WdXKhmm/BLFlDoj5vc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=BFnWpjHEfxPFk9Q0u/qUj6l3gjFe4ahO1RCbI/kw4Lz0AF2FNviNxX1JfWONJ6ZbBHU/yExbrI8ip9drtz9RxCexWi5/774l2Y5KDLRVre166JJxSc3pu43XCRvZJxItQ5Z0AZuNZHLo7QHDY30fC8ZKLHQ2uoFWu/VpuNSwxh0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=KrYjqsrn; arc=none smtp.client-ip=209.85.128.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="KrYjqsrn" Received: by mail-yw1-f174.google.com with SMTP id 00721157ae682-6e23b300bf9so5530727b3.3 for ; Thu, 26 Sep 2024 08:22:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364169; x=1727968969; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=XPPI7wjVu8AX0Acif3L3I5PDWtypISmumVJ9KRiyCqs=; b=KrYjqsrnl5BvN0ydElyUjg3UZxKXdbK/mn+Fm9K3Pto289Bu5yBxGzeYuUrKXMryd/ a78n3tRc89oUBNbVaLsszFKbPLvTJHxZFZW+joQoDhy9iAstfQG0r20jNjxOJvnaY3Me TmOE9BFGGh0shpx6nXNpZ894T94mI5sWhckJAyDBoWfI9crwfU0OfxwWkckkAqUuHS5P WSupDLYLUiu0c9otQghswOIhU0fLimA6vwVvLF6N5ZDHsGop0Ol3u4QchekGqi0aUaJX np3Vj/O7Vj/Y1zu73I2E66KvopNJRr2PikXvIUR045aem5hE7Dk/v4IaWo+wNCTzwK3Z 5DWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364169; x=1727968969; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=XPPI7wjVu8AX0Acif3L3I5PDWtypISmumVJ9KRiyCqs=; b=QKQNelkkOafIm8vMi2nt25xSW/oQmRwFsAmUwR1b2x6uFYtNKvXoGFx2J28vsq2Iyy Y1oNdLjefOlmqQVaaEb5knEAtyjGSqurJj5wX8/7KeeLy76Sau/yHGhqI26jlJ3gTYlI fN9WxU06oDjEIT8OnptJpIwZU/tgpkqmK3gxSF8uEiQwmT8eyeoZ14p2FgorOw8aLZzR 34e39Se9IXzVTyk/RDHOS5vKelEPmNGQjwsIiLpbU9HrzZBiHer7YoYaE1t0028eqXn+ jSQTRxv9uDj8hMsTwSE9a6zE8uiuArTE82NHTeySFEa7m13UOVk3H45Ne6n+IUqdVFGO N8Ug== X-Gm-Message-State: AOJu0YzAbfDu8UhUsRypCgqrj6uGt4LddU/aiWMaFQsbnssYcceHyl54 6nYWJMj2Y0FGd4434aCdM3LMB6MdRXkQf9nPyjiLMHZVfxi2vnJcby1BfNn5EbrbRds9Yrdj5IS Uvus= X-Google-Smtp-Source: AGHT+IFS0seCY5LO7NeOqN+4g6Ut1vCRJ+ndMF2yLM4FpuBy8S4oy+C25Yimvcf1vLBxt3yZ6M9lfQ== X-Received: by 2002:a05:690c:7181:b0:6db:e368:3ff3 with SMTP id 00721157ae682-6e21d9f3b40mr49040827b3.40.1727364169018; Thu, 26 Sep 2024 08:22:49 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e245318153sm180387b3.60.2024.09.26.08.22.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:48 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:47 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 6/8] hash.h: scaffolding for _unsafe hashing variants Message-ID: <65de6d724d646082332e65799cf23a5a412ac68f.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau --- hash.h | 42 ++++++++++++++++++++++++++++++++++++++++++ object-file.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+) diff --git a/hash.h b/hash.h index 72ffbc862e..96458b129f 100644 --- a/hash.h +++ b/hash.h @@ -44,14 +44,32 @@ #define platform_SHA1_Final SHA1_Final #endif +#ifndef platform_SHA_CTX_unsafe +# define platform_SHA_CTX_unsafe platform_SHA_CTX +# define platform_SHA1_Init_unsafe platform_SHA1_Init +# define platform_SHA1_Update_unsafe platform_SHA1_Update +# define platform_SHA1_Final_unsafe platform_SHA1_Final +# ifdef platform_SHA1_Clone +# define platform_SHA1_Clone_unsafe platform_SHA1_Clone +# endif +#endif + #define git_SHA_CTX platform_SHA_CTX #define git_SHA1_Init platform_SHA1_Init #define git_SHA1_Update platform_SHA1_Update #define git_SHA1_Final platform_SHA1_Final +#define git_SHA_CTX_unsafe platform_SHA_CTX_unsafe +#define git_SHA1_Init_unsafe platform_SHA1_Init_unsafe +#define git_SHA1_Update_unsafe platform_SHA1_Update_unsafe +#define git_SHA1_Final_unsafe platform_SHA1_Final_unsafe + #ifdef platform_SHA1_Clone #define git_SHA1_Clone platform_SHA1_Clone #endif +#ifdef platform_SHA1_Clone_unsafe +# define git_SHA1_Clone_unsafe platform_SHA1_Clone_unsafe +#endif #ifndef platform_SHA256_CTX #define platform_SHA256_CTX SHA256_CTX @@ -81,6 +99,13 @@ static inline void git_SHA1_Clone(git_SHA_CTX *dst, const git_SHA_CTX *src) memcpy(dst, src, sizeof(*dst)); } #endif +#ifndef SHA1_NEEDS_CLONE_HELPER_UNSAFE +static inline void git_SHA1_Clone_unsafe(git_SHA_CTX_unsafe *dst, + const git_SHA_CTX_unsafe *src) +{ + memcpy(dst, src, sizeof(*dst)); +} +#endif #ifndef SHA256_NEEDS_CLONE_HELPER static inline void git_SHA256_Clone(git_SHA256_CTX *dst, const git_SHA256_CTX *src) @@ -178,6 +203,8 @@ enum get_oid_result { /* A suitably aligned type for stack allocations of hash contexts. */ union git_hash_ctx { git_SHA_CTX sha1; + git_SHA_CTX_unsafe sha1_unsafe; + git_SHA256_CTX sha256; }; typedef union git_hash_ctx git_hash_ctx; @@ -222,6 +249,21 @@ struct git_hash_algo { /* The hash finalization function for object IDs. */ git_hash_final_oid_fn final_oid_fn; + /* The non-cryptographic hash initialization function. */ + git_hash_init_fn unsafe_init_fn; + + /* The non-cryptographic hash context cloning function. */ + git_hash_clone_fn unsafe_clone_fn; + + /* The non-cryptographic hash update function. */ + git_hash_update_fn unsafe_update_fn; + + /* The non-cryptographic hash finalization function. */ + git_hash_final_fn unsafe_final_fn; + + /* The non-cryptographic hash finalization function. */ + git_hash_final_oid_fn unsafe_final_oid_fn; + /* The OID of the empty tree. */ const struct object_id *empty_tree; diff --git a/object-file.c b/object-file.c index b9a3a1f62d..196c9e2df8 100644 --- a/object-file.c +++ b/object-file.c @@ -115,6 +115,33 @@ static void git_hash_sha1_final_oid(struct object_id *oid, git_hash_ctx *ctx) oid->algo = GIT_HASH_SHA1; } +static void git_hash_sha1_init_unsafe(git_hash_ctx *ctx) +{ + git_SHA1_Init_unsafe(&ctx->sha1_unsafe); +} + +static void git_hash_sha1_clone_unsafe(git_hash_ctx *dst, const git_hash_ctx *src) +{ + git_SHA1_Clone_unsafe(&dst->sha1_unsafe, &src->sha1_unsafe); +} + +static void git_hash_sha1_update_unsafe(git_hash_ctx *ctx, const void *data, + size_t len) +{ + git_SHA1_Update_unsafe(&ctx->sha1_unsafe, data, len); +} + +static void git_hash_sha1_final_unsafe(unsigned char *hash, git_hash_ctx *ctx) +{ + git_SHA1_Final_unsafe(hash, &ctx->sha1_unsafe); +} + +static void git_hash_sha1_final_oid_unsafe(struct object_id *oid, git_hash_ctx *ctx) +{ + git_SHA1_Final_unsafe(oid->hash, &ctx->sha1_unsafe); + memset(oid->hash + GIT_SHA1_RAWSZ, 0, GIT_MAX_RAWSZ - GIT_SHA1_RAWSZ); + oid->algo = GIT_HASH_SHA1; +} static void git_hash_sha256_init(git_hash_ctx *ctx) { @@ -189,6 +216,11 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { .update_fn = git_hash_unknown_update, .final_fn = git_hash_unknown_final, .final_oid_fn = git_hash_unknown_final_oid, + .unsafe_init_fn = git_hash_unknown_init, + .unsafe_clone_fn = git_hash_unknown_clone, + .unsafe_update_fn = git_hash_unknown_update, + .unsafe_final_fn = git_hash_unknown_final, + .unsafe_final_oid_fn = git_hash_unknown_final_oid, .empty_tree = NULL, .empty_blob = NULL, .null_oid = NULL, @@ -204,6 +236,11 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { .update_fn = git_hash_sha1_update, .final_fn = git_hash_sha1_final, .final_oid_fn = git_hash_sha1_final_oid, + .unsafe_init_fn = git_hash_sha1_init_unsafe, + .unsafe_clone_fn = git_hash_sha1_clone_unsafe, + .unsafe_update_fn = git_hash_sha1_update_unsafe, + .unsafe_final_fn = git_hash_sha1_final_unsafe, + .unsafe_final_oid_fn = git_hash_sha1_final_oid_unsafe, .empty_tree = &empty_tree_oid, .empty_blob = &empty_blob_oid, .null_oid = &null_oid_sha1, @@ -219,6 +256,11 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { .update_fn = git_hash_sha256_update, .final_fn = git_hash_sha256_final, .final_oid_fn = git_hash_sha256_final_oid, + .unsafe_init_fn = git_hash_sha256_init, + .unsafe_clone_fn = git_hash_sha256_clone, + .unsafe_update_fn = git_hash_sha256_update, + .unsafe_final_fn = git_hash_sha256_final, + .unsafe_final_oid_fn = git_hash_sha256_final_oid, .empty_tree = &empty_tree_oid_sha256, .empty_blob = &empty_blob_oid_sha256, .null_oid = &null_oid_sha256, From patchwork Thu Sep 26 15:22:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813463 Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7CBCA13B5B4 for ; Thu, 26 Sep 2024 15:22:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364175; cv=none; b=i0L9TFhfTU0qgAH3YPqWPtMFZU/6jmJkyX63C3kQrMzgLgoTJrXgf5hHbPTJypAFC8br5swSnJlYB5edrUI4vg6gb77CZaPCc+1tdMK9EXqbzLBakkERXtXMP027hC1QCS3ARkysMtlpVOS59rAKNl5c7vbydPg0XtzFDnyj79c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364175; c=relaxed/simple; bh=7vDKM4SUo2apFyTxPXWavVpiCUh4l4mtFKXnDw8lZIk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hkAqnKvDDKK23Bd2BeO3h8K6fh/mxNSepyUc4giAmC/D52ImSFFhKMa3O7a4jX7QOF9jCexN4SQI3JsS7lIbc6xlYS9SrjpCgCHKOn8/CdVg4jT5QjCozowt8PUSRT5gC6CZ9C16ZjRq539PLOfQkt32zgSsWnF/yHw+Yiymxzw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=N3SzpLrY; arc=none smtp.client-ip=209.85.128.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="N3SzpLrY" Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-6e21ac45b70so10221187b3.1 for ; Thu, 26 Sep 2024 08:22:53 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364172; x=1727968972; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=aYehiyHbxf/iCSm1oMf9XWK5Z0q8gEvRMTROIU/d8ik=; b=N3SzpLrYni3C8EWBo1kIkVeahT7zq3s+XKR9YslnOgYhUziOQdkzfUt5fRIvghRZBj sj/QJ/VWoQQ+2Hxecbc3QmgUoDD5yppsfXkFQ+jfJTyFEHr4I7M1oz3KPBl9hoaCJui5 YRbuxaHStOCulMfUSGYwMP4vOSGSnDjbmGG3D6ADm3B/MB1vocOHZa+/0QXvL7PnjhnU x6amBzqgA21apitrzMSjBybIUQSfcEewoSth1T1Uuzu4Lasl8mhnE/btVWbxLX5mCfh8 QAWmCWn6SOxebpgGDY/TK+fWJ6U6v0z8RHRPdub2PshGqvgE4wdG9PaDuU+ojSDu0RqN Ak8g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364172; x=1727968972; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=aYehiyHbxf/iCSm1oMf9XWK5Z0q8gEvRMTROIU/d8ik=; b=bJo1CNJDjwZgIjYTcuPT/61B8KBTGqL74Upuq7ha8Gc3nqVyFnk1Wxu7feNjN8k4+u MJJ/wHXlvOsp/+tsGfBVicbtqL47agL9vmYiG5h4gLa1oP1d/MCZ2RT8/Y/7v4vdpaJr OhQC02+acykNTcH1c3t6+diyHh32vj/ahMiIlDuARyxOvrKorrrvGsYVvNsa7OSeNR1K T0rveHYPLfoYTZKBM4u83zxJmK/v+J2w15irmzh/vZPL98Eueh/YWLHYwrgQbbgtTr2u TVHxyBZYHmmg222LxYeyLVFo4k/J5vnBAqWaOeqhXXy01jWRc5JweoCYMSWbw+LKLx7C AVOg== X-Gm-Message-State: AOJu0Yz913pSbgvc0UqOUb4HLlV+2qIAVIOFSj9c8x0KYrMxKDEbBhX0 mxvoyDeBzcDwM1dQV3ZlZoqT5OF2alf2o71qJJD3Ak5q2ce9YmzQPr473VqLdbGenXO1BcJE991 G13E= X-Google-Smtp-Source: AGHT+IHD0Bumji0CX7bjEaoAXUpHLBI5sZqJrkH22wyfoUncV0o7mWX60fn0Bm2ujixgaaz7YBwBfA== X-Received: by 2002:a05:690c:397:b0:6dd:bd27:8b07 with SMTP id 00721157ae682-6e21d9d61d1mr49001427b3.36.1727364172063; Thu, 26 Sep 2024 08:22:52 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e2452f9b39sm190967b3.25.2024.09.26.08.22.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:51 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:50 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 7/8] Makefile: allow specifying a SHA-1 for non-cryptographic uses Message-ID: <3884cd0e3add1b8dd2de01101dd11addc163d88e.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Introduce _UNSAFE variants of the OPENSSL_SHA1, BLK_SHA1, and APPLE_COMMON_CRYPTO_SHA1 compile-time knobs which indicate which SHA-1 implementation is to be used for non-cryptographic uses. There are a couple of small implementation notes worth mentioning: - There is no way to select the collision detecting SHA-1 as the "fast" fallback, since the fast fallback is only for non-cryptographic uses, and is meant to be faster than our collision-detecting implementation. - There are no similar knobs for SHA-256, since no collision attacks are presently known and thus no collision-detecting implementations actually exist. Signed-off-by: Taylor Blau --- Makefile | 25 +++++++++++++++++++++++++ hash.h | 30 ++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+) diff --git a/Makefile b/Makefile index 9cf2be070f..fb84a87592 100644 --- a/Makefile +++ b/Makefile @@ -521,6 +521,10 @@ include shared.mak # Define APPLE_COMMON_CRYPTO_SHA1 to use Apple's CommonCrypto for # SHA-1. # +# Define the same Makefile knobs as above, but suffixed with _UNSAFE to +# use the corresponding implementations for unsafe SHA-1 hashing for +# non-cryptographic purposes. +# # If don't enable any of the *_SHA1 settings in this section, Git will # default to its built-in sha1collisiondetection library, which is a # collision-detecting sha1 This is slower, but may detect attempted @@ -1996,6 +2000,27 @@ endif endif endif +ifdef OPENSSL_SHA1_UNSAFE +ifndef OPENSSL_SHA1 + EXTLIBS += $(LIB_4_CRYPTO) + BASIC_CFLAGS += -DSHA1_OPENSSL_UNSAFE +endif +else +ifdef BLK_SHA1_UNSAFE +ifndef BLK_SHA1 + LIB_OBJS += block-sha1/sha1.o + BASIC_CFLAGS += -DSHA1_BLK_UNSAFE +endif +else +ifdef APPLE_COMMON_CRYPTO_SHA1_UNSAFE +ifndef APPLE_COMMON_CRYPTO_SHA1 + COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL + BASIC_CFLAGS += -DSHA1_APPLE_UNSAFE +endif +endif +endif +endif + ifdef OPENSSL_SHA256 EXTLIBS += $(LIB_4_CRYPTO) BASIC_CFLAGS += -DSHA256_OPENSSL diff --git a/hash.h b/hash.h index 96458b129f..f97f858307 100644 --- a/hash.h +++ b/hash.h @@ -15,6 +15,36 @@ #include "block-sha1/sha1.h" #endif +#if defined(SHA1_APPLE_UNSAFE) +# include +# define platform_SHA_CTX_unsafe CC_SHA1_CTX +# define platform_SHA1_Init_unsafe CC_SHA1_Init +# define platform_SHA1_Update_unsafe CC_SHA1_Update +# define platform_SHA1_Final_unsafe CC_SHA1_Final +#elif defined(SHA1_OPENSSL_UNSAFE) +# include +# if defined(OPENSSL_API_LEVEL) && OPENSSL_API_LEVEL >= 3 +# define SHA1_NEEDS_CLONE_HELPER_UNSAFE +# include "sha1/openssl.h" +# define platform_SHA_CTX_unsafe openssl_SHA1_CTX +# define platform_SHA1_Init_unsafe openssl_SHA1_Init +# define platform_SHA1_Clone_unsafe openssl_SHA1_Clone +# define platform_SHA1_Update_unsafe openssl_SHA1_Update +# define platform_SHA1_Final_unsafe openssl_SHA1_Final +# else +# define platform_SHA_CTX_unsafe SHA_CTX +# define platform_SHA1_Init_unsafe SHA1_Init +# define platform_SHA1_Update_unsafe SHA1_Update +# define platform_SHA1_Final_unsafe SHA1_Final +# endif +#elif defined(SHA1_BLK_UNSAFE) +# include "block-sha1/sha1.h" +# define platform_SHA_CTX_unsafe blk_SHA_CTX +# define platform_SHA1_Init_unsafe blk_SHA1_Init +# define platform_SHA1_Update_unsafe blk_SHA1_Update +# define platform_SHA1_Final_unsafe blk_SHA1_Final +#endif + #if defined(SHA256_NETTLE) #include "sha256/nettle.h" #elif defined(SHA256_GCRYPT) From patchwork Thu Sep 26 15:22:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13813464 Received: from mail-yw1-f172.google.com (mail-yw1-f172.google.com [209.85.128.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D860213B5B4 for ; Thu, 26 Sep 2024 15:22:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364178; cv=none; b=Q5wijtyIlMEgHZXleYq5YFaaLK6+NnMdhErxkgU52vE77XXNHHB9Maem5hp+3F8Fxy5snafcR5JiZtSD71imHKHqn+AEmnYF3U8QAAxZbaXqQ5qHzSQt/bEyk8csoPx1IvT+DCoFKYHcQCBZ6GO0g/iUKFwKrilHqa6l3yjNdEo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727364178; c=relaxed/simple; bh=LbV+n+sqRpOxiz6etEHTBA1JtLrES3qyG0fGbqQ74Ac=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hvCItRfEs8hw/yp7NSpN0dZTnxZVXmKb2ca8KT9GIujGJ7CEDfVKo2+Pt1t4UcgphlkmR8aJ0WWU2csf+x2TJtiTsh4lhXRQAnLsyuyxzETZtUUykCfq+ArSJVNLexlVZ2arXQxLi8sMnDF+ZhrnP8P9Kfwi69agZo5ssriQnPU= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=qZGDFb1z; arc=none smtp.client-ip=209.85.128.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="qZGDFb1z" Received: by mail-yw1-f172.google.com with SMTP id 00721157ae682-6e129c01b04so9982697b3.1 for ; Thu, 26 Sep 2024 08:22:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727364175; x=1727968975; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=nHjGNvp/nuRbWxY+dkUxDKmQ7sY3N7norKsRQKZD2E8=; b=qZGDFb1zKPn0dMbYYRbNPsecO/vltg07ebWF6QsXWzxBU5EVd97NaA8MaVISLafOaV sZwGNDGdkSzWj4sUzKZponXH5ViRH4rKjCqC+5U6maWUDZ+tymwssgMjXrLjsg87tLWU rfhJeuDxJejcBdVQoz4Jpd//QvdVTYAXYR8LACe3lcFDuXDKFofMlwv5prsJaeC5Sk8K hI3QQG9Tm9YJ5NZV2+GWHF31t2lUHkra2jLEu2SiAeukhB93M0VRqxf9hIiLgqkIdu6V YOkSQQN+qYb5po7mKGZrkD8u8D+6MyacwjTLh8/uC/xCTaOj7nPOAFvKgOVoxi3GiUVr 7W5w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727364175; x=1727968975; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=nHjGNvp/nuRbWxY+dkUxDKmQ7sY3N7norKsRQKZD2E8=; b=Sq1N9zNObFJVj1Qo4uuRquFJ9W+KE1mSdxmF5hLTymJufl9K0p99aLJQm6eaj8o+ri 3cYSjY1zD+ljclC7MBIKkWSr3Ldz7G/BIFO9rxREsfuOb3vmKKfqqBksuSxhcr2v4780 TCsZSyfArVBAa45elviMZib+Si+lEDgOSELbF6FauFxSC2e0n8x51U/pnU/p/5+LPJId UtTPq+xIU4MEl1fAi+XGBU5OSAqzK9x8cuVGlnN5jbBNKMPVBXe1vro2qoKVzm9x9jSU ZZe9QDCjeLx0qTxUp530TWo755QQCqU4jrEbN6Or6KVORYHcz24YW5QPyflngVRIk4SO WVew== X-Gm-Message-State: AOJu0YyVxUwGhuRnFkn4hRrxyrX2ESdWwlLujfWunu9/yHoYQasROPXq +767V/WayKkLI5xedEswPimBPJIEO73dzN2CtnTlJ1gtfYSKuELIJx2VvT1A0SE2VhKET4oiKPd /tuk= X-Google-Smtp-Source: AGHT+IGGssoOqbynoVNB22nXNmJduyFGpDINfrddsBGUbYSM4gWgiKw7HWpaTm5WHDm7l72gy+xS9w== X-Received: by 2002:a05:690c:d89:b0:6dd:d2c5:b2c with SMTP id 00721157ae682-6e21d6de6b7mr66897037b3.4.1727364175534; Thu, 26 Sep 2024 08:22:55 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e245308eaesm182747b3.41.2024.09.26.08.22.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 26 Sep 2024 08:22:55 -0700 (PDT) Date: Thu, 26 Sep 2024 11:22:53 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v5 8/8] csum-file.c: use unsafe SHA-1 implementation when available Message-ID: <62abddf73dacc8593dde41ef37bab71215fdefa3.1727364141.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Update hashwrite() and friends to use the unsafe_-variants of hashing functions, calling for e.g., "the_hash_algo->unsafe_update_fn()" instead of "the_hash_algo->update_fn()". These callers only use the_hash_algo to produce a checksum, which we depend on for data integrity, but not for cryptographic purposes, so these callers are safe to use the unsafe (non-collision detecting) SHA-1 implementation. To time this, I took a freshly packed copy of linux.git, and ran the following with and without the OPENSSL_SHA1_UNSAFE=1 build-knob. Both versions were compiled with -O3: $ git for-each-ref --format='%(objectname)' refs/heads refs/tags >in $ valgrind --tool=callgrind ~/src/git/git-pack-objects \ --revs --stdout --all-progress --use-bitmap-index /dev/null Without OPENSSL_SHA1_UNSAFE=1 (that is, using the collision-detecting SHA-1 implementation for both cryptographic and non-cryptographic purposes), we spend a significant amount of our instruction count in hashwrite(): $ callgrind_annotate --inclusive=yes | grep hashwrite | head -n1 159,998,868,413 (79.42%) /home/ttaylorr/src/git/csum-file.c:hashwrite [/home/ttaylorr/src/git/git-pack-objects] , and the resulting "clone" takes 19.219 seconds of wall clock time, 18.94 seconds of user time and 0.28 seconds of system time. Compiling with OPENSSL_SHA1_UNSAFE=1, we spend ~60% fewer instructions in hashwrite(): $ callgrind_annotate --inclusive=yes | grep hashwrite | head -n1 59,164,001,176 (58.79%) /home/ttaylorr/src/git/csum-file.c:hashwrite [/home/ttaylorr/src/git/git-pack-objects] , and generate the resulting "clone" much faster, in only 11.597 seconds of wall time, 11.37 seconds of user time, and 0.23 seconds of system time, for a ~40% speed-up. Signed-off-by: Taylor Blau --- csum-file.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/csum-file.c b/csum-file.c index bf82ad8f9f..c203ebf11b 100644 --- a/csum-file.c +++ b/csum-file.c @@ -50,7 +50,7 @@ void hashflush(struct hashfile *f) if (offset) { if (!f->skip_hash) - the_hash_algo->update_fn(&f->ctx, f->buffer, offset); + the_hash_algo->unsafe_update_fn(&f->ctx, f->buffer, offset); flush(f, f->buffer, offset); f->offset = 0; } @@ -73,7 +73,7 @@ int finalize_hashfile(struct hashfile *f, unsigned char *result, if (f->skip_hash) hashclr(f->buffer, the_repository->hash_algo); else - the_hash_algo->final_fn(f->buffer, &f->ctx); + the_hash_algo->unsafe_final_fn(f->buffer, &f->ctx); if (result) hashcpy(result, f->buffer, the_repository->hash_algo); @@ -128,7 +128,7 @@ void hashwrite(struct hashfile *f, const void *buf, unsigned int count) * f->offset is necessarily zero. */ if (!f->skip_hash) - the_hash_algo->update_fn(&f->ctx, buf, nr); + the_hash_algo->unsafe_update_fn(&f->ctx, buf, nr); flush(f, buf, nr); } else { /* @@ -174,7 +174,7 @@ static struct hashfile *hashfd_internal(int fd, const char *name, f->name = name; f->do_crc = 0; f->skip_hash = 0; - the_hash_algo->init_fn(&f->ctx); + the_hash_algo->unsafe_init_fn(&f->ctx); f->buffer_len = buffer_len; f->buffer = xmalloc(buffer_len); @@ -208,7 +208,7 @@ void hashfile_checkpoint(struct hashfile *f, struct hashfile_checkpoint *checkpo { hashflush(f); checkpoint->offset = f->total; - the_hash_algo->clone_fn(&checkpoint->ctx, &f->ctx); + the_hash_algo->unsafe_clone_fn(&checkpoint->ctx, &f->ctx); } int hashfile_truncate(struct hashfile *f, struct hashfile_checkpoint *checkpoint) @@ -219,7 +219,7 @@ int hashfile_truncate(struct hashfile *f, struct hashfile_checkpoint *checkpoint lseek(f->fd, offset, SEEK_SET) != offset) return -1; f->total = offset; - the_hash_algo->clone_fn(&f->ctx, &checkpoint->ctx); + the_hash_algo->unsafe_clone_fn(&f->ctx, &checkpoint->ctx); f->offset = 0; /* hashflush() was called in checkpoint */ return 0; } @@ -245,9 +245,9 @@ int hashfile_checksum_valid(const unsigned char *data, size_t total_len) if (total_len < the_hash_algo->rawsz) return 0; /* say "too short"? */ - the_hash_algo->init_fn(&ctx); - the_hash_algo->update_fn(&ctx, data, data_len); - the_hash_algo->final_fn(got, &ctx); + the_hash_algo->unsafe_init_fn(&ctx); + the_hash_algo->unsafe_update_fn(&ctx, data, data_len); + the_hash_algo->unsafe_final_fn(got, &ctx); return hasheq(got, data + data_len, the_repository->hash_algo); }