From patchwork Tue Sep 24 17:32:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811077 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D1BD38DD1 for ; Tue, 24 Sep 2024 17:32:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199135; cv=none; b=Y7HVL8gL+6trC6gayd5wLVNGIda92qbDJSXULJR4rTu23EVPAUa/ss1/mP3d6ocb9NJq6LaLlzauPaLvVWg3hWDG3XQc8tn3B1xGwMkFlJ0V3pVU1uFxbbW6IDS1hZhC5YD+yaf4lCM9KtBNStZwh58i94PyXwNKpzTBMMnG8Hk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199135; c=relaxed/simple; bh=lyuGCn/y2ugP24lDv9u8fU/SQ4YYm60gWGmNJ3OBrsA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Zmp2mIQQ35TYoOVJhh9pJ50s4TVQQJxygNkHjKMSieLPDRdVt4PcM2gPDYmmde9UDVNiOlzhrXH5FP98sXtqIydlWBGysh2eASrR4KBdykoVcfrYKnMARjejV0hK0uEL2owfl60GfNYnckLI2vgfIYjcab2EaiYuSNS2jZSl3lk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=h0d9WREN; arc=none smtp.client-ip=209.85.219.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="h0d9WREN" Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-e1a74ee4c0cso5147605276.2 for ; Tue, 24 Sep 2024 10:32:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199133; x=1727803933; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=NLFXXCQeg2YweClnyjFgATxeHCWdKZtflN4MsinISps=; b=h0d9WRENlrNAR0lr+sIOE+D6GC/ThE7KdTC387K5WJedTflo86O/MlQIMkVBD+LA6t UXeEkKxhTs5vYHTZMy1uYmPELPX9ZorAGuiaVu3lT6jC8ncdgXe3kFtpp2ZjZOc/1uW/ tJpAJJOfmIhSKKoGPEX83WYyUYjoaSyJYHGsVLnpjXMHxsqNhLV3rnOP7JHhbX61S8Jy +DsuQdI0LdKN1iiQVh0lpRp1YKEnWb1J2RHQnPKHba5gJlDUcHuuud96inmGSnvJrQrk +/11vkzW7/tiS0fr8chK+dMlM9XKM9QnQ/8Q/p46JoCDD6+PtH3rT4jIs3ZtYB0gb+hb wSGg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199133; x=1727803933; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=NLFXXCQeg2YweClnyjFgATxeHCWdKZtflN4MsinISps=; b=a8lBt2oSt3LxA9Ni30HtoGrCiu6DFehHuILxl0HvrL3yzazjLzbKsy29X4abA7UHM5 gEXOUZQ9NIuXExK/xW2dN065XCtePfy9R7sU4v8lLEmDuDaaRdIDtiBbBCKEyyRfZkq+ sBb8EwqJhvGZW8WEQ+a6YfkIUmRGbU79PyantD0fvK3izkSDnOZxNglOPqcmrItt6ReB Ep+u7A1oMu4PgFtI3IeIVkthENtunp4t979jQLFqaenlUw06fiJLzytsfnNkgeugYB/c XdXROJa7qsqY+WHNOsnD7E+Cw3XreZb/NZzptlHnLkSJWC+cn6CYnwCT96o49YdHd0s3 xKDA== X-Gm-Message-State: AOJu0Yy5YZBo8bEtXK2dr6gdJZdUoJ8mP8nFxxYuTlETaLOfIrzlOkLB BGvtv0sl5gbf/qmUy3PYt5F37VGKU555USiOfyebotjBR/8lL4r63ZlhbONrDxS+QociB9XOHAO snXU= X-Google-Smtp-Source: AGHT+IHw4DkIUM70qBsyTvgrPm4Z9JTaRt0yT9BlkXGmRQcARQZ9KzLMIntZ+g8/g5SkE2n2xUUGbQ== X-Received: by 2002:a05:690c:4246:b0:6af:fd49:67e0 with SMTP id 00721157ae682-6e21da1e9eamr1102417b3.46.1727199132820; Tue, 24 Sep 2024 10:32:12 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e20d04a335sm3103407b3.47.2024.09.24.10.32.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:12 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:10 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 1/8] finalize_object_file(): check for name collision before renaming Message-ID: <6f1ee91fff315678fef39a54220eae91632d2df9.1727199118.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We prefer link()/unlink() to rename() for object files, with the idea that we should prefer the data that is already on disk to what is incoming. But we may fall back to rename() if the user has configured us to do so, or if the filesystem seems not to support cross-directory links. This loses the "prefer what is on disk" property. We can mitigate this somewhat by trying to stat() the destination filename before doing the rename. This is racy, since the object could be created between the stat() and rename() calls. But in practice it is expanding the definition of "what is already on disk" to be the point that the function is called. That is enough to deal with any potential attacks where an attacker is trying to collide hashes with what's already in the repository. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- object-file.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/object-file.c b/object-file.c index 968da27cd41..38407f468a9 100644 --- a/object-file.c +++ b/object-file.c @@ -1937,6 +1937,7 @@ static void write_object_file_prepare_literally(const struct git_hash_algo *algo */ int finalize_object_file(const char *tmpfile, const char *filename) { + struct stat st; int ret = 0; if (object_creation_mode == OBJECT_CREATION_USES_RENAMES) @@ -1957,9 +1958,12 @@ int finalize_object_file(const char *tmpfile, const char *filename) */ if (ret && ret != EEXIST) { try_rename: - if (!rename(tmpfile, filename)) + if (!stat(filename, &st)) + ret = EEXIST; + else if (!rename(tmpfile, filename)) goto out; - ret = errno; + else + ret = errno; } unlink_or_warn(tmpfile); if (ret) { From patchwork Tue Sep 24 17:32:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811078 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 74EFA38DD1 for ; Tue, 24 Sep 2024 17:32:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199139; cv=none; b=QNrxPVYKhGfYDY+pi+K4fu/rnrNs93AN6SJg0I/yETzufVggFYna9l01tzSkqFelZuiEfR610qP0lbXGvCR3LCJxHogYVwKbFM/TbN50JDNRXTLjlbpuucZRAZ24RF7tpFr9HnSm5gOP34giiYssHMdmTUDMp377to9hoQC78gg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199139; c=relaxed/simple; bh=HeBrCo+ZVTeY0GRFVCQNUBnHo9dG4ey5LSlkb7wKNQk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GntFs1dEWaIthDGBqg14i25oBOxvjxZj8iMarTFZR/w8wjvXgej2yyDFZnyb4lXlQTU7AjzN7Z+D9txxpI0+13O0Vfg8Kq17GTHMYovL/A8ywkw3A5Elv2LxA+rupj6sStevokOIQpiRCNeX2LI77Iw8dNwoTaW9636SUMnoOos= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=QjMtRqrC; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="QjMtRqrC" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-6e20a8141c7so10820927b3.0 for ; Tue, 24 Sep 2024 10:32:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199136; x=1727803936; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cX8phJkeB+hnyG0RojMlo+hq6/5rVRNGKooNkDOKJnQ=; b=QjMtRqrCcKsJk6+9gHlsWQIfobHK/sNFrlwktbNFwxdyJryDezvcDxjX7Ey4qCTyqk FGgRGkj/hGxsJgZgF/dH/lF/x0Opv9w9VkSOK8azUR8pAlpwlFVKIH39W/w7gAE8hbL5 IZ1m3ahOYJxjrfTTNvRJhvoIZvAjmd1cVtGL7eh0y2F2B58unXjRdhCyDTM7bOreG8c0 SSdJb53sPT70rSa1HvWptOw0Ia3ytqtYs7wMFwozTBke1UPlE+NfpGztq2e5MLa3/yc0 AQYMnwSOlDfWrf6jXssFxh/UaRU3xg7CUKE/y3/aXJPFg2ViBG7ajGsEKQ02m6hGgeQy CVsw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199136; x=1727803936; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cX8phJkeB+hnyG0RojMlo+hq6/5rVRNGKooNkDOKJnQ=; b=meZQ0sspzsZo+p97Luc4A+QHEo1sndaynPA3DwOzei9+JLHDWnGxQW63dC9lfCJdZ0 fhG9JM5PufPijqrmYvyDAaJW9Ci7THwCKaoaaKJ7g0yUQlTdMvw1zruNISEZByi/R2OQ KaCikPTLRSzcoC+Psbq+TvSxxa4D6/bFg72VKfX6b/oENhUtF9yn8vagYd+BOoCTq1wK b1ExoUQSnF8YquQYO5OPYg5sv3xxisah5gZtRt4C6f68YOlr21/MGjHgngnUcydVA4TC 0bDN3/uDTARgQzXp83jfxiMo8zi5nXGFaFVvtiFUlMr+mLHNV73drfymyVTRVGp33neq p08w== X-Gm-Message-State: AOJu0Ywnw3slU3jozfz5+5rnDeKdf4ntar5kqSAHGV7QKdnCBzWIYqeA dp7xTdgEcIgw0zyIIdzS7oB5725nQ7x+MzdBITk2Sr+hQy38/HlcKlUSHkCNJcFutnznD5pG8rT N+0U= X-Google-Smtp-Source: AGHT+IFFOufsCpakZz3jIf+0CKMH1HXxQvgkpw/pKNdWbaR3jLSwqZMtaqlkC9+oADF6/m1dSQVdWA== X-Received: by 2002:a05:690c:660f:b0:6de:b23:f2c3 with SMTP id 00721157ae682-6e21d6ec709mr2186657b3.7.1727199136170; Tue, 24 Sep 2024 10:32:16 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e20d06ea39sm3101547b3.68.2024.09.24.10.32.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:15 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:14 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 2/8] finalize_object_file(): refactor unlink_or_warn() placement Message-ID: <133047ca8c9aed3e297483b60ca3fdb02eb532a5.1727199118.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: As soon as we've tried to link() a temporary object into place, we then unlink() the tempfile immediately, whether we were successful or not. For the success case, this is because we no longer need the old file (it's now linked into place). For the error case, there are two outcomes. Either we got EEXIST, in which case we consider the collision to be a noop. Or we got a system error, in which we case we are just cleaning up after ourselves. Using a single line for all of these cases has some problems: - in the error case, our unlink() may clobber errno, which we use in the error message - for the collision case, there's a FIXME that indicates we should do a collision check. In preparation for implementing that, we'll need to actually hold on to the file. Split these three cases into their own calls to unlink_or_warn(). This is more verbose, but lets us do the right thing in each case. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- object-file.c | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-) diff --git a/object-file.c b/object-file.c index 38407f468a9..5a1da577115 100644 --- a/object-file.c +++ b/object-file.c @@ -1944,6 +1944,8 @@ int finalize_object_file(const char *tmpfile, const char *filename) goto try_rename; else if (link(tmpfile, filename)) ret = errno; + else + unlink_or_warn(tmpfile); /* * Coda hack - coda doesn't like cross-directory links, @@ -1965,12 +1967,15 @@ int finalize_object_file(const char *tmpfile, const char *filename) else ret = errno; } - unlink_or_warn(tmpfile); if (ret) { if (ret != EEXIST) { + int saved_errno = errno; + unlink_or_warn(tmpfile); + errno = saved_errno; return error_errno(_("unable to write file %s"), filename); } /* FIXME!!! Collision check here ? */ + unlink_or_warn(tmpfile); } out: From patchwork Tue Sep 24 17:32:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811079 Received: from mail-yb1-f171.google.com (mail-yb1-f171.google.com [209.85.219.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7A94038DD1 for ; Tue, 24 Sep 2024 17:32:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199143; cv=none; b=bx47apn5lY2Z9EQelPd/ZDvmsBX10TWZvWEw2OlP6bFk6hyxaeONZU3BAKwkoeazYLFi4Cts2ZSlP+VONRtmDEArCulXGQH0QCHcE7nZ6svbtqiBI99WUG1FMMc9gXrKTJkOXGjtfaU6jJMyAofAhbAdYEjqEekZl1YydUOHnPE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199143; c=relaxed/simple; bh=419rShZIYyT04h4XXuVTt2qvKMf2faSu+iWL+KMM8+o=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=o8RTNlVKPxYnnz51SGcfNzHSIPDZCttPjXPpnogP4qI1okYpfNRgm2RrKW5nOOvk3T16u4N7qRlaI3uvbJqywChqShbEh174CIhQSA1PjWZEKgYlRyUPAdHFvDjHmFxmidCVDskacAvBwT0W0a7aw3EnQVnXYv07zGjl1bIcoKE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=DgyFuA76; arc=none smtp.client-ip=209.85.219.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="DgyFuA76" Received: by mail-yb1-f171.google.com with SMTP id 3f1490d57ef6-e1a9e4fa5aaso5186646276.2 for ; Tue, 24 Sep 2024 10:32:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199140; x=1727803940; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=4pkiQpGV8RxKoYbN/qj8tXlwNVqbWlJfbGiu/CPrKak=; b=DgyFuA76oMHp4PcfEG9hC7lEhvwKBJyAFbXk8dhbAXY4lIcGr25ZXUi4BRoWk8yE1O xxiQtVtfRF/hU33n+8fg54IKG8ltMmtNR7hSJu1W0aEWGdvg9gHv6XExbA5hUUL5l9DZ 9A0ksOmNEre1vUkys7SuMi3b8rJeVIcwg+O29UQ82DWBSncSWbk/DaV1lTXP9+KFalXJ kSjAao1l5j3CfSx9/k3aoz0/SleoY1aEhIY0Exv6+abmgYCer5LO+d9lt7snxcwalxUl PISkgczTDtHxM4aEOZIcd58vDTPQA2pciQ8iF9Do04x0Y2T+dMgzg4saIIVM3ebqKG2Z PlFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199140; x=1727803940; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4pkiQpGV8RxKoYbN/qj8tXlwNVqbWlJfbGiu/CPrKak=; b=HU/yl89x0BbzKZ0SSwtg5rXpwd0KXxKDbgfb5/rgHPHt75BI1KfrGQTbPI0HXTjEEn m9l7IPB1TTZFBRl8flpCLiV4vAN0NIJASrego3xzX9mq1CtgqPRPkfWKpCX0Sc/mXLvc D6FBtf1g73KTWVQwGSnVar/P3PeVd7H2ve4Y/ZiUeHnrgtR9fixKYZESaviB6DVBDr2e gH3Uk7cQwd6LqgKuQRjKUfb2CRkkZX6+e32BNcH3jrlSYtXUMry7hDJog+/nGFEL5xsK WFD0BAUfMKWzk0+TNj9GKdUur3p08S2LbuHNNCWbjfIRwIPbKn+wGKOaDE+lT2/kzLbd J9dw== X-Gm-Message-State: AOJu0Yzq5FBly+DmL2EZdgnWEvkRTTjcLV2n5zEVK6xlqhDXU6zcooLP csX+Xd9GGSVVW0K2l8dpuoyxpl69Mp13UzwdN9ksrPOyL/1SYeTxbGVMJUZjgAGyFO1KuNlfYgB fFpk= X-Google-Smtp-Source: AGHT+IEeyPQyIKWy2SThKYdw5FBcoofZSFw+wuiSFK5Jd0j1kqF2ybXiH4HNPOEL++Ily/WBKy5L6g== X-Received: by 2002:a05:6902:1249:b0:e24:a37b:c6a0 with SMTP id 3f1490d57ef6-e24a37bc787mr556689276.56.1727199140168; Tue, 24 Sep 2024 10:32:20 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e2499b18171sm301710276.27.2024.09.24.10.32.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:19 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 3/8] finalize_object_file(): implement collision check Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: We've had "FIXME!!! Collision check here ?" in finalize_object_file() since aac1794132 (Improve sha1 object file writing., 2005-05-03). That is, when we try to write a file with the same name, we assume the on-disk contents are the same and blindly throw away the new copy. One of the reasons we never implemented this is because the files it moves are all named after the cryptographic hash of their contents (either loose objects, or packs which have their hash in the name these days). So we are unlikely to see such a collision by accident. And even though there are weaknesses in sha1, we assume they are mitigated by our use of sha1dc. So while it's a theoretical concern now, it hasn't been a priority. However, if we start using weaker hashes for pack checksums and names, this will become a practical concern. So in preparation, let's actually implement a byte-for-byte collision check. The new check will cause the write of new differing content to be a failure, rather than a silent noop, and we'll retain the temporary file on disk. If there's no collision present, we'll clean up the temporary file as usual after either rename()-ing or link()-ing it into place. Note that this may cause some extra computation when the files are in fact identical, but this should happen rarely. Loose objects are exempt from this check, and the collision check may be skipped by calling the _flags variant of this function with the FOF_SKIP_COLLISION_CHECK bit set. This is done for a couple of reasons: - We don't treat the hash of the loose object file's contents as a checksum, since the same loose object can be stored using different bytes on disk (e.g., when adjusting core.compression, using a different version of zlib, etc.). This is fundamentally different from cases where finalize_object_file() is operating over a file which uses the hash value as a checksum of the contents. In other words, a pair of identical loose objects can be stored using different bytes on disk, and that should not be treated as a collision. - We already use the path of the loose object as its hash value / object name, so checking for collisions at the content level doesn't add anything. This is why we do not bother to check the inflated object contents for collisions either, since either (a) the object contents have the fingerprint of a SHA-1 collision, in which case the collision detecting SHA-1 implementation used to hash the contents to give us a path would have already rejected it, or (b) the contents are part of a colliding pair which does not bear the same fingerprints of known collision attacks, in which case we would not have caught it anyway. So skipping the collision check here does not change for better or worse the hardness of loose object writes. As a small note related to the latter bullet point above, we must teach the tmp-objdir routines to similarly skip the content-level collision checks when calling migrate_one() on a loose object file, which we do by setting the FOF_SKIP_COLLISION_CHECK bit when we are inside of a loose object shard. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- object-file.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++--- object-file.h | 6 +++++ tmp-objdir.c | 26 ++++++++++++++------ 3 files changed, 89 insertions(+), 10 deletions(-) diff --git a/object-file.c b/object-file.c index 5a1da577115..b9a3a1f62db 100644 --- a/object-file.c +++ b/object-file.c @@ -1932,10 +1932,67 @@ static void write_object_file_prepare_literally(const struct git_hash_algo *algo hash_object_body(algo, &c, buf, len, oid, hdr, hdrlen); } +static int check_collision(const char *filename_a, const char *filename_b) +{ + char buf_a[4096], buf_b[4096]; + int fd_a = -1, fd_b = -1; + int ret = 0; + + fd_a = open(filename_a, O_RDONLY); + if (fd_a < 0) { + ret = error_errno(_("unable to open %s"), filename_a); + goto out; + } + + fd_b = open(filename_b, O_RDONLY); + if (fd_b < 0) { + ret = error_errno(_("unable to open %s"), filename_b); + goto out; + } + + while (1) { + ssize_t sz_a, sz_b; + + sz_a = read_in_full(fd_a, buf_a, sizeof(buf_a)); + if (sz_a < 0) { + ret = error_errno(_("unable to read %s"), filename_a); + goto out; + } + + sz_b = read_in_full(fd_b, buf_b, sizeof(buf_b)); + if (sz_b < 0) { + ret = error_errno(_("unable to read %s"), filename_b); + goto out; + } + + if (sz_a != sz_b || memcmp(buf_a, buf_b, sz_a)) { + ret = error(_("files '%s' and '%s' differ in contents"), + filename_a, filename_b); + goto out; + } + + if (sz_a < sizeof(buf_a)) + break; + } + +out: + if (fd_a > -1) + close(fd_a); + if (fd_b > -1) + close(fd_b); + return ret; +} + /* * Move the just written object into its final resting place. */ int finalize_object_file(const char *tmpfile, const char *filename) +{ + return finalize_object_file_flags(tmpfile, filename, 0); +} + +int finalize_object_file_flags(const char *tmpfile, const char *filename, + enum finalize_object_file_flags flags) { struct stat st; int ret = 0; @@ -1974,7 +2031,9 @@ int finalize_object_file(const char *tmpfile, const char *filename) errno = saved_errno; return error_errno(_("unable to write file %s"), filename); } - /* FIXME!!! Collision check here ? */ + if (!(flags & FOF_SKIP_COLLISION_CHECK) && + check_collision(tmpfile, filename)) + return -1; unlink_or_warn(tmpfile); } @@ -2228,7 +2287,8 @@ static int write_loose_object(const struct object_id *oid, char *hdr, warning_errno(_("failed utime() on %s"), tmp_file.buf); } - return finalize_object_file(tmp_file.buf, filename.buf); + return finalize_object_file_flags(tmp_file.buf, filename.buf, + FOF_SKIP_COLLISION_CHECK); } static int freshen_loose_object(const struct object_id *oid) @@ -2350,7 +2410,8 @@ int stream_loose_object(struct input_stream *in_stream, size_t len, strbuf_release(&dir); } - err = finalize_object_file(tmp_file.buf, filename.buf); + err = finalize_object_file_flags(tmp_file.buf, filename.buf, + FOF_SKIP_COLLISION_CHECK); if (!err && compat) err = repo_add_loose_object_map(the_repository, oid, &compat_oid); cleanup: diff --git a/object-file.h b/object-file.h index d6414610f80..81b30d269c8 100644 --- a/object-file.h +++ b/object-file.h @@ -117,7 +117,13 @@ int check_object_signature(struct repository *r, const struct object_id *oid, */ int stream_object_signature(struct repository *r, const struct object_id *oid); +enum finalize_object_file_flags { + FOF_SKIP_COLLISION_CHECK = 1, +}; + int finalize_object_file(const char *tmpfile, const char *filename); +int finalize_object_file_flags(const char *tmpfile, const char *filename, + enum finalize_object_file_flags flags); /* Helper to check and "touch" a file */ int check_and_freshen_file(const char *fn, int freshen); diff --git a/tmp-objdir.c b/tmp-objdir.c index c2fb9f91930..9da0071cba8 100644 --- a/tmp-objdir.c +++ b/tmp-objdir.c @@ -206,9 +206,11 @@ static int read_dir_paths(struct string_list *out, const char *path) return 0; } -static int migrate_paths(struct strbuf *src, struct strbuf *dst); +static int migrate_paths(struct strbuf *src, struct strbuf *dst, + enum finalize_object_file_flags flags); -static int migrate_one(struct strbuf *src, struct strbuf *dst) +static int migrate_one(struct strbuf *src, struct strbuf *dst, + enum finalize_object_file_flags flags) { struct stat st; @@ -220,12 +222,18 @@ static int migrate_one(struct strbuf *src, struct strbuf *dst) return -1; } else if (errno != EEXIST) return -1; - return migrate_paths(src, dst); + return migrate_paths(src, dst, flags); } - return finalize_object_file(src->buf, dst->buf); + return finalize_object_file_flags(src->buf, dst->buf, flags); } -static int migrate_paths(struct strbuf *src, struct strbuf *dst) +static int is_loose_object_shard(const char *name) +{ + return strlen(name) == 2 && isxdigit(name[0]) && isxdigit(name[1]); +} + +static int migrate_paths(struct strbuf *src, struct strbuf *dst, + enum finalize_object_file_flags flags) { size_t src_len = src->len, dst_len = dst->len; struct string_list paths = STRING_LIST_INIT_DUP; @@ -239,11 +247,15 @@ static int migrate_paths(struct strbuf *src, struct strbuf *dst) for (i = 0; i < paths.nr; i++) { const char *name = paths.items[i].string; + enum finalize_object_file_flags flags_copy = flags; strbuf_addf(src, "/%s", name); strbuf_addf(dst, "/%s", name); - ret |= migrate_one(src, dst); + if (is_loose_object_shard(name)) + flags_copy |= FOF_SKIP_COLLISION_CHECK; + + ret |= migrate_one(src, dst, flags_copy); strbuf_setlen(src, src_len); strbuf_setlen(dst, dst_len); @@ -271,7 +283,7 @@ int tmp_objdir_migrate(struct tmp_objdir *t) strbuf_addbuf(&src, &t->path); strbuf_addstr(&dst, repo_get_object_directory(the_repository)); - ret = migrate_paths(&src, &dst); + ret = migrate_paths(&src, &dst, 0); strbuf_release(&src); strbuf_release(&dst); From patchwork Tue Sep 24 17:32:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811080 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B52681AC88D for ; Tue, 24 Sep 2024 17:32:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199146; cv=none; b=NSBpAK0Q9loCV8VvEyB9GekySwR0Oq2d8z69BJOgWA1oLzX2QPKWh3i9SfQ5z4fsIJRAUSdSCmp9yCurADU2WB66VBoQKqEyQQvKz3EVKl+qX8y0sx5j3TVxbPmEXgOofLYmC1P3b9F6F617L9JVUe15yKPxu0+5R5+XV7jOxEQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199146; c=relaxed/simple; bh=47viyOdQ+rZFEkFy2ZAwA7b7p3pfn58pQhJKlAzk95k=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Kw+7ezOjUn5jNUJKNiwIS4U8ZZCxEzUjrCnYZlsafDR2smzpPtSubCm0jEATvmbpSwImSP4sE/bw2jy+Jn/Jq6kCL7aHSkk0NGF6wmKOCTLTn9EbVaxix8VPn3rmduMnttbl9o6OLX7xG0mCh1C8ZnYGXx1dZe2FNwYH4E/L5yk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=C8HwDS8S; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="C8HwDS8S" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-6e129c01b04so21034507b3.1 for ; Tue, 24 Sep 2024 10:32:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199143; x=1727803943; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=n6DPp10fMWHJ8BaJ9Ge3DySgWYiOYFrdeB2UwMuQsKo=; b=C8HwDS8SETgyGteKO4sOgdSwSdNAQrKqDzcyPNeR3sV0Ir7Mt9JUqIrexhe84XDOe2 9aSWhW5rCf9rUGWlr35lKKkDu0ILFkcgSjVxLxHCGXbsEAmi/pybufZLUP4fOjwuVvgY BKNybqCJZHh/0OaxuXZCnVlni9qy8//EttaTMbdcLRRy3ecT3KHRhlUH99CaqrLilnbJ ThpObF1rOhKoiWMA+et7wb+UuwK+yS0+BUYJjSW6KN0ydvwgUoMODR+mbhCkb6YHz/e0 0dlh7l6BWcxeI/Als8mW1zPcc5Kb0NOZWsAmkUCTz2aTNobOOZKV4wdiu/35lHb1bNPr HHIw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199143; x=1727803943; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=n6DPp10fMWHJ8BaJ9Ge3DySgWYiOYFrdeB2UwMuQsKo=; b=LqBFbVDpQJQ0xtZMnATjiDcEVCuTj72nxHLCyBVMZwlVEhbRl27U5kQ5BwmKp/w6Ea bFDBTQr3InsbUF1nWzDf7IB9prebQtFc1xCb4EK6J+4LD0QPZKBhh6iNV6l8QW+9zO2C 1CHgxTeN/hYQbzOW9+lvDFXHKkpMFL7+G2bHxyql0jY+vvlBWUjt9FS4KKRlFFGvyF1c Ey21bcWQT1Y6UqxUunNpon60rJDCMycMHoFTTB2xMMxWY09690wsYhre34U0wL+KFiFl WtDAlDtxQHXg3PkPC/hVFqqkbgyPi6/7FBhh6b4ao1vCobN2cDyQausTNq6st+1YDvd/ kqnA== X-Gm-Message-State: AOJu0Yzkd9LQkyvTQhl3vjreK2UI7OANBtMcgiK9+I7HWlykVWr2f7WA DtM5IcpMyl/maAzYDmw/LL0+CvSLxhlIn/xLp2xAFmenPNbn70h4udTknarw4rRdw+CZmtY4QYy Eg0c= X-Google-Smtp-Source: AGHT+IEMHjKqLBPETsTYKdED3wIciEZEa0C0zyo+of7Yho5InJGjmHbA9gvij7Drujmbt81xgvRivQ== X-Received: by 2002:a05:690c:dcf:b0:631:78a1:baf with SMTP id 00721157ae682-6e21d6e1c69mr2073357b3.6.1727199143330; Tue, 24 Sep 2024 10:32:23 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e20d15ef56sm3138737b3.76.2024.09.24.10.32.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:22 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:21 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 4/8] pack-objects: use finalize_object_file() to rename pack/idx/etc Message-ID: <3cc7f7b1f67fe823834c36f3be20be8ee56e16a4.1727199118.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In most places that write files to the object database (even packfiles via index-pack or fast-import), we use finalize_object_file(). This prefers link()/unlink() over rename(), because it means we will prefer data that is already in the repository to data that we are newly writing. We should do the same thing in pack-objects. Even though we don't think of it as accepting outside data (and thus not being susceptible to collision attacks), in theory a determined attacker could present just the right set of objects to cause an incremental repack to generate a pack with their desired hash. This has some test and real-world fallout, as seen in the adjustment to t5303 below. That test script assumes that we can "fix" corruption by repacking into a good state, including when the pack generated by that repack operation collides with a (corrupted) pack with the same hash. This violates our assumption from the previous adjustments to finalize_object_file() that if we're moving a new file over an existing one, that since their checksums match, so too must their contents. This makes "fixing" corruption like this a more explicit operation, since the test (and users, who may fix real-life corruption using a similar technique) must first move the broken contents out of the way. Note also that we now call adjust_shared_perm() twice. We already call adjust_shared_perm() in stage_tmp_packfiles(), and now call it again in finalize_object_file(). This is somewhat wasteful, but cleaning up the existing calls to adjust_shared_perm() is tricky (because sometimes we're writing to a tmpfile, and sometimes we're writing directly into the final destination), so let's tolerate some minor waste until we can more carefully clean up the now-redundant calls. Co-authored-by: Jeff King Signed-off-by: Jeff King Signed-off-by: Taylor Blau --- pack-write.c | 7 ++++--- t/t5303-pack-corruption-resilience.sh | 7 ++++++- 2 files changed, 10 insertions(+), 4 deletions(-) diff --git a/pack-write.c b/pack-write.c index 27965672f17..f415604c159 100644 --- a/pack-write.c +++ b/pack-write.c @@ -8,6 +8,7 @@ #include "csum-file.h" #include "remote.h" #include "chunk-format.h" +#include "object-file.h" #include "pack-mtimes.h" #include "pack-objects.h" #include "pack-revindex.h" @@ -528,9 +529,9 @@ static void rename_tmp_packfile(struct strbuf *name_prefix, const char *source, size_t name_prefix_len = name_prefix->len; strbuf_addstr(name_prefix, ext); - if (rename(source, name_prefix->buf)) - die_errno("unable to rename temporary file to '%s'", - name_prefix->buf); + if (finalize_object_file(source, name_prefix->buf)) + die("unable to rename temporary file to '%s'", + name_prefix->buf); strbuf_setlen(name_prefix, name_prefix_len); } diff --git a/t/t5303-pack-corruption-resilience.sh b/t/t5303-pack-corruption-resilience.sh index 61469ef4a68..e6a43ec9ae3 100755 --- a/t/t5303-pack-corruption-resilience.sh +++ b/t/t5303-pack-corruption-resilience.sh @@ -44,9 +44,14 @@ create_new_pack() { } do_repack() { + for f in $pack.* + do + mv $f "$(echo $f | sed -e 's/pack-/pack-corrupt-/')" || return 1 + done && pack=$(printf "$blob_1\n$blob_2\n$blob_3\n" | git pack-objects $@ .git/objects/pack/pack) && - pack=".git/objects/pack/pack-${pack}" + pack=".git/objects/pack/pack-${pack}" && + rm -f .git/objects/pack/pack-corrupt-* } do_corrupt_object() { From patchwork Tue Sep 24 17:32:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811081 Received: from mail-yb1-f179.google.com (mail-yb1-f179.google.com [209.85.219.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B75DE1AC89B for ; Tue, 24 Sep 2024 17:32:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199149; cv=none; b=e7EtjCOsIwFLRtzsBescQO1zF4KHG9hHs0dElEP1RdqD3Ob6ZyOWawUHCVY7oSNLHKTVyS0Bc8bAiJHo1yL1CAmKtsd2NH57ncTxcWMpk9jhsozPh/n/69T7YPRYoO0HAeGRB5INTm1dhnLtxQcM6gXCuqb0E1W+UAaYGb8jAS0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199149; c=relaxed/simple; bh=PPR0KUGWoQXvQ9N5oCgYRCD4y+oNPgM+laGsMSi7wHU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=jlp2GGqmNgvOG5dgG6D689fDXUexwGUK7e0KvW2y2akDcY++Ltq4FuAMjPbXtpCKOQR88tbY8hc4qtmMWUvZlnsTh04Lwg5Thd8Cj3LG0W1fgSK3z00O6lEkAgl6AvbY8j5syHR2QDwHrKk5JZyJjwZmNc4aKD0eT157SW7BOC0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=szwEM6bj; arc=none smtp.client-ip=209.85.219.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="szwEM6bj" Received: by mail-yb1-f179.google.com with SMTP id 3f1490d57ef6-e1a9dc3efc1so5800706276.2 for ; Tue, 24 Sep 2024 10:32:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199146; x=1727803946; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=BJjrakAcqMqCV9bvTcFxUPZIYP5wM6235OvJkaiITCQ=; b=szwEM6bjOQH7sLTNnZboplVZzXTgbQ7MzH4SiCS43FYjkC+Qbaj6HN3hu8uHEuL9Tq py9opeA1N9ZMoO9MnhmxSDvAO7isU848YiIrQCUNREnt4G0UZ3expIF/krBDhwOueAKe lZoC2x1LNae3krXI2MIaLwXEs1eS04jdB1mzL+5zE4e00Kfp45rOBGH3NZhoTEKZRrV4 1nsnFCQ37/QsbTY/30frW401YthNUGGduBbsPjcOnsnvuqqqXyn6j4VDW4MbtiV4xQ/b tBCeqOprp/r1r++TlvhJgrwuDCIh4HVFzRJ2OO+JwMUE2b/S3P/8BngLxq53NjpJ1QMO MN2A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199146; x=1727803946; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=BJjrakAcqMqCV9bvTcFxUPZIYP5wM6235OvJkaiITCQ=; b=eEYonFzBF1Q+em45iIWz6ysE9kN4+o8brNmmPufaoV8dIZ8+GpQTC6A+DrgoBQgTkv q3k2fbfbTK0hqmqkFCqMy2vsSVSUzQGnLq1BuEIQsa9liYT57mJNNJLJX19IGW5pSZ5u NyoOjpyXLKJsxtLU7ssrh4YaZ9uF4quFa+efNypa4AYSorrnx/dq0XY8Xo2K+ek9X+Y4 yZTXpKNYyEyEXTjNZJ1L12ExFLjgOxC0C1IbJuqll89747YzxyyDCM9BhVCE2t8oJXMN tRkSmY7pCakfXGyZ4ZYqUJitYDql37F8Dfn/MMyAf7UiI1iBL/EDTFe8vNYFDiikbR+U wK8A== X-Gm-Message-State: AOJu0YyL+oB5ocSTLtZpKxD67kwSonVxzXEIJR8MSDyR5B1DEjib9j9b N9Qyx/t8VoxGX5L1a0OPF2qvNy0HCR4hYDLw0FIsJr4IBJiUts15H/v7/UClIB5p/zCwvbxqQaf LtgE= X-Google-Smtp-Source: AGHT+IFp+hEFl10gUG9His5SMUdhGmNdKva2fU7PZsSoeICPVCU6+lSdH13KJKRdtKl+W2t+okGWcA== X-Received: by 2002:a05:690c:d83:b0:6e2:11e4:2f58 with SMTP id 00721157ae682-6e21d6ec5e2mr2138627b3.7.1727199146487; Tue, 24 Sep 2024 10:32:26 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e20cffe0f6sm3142427b3.25.2024.09.24.10.32.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:26 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:24 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 5/8] sha1: do not redefine `platform_SHA_CTX` and friends Message-ID: <8f8ac0f5b0e6ddc774b9108049d7f9c3310735f3.1727199118.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Our in-tree SHA-1 wrappers all define platform_SHA_CTX and related macros to point at the opaque "context" type, init, update, and similar functions for each specific implementation. In hash.h, we use these platform_ variables to set up the function pointers for, e.g., the_hash_algo->init_fn(), etc. But while these header files have a header-specific macro that prevents them declaring their structs / functions multiple times, they unconditionally define the platform variables, making it impossible to load multiple SHA-1 implementations at once. As a prerequisite for loading a separate SHA-1 implementation for non-cryptographic uses, only define the platform_ variables if they have not already been defined. Signed-off-by: Taylor Blau --- block-sha1/sha1.h | 2 ++ sha1/openssl.h | 2 ++ sha1dc_git.h | 3 +++ 3 files changed, 7 insertions(+) diff --git a/block-sha1/sha1.h b/block-sha1/sha1.h index 9fb0441b988..47bb9166368 100644 --- a/block-sha1/sha1.h +++ b/block-sha1/sha1.h @@ -16,7 +16,9 @@ void blk_SHA1_Init(blk_SHA_CTX *ctx); void blk_SHA1_Update(blk_SHA_CTX *ctx, const void *dataIn, size_t len); void blk_SHA1_Final(unsigned char hashout[20], blk_SHA_CTX *ctx); +#ifndef platform_SHA_CTX #define platform_SHA_CTX blk_SHA_CTX #define platform_SHA1_Init blk_SHA1_Init #define platform_SHA1_Update blk_SHA1_Update #define platform_SHA1_Final blk_SHA1_Final +#endif diff --git a/sha1/openssl.h b/sha1/openssl.h index 006c1f4ba54..1038af47daf 100644 --- a/sha1/openssl.h +++ b/sha1/openssl.h @@ -40,10 +40,12 @@ static inline void openssl_SHA1_Clone(struct openssl_SHA1_CTX *dst, EVP_MD_CTX_copy_ex(dst->ectx, src->ectx); } +#ifndef platform_SHA_CTX #define platform_SHA_CTX openssl_SHA1_CTX #define platform_SHA1_Init openssl_SHA1_Init #define platform_SHA1_Clone openssl_SHA1_Clone #define platform_SHA1_Update openssl_SHA1_Update #define platform_SHA1_Final openssl_SHA1_Final +#endif #endif /* SHA1_OPENSSL_H */ diff --git a/sha1dc_git.h b/sha1dc_git.h index 60e3ce84395..f6f880cabea 100644 --- a/sha1dc_git.h +++ b/sha1dc_git.h @@ -18,7 +18,10 @@ void git_SHA1DCFinal(unsigned char [20], SHA1_CTX *); void git_SHA1DCUpdate(SHA1_CTX *ctx, const void *data, unsigned long len); #define platform_SHA_IS_SHA1DC /* used by "test-tool sha1-is-sha1dc" */ + +#ifndef platform_SHA_CTX #define platform_SHA_CTX SHA1_CTX #define platform_SHA1_Init git_SHA1DCInit #define platform_SHA1_Update git_SHA1DCUpdate #define platform_SHA1_Final git_SHA1DCFinal +#endif From patchwork Tue Sep 24 17:32:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811082 Received: from mail-yb1-f169.google.com (mail-yb1-f169.google.com [209.85.219.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1D6141AC424 for ; Tue, 24 Sep 2024 17:32:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199152; cv=none; b=LZSZ5pkBSA2Ijb4AG3dOLN4IXtQc8S+jpW+VTo5Zs+eogKUqzadCOhZPAcITjKyfNwMXYNoiqY9clH5ImcY2PBivO9rgBlGkSjSX02LMUF4nDOucEnW2jgP43d++rOBkY7zpsRUElXSpkubz1873Zo43a/mHXbhm/ccqsZq9ucg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199152; c=relaxed/simple; bh=b63aMYZVWbypZvi/m4XzJBZmH5tJUPBB/lQITUpb0yE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=poQhmTq/RZ/TF2mhDpiQekD60Fx+VJuRU72AHsgdNG6MfZkqw7lAdWKCvs056UKiv6ols/wxv10xPPZZ/kOZT7ekdOjXRbirXzfs84fXl9c65a2DKLhjcnYvqTCkIlkthwGxRkJlK2shPAjLTd333xHOORBgRjLoPsybUuA15LQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=ChX7hMWW; arc=none smtp.client-ip=209.85.219.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="ChX7hMWW" Received: by mail-yb1-f169.google.com with SMTP id 3f1490d57ef6-e1a989bd17aso5045324276.1 for ; Tue, 24 Sep 2024 10:32:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199149; x=1727803949; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=OAJw+96cLZL7uSh3byPCWym4U1G0sQPREeuewVhiFZQ=; b=ChX7hMWW4Xr3RjjAoNmaRN6KWlaunORs/d8m7oq4VH34zM+Z949r9p/BN+1OKo2xjL yV+G+dT1rqQwO+/TM/obKU3lCzUt9W1UU5lSF0g2rzcMEfyt/XJpOxCw2nLlpbD8sJlj zy2/xQwXctrDKk7n0WbGOEHaNJaoSVnSq9QkT0A8Gy9utLPKFSVh37NEWHTMpEYWgnKS Mw8yypWK5z7wRX+6eThp7p++uOzvlc3fuCip5wjnwFBNl+FhXf3j8riCyIIaR3W7jxjL T7rDeaHnIWkF1XXhcJBqxrhjASzzaCnMbpJoSxpk97sHj7dFKo66PfDiElcPP6pO95Ud L0kA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199149; x=1727803949; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=OAJw+96cLZL7uSh3byPCWym4U1G0sQPREeuewVhiFZQ=; b=i7Z6hLcA1ivOfJJR3aX1Aeuvc9wq6g0MBwqFrOCH/uEgGsaKcnA0Oz+VQXyR0b111S 1aws89sFdOOG7vZcPsMKVz7YrKC46YdsbUcCeJMbR6UcS4BKxeFiKuZsocALTL2TeHGf Exg26pXQuFMI+Z58jBgp4btx8MpWnPFBzYCCEK8M0+z1cgEh60FXgozVnYp8p8tZlWHv T8fOYKskOoiY16C5wV7V53e9WkHgPgtJCgjoDmO0+hfIzaj5NH87k8BJccPtp8BYtMOZ SX3QO/+xzKNidEVCIs9RSOYcF45quPHv1m1KcEZztRbrG9NWZEvNr0K1xvHUzmfiSlrJ X4gQ== X-Gm-Message-State: AOJu0YzhRbe8PE7yzIXqMoyS/NeoktCbnLAi6HT53rw+6S8MVA+k8BXj FooZqkB6O9k7kfaRdOsdD2//+2JbL1GBUd4STfYhhfxAx32iOxuS20TUEZcg60IOQmyGn4cpjXR P5MA= X-Google-Smtp-Source: AGHT+IG8fhsGrrswbdUdYosu/Ge5BNoDAuEQmWrlKgAVJdK5096wL/JJ5z7jOg3c8MQEhyOVr5Ui4g== X-Received: by 2002:a05:6902:250a:b0:e1f:eb2b:5502 with SMTP id 3f1490d57ef6-e2250c3b434mr11405535276.13.1727199149664; Tue, 24 Sep 2024 10:32:29 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e2499c63d11sm317292276.42.2024.09.24.10.32.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:29 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:28 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 6/8] hash.h: scaffolding for _unsafe hashing variants Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Git's default SHA-1 implementation is collision-detecting, which hardens us against known SHA-1 attacks against Git objects. This makes Git object writes safer at the expense of some speed when hashing through the collision-detecting implementation, which is slower than non-collision detecting alternatives. Prepare for loading a separate "unsafe" SHA-1 implementation that can be used for non-cryptographic purposes, like computing the checksum of files that use the hashwrite() API. This commit does not actually introduce any new compile-time knobs to control which implementation is used as the unsafe SHA-1 variant, but does add scaffolding so that the "git_hash_algo" structure has five new function pointers which are "unsafe" variants of the five existing hashing-related function pointers: - git_hash_init_fn unsafe_init_fn - git_hash_clone_fn unsafe_clone_fn - git_hash_update_fn unsafe_update_fn - git_hash_final_fn unsafe_final_fn - git_hash_final_oid_fn unsafe_final_oid_fn The following commit will introduce compile-time knobs to specify which SHA-1 implementation is used for non-cryptographic uses. Signed-off-by: Taylor Blau --- hash.h | 42 ++++++++++++++++++++++++++++++++++++++++++ object-file.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 2 files changed, 84 insertions(+) diff --git a/hash.h b/hash.h index 72ffbc862e5..96458b129f9 100644 --- a/hash.h +++ b/hash.h @@ -44,14 +44,32 @@ #define platform_SHA1_Final SHA1_Final #endif +#ifndef platform_SHA_CTX_unsafe +# define platform_SHA_CTX_unsafe platform_SHA_CTX +# define platform_SHA1_Init_unsafe platform_SHA1_Init +# define platform_SHA1_Update_unsafe platform_SHA1_Update +# define platform_SHA1_Final_unsafe platform_SHA1_Final +# ifdef platform_SHA1_Clone +# define platform_SHA1_Clone_unsafe platform_SHA1_Clone +# endif +#endif + #define git_SHA_CTX platform_SHA_CTX #define git_SHA1_Init platform_SHA1_Init #define git_SHA1_Update platform_SHA1_Update #define git_SHA1_Final platform_SHA1_Final +#define git_SHA_CTX_unsafe platform_SHA_CTX_unsafe +#define git_SHA1_Init_unsafe platform_SHA1_Init_unsafe +#define git_SHA1_Update_unsafe platform_SHA1_Update_unsafe +#define git_SHA1_Final_unsafe platform_SHA1_Final_unsafe + #ifdef platform_SHA1_Clone #define git_SHA1_Clone platform_SHA1_Clone #endif +#ifdef platform_SHA1_Clone_unsafe +# define git_SHA1_Clone_unsafe platform_SHA1_Clone_unsafe +#endif #ifndef platform_SHA256_CTX #define platform_SHA256_CTX SHA256_CTX @@ -81,6 +99,13 @@ static inline void git_SHA1_Clone(git_SHA_CTX *dst, const git_SHA_CTX *src) memcpy(dst, src, sizeof(*dst)); } #endif +#ifndef SHA1_NEEDS_CLONE_HELPER_UNSAFE +static inline void git_SHA1_Clone_unsafe(git_SHA_CTX_unsafe *dst, + const git_SHA_CTX_unsafe *src) +{ + memcpy(dst, src, sizeof(*dst)); +} +#endif #ifndef SHA256_NEEDS_CLONE_HELPER static inline void git_SHA256_Clone(git_SHA256_CTX *dst, const git_SHA256_CTX *src) @@ -178,6 +203,8 @@ enum get_oid_result { /* A suitably aligned type for stack allocations of hash contexts. */ union git_hash_ctx { git_SHA_CTX sha1; + git_SHA_CTX_unsafe sha1_unsafe; + git_SHA256_CTX sha256; }; typedef union git_hash_ctx git_hash_ctx; @@ -222,6 +249,21 @@ struct git_hash_algo { /* The hash finalization function for object IDs. */ git_hash_final_oid_fn final_oid_fn; + /* The non-cryptographic hash initialization function. */ + git_hash_init_fn unsafe_init_fn; + + /* The non-cryptographic hash context cloning function. */ + git_hash_clone_fn unsafe_clone_fn; + + /* The non-cryptographic hash update function. */ + git_hash_update_fn unsafe_update_fn; + + /* The non-cryptographic hash finalization function. */ + git_hash_final_fn unsafe_final_fn; + + /* The non-cryptographic hash finalization function. */ + git_hash_final_oid_fn unsafe_final_oid_fn; + /* The OID of the empty tree. */ const struct object_id *empty_tree; diff --git a/object-file.c b/object-file.c index b9a3a1f62db..196c9e2df8b 100644 --- a/object-file.c +++ b/object-file.c @@ -115,6 +115,33 @@ static void git_hash_sha1_final_oid(struct object_id *oid, git_hash_ctx *ctx) oid->algo = GIT_HASH_SHA1; } +static void git_hash_sha1_init_unsafe(git_hash_ctx *ctx) +{ + git_SHA1_Init_unsafe(&ctx->sha1_unsafe); +} + +static void git_hash_sha1_clone_unsafe(git_hash_ctx *dst, const git_hash_ctx *src) +{ + git_SHA1_Clone_unsafe(&dst->sha1_unsafe, &src->sha1_unsafe); +} + +static void git_hash_sha1_update_unsafe(git_hash_ctx *ctx, const void *data, + size_t len) +{ + git_SHA1_Update_unsafe(&ctx->sha1_unsafe, data, len); +} + +static void git_hash_sha1_final_unsafe(unsigned char *hash, git_hash_ctx *ctx) +{ + git_SHA1_Final_unsafe(hash, &ctx->sha1_unsafe); +} + +static void git_hash_sha1_final_oid_unsafe(struct object_id *oid, git_hash_ctx *ctx) +{ + git_SHA1_Final_unsafe(oid->hash, &ctx->sha1_unsafe); + memset(oid->hash + GIT_SHA1_RAWSZ, 0, GIT_MAX_RAWSZ - GIT_SHA1_RAWSZ); + oid->algo = GIT_HASH_SHA1; +} static void git_hash_sha256_init(git_hash_ctx *ctx) { @@ -189,6 +216,11 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { .update_fn = git_hash_unknown_update, .final_fn = git_hash_unknown_final, .final_oid_fn = git_hash_unknown_final_oid, + .unsafe_init_fn = git_hash_unknown_init, + .unsafe_clone_fn = git_hash_unknown_clone, + .unsafe_update_fn = git_hash_unknown_update, + .unsafe_final_fn = git_hash_unknown_final, + .unsafe_final_oid_fn = git_hash_unknown_final_oid, .empty_tree = NULL, .empty_blob = NULL, .null_oid = NULL, @@ -204,6 +236,11 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { .update_fn = git_hash_sha1_update, .final_fn = git_hash_sha1_final, .final_oid_fn = git_hash_sha1_final_oid, + .unsafe_init_fn = git_hash_sha1_init_unsafe, + .unsafe_clone_fn = git_hash_sha1_clone_unsafe, + .unsafe_update_fn = git_hash_sha1_update_unsafe, + .unsafe_final_fn = git_hash_sha1_final_unsafe, + .unsafe_final_oid_fn = git_hash_sha1_final_oid_unsafe, .empty_tree = &empty_tree_oid, .empty_blob = &empty_blob_oid, .null_oid = &null_oid_sha1, @@ -219,6 +256,11 @@ const struct git_hash_algo hash_algos[GIT_HASH_NALGOS] = { .update_fn = git_hash_sha256_update, .final_fn = git_hash_sha256_final, .final_oid_fn = git_hash_sha256_final_oid, + .unsafe_init_fn = git_hash_sha256_init, + .unsafe_clone_fn = git_hash_sha256_clone, + .unsafe_update_fn = git_hash_sha256_update, + .unsafe_final_fn = git_hash_sha256_final, + .unsafe_final_oid_fn = git_hash_sha256_final_oid, .empty_tree = &empty_tree_oid_sha256, .empty_blob = &empty_blob_oid_sha256, .null_oid = &null_oid_sha256, From patchwork Tue Sep 24 17:32:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811083 Received: from mail-yb1-f171.google.com (mail-yb1-f171.google.com [209.85.219.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 02AC21AC424 for ; Tue, 24 Sep 2024 17:32:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199156; cv=none; b=A0Nrso7j4cE6c9NJ8oHDvKY0zEBocuDxGCCagaph/tLIMRDfSJ5vXePngY69DApK05D8V4nD5r6L7gQlwEiqedyGjm26qnxzWNJGMrM0Fl/ViO5V4Mfp1mtlJvrmKyh8wjFKYDtkELXPIc6lAkmoeKIzvK1ymlmSSr4I8FnAsMM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199156; c=relaxed/simple; bh=+WM9XWHSEJ3ncDjE3zebOmDr8bXcBSX6wjlrZDms4Q8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=iaXv2PtgLwzG9+Oe/LS5oLQpX+/G/YK30SNPsFqJxuK2bedO9lqMBj9MLbBISq1pkfLGdfUki+C9BCEmS/GvDmSG2Eiu+1HhpwBLztoKiAKk6y8s8w94rAIr4K0QVxGqSJzrBUYedpYGdbGnrrp+oNwpZWuMM6tXAdYWegSQhQw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=nUjifs+X; arc=none smtp.client-ip=209.85.219.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="nUjifs+X" Received: by mail-yb1-f171.google.com with SMTP id 3f1490d57ef6-e22531db3baso3792130276.3 for ; Tue, 24 Sep 2024 10:32:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199153; x=1727803953; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=g4Q6POxtDbrGJ5dnpl5kdnB8d0HovRP9OhWdbVm6isk=; b=nUjifs+X/1ON43waEDJ3QNaZLsKIQ3R/UVMApnhqnHf4v0YBaTWsuj5GA+18ZP7NQa 8tevyjOqerwjBvij7MGTxa8nDz847O9NZAULipDpNuWDZZXFnDYGbjWMU9HBa3NHnpJE wQoSitw3XxtlfSiMrqdVY96YdSswr50epesTuI1FFxOt7G0CU1MDBCqnL6v5/Z32hbyf 3zkSzDZZRrcy3lvMpdDpct7jUYmnslbsmCBdjVmZt3UCDu/mam/5+O9uBTIeewWgwEnW oXz8SY8zcmssoIae18aTSLYtE286AlqyfiPNzFO1R1Is21D+FPYE63BJhHHs4DKF/zE6 iW0A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199153; x=1727803953; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=g4Q6POxtDbrGJ5dnpl5kdnB8d0HovRP9OhWdbVm6isk=; b=EXd1HnbSpaJ1reHy+wsKIDhGSM+0Uc8KDnkwffpW9KJ7nPnecRmUzAJOaC8jxJSSQF VhxUJYaPnSWmwaY1n8c3Nt+1CaIxc6o+J0xEIC4gLGqDloMkGhibjKEyoBxa+F7WEU4a i4UkUuQPqP9FY9Bm6tBOkjwsXrc3jHRYCD6Gx4Hdp6pAqA0DIiljAWcvDcdZRIkqDCmX EzeyZAcraHtIydx6ux++pDiSqrfrALHt7hSjY7Om5RUqHkaAANOQOe/aQdKdAtOLDaK3 ZBxBSEpbagOARs8zGh9x2FRrznJN6oVSNsIfhAfL+7YcRpiRRy6ipbQum+7A2/50XDbw 6cvw== X-Gm-Message-State: AOJu0YzYEaz6dYDnnoIIYDVgBi7tOOhsfzXO+lwzzr4D/p4D1vPvfl+A jc809aq6Jyh0qoOhprd8AAvXJSI1nG5uxPhhBSlRgetA9n7Ka72VNcE8QwO1xRP15oJIyyGffjn sGXg= X-Google-Smtp-Source: AGHT+IFQb0KgEff82Qd2b13z5Wr3qXshW/NfXn5zDGCBofkazDjKXgjmZsTeRL0JnFqehrDqAvP2zw== X-Received: by 2002:a05:6902:f82:b0:e20:1165:8da2 with SMTP id 3f1490d57ef6-e2250c47748mr12184023276.29.1727199153593; Tue, 24 Sep 2024 10:32:33 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e2499ae6a78sm319402276.6.2024.09.24.10.32.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:32 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:31 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 7/8] Makefile: allow specifying a SHA-1 for non-cryptographic uses Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Introduce _UNSAFE variants of the OPENSSL_SHA1, BLK_SHA1, and APPLE_COMMON_CRYPTO_SHA1 compile-time knobs which indicate which SHA-1 implementation is to be used for non-cryptographic uses. There are a couple of small implementation notes worth mentioning: - There is no way to select the collision detecting SHA-1 as the "fast" fallback, since the fast fallback is only for non-cryptographic uses, and is meant to be faster than our collision-detecting implementation. - There are no similar knobs for SHA-256, since no collision attacks are presently known and thus no collision-detecting implementations actually exist. Signed-off-by: Taylor Blau --- Makefile | 25 +++++++++++++++++++++++++ hash.h | 30 ++++++++++++++++++++++++++++++ 2 files changed, 55 insertions(+) diff --git a/Makefile b/Makefile index 9cf2be070fa..fb84a87592d 100644 --- a/Makefile +++ b/Makefile @@ -521,6 +521,10 @@ include shared.mak # Define APPLE_COMMON_CRYPTO_SHA1 to use Apple's CommonCrypto for # SHA-1. # +# Define the same Makefile knobs as above, but suffixed with _UNSAFE to +# use the corresponding implementations for unsafe SHA-1 hashing for +# non-cryptographic purposes. +# # If don't enable any of the *_SHA1 settings in this section, Git will # default to its built-in sha1collisiondetection library, which is a # collision-detecting sha1 This is slower, but may detect attempted @@ -1996,6 +2000,27 @@ endif endif endif +ifdef OPENSSL_SHA1_UNSAFE +ifndef OPENSSL_SHA1 + EXTLIBS += $(LIB_4_CRYPTO) + BASIC_CFLAGS += -DSHA1_OPENSSL_UNSAFE +endif +else +ifdef BLK_SHA1_UNSAFE +ifndef BLK_SHA1 + LIB_OBJS += block-sha1/sha1.o + BASIC_CFLAGS += -DSHA1_BLK_UNSAFE +endif +else +ifdef APPLE_COMMON_CRYPTO_SHA1_UNSAFE +ifndef APPLE_COMMON_CRYPTO_SHA1 + COMPAT_CFLAGS += -DCOMMON_DIGEST_FOR_OPENSSL + BASIC_CFLAGS += -DSHA1_APPLE_UNSAFE +endif +endif +endif +endif + ifdef OPENSSL_SHA256 EXTLIBS += $(LIB_4_CRYPTO) BASIC_CFLAGS += -DSHA256_OPENSSL diff --git a/hash.h b/hash.h index 96458b129f9..f97f858307b 100644 --- a/hash.h +++ b/hash.h @@ -15,6 +15,36 @@ #include "block-sha1/sha1.h" #endif +#if defined(SHA1_APPLE_UNSAFE) +# include +# define platform_SHA_CTX_unsafe CC_SHA1_CTX +# define platform_SHA1_Init_unsafe CC_SHA1_Init +# define platform_SHA1_Update_unsafe CC_SHA1_Update +# define platform_SHA1_Final_unsafe CC_SHA1_Final +#elif defined(SHA1_OPENSSL_UNSAFE) +# include +# if defined(OPENSSL_API_LEVEL) && OPENSSL_API_LEVEL >= 3 +# define SHA1_NEEDS_CLONE_HELPER_UNSAFE +# include "sha1/openssl.h" +# define platform_SHA_CTX_unsafe openssl_SHA1_CTX +# define platform_SHA1_Init_unsafe openssl_SHA1_Init +# define platform_SHA1_Clone_unsafe openssl_SHA1_Clone +# define platform_SHA1_Update_unsafe openssl_SHA1_Update +# define platform_SHA1_Final_unsafe openssl_SHA1_Final +# else +# define platform_SHA_CTX_unsafe SHA_CTX +# define platform_SHA1_Init_unsafe SHA1_Init +# define platform_SHA1_Update_unsafe SHA1_Update +# define platform_SHA1_Final_unsafe SHA1_Final +# endif +#elif defined(SHA1_BLK_UNSAFE) +# include "block-sha1/sha1.h" +# define platform_SHA_CTX_unsafe blk_SHA_CTX +# define platform_SHA1_Init_unsafe blk_SHA1_Init +# define platform_SHA1_Update_unsafe blk_SHA1_Update +# define platform_SHA1_Final_unsafe blk_SHA1_Final +#endif + #if defined(SHA256_NETTLE) #include "sha256/nettle.h" #elif defined(SHA256_GCRYPT) From patchwork Tue Sep 24 17:32:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13811084 Received: from mail-yw1-f176.google.com (mail-yw1-f176.google.com [209.85.128.176]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 614841AC424 for ; Tue, 24 Sep 2024 17:32:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.176 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199160; cv=none; b=f1p6JZiq+xAGU5ay6Fi8QQc2ZyNugdO4tgwFGJbjBj6N9AbUqL8ut0GSmMZKwBiYrcrDmhrkRo6EUeNGfA9IZML7i2NdT4wt8GN92UgBx45MQuUoB/Eah+uKjyHtuTw+RCql1jVsKt07289r2xQMYsHyb+zBRXTmAh0/c0sA5Mw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1727199160; c=relaxed/simple; bh=NTLuwDjSEVo2enM6qrGpPVedmwmZAHc9WQo6iK9j2mc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Yxf3E1AJpb3Ax/Z5mf3XgecJHcyjrrTJdlfKbyyWKaq5kTVJVkNHdclx2HPRAjX/Yfz0cGV50WfUEakS3xlVnkjVIMVqvVNkMiZ39PwhoTm9jSlPhnLdF2jdjSUXG4yJgx8d3igHSQBGzyZ4vPVrgTYs0eSPhV5n2ee8xfh2Nbk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=MVB9r6VL; arc=none smtp.client-ip=209.85.128.176 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="MVB9r6VL" Received: by mail-yw1-f176.google.com with SMTP id 00721157ae682-6e1f48e7c18so21152747b3.3 for ; Tue, 24 Sep 2024 10:32:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1727199157; x=1727803957; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=c0lIMfZ9NfUkQbUEE/VkzFEp0wafK+H/olc9cy1kxFY=; b=MVB9r6VLOtgvC/sayM9Gi9G2q3/yQh8yujCYXqDevPg7DdIA7f6tHtMJCxrhUBdSYY SSsNKhRgD0sNqqqIjRBNQIc52gpps4muCyjlA14rND52rMgR671/zyn185iIIF1LUU7o yhh9KSPxuFFtmH5tfzMM8lIRbK2sw2SeCsUt2Jg6sifA7OIFJrAAmRvqd0HLc4A2kxHX jZrEkucn9pbGzhN5uew7fsNRLJddbqgrK3BcHSQWGaNIF0Ayq96vlU2eGKbf7uY3oKoM IHauvYAl+VaTiXlhD3JacbMlNgV1tjF9oIOzeC+/zjNcxsVHZHue2OO1kSdKEZ2VuYtG /wsQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1727199157; x=1727803957; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=c0lIMfZ9NfUkQbUEE/VkzFEp0wafK+H/olc9cy1kxFY=; b=O4HW6kftZz1noOV4+9pV+tHn9nlu1gY5eDDEHi/jZ3fK8LOKMuGdaRTQycDAgFW9b1 xPgwd9rmvy8W4kkqbInhquyr/BQcXNX7jUB7wfMVBzKF46JcHWfwSEhJ3rODM+FdLAXA wuIY8GpGZHrLYH1ZFqzdJbajVR47n8Hz85KxC/48KYXirUU21wFclFopwCS0mu260D7C M6Uf2zGfihlrziiPENbHXRLS5ZK0c1P8RpkZDFyaN+/YfZrXck1e3I3xdwJ63TxPEZE1 1kOsNHZg5CVb0q2adR746FAAKNfYIYS3ETQgR1ObQyWaFQHDU6csTsoMNRPl9dnsAIZq SVWQ== X-Gm-Message-State: AOJu0Yy3k+Pgn5HD9pRWT6rrBEfbOIlR00FJPaTS4EsVc0WyvJtKlsd+ 2Dv8GnB1V1wrPf3cPYBj4mxn4wD+ZzjZ1Xxij1UfrHnCRH6hGkkc37sARCzWJe0hKAVQEt29X8a Qqc4= X-Google-Smtp-Source: AGHT+IGDuvAhgzuMc+kXitqrQUQTd7YYGD1xLPy8155O8D++1cmVh8TJiTF4JfJt0a0LGmERdLdbOg== X-Received: by 2002:a05:690c:4a90:b0:6db:e2a3:4158 with SMTP id 00721157ae682-6e21db97daamr1007027b3.46.1727199156849; Tue, 24 Sep 2024 10:32:36 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e20d03d4e6sm3160707b3.31.2024.09.24.10.32.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 24 Sep 2024 10:32:36 -0700 (PDT) Date: Tue, 24 Sep 2024 13:32:35 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , "brian m. carlson" , Elijah Newren , Patrick Steinhardt , Junio C Hamano Subject: [PATCH v4 8/8] csum-file.c: use unsafe SHA-1 implementation when available Message-ID: <4b83dd05e9fe55e1ebd07f4b60831e1ddd404c0a.1727199118.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Update hashwrite() and friends to use the unsafe_-variants of hashing functions, calling for e.g., "the_hash_algo->unsafe_update_fn()" instead of "the_hash_algo->update_fn()". These callers only use the_hash_algo to produce a checksum, which we depend on for data integrity, but not for cryptographic purposes, so these callers are safe to use the unsafe (and potentially non-collision detecting) SHA-1 implementation. To time this, I took a freshly packed copy of linux.git, and ran the following with and without the OPENSSL_SHA1_UNSAFE=1 build-knob. Both versions were compiled with -O3: $ git for-each-ref --format='%(objectname)' refs/heads refs/tags >in $ valgrind --tool=callgrind ~/src/git/git-pack-objects \ --revs --stdout --all-progress --use-bitmap-index /dev/null Without OPENSSL_SHA1_UNSAFE=1 (that is, using the collision-detecting SHA-1 implementation for both cryptographic and non-cryptographic purposes), we spend a significant amount of our instruction count in hashwrite(): $ callgrind_annotate --inclusive=yes | grep hashwrite | head -n1 159,998,868,413 (79.42%) /home/ttaylorr/src/git/csum-file.c:hashwrite [/home/ttaylorr/src/git/git-pack-objects] , and the resulting "clone" takes 19.219 seconds of wall clock time, 18.94 seconds of user time and 0.28 seconds of system time. Compiling with OPENSSL_SHA1_UNSAFE=1, we spend ~60% fewer instructions in hashwrite(): $ callgrind_annotate --inclusive=yes | grep hashwrite | head -n1 59,164,001,176 (58.79%) /home/ttaylorr/src/git/csum-file.c:hashwrite [/home/ttaylorr/src/git/git-pack-objects] , and generate the resulting "clone" much unsafeer, in only 11.597 seconds of wall time, 11.37 seconds of user time, and 0.23 seconds of system time, for a ~40% speed-up. Signed-off-by: Taylor Blau --- csum-file.c | 18 +++++++++--------- 1 file changed, 9 insertions(+), 9 deletions(-) diff --git a/csum-file.c b/csum-file.c index bf82ad8f9f5..c203ebf11b3 100644 --- a/csum-file.c +++ b/csum-file.c @@ -50,7 +50,7 @@ void hashflush(struct hashfile *f) if (offset) { if (!f->skip_hash) - the_hash_algo->update_fn(&f->ctx, f->buffer, offset); + the_hash_algo->unsafe_update_fn(&f->ctx, f->buffer, offset); flush(f, f->buffer, offset); f->offset = 0; } @@ -73,7 +73,7 @@ int finalize_hashfile(struct hashfile *f, unsigned char *result, if (f->skip_hash) hashclr(f->buffer, the_repository->hash_algo); else - the_hash_algo->final_fn(f->buffer, &f->ctx); + the_hash_algo->unsafe_final_fn(f->buffer, &f->ctx); if (result) hashcpy(result, f->buffer, the_repository->hash_algo); @@ -128,7 +128,7 @@ void hashwrite(struct hashfile *f, const void *buf, unsigned int count) * f->offset is necessarily zero. */ if (!f->skip_hash) - the_hash_algo->update_fn(&f->ctx, buf, nr); + the_hash_algo->unsafe_update_fn(&f->ctx, buf, nr); flush(f, buf, nr); } else { /* @@ -174,7 +174,7 @@ static struct hashfile *hashfd_internal(int fd, const char *name, f->name = name; f->do_crc = 0; f->skip_hash = 0; - the_hash_algo->init_fn(&f->ctx); + the_hash_algo->unsafe_init_fn(&f->ctx); f->buffer_len = buffer_len; f->buffer = xmalloc(buffer_len); @@ -208,7 +208,7 @@ void hashfile_checkpoint(struct hashfile *f, struct hashfile_checkpoint *checkpo { hashflush(f); checkpoint->offset = f->total; - the_hash_algo->clone_fn(&checkpoint->ctx, &f->ctx); + the_hash_algo->unsafe_clone_fn(&checkpoint->ctx, &f->ctx); } int hashfile_truncate(struct hashfile *f, struct hashfile_checkpoint *checkpoint) @@ -219,7 +219,7 @@ int hashfile_truncate(struct hashfile *f, struct hashfile_checkpoint *checkpoint lseek(f->fd, offset, SEEK_SET) != offset) return -1; f->total = offset; - the_hash_algo->clone_fn(&f->ctx, &checkpoint->ctx); + the_hash_algo->unsafe_clone_fn(&f->ctx, &checkpoint->ctx); f->offset = 0; /* hashflush() was called in checkpoint */ return 0; } @@ -245,9 +245,9 @@ int hashfile_checksum_valid(const unsigned char *data, size_t total_len) if (total_len < the_hash_algo->rawsz) return 0; /* say "too short"? */ - the_hash_algo->init_fn(&ctx); - the_hash_algo->update_fn(&ctx, data, data_len); - the_hash_algo->final_fn(got, &ctx); + the_hash_algo->unsafe_init_fn(&ctx); + the_hash_algo->unsafe_update_fn(&ctx, data, data_len); + the_hash_algo->unsafe_final_fn(got, &ctx); return hasheq(got, data + data_len, the_repository->hash_algo); }