From patchwork Wed Oct 9 20:31:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829206 Received: from mail-yw1-f180.google.com (mail-yw1-f180.google.com [209.85.128.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 22D471DFE2B for ; Wed, 9 Oct 2024 20:31:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505865; cv=none; b=j50RqB1DGjA6tJQTIqLk4LP2cVTcCtu6S2n/WNE3YKVAGYd3h2uoOkZEghZ/Hz5thnTt7d0uT1y1wBTDSY2rGmYZgB0v60yIyulhxVA19hX4110r2GalZCOizq/xvlCxnPSQ/ZuGB3nzuMyoFkQpl6CN1ajgwB3JwMc1wxfOpGA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505865; c=relaxed/simple; bh=vKIPL9Fr/i9KtxXEyK7ZSY6cNCKrYkZAIgUa4+Z6/KA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Hpjj0wq3KqMNntEGzmFT41dmG3bPcEY4w1GxACvaCMNP3IgXTuioxdMJgpTvoKBfRgMAJpS3EpygKzE8lvcabznuRXmVm5lJhV9ViRtdCENu40JdKRnxxU/fbNp0pK13agHfXq1HHdEwXt1Txt8FCoZ+dksc8S9f+DS1WUXrvTM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=BPHWTig3; arc=none smtp.client-ip=209.85.128.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="BPHWTig3" Received: by mail-yw1-f180.google.com with SMTP id 00721157ae682-6e2772f7df9so2377397b3.2 for ; Wed, 09 Oct 2024 13:31:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505863; x=1729110663; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=oFyixOplwrZsxrWxVQzNGlIn5uVn9Hg+dQ+hQe1FUK8=; b=BPHWTig3uyHzOxZdD13m1UFWwGp2EYmsN7x5foY6x0yhkVGN48mwbvqqSqiVot4Jib +mKd3En8vIm6+BayuB5Zx4h3lPsq/oUkN4rw0WHQAYOoCcM0UJQGl5ehOxMW6tzQsqD9 3nGtYTor8yFj7/KFHGy5lmwvcsvrHJ2oS3jgXuTCfEZ9qWAFtzMnEeotKpLgbqOSJyHa n3vFPHuXERU1wsPezjDdvbsq5oFxPWsMyKcQRfPBT4TKfyHTK+nrq6TSUCy59PolhmwV E1xCY16dsGeKv8pBvB+Yn0E9ILsxh4RUrJo0oeKtTBc73iL2+f1OMF0sHBUKD66Y5zcP op5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505863; x=1729110663; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=oFyixOplwrZsxrWxVQzNGlIn5uVn9Hg+dQ+hQe1FUK8=; b=dmHwvbmpRK2cNp6Ds3WG+p+LR5xpEzXlb/yCXK22jCiF6RCU0y2aQbtKKRdfsl2hLE FLqPWCORb8PIIPV8q4jaxSVMvg7xrI1HteYr+rY19aglcxcMbxweg7N7L6vZK3vdbIhI iK9mM3lU0k/uD0uQkbPKwnfIvBUQupnq/K3GXRUR4/lXnNq7B15IntXp38t/NxZOQjVP FSn3PpkXRk8euF8qASxX+5VnGAzOxTUpJFh4kJSI/3dT22oPOBBr3QUHrt4IjiyMJ0dj 1/acM9IAtqvRl42clCiHcp1WBhdTWAtU+wPDdapalA1mZHG6C+jjivQn6hLQaxWrbzfU oaeg== X-Gm-Message-State: AOJu0YysTGPY38lMZaRDpYM3N58L59oL7vR3AAhC5fTrbspKptnu0FAZ M5fyd9D0r+dBt38JKqYRbhRp554JocV3J61TJThVTDPZi1ifiVP8cFnv8AcSFLZFaiVTCPiW8H3 XiVc= X-Google-Smtp-Source: AGHT+IG+aEDbUa079rXBa/2VlHEGq1I5y29U9tEwF4DYXPb0NySbe96LYNKUZfGT99dxvRANtRKGZg== X-Received: by 2002:a05:690c:62c2:b0:6b5:409e:654 with SMTP id 00721157ae682-6e32215c7f1mr47872947b3.24.1728505862846; Wed, 09 Oct 2024 13:31:02 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e2d937c4d9sm19881827b3.56.2024.10.09.13.31.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:02 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:01 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 01/11] pack-bitmap.c: do not pass `pack_pos` to `try_partial_reuse()` Message-ID: <80f96385a7724fd2f8bf10c42ccce5b0b454a107.1728505840.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `try_partial_reuse()` uses its `pack_pos` parameter to determine if: - the object specified by that position is in bounds of the pack it's operating on, and - in the case that the specified object is an OFS_DELTA (and we're operating on a single-pack reachability bitmap) to ensure that the position of the base object occurs earlier than the delta But neither of these checks are necessary. The bounds-check is redundant because we are operating on bit positions which we know correspond to objects in some pack, and the caller in reuse_partial_packfile_from_bitmap_1() is smart enough to know to stop processing bits when it is at the end of some pack. Likewise, the delta dependency check is also unnecessary, because we can use the object offsets within a single pack as a proxy for bit positions in that pack's bitmap. So let's avoid passing in this redundant piece of information, and save one call to offset_to_pack_pos(), which is O(log N) in the number of objects in the pack we're currently processing. (This all comes from a discussion on a semi-related series from [1]). [1]: https://lore.kernel.org/git/Zti0xrzCtpWScPjz@nand.local/ Suggested-by: Junio C Hamano Signed-off-by: Taylor Blau --- pack-bitmap.c | 21 +++++++++------------ 1 file changed, 9 insertions(+), 12 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 9d9b8c4bfbc..706ff26a7b1 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2054,7 +2054,6 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, static int try_partial_reuse(struct bitmap_index *bitmap_git, struct bitmapped_pack *pack, size_t bitmap_pos, - uint32_t pack_pos, off_t offset, struct bitmap *reuse, struct pack_window **w_curs) @@ -2063,9 +2062,6 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, enum object_type type; unsigned long size; - if (pack_pos >= pack->p->num_objects) - return -1; /* not actually in the pack */ - delta_obj_offset = offset; type = unpack_object_header(pack->p, w_curs, &offset, &size); if (type < 0) @@ -2121,8 +2117,13 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, * OFS_DELTA is the default). But let's double check to * make sure the pack wasn't written with odd * parameters. + * + * Since we're working on a single-pack bitmap, we can + * use the object offset as a proxy for the bit + * position, since the bits are ordered by their + * positions within the pack. */ - if (base_pos >= pack_pos) + if (base_offset >= offset) return 0; base_bitmap_pos = pack->bitmap_pos + base_pos; } @@ -2184,7 +2185,6 @@ static void reuse_partial_packfile_from_bitmap_1(struct bitmap_index *bitmap_git for (offset = 0; offset < BITS_IN_EWORD; offset++) { size_t bit_pos; - uint32_t pack_pos; off_t ofs; if (word >> offset == 0) @@ -2203,12 +2203,9 @@ static void reuse_partial_packfile_from_bitmap_1(struct bitmap_index *bitmap_git midx_pos = pack_pos_to_midx(bitmap_git->midx, bit_pos); ofs = nth_midxed_offset(bitmap_git->midx, midx_pos); - - if (offset_to_pack_pos(pack->p, ofs, &pack_pos) < 0) - BUG("could not find object in pack %s " - "at offset %"PRIuMAX" in MIDX", - pack_basename(pack->p), (uintmax_t)ofs); } else { + uint32_t pack_pos; + pack_pos = cast_size_t_to_uint32_t(st_sub(bit_pos, pack->bitmap_pos)); if (pack_pos >= pack->p->num_objects) BUG("advanced beyond the end of pack %s (%"PRIuMAX" > %"PRIu32")", @@ -2219,7 +2216,7 @@ static void reuse_partial_packfile_from_bitmap_1(struct bitmap_index *bitmap_git } if (try_partial_reuse(bitmap_git, pack, bit_pos, - pack_pos, ofs, reuse, &w_curs) < 0) { + ofs, reuse, &w_curs) < 0) { /* * try_partial_reuse indicated we couldn't reuse * any bits, so there is no point in trying more From patchwork Wed Oct 9 20:31:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829207 Received: from mail-yw1-f182.google.com (mail-yw1-f182.google.com [209.85.128.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 791751DFE2B for ; Wed, 9 Oct 2024 20:31:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505869; cv=none; b=HSVTrxajdSEIj6jnaNAq7pfb5z5ryYTgRIIElCC7FYe33QUM8PTFc8CepmLY6VO+fFQZkaEPkLtJ7t+c1XHDpFpy9Ssb8I2KYkUGtu35KRjcTy40r1ZEE6KEcjwowkYvOa4JujkZuLjMViOGigEBQuoiSuMgoz2/CHuK/5OpHsk= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505869; c=relaxed/simple; bh=DVx7x+Zdb3aUIQzTYhR9tZXixJHgHLyflV7H8HMldoA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=rvwBTPi99hf+UnH/ppNrmRA9FKMnoz4RiRfRMp7b1IH2XeXqHFBTSwWBpibpokmZdov3zmbNpo8bsEiCUwHqGszbor6+Z5xYgjJGCqnV8eLBdjot3KMjZAya5urHGCMdw2tjj4Xc1Yh14eXbr/SI1h2QmPo/E0E785Ni6RcoksM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=2UJPWpUE; arc=none smtp.client-ip=209.85.128.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="2UJPWpUE" Received: by mail-yw1-f182.google.com with SMTP id 00721157ae682-6e2d2447181so2661517b3.1 for ; Wed, 09 Oct 2024 13:31:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505866; x=1729110666; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ItJ59+OwnJ7j6SYldlZucWMJGrgH1cP8nZP9IDzUP7A=; b=2UJPWpUErKucaMHs4H7ReBrDaiSBG0XMVS1F91SX1B412Oa4vi03fnna+dNaHDcF5U pvzXub3otMT9ujVrealgxtxnaDc29t8n7o5ULHvznuQ1Sekg/OU/65bsBe6O35LKIe8L E9EQUy6XW2Mi3Rr4pzd69EnUizRWOH8BpgjDMboVeWbjeeRby1P8yGHKhnMFcMgXQkon sYItQxxhbgfkf+HEWCcIhwqvzBtO9mShZPrWsDg/uOXwpAW0I+Bx8sKt1v/uyCKe1eVU o+IjhZw8Q22M1r2ZjJcZQkumoHBJWDZzU5NsIWh+ajQ6nP/JubhGp1Hku6RYT0L/EkvU Fu1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505866; x=1729110666; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ItJ59+OwnJ7j6SYldlZucWMJGrgH1cP8nZP9IDzUP7A=; b=uSiUkMXq8pcpK7jFHVQYzhMmE91e5iJ8tH987FEydavQiK0cRb4hFbjM6TE9SLK0Vm L7yriFbRZs6M/fuQxPBg37ZTmkbFCXzwWGS56FaDi7Exvq9y9yokvCJ72sTADvtced5C knFs5QaBlC6dbDlZxuvquHvgkph8HUuC3QDy1B073i0nuRhuLC/yT7AgOqUYlyCQHFDl cRbxW6nD+sQZ2AL+BYLCPVkd6QSEI19EF/plrW90Wt8cSqCme9tDanYhG1ngViwDHA/0 BbaYNK8war1Z77+lqP9EVmJkpGvjsmCokRist40vqJprp3qkCMasWVnXhXpKZHtv7HiD fUWw== X-Gm-Message-State: AOJu0Yx6TavsAJpcIaly5EzMaofzR1WFlphkNFwT4CzVkqfiDrxdpenu a3Q8JNc+UbZSJyOBZHqImdvLv7KPNkSObcVzev3E0Rf9EFzsrFsBi/CJGsq1ifk2bqfKhhn9qv9 HzvU= X-Google-Smtp-Source: AGHT+IHYOIUkT/jVpy2orFmeLQWLdMBZfcLwOUx6mwnID17+MyONq8Ez+u7GKwu+Z/2D6OuT35MrYg== X-Received: by 2002:a05:690c:89:b0:6db:4536:8591 with SMTP id 00721157ae682-6e322491490mr41404597b3.28.1728505866135; Wed, 09 Oct 2024 13:31:06 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e2d926dd83sm19693347b3.20.2024.10.09.13.31.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:05 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:04 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 02/11] pack-bitmap.c: avoid unnecessary `offset_to_pack_pos()` Message-ID: <010316f1eb49b46db1ed0b8df10402d3a49013f8.1728505840.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In `try_partial_reuse()`, for any object which is either an OFS_DELTA or REF_DELTA, we need to determine the bitmap position for its base. We handle the case of single- and multi-pack bitmaps separately: - For multi-pack bitmaps, we look to see if the MIDX selected the base we're looking at to represent that object. If the base object was selected from a different pack in the MIDX, we won't reuse the delta. - For single-pack bitmaps, we convert the base object's offset into a pack-relative position (identified here as 'base_pos') by calling `offset_to_pack_pos()`. But we also call `offset_to_pack_pos()` before handling each case above separately. For the MIDX case, this is unnecessary, since we don't need to lookup the base object's pack-relative position; instead, we just care whether the selected copy of that base is from the same pack we're currently operating on. For the non-MIDX case, we end up calling offset_to_pack_pos() again, yielding the same result, making the earlier call wasteful. This behavior dates back to 519e17ff75 (pack-bitmap: prepare to mark objects from multiple packs for reuse, 2023-12-14). Let's correct that by avoiding the unconditional call to 'offset_to_pack_pos()'. Signed-off-by: Taylor Blau --- pack-bitmap.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 706ff26a7b1..5c5c26efe0d 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2069,7 +2069,6 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, if (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA) { off_t base_offset; - uint32_t base_pos; uint32_t base_bitmap_pos; /* @@ -2085,8 +2084,6 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, if (!base_offset) return 0; - offset_to_pack_pos(pack->p, base_offset, &base_pos); - if (bitmap_is_midx(bitmap_git)) { /* * Cross-pack deltas are rejected for now, but could @@ -2105,6 +2102,8 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, return 0; } } else { + uint32_t base_pos; + if (offset_to_pack_pos(pack->p, base_offset, &base_pos) < 0) return 0; From patchwork Wed Oct 9 20:31:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829208 Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C93A11DFE2B for ; Wed, 9 Oct 2024 20:31:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505872; cv=none; b=ceR+RD6qekPn+XfbOL69+Cm8z/Z4KBGuyM6HLgq2IhDS777ShnLH+qSvvfWGxZ99q2ePGvz89eknTTEJdebZeiagA5sGdgEiFgsY1Z6jZ2MbrUwYAirgNbgK9bEnt1SiW3y39njxp17rKgtRJ5xckvjdxo1dvKmoIQLVS9OCyvU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505872; c=relaxed/simple; bh=4Yw2pzGR0wOeq8aVIy6qUhcZZRmfYt79+Ng9IcpXeTM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=OOzrrmk6PUwnF1bbiVm4Nh5kPbTdzVVHRMlCK1TCrmEUooGYVRhF9rYkmT9lOt2BkswK8yO1NGd2627xvzZ8M3go0htHsizC3C5x3DH25pI6Q/EhfgeXdFOTTuMwj9lcoQXlx1ZCHWZEOOWIgIkXWKEYjLOZmM8O16SIt+ZJCpk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=ZoZ5WvtI; arc=none smtp.client-ip=209.85.128.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="ZoZ5WvtI" Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-6dbb24ee34dso2225017b3.2 for ; Wed, 09 Oct 2024 13:31:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505869; x=1729110669; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=gZ4CWK6MEGYelnnHf5BZR9q0ny8i9QOTMIHGQ3r9LF8=; b=ZoZ5WvtIJ7R0EUtwg3BJ1RMOSGlFC9CHy6/+R6jiKidpFxy8vJ5qOYMiTBqg+2uCRJ aL7MtoAvE/YGoouxnR+pmNGnH2HJSBJbreCzI6xlstrq6JD06k4/DRwnyNqVecQsA5f2 2y+2DrdXDrmGL9DGBUWPxyYVaQKdhpTADgtuRleM7uSN84kabfdJ71eWQW9IG/LBeB94 FuIAQdHkkHjygFDExn659htLucBFTUNu/ag0w3Egrvksi9E/42MEAu4brb1ybcsL/6++ kc2QmeD0bmKtmNinH96ILqpdO0PnM4dn4balVmd5D2Lt+GUFa6FP6YDyxCcUkUNuXN9I ov+Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505869; x=1729110669; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=gZ4CWK6MEGYelnnHf5BZR9q0ny8i9QOTMIHGQ3r9LF8=; b=fywnL3U9aB6T+1uQlRESR5KLfBdR7AQ/XaT0Vur2ZIBdz/IY3fIdZzBT7JnyzQ9D57 OQNjPY6cP4F7sHNVekkYlQZI74c28Y1VQCswryoK4VnBfZXFTQse1OdTukgNDnRJVOyl TVKP2KOjBt4DSC+Y3Zns6Axjt21o6EPs0VPwRwKEdE02sgxvKj3PJYkRzNpDbj5RwaQV 847+vZwpCc/y4YtJE7hZmfWrpJlmwycVmPxLy0qfIjEvdxG+F0ovXqtpGSLVpQwXIum0 refQ/osAqOToOJRpSuxdrpCdGJiKpza0mlzw47EZaCe03DiJHnlBTxpathqpKnv6itmX ofEw== X-Gm-Message-State: AOJu0YyY4fMG+ta7AjJmzOgLRhx3JL+9Q4rsw05P5rYKfePuegvkBI83 2tLDhbdv6M+nJTBxIwvkw8+BTjGsV9Zhjdw+8Ss3Pmnz+mRLKksyI5gMzRozf2iOtqgvh7oOKnI bdgE= X-Google-Smtp-Source: AGHT+IF5WwZj2e6CmXZc6hvjSYKjx34/pGzy7pRHLXUjytW8UaNMIaQ5Y9cz3S7YSlQ5cMzuF8K03w== X-Received: by 2002:a05:690c:f03:b0:6e2:1626:fc24 with SMTP id 00721157ae682-6e32e1d7532mr17341687b3.7.1728505869453; Wed, 09 Oct 2024 13:31:09 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e32d5ebeacsm2030567b3.6.2024.10.09.13.31.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:08 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:07 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 03/11] pack-bitmap.c: delay calling 'offset_to_pack_pos()' Message-ID: <6118055b55c801f964b5a4b5d001196c00b757ab.1728505840.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When handling single object reuse from a single-pack bitmap, we first convert the base object's offset into a pack-relative position. Then, we check if the base object's offset occurs after the delta object's offset. If that's the case, then we kick the delta'd object back to the slow path. But there are a couple of oddities here: - However unlikely it is that we'll find a delta/base pair with base_offset >= offset, it doesn't make sense to convert the base's offset to a pack-relative position if we're going to throw out the reuse opportunity anyway. - We "convert" the base object's position into bitmap order by offsetting it by 'pack->bitmap_pos'. But this makes no sense, since single-pack bitmaps have only one pack (by definition), and that pack's first object occurs at bit position 0. Let's clean up this part of 'try_partial_reuse()' by (a) first seeing if we can reuse the delta before converting its base object's offset into a pack position, and (b) avoid an unnecessary conversion with 'pack->bitmap_pos'. (b) allows us to avoid the intermediate 'base_pos' variable, and instead write directly into 'base_bitmap_pos', which we also change here for further clarity. Signed-off-by: Taylor Blau --- pack-bitmap.c | 9 +++------ 1 file changed, 3 insertions(+), 6 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 5c5c26efe0d..faabc0ba0e9 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2102,11 +2102,6 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, return 0; } } else { - uint32_t base_pos; - - if (offset_to_pack_pos(pack->p, base_offset, - &base_pos) < 0) - return 0; /* * We assume delta dependencies always point backwards. * This lets us do a single pass, and is basically @@ -2124,7 +2119,9 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, */ if (base_offset >= offset) return 0; - base_bitmap_pos = pack->bitmap_pos + base_pos; + if (offset_to_pack_pos(pack->p, base_offset, + &base_bitmap_pos) < 0) + return 0; } /* From patchwork Wed Oct 9 20:31:11 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829209 Received: from mail-yw1-f170.google.com (mail-yw1-f170.google.com [209.85.128.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 32DD41E25ED for ; Wed, 9 Oct 2024 20:31:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505875; cv=none; b=tHiR+ULAN4x5qg+koT9kr3c0YF1a5FDo7IwjrrggOU3T3RSunmjE10nXRPG/PPz+MOHQ7aRsqUmuQu/fFTrrJrK6fTAVZdAcBXd0vCHGSDZnRNIokFr7i2iFXCbBTw8brWhLCnGZ48GlXfBJPtEQ/YqaKBmvQMQgo81qRusYfSg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505875; c=relaxed/simple; bh=VHXqtIrTQ8EoL49wvasLQy0bQ90hM+ngote5lOyTFs4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VXs9bp3yS5kIh6YF3lWAr6TX2FfJk7alrPs/C8vlp0WdzSFexF5pquwtk4/T0/ZAZreZgt3VAEi/pxHyFd1wZVqcvy4T0G7mli9WU9/EhGRNqC0ii2mmoFu5OeQZCSHtr9Z1nsFq8dklP6iDD3LsYoZS/CuTx/Stl5h7UV5mJdY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=1NEkCfK0; arc=none smtp.client-ip=209.85.128.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="1NEkCfK0" Received: by mail-yw1-f170.google.com with SMTP id 00721157ae682-6dbc9a60480so2802037b3.0 for ; Wed, 09 Oct 2024 13:31:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505873; x=1729110673; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=3DmxUW9VfrcaxEOmru85wG8XZ8VpKgOyjgrDK9i/sLc=; b=1NEkCfK0z12L0kXte+VXzm/8+qEW5gaxBQD4HZw7kJNUTFolD7BGlJQcohyAAeAgv1 aAg88LbI1iX7lqBli3sSzFuQOT4/C3NVEismZX/9UYHgX24ZOAYzhfEwtBrUQckudIMl +uUggrbRQOU9nU9/Y+V3XzV1NPKygbnRZcn5pXA/T370kfu8PlgrwTIsGsJVjPG02gQ4 5Ravqq/GvevAmGLBxOebTzhKvGHLVNBL/bkX26ZEEaaDeCwp+VYnzlViw2N+dYr+zQpd WW7pwtTNMsBczOslcLzhk8dKTKh49Wsn/VslxzvGV9Q8MpPZtwAoY/tt/Fn1qQDHs4yF MjLA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505873; x=1729110673; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=3DmxUW9VfrcaxEOmru85wG8XZ8VpKgOyjgrDK9i/sLc=; b=qhfonmE+i+lWie6NV46JV/uLUzieo4N8aNG0ZJMBtG7tL+aeVmDLww1aM/62m3qbfb i21kiSLrT4Q8HaN4MiSCJCG7NXxk/pFrNfyoeckfDTWl57iX85yMS2d3MJez98jfqWbg 5K2hk+q1AqXEQERVnilkDoIcTY6mVNINBbJEkib/jxmwvRuo/HAxx7GqJfzLp0eV5NKG oEXHTZZ43A3Bpzb52NXPkPO/5Vro+rtWzzFY8diKtRRCzqQVdT2hBycHgUUnyECFwXEW ySOE4MsqNqv7BX/eF7NPEUVgkbJe445YSgb+DzOARjeKEKXRKSPHAJXmCz9dJJrknnIb 2Ulg== X-Gm-Message-State: AOJu0YzkSixqAs87ZyflSYOxgcYn36HZCzCILwZmkqf99UYkEttGqh20 /xNjiqkqgfNMEdGABAyjPRTUif94UOb1ljasH/2r27qATqEJ4bEc2az4bdwkqaqu9AwfOkCjd9S bu/o= X-Google-Smtp-Source: AGHT+IFXrXhIyjKXfwuHcnpecIIWIXyda3nc72rNMpnN9+/kvWFfZHUeWpIL1qkUdAlj4qHrq/4fYg== X-Received: by 2002:a05:690c:f0f:b0:6de:351:3e with SMTP id 00721157ae682-6e3221867e6mr43772907b3.8.1728505872947; Wed, 09 Oct 2024 13:31:12 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e318698e66sm6554917b3.32.2024.10.09.13.31.12 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:12 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:11 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 04/11] pack-bitmap.c: compare `base_offset` to `delta_obj_offset` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When comparing the offsets of either half of a delta/base-pair, we compare `base_offset` to `offset`. There is nothing functionally wrong with that comparison, but it is slightly confusing, since `offset` points to just after the delta object's type header in the pack, whereas `base_offset` points to the beginning of the header. In practice, that distinction does not matter, and it is perfectly fine to compare base_offset to offset. But we already make a copy of `offset` before it is moved forward by calling the function `unpack_object_header()`. So let's use that copy (which points at the beginning of the delta object's header) in the comparison so that we are comparing like quantities. Signed-off-by: Taylor Blau --- pack-bitmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index faabc0ba0e9..3e1034cabf3 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2117,7 +2117,7 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, * position, since the bits are ordered by their * positions within the pack. */ - if (base_offset >= offset) + if (base_offset >= delta_obj_offset) return 0; if (offset_to_pack_pos(pack->p, base_offset, &base_bitmap_pos) < 0) From patchwork Wed Oct 9 20:31:14 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829210 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B874D1DFE2B for ; Wed, 9 Oct 2024 20:31:17 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505879; cv=none; b=rakpaXyn7O6u6F28lVmvrXD/IRBuX8vBkYjqxRy9o/1o4Aja5hTBpF4VatbRyTiXLIFkIU/elsGYTWD6/Tg+3DLW4MbNBVwcp5hCrvboflOZUeM1mocf6ZmxXu3+G7UUvpfiRPxZfAjDSoWe7YpKql397GLuS73gALcj27IPHss= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505879; c=relaxed/simple; bh=4q5yxhJJXoLLr3+Ye/IHlYpmxzjCEFmb3HTTNRpNOsE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Ej2CQI8cau2iaPoW8qCp2ocjIWnUh/b3hzukgx65+eXoa2d0guPasl819ETn7yGKGJJbXa/PRn1oaJJ/wZG37pRb5xkwQuTsKp9iDmJSKpU8QVncDaJc/Q/iYrdnNXwnaQBZjfxCJIerMjvXx5uamRvBsQC+DoSFEyb3s9vbyjw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=R3govh6L; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="R3govh6L" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-6e129c01b04so2585067b3.1 for ; Wed, 09 Oct 2024 13:31:17 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505876; x=1729110676; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=hTsER+cHeQLEW3HZD4aUsowq/3kZuml36OPqteXqdBM=; b=R3govh6LCqgLhKKtDbQNciPK1tp9mBpyfcEOm9Xq/A8o2UJM34KAkzf9vYUmIe3RyN Wbq7XElnVIF5ji5lZZYy7nTvpasIc85naicMus8U26ubYkBE/3KYkT347iKrxgDRuleb ttZ5arVyFUY1XCeiwRQozah84kOGJ6ivMXKra0V97MLvX+brydrwvJLA4lYQSdyD5PTD iQbUl96y8ajZJHiemCfxTAnoyep6kcNs/6eOAxtOE4WgPSRFAXNxigXcHZqFwJ6PdmIa P0D/t0/AcowuMkmBel6e4+kz9kfvyJ1frpCjH5/SGzZtiZH8/iBwSQxtlDVnPumkNe9J f+FQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505876; x=1729110676; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hTsER+cHeQLEW3HZD4aUsowq/3kZuml36OPqteXqdBM=; b=Jb+b20+LzA6IIjRcffcRkilHmX5zO0gl0vayMU2uKYnRDZhLC5Y8DZH0e7nH1URCKD ng07vw25tOMFRtJUkbjt82mnhFHfR0liNU8N7iHrCSpYwclhDZwEh9kF/EV4737bo+Fy zF10j5klh58dA7lN6sL6bDkKwITeS57tXSzaCScuDyZ0EjHjy1dkFxODdGf9I765NCEf 3vFdCO9vx0J9nHkvNZhmrsT6yggqsmHeECKJxfZ3O9ADyD4sVv4nysNQwJGSd5Md/DiT 39iMSyMvpqA+2CkfqMaLPt4vII56r9UKuborBip+M8hiSPQTiQAYjt8oHzGDdlZF9eZj 2J8w== X-Gm-Message-State: AOJu0Yyif05A5ev09tA/uB7vrjQtatXRmjADD89mj+YSAZ0kDhT50tGM 5NQzon3CsYZos6xNKPfM13yiPcsFpgX94AD+IlTmHExsy41WjeLO7eOBdFDquL3l8qpJ/GC5lb9 skMc= X-Google-Smtp-Source: AGHT+IGX2j7yG/fGRZmwEI0qRzH4e35MannYAAwaZedz/tgEVAspk/EKryd0ybpIk3D/aqn8ITqiWQ== X-Received: by 2002:a05:690c:89:b0:6e3:2b5e:918f with SMTP id 00721157ae682-6e32b5e93camr24725807b3.44.1728505876077; Wed, 09 Oct 2024 13:31:16 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e2d93e2cdasm19860487b3.114.2024.10.09.13.31.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:15 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:14 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 05/11] pack-bitmap.c: extract `find_base_bitmap_pos()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Inside of `try_partial_reuse()` is a block of code that handles specially the case where the object we are attempting reuse on is either an OFS_DELTA or a REF_DELTA. Extract that part of the routine out to a separate function, named `find_base_bitmap_pos()`. This will allow us to make that routine slightly more complex in order to implement cross-pack and thin-delta conversion (which we'll describe in detail in subsequent patches) without compromising the readability of its caller, `try_partial_reuse()`. Signed-off-by: Taylor Blau --- pack-bitmap.c | 90 ++++++++++++++++++++++++++++----------------------- 1 file changed, 49 insertions(+), 41 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 3e1034cabf3..6dbe6a2c5bc 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2047,6 +2047,52 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, return NULL; } +static int find_base_bitmap_pos(struct bitmap_index *bitmap_git, + struct bitmapped_pack *pack, + off_t base_offset, + off_t delta_obj_offset, + uint32_t *base_bitmap_pos) +{ + if (bitmap_is_midx(bitmap_git)) { + /* + * Cross-pack deltas are rejected for now, but could + * theoretically be supported in the future. + * + * We would need to ensure that we're sending both + * halves of the delta/base pair, regardless of whether + * or not the two cross a pack boundary. If they do, + * then we must convert the delta to an REF_DELTA to + * refer back to the base in the other pack. + * */ + if (midx_pair_to_pack_pos(bitmap_git->midx, pack->pack_int_id, + base_offset, base_bitmap_pos) < 0) + return -1; + } else { + /* + * We assume delta dependencies always point backwards. + * This lets us do a single pass, and is basically + * always true due to the way OFS_DELTAs work. You would + * not typically find REF_DELTA in a bitmapped pack, + * since we only bitmap packs we write fresh, and + * OFS_DELTA is the default). But let's double check to + * make sure the pack wasn't written with odd + * parameters. + * + * Since we're working on a single-pack bitmap, we can + * use the object offset as a proxy for the bit + * position, since the bits are ordered by their + * positions within the pack. + */ + if (base_offset >= delta_obj_offset) + return -1; + if (offset_to_pack_pos(pack->p, base_offset, + base_bitmap_pos) < 0) + return -1; + } + + return 0; +} + /* * -1 means "stop trying further objects"; 0 means we may or may not have * reused, but you can keep feeding bits. @@ -2081,49 +2127,11 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, */ base_offset = get_delta_base(pack->p, w_curs, &offset, type, delta_obj_offset); - if (!base_offset) + if (!base_offset || + find_base_bitmap_pos(bitmap_git, pack, base_offset, + delta_obj_offset, &base_bitmap_pos) < 0) return 0; - if (bitmap_is_midx(bitmap_git)) { - /* - * Cross-pack deltas are rejected for now, but could - * theoretically be supported in the future. - * - * We would need to ensure that we're sending both - * halves of the delta/base pair, regardless of whether - * or not the two cross a pack boundary. If they do, - * then we must convert the delta to an REF_DELTA to - * refer back to the base in the other pack. - * */ - if (midx_pair_to_pack_pos(bitmap_git->midx, - pack->pack_int_id, - base_offset, - &base_bitmap_pos) < 0) { - return 0; - } - } else { - /* - * We assume delta dependencies always point backwards. - * This lets us do a single pass, and is basically - * always true due to the way OFS_DELTAs work. You would - * not typically find REF_DELTA in a bitmapped pack, - * since we only bitmap packs we write fresh, and - * OFS_DELTA is the default). But let's double check to - * make sure the pack wasn't written with odd - * parameters. - * - * Since we're working on a single-pack bitmap, we can - * use the object offset as a proxy for the bit - * position, since the bits are ordered by their - * positions within the pack. - */ - if (base_offset >= delta_obj_offset) - return 0; - if (offset_to_pack_pos(pack->p, base_offset, - &base_bitmap_pos) < 0) - return 0; - } - /* * And finally, if we're not sending the base as part of our * reuse chunk, then don't send this object either. The base From patchwork Wed Oct 9 20:31:17 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829211 Received: from mail-yb1-f172.google.com (mail-yb1-f172.google.com [209.85.219.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 157B91E2831 for ; Wed, 9 Oct 2024 20:31:20 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505882; cv=none; b=QIGhOfGRr4RPUj/ej+7vj2hRxAyZombjw91fVMYE7YZ0hJWfatxLZl0JkOfPngs/fQQrgUN0zkHcxEkByc+tQpY9YVpXb+FQ8lq7cNr20PkgytE8PwibPfRFIgf6TW23LDPrAi9Oi6jAPztw1w/Gw2FUCcdRuCGeBSAj0qCa5bQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505882; c=relaxed/simple; bh=mMAgZa07M/tSaFvM4gDdX6SIaFdy73SCo6kbM0z+P+Y=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=eP1eDiOTh29WccleGiX2SzMgk01PibqVfi36nuDng8cjQppJQWmm2cvvYx9Q5NfTKIzOjk9prrqZqd6sRJXbv1yaa9CZDTtm2iisXRDwZh0VYFI235S/fBf/2vjk/c4z7WA+o+POnPPwTnzdNiOB7+Dp7fW0Pl9Fug2NCFD1mqk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=LCaD9ErN; arc=none smtp.client-ip=209.85.219.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="LCaD9ErN" Received: by mail-yb1-f172.google.com with SMTP id 3f1490d57ef6-e28fa28de37so152771276.3 for ; Wed, 09 Oct 2024 13:31:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505879; x=1729110679; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=a8fC45fk3kkuGgz5bp0aKHUHsvCc3Jgn8s/zOrWkLl4=; b=LCaD9ErNZfXdfVQPvz2eDLrhWI1QwkkJZRvCqt+PkyqKe9McQch7tIBmQt+5iZSH2V hcDLpBnR72rcvwBSi++OPvR/wGOxeF7VRpNw9Q/FujQddh842+/rrkXoT8fw/q7M1J+I CyKj6oYyIqriU8Tj0jLG5QNcNjW7CnhFaH4VpYjMCU0NOPPUmqN/JG7caRjKDfPQr8cs 48NIy5702wCrdHDGPCJWmAyp1I6N5AvUHrS9TfYGVT2c/Xea8maHKy4sq2ze/OE89gsI /fKeMJFEFgKbfVrwbxeb8TdY1z+xoYvhuPKtJ5UsyqHI77vvkIkEPRuxqKEzIU7mof5W kQVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505879; x=1729110679; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=a8fC45fk3kkuGgz5bp0aKHUHsvCc3Jgn8s/zOrWkLl4=; b=K+xxJbIgqgGzY++nJYUiOw3KGW6wwWUPuuZsSSu6Wqj48s2rvPb8ydA5T5zievRYJm 0t3ThsQ8NelrDdoT5X2JrIheao361BqfNY511npbHQ8wlbszSo/w7N3yPbmsuxe54Njg wGilb7iTasL1U2zl4tQ67w0zvxf1dOUQ4FfOxi2fATFSDLmC6s69r2RIqqGjXR4BiTOs K/zoS360sBRL4xGiM0XhGQMpLePFH8QOumBtQKa4G/u87tw8jL8Bxr6rPwoLDQPbp206 SbUe6lzKL/hVoJ/9u3hj3grUH3AcuCosQVggCKgAfTtdlOLAbnGbBrtQstTdRgXgoE4i 1QmQ== X-Gm-Message-State: AOJu0YyJgHYiwm7rb+jpadVJ5yQHAOtYpt7qvYnxcyQ/TA/LnLTZp7AS IFq7Ed51+GX/oSSDWoQpx34EIhR7jomvNwnnorszY1y2Pzbu9CXr5/072+4TDrk+MIFHX7fLmZ0 KZbM= X-Google-Smtp-Source: AGHT+IGf/IEk5AtA0WDnEKW35I0p0zu/B1H9Z/T5/ACsPwuH316Qgsk8ueHTiu0mMSx64m2lV3uEWg== X-Received: by 2002:a05:6902:218f:b0:e28:f176:105 with SMTP id 3f1490d57ef6-e28fe4dbdc4mr3789125276.36.1728505879397; Wed, 09 Oct 2024 13:31:19 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e28f0e31127sm729836276.58.2024.10.09.13.31.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:19 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:17 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 06/11] pack-bitmap: drop `from_midx` field from `bitmapped_pack` Message-ID: <278beed9cfbca0ec00a113e36d615d1d47a5223f.1728505840.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: This field was added in 41cd4b478f7 (pack-bitmap: tag bitmapped packs with their corresponding MIDX, 2024-08-27) in order to expose the bitmap's MIDX in order to translate bit positions correctly from within 'pack-objects' during pack-reuse (c.f., 125c32605ab (builtin/pack-objects.c: translate bit positions during pack-reuse, 2024-08-27) for more details). But another approach would have been to use the `->midx` field of the `struct bitmap_index *` directly, which feels clearer and avoids duplicating information. Unfortunately, we can't access that field directly since it is part of the `bitmap_index` structure which is static within the pack-bitmap.c compilation unit. So let's instead introduce a new function which returns that pointer to us, and replace the `from_midx` field with uses of that new function (which we call `bitmap_midx()` here). Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 4 ++-- midx.c | 1 - pack-bitmap.c | 6 +++++- pack-bitmap.h | 2 +- 4 files changed, 8 insertions(+), 5 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 0fc0680b402..097bb5ac2ca 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1201,7 +1201,7 @@ static void write_reused_pack(struct bitmapped_pack *reuse_packfile, if (pos + offset >= reuse_packfile->bitmap_pos + reuse_packfile->bitmap_nr) goto done; - if (reuse_packfile->bitmap_pos) { + if (bitmap_is_midx(bitmap_git)) { /* * When doing multi-pack reuse on a * non-preferred pack, translate bit positions @@ -1209,7 +1209,7 @@ static void write_reused_pack(struct bitmapped_pack *reuse_packfile, * pack-relative positions before attempting * reuse. */ - struct multi_pack_index *m = reuse_packfile->from_midx; + struct multi_pack_index *m = bitmap_midx(bitmap_git); uint32_t midx_pos; off_t pack_ofs; diff --git a/midx.c b/midx.c index 67e0d640046..ca98bfd7c64 100644 --- a/midx.c +++ b/midx.c @@ -496,7 +496,6 @@ int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id + sizeof(uint32_t)); bp->pack_int_id = pack_int_id; - bp->from_midx = m; return 0; } diff --git a/pack-bitmap.c b/pack-bitmap.c index 6dbe6a2c5bc..b9ea1fab397 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2326,7 +2326,6 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, packs[packs_nr].pack_int_id = pack_int_id; packs[packs_nr].bitmap_nr = pack->num_objects; packs[packs_nr].bitmap_pos = 0; - packs[packs_nr].from_midx = bitmap_git->midx; objects_nr = packs[packs_nr++].bitmap_nr; } @@ -2981,6 +2980,11 @@ int bitmap_is_midx(struct bitmap_index *bitmap_git) return !!bitmap_git->midx; } +struct multi_pack_index *bitmap_midx(struct bitmap_index *bitmap_git) +{ + return bitmap_git->midx; +} + const struct string_list *bitmap_preferred_tips(struct repository *r) { const struct string_list *dest; diff --git a/pack-bitmap.h b/pack-bitmap.h index d7f4b8b8e95..a1e8c8936c9 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -60,7 +60,6 @@ struct bitmapped_pack { uint32_t bitmap_pos; uint32_t bitmap_nr; - struct multi_pack_index *from_midx; /* MIDX only */ uint32_t pack_int_id; /* MIDX only */ }; @@ -157,6 +156,7 @@ char *midx_bitmap_filename(struct multi_pack_index *midx); char *pack_bitmap_filename(struct packed_git *p); int bitmap_is_midx(struct bitmap_index *bitmap_git); +struct multi_pack_index *bitmap_midx(struct bitmap_index *bitmap_git); const struct string_list *bitmap_preferred_tips(struct repository *r); int bitmap_is_preferred_refname(struct repository *r, const char *refname); From patchwork Wed Oct 9 20:31:21 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829212 Received: from mail-yw1-f181.google.com (mail-yw1-f181.google.com [209.85.128.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DC8821E284B for ; Wed, 9 Oct 2024 20:31:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505885; cv=none; b=ugwVJEGxxd8zfrLjWaAV1Mzq+jlcTroRrpjhCkkkAm681N3NrwOzY+orWBTgrpLolIgtNXH4DXxHcgrYmlwUmeBaAxIJzHt+wrSNysdMhNOZf3FIXpM3zyuYSa4jaf6h2zT+XS0ZwXSoRSwnuZalgQXfYm3zMSr0iy56pFCnTxQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505885; c=relaxed/simple; bh=8+RhcNlNymeg5o2rLVPpOBBAFzTvS9OJEsV8+beC2hA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HbVFnylJbOTFqW9ILcmfM3W4pMdfY6kpYPFs6q1nrn2ZTDKjmxWzkhUK9EbIsEoCaWK4xhd/KUuflNnr3e+df2iRDZViVTRpevYqGTRKDDSQULWIUmub31HZz03OIrpLCp6FkiUvrWRCBq7Q7SSDJS0uNIXNeX9YaGjy5XBYpQM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=N5TIShPD; arc=none smtp.client-ip=209.85.128.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="N5TIShPD" Received: by mail-yw1-f181.google.com with SMTP id 00721157ae682-6e2346f164cso2603157b3.3 for ; Wed, 09 Oct 2024 13:31:23 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505882; x=1729110682; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=Mqe0GPnuXb9j/DsS9BJF8/QGW/5Dw4Nes3IsxXwm4R4=; b=N5TIShPD2jQX08DvrSyu5s1gxGdwRJfXT1+ZkP1bHzf6GzWYTht0JDlcOJdJ1xz1IQ bC3NbCObBgkH4m7eY3SIeq02DPRmKruVk/CuNCcXRIiTZCou5ihW1K766+HsCJeAKat3 10GozdOqapbeyZ97MR/3NK9zjAe+dGoPTsjwEAP71//cgFqYEXGyeW1joLiouNVfADDP lqQQeGdOpbxL8eN5TkNNLrM2jroKaWg8PMkz7zBqsBTJxzfePQJHvSO4n8lFGpHFtPEW LepZ2hFKpqn20pghc2LfqNH3VycP0cy9VfRCEOde507tcJDntaAtRZjlhnPv0TWiz4XQ Bxtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505882; x=1729110682; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=Mqe0GPnuXb9j/DsS9BJF8/QGW/5Dw4Nes3IsxXwm4R4=; b=sgEuaJ4DiwVOzI8kKz/MOnZsz5d5y9ONNp3AEzxcfEuugfocYW3fnmibadJEE7KuCY Db2uPomaV7zRhQoTFWEThR2k2Mlpx88fIkAaopGI1uCEncCivIbIT7P+YwjVNyQYh6jF lBh/Qh1p/K0oDT8UqIyHcID/vOXMH3mYPEPY2Xe/jpuN90kaoadcLWkSr7q4zNkz4nfP +F+xteWYL88HfmQmighnmq8qZ+fnkdjXXxh9sPuQgyDis4hnE1NCIoTAa0/yG85B2628 CIPMZcql2qmOrbRog77Kb9gyxVbSWPbxZdbFJjE2RTuom9Cav8+svrIhCnTW1BXEawLS Oxyw== X-Gm-Message-State: AOJu0YxKLcQdl8EsN3lt99zwforsnF3ZCZIjARXdmNI1EvcqoqRde7XT 0fKevwxOdNKXgoH2VinA0nHKUT9m6AnMDnQ/6nTW82XmWOgDT0f8hRUfBA6+R4877TctdvhsEH9 nAgE= X-Google-Smtp-Source: AGHT+IHdLnL1jg4F4w9r+rw56f0uO5RMMGyDeG4knzXHNPck/OgxH7kh8aHoyvZXZmzDMsACsgFS/w== X-Received: by 2002:a05:690c:3502:b0:6e3:21fa:e50f with SMTP id 00721157ae682-6e322197244mr45605567b3.13.1728505882551; Wed, 09 Oct 2024 13:31:22 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e2d93d75a8sm20404927b3.96.2024.10.09.13.31.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:22 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:21 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 07/11] write_reused_pack_one(): translate bit positions directly Message-ID: <94e5c96f6859479e0206d6d775eacf54b3639ee5.1728505840.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: A future commit will want to deal with bit positions instead of pack positions from within builtin/pack-objects.c::write_reused_pack_one(). That function at present takes a pack position, so one approach to accommodating the new functionality would be to add a secondary bit position parameter, making the function's declaration look something like: static void write_reused_pack_one(struct packed_git *reuse_packfile, size_t pack_pos, size_t bitmap_pos, struct hashfile *out, off_t pack_start, struct pack_window **w_curs); But because the pack-relative position can be easily derived from the bit position, it makes senes to just pass the latter and let the function itself translate it into a pack-relative position. To do this, extract a new function `bitmap_to_pack_pos()` from the existing `write_reused_pack()` function. This new routine is responsible for performing the conversion from bitmap- to pack-relative positions. Instead of performing that translation in `write_reused_pack()`, instead call the new function from within `write_reused_pack_one()` so that we can just pass a single bit position to it. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 78 ++++++++++++++++++++++-------------------- 1 file changed, 41 insertions(+), 37 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 097bb5ac2ca..7f50d58a235 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1017,6 +1017,42 @@ static off_t find_reused_offset(off_t where) return reused_chunks[lo-1].difference; } +static uint32_t bitmap_to_pack_pos(struct packed_git *reuse_packfile, + size_t pos) +{ + if (bitmap_is_midx(bitmap_git)) { + /* + * When doing multi-pack reuse on a + * non-preferred pack, translate bit positions + * from the MIDX pseudo-pack order back to their + * pack-relative positions before attempting + * reuse. + */ + struct multi_pack_index *m = bitmap_midx(bitmap_git); + uint32_t midx_pos, pack_pos; + off_t pack_ofs; + + if (!m) + BUG("non-zero bitmap position without MIDX"); + + midx_pos = pack_pos_to_midx(m, pos); + pack_ofs = nth_midxed_offset(m, midx_pos); + + if (offset_to_pack_pos(reuse_packfile, pack_ofs, &pack_pos) < 0) + BUG("could not find expected object at offset %"PRIuMAX" in pack %s", + (uintmax_t)pack_ofs, pack_basename(reuse_packfile)); + + return pack_pos; + } else { + /* + * Can use bit positions directly, even for MIDX + * bitmaps. See comment in try_partial_reuse() + * for why. + */ + return pos; + } +} + static void write_reused_pack_one(struct packed_git *reuse_packfile, size_t pos, struct hashfile *out, off_t pack_start, @@ -1025,9 +1061,10 @@ static void write_reused_pack_one(struct packed_git *reuse_packfile, off_t offset, next, cur; enum object_type type; unsigned long size; + uint32_t pack_pos = bitmap_to_pack_pos(reuse_packfile, pos); - offset = pack_pos_to_offset(reuse_packfile, pos); - next = pack_pos_to_offset(reuse_packfile, pos + 1); + offset = pack_pos_to_offset(reuse_packfile, pack_pos); + next = pack_pos_to_offset(reuse_packfile, pack_pos + 1); record_reused_object(offset, offset - (hashfile_total(out) - pack_start)); @@ -1191,7 +1228,6 @@ static void write_reused_pack(struct bitmapped_pack *reuse_packfile, size_t pos = (i * BITS_IN_EWORD); for (offset = 0; offset < BITS_IN_EWORD; ++offset) { - uint32_t pack_pos; if ((word >> offset) == 0) break; @@ -1201,40 +1237,8 @@ static void write_reused_pack(struct bitmapped_pack *reuse_packfile, if (pos + offset >= reuse_packfile->bitmap_pos + reuse_packfile->bitmap_nr) goto done; - if (bitmap_is_midx(bitmap_git)) { - /* - * When doing multi-pack reuse on a - * non-preferred pack, translate bit positions - * from the MIDX pseudo-pack order back to their - * pack-relative positions before attempting - * reuse. - */ - struct multi_pack_index *m = bitmap_midx(bitmap_git); - uint32_t midx_pos; - off_t pack_ofs; - - if (!m) - BUG("non-zero bitmap position without MIDX"); - - midx_pos = pack_pos_to_midx(m, pos + offset); - pack_ofs = nth_midxed_offset(m, midx_pos); - - if (offset_to_pack_pos(reuse_packfile->p, - pack_ofs, &pack_pos) < 0) - BUG("could not find expected object at offset %"PRIuMAX" in pack %s", - (uintmax_t)pack_ofs, - pack_basename(reuse_packfile->p)); - } else { - /* - * Can use bit positions directly, even for MIDX - * bitmaps. See comment in try_partial_reuse() - * for why. - */ - pack_pos = pos + offset; - } - - write_reused_pack_one(reuse_packfile->p, pack_pos, f, - pack_start, &w_curs); + write_reused_pack_one(reuse_packfile->p, pos + offset, + f, pack_start, &w_curs); display_progress(progress_state, ++written); } } From patchwork Wed Oct 9 20:31:24 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829213 Received: from mail-yw1-f174.google.com (mail-yw1-f174.google.com [209.85.128.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EAE5A1E2855 for ; Wed, 9 Oct 2024 20:31:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505890; cv=none; b=UTY6duOYeejMWMosGj2HpeQMTx93JjUTpD7SlR+P5DRbuujmGj8XD+zyIThwZ8A4lN9BeUYB1bF3VWU27YQOh4jnjDAxInFtqUwlxK21mM5BTIgfUyphmLXVp8R/R8hzt7OxKvggXEXCsDX2p++N4X6jinm0DJ7HbK+NVgndbb0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505890; c=relaxed/simple; bh=k+N1XOmHuUK+j7SeX37AmbB3XcpZkf3DyULU74FkWEA=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EI3C2FHl7zJ3KH/nruQeQASCdLEIgiT5C42Wrf3QwjfOW9bos4fO1Oo6qebqKwiRz1ff2TCXo+LmUbByXptH7hRDalWBeeQckQzgNujO+2qL1IR5GSqj6CZ3TfFR7O/Yp9L07KVnGsFTF4e7BTMJ5e6CGCIgGVbJWl5XOicxTr8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=YxamRsLe; arc=none smtp.client-ip=209.85.128.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="YxamRsLe" Received: by mail-yw1-f174.google.com with SMTP id 00721157ae682-6db836c6bd7so2534487b3.3 for ; Wed, 09 Oct 2024 13:31:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505885; x=1729110685; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=vQ+WiuPlVa7Y3KzKAJXuhMYUT3b1gkJjevd9YJHSVYE=; b=YxamRsLec2dWdKdKs1bdgbeXRcWbOPmvH7Bse7I7c0hVHPx5b1cCCPyMWQkPXB9aSU ABJ3BuPo2lOgaBW9x4eFifuXbY+/ZomddXjDDczBVVtRptdV7qI/X4wNpEDg/T4BnCyO vVLH4M5oQMvGoHJm/nv+95xpx1PPDE7/zu7zcmYRftXYbNCpZAd4rf6pizAdnTI25N0s daqZ3mnTME+e8C5IWev/OWD1mrbYCdjJ++kXCYbeT+6HuhTV/z0RaijVWJSEJxTZdIGc CmkVXKUDaFLmGeELkDQy//ZVEs1pAGUPZsjs3/x7QUQsWKD8h0y5bS4PlPuCnDKiB8Le 2EBA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505885; x=1729110685; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=vQ+WiuPlVa7Y3KzKAJXuhMYUT3b1gkJjevd9YJHSVYE=; b=FDOJqOT6VyB8X7n6qIsuEHPE/ssz0YV46HlAKPDfehhSosDbvUb9754EBHbhiEQpfN XNJRMySLauYWmMIrFLUhAsZWXVlnhKcE0BuR2VOQRzDSSHy5VZiY5Sas4SdFCG2V6ZLI cFC035Cztfo/dQDeWJsS/x0hQCFoiEKzsndJvxkI4GAcnFTPk4FEsn8BDNTfA4mEOwtr yFgoLqSPaRozjZcReMRXJPPRy/zfOE2QC+NK1A/HHaxTsBozsJcbGvKcV37F4yEGPrfY /n2SqoBxEt9hIPiYF4ow5VX4ipYMuMElr7xvUY7ttWUJ/SErk5qhcXkn3JgdceKMRLjO yn9Q== X-Gm-Message-State: AOJu0Yxz6gKWWBAenaNpg1LDP7FTPdBtyJ/69VGSJxQfjpgN6SgwshvX /o9a4U9NyCb98HZauCRN41jgVwmuzyGcdVlb32xqcePLf2LAMQY+oOVTazg4SRrLB4arzij8Ih7 kZKc= X-Google-Smtp-Source: AGHT+IEeaOwRLYFjEEl2s/Zx5fBNMadN5BRO57RcIyXYV7kP6bpnwTEMalhTBxYmkUbfHDDns9TGoA== X-Received: by 2002:a05:690c:6604:b0:6b0:70fc:f6e4 with SMTP id 00721157ae682-6e32e1fe41emr21686007b3.15.1728505885667; Wed, 09 Oct 2024 13:31:25 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e2d9387dabsm20219867b3.63.2024.10.09.13.31.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:25 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:24 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 08/11] t5332: enable OFS_DELTAs via test_pack_objects_reused Message-ID: <9d81d890402f94f4126aedef0845d615d10455bc.1728505840.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Back when test_pack_objects_reused was introduced via commit 7c01878eeb (t5332-multi-pack-reuse.sh: extract pack-objects helper functions, 2024-02-05), we converted bare pack-objects invocations into one of two wrapped variants, either test_pack_objects_reused or test_pack_objects_reused_all. The latter passes `--delta-base-offset`, allowing pack-objects to generate OFS_DELTAs in its output pack. But the former does not, for no good reason. As we do not want to convert OFS_DELTAs to REF_DELTAs unnecessarily, let's unify these two and pass `--delta-base-offset` to both. Instrumenting the codepath where we convert OFS_DELTAs to REF_DELTAs with a BUG() like so: diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 0fc0680b402..0f1b22b8674 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1051,6 +1051,8 @@ static void write_reused_pack_one(struct packed_git *reuse_packfile, uint32_t base_pos; struct object_id base_oid; + BUG("tested"); + if (offset_to_pack_pos(reuse_packfile, base_offset, &base_pos) < 0) die(_("expected object at offset %"PRIuMAX" " "in pack %s"), , and seeing what test(s) fail yields the following. Test Summary Report ------------------- t5326-multi-pack-bitmaps.sh (Wstat: 256 (exited 1) Tests: 357 Failed: 6) Failed tests: 46, 91, 167, 220, 265, 341 Non-zero exit status: 1 t5310-pack-bitmaps.sh (Wstat: 256 (exited 1) Tests: 227 Failed: 6) Failed tests: 46-47, 120-121, 197-198 Non-zero exit status: 1 t5327-multi-pack-bitmaps-rev.sh (Wstat: 256 (exited 1) Tests: 314 Failed: 6) Failed tests: 46, 91, 157, 203, 248, 314 Non-zero exit status: 1 So the OFS_DELTA to REF_DELTA conversion is still tested thoroughly in t5310, t5326, and t5327. Signed-off-by: Taylor Blau --- t/t5332-multi-pack-reuse.sh | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh index 955ea42769b..8bcb736c75a 100755 --- a/t/t5332-multi-pack-reuse.sh +++ b/t/t5332-multi-pack-reuse.sh @@ -43,7 +43,7 @@ test_pack_objects_reused_all () { test_pack_objects_reused () { : >trace2.txt && GIT_TRACE2_EVENT="$PWD/trace2.txt" \ - git pack-objects --stdout --revs >got.pack && + git pack-objects --delta-base-offset --stdout --revs >got.pack && test_pack_reused "$1" X-Patchwork-Id: 13829214 Received: from mail-yb1-f174.google.com (mail-yb1-f174.google.com [209.85.219.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EFAB11DFE2B for ; Wed, 9 Oct 2024 20:31:30 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505892; cv=none; b=apbJfovsvEywHz/mQ0YvNQhAj6Bkry3qyt0qNCm4mdoKphsks+q9srZO4qUpB28JxJyt6yCt0WARvDa9KMekiAC7gN4Sx8yW2h3aYFfY2RH1AFNED9Xitkv1M0Fm+Cvc1XeThdbeyLN/UUjOHZqC7lVL7LSrApxuDqnyY7cJmb8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505892; c=relaxed/simple; bh=qsTt1okpCH0XjdUNRLiLnPjQP/8j00Bp1b8A6dK5Nbw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=iVGoqS+9OwopTdbOZOoI4cJzLqp7Avq/tSRpHN05jFVUvwrGyOLlmjx46A23TWUmp8keOn1pVt7uEWQ+V54uCziMNHhGRcmT25aLkgGdmHZMuu6XCQEqtQ0Hd3d3AF5IK3vU/Zf+fAp8koTjNDx567RF9dLS+8OwWcN3sctafv4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=YO6q8CfY; arc=none smtp.client-ip=209.85.219.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="YO6q8CfY" Received: by mail-yb1-f174.google.com with SMTP id 3f1490d57ef6-e2904ce7e14so196624276.0 for ; Wed, 09 Oct 2024 13:31:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505890; x=1729110690; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=+ZpQe7A0bytBUiilfEqxsKxr7BVGKQcnyeeyz9Mf1Lc=; b=YO6q8CfYD2jGyHGbK09Yzrh/kVFnv591kaKUUSDNcm+klF+ll5JKzI+aUU5S7gBaeu cKMQioN3aUw4qzmqJ2KLnGDiFUlNm3f/ahKay+ajXhyMmr8QrVeQRCQv/K6KN/jqMW2W nzjS9ok9UrgLi59DrgtoRIZPG6MimLOEqn30sZo7d2H/PiyvHR3hK0bxzlHUsvCxJcK6 Z85Du/98gfoTDpoUuV50ncYAMJgswJfzurpgYG7WwmUoGM3TT1rt14NEDZpMj0O9nl8Q QBChfQFlXnB8qtuS9t50tvThL8WFM0+PQ7ZQOOY1bhmKvZdECNPvvu+LrcFcq7e/CV+A WTtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505890; x=1729110690; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+ZpQe7A0bytBUiilfEqxsKxr7BVGKQcnyeeyz9Mf1Lc=; b=LQbgb3xVZFBUkYQkzrwMkhK+kthTyCmX93SF5g0BoprqT2x3cZpj9ahzZvErvyo0gA +/uJh33RPsztmi7A+X4x9gzWhJt0ON9dVYnMZndgAL8nV8CQh0DHPKd+3n533S2G5weG K3DSuct+104q8TLkJqUnDhvcm37whXh4hJTrVL+kwUvEOhNfFVTMOTwafGf6WaC27Zzo QtzsBcM6e39YScjktEuSLNogwAiKKD1Bfn5Mj4x/WU4Vdmdm4/Nvzinvp5t+lFdj9AV5 bT0KPvUk54KlkoesEY6/s9Dr8TcT8z5QItRk2L0cMp++0ivGHJfZEjrVso6LkWhmUdjJ CzgA== X-Gm-Message-State: AOJu0YxIdSovITs5QbK0vyPdZ3r4W5QoLpuHn9flXMSGn+VwgnMY5S6A 4OtPt2X6HQi2kn8qjJj/wov3EAA9KErE8Do4mWcBW8KapDqJe6S3Pm6HndEwpimqBrmVSRb3nY+ r8Ho= X-Google-Smtp-Source: AGHT+IE1mNAS9dsrg5j5wAOgghTLlatiKagERB24LzKHFwSFqLehWUhw+UVQqTlJEmY378NAPmFDqw== X-Received: by 2002:a05:6902:f82:b0:e26:8e8:b29d with SMTP id 3f1490d57ef6-e28fe50e8f0mr3949031276.53.1728505889621; Wed, 09 Oct 2024 13:31:29 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e28ee8c438csm789516276.47.2024.10.09.13.31.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:29 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:28 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 09/11] pack-bitmap: enable cross-pack delta reuse Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Back when multi-pack bitmaps were first introduced, we performed verbatim pack reuse over the MIDX's preferred pack only. Since all duplicate objects were resolved in favor of that pack, any object from the preferred pack that was stored as an OFS_DELTA is directly reusable. That's not the case when we are reusing objects from multiple packs, like was introduced via af626ac0e0 (pack-bitmap: enable reuse from all bitmapped packs, 2023-12-14). When reusing an object stored as a delta from some pack other than the preferred one, it's possible that the base was selected from a different pack. When that's the case, we could have converted those OFS_DELTAs to REF_DELTAs on the fly (like we do when running 'pack-objects' without the `--delta-base-offset` option). But af626ac0e0 decided instead to kick those objects back into the non-verbatim reuse path, meaning that reusing them was slightly slower. This patch removes that limitation, and teaches pack-objects to convert deltas (whose base was selected from a different pack in the MIDX) from OFS_DELTAs to REF_DELTAs. In order to do this, the pack-reuse code within pack-bitmap.c marks bits in a separate bitmap called 'reuse_as_ref_delta'. Objects whose bits are marked in that bitmap must be converted from OFS_DELTAs to REF_DELTAs. To mark bits in that bitmap, we adjust find_base_bitmap_pos() to return the bitmap position of any delta object's base regardless of whether or not it came from the same pack. This is done by: 1. First converting the base object's into a pack position (via `offset_to_pack_pos()`). 2. Then converting from pack position into into lexical order (via `pack_pos_to_index()`). 3. Then into an object ID (via `nth_packed_object_id()`). 4. Then into a position in the MIDX's lexical order of object IDs (via `bsearch_midx()`). 5. And finally into a position in the MIDX's pseudo-pack order (via `midx_pair_to_pack_pos()`). If we can find that base object in the MIDX, then we use its position in the MIDX's pseudo-pack order to determine whether or not it was selected from the same pack. If it is, no adjustment is necessary. Otherwise, we mark the delta object's position in the new `reuse_as_ref_delta` bitmap, and convert accordingly from within `write_reused_pack_one()`. The only tweak we need to make outside of the above is to teach `write_reused_pack_verbatim()` to only consider words of the `reuse_packfile_bitmap` as reusable verbatim if the word does not have any bits which are marked in the reuse_as_ref_delta bitmap, in which case they must be converted and cannot be reused verbatim. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 29 ++++++++++++++--- pack-bitmap.c | 63 +++++++++++++++++++++++++++---------- pack-bitmap.h | 1 + t/t5332-multi-pack-reuse.sh | 4 +-- 4 files changed, 75 insertions(+), 22 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 7f50d58a235..dbcd632483e 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -223,6 +223,7 @@ static size_t reuse_packfiles_nr; static size_t reuse_packfiles_used_nr; static uint32_t reuse_packfile_objects; static struct bitmap *reuse_packfile_bitmap; +static struct bitmap *reuse_as_ref_delta_packfile_bitmap; static int use_bitmap_index_default = 1; static int use_bitmap_index = -1; @@ -1084,7 +1085,8 @@ static void write_reused_pack_one(struct packed_git *reuse_packfile, assert(base_offset != 0); /* Convert to REF_DELTA if we must... */ - if (!allow_ofs_delta) { + if (!allow_ofs_delta || + bitmap_get(reuse_as_ref_delta_packfile_bitmap, pos)) { uint32_t base_pos; struct object_id base_oid; @@ -1160,6 +1162,10 @@ static size_t write_reused_pack_verbatim(struct bitmapped_pack *reuse_packfile, if (!bitmap_get(reuse_packfile_bitmap, word_pos * BITS_IN_EWORD + offset)) return word_pos; + if (reuse_as_ref_delta_packfile_bitmap && + bitmap_get(reuse_as_ref_delta_packfile_bitmap, + word_pos * BITS_IN_EWORD + offset)) + return word_pos; } pos += BITS_IN_EWORD - (pos % BITS_IN_EWORD); @@ -1182,10 +1188,24 @@ static size_t write_reused_pack_verbatim(struct bitmapped_pack *reuse_packfile, if (pos >= end) return reuse_packfile->bitmap_pos / BITS_IN_EWORD; - while (pos < end && - reuse_packfile_bitmap->words[pos / BITS_IN_EWORD] == (eword_t)~0) + while (pos < end) { + size_t wpos = pos / BITS_IN_EWORD; + eword_t reuse; + + reuse = reuse_packfile_bitmap->words[wpos]; + if (reuse_as_ref_delta_packfile_bitmap) { + /* + * Can't reuse verbatim any objects which need + * to be first rewritten as REF_DELTAs. + */ + reuse &= ~reuse_as_ref_delta_packfile_bitmap->words[wpos]; + } + + if (reuse != (eword_t)~0) + break; + pos += BITS_IN_EWORD; - + } if (pos > end) pos = end; @@ -4078,6 +4098,7 @@ static int get_object_list_from_bitmap(struct rev_info *revs) &reuse_packfiles, &reuse_packfiles_nr, &reuse_packfile_bitmap, + &reuse_as_ref_delta_packfile_bitmap, allow_pack_reuse == MULTI_PACK_REUSE); if (reuse_packfiles) { diff --git a/pack-bitmap.c b/pack-bitmap.c index b9ea1fab397..8326a20979e 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2054,18 +2054,34 @@ static int find_base_bitmap_pos(struct bitmap_index *bitmap_git, uint32_t *base_bitmap_pos) { if (bitmap_is_midx(bitmap_git)) { + struct object_id base_oid; + uint32_t base_pack_pos; + uint32_t base_idx_pos; + uint32_t base_midx_pos; + /* - * Cross-pack deltas are rejected for now, but could - * theoretically be supported in the future. - * - * We would need to ensure that we're sending both - * halves of the delta/base pair, regardless of whether - * or not the two cross a pack boundary. If they do, - * then we must convert the delta to an REF_DELTA to - * refer back to the base in the other pack. - * */ - if (midx_pair_to_pack_pos(bitmap_git->midx, pack->pack_int_id, - base_offset, base_bitmap_pos) < 0) + * First convert from the base object's offset + * to its pack position in pack order relative + * to its source pack. Use that information to + * find the base object's OID. + */ + if (offset_to_pack_pos(pack->p, base_offset, + &base_pack_pos) < 0) + return -1; + base_idx_pos = pack_pos_to_index(pack->p, base_pack_pos); + if (nth_packed_object_id(&base_oid, pack->p, base_idx_pos) < 0) + return -1; + + /* + * Now find the base object's lexical position + * in the MIDX, and convert that to a bitmap + * position in the MIDX's pseudo-pack order. + */ + if (!bsearch_midx(&base_oid, bitmap_git->midx, + &base_midx_pos)) + return -1; + if (midx_to_pack_pos(bitmap_git->midx, base_midx_pos, + base_bitmap_pos) < 0) return -1; } else { /* @@ -2102,6 +2118,7 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, size_t bitmap_pos, off_t offset, struct bitmap *reuse, + struct bitmap *reuse_as_ref_delta, struct pack_window **w_curs) { off_t delta_obj_offset; @@ -2116,6 +2133,7 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, if (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA) { off_t base_offset; uint32_t base_bitmap_pos; + int cross_pack; /* * Find the position of the base object so we can look it up @@ -2129,7 +2147,8 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, delta_obj_offset); if (!base_offset || find_base_bitmap_pos(bitmap_git, pack, base_offset, - delta_obj_offset, &base_bitmap_pos) < 0) + delta_obj_offset, + &base_bitmap_pos) < 0) return 0; /* @@ -2142,8 +2161,11 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, */ if (!bitmap_get(reuse, base_bitmap_pos)) return 0; - } + cross_pack = base_bitmap_pos < pack->bitmap_pos; + if (cross_pack) + bitmap_set(reuse_as_ref_delta, bitmap_pos); + } /* * If we got here, then the object is OK to reuse. Mark it. */ @@ -2153,7 +2175,8 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, static void reuse_partial_packfile_from_bitmap_1(struct bitmap_index *bitmap_git, struct bitmapped_pack *pack, - struct bitmap *reuse) + struct bitmap *reuse, + struct bitmap *reuse_as_ref_delta) { struct bitmap *result = bitmap_git->result; struct pack_window *w_curs = NULL; @@ -2220,7 +2243,8 @@ static void reuse_partial_packfile_from_bitmap_1(struct bitmap_index *bitmap_git } if (try_partial_reuse(bitmap_git, pack, bit_pos, - ofs, reuse, &w_curs) < 0) { + ofs, reuse, reuse_as_ref_delta, + &w_curs) < 0) { /* * try_partial_reuse indicated we couldn't reuse * any bits, so there is no point in trying more @@ -2255,12 +2279,14 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct bitmapped_pack **packs_out, size_t *packs_nr_out, struct bitmap **reuse_out, + struct bitmap **reuse_as_ref_delta_out, int multi_pack_reuse) { struct repository *r = the_repository; struct bitmapped_pack *packs = NULL; struct bitmap *result = bitmap_git->result; struct bitmap *reuse; + struct bitmap *reuse_as_ref_delta = NULL; size_t i; size_t packs_nr = 0, packs_alloc = 0; size_t word_alloc; @@ -2333,14 +2359,18 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, word_alloc = objects_nr / BITS_IN_EWORD; if (objects_nr % BITS_IN_EWORD) word_alloc++; + reuse = bitmap_word_alloc(word_alloc); + reuse_as_ref_delta = bitmap_word_alloc(word_alloc); for (i = 0; i < packs_nr; i++) - reuse_partial_packfile_from_bitmap_1(bitmap_git, &packs[i], reuse); + reuse_partial_packfile_from_bitmap_1(bitmap_git, &packs[i], + reuse, reuse_as_ref_delta); if (bitmap_is_empty(reuse)) { free(packs); bitmap_free(reuse); + bitmap_free(reuse_as_ref_delta); return; } @@ -2352,6 +2382,7 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, *packs_out = packs; *packs_nr_out = packs_nr; *reuse_out = reuse; + *reuse_as_ref_delta_out = reuse_as_ref_delta; } int bitmap_walk_contains(struct bitmap_index *bitmap_git, diff --git a/pack-bitmap.h b/pack-bitmap.h index a1e8c8936c9..e7962ac90e8 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -86,6 +86,7 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, struct bitmapped_pack **packs_out, size_t *packs_nr_out, struct bitmap **reuse_out, + struct bitmap **reuse_as_ref_delta_out, int multi_pack_reuse); int rebuild_existing_bitmaps(struct bitmap_index *, struct packing_data *mapping, kh_oid_map_t *reused_bitmaps, int show_progress); diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh index 8bcb736c75a..6edff02d124 100755 --- a/t/t5332-multi-pack-reuse.sh +++ b/t/t5332-multi-pack-reuse.sh @@ -194,7 +194,7 @@ test_expect_success 'omit delta with uninteresting base (same pack)' ' test_pack_objects_reused 3 1 in <<-EOF && $(git rev-parse $base) ^$(git rev-parse $delta) @@ -207,7 +207,7 @@ test_expect_success 'omit delta from uninteresting base (cross pack)' ' packs_nr="$(find $packdir -type f -name "pack-*.pack" | wc -l)" && objects_nr="$(git rev-list --count --all --objects)" && - test_pack_objects_reused_all $(($objects_nr - 1)) $packs_nr + test_pack_objects_reused_all $objects_nr $packs_nr ' test_expect_success 'non-omitted delta in MIDX preferred pack' ' From patchwork Wed Oct 9 20:31:31 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829215 Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E53A91E1A30 for ; Wed, 9 Oct 2024 20:31:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505895; cv=none; b=MtlP3FwuUmRGou2/jE+Vvz1ZR55321hCY7WGuyeEjEB5qlW9I0zZuMkL2nBHydMlujTiomTve1YRVxTnQq8RB+swyEg9EK1AUrSYVuypkK4wdu/1SIhY8tz2NOAjFTZ3ta5AVQSZGidsx83SMUEH+mQvV4Ray8JJEKtdOrmBLcY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505895; c=relaxed/simple; bh=bC9eK4v84n4FRZ8oq5ek4gaR9/Fgd1GU+eV8sKM+XWk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=OBAwSRcRZP2Pni3upgBZa8KxOBq1/uXrL4GnQkS5koPYa/WL8unlk8Jda8ABkIufeBhn3KZnLTq439gzWuU/cwvxiH9JWKqDFmleWfKEPSlOPIOqLBD9/jSrWWeShqDTGLRaXZvYHq4jWf2WwsOg8RDnYgCzNI9zXZTCOfpuhM8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=cGkkvn7v; arc=none smtp.client-ip=209.85.219.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="cGkkvn7v" Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-e28fa2807eeso189540276.1 for ; Wed, 09 Oct 2024 13:31:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505893; x=1729110693; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=l+RPYImUZjOCWOEE4ttOaMWNBXBFypcKjzPRW7st1KQ=; b=cGkkvn7v98OhSH0tDwCyJvpUMmTYqFGx+cfykPWZkXA5B5bbfiBc63yrsuUQCXRkhH B7VJO8IXnMF8kYYOin2bSpWgB+ZF5wO1hFuclXZ4Rm/9rRH/7jiYcbt+mxYYiNn2eK5n +K+Gtjqlq3OOj9rZHkZ3rzvMX6+TGqxPRr2hWxoYG6yfybNtxrIBe5Xn9oQUMarPXTR2 qLvLUm0lemkcXz3xpWw6LFg3SH7f2SImlE6n0Ius4BuacfKoA9Q7GTcSqIaLetBS5O2u zTWfCVqkW5KysI664TmFSxqZUWXYgxn3P3nvjWPjAZAK1snVsac6JBmdE0Mc+0UsKyd1 xWUQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505893; x=1729110693; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=l+RPYImUZjOCWOEE4ttOaMWNBXBFypcKjzPRW7st1KQ=; b=Jg3Zq3E8gI9lI8/kAJC6CvJXUEmfaC+EZ2bxl9iW2AUNpU/dpxQ1J+EoNRB9mDXnrt fatl8XMaRMVxo01cOoLUBdRo1BoGe9K38c+CFHvGTzB9NcKrCay+UUJzw5oPnEEDiNph 3VLEtf4kxSSPcT6i/bvYiWFCCVglCq9OJkIyZpimLIOKuTQRauHdL+N848F80KYXfRUE 7MFwAhDszb2eigJGts59YcXbmkXP02dI6rdfMNrPssOzVJ7/gp0W5W6jw0b8zICVKNPq /FgPw/NECq4TN8D66Ta9eRyoER78J9ZEqA6zR6u+9pjRIraDNUqehZJ5Q8nWgFjlVXX8 SdTw== X-Gm-Message-State: AOJu0Yy6d4e2N5g9xQJyLRj64O7ZOco1l5lTU9BFgQZFBleCZTXioeU+ l2SNHArgAqkW3kvNMpQU9tTUBocIjRVeznHBR9cW2Ap+oJXFVWTp0cfOHoYh46EWP8XAELSuOk6 KI5U= X-Google-Smtp-Source: AGHT+IF/DjSZ9TlU8RzYaTs2xfc7AG/OlnD/o/uzXBDGmH5h/z3sctTHZrs3g0EwH4piuEiOTYBzVA== X-Received: by 2002:a05:6902:138f:b0:e28:e746:1f07 with SMTP id 3f1490d57ef6-e28fe44a3b4mr3939626276.16.1728505892716; Wed, 09 Oct 2024 13:31:32 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e2904e29b3csm314355276.56.2024.10.09.13.31.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:32 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:31 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 10/11] pack-bitmap.c: record whether the result was filtered Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In a following commit, we will prepare to mark objects for reuse which are stored as deltas, but whose base object wasn't included in the output pack (e.g., because the client already has it). We can't, however, take advantage of that optimization when the traversal removed objects from the result due to either (a) object filtering, or (b) --unpacked. That's because in either case, we can't rely on the principle that "if an object appears in the 'haves' bitmap, that the client already has that object", since they may not have it as a result of filtering. Keep track of whether or not we performed any object filtering during our traversal in preparation to actually implement thin delta conversion. Signed-off-by: Taylor Blau --- pack-bitmap.c | 12 ++++++++++++ 1 file changed, 12 insertions(+) diff --git a/pack-bitmap.c b/pack-bitmap.c index 8326a20979e..67ea267ed2a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -112,6 +112,12 @@ struct bitmap_index { /* "have" bitmap from the last performed walk */ struct bitmap *haves; + /* + * Whether the last performed walk had objects removed from + * 'result' due to object filtering. + */ + int filtered; + /* Version of the bitmap index */ unsigned int version; }; @@ -1684,6 +1690,8 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, bitmap_unset(to_filter, pos); } + bitmap_git->filtered = 1; + bitmap_free(tips); } @@ -1776,6 +1784,8 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, bitmap_unset(to_filter, pos); } + bitmap_git->filtered = 1; + bitmap_free(tips); } @@ -1892,6 +1902,8 @@ static void filter_packed_objects_from_bitmap(struct bitmap_index *bitmap_git, if (has_object_pack(&eindex->objects[i]->oid)) bitmap_unset(result, objects_nr + i); } + + bitmap_git->filtered = 1; } struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, From patchwork Wed Oct 9 20:31:34 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13829216 Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1DC6F1E1A30 for ; Wed, 9 Oct 2024 20:31:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505898; cv=none; b=QEd+x3WKhTLaFX3UBIVGG8sHjjNkTrMEHGGhU2OuE7+EAXAFeHmquRV8R6azGzNCYJ9rNtJv6VHEFSL1FO+fr5XnC9F6leIxfWWkgHRfKWVrJKaiFWdzf0QAiT0k//vGw0cZTgFdHKXGgwogpBlq06xQWs+kKYyQ6QslXKxTxu4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728505898; c=relaxed/simple; bh=BNl7ZVvmZtGlpb2WVtte+Daz769a56Jp6Pj6rbe1IM8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=rHRng0j/BjHTlPqSfVXcCeP2ueQ8XmnpUAPZ07n0B88LRWMG6Ltl9YXqz8r/4n23BQPoGdOwe6Vt24m+P7zJ4eXIwsiSbD4oKvqWWl3DsTxBT2QvbcHUhnF08+W0lbexx2voTsM+YVvYnWAxnhMNf2WjPcf1f2fQ/2c0EcaR9Dc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=c837JVy9; arc=none smtp.client-ip=209.85.219.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="c837JVy9" Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-e28fe3b02ffso153858276.3 for ; Wed, 09 Oct 2024 13:31:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1728505896; x=1729110696; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=RinAmfBl92v9ExenHxp8lXmwNKwUnMvFVISnlM0937w=; b=c837JVy9VbeH+zqftjB9WE6g2d4ZPZzIN0zYVT9ZRvzDA5JlT8QVEmD/gEhWrZ+5zF nWfVLBwoIUUnGklMKyxGr38gYqULJlBFa8GCR16quHzchTHNNSWArr3U4OiO4USHagsx AQSpMAylchBkEF0ft5Nsbs/kh+4QW7kF2HrdaM+E6rWColwOvirFBy3smrCRMNCiLwIL PfZpI3IExleR9xN2pridQxmbOFsPQMkg20ZOXTo/cUELR7dW4tBFM/APUgeKD70bDRQB 2MJs8fhHzhcP/weK4/QRW26iv/xM7JR3GU33xc8AmgmVUkoyWA1NHtuK2B3h6asgnbEr yl6Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728505896; x=1729110696; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=RinAmfBl92v9ExenHxp8lXmwNKwUnMvFVISnlM0937w=; b=j1xlYfru1fHvn3oaO8ScRiYqk+ztzT6bfHGSLZJDhhdoTpX+2dPh2vEaxWXtfVbMfM +GlA5o4pWd1xTMvcjq9aCLYZUKukHN1OmtSKrJGliPDj/GxZunjJxsyULmIi4EOCfKGp z5ymvtRzzAKfAnaOYinVq3Hfwbm+Pb1a+yyxWXR+XL5D6DbqtUniq1ZfAgRK3kopXwzx h4vfPEowKhgBnOjP/pF9XUJPvqPSsa/zSPnXkZXvTruDfvO5RctqtX5aMVQCQEH0SBbO 6+M6Zt5syLh/Q6EXYBeGvNl4bbtlIgPn4NxiZhl7KXxgIQG95LW+SbNzqSdIpUSeYCzA 4aAg== X-Gm-Message-State: AOJu0YyJHqONcpulka4ZoPKHmJWJ/rvxbvQdHd0hMJuBa7mYwIP5rhqY /QYGL3qVduT3DIy+NLoJC2Rbb3St58bkmr43zwVBqpVRhM61kAHF2LQ8POluK2oIKST+B30C0CW mNf8= X-Google-Smtp-Source: AGHT+IHgSA+l33FOTkc5f3wT8JY5YCAp0e3cgIHhl63radMsGIvatS0hFPm2JbzBZpMRnZXa/XiQGA== X-Received: by 2002:a05:690c:4703:b0:6e2:43ea:55f with SMTP id 00721157ae682-6e3221f7bcamr31019847b3.38.1728505895860; Wed, 09 Oct 2024 13:31:35 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6e31ce7ba14sm5856547b3.42.2024.10.09.13.31.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 09 Oct 2024 13:31:35 -0700 (PDT) Date: Wed, 9 Oct 2024 16:31:34 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano , Patrick Steinhardt Subject: [PATCH 11/11] pack-bitmap: enable reusing deltas with base objects in 'haves' bitmap Message-ID: <487258bca343078bcb59f22ac0a9a69d69f3c284.1728505840.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Ever since the current verbatim pack-reuse strategy was devised in bb514de356 (pack-objects: improve partial packfile reuse, 2019-12-18), we have skipped sending delta objects whenever we're not sending that object's base. But it's fine to send such an object as a REF_DELTA against the base object, provided the following are true: - We know that the client has the object already, i.e. it appears in the 'haves' bitmap. - The client supports thin packs, i.e. 'pack-objects' was invoked with the '--thin' option. - The client did not request object filtering, in which case we cannot trust that they actually do have all of the objects in the 'haves' bitmap, since we only filter the 'result' bitmap. When all of those criteria are met, we can send the delta object verbatim, meaning that we can reuse more objects from existing packs via the verbatim reuse mechanism, which should be faster than kicking those objects back to the slow path. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 +- pack-bitmap.c | 52 ++++++++++++++++++++--------- pack-bitmap.h | 3 +- t/t5332-multi-pack-reuse.sh | 66 +++++++++++++++++++++++++++++++++++++ 4 files changed, 107 insertions(+), 17 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index dbcd632483e..027c64f931f 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4099,7 +4099,8 @@ static int get_object_list_from_bitmap(struct rev_info *revs) &reuse_packfiles_nr, &reuse_packfile_bitmap, &reuse_as_ref_delta_packfile_bitmap, - allow_pack_reuse == MULTI_PACK_REUSE); + allow_pack_reuse == MULTI_PACK_REUSE, + thin); if (reuse_packfiles) { reuse_packfile_objects = bitmap_popcount(reuse_packfile_bitmap); diff --git a/pack-bitmap.c b/pack-bitmap.c index 67ea267ed2a..e8094453ca3 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2131,6 +2131,7 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, off_t offset, struct bitmap *reuse, struct bitmap *reuse_as_ref_delta, + int thin_deltas, struct pack_window **w_curs) { off_t delta_obj_offset; @@ -2145,7 +2146,7 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, if (type == OBJ_REF_DELTA || type == OBJ_OFS_DELTA) { off_t base_offset; uint32_t base_bitmap_pos; - int cross_pack; + int wants_base, cross_pack; /* * Find the position of the base object so we can look it up @@ -2164,19 +2165,25 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, return 0; /* - * And finally, if we're not sending the base as part of our - * reuse chunk, then don't send this object either. The base - * would come after us, along with other objects not - * necessarily in the pack, which means we'd need to convert - * to REF_DELTA on the fly. Better to just let the normal - * object_entry code path handle it. + * And finally, if we're not sending the base as part of + * our reuse chunk, then either convert the delta to a + * REF_DELTA if the client supports thin deltas, or + * don't send this object either. */ - if (!bitmap_get(reuse, base_bitmap_pos)) - return 0; - + wants_base = bitmap_get(reuse, base_bitmap_pos); cross_pack = base_bitmap_pos < pack->bitmap_pos; - if (cross_pack) + + if (!wants_base) { + if (!thin_deltas) + return 0; + if (!bitmap_git->haves || + !bitmap_get(bitmap_git->haves, base_bitmap_pos)) + return 0; + bitmap_set(reuse_as_ref_delta, bitmap_pos); + } else if (cross_pack) { + bitmap_set(reuse_as_ref_delta, bitmap_pos); + } } /* * If we got here, then the object is OK to reuse. Mark it. @@ -2188,7 +2195,8 @@ static int try_partial_reuse(struct bitmap_index *bitmap_git, static void reuse_partial_packfile_from_bitmap_1(struct bitmap_index *bitmap_git, struct bitmapped_pack *pack, struct bitmap *reuse, - struct bitmap *reuse_as_ref_delta) + struct bitmap *reuse_as_ref_delta, + int thin_deltas) { struct bitmap *result = bitmap_git->result; struct pack_window *w_curs = NULL; @@ -2256,7 +2264,7 @@ static void reuse_partial_packfile_from_bitmap_1(struct bitmap_index *bitmap_git if (try_partial_reuse(bitmap_git, pack, bit_pos, ofs, reuse, reuse_as_ref_delta, - &w_curs) < 0) { + thin_deltas, &w_curs) < 0) { /* * try_partial_reuse indicated we couldn't reuse * any bits, so there is no point in trying more @@ -2292,7 +2300,8 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, size_t *packs_nr_out, struct bitmap **reuse_out, struct bitmap **reuse_as_ref_delta_out, - int multi_pack_reuse) + int multi_pack_reuse, + int thin_deltas) { struct repository *r = the_repository; struct bitmapped_pack *packs = NULL; @@ -2375,9 +2384,22 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, reuse = bitmap_word_alloc(word_alloc); reuse_as_ref_delta = bitmap_word_alloc(word_alloc); + if (bitmap_git->filtered) { + /* + * If the bitmap traversal filtered objects, then we + * can't trust the client actually has all of the + * objects that appear in the 'haves' bitmap. In that + * case, pretend like we don't support thin-deltas, + * since we can't guarantee that the client has all of + * the objects we think it has. + */ + thin_deltas = 0; + } + for (i = 0; i < packs_nr; i++) reuse_partial_packfile_from_bitmap_1(bitmap_git, &packs[i], - reuse, reuse_as_ref_delta); + reuse, reuse_as_ref_delta, + thin_deltas); if (bitmap_is_empty(reuse)) { free(packs); diff --git a/pack-bitmap.h b/pack-bitmap.h index e7962ac90e8..1a204fec31e 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -87,7 +87,8 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, size_t *packs_nr_out, struct bitmap **reuse_out, struct bitmap **reuse_as_ref_delta_out, - int multi_pack_reuse); + int multi_pack_reuse, + int thin_deltas); int rebuild_existing_bitmaps(struct bitmap_index *, struct packing_data *mapping, kh_oid_map_t *reused_bitmaps, int show_progress); void free_bitmap_index(struct bitmap_index *); diff --git a/t/t5332-multi-pack-reuse.sh b/t/t5332-multi-pack-reuse.sh index 6edff02d124..6237c680ae9 100755 --- a/t/t5332-multi-pack-reuse.sh +++ b/t/t5332-multi-pack-reuse.sh @@ -51,6 +51,33 @@ test_pack_objects_reused () { git index-pack --strict -o got.idx got.pack } +# test_pack_objects_reused_thin +test_pack_objects_reused_thin () { + : >trace2.txt && + GIT_TRACE2_EVENT="$PWD/trace2.txt" \ + git pack-objects --thin --delta-base-offset --stdout --revs \ + >got.pack && + + test_pack_reused "$1" +test_pack_objects_reused_filter () { + : >trace2.txt && + GIT_TRACE2_EVENT="$PWD/trace2.txt" \ + git pack-objects --thin --delta-base-offset --stdout --revs \ + --thin --filter="$1" \ + >got.pack && + + test_pack_reused "$2" f && + git add f && + test_tick && + git commit -m "base" && + base="$(git rev-parse HEAD)" && + + test_seq 32 >f && + test_tick && + git commit -a -m "delta" && + delta="$(git rev-parse HEAD)" && + + git repack -ad && + + test_commit other && + + pack=$(git pack-objects --all --unpacked $packdir/pack) && + git multi-pack-index write --bitmap \ + --preferred-pack=pack-$pack.pack && + + have_delta "$(git rev-parse $delta:f)" "$(git rev-parse $base:f)" && + + cat >in <<-EOF && + $delta + ^$base + EOF + + test_pack_objects_reused_thin 3 1