From patchwork Thu Jun 6 23:04:25 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689023 Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C0BE513A3F1 for ; Thu, 6 Jun 2024 23:04:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715070; cv=none; b=N8pTwCINqZ0PWXWpwin/jDm2lDfE0z6/d/qXAWvd0v3SnaiF8cVc9UrOEy5SqggKe9HEzJNHSCpinH6gTVkwI0BskkQ/XQT+YDcVelWCRxvkoE8DsRdcmnOY3Ug3TqvFI/PANviB3AzlVbS3RVcVPzV4LAxtriqoh+MvS/56N2Q= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715070; c=relaxed/simple; bh=wmc1CedVA/kuRiNysYPzAKgToMkFKojZvSYewPyUlyg=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=OxKuY8buocaotqtNEQCfnAo+QJ1gUrfyymY/latN2BhJ4DrDdQQPaRXU6n2h80JKVuh0BDos+qxg7hEKii1ubmDXWDOfhOzn/tWBYCCZroC7cwanOSR1ISKvDitidY15l1mZ58V5nf4rrXEpEU9g9r0JEZD9omLqFYMPAqC22Bg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=eBDWaMiS; arc=none smtp.client-ip=209.85.160.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="eBDWaMiS" Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-44026036ea8so7443571cf.0 for ; Thu, 06 Jun 2024 16:04:28 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715067; x=1718319867; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=ms/sbqYvdaFtTktSi0nsCL9HM3ughK0GKElkwM5WjN0=; b=eBDWaMiSfFtGzIGCAuoql6+JE/3P4LPUn9VDvkx0//ytDLLtS6UCRLsLZ6bAlvQ2Gf Ig2q0q5hJT1JlYiQ0vQr/MbjiDAuDapQIaqYhG5aGZAPmG2Ss9ItJX7qa+FvXrJKyR1F K6eR9LVfp8YkuvG+fbE4+IRqrzSpr4KpuOsg2X/StVoDe4nd4Ui7kwc0hPpGRRbmyt7o H6tTPW1YnQPDj3kdeAvx8l0iTHd46w1fzEpsKyw02y5B7qCJmHXIRhIr4nxW3VrF/Jrm +gv89oQdF4ZPml7nWlT94G0RZW1C+VERh4GVDh9loetxoBAtAuv5+TbdHolP33CwsCP5 lxhQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715067; x=1718319867; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=ms/sbqYvdaFtTktSi0nsCL9HM3ughK0GKElkwM5WjN0=; b=DgTfUugkasppKL5PsysLO/xW9/xyxSJ4R5JVr904HVVbc/hDwoFRJyrqano8+ZoO7/ TkGcfPPei1WaCgjY8xMZ11t5/aqx1gI3CNa/wjXBVYF2m+HOuFCaxh1HTTOdfZ15gpPJ epQQjDyDqukGHQ3Jq6zAVKmriMsZpO0mjXRIm7+fveQPMRKiFghjIMyUk9Qgpltz5vI+ 4TnWw2UuErGt9fOw1J8+034E36eFHqmchXmE1jCpBQfxKUvlE9muS9opSSyGrTG/o1UZ MuXHQDzwYe6FfYcLiXFimATHXkii44HBx8Qxa74zn0EVz7ToJSO6qwZhK1c2ZvEgY8yM kNYg== X-Gm-Message-State: AOJu0Yxng4m5yoLAGoDRUdfde21xyZx+t46OG8j/x0B56cG7ZRBD+ZPh 4hEsiX888tzmxuWWCSXnsMO2REHJ3cvnMNL2cUP8s58g3fgWt6f/gh9tH4cxvr5buavTclfvKP0 MFVA= X-Google-Smtp-Source: AGHT+IFrTNCktzRabGmGJXqnWilfVMehkKHxcEJvyo52wxTzlkKu0sc/VluAv07ItWFgYVFnGirCLQ== X-Received: by 2002:a05:622a:107:b0:43e:391a:1a20 with SMTP id d75a77b69052e-44041c544b1mr13684361cf.15.1717715067067; Thu, 06 Jun 2024 16:04:27 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-7953287548esm102847485a.64.2024.06.06.16.04.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:26 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:25 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 01/19] Documentation: describe incremental MIDX format Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to implement incremental multi-pack indexes (MIDXs) over the next several commits by first describing the relevant prerequisites (like a new chunk in the MIDX format, the directory structure for incremental MIDXs, etc.) The format is described in detail in the patch contents below, but the high-level description is as follows. Incremental MIDXs live in $GIT_DIR/objects/pack/multi-pack-index.d, and each `*.midx` within that directory has a single "parent" MIDX, which is the MIDX layer immediately before it in the MIDX chain. The chain order resides in a file 'multi-pack-index-chain' in the same directory. Signed-off-by: Taylor Blau --- Documentation/technical/multi-pack-index.txt | 100 +++++++++++++++++++ 1 file changed, 100 insertions(+) diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index f2221d2b44..d05e3d6dd9 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -61,6 +61,106 @@ Design Details - The MIDX file format uses a chunk-based approach (similar to the commit-graph file) that allows optional data to be added. +Incremental multi-pack indexes +------------------------------ + +As repositories grow in size, it becomes more expensive to write a +multi-pack index (MIDX) that includes all packfiles. To accommodate +this, the "incremental multi-pack indexes" feature allows for combining +a "chain" of multi-pack indexes. + +Each individual component of the chain need only contain a small number +of packfiles. Appending to the chain does not invalidate earlier parts +of the chain, so repositories can control how much time is spent +updating the MIDX chain by determining the number of packs in each layer +of the MIDX chain. + +=== Design state + +At present, the incremental multi-pack indexes feature is missing two +important components: + + - The ability to rewrite earlier portions of the MIDX chain (i.e., to + "compact" some collection of adjacent MIDX layers into a single + MIDX). At present the only supported way of shrinking a MIDX chain + is to rewrite the entire chain from scratch without the `--split` + flag. ++ +There are no fundamental limitations that stand in the way of being able +to implement this feature. It is omitted from the initial implementation +in order to reduce the complexity, but will be added later. + + - Support for reachability bitmaps. The classic single MIDX + implementation does support reachability bitmaps (see the section + titled "multi-pack-index reverse indexes" in + linkgit:gitformat-pack[5] for more details). ++ +As above, there are no fundamental limitations that stand in the way of +extending the incremental MIDX format to support reachability bitmaps. +The design below specifically takes this into account, and support for +reachability bitmaps will be added in a future patch series. It is +omitted from this series for the same reason as above. ++ +In brief, to support reachability bitmaps with the incremental MIDX +feature, the concept of the pseudo-pack order is extended across each +layer of the incremental MIDX chain to form a concatenated pseudo-pack +order. This concatenation takes place in the same order as the chain +itself (in other words, the concatenated pseudo-pack order for a chain +`{$H1, $H2, $H3}` would be the pseudo-pack order for `$H1`, followed by +the pseudo-pack order for `$H2`, followed by the pseudo-pack order for +`$H3`). ++ +The layout will then be extended so that each layer of the incremental +MIDX chain can write a `*.bitmap`. The objects in each layer's bitmap +are offset by the number of objects in the previous layers of the chain. + +=== File layout + +Instead of storing a single `multi-pack-index` file (with an optional +`.rev` and `.bitmap` extension) in `$GIT_DIR/objects/pack`, incremental +MIDXs are stored in the following layout: + +---- +$GIT_DIR/objects/pack/multi-pack-index.d/ +$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-chain +$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-$H1.midx +$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-$H2.midx +$GIT_DIR/objects/pack/multi-pack-index.d/multi-pack-index-$H3.midx +---- + +The `multi-pack-index-chain` file contains a list of the incremental +MIDX files in the chain, in order. The above example shows a chain whose +`multi-pack-index-chain` file would contain the following lines: + +---- +$H1 +$H2 +$H3 +---- + +The `multi-pack-index-$H1.midx` file contains the first layer of the +multi-pack-index chain. The `multi-pack-index-$H2.midx` file contains +the second layer of the chain, and so on. + +=== Object positions for incremental MIDXs + +In the original multi-pack-index design, we refer to objects via their +lexicographic position (by object IDs) within the repository's singular +multi-pack-index. In the incremental multi-pack-index design, we refer +to objects via their index into a concatenated lexicographic ordering +among each component in the MIDX chain. + +If `objects_nr()` is a function that returns the number of objects in a +given MIDX layer, then the index of an object at lexicographic position +`i` within, say, $H3 is defined as: + +---- +objects_nr($H2) + objects_nr($H1) + i +---- + +(in the C implementation, this is often computed as `i + +m->num_objects_in_base`). + Future Work ----------- From patchwork Thu Jun 6 23:04:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689025 Received: from mail-oa1-f41.google.com (mail-oa1-f41.google.com [209.85.160.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1EC61535C8 for ; Thu, 6 Jun 2024 23:04:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.41 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715073; cv=none; b=n1lc0sk7KlE2FCwr/ISkmuhkND1MlvJbIboMW4ajYNDRPOvwx+jZQq/x1Lr/SqnCD68TFRIwyaCTDg4bea70j4p29ZWXwSeLp12wwM0wF6aGWQ5hMPUK0gNdtCA15h+kgNzoT+tq2HQfZ3MzfAOeyMe+jfCoCMnS3EwEpSTy0n4= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715073; c=relaxed/simple; bh=N3Z1PZ72V5odW9/MQjFz+L5vJqRbXcSiFIR91zHCaL4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=XcOkzdysi11TLlu8+YVcO6IHZgrRBw2R7ddgWf1v6hD+ZSuTKB3q+nn7XX+RFRT1EIUic4y1L+nxI6aUcBu04KBFtng1LeNx20M8DngIjDIurcO+OIRR6qXPl3e9eYtjCliNPCiwoF9rFWeuTpoWjA7fzkKdBbrvELcnxZ21Bmo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=qwSLVh9f; arc=none smtp.client-ip=209.85.160.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="qwSLVh9f" Received: by mail-oa1-f41.google.com with SMTP id 586e51a60fabf-24c9f2b7b19so800890fac.0 for ; Thu, 06 Jun 2024 16:04:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715071; x=1718319871; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=+NUBPmlrNk7t035RPVDJ8n/QQM107S7lc+8eGlqfjLY=; b=qwSLVh9fbD51Dco013aqlKsY4KG378E6j0AM3lsB7CDhFBZZjFh6YjTpbUppPA2fAY XfnCHs77QejJvqoHZU9hBP2YWiAwXs0Kp5ztu+ZO0ds9sjHmjBvVH8/bctc/hK/UpOSL zdK+c6gHWNyS3gpXrFX5gUI8Mk5p4O+7CfqlPt9tYSBHMtzI6ne48K1Bm3mH0kK21+/E NQ7nzdcgsr9GQXvH1n4fRBzBMa8WOpTO32mpkRmLkgWVnJiwiOSaHQnFM9UWwb+eoFDt soaI7PEeR2Esv6wmYTQZDeyyMjhxDiWDdCi1LpSx+XbFvdztCHJbEc7FhJT2eeGjiJ3R Sb1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715071; x=1718319871; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=+NUBPmlrNk7t035RPVDJ8n/QQM107S7lc+8eGlqfjLY=; b=kjXcMVPGt+aJexwrA3rXdR23vbNWTqEEoHhrogP3px5xh7BSTx40FtiIqyH0ZbqjxB PKCgoA42hn2W6W/0dhjt8ttZSUYfdVBHIPvvfT4sZwgVcjRn3tt6/yLWP+YFNWEYF7cg dnPs0HErsX/RSkcDEFXFsEeFdl+76GI1QTXfp7EAbCeHcrL7Xu9mkt752ECFJlulaaEV fXvYMHKzLV+O+XvlV8x+5ry3LS2m4G5wvDsDR0E8uXeYeuXkKbfNKtJYTMUXlH1V7t6L p5Jiknx7F7CCRVEE84locjsnDuCgMsbDTb7lZXTKsAEoP/1v3H84kkDIEpaDcFYfvzt/ VTmQ== X-Gm-Message-State: AOJu0Yw7sqB9swW+FXWrrF9Dw8pRLUAwhhoBbmsWKZWrmJ2jCrpanqr0 iwjdL+5bBj0zGBV3+Bwem9NIYsPaeFwzE8U5YHKYFbIqvsh7RK3KL1sUffAvZCOrfH9D+ODibSw 3j+c= X-Google-Smtp-Source: AGHT+IHExkUtA8VVFPRnrpJadtRnAXhUXNhLv3VUml4j4IXJURtVRV4tbNxLH5NZrGsbz6QWtoMlDQ== X-Received: by 2002:a05:6870:c1d4:b0:250:129a:d0ef with SMTP id 586e51a60fabf-25464e9e9a6mr937817fac.36.1717715070846; Thu, 06 Jun 2024 16:04:30 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-795394389e5sm52735185a.15.2024.06.06.16.04.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:30 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:29 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 02/19] midx: add new fields for incremental MIDX chains Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The incremental MIDX chain feature is designed around the idea of indexing into a concatenated lexicographic ordering of object IDs present in the MIDX. When given an object position, the MIDX machinery needs to be able to locate both (a) which MIDX layer contains the given object, and (b) at what position *within that MIDX layer* that object appears. To do this, three new fields are added to the `struct multi_pack_index`: - struct multi_pack_index *base_midx; - uint32_t num_objects_in_base; - uint32_t num_packs_in_base; These three fields store the pieces of information suggested by their respective field names. In turn, the `num_objects_in_base` and `num_packs_in_base` fields are used to crawl backwards along the `base_midx` pointer to locate the appropriate position for a given object within the MIDX that contains it. The following commits will update various parts of the MIDX machinery (as well as their callers from outside of midx.c and midx-write.c) to be aware and make use of these fields when performing object lookups. Signed-off-by: Taylor Blau --- midx.h | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/midx.h b/midx.h index 8554f2d616..020e49f77c 100644 --- a/midx.h +++ b/midx.h @@ -63,6 +63,10 @@ struct multi_pack_index { const unsigned char *chunk_revindex; size_t chunk_revindex_len; + struct multi_pack_index *base_midx; + uint32_t num_objects_in_base; + uint32_t num_packs_in_base; + const char **pack_names; struct packed_git **packs; char object_dir[FLEX_ARRAY]; From patchwork Thu Jun 6 23:04:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689026 Received: from mail-qk1-f175.google.com (mail-qk1-f175.google.com [209.85.222.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 85980535C8 for ; Thu, 6 Jun 2024 23:04:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715077; cv=none; b=akyUu6StdB66CMN17gN+WoHlgL7qx1Y/BUb8qtk8tzk/tEbMqfCavRTA6gdogOIamLTI9FbzAoh/vzvw+sbxY1oNMzlRN9YtDysr7KnoXVwQoyO7RHj7Ohz1f5mCk11rE0InkcW+czM1p2KUvgJTbqGb17i/IrQEDHVZ4fQIzmU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715077; c=relaxed/simple; bh=aKz6PjgmFYmhXt9wGM4ig7A8ibAxI5QSOQ2/L7icZqM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=gINIvVbSJVjSGsYSULSCcTZIP+eW5gmXnb5OhUwagqQfn7nd+L8gCeLA5W8NCrgvAneBOI1l2TakrfsIMTOlcVD7HvnRTe+iSmm3nhD90J+XnkM8ZSP02KibN5kfWGesWbRe+EdsQDz/TeKq2/ZFkdz3PVYRga9zGQENmS60IfI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=aZrWi9LN; arc=none smtp.client-ip=209.85.222.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="aZrWi9LN" Received: by mail-qk1-f175.google.com with SMTP id af79cd13be357-79535efc9d8so68528885a.0 for ; Thu, 06 Jun 2024 16:04:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715074; x=1718319874; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=lvfhUyiIBUHEVPo4UsdIp09q3jyUZir7shtDZpw/IOQ=; b=aZrWi9LN8Iz3JeOB8a+qwV+S1LbAnKkYhCz7WKMrAk92nsWR5b0Cvz9p0kyy0djEIm dWHKR/QsCTwQb07hH8u75eh7GpHzyzB/KOT46A77IXZlqfQmI5y+ICvQNe574KqQ/5QB TnYmM0jm5/sHh4A37+snkqeBb6TmRbhLtQHBgskUB9BN64B0Lq8f88sRV9X+tE6YZQHO odH4gYTfgDCAMw/MoeeKQ3FUOKBkBs2gLDFxmWLaiadshbwblHG6xsc1mvtXePY62V50 O5lRuh3okpNP6pl1lwh4BVyGeGOPrIb1DGn01i8E81MtRbr7SrWpStvr+9D7ZXuAHQCU NIOg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715074; x=1718319874; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=lvfhUyiIBUHEVPo4UsdIp09q3jyUZir7shtDZpw/IOQ=; b=HUMrytOG2kSCe5Cbbr12aTfFYOD2TeRWV7A/H+YfYoBQsS1hp642D6Vv8WgABMhPwo rvxIGKhqkn9KD50ra+kCS7hOGBJkfouUI31oJLgX/muztw+rSOC1/4jT12oLux6HXAx5 /uNSKgNHuePXiASEeE2fhYjU3GrnXuSllTd5XA+sxF9uSXl+7Z6Se3Azq7kNr7CxL69C hRmaj54ACQaonmURGCcBtm+MRT5/XwoNijgSHdRQx5e5c2MzxDRYDaxuu3qfmhE2RdHA p1UsTx/nxCWoHyFjBP72p5AnYvOgSJHb5fpctE/kvZ42quDkpPk78WlyWELqdgMO+lxI do1A== X-Gm-Message-State: AOJu0Yz2ZqPs+cAKGjvPjzJl8DPOJKiQ9Gw/ENNLrAN0JUUob3Ggu/jo p+HerpdZYEz2holJYyrUzJ7zqvAAvY6K+0AzlGj5BAWG+lfLl/Gt2DwLgoCpNQIPgXT6BJqg1xP 3UXk= X-Google-Smtp-Source: AGHT+IFT1a/RrG3pyov6/aMKwV1eSlVDUvQAArF8IcM9Y6xcO0isGfF6q4wYfWWQvwiVFhKPhAVpig== X-Received: by 2002:a05:620a:204d:b0:792:b938:90ef with SMTP id af79cd13be357-7953c6475d7mr68463185a.35.1717715073935; Thu, 06 Jun 2024 16:04:33 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-795331c9320sm103069685a.120.2024.06.06.16.04.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:33 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:32 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 03/19] midx: teach `nth_midxed_pack_int_id()` about incremental MIDXs Message-ID: <9f2cec7aa27b806bedf62826be2a58639717b289.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `nth_midxed_pack_int_id()` takes in a object position in MIDX lexicographic order and returns an identifier of the pack from which that object was selected in the MIDX. Currently, the given object position is an index into the lexicographic order of objects in a single MIDX. Change this position to instead refer into the concatenated lexicographic order of all MIDXs in a MIDX chain. This has two visible effects within the implementation of `prepare_midx_pack()`: - First, the given position is now an index into the concatenated lexicographic order of all MIDXs in the order in which they appear in the MIDX chain. - Second the pack ID returned from this function is now also in the concatenated order of packs among all layers of the MIDX chain in the same order that they appear in the MIDX chain. To do this, introduce the first of two general purpose helpers, this one being `midx_for_object()`. `midx_for_object()` takes a double pointer to a `struct multi_pack_index` as well as an object `pos` in terms of the entire MIDX chain[^1]. The function chases down the '->base_midx' field until it finds the MIDX layer within the chain that contains the given object. It then: - modifies the double pointer to point to the containing MIDX, instead of the tip of the chain, and - returns the MIDX-local position[^2] at which the given object can be found. Use this function within `nth_midxed_pack_int_id()` so that the `pos` it expects is now relative to the entire MIDX chain, and that it returns the appropriate pack position for that object. [^1]: As a reminder, this means that the object is identified among the objects contained in all layers of the incremental MIDX chain, not any particular layer. For example, consider MIDX chain with two individual MIDXs, one with 4 objects and another with 3 objects. If the MIDX with 4 objects appears earlier in the chain, then asking for pack "6" would return the second object in the MIDX with 3 objects. [^2]: Building on the previous example, asking for object 6 in a MIDX chain with (4, 3) objects, respectively, this would set the double pointer to point at the MIDX containing three objects, and would return an index to the second object within that MIDX. Signed-off-by: Taylor Blau --- midx.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index bc4797196f..d5828b48fd 100644 --- a/midx.c +++ b/midx.c @@ -240,6 +240,23 @@ void close_midx(struct multi_pack_index *m) free(m); } +static uint32_t midx_for_object(struct multi_pack_index **_m, uint32_t pos) +{ + struct multi_pack_index *m = *_m; + while (m && pos < m->num_objects_in_base) + m = m->base_midx; + + if (!m) + BUG("NULL multi-pack-index for object position: %"PRIu32, pos); + + if (pos >= m->num_objects + m->num_objects_in_base) + die(_("invalid MIDX object position, MIDX is likely corrupt")); + + *_m = m; + + return pos - m->num_objects_in_base; +} + int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id) { struct strbuf pack_name = STRBUF_INIT; @@ -331,8 +348,10 @@ off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos) { - return get_be32(m->chunk_object_offsets + - (off_t)pos * MIDX_CHUNK_OFFSET_WIDTH); + pos = midx_for_object(&m, pos); + + return m->num_packs_in_base + get_be32(m->chunk_object_offsets + + (off_t)pos * MIDX_CHUNK_OFFSET_WIDTH); } int fill_midx_entry(struct repository *r, From patchwork Thu Jun 6 23:04:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689027 Received: from mail-qv1-f52.google.com (mail-qv1-f52.google.com [209.85.219.52]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8FF2A535C8 for ; Thu, 6 Jun 2024 23:04:38 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.52 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715080; cv=none; b=um9HEJi56kbO2GOKIvq1tjuW3AGQOfTcS9U+/YJGjvjyUgAFPkV7qtqx7CV+kccWGDqEW6kq/S3ZCm9Nk5iUWNboIHy/iO3oy6Z7wNp7PZxG8bUMjlZwONAOUfn0mD1EvxXh8oP0oLXDM214zVlhJGmIX9oP0QyZwakmS68Bebo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715080; c=relaxed/simple; bh=RTtUHAXSC8mQ4jyGxASvFVIMVDdYfXnvRpldziGkEyQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=umqUXrl3EnEV9yh46ST6mQu5ONza9x87DnX0AcGJlNE+edDttCHMvnFRdoZw0yCHxxUoWx4rAFWZiI9rQTOEs45nOWFDUhl6hR2cL+7XI2P8+7PEbJHJ+OYcJf3WuK7WnFG+O2TvnksrvSsC1y+AadsIT6/WFES6xjQfiWTm688= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=TVvNlCuh; arc=none smtp.client-ip=209.85.219.52 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="TVvNlCuh" Received: by mail-qv1-f52.google.com with SMTP id 6a1803df08f44-6ae1059a62fso9007936d6.1 for ; Thu, 06 Jun 2024 16:04:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715077; x=1718319877; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=jM7q6LAkUvM6xIF32lQYIHwQPR8+fYtMEGh/2AUB5Vg=; b=TVvNlCuhB+zimLfuQblgdl8ZnJD9CYFzphDGmskNi6HH6HGCnk/6edv7OXfM53g/eO vuPaNqXPVv+NLq6IC+sW+KDbROtLnqnNr6pPCXeqd/bi5zXoUNAurERi94tS+KtGZo3i XfwPTO7NZ5DvLgBLm+XLqhH8CDadzqhUFVbMInBF2Ml9yc150zrm31fSMCZhJiNxfXB2 4ApVY0PQ/AEOxC5b96ehkYu/LjH7bMfRlxM50qAT+uC62A2rNOvVIFPO+3UrBkANM+IW E+sbjDBaKFL/dTLtktujbmk8dMe4ZwK4W00xB4tpcpe0AjlQv4lykuEhJyq5kbLdzXEz PHmQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715077; x=1718319877; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=jM7q6LAkUvM6xIF32lQYIHwQPR8+fYtMEGh/2AUB5Vg=; b=L3aWY0nIlQnlPGZjmtDAMvzBI3BxlSaBdc4r2/bzxBgyBcXVct79765s344lSftuu7 ph09YaoF/Q7r6jued9btRP5eCO1JxauQcEXMPcfLJxY0FFgBgjf7PGXSaGrlXljSJiVI Hs5nCoKfiqgKoqBt9Uqrv+T7vBrpRibU2c9SGugitYu0tBLrca2H+42Yr2MaPMf6pxMK M39LBDq+5eiEPMO6qDS1gnxyE2tgJm3kEIXfh1R9sRuzqUaD8cgPT0FiaO+eX3QhT5iC o6niUuE2KY7WiFqA91BxmnYcRB3FwvGnzJamNzP2rAZklXDkDyXO2a3SO14tfl0kU/sB GBLA== X-Gm-Message-State: AOJu0YwRXb/rONaxoTU+SYzXJ/UslSKl3mmMEs5PX/HGnp5/JmUtxFMK fCnlju4KD0aX/aFJ5QtfgJGvfxzY8GE+sSDUR+KlfscDzcA5ruEeF6VcChpBuJLIEoJvKSNMCIt 4v0o= X-Google-Smtp-Source: AGHT+IEVvRN2jbFqLqg76rpn9ak97gAA7g/GHcSXhoEh9jq+G+p2f8vbaHe5FCDpPFwdBFobru618A== X-Received: by 2002:a05:6214:5984:b0:6af:bffd:d1d9 with SMTP id 6a1803df08f44-6b059b3e4d0mr11811356d6.3.1717715077079; Thu, 06 Jun 2024 16:04:37 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b04f98499fsm10533556d6.80.2024.06.06.16.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:36 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:35 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 04/19] midx: teach `prepare_midx_pack()` about incremental MIDXs Message-ID: <97661bb0de966354cc19d37c70d18f324d82f693.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `prepare_midx_pack()` is part of the midx.h API and loads the pack identified by the MIDX-local 'pack_int_id'. This patch prepares that function to be aware of an incremental MIDX world. To do this, introduce the second of the two general purpose helpers mentioned in the previous commit. This commit introduces `midx_for_pack()`, which is the pack-specific analog of `midx_for_object()`, and works in the same fashion. Like `midx_for_object()`, this function chases down the '->base_midx' field until it finds the MIDX layer within the chain that contains the given pack. Use this function within `prepare_midx_pack()` so that the `pack_int_id` it expects is now relative to the entire MIDX chain, and that it prepares the given pack in the appropriate MIDX. Signed-off-by: Taylor Blau --- midx.c | 33 +++++++++++++++++++++++++-------- 1 file changed, 25 insertions(+), 8 deletions(-) diff --git a/midx.c b/midx.c index d5828b48fd..7fa3a1a7f8 100644 --- a/midx.c +++ b/midx.c @@ -257,20 +257,37 @@ static uint32_t midx_for_object(struct multi_pack_index **_m, uint32_t pos) return pos - m->num_objects_in_base; } -int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id) +static uint32_t midx_for_pack(struct multi_pack_index **_m, + uint32_t pack_int_id) { - struct strbuf pack_name = STRBUF_INIT; - struct packed_git *p; + struct multi_pack_index *m = *_m; + while (m && pack_int_id < m->num_packs_in_base) + m = m->base_midx; - if (pack_int_id >= m->num_packs) + if (!m) + BUG("NULL multi-pack-index for pack ID: %"PRIu32, pack_int_id); + + if (pack_int_id >= m->num_packs + m->num_packs_in_base) die(_("bad pack-int-id: %u (%u total packs)"), - pack_int_id, m->num_packs); + pack_int_id, m->num_packs + m->num_packs_in_base); - if (m->packs[pack_int_id]) + *_m = m; + + return pack_int_id - m->num_packs_in_base; +} + +int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, + uint32_t pack_int_id) +{ + struct strbuf pack_name = STRBUF_INIT; + struct packed_git *p; + uint32_t local_pack_int_id = midx_for_pack(&m, pack_int_id); + + if (m->packs[local_pack_int_id]) return 0; strbuf_addf(&pack_name, "%s/pack/%s", m->object_dir, - m->pack_names[pack_int_id]); + m->pack_names[local_pack_int_id]); p = add_packed_git(pack_name.buf, pack_name.len, m->local); strbuf_release(&pack_name); @@ -279,7 +296,7 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t return 1; p->multi_pack_index = 1; - m->packs[pack_int_id] = p; + m->packs[local_pack_int_id] = p; install_packed_git(r, p); list_add_tail(&p->mru, &r->objects->packed_git_mru); From patchwork Thu Jun 6 23:04:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689028 Received: from mail-ot1-f48.google.com (mail-ot1-f48.google.com [209.85.210.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 70DBA140364 for ; Thu, 6 Jun 2024 23:04:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715082; cv=none; b=p38RlQiWmr28A8X5vu2MPtX9ipi/VHCrr0VN/pumM9PW7bAUIysSLZYmXsqu7OjbPFe39Q3DmIgEyyem+4FfLJECC5ZS29YOfEUGGn7Bxz3VW1AxkmT7tmpwqwmLkApVP3B8m1L7Dss1oB1cUfWuiDFwDDYNKvdZzoCAjFVHL6k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715082; c=relaxed/simple; bh=kmMeGIqJUXwb3X46dMrlMwaBvEhuH5d+I7M5n254914=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Fv3fl+Ox1imEfM6ftvB/OEsVhuiI3xGU2qc1QEF2WyT8llK44f4HyRkQs2ZPaJDx42C2hLyZ7eoU1NoY9znD9p7Brq97wbYyPRlQ/tSx5qFMBNoLcj6Ol7KK15gknjXND8fb8AU92iC5zzZ31x49LCPthWpcfN0vnTv3oSQmgm8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Pg3AqFjm; arc=none smtp.client-ip=209.85.210.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Pg3AqFjm" Received: by mail-ot1-f48.google.com with SMTP id 46e09a7af769-6f938a7f492so771290a34.1 for ; Thu, 06 Jun 2024 16:04:41 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715080; x=1718319880; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=VNY0AS1hSrzf0C4LWgDRvNdiqZb4JuvAfrGShKK43sk=; b=Pg3AqFjmotI1Egzn4BSbA2MD9AGuFXuBS4zKglYpfdZ9uQSrKNGS8CWakGfU/ZbTOX nYLy9Fc0uzTP/OxSRDmaibV97Ggzlc8p/DhQdf+yNrIqjAzJsED/DGF53fszQ3Wbz2eY +yw4pvesvLjuLGi6FKAvLuHcJTUWUyUqpzRPhKlzl5chVlvhrwFZuFjRrCpklOKqFFuC Iuv0eR52wdjvJFFviVsl7gK8dupHJcPuIj24AL1MUAf9pkZ2ieGoiXJya+pWi4AnSfDF eXTMB3r2rgU2np8ftwNbEGSUSq3mWQ3i72s1xfPYbhCTIhJXw4A+kNI83d6kNU3Cf0gY tCpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715080; x=1718319880; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=VNY0AS1hSrzf0C4LWgDRvNdiqZb4JuvAfrGShKK43sk=; b=Sy5ZDjVtvSpRPd66QNUPWppVLbNrJAfG8THdIgb2k3YJhlgvg+pOQkU1NVN6Q+5GSn niwUcLI0JxQITuR6hRX1XW2SFDq+Iorw6Tj19XU0suDgpQAdr3LH7VHVAOc8icXZKkSb khiWvzjQUkC+GomHW4TmZmDFWadh4E2K7y6DrAPF9vuOKayOukpGyUY44Bk/7Rn5zsqA Bv76C//fUogJ/+4Jn7rD1WJsNmnrphVlVD2fbFm7/D6UaHFDyttryxr3E/NlVWVwiixZ 19QRKiZWa6wW4IxdiCduFgIw5vXNVExq64TaMTe1AXjw2e4YO984ARTIfeyJn5Jgiaiu IlQA== X-Gm-Message-State: AOJu0Yw5JzL1xBoSMyDsSI76x9rrS8riuqks79ad2cUmvo2nvLn9tr20 5YLE3B0LT6e7EfcRREM4FUShAug1rPp2rSQNYR0GTdRAngChDplsAa2FCYqQIBBkoKCQ3iL1L2H iGYU= X-Google-Smtp-Source: AGHT+IH+gJ8lCdEU3SQIIYqI18aMkCngKS+0f0fBeY6mZsjAOIdiw7REkR3HYAncBKL0oTWD0w2SIA== X-Received: by 2002:a9d:7f8a:0:b0:6f9:4164:bb3c with SMTP id 46e09a7af769-6f957351a13mr796067a34.31.1717715080096; Thu, 06 Jun 2024 16:04:40 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-795330b2277sm102689585a.79.2024.06.06.16.04.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:39 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:38 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 05/19] midx: teach `nth_midxed_object_oid()` about incremental MIDXs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `nth_midxed_object_oid()` returns the object ID for a given object position in the MIDX lexicographic order. Teach this function to instead operate over the concatenated lexicographic order defined in an earlier step so that it is able to be used with incremental MIDXs. To do this, we need to both (a) adjust the bounds check for the given 'n', as well as record the MIDX-local position after chasing the `->base_midx` pointer to find the MIDX which contains that object. Signed-off-by: Taylor Blau --- midx.c | 4 +++- 1 file changed, 3 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 7fa3a1a7f8..cfab7f8113 100644 --- a/midx.c +++ b/midx.c @@ -335,9 +335,11 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid, struct multi_pack_index *m, uint32_t n) { - if (n >= m->num_objects) + if (n >= m->num_objects + m->num_objects_in_base) return NULL; + n = midx_for_object(&m, n); + oidread(oid, m->chunk_oid_lookup + st_mult(m->hash_len, n)); return oid; } From patchwork Thu Jun 6 23:04:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689029 Received: from mail-io1-f45.google.com (mail-io1-f45.google.com [209.85.166.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AC43B13E041 for ; Thu, 6 Jun 2024 23:04:44 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.166.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715086; cv=none; b=ZCgFI9seQstCTQM3rqEzheCFnoBbXBFYkZo52wGKFKCwjov+JMZzZS4zd9yIBbjsrcsRzBTz+4h1a/nVEWqKtUVqrNUrpAt37iJUPA78YElAGZOjLgI2RLSyeSclg0d9ws7IRuWCIHeoqTrRxBUN5mFXdKpcs2YkeyTNUtURbBA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715086; c=relaxed/simple; bh=X4FhyZ4CxJjAmal+fkdOsRJjyhIrgvlpgMZPkEFASBI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=J0Hhiaw4srlJdd3SVV/bQY6im7RVpFZg8ZYs7NiSIbQQg8DfZqdQhrNCRARGIWwJm6MSWr8eJNMnYFp7QDVwk06oslthRdICTfnfJSFb5NPvEnc56Ikk5jKixVBuG98hQSBJfklBwUV9SaEl1g4eqmuhazL5KLiAoB1Hx6Rq9XQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=mO/cogAV; arc=none smtp.client-ip=209.85.166.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="mO/cogAV" Received: by mail-io1-f45.google.com with SMTP id ca18e2360f4ac-7eb35c5dd37so65153939f.2 for ; Thu, 06 Jun 2024 16:04:44 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715083; x=1718319883; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=rR9Fx9Hvg//c4k8/nWiH1uMBTJxu4YfQn1Pw+o+UU2k=; b=mO/cogAVrujo8WLQI9Pcg1rCD1bkx0muEPl8MbAA8HZLCaVgR+z8v4Sz5GPh8aA/TX wf8axjA9jrI7wuE2qX9HYpx6eZ7rO9yPm0XvQlOODi5MMbzCH4L7zh6C06wuXoj9ejQz XQRT/OXpcFlJ12p2s4Z+0K2qXtr3BQ+MMmRi5gpUAOb5Jfe5Lw5Y4LE413OsswA2pbYj rFDzZqu5R5dOo6tIXPSC0mwJ2pCm3oFJ+vfPJ0GlPGB7bPp5FOgOCGZSn4R4O+UIEyDD 0qde2ZQJMIxoOJxJfxEO5aiQGLAkx3uK4+nj+cCDuIfXwPdWvLWZNvB8Xd2eHMccxqpS ZVjw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715083; x=1718319883; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=rR9Fx9Hvg//c4k8/nWiH1uMBTJxu4YfQn1Pw+o+UU2k=; b=GG3P8XwObZnGbwUzE7OV6V5fqAoU8Sz/1UGhzhB2jjmAsEUygi5j8AK0kdvEOlvn5W aiIXG4TG6uZY+LAbyqodMsOefOEtJUlmymHIw3VcayFnqOp0tJnW4MzOGzMc+iE6S1F4 dVxx7b986n/4JX/is7m2HXZqnpj7fYuHVjeZ6bA9WQnN1PFZCWevTu/EhlMfceERyioG 406AI6zf5LCcCzL3Kvf3kJOmu4C6krHWCS6JeuEOZl3YVvUTi+I364su/WQKDU92hAnD SuckYYO6UWwTXoSI6Q8ZNgLaYv3OkcEtgcALvbuWicHTvaQWubTdJw9kEMISi4o6Ld64 /Beg== X-Gm-Message-State: AOJu0YxnVuBBpjt/KREYrxKHlelEm1zHbmaES68bBbCv5V+QlBa+xUw/ sOgzI+ocTUh8rxPefRzPBrqXGmBqvolQfzoeFOiYwxUzlVhg7SIx7dXRF7waFJ3dyzBj3mYgcw3 PObw= X-Google-Smtp-Source: AGHT+IEN0t6n5NsyEBRiTY03nwszjj3nxrR8CO1dtY/njnpwa+a6XJ2WJUYOBVA9kyGIfWruP1vmQQ== X-Received: by 2002:a05:6602:2dcd:b0:7e1:7dcb:7806 with SMTP id ca18e2360f4ac-7eb5725b6cdmr108686839f.18.1717715083418; Thu, 06 Jun 2024 16:04:43 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44038b5e617sm7541601cf.93.2024.06.06.16.04.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:43 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:41 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 06/19] midx: teach `nth_bitmapped_pack()` about incremental MIDXs Message-ID: <4e960edf8a1018b8425d4d6ae5bb3553bfd38023.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In a similar fashion as in previous commits, teach the function `nth_bitmapped_pack()` about incremental MIDXs by translating the given `pack_int_id` from the concatenated lexical order to a MIDX-local lexical position. When accessing the containing MIDX's array of packs, use the local pack ID. Likewise, when reading the 'BTMP' chunk, use the MIDX-local offset when accessing the data within that chunk. Signed-off-by: Taylor Blau --- midx.c | 8 +++++--- 1 file changed, 5 insertions(+), 3 deletions(-) diff --git a/midx.c b/midx.c index cfab7f8113..cdc754af97 100644 --- a/midx.c +++ b/midx.c @@ -308,17 +308,19 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, struct bitmapped_pack *bp, uint32_t pack_int_id) { + uint32_t local_pack_int_id = midx_for_pack(&m, pack_int_id); + if (!m->chunk_bitmapped_packs) return error(_("MIDX does not contain the BTMP chunk")); if (prepare_midx_pack(r, m, pack_int_id)) return error(_("could not load bitmapped pack %"PRIu32), pack_int_id); - bp->p = m->packs[pack_int_id]; + bp->p = m->packs[local_pack_int_id]; bp->bitmap_pos = get_be32((char *)m->chunk_bitmapped_packs + - MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * pack_int_id); + MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id); bp->bitmap_nr = get_be32((char *)m->chunk_bitmapped_packs + - MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * pack_int_id + + MIDX_CHUNK_BITMAPPED_PACKS_WIDTH * local_pack_int_id + sizeof(uint32_t)); bp->pack_int_id = pack_int_id; From patchwork Thu Jun 6 23:04:45 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689030 Received: from mail-qv1-f44.google.com (mail-qv1-f44.google.com [209.85.219.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4227213E041 for ; Thu, 6 Jun 2024 23:04:48 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715089; cv=none; b=XCI0h38Slr8SA32kxfWHHpU+8WeufS7GdNT7SyODUQsZmAsakI21jBjzbu7oUA4eKGHlGaO1uSi9QTeRzN9UDj3E7/oufILisYc8ckqMKz3/O/ka1ij6mkcA6weij55I21A/aqE6DwDoLTcA/wPAxPPbdCYgPKuX2a6+v6E5CNQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715089; c=relaxed/simple; bh=cBuOVa88A8CCMvbgDAO0BgUgfCMF001EHIaUML/0qXc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=VKADOoq9aoeQbopJYJWgrYl6oDAXnDIRx7ygFxDakgrUEY7Qlt/GaaufU5uftwOVE7vNuIW7dpvnLFQWGjE0YJJ+dtokGHF4rvLEeNFplIAct+dtVBkb64nA7brRowhxOEVhUYM+ul3FO4kt+FWwqBSZYnxZ2JanPRd9QZrAnwc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=AgjiS1Tf; arc=none smtp.client-ip=209.85.219.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="AgjiS1Tf" Received: by mail-qv1-f44.google.com with SMTP id 6a1803df08f44-6ae102523a4so8235956d6.1 for ; Thu, 06 Jun 2024 16:04:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715087; x=1718319887; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=uScawNpktYDzaAJVwIILX+gkfJ+wR/0v+41pAtl6WaM=; b=AgjiS1Tfj4uzcsFu6W2xI94h+iNcbnYZ8qn2gkOYsA6DpWcGpCqSjW7gO7nYrsts5V gawavf0Z27YCVOPJ85MBrBrcDCB2LZqRWaSgCCRiRkWFwDkCh5Z3fZKCKvb5ZJBl/9FO xJ1wwulINQ950N+8YebBvw87egv4Y4uJ/ONrQVYoPDXV+W444A86QNpncuxPlPc4bBwf +JdQla0Brxg5a4LhqXbIvkE3Ug/j9WRRgjz+2cDz7vpeSBkHBuc76Ysn9pumIsV/2GNF uiWznQ1xRyAxE2Nsmx2jb6rhSGa7P8p5pYIVkeVVfSJst3VtLG6xtKKxuFEMeGASqcve wzIA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715087; x=1718319887; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=uScawNpktYDzaAJVwIILX+gkfJ+wR/0v+41pAtl6WaM=; b=LsQOyAKhgF8vgU2KmL9LtXRkOwppN6jVdoqKeDjxehWPLpuHyZlp+jDBgOaSJ7jAxJ o4s2G2P4TqxJnfseSBcX+a1OWZjkM3+YagcQ+cjDdf/GlGlCij3E3zIGXvzuO4+2NJTJ BO743wDxR2qmn3SB6OhSYOCRbxr+RQBPd0PhXOusl6cuvAl2g9mkMv8Ua7H7sLGKYoMf fgkgkCmCjGe/0Md+HpPJYxIa0QZ6xJ5R/NhVm0mMni/P0ztb4DiwdjzlGnbbwRm+KnuK v57UqSnvg5W2Wc2KJeSSYeDcvu2Ei+8f5/JQoaPqmz1EyzAsQgChoaooZUc2VexcmIor iIRw== X-Gm-Message-State: AOJu0YyhOcnoX9p4pSK23zaI1jQodBeznfeBhAkq/yMVROLNBOEEYJbH +Nwtm5k4vY58Q9T5vv36ayOYLfNC6LWbs8QJfxYA1Zxc2b35na/LDlqaVWrbp+ZYt+8QVaw+LoT b+Y8= X-Google-Smtp-Source: AGHT+IF567UOflgF7sBoCAoe2Ea6Pdj7NKRlk8O3p6tGGFeL7Lr+P467FPJny0WeuDCMmxHSoNE0fg== X-Received: by 2002:a05:6214:4485:b0:6af:bc71:828b with SMTP id 6a1803df08f44-6b059f0e9d2mr10381956d6.58.1717715086500; Thu, 06 Jun 2024 16:04:46 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b04f9f6a94sm10490346d6.113.2024.06.06.16.04.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:46 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:45 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 07/19] midx: introduce `bsearch_one_midx()` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The `bsearch_midx()` function will be extended in a following commit to search for the location of a given object ID across all MIDXs in a chain (or the single non-chain MIDX if no chain is available). While most callers will naturally want to use the updated `bsearch_midx()` function, there are a handful of special cases that will want finer control and will only want to search through a single MIDX. For instance, the object abbreviation code, which cares about object IDs near to where we'd expect to find a match in a MIDX. In that case, we want to look at the nearby matches in each layer of the MIDX chain, not just a single one). Split the more fine-grained control out into a separate function called `bsearch_one_midx()` which searches only a single MIDX. At present both `bsearch_midx()` and `bsearch_one_midx()` have identical behavior, but the following commit will rewrite the former to be aware of incremental MIDXs for the remaining non-special case callers. Signed-off-by: Taylor Blau --- midx.c | 17 +++++++-- midx.h | 5 ++- object-name.c | 99 +++++++++++++++++++++++++++------------------------ 3 files changed, 71 insertions(+), 50 deletions(-) diff --git a/midx.c b/midx.c index cdc754af97..1b4a9d5d00 100644 --- a/midx.c +++ b/midx.c @@ -327,10 +327,21 @@ int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, return 0; } -int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result) +int bsearch_one_midx(const struct object_id *oid, struct multi_pack_index *m, + uint32_t *result) { - return bsearch_hash(oid->hash, m->chunk_oid_fanout, m->chunk_oid_lookup, - the_hash_algo->rawsz, result); + int ret = bsearch_hash(oid->hash, m->chunk_oid_fanout, + m->chunk_oid_lookup, the_hash_algo->rawsz, + result); + if (result) + *result += m->num_objects_in_base; + return ret; +} + +int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, + uint32_t *result) +{ + return bsearch_one_midx(oid, m, result); } struct object_id *nth_midxed_object_oid(struct object_id *oid, diff --git a/midx.h b/midx.h index 020e49f77c..46c53d69ff 100644 --- a/midx.h +++ b/midx.h @@ -90,7 +90,10 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, struct bitmapped_pack *bp, uint32_t pack_int_id); -int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); +int bsearch_one_midx(const struct object_id *oid, struct multi_pack_index *m, + uint32_t *result); +int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, + uint32_t *result); off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos); uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos); struct object_id *nth_midxed_object_oid(struct object_id *oid, diff --git a/object-name.c b/object-name.c index 523af6f64f..3307d5200c 100644 --- a/object-name.c +++ b/object-name.c @@ -132,28 +132,32 @@ static int match_hash(unsigned len, const unsigned char *a, const unsigned char static void unique_in_midx(struct multi_pack_index *m, struct disambiguate_state *ds) { - uint32_t num, i, first = 0; - const struct object_id *current = NULL; - int len = ds->len > ds->repo->hash_algo->hexsz ? - ds->repo->hash_algo->hexsz : ds->len; - num = m->num_objects; + for (; m; m = m->base_midx) { + uint32_t num, i, first = 0; + const struct object_id *current = NULL; + int len = ds->len > ds->repo->hash_algo->hexsz ? + ds->repo->hash_algo->hexsz : ds->len; - if (!num) - return; + num = m->num_objects + m->num_objects_in_base; - bsearch_midx(&ds->bin_pfx, m, &first); + if (!num) + continue; - /* - * At this point, "first" is the location of the lowest object - * with an object name that could match "bin_pfx". See if we have - * 0, 1 or more objects that actually match(es). - */ - for (i = first; i < num && !ds->ambiguous; i++) { - struct object_id oid; - current = nth_midxed_object_oid(&oid, m, i); - if (!match_hash(len, ds->bin_pfx.hash, current->hash)) - break; - update_candidates(ds, current); + bsearch_one_midx(&ds->bin_pfx, m, &first); + + /* + * At this point, "first" is the location of the lowest + * object with an object name that could match + * "bin_pfx". See if we have 0, 1 or more objects that + * actually match(es). + */ + for (i = first; i < num && !ds->ambiguous; i++) { + struct object_id oid; + current = nth_midxed_object_oid(&oid, m, i); + if (!match_hash(len, ds->bin_pfx.hash, current->hash)) + break; + update_candidates(ds, current); + } } } @@ -706,37 +710,40 @@ static int repo_extend_abbrev_len(struct repository *r UNUSED, static void find_abbrev_len_for_midx(struct multi_pack_index *m, struct min_abbrev_data *mad) { - int match = 0; - uint32_t num, first = 0; - struct object_id oid; - const struct object_id *mad_oid; + for (; m; m = m->base_midx) { + int match = 0; + uint32_t num, first = 0; + struct object_id oid; + const struct object_id *mad_oid; - if (!m->num_objects) - return; + if (!m->num_objects) + continue; - num = m->num_objects; - mad_oid = mad->oid; - match = bsearch_midx(mad_oid, m, &first); + num = m->num_objects + m->num_objects_in_base; + mad_oid = mad->oid; + match = bsearch_one_midx(mad_oid, m, &first); - /* - * first is now the position in the packfile where we would insert - * mad->hash if it does not exist (or the position of mad->hash if - * it does exist). Hence, we consider a maximum of two objects - * nearby for the abbreviation length. - */ - mad->init_len = 0; - if (!match) { - if (nth_midxed_object_oid(&oid, m, first)) - extend_abbrev_len(&oid, mad); - } else if (first < num - 1) { - if (nth_midxed_object_oid(&oid, m, first + 1)) - extend_abbrev_len(&oid, mad); + /* + * first is now the position in the packfile where we + * would insert mad->hash if it does not exist (or the + * position of mad->hash if it does exist). Hence, we + * consider a maximum of two objects nearby for the + * abbreviation length. + */ + mad->init_len = 0; + if (!match) { + if (nth_midxed_object_oid(&oid, m, first)) + extend_abbrev_len(&oid, mad); + } else if (first < num - 1) { + if (nth_midxed_object_oid(&oid, m, first + 1)) + extend_abbrev_len(&oid, mad); + } + if (first > 0) { + if (nth_midxed_object_oid(&oid, m, first - 1)) + extend_abbrev_len(&oid, mad); + } + mad->init_len = mad->cur_len; } - if (first > 0) { - if (nth_midxed_object_oid(&oid, m, first - 1)) - extend_abbrev_len(&oid, mad); - } - mad->init_len = mad->cur_len; } static void find_abbrev_len_for_pack(struct packed_git *p, From patchwork Thu Jun 6 23:04:48 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689031 Received: from mail-qv1-f50.google.com (mail-qv1-f50.google.com [209.85.219.50]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E47B71419BA for ; Thu, 6 Jun 2024 23:04:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.50 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715092; cv=none; b=E9Kw8qgFJvgll8YSEBBjNWJ+GTFHBcPqhn5wJIM8ZgtxATXjSwsuITqEcg3312qjry1JLdxMGdE+AT/RywRnBV8pMxlqtOynqwd8pk9BUPrHmviD3cOZGM8qyLLTI7G/uZw2Pg7heWRhMI3F8g6+eibAbZ/+hVpg4lEeAFQiHt8= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715092; c=relaxed/simple; bh=7WMj4MRg9OvM3QTcBDWlSSZwm774Snua9Wn22SkRkEU=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Qzq1uu075oK34G4X1igm0GgVNHmli9DUqaw1BUA/N+c09ZCUfm2CXK59aZ9+95ovuwirTKyqY08fQ5Gut/I3RsfTObYcI4uySDQ9jVX5cSeqbOFpkWPhDPIVACExReOuDoKheHAvbh6cLDQLaElrTZOMRClGepjzvDIUD1QOofE= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=FduXq6AB; arc=none smtp.client-ip=209.85.219.50 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="FduXq6AB" Received: by mail-qv1-f50.google.com with SMTP id 6a1803df08f44-6ae093e8007so3716116d6.3 for ; Thu, 06 Jun 2024 16:04:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715089; x=1718319889; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=xSqSR6/DZLooFDhAZ7Eo1qreExrut3EIGAUfb+Nrtvo=; b=FduXq6AB/K3JIllj4tokvJzszR99YAxxknPjXGG7c78sJ4UA0iAWeLX60mvVw7tVGT kEI45+8MHMZRoTWIoXBWsQsrmYvncPt/xzweA5nmi22JaUeM7dPXw+1dnICiczgnXnFR StGLuCIQfsr8bwLbrGc8lI/lCTb+gCZPKUEd/R1TDFQ0nzduRkSptu0CjtUdQ+zsKSTG HbqyQsa8An11sC61XdIKq+9IWu/LvJYlnKi8L7aAx0Up+waDYSkJm63FMOf7wB+9ticR mIH//cfADkyYkbCgV17jVl+wWcBVPeI31MaFcu5a0devXZiiqDji0P+gTCuUpBMhd0Uu 8+Rg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715089; x=1718319889; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=xSqSR6/DZLooFDhAZ7Eo1qreExrut3EIGAUfb+Nrtvo=; b=rdaOrUEbgKeL8jX+YpiJBQLmtQcYpnlUv7i8srXSvsXANvpgZj76v2Bug1og08lTMG vf46z/R5ej0ulO5tSnRH9Pn1I4P/9G9+Spefm4J/FdfdEUP2zPIhHS/XweRKPD7MOqRb V7B/1KwDEsXAHycpAk4+HWHsIklrnK318rtF8eacaCWacllnv6hzgUwgPumMpH3e75b7 kCwgtHl3TsTwccuKCPEiKVaOsmENuHAB63NUEw4gUFid46cV9T5km9EXoYhMYuvT2vvI yUGMCZ+Cd8/wvvj6AMYGyJhqNSL3vdaviV6/XrCNN2uEapJq9LypeKe6Y+0vamtHcfK2 kLkA== X-Gm-Message-State: AOJu0YzwKR0KHV9+5BW/QdquE91dwYzQKHs8S9EXH28wRIQtqhaLsJRx v0C40N6sSxDhj3a5YasXuCr02K3A+jmnTzZG2RORUh+HulpNGYKmGV5r5qeDbSeJv+1yS5ZbJw/ DN9Y= X-Google-Smtp-Source: AGHT+IGb1SAD6CMhHO8zIIbQh3TlRNEpFFJmjXjW3hOAATIhBruU/dUk1ke3v5Yjn0UvkqkPuBqnpw== X-Received: by 2002:a05:6214:5d89:b0:6ae:cfb:480e with SMTP id 6a1803df08f44-6b059f654dcmr8957076d6.45.1717715089587; Thu, 06 Jun 2024 16:04:49 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b04f66734asm10539966d6.35.2024.06.06.16.04.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:49 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:48 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 08/19] midx: teach `bsearch_midx()` about incremental MIDXs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the special cases callers of `bsearch_midx()` have been dealt with, teach `bsearch_midx()` to handle incremental MIDX chains. The incremental MIDX-aware version of `bsearch_midx()` works by repeatedly searching for a given OID in each layer along the `->base_midx` pointer, stopping either when an exact match is found, or the end of the chain is reached. Signed-off-by: Taylor Blau --- midx.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 1b4a9d5d00..7c4f58f7f1 100644 --- a/midx.c +++ b/midx.c @@ -341,7 +341,10 @@ int bsearch_one_midx(const struct object_id *oid, struct multi_pack_index *m, int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result) { - return bsearch_one_midx(oid, m, result); + for (; m; m = m->base_midx) + if (bsearch_one_midx(oid, m, result)) + return 1; + return 0; } struct object_id *nth_midxed_object_oid(struct object_id *oid, From patchwork Thu Jun 6 23:04:51 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689032 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 2CA5113C69B for ; Thu, 6 Jun 2024 23:04:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715095; cv=none; b=DsnC+oae29HF2xq2hZdqfp5VQMRPribf1LBoUJpzXby8YBxC8rfy5xIx3dH2ZsEl4fkBDemzZyO/rRLzAJ5zJ59ETq+ZohDC7bo5fgHJaLy5NMu6m9B/pCveztLB/PkkanqWF67jGbHwVLU2bvJL9SY4ep/va9fFpO/Lz32BTDs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715095; c=relaxed/simple; bh=70uT6oWdnVRhksH4UXf6R16XooyRLowKxFEdB7GGRFM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=BxTLaGTi0+NNwHhquj75GtOGOrXOR+IOPxVicZMbIisHVOp3Tz33BXUWWpGAlYFFAZnpB5+A3o43Q7KWE1b+ZDoIUgnGrQudwPLF9RWRJ+6BqM34AxF+dFOwvZsyQTdPpvirxgCOvfYjUxCnXRUlTsLG1NdGa5Qx0BpKbvpj6eM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=B1K1Dpre; arc=none smtp.client-ip=209.85.160.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="B1K1Dpre" Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-4402c88390dso7342311cf.1 for ; Thu, 06 Jun 2024 16:04:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715093; x=1718319893; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=nMflsJBLy9hc/mmnuCslPG/MubvHD6SplDSMyH50a8s=; b=B1K1DprecII5YpUNlgo8/35hZvaOrh1MGdkbkhT4xj8DatjD7T3TYf8WC9HgtQJbR0 RKkRTR2XKGsDzc+QaTa+RNcfXpOocOM2RnkZ+pzUh/q5EbXohiUbKQzM7Bdw//9+EPlx ua3kf9ygkpys+lGcxnXJor5+/0N6WuGDkV72k8zgwNAMHV+I/wrFt74JakzB396fP10J qwwtEJxxJG6TTP1/YbL5BowHwhiLO9N1p2/mNpUzllKl2TsRhoM+VnACRgfu/GM9w1fD gSqvE+IQtxToLnySkeRzYHSsdvJ2EevyWlgQysA1d5+iEAPBVbgS2Ea0rK+s3HSDY02G nzmA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715093; x=1718319893; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=nMflsJBLy9hc/mmnuCslPG/MubvHD6SplDSMyH50a8s=; b=f1dHyuIw4t/n8838ucgB6oDenbeb3S0NeaklnvAAT++Z715ylFaWrN63FHO3RlCHdt NFIoNRnQkWJSFzo/b0lXL140u7UOSVyvZUtx19SUsFdl5Ypvf7hL5IO/d/x7eAugwPrJ qjAA/kb+HVFOGsWhvtzuZj7e32ywRgxDcbtpaV5+UhL8tXsKo5ksvwGT0YefZ+cv9M5T XjUR3JUIVatDaze/UAvztyrgkgrb+pcioRmRqsrtNT/x46IXD6c2v0eVF0VYxTQIq5HY z4XnFtMYxj+jxpImBGf8aH5YeKJ2M3wmT+DjrHv2WLKcwnDMty7Glw3nVlzGX+H+yjfG 1FKA== X-Gm-Message-State: AOJu0Yzf2N/2w/2DE7KKU/LnwuMGrhl3LXY0d16i5deXom7kgzFhbuC0 Ob9IbUjrh1dyNuw+/3YbQX/MtwO5laqP6S5AGg3yMaLH/OIXsTwh3nu2XUjDLFML0Xk5RLiyQZA q838= X-Google-Smtp-Source: AGHT+IFlxGeQ3f4OC3wfwX95Q56h1R3t0lfX12FcpDWu2jtwZYvJdrJmr9+daS6AivJiTETKZMVqkA== X-Received: by 2002:a05:622a:4d2:b0:440:2786:aca1 with SMTP id d75a77b69052e-44041b68e33mr11698231cf.21.1717715092738; Thu, 06 Jun 2024 16:04:52 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-4403895e16csm7771941cf.6.2024.06.06.16.04.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:52 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:51 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 09/19] midx: teach `nth_midxed_offset()` about incremental MIDXs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In a similar fashion as in previous commits, teach the function `nth_midxed_offset()` about incremental MIDXs. The given object `pos` is used to find the containing MIDX, and translated back into a MIDX-local position by assigning the return value of `midx_for_object()` to it. Signed-off-by: Taylor Blau --- midx.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/midx.c b/midx.c index 7c4f58f7f1..d351dbb7e0 100644 --- a/midx.c +++ b/midx.c @@ -365,6 +365,8 @@ off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos) const unsigned char *offset_data; uint32_t offset32; + pos = midx_for_object(&m, pos); + offset_data = m->chunk_object_offsets + (off_t)pos * MIDX_CHUNK_OFFSET_WIDTH; offset32 = get_be32(offset_data + sizeof(uint32_t)); From patchwork Thu Jun 6 23:04:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689033 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3214E13C69B for ; Thu, 6 Jun 2024 23:04:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715098; cv=none; b=JPLvSr9FAnPFUyWpYa9ZgCS06q3HbXMrvJT69POMtkS0ocOvrx2MJDGwXTm2+7KoljSG8EwKZI2CtV4/8PgyWX+HiiBb5tcyl8f6mDKPUaTiiCT+jMEmRJZJYPRRCW1xdA5fy842ROVD5PRSc981h6BVIXsB+1d09KTAbNdSM1E= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715098; c=relaxed/simple; bh=GaWcuw/GtMJiNkkzeiniXl46oLK22OBHh5f0vhTNJ2c=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KHajEdAoPnv7aaZOlahd180RYX1NIU+gvrO9R0zxRT4CQX1R/5oAgpVGUqXMZKM58ItvCIEaHBcaHftksKVnrh94qEBPzr8/PpMynYStQr0OKil7vLXAueTcmrGuMuV4XeAkcotGZgwUKamWv4zJZxrzPeWrsq2L/VQTMv5CZnk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=jGKpPd9V; arc=none smtp.client-ip=209.85.222.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="jGKpPd9V" Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-7951e24db3bso84908285a.3 for ; Thu, 06 Jun 2024 16:04:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715096; x=1718319896; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=7kTlWimwQGMqq0Er0wME7lSQg8iz6k/KH6Kxn+SJg/Y=; b=jGKpPd9VzA11/QrsV7nZxNJyIG7IIh4BCvPsAAAyQ/6jdY0T8JxW3kP8Qlxn/TvDgV pK1FTOM2ZS9Ree1FWhyatY614bv3usg3lPaQEkqKukuFMyW0XncmWzFBRyN0n2rdmwTM poWRYN7h1uMqI0GyR949HmlOqb7xQ0AC2NCCaXk0LsenxEvEYxhCeldKJeDeiaRsmbja RHyImwQnOf6JDcJOp3glQVz59YEnoeh0sJL9VF6RnyRtaMCnMPoXy8c6um9l+8brTGo0 c6yJC59VxkQHiV+Bt0QP2KsCuC0SpQ4GTHRvnnqX5h/6KBdLm8HDjdHCkp111pC9SgK7 sLRg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715096; x=1718319896; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=7kTlWimwQGMqq0Er0wME7lSQg8iz6k/KH6Kxn+SJg/Y=; b=mmGc/kJxBlpCEMnvQLCPlApX82sTNxYjltuKVmJhiJiIARcbi+voI4DTJr06jg/tkD H51Wz5DVMFfHMYAupM5y0dCnWy0bqtqTtNwCyHv7VsuKQ8JxletuBtK/MtZKDdDG9Qjb B2HxQzCRsPD/6mtdBGdZ2LKQ9TInIyFT6F9U5StDJM/HNnNTqtyxMnQAkq8AenpugCLi tLSlLPP2xHOPcG+PSDZdNJANxenN3kSs6s+9oXDw36AbwAIHLFzgLhNMD9AwsR8IacJQ CojgicTk8/m2V2z2aA707hrk6b5ofiBXUmXUAV+rKltogUJS8u3Pa+H1Uzb9sI7TztJe Ss4Q== X-Gm-Message-State: AOJu0YziB+YlTTuDFi5lrLVPyPt2dgz5Hka54acgQxGjMoVo3abFfc19 v5iATgnDE/AxDdfH3pqE7x3VXbF/5yVTIbcqoI+a1wQ68jrHRhIe6ANvaGLZF+05MosGMFmCiem M8ZE= X-Google-Smtp-Source: AGHT+IFDQLKO/1qN8RKxhPU4QS+JQst6FFi8bakNMOhRy6NpUmCt6OzAYeKkgQx8H3kvW8aZu7kemg== X-Received: by 2002:a05:622a:1792:b0:43a:4b3c:3a85 with SMTP id d75a77b69052e-44041a80fa6mr11920981cf.0.1717715095738; Thu, 06 Jun 2024 16:04:55 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44042fe10aasm1303441cf.15.2024.06.06.16.04.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:55 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:54 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 10/19] midx: teach `fill_midx_entry()` about incremental MIDXs Message-ID: <984ca9dc2dbc1777586424243b8a6c1aeb565eec.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In a similar fashion as previous commits, teach the `fill_midx_entry()` function to work in a incremental MIDX-aware fashion. This function, unlike others which accept an index into either the lexical order of objects or packs, takes in an object_id, and attempts to fill a caller-provided 'struct pack_entry' with the remaining pieces of information about that object from the MIDX. The function uses `bsearch_midx()` which fills out the frame-local 'pos' variable, recording the given object_id's lexical position within the MIDX chain, if found (if no matching object ID was found, we'll return immediately without filling out the `pack_entry` structure). Once given that position, we jump back through the `->base_midx` pointer to ensure that our `m` points at the MIDX layer which contains the given object_id (and not an ancestor or descendant of it in the chain). Note that we can drop the bounds check "if (pos >= m->num_objects)" because `midx_for_object()` performs this check for us. After that point, we only need to make two special considerations within this function: - First, the pack_int_id returned to us by `nth_midxed_pack_int_id()` is a position in the concatenated lexical order of packs, so we must ensure that we subtract `m->num_packs_in_base` before accessing the MIDX-local `packs` array. - Second, we must avoid translating the `pos` back to a MIDX-local index, since we use it as an argument to `nth_midxed_offset()` which expects a position relative to the concatenated lexical order of objects. Signed-off-by: Taylor Blau --- midx.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-) diff --git a/midx.c b/midx.c index d351dbb7e0..c3802354e3 100644 --- a/midx.c +++ b/midx.c @@ -403,14 +403,12 @@ int fill_midx_entry(struct repository *r, if (!bsearch_midx(oid, m, &pos)) return 0; - if (pos >= m->num_objects) - return 0; - + midx_for_object(&m, pos); pack_int_id = nth_midxed_pack_int_id(m, pos); if (prepare_midx_pack(r, m, pack_int_id)) return 0; - p = m->packs[pack_int_id]; + p = m->packs[pack_int_id - m->num_packs_in_base]; /* * We are about to tell the caller where they can locate the From patchwork Thu Jun 6 23:04:57 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689034 Received: from mail-qv1-f51.google.com (mail-qv1-f51.google.com [209.85.219.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 8F17D13C69B for ; Thu, 6 Jun 2024 23:05:01 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715103; cv=none; b=a1JXykSPH038avCuEeWyBqVGsqfBDBQupU2LX7iNQkexvP0L+nCsiT/oFxpzOLqFX9bQYM7aQA6oNMoTzZgKd4u3UYjZgZQEZ9ijIrANT21+IVNmS8SURXw7UTg+kuoW3Av8WWpDmJ15red04jGOJZhdNi03WZbyktLHG2UUhVw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715103; c=relaxed/simple; bh=Jx+EKax+f7OjPev5cLBiFJyyZy6Esium833KHPLgd44=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=bSYRrPIP9cfyBaC/7ZuOdNMWtLXvRcqrgmIt9tpZ3lOXmH+SMNw7Djq6WtOKysWxvzWRO2TzpQt1WAHvxKZRetW175ssXffB5vCIHK025upnuWlR+I+G0uQKdhG61uIrpskS9KRvkgp3kNWxN202/eHklKrAwxxYuZl9TR5ois8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Z0MPs3WW; arc=none smtp.client-ip=209.85.219.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Z0MPs3WW" Received: by mail-qv1-f51.google.com with SMTP id 6a1803df08f44-6afc61f9a2eso12149646d6.0 for ; Thu, 06 Jun 2024 16:05:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715100; x=1718319900; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=lkCtS+Wk30MAcBbc1irqppF/G8S5cfZAS08t28BM9eY=; b=Z0MPs3WWG+lon20HV70ysuM0GEpuMaAr4GcwMlBXYHPCBSLTSkkabqwGg905J3cwAy Kge4pjepFterkjT4pqcWrfDP7lJJPrjJvoN+tLnFrbjfkTdflG2/GBEp16qMzoNDoOpc L4CVgC9xr+OoZfT/M4q4d8MgR49jbmQygwukDJ8aB0y1GLIS9x/KWN/tKfN9wBLVkKbe el3U731fxP3YCcGMELLJGdahh3um250SV9pxWid8QgpDCHQa0KIiY/8ry8SiuX5TR79G WF6YrAhkwjinSjuO7wDxC3LLRIqQYBIvTqRPYNThCdIyeuYHAMoZrD14nfF/4xBJlyps zPaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715100; x=1718319900; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=lkCtS+Wk30MAcBbc1irqppF/G8S5cfZAS08t28BM9eY=; b=PKJ4qhyXrFN8poVD4uLEcyOJXpouf17oE2oKoDVJ8tTqPbSNnqtLnDurdfxMwWuraf CCaxx/vbEEObmsRi0NDecvQf92plpiv+ueijIdF7t/oY3stcgxUtljIc6GHMdXlDpN4e +pE92Opu4TJslAwJfxgUUU97ZrZoCqZPSTgejJ5p3hXGR0n1lFWDUuBPLWVdwFacnZxF UE2IFXTKS6s2I/2n3/6Qix8SD2HMosSx+x/n/foLi/rozaLcSAUOX6bulNAKbc5qw3hB apnxEPLwnjBTZ1zeLvUUJ47o/uk63tPXQure78WnOPWHXx5S0NcwUw6YQBC1d8cyCk8O bFKg== X-Gm-Message-State: AOJu0YwQHn2Wx81dZAObzX9uGh3aOxcohD6ymIXGTOfBpLUbPHcb4+LD 60IFvO4gz3AIkjlmGu3UJomR1W6tjyfeabzK9ucM1miYvnZ1xlpEpTIuFwMRzCWKEwnp/lSKHIT FnMQ= X-Google-Smtp-Source: AGHT+IGpk6a6r0GZ1sKdY6nLj0Y2bBg83rGG4AHLc9mE+T2p7Iq2UEQ2Er6dbjiYR7MA6esWuH2jOQ== X-Received: by 2002:a05:6214:3217:b0:6af:c64c:d197 with SMTP id 6a1803df08f44-6b05952a874mr18647676d6.8.1717715099795; Thu, 06 Jun 2024 16:04:59 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b04f621640sm10554406d6.19.2024.06.06.16.04.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:04:58 -0700 (PDT) Date: Thu, 6 Jun 2024 19:04:57 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 11/19] midx: remove unused `midx_locate_pack()` Message-ID: <02b1d39fd2aa0da5c2c2591aae1268498e70aa05.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Commit 307d75bbe6 (midx: implement `midx_locate_pack()`, 2023-12-14) introduced `midx_locate_pack()`, which was described at the time as a complement to the function `midx_contains_pack()` which allowed callers to determine where in the MIDX lexical order a pack appeared, as opposed to whether or not it was simply contained. 307d75bbe6 suggests that future patches would be added which would introduce callers for this new function, but none ever were, meaning the function has gone unused since its introduction. Clean this up by in effect reverting 307d75bbe6, which removes the unused functions and inlines its definition back into `midx_contains_pack()`. (Looking back through the list archives when 307d75bbe6 was written, this was in preparation for this[1] patch from back when we had the concept of "disjoint" packs while developing multi-pack verbatim reuse. That concept was abandoned before the series was merged, but I never dropped what would become 307d75bbe6 from the series, leading to the state prior to this commit). [1]: https://lore.kernel.org/git/3019738b52ba8cd78ea696a3b800fa91e722eb66.1701198172.git.me@ttaylorr.com/ Signed-off-by: Taylor Blau --- midx.c | 13 ++----------- midx.h | 2 -- 2 files changed, 2 insertions(+), 13 deletions(-) diff --git a/midx.c b/midx.c index c3802354e3..186d8344dc 100644 --- a/midx.c +++ b/midx.c @@ -462,8 +462,7 @@ int cmp_idx_or_pack_name(const char *idx_or_pack_name, return strcmp(idx_or_pack_name, idx_name); } -int midx_locate_pack(struct multi_pack_index *m, const char *idx_or_pack_name, - uint32_t *pos) +int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name) { uint32_t first = 0, last = m->num_packs; @@ -474,11 +473,8 @@ int midx_locate_pack(struct multi_pack_index *m, const char *idx_or_pack_name, current = m->pack_names[mid]; cmp = cmp_idx_or_pack_name(idx_or_pack_name, current); - if (!cmp) { - if (pos) - *pos = mid; + if (!cmp) return 1; - } if (cmp > 0) { first = mid + 1; continue; @@ -489,11 +485,6 @@ int midx_locate_pack(struct multi_pack_index *m, const char *idx_or_pack_name, return 0; } -int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name) -{ - return midx_locate_pack(m, idx_or_pack_name, NULL); -} - int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id) { if (m->preferred_pack_idx == -1) { diff --git a/midx.h b/midx.h index 46c53d69ff..86af7dfc5e 100644 --- a/midx.h +++ b/midx.h @@ -102,8 +102,6 @@ struct object_id *nth_midxed_object_oid(struct object_id *oid, int fill_midx_entry(struct repository *r, const struct object_id *oid, struct pack_entry *e, struct multi_pack_index *m); int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name); -int midx_locate_pack(struct multi_pack_index *m, const char *idx_or_pack_name, - uint32_t *pos); int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id); int prepare_multi_pack_index_one(struct repository *r, const char *object_dir, int local); From patchwork Thu Jun 6 23:05:01 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689035 Received: from mail-qt1-f173.google.com (mail-qt1-f173.google.com [209.85.160.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1ED05145340 for ; Thu, 6 Jun 2024 23:05:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715105; cv=none; b=AaboT0R6g9HRTmo2LzV9eLkNFdb/9+tS03LbK4yNQVN2/ijAmrAfYiYy7BGfO1EbrHlQnl1CLNp0Olv6Ma96NETk95cVb5Z7Cil4HaHBJKyi2NVhDJn247n3fdjd7gLsl9VQO5xNZ+TAWpYVKsBcdFun/YHEfkXa+L5adz7ZS40= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715105; c=relaxed/simple; bh=wZkMPJXgxN+jJ/mpkGELC6VTlJjyUb6fGqGFJZJh4N0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=eaD7w6yCS4U2AOzNWfZpvA+EEiTr9VBsyp3Sy5/P7nZKmN+tZukBAoDiVjinDi9R9BB+ij/Avqls+EuasYN497WOYFJngwx+SX0StSVDKS3pPL7MJL+PHZq5kGo0ggBA8EXNigT7wFtxvksJFJnjIYWGVgCsA4J9wQ+ZrV/RmJY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=J5DV97pN; arc=none smtp.client-ip=209.85.160.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="J5DV97pN" Received: by mail-qt1-f173.google.com with SMTP id d75a77b69052e-44028fc3d22so5987671cf.2 for ; Thu, 06 Jun 2024 16:05:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715103; x=1718319903; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=1IsGWTT09U0jo7NzjPDH0aF6BbG115ADPoCDm5mLfg4=; b=J5DV97pN1zCLDdLSiA1VqshBt5T5kV8Dxlh+vltIXECDjrGqtl2ieH4wlcULcbxp7x AdnCvsoDD0nN2sIedINCUeCNYNUKIxq6IFrHC0uJ9Ma0GHjZPfcN5P/+Ji2CEQJ2PViX +jeO5h9UScSq2MMEjVYxGMMUJPsVlocxlH/WrRNijIdmze01Bjd03Nyc9ZuDK5wJXySu CGb1LHqMtaVLfOtqfZj6T5wt32RnS301wE2lAUpylb/bU2pyPimqvjlpCYkiLE0DW+wP Ga94p3pDq4wjZd0HAP+8AddjebWX0QSGFphY/0nigKmrkZmeaHkHyObDEBeX+1L0XExf J4og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715103; x=1718319903; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=1IsGWTT09U0jo7NzjPDH0aF6BbG115ADPoCDm5mLfg4=; b=sdzDG/yiYdLpAoXs8WJuD5OCeeWGLtNwbQexLFE9DEdTc0gbnoJJ+YSSq8v6WulhzA Cs5T0N9nOC+S4ziFgTzGR89UnIYQwT/U20gyo1PcxMc3bqwgJJbnCLUtw/HQY4NKWVDr JWtcAn/+LLk597U6XOtz7XrTRuJneUqF0OJv3FJzFaiL9O2VinuY8YtxVn/iEl+Uz08M jIqeoqCNvkvTNGSAyexpu+8Iz8peckE3rFfDH5wIMd7PSN6okWsNLPbnRBruCeVUubfW GhELbt7jFQWzMWyrCSxoJ9TIv+/rk/K0H6jD6dMP8YaHqQobkv49Y4JfC9E46quMIFKm OGoQ== X-Gm-Message-State: AOJu0YzAbVhdl8ZDIfPVisYcB3IqsULOcMYVqaRLsMPfCj09h0nROQLZ vOWRY2dy7UPvqkMm1ADRFW+La7JqRtmK5p3TLjItHsmxFvkYzQC66pH/pbd0WZ3TD9gTU+UwFig UzUI= X-Google-Smtp-Source: AGHT+IHnGvKgW8pDFjWJa3AmLYRtcdD20SiJ1k0vGcwQSX0sAiRSEfTMum+yxT4KD74Jo9Pq1zpOkA== X-Received: by 2002:a05:622a:1a09:b0:43e:40bb:a0db with SMTP id d75a77b69052e-44041c12b17mr11429281cf.54.1717715102754; Thu, 06 Jun 2024 16:05:02 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id d75a77b69052e-44038b5e617sm7543131cf.93.2024.06.06.16.05.02 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:02 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:01 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 12/19] midx: teach `midx_contains_pack()` about incremental MIDXs Message-ID: <2288683674ba6858d1ea217dc7f84bff269024b1.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the `midx_contains_pack()` versus `midx_locate_pack()` debacle has been cleaned up, teach the former about how to operate in an incremental MIDX-aware world in a similar fashion as in previous commits. Instead of using either of the two `midx_for_object()` or `midx_for_pack()` helpers, this function is split into two: one that determines whether a pack is contained in a single MIDX, and another which calls the former in a loop over all MIDXs. This approach does not require that we change any of the implementation in what is now `midx_contains_pack_1()` as it still operates over a single MIDX. Signed-off-by: Taylor Blau --- midx.c | 11 ++++++++++- 1 file changed, 10 insertions(+), 1 deletion(-) diff --git a/midx.c b/midx.c index 186d8344dc..564e922533 100644 --- a/midx.c +++ b/midx.c @@ -462,7 +462,8 @@ int cmp_idx_or_pack_name(const char *idx_or_pack_name, return strcmp(idx_or_pack_name, idx_name); } -int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name) +static int midx_contains_pack_1(struct multi_pack_index *m, + const char *idx_or_pack_name) { uint32_t first = 0, last = m->num_packs; @@ -485,6 +486,14 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name) return 0; } +int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name) +{ + for (; m; m = m->base_midx) + if (midx_contains_pack_1(m, idx_or_pack_name)) + return 1; + return 0; +} + int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id) { if (m->preferred_pack_idx == -1) { From patchwork Thu Jun 6 23:05:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689036 Received: from mail-qk1-f174.google.com (mail-qk1-f174.google.com [209.85.222.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5BD93145340 for ; Thu, 6 Jun 2024 23:05:07 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715108; cv=none; b=ldkn1dxGFTiQvPYhG9mzT+jevroQSs8Oh7MGiqfrSgKzPBFaEZanTUz6Cc2UpYZqfUQzI8qzWLd3IkNYRLwxjpBTb5mj5S5NjPW3U3ebbFJPR+LNJnt8MqXOlwFDOlxF87KsFdMUnMKGdsc0FPNlPIr+VqeBsWCyH45yEsQFU6Y= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715108; c=relaxed/simple; bh=d7yb1MkZ1pzoP33tv+pwTcQt5kBPD+zsloB2eZbZX3s=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=oegw6fyLkBj5cghs8kTQuaN3mxD82qO4xp+hHvnvRs0qZe8IuSon59loWiWcbiwrtUD2czLtiIylErNhAGc6zdXpGpg6MCKgSzA4SGrjAI34rL4JH73phZozTbBQbr4P5L8W/UY0mUy/natZi9DLeuBG+q/IenrQk2XFwX2oxy8= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=1RHwaJAc; arc=none smtp.client-ip=209.85.222.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="1RHwaJAc" Received: by mail-qk1-f174.google.com with SMTP id af79cd13be357-795004bd75dso92000385a.2 for ; Thu, 06 Jun 2024 16:05:07 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715106; x=1718319906; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=TXcxJyt9JgMl7vfM8SQYzGKD/c7BlhJ4KGIF9sFn1v0=; b=1RHwaJAcTKxfPveVuM0O91oLqSRHyEyg2hH5Slug7Q6NNR5s/NyPbiPkKCawtzaTsF P++/Wxr+GkrK/dtj1Fx86WM0N4JdJ5+8j3kG7UQ4PP5nrrpncMLRzCLhHBxPhCNcJODH UPf0gXHmSv/O7SL3PrTn01kjDMmo3tT/HWdPbbldig6KVZYB8R+lf++jL4W7MRHeHj20 onJe8s9j01iH/pWCjUHplr/KDOWWUtFviKc+3JCTrT5c6HDkVLyqOfATdAdIffAX6bVW dCIUw9UjKeD3V9BpPRaxBSN2Jfuzynek+/PsqD4xVsVOiNpXfGjeqFpUJCTKMyLyh2rZ JDjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715106; x=1718319906; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=TXcxJyt9JgMl7vfM8SQYzGKD/c7BlhJ4KGIF9sFn1v0=; b=PHaMBh3aI1JYmAsbQaF40mCiafag5EIr2XFW/FL7/lmrpMP7EgT/EXku/0kEbKGzPO mWu6MG0Oa/FCkxJ+AEUC7h1/XP3FhBGteb81/BD8sr/lNJR31nVeaxnDfiJ95HKkZjTS M+BDN7k3oqNjUhRpAsXcTnHmpvPmqgVcH8UcpSz4H3k3WvpVVacCnUBc3R0cwNTy6u+D 07sFY/Mme41wOghu7f4x6oIq3VnkdLtuXIJxZvm+IuAr5HwuYs/KcV69wM934Jod1cB9 dQNw347gcVwVAfVT41PsVk+0nB+MV0FbVNFx1FiAwfXijcc3+AhhJLPFdYFxqFaK+ngZ 7CoQ== X-Gm-Message-State: AOJu0YwAMXMyKp1U0xKTr78WfFJ33npeOFoIVWUpjJhdokz2lXUGPSqW UWKT7BoJzdKpaU86j/wE+W24SF+0JRfaWMQvQSwLy/wXViUxqj4Gw449PwqNVBoDEa/cQo5ObO1 LyxY= X-Google-Smtp-Source: AGHT+IEIvruCNP2aweSWIkVPz6Kfw6eLPBPzn1yOpCaQBs2MxQZ3IgycMCMNNIpJ1ezL7rrdN0CyEw== X-Received: by 2002:a05:620a:458a:b0:792:91bb:9ca3 with SMTP id af79cd13be357-7953c45fe7fmr118576785a.37.1717715105844; Thu, 06 Jun 2024 16:05:05 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79532813981sm103624185a.25.2024.06.06.16.05.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:05 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:04 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 13/19] midx: teach `midx_preferred_pack()` about incremental MIDXs Message-ID: <53b71c6514dac9abe7472d95842b7be5675b8a4e.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `midx_preferred_pack()` is used to determine the identity of the preferred pack, which is the identity of a unique pack within the MIDX which is used as a tie-breaker when selecting from which pack to represent an object that appears in multiple packs within the MIDX. Historically we have said that the MIDX's preferred pack has the unique property that all objects from that pack are represented in the MIDX. But that isn't quite true: a more precise statement would be that all objects from that pack *which appear in the MIDX* are selected from that pack. This helps us extend the concept of preferred packs across a MIDX chain, where some object(s) in the preferred pack may appear in other packs in an earlier MIDX layer, in which case those object(s) will not appear in a subsequent MIDX layer from either the preferred pack or any other pack. Extend the concept of preferred packs by using the pack which represents the object at the first position in MIDX pseudo-pack order belonging to the current MIDX layer (i.e., at position 'm->num_objects_in_base'). Signed-off-by: Taylor Blau --- midx.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-) diff --git a/midx.c b/midx.c index 564e922533..cb7b623b5d 100644 --- a/midx.c +++ b/midx.c @@ -497,13 +497,16 @@ int midx_contains_pack(struct multi_pack_index *m, const char *idx_or_pack_name) int midx_preferred_pack(struct multi_pack_index *m, uint32_t *pack_int_id) { if (m->preferred_pack_idx == -1) { + uint32_t midx_pos; if (load_midx_revindex(m) < 0) { m->preferred_pack_idx = -2; return -1; } - m->preferred_pack_idx = - nth_midxed_pack_int_id(m, pack_pos_to_midx(m, 0)); + midx_pos = pack_pos_to_midx(m, m->num_objects_in_base); + + m->preferred_pack_idx = nth_midxed_pack_int_id(m, midx_pos); + } else if (m->preferred_pack_idx == -2) return -1; /* no revindex */ From patchwork Thu Jun 6 23:05:07 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689037 Received: from mail-qk1-f170.google.com (mail-qk1-f170.google.com [209.85.222.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9B04813DDA0 for ; Thu, 6 Jun 2024 23:05:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715112; cv=none; b=fW9XvVT+Vb2Oi8Y8MhiJCy4UMNU51/jb8IfI0AxVislcLpyWU24FJGtR7tHpPUh5J9T1GdZAndxXSK2EDmAZviReac6AHLA5oT7R1cZVtbQFa5mIgfDBT9h89ScvIN8NHHKHjuztA3uhuNCvPTg9NLX4Yk7AWdQWXbZdBHMJqCo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715112; c=relaxed/simple; bh=UO1scOQICroab+Xp3blihB9i14dE8VsRbdZM6aY0DLs=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=rdr4RDur0IbSACE8jlCKg6nhYFKLwOCWxE06HhquOvXgNAgPhspk0+XnL+RpbpjDTnt+SHOUuIEU69LzUfxTu3SgsME/Isn+IYWzVFt4LBBU8blXXz4Zd6UjmMQXewsljUVSCWquBoglthNmjJ5zOm+9CiSgdLs++TcrqnmxtGg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=pPDWtkWf; arc=none smtp.client-ip=209.85.222.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="pPDWtkWf" Received: by mail-qk1-f170.google.com with SMTP id af79cd13be357-7951713ba08so126263885a.1 for ; Thu, 06 Jun 2024 16:05:10 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715109; x=1718319909; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=UG2JDNDcuB3SCaR3reBRe3n+CL4I4BcUSmoDVUVCNGc=; b=pPDWtkWffTFUNboNHD5I9mvMMfcXkzGjyc5Nq51CgUllgKlHrzMm3me74ztA/uiFia Nxx9VYdi1fPvPQyt20uJzqntt5aN/r7AKbSmJ5zRgISyyYNYHcdjXF3ulXtAZffF6aoS uOB4OXNKt7Rgf7om7g9Fc2OZFiWws2fU0Dqf72fxticA8qRwj5wTScVvnRYS8rz8hA79 3uIHPrlgD+n9JFKXKXLPQuyb8jQKTI2tpRPzzaCO/L2TfZynC78vw8pqGIavFsYqCHsp KBMPJymmwg1ojArqd87vs72wI3Ie4JNWaij47GaR9dhZhJPs2fcmoYiITQch87/jUnKq 463g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715109; x=1718319909; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=UG2JDNDcuB3SCaR3reBRe3n+CL4I4BcUSmoDVUVCNGc=; b=jkJL9g04i3xCybLXuWlo+lqa7blQBYJfMs7OTQ0ei4ufWGOR80lMrS4ik0BGY0jIXn dzXbLa4iytVGAy09mz0q+yo1YiP3rT69xhbCBf/Q8VX8h95MAD/qB6tchl6U/4OkL/Pb kkCZd82b3hpIG0UiQKyUTKV3/MoTuvmLTsG8dmPPCSjo1LytOdHxIbyXuo+iNERgv0/v e82xmEt6tqWRgAa5UldKuTm5vwY8kRpKne67nmYOSBIOtk10QkcZJG9aLGj+BqALYdDy NZCuIfHq+J83koKLgsAe6F+6tEvgryGBempX+FpMaiJqGzhlacL0LiKTDHPxmvsvkMHd qaoQ== X-Gm-Message-State: AOJu0Yz6xv3SL4fJFmTVjmOyl+5eL6nYTQ7jiNpNcaWCPWaa6tqCB3qg /Vf7TQfVtrs5P9YOLrwBTqpNoS0tqplMGhwiCeCGGRDbjoPXWA4SAlJlTMJO8wa6iF57mMeLmaG 5TKA= X-Google-Smtp-Source: AGHT+IFwb3hBLEPQnDwNeZGaH2WAyu6s8tPkZ0FNXxx3CYYVwQ1vGoxsLkQustjxNZYsXZklro7ixw== X-Received: by 2002:a05:620a:235:b0:790:9817:309d with SMTP id af79cd13be357-7953af0d49fmr198551685a.9.1717715108873; Thu, 06 Jun 2024 16:05:08 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-795330b7607sm103079585a.85.2024.06.06.16.05.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:08 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:07 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 14/19] midx: teach `midx_fanout_add_midx_fanout()` about incremental MIDXs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The function `midx_fanout_add_midx_fanout()` is used to help construct the fanout table when generating a MIDX by reusing data from an existing MIDX. Prepare this function to work with incremental MIDXs by making a few changes: - The bounds checks need to be adjusted to start object lookups taking into account the number of objects in the previous MIDX layer (i.e., by starting the lookups at position `m->num_objects_in_base` instead of position 0). - Likewise, the bounds checks need to end at `m->num_objects_in_base` objects after `m->num_objects`. - Finally, `midx_fanout_add_midx_fanout()` needs to recur on earlier MIDX layers when dealing with an incremental MIDX chain by calling itself when given a MIDX with a non-NULL `base_midx`. Signed-off-by: Taylor Blau --- midx-write.c | 12 ++++++++---- 1 file changed, 8 insertions(+), 4 deletions(-) diff --git a/midx-write.c b/midx-write.c index 55a6b63bac..b148ee443a 100644 --- a/midx-write.c +++ b/midx-write.c @@ -180,7 +180,7 @@ static int nth_midxed_pack_midx_entry(struct multi_pack_index *m, struct pack_midx_entry *e, uint32_t pos) { - if (pos >= m->num_objects) + if (pos >= m->num_objects + m->num_objects_in_base) return 1; nth_midxed_object_oid(&e->oid, m, pos); @@ -231,12 +231,16 @@ static void midx_fanout_add_midx_fanout(struct midx_fanout *fanout, uint32_t cur_fanout, int preferred_pack) { - uint32_t start = 0, end; + uint32_t start = m->num_objects_in_base, end; uint32_t cur_object; + if (m->base_midx) + midx_fanout_add_midx_fanout(fanout, m->base_midx, cur_fanout, + preferred_pack); + if (cur_fanout) - start = ntohl(m->chunk_oid_fanout[cur_fanout - 1]); - end = ntohl(m->chunk_oid_fanout[cur_fanout]); + start += ntohl(m->chunk_oid_fanout[cur_fanout - 1]); + end = m->num_objects_in_base + ntohl(m->chunk_oid_fanout[cur_fanout]); for (cur_object = start; cur_object < end; cur_object++) { if ((preferred_pack > -1) && From patchwork Thu Jun 6 23:05:10 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689038 Received: from mail-qv1-f45.google.com (mail-qv1-f45.google.com [209.85.219.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 67A6213DDA0 for ; Thu, 6 Jun 2024 23:05:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715115; cv=none; b=E4pNk4epMhOCze7WTI6im8a2WvloRKEbL584FSk6IcQRORCUobj9oBqttP+rHSp451TuSk86I2U40tfbmtP/a7+qvgWDz2ZivCi2WqjSauA2oS4dqqddgWWgHAGGOEuvRvdIzkx5J0rYyFLsIlXZEPiQUgPBmcX59x2QDdndrLs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715115; c=relaxed/simple; bh=JNOhloDyWt/qlRsBrqDVt8pYwr49wjr1RN5gg0EfUis=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=PPoJE7LCQiMRQplY22M9K/tGlOP++T9lYP8uWYetaJ2kzR1SHIsWbvVgqdhgX6C0dPWaR982DGCK/vB0q70OOyWmWKntogQRujJ2cmapirn0+XwE5AuuVtDv2YLzUDfXrqosgwqbJP0n4gl4peYM7ZnRSpsjNdhrAU45ifZvYLk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=eRL05wiT; arc=none smtp.client-ip=209.85.219.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="eRL05wiT" Received: by mail-qv1-f45.google.com with SMTP id 6a1803df08f44-6b04bacd1e1so8627296d6.0 for ; Thu, 06 Jun 2024 16:05:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715112; x=1718319912; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=QUBLukhQJGmKY3hKB7d8XLffYxaP3qSjJueq/wpsCXY=; b=eRL05wiTqCoH1aQygMmXr207UTE5lXxkKKmZzsBxqr24S8tl7qBxYnaib2W2ZG9HR9 fiz+Lnm8eHNc3bgZoELVg2BBDXBbqcd90TkImaSp6W65DdKgaqxaxsDyR57XnCOcDJAj GcOoAcSVUI64D1Kp+VDirsXuwId34PgtfedOwPmLKmtzao2mpJugZgiDyiHdQzUOv4iZ +AUIq9cnx2YsJTozlt2yH/5Wmw7I5LkEKhVVQtaC9wuQXSMb99jbhfo+nXXzF7x0ztuo wSzUJezsgReYu1Avd3dU82Q0dXfV+GP4vOmVVCzIdTLpCh7/Z0pTkBSeyBgqLckeUl67 RUCA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715112; x=1718319912; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=QUBLukhQJGmKY3hKB7d8XLffYxaP3qSjJueq/wpsCXY=; b=bLVuWse6hLTUOVm7lDyZtNPcMIPKjPs+6JGm6QBirp6cLMtzB+P4zf9jy//b3N0ZK0 mYBZdnPYMAs0jZ4zc3EcceY5V4V1bCydokdgv4vjXYDL695l+uJyLUMzJP40DvF9TOIT 1q03ZeRdjDiga8kOMqFaFvqYc/0NOlHKzuMHOd454eg16PdcNgUG/IXM5Gi4by1866GN koGMOJL4SeGOFj/ZQF2LOI/qV1B4CWXM5Gp7UZQMCs+qo89q9AwIuXnmIuS2UtAaH/9o CBturF6ZwRKr8pEqxco4W5ruVjVv4d7bnkCOO740L+9kwC+ZF4eUDDmAUW4hAzpuuiB9 0Iaw== X-Gm-Message-State: AOJu0YwXIbLzCXMt1fCLTFUE0ZImbIM/0rYJnog5CuA2O7xhOrpQ8hyA bjdTTBv18vPCXdhLOcudmXrrWSRA7Q1HVL4fjmuOzDuJ07lZWNPrFJk2gHnjsel7m/z6XUTNLpD o0g8= X-Google-Smtp-Source: AGHT+IFCd3aoBImaceulvWDcL+Jin6H5yNv1HsFEZYiuWiBlEmjATe5rdOXz3fvIrCSyOPZ5bXJa+w== X-Received: by 2002:a05:6214:424d:b0:6b0:5aa6:9996 with SMTP id 6a1803df08f44-6b05aa6999amr6519076d6.24.1717715111899; Thu, 06 Jun 2024 16:05:11 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b04fa3ad64sm10266736d6.142.2024.06.06.16.05.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:11 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:10 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 15/19] midx: support reading incremental MIDX chains Message-ID: <28579fa29266b65c7c6b915678dff35f50cc051d.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the MIDX machinery's internals have been taught to understand incremental MIDXs over the previous handful of commits, the MIDX machinery itself can begin reading incremental MIDXs. (Note that while the on-disk format for incremental MIDXs has been defined, the writing end has not been implemented. This will take place in the commit after next.) The core of this change involves following the order specified in the MIDX chain and opening up MIDXs in the chain one-by-one, adding them to the previous layer's `->base_midx` pointer at each step. In order to implement this, the `load_multi_pack_index()` function is taught to call a new `load_multi_pack_index_chain()` function if loading a non-incremental MIDX failed via `load_multi_pack_index_one()`. When loading a MIDX chain, `load_midx_chain_fd_st()` reads each line in the file one-by-one and dispatches calls to `load_multi_pack_index_one()` to read each layer of the MIDX chain. When a layer was successfully read, it is added to the MIDX chain by calling `add_midx_to_chain()` which validates the contents of the `BASE` chunk, performs some bounds checks on the number of combined packs and objects, and attaches the new MIDX by assigning its `base_midx` pointer to the existing part of the chain. As a supplement to this, introduce a new mode in the test-read-midx test-tool which allows us to read the information for a specific MIDX in the chain by specifying its trailing checksum via the command-line arguments like so: $ test-tool read-midx .git/objects [checksum] Signed-off-by: Taylor Blau --- midx.c | 184 +++++++++++++++++++++++++++++++++++--- midx.h | 7 ++ packfile.c | 5 +- t/helper/test-read-midx.c | 24 +++-- 4 files changed, 201 insertions(+), 19 deletions(-) diff --git a/midx.c b/midx.c index cb7b623b5d..ac44fcefc2 100644 --- a/midx.c +++ b/midx.c @@ -89,7 +89,9 @@ static int midx_read_object_offsets(const unsigned char *chunk_start, #define MIDX_MIN_SIZE (MIDX_HEADER_SIZE + the_hash_algo->rawsz) -struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local) +static struct multi_pack_index *load_multi_pack_index_one(const char *object_dir, + const char *midx_name, + int local) { struct multi_pack_index *m = NULL; int fd; @@ -97,31 +99,26 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local size_t midx_size; void *midx_map = NULL; uint32_t hash_version; - struct strbuf midx_name = STRBUF_INIT; uint32_t i; const char *cur_pack_name; struct chunkfile *cf = NULL; - get_midx_filename(&midx_name, object_dir); - - fd = git_open(midx_name.buf); + fd = git_open(midx_name); if (fd < 0) goto cleanup_fail; if (fstat(fd, &st)) { - error_errno(_("failed to read %s"), midx_name.buf); + error_errno(_("failed to read %s"), midx_name); goto cleanup_fail; } midx_size = xsize_t(st.st_size); if (midx_size < MIDX_MIN_SIZE) { - error(_("multi-pack-index file %s is too small"), midx_name.buf); + error(_("multi-pack-index file %s is too small"), midx_name); goto cleanup_fail; } - strbuf_release(&midx_name); - midx_map = xmmap(NULL, midx_size, PROT_READ, MAP_PRIVATE, fd, 0); close(fd); @@ -211,7 +208,6 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local cleanup_fail: free(m); - strbuf_release(&midx_name); free_chunkfile(cf); if (midx_map) munmap(midx_map, midx_size); @@ -220,6 +216,173 @@ struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local return NULL; } +void get_midx_chain_dirname(struct strbuf *buf, const char *object_dir) +{ + strbuf_addf(buf, "%s/pack/multi-pack-index.d", object_dir); +} + +void get_midx_chain_filename(struct strbuf *buf, const char *object_dir) +{ + get_midx_chain_dirname(buf, object_dir); + strbuf_addstr(buf, "/multi-pack-index-chain"); +} + +void get_split_midx_filename_ext(struct strbuf *buf, const char *object_dir, + const unsigned char *hash, const char *ext) +{ + get_midx_chain_dirname(buf, object_dir); + strbuf_addf(buf, "/multi-pack-index-%s.%s", hash_to_hex(hash), ext); +} + +static int open_multi_pack_index_chain(const char *chain_file, + int *fd, struct stat *st) +{ + *fd = git_open(chain_file); + if (*fd < 0) + return 0; + if (fstat(*fd, st)) { + close(*fd); + return 0; + } + if (st->st_size < the_hash_algo->hexsz) { + close(*fd); + if (!st->st_size) { + /* treat empty files the same as missing */ + errno = ENOENT; + } else { + warning(_("multi-pack-index chain file too small")); + errno = EINVAL; + } + return 0; + } + return 1; +} + +static int add_midx_to_chain(struct multi_pack_index *midx, + struct multi_pack_index *midx_chain, + struct object_id *oids, + int n) +{ + if (midx_chain) { + if (unsigned_add_overflows(midx_chain->num_packs, + midx_chain->num_packs_in_base)) { + warning(_("pack count in base MIDX too high: %"PRIuMAX), + (uintmax_t)midx_chain->num_packs_in_base); + return 0; + } + if (unsigned_add_overflows(midx_chain->num_objects, + midx_chain->num_objects_in_base)) { + warning(_("object count in base MIDX too high: %"PRIuMAX), + (uintmax_t)midx_chain->num_objects_in_base); + return 0; + } + midx->num_packs_in_base = midx_chain->num_packs + + midx_chain->num_packs_in_base; + midx->num_objects_in_base = midx_chain->num_objects + + midx_chain->num_objects_in_base; + } + + midx->base_midx = midx_chain; + midx->has_chain = 1; + + return 1; +} + +static struct multi_pack_index *load_midx_chain_fd_st(const char *object_dir, + int local, + int fd, struct stat *st, + int *incomplete_chain) +{ + struct multi_pack_index *midx_chain = NULL; + struct strbuf buf = STRBUF_INIT; + struct object_id *layers = NULL; + int valid = 1; + uint32_t i, count; + FILE *fp = xfdopen(fd, "r"); + + count = st->st_size / (the_hash_algo->hexsz + 1); + CALLOC_ARRAY(layers, count); + + for (i = 0; i < count; i++) { + struct multi_pack_index *m; + + if (strbuf_getline_lf(&buf, fp) == EOF) + break; + + if (get_oid_hex(buf.buf, &layers[i])) { + warning(_("invalid multi-pack-index chain: line '%s' " + "not a hash"), + buf.buf); + valid = 0; + break; + } + + valid = 0; + + strbuf_reset(&buf); + get_split_midx_filename_ext(&buf, object_dir, layers[i].hash, + MIDX_EXT_MIDX); + m = load_multi_pack_index_one(object_dir, buf.buf, local); + + if (m) { + if (add_midx_to_chain(m, midx_chain, layers, i)) { + midx_chain = m; + valid = 1; + } else { + close_midx(m); + } + } + if (!valid) { + warning(_("unable to find all multi-pack index files")); + break; + } + } + + free(layers); + fclose(fp); + strbuf_release(&buf); + + *incomplete_chain = !valid; + return midx_chain; +} + +static struct multi_pack_index *load_multi_pack_index_chain(const char *object_dir, + int local) +{ + struct strbuf chain_file = STRBUF_INIT; + struct stat st; + int fd; + struct multi_pack_index *m = NULL; + + get_midx_chain_filename(&chain_file, object_dir); + if (open_multi_pack_index_chain(chain_file.buf, &fd, &st)) { + int incomplete; + /* ownership of fd is taken over by load function */ + m = load_midx_chain_fd_st(object_dir, local, fd, &st, + &incomplete); + } + + strbuf_release(&chain_file); + return m; +} + +struct multi_pack_index *load_multi_pack_index(const char *object_dir, + int local) +{ + struct strbuf midx_name = STRBUF_INIT; + struct multi_pack_index *m; + + get_midx_filename(&midx_name, object_dir); + + m = load_multi_pack_index_one(object_dir, midx_name.buf, local); + if (!m) + m = load_multi_pack_index_chain(object_dir, local); + + strbuf_release(&midx_name); + + return m; +} + void close_midx(struct multi_pack_index *m) { uint32_t i; @@ -228,6 +391,7 @@ void close_midx(struct multi_pack_index *m) return; close_midx(m->next); + close_midx(m->base_midx); munmap((unsigned char *)m->data, m->data_len); diff --git a/midx.h b/midx.h index 86af7dfc5e..94de16a8c4 100644 --- a/midx.h +++ b/midx.h @@ -24,6 +24,7 @@ struct bitmapped_pack; #define MIDX_CHUNKID_OBJECTOFFSETS 0x4f4f4646 /* "OOFF" */ #define MIDX_CHUNKID_LARGEOFFSETS 0x4c4f4646 /* "LOFF" */ #define MIDX_CHUNKID_REVINDEX 0x52494458 /* "RIDX" */ +#define MIDX_CHUNKID_BASE 0x42415345 /* "BASE" */ #define MIDX_CHUNK_OFFSET_WIDTH (2 * sizeof(uint32_t)) #define MIDX_LARGE_OFFSET_NEEDED 0x80000000 @@ -50,6 +51,7 @@ struct multi_pack_index { int preferred_pack_idx; int local; + int has_chain; const unsigned char *chunk_pack_names; size_t chunk_pack_names_len; @@ -80,11 +82,16 @@ struct multi_pack_index { #define MIDX_EXT_REV "rev" #define MIDX_EXT_BITMAP "bitmap" +#define MIDX_EXT_MIDX "midx" const unsigned char *get_midx_checksum(struct multi_pack_index *m); void get_midx_filename(struct strbuf *out, const char *object_dir); void get_midx_filename_ext(struct strbuf *out, const char *object_dir, const unsigned char *hash, const char *ext); +void get_midx_chain_dirname(struct strbuf *buf, const char *object_dir); +void get_midx_chain_filename(struct strbuf *buf, const char *object_dir); +void get_split_midx_filename_ext(struct strbuf *buf, const char *object_dir, + const unsigned char *hash, const char *ext); struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); diff --git a/packfile.c b/packfile.c index d4df7fdeea..85f0345435 100644 --- a/packfile.c +++ b/packfile.c @@ -878,7 +878,8 @@ static void prepare_pack(const char *full_name, size_t full_name_len, if (!report_garbage) return; - if (!strcmp(file_name, "multi-pack-index")) + if (!strcmp(file_name, "multi-pack-index") || + !strcmp(file_name, "multi-pack-index.d")) return; if (starts_with(file_name, "multi-pack-index") && (ends_with(file_name, ".bitmap") || ends_with(file_name, ".rev"))) @@ -1062,7 +1063,7 @@ struct packed_git *get_all_packs(struct repository *r) prepare_packed_git(r); for (m = r->objects->multi_pack_index; m; m = m->next) { uint32_t i; - for (i = 0; i < m->num_packs; i++) + for (i = 0; i < m->num_packs + m->num_packs_in_base; i++) prepare_midx_pack(r, m, i); } diff --git a/t/helper/test-read-midx.c b/t/helper/test-read-midx.c index 4acae41bb9..f9148328e3 100644 --- a/t/helper/test-read-midx.c +++ b/t/helper/test-read-midx.c @@ -7,8 +7,10 @@ #include "packfile.h" #include "setup.h" #include "gettext.h" +#include "pack-revindex.h" -static int read_midx_file(const char *object_dir, int show_objects) +static int read_midx_file(const char *object_dir, const char *checksum, + int show_objects) { uint32_t i; struct multi_pack_index *m; @@ -19,6 +21,13 @@ static int read_midx_file(const char *object_dir, int show_objects) if (!m) return 1; + if (checksum) { + while (m && strcmp(hash_to_hex(get_midx_checksum(m)), checksum)) + m = m->base_midx; + if (!m) + return 1; + } + printf("header: %08x %d %d %d %d\n", m->signature, m->version, @@ -52,7 +61,8 @@ static int read_midx_file(const char *object_dir, int show_objects) struct pack_entry e; for (i = 0; i < m->num_objects; i++) { - nth_midxed_object_oid(&oid, m, i); + nth_midxed_object_oid(&oid, m, + i + m->num_objects_in_base); fill_midx_entry(the_repository, &oid, &e, m); printf("%s %"PRIu64"\t%s\n", @@ -109,7 +119,7 @@ static int read_midx_bitmapped_packs(const char *object_dir) if (!midx) return 1; - for (i = 0; i < midx->num_packs; i++) { + for (i = 0; i < midx->num_packs + midx->num_packs_in_base; i++) { if (nth_bitmapped_pack(the_repository, midx, &pack, i) < 0) return 1; @@ -125,16 +135,16 @@ static int read_midx_bitmapped_packs(const char *object_dir) int cmd__read_midx(int argc, const char **argv) { - if (!(argc == 2 || argc == 3)) - usage("read-midx [--show-objects|--checksum|--preferred-pack|--bitmap] "); + if (!(argc == 2 || argc == 3 || argc == 4)) + usage("read-midx [--show-objects|--checksum|--preferred-pack|--bitmap] "); if (!strcmp(argv[1], "--show-objects")) - return read_midx_file(argv[2], 1); + return read_midx_file(argv[2], argv[3], 1); else if (!strcmp(argv[1], "--checksum")) return read_midx_checksum(argv[2]); else if (!strcmp(argv[1], "--preferred-pack")) return read_midx_preferred_pack(argv[2]); else if (!strcmp(argv[1], "--bitmap")) return read_midx_bitmapped_packs(argv[2]); - return read_midx_file(argv[1], 0); + return read_midx_file(argv[1], argv[2], 0); } From patchwork Thu Jun 6 23:05:13 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689039 Received: from mail-oa1-f48.google.com (mail-oa1-f48.google.com [209.85.160.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 340381474A9 for ; Thu, 6 Jun 2024 23:05:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715117; cv=none; b=k615+5yGF4If+7lR2uMMWGa29vgGwpL7jH55OyuXVoyW7z0HXfCQNuwE0fvWU8fua0Wm3fpBkuf6irBbDVUWNg05UiJDthZ/KUca696EUWmywX2U6jswTK5yjXw+UqjqLTP4Nm1LlBU7EkfY6LxeiAgEjai1lkhNcsP25nRsT8o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715117; c=relaxed/simple; bh=gwos665eZ6F5tukUUE5Av3afHvvz6uK5XBXr8k7y5TM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=KocTTnn28OrTXXC6SZtzJ4fN2yPo+rncYKt2rHex9Q5LcHr9H5KTHZkzsPb7bD1Lor+6D4bFvdixavCO26H9jVeBmz3sJ7IWzJi+p+vm/MOpE9dx2zhdVxmiaosmJvP9YU5pkQM8NeiuChIige67IABRlLNL9usdBbJ75QidCYs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=PSIe7Tte; arc=none smtp.client-ip=209.85.160.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="PSIe7Tte" Received: by mail-oa1-f48.google.com with SMTP id 586e51a60fabf-24c9f6338a4so789310fac.1 for ; Thu, 06 Jun 2024 16:05:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715115; x=1718319915; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=cDY2Yi6BRuPZQBff8tNJLIy3fifVqigeLpZR8oqZerA=; b=PSIe7Tte5H057C0Dtu271V1+veX4tyswUOGNKHSUkr/gnS2Be96qP2kotwhN8WFOGK 9T7iIbGcV9N1WvR6ny+I6pefNJulQ5NChaD1WjZl1iwtAwUGvKEpndzv3Sy8OvhmD8dc 6X9hEj3tbGaFZQUBY62jEP/itxU3z9YRy59ArBn4fot94lsuJvMnAxkLQ14l9bNK19j6 nckVa5ssnNEKE20ORnc9803Ep9+NP5mNyA8ZwIB1CvBuGKLqU1stA97v2LUv1IlkL+9w L+SJAjWDK5ePBjjPNvRJfU5qbHXxCLUUNo6IR9YU+WZ5RkE1NdO+1Vcpxyph/HhK+HzX njrw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715115; x=1718319915; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=cDY2Yi6BRuPZQBff8tNJLIy3fifVqigeLpZR8oqZerA=; b=KNn+pqYksGfbDX8K+nv2Oh/MCA7/QgeOVN2rS+FG/VsbQONX9ZZqWlc8xTooZX/2EA UVnPjL2RLB7AEdmoBFNRwBtwz2ESC1Kf/Ab4i539FXnvqveWdqikxcABoCThbCLu5mvp 4slahz2XcdZVZKTTXm6muAJ9O9YRmzH3M9Ym1cK2I2v+BFq5LI5oiuCQ08P3U0Qimqlu LF1tTabqUvxzohsY0fXeeDfptgbNhv1SgdVFyrKNA4vBC1RE/ZMwg+7liC5ty5AFITjT 7QoKdh1wCbxWke6XES5vgGjO1QU+s5j9rsT8P0MoBwwTHj3UzYTX60XjzVEnVbVVsyeU 14UA== X-Gm-Message-State: AOJu0Yw8RfZHGsY1wIfeFF82HU6A++5TNQpXqbg8TS86s+A5jny825Fv zUdCDbB9fxjktvaw351txrExhA/RVjDSKZpdvLSvVvPv6jS6qKXdcwnR3pgq8tkdeZAQcCXTkyB zAd0= X-Google-Smtp-Source: AGHT+IGJ+jiKuAl3/juE5QA+r+Ar2Q+q2hWp9vKgMxAyQ63Uizia1BfSdeUQiR18mCsskTnw8sY7Iw== X-Received: by 2002:a05:6870:a194:b0:250:7913:1712 with SMTP id 586e51a60fabf-254645901c8mr1023820fac.35.1717715114904; Thu, 06 Jun 2024 16:05:14 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-795332df4c7sm102553485a.127.2024.06.06.16.05.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:14 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:13 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 16/19] midx: implement verification support for incremental MIDXs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Teach the verification implementation used by `git multi-pack-index verify` to perform verification for incremental MIDX chains by independently validating each layer within the chain. Signed-off-by: Taylor Blau --- midx.c | 47 ++++++++++++++++++++++++++++++----------------- midx.h | 2 ++ 2 files changed, 32 insertions(+), 17 deletions(-) diff --git a/midx.c b/midx.c index ac44fcefc2..ae3e30a062 100644 --- a/midx.c +++ b/midx.c @@ -467,6 +467,13 @@ int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, return 0; } +struct packed_git *nth_midxed_pack(struct multi_pack_index *m, + uint32_t pack_int_id) +{ + uint32_t local_pack_int_id = midx_for_pack(&m, pack_int_id); + return m->packs[local_pack_int_id]; +} + #define MIDX_CHUNK_BITMAPPED_PACKS_WIDTH (2 * sizeof(uint32_t)) int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, @@ -814,6 +821,7 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag uint32_t i; struct progress *progress = NULL; struct multi_pack_index *m = load_multi_pack_index(object_dir, 1); + struct multi_pack_index *curr; verify_midx_error = 0; if (!m) { @@ -836,8 +844,8 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag if (flags & MIDX_PROGRESS) progress = start_delayed_progress(_("Looking for referenced packfiles"), - m->num_packs); - for (i = 0; i < m->num_packs; i++) { + m->num_packs + m->num_packs_in_base); + for (i = 0; i < m->num_packs + m->num_packs_in_base; i++) { if (prepare_midx_pack(r, m, i)) midx_report("failed to load pack in position %d", i); @@ -857,17 +865,20 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag if (flags & MIDX_PROGRESS) progress = start_sparse_progress(_("Verifying OID order in multi-pack-index"), m->num_objects - 1); - for (i = 0; i < m->num_objects - 1; i++) { - struct object_id oid1, oid2; - nth_midxed_object_oid(&oid1, m, i); - nth_midxed_object_oid(&oid2, m, i + 1); + for (curr = m; curr; curr = curr->base_midx) { + for (i = 0; i < m->num_objects - 1; i++) { + struct object_id oid1, oid2; - if (oidcmp(&oid1, &oid2) >= 0) - midx_report(_("oid lookup out of order: oid[%d] = %s >= %s = oid[%d]"), - i, oid_to_hex(&oid1), oid_to_hex(&oid2), i + 1); + nth_midxed_object_oid(&oid1, m, m->num_objects_in_base + i); + nth_midxed_object_oid(&oid2, m, m->num_objects_in_base + i + 1); - midx_display_sparse_progress(progress, i + 1); + if (oidcmp(&oid1, &oid2) >= 0) + midx_report(_("oid lookup out of order: oid[%d] = %s >= %s = oid[%d]"), + i, oid_to_hex(&oid1), oid_to_hex(&oid2), i + 1); + + midx_display_sparse_progress(progress, i + 1); + } } stop_progress(&progress); @@ -877,8 +888,8 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag * each of the objects and only require 1 packfile to be open at a * time. */ - ALLOC_ARRAY(pairs, m->num_objects); - for (i = 0; i < m->num_objects; i++) { + ALLOC_ARRAY(pairs, m->num_objects + m->num_objects_in_base); + for (i = 0; i < m->num_objects + m->num_objects_in_base; i++) { pairs[i].pos = i; pairs[i].pack_int_id = nth_midxed_pack_int_id(m, i); } @@ -892,16 +903,18 @@ int verify_midx_file(struct repository *r, const char *object_dir, unsigned flag if (flags & MIDX_PROGRESS) progress = start_sparse_progress(_("Verifying object offsets"), m->num_objects); - for (i = 0; i < m->num_objects; i++) { + for (i = 0; i < m->num_objects + m->num_objects_in_base; i++) { struct object_id oid; struct pack_entry e; off_t m_offset, p_offset; if (i > 0 && pairs[i-1].pack_int_id != pairs[i].pack_int_id && - m->packs[pairs[i-1].pack_int_id]) - { - close_pack_fd(m->packs[pairs[i-1].pack_int_id]); - close_pack_index(m->packs[pairs[i-1].pack_int_id]); + m->packs[pairs[i-1].pack_int_id]) { + uint32_t pack_int_id = pairs[i-1].pack_int_id; + struct packed_git *p = nth_midxed_pack(m, pack_int_id); + + close_pack_fd(p); + close_pack_index(p); } nth_midxed_object_oid(&oid, m, pairs[i].pos); diff --git a/midx.h b/midx.h index 94de16a8c4..9d30935589 100644 --- a/midx.h +++ b/midx.h @@ -95,6 +95,8 @@ void get_split_midx_filename_ext(struct strbuf *buf, const char *object_dir, struct multi_pack_index *load_multi_pack_index(const char *object_dir, int local); int prepare_midx_pack(struct repository *r, struct multi_pack_index *m, uint32_t pack_int_id); +struct packed_git *nth_midxed_pack(struct multi_pack_index *m, + uint32_t pack_int_id); int nth_bitmapped_pack(struct repository *r, struct multi_pack_index *m, struct bitmapped_pack *bp, uint32_t pack_int_id); int bsearch_one_midx(const struct object_id *oid, struct multi_pack_index *m, From patchwork Thu Jun 6 23:05:16 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689040 Received: from mail-qv1-f46.google.com (mail-qv1-f46.google.com [209.85.219.46]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B1C3B13E035 for ; Thu, 6 Jun 2024 23:05:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.46 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715121; cv=none; b=VOyoGZwkOjoJTrT7mqbf7qhBRGbpxRD83dK2t+9x82IkbwgFz0arPDt5scukvv8HGdw1AukOezgTnZ595e84/gVEItdoSPWZcU2he0+TOfOA0RCERYbaXGkyldGYChJHXH1DFNzkTN3ea5bcdtQXP4tOKznU6XvZcglMx8PTNIM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715121; c=relaxed/simple; bh=NI/xEUUOL62RFy8bz7XaVCNuLm2CygRanWRC/eCTaco=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=UwCSamrdVRZSCfAcc7ahgJnYPBysc96iseA/gq+KGgy33J5URnAV9v8WfqcVjeqzeViqqnOCVkSCet0SxzeTI1AuFZ2REzDl4X9EzYeWXXDKNgP12ZwLNU2P+ZgOU9/vImTiyHQ/C3RrHPA5bX2Ny0IaGR/gVHTc0sOtKdGktoY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=aIdEAzjV; arc=none smtp.client-ip=209.85.219.46 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="aIdEAzjV" Received: by mail-qv1-f46.google.com with SMTP id 6a1803df08f44-6ae259b1c87so21788396d6.1 for ; Thu, 06 Jun 2024 16:05:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715118; x=1718319918; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=u40bSVseg4foN3wedwgvqR+JxOSfSOvzvICwCpPVeH8=; b=aIdEAzjVQ57okpm+C/w81tKnzPUWXKhguzc0/uOoXC53gPCuneqko8tzBHgmiBQI1d XGVGXc+2n4DBgV4y3hBfIWFaa0CncbQig6W2zbW/n2mP2xUJWU/4RYD/o615SL1Udc7D ez7CP0sJydRQuhdMK1Wy4bPXBmNmOWV36FDnZA/ih4b6Tm8oiCrxH0kKL0DobPQyenIt GA5K94orsYu0KA32jjjoNPiukPXCkaineYTJ65J3uSdf35mIfkDG2ZMHP/SNDit4qwVn ZKkMPFeN7jfgCQYdDmdpjEdWN+j7tlP4QqFWJ25muxvlbRUm4LjZBPJ4Y9YdVdRbb5pD PY5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715118; x=1718319918; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=u40bSVseg4foN3wedwgvqR+JxOSfSOvzvICwCpPVeH8=; b=iKjMHB/VOly12U0Z8K7PDAYncMKCK5yDQThU87TUE4iI6UVlddMY3muJM9xjbsDN3d 3kFvhFWYYwlcKnh3+xuNnrO8TRyFjZkyC9G+kXB/U7fDenqqKl6avneDOCz7jm0DgvGJ 6H3OI1UPwv4UZsjC3paCnu/xYThoAO04sS5jT0VDgvfMmUnbDFAHrix5U8c6dPrIDyR9 xxtNLfwHCXM6ncnYokuov+P+k6XI78ft4FoDL4CXuH9CJlrz5tZV54Eu8Arhi0c0twwW 7VDEo32/rHFTdJrDIwc8APVfYC89DAQdabCjJabxXB+p9HWDAgUJgLQ3ingCzAKptpLq 9Gcg== X-Gm-Message-State: AOJu0YyZkZ1mfuLtzQ2+ao5JbhqCUDifWEAS0iZNjCokqkfoRzFfJMfx 3GbOhuIyB5JW/VTY5EKENEdlKcmLISfAdP94u2/WNe03AqqLzhi9CtbdyUk9oBEd9/HFjeRkYtG AsXw= X-Google-Smtp-Source: AGHT+IE/1n6GpdoH5pPqLXLqODqNq72EUEyuWiZciz2fd+H/tZ8itwDKKZCeTnkYY5Iy169/LNx1GQ== X-Received: by 2002:ad4:5c8c:0:b0:6ad:7573:acb9 with SMTP id 6a1803df08f44-6b04bc0c899mr74031106d6.0.1717715117946; Thu, 06 Jun 2024 16:05:17 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b04f987424sm10513956d6.87.2024.06.06.16.05.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:17 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:16 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 17/19] t: retire 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' Message-ID: <1609b8611f073e2954ec5fca875388cbeaecccdb.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Two years ago, commit ff1e653c8e2 (midx: respect 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP', 2021-08-31) introduced a new environment variable which caused the test suite to write MIDX bitmaps after any 'git repack' invocation. At the time, this was done to help flush out any bugs with MIDX bitmaps that weren't explicitly covered in the t5326-multi-pack-bitmap.sh script. Two years later, that flag has served us well and is no longer providing meaningful coverage, as the script in t5326 has matured substantially and covers many more interesting cases than it did back when ff1e653c8e2 was originally written. Remove the 'GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP' environment variable as it is no longer serving a useful purpose. More importantly, removing this variable clears the way for us to introduce a new one to help similarly flush out bugs related to incremental MIDX chains. Because these incremental MIDX chains are (for now) incompatible with MIDX bitmaps, we cannot have both. Signed-off-by: Taylor Blau --- builtin/repack.c | 12 ++---------- ci/run-build-and-tests.sh | 1 - midx.h | 2 -- t/README | 4 ---- t/t0410-partial-clone.sh | 2 -- t/t5310-pack-bitmaps.sh | 4 ---- t/t5319-multi-pack-index.sh | 3 +-- t/t5326-multi-pack-bitmaps.sh | 3 +-- t/t5327-multi-pack-bitmaps-rev.sh | 5 ++--- t/t7700-repack.sh | 21 +++++++-------------- 10 files changed, 13 insertions(+), 44 deletions(-) diff --git a/builtin/repack.c b/builtin/repack.c index 58ad82dd97..e2fec16389 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -1217,10 +1217,6 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (!write_midx && (!(pack_everything & ALL_INTO_ONE) || !is_bare_repository())) write_bitmaps = 0; - } else if (write_bitmaps && - git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0) && - git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) { - write_bitmaps = 0; } if (pack_kept_objects < 0) pack_kept_objects = write_bitmaps > 0 && !write_midx; @@ -1518,12 +1514,8 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (run_update_server_info) update_server_info(0); - if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) { - unsigned flags = 0; - if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP, 0)) - flags |= MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX; - write_midx_file(get_object_directory(), NULL, NULL, flags); - } + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) + write_midx_file(get_object_directory(), NULL, NULL, 0); cleanup: string_list_clear(&names, 1); diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 98dda42045..e6fd68630c 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -25,7 +25,6 @@ linux-TEST-vars) export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 - export GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=1 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master export GIT_TEST_NO_WRITE_REV_INDEX=1 export GIT_TEST_CHECKOUT_WORKERS=2 diff --git a/midx.h b/midx.h index 9d30935589..3714cad2cc 100644 --- a/midx.h +++ b/midx.h @@ -29,8 +29,6 @@ struct bitmapped_pack; #define MIDX_LARGE_OFFSET_NEEDED 0x80000000 #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" -#define GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP \ - "GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP" struct multi_pack_index { struct multi_pack_index *next; diff --git a/t/README b/t/README index d9e0e07506..e8a11926e4 100644 --- a/t/README +++ b/t/README @@ -469,10 +469,6 @@ GIT_TEST_MULTI_PACK_INDEX=, when true, forces the multi-pack- index to be written after every 'git repack' command, and overrides the 'core.multiPackIndex' setting to true. -GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=, when true, sets the -'--bitmap' option on all invocations of 'git multi-pack-index write', -and ignores pack-objects' '--write-bitmap-index'. - GIT_TEST_SIDEBAND_ALL=, when true, overrides the 'uploadpack.allowSidebandAll' setting to true, and when false, forces fetch-pack to not request sideband-all (even if the server advertises diff --git a/t/t0410-partial-clone.sh b/t/t0410-partial-clone.sh index 7797391c03..f6c58d80dd 100755 --- a/t/t0410-partial-clone.sh +++ b/t/t0410-partial-clone.sh @@ -4,8 +4,6 @@ test_description='partial clone' . ./test-lib.sh -# missing promisor objects cause repacks which write bitmaps to fail -GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 # When enabled, some commands will write commit-graphs. This causes fsck # to fail when delete_object() is called because fsck will attempt to # verify the out-of-sync commit graph. diff --git a/t/t5310-pack-bitmaps.sh b/t/t5310-pack-bitmaps.sh index d7fd71360e..a6de7c5764 100755 --- a/t/t5310-pack-bitmaps.sh +++ b/t/t5310-pack-bitmaps.sh @@ -5,10 +5,6 @@ test_description='exercise basic bitmap functionality' . ./test-lib.sh . "$TEST_DIRECTORY"/lib-bitmap.sh -# t5310 deals only with single-pack bitmaps, so don't write MIDX bitmaps in -# their place. -GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 - # Likewise, allow individual tests to control whether or not they use # the boundary-based traversal. sane_unset GIT_TEST_PACK_USE_BITMAP_BOUNDARY_TRAVERSAL diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 10d2a6bf92..6e9ee23398 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -600,8 +600,7 @@ test_expect_success 'repack preserves multi-pack-index when creating packs' ' compare_results_with_midx "after repack" test_expect_success 'multi-pack-index and pack-bitmap' ' - GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ - git -c repack.writeBitmaps=true repack -ad && + git -c repack.writeBitmaps=true repack -ad && git multi-pack-index write && git rev-list --test-bitmap HEAD ' diff --git a/t/t5326-multi-pack-bitmaps.sh b/t/t5326-multi-pack-bitmaps.sh index cc7220b6c0..dff3b26849 100755 --- a/t/t5326-multi-pack-bitmaps.sh +++ b/t/t5326-multi-pack-bitmaps.sh @@ -4,10 +4,9 @@ test_description='exercise basic multi-pack bitmap functionality' . ./test-lib.sh . "${TEST_DIRECTORY}/lib-bitmap.sh" -# We'll be writing our own midx and bitmaps, so avoid getting confused by the +# We'll be writing our own MIDX, so avoid getting confused by the # automatic ones. GIT_TEST_MULTI_PACK_INDEX=0 -GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 # This test exercise multi-pack bitmap functionality where the object order is # stored and read from a special chunk within the MIDX, so use the default diff --git a/t/t5327-multi-pack-bitmaps-rev.sh b/t/t5327-multi-pack-bitmaps-rev.sh index e65e311cd7..23db949c20 100755 --- a/t/t5327-multi-pack-bitmaps-rev.sh +++ b/t/t5327-multi-pack-bitmaps-rev.sh @@ -5,10 +5,9 @@ test_description='exercise basic multi-pack bitmap functionality (.rev files)' . ./test-lib.sh . "${TEST_DIRECTORY}/lib-bitmap.sh" -# We'll be writing our own midx and bitmaps, so avoid getting confused by the -# automatic ones. +# We'll be writing our own MIDX, so avoid getting confused by the automatic +# ones. GIT_TEST_MULTI_PACK_INDEX=0 -GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 # Unlike t5326, this test exercise multi-pack bitmap functionality where the # object order is stored in a separate .rev file. diff --git a/t/t7700-repack.sh b/t/t7700-repack.sh index 127efe99f8..8f34f05087 100755 --- a/t/t7700-repack.sh +++ b/t/t7700-repack.sh @@ -70,14 +70,13 @@ test_expect_success 'objects in packs marked .keep are not repacked' ' test_expect_success 'writing bitmaps via command-line can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 git repack -Adbl && + git repack -Adbl && test_has_duplicate_object true ' test_expect_success 'writing bitmaps via config can duplicate .keep objects' ' # build on $oid, $packid, and .keep state from previous - GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ - git -c repack.writebitmaps=true repack -Adl && + git -c repack.writebitmaps=true repack -Adl && test_has_duplicate_object true ' @@ -284,8 +283,7 @@ test_expect_success 'repacking fails when missing .pack actually means missing o test_expect_success 'bitmaps are created by default in bare repos' ' git clone --bare .git bare.git && rm -f bare.git/objects/pack/*.bitmap && - GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ - git -C bare.git repack -ad && + git -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap) && test_path_is_file "$bitmap" ' @@ -296,8 +294,7 @@ test_expect_success 'incremental repack does not complain' ' ' test_expect_success 'bitmaps can be disabled on bare repos' ' - GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ - git -c repack.writeBitmaps=false -C bare.git repack -ad && + git -c repack.writeBitmaps=false -C bare.git repack -ad && bitmap=$(ls bare.git/objects/pack/*.bitmap || :) && test -z "$bitmap" ' @@ -308,8 +305,7 @@ test_expect_success 'no bitmaps created if .keep files present' ' keep=${pack%.pack}.keep && test_when_finished "rm -f \"\$keep\"" && >"$keep" && - GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ - git -C bare.git repack -ad 2>stderr && + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack/ -type f -name "*.bitmap" >actual && test_must_be_empty actual @@ -320,8 +316,7 @@ test_expect_success 'auto-bitmaps do not complain if unavailable' ' blob=$(test-tool genrandom big $((1024*1024)) | git -C bare.git hash-object -w --stdin) && git -C bare.git update-ref refs/tags/big $blob && - GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ - git -C bare.git repack -ad 2>stderr && + git -C bare.git repack -ad 2>stderr && test_must_be_empty stderr && find bare.git/objects/pack -type f -name "*.bitmap" >actual && test_must_be_empty actual @@ -342,9 +337,7 @@ test_expect_success 'repacking with a filter works' ' ' test_expect_success '--filter fails with --write-bitmap-index' ' - test_must_fail \ - env GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 \ - git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none + test_must_fail git -C bare.git repack -a -d --write-bitmap-index --filter=blob:none ' test_expect_success 'repacking with two filters works' ' From patchwork Thu Jun 6 23:05:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689041 Received: from mail-oo1-f51.google.com (mail-oo1-f51.google.com [209.85.161.51]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4BC7813E04F for ; Thu, 6 Jun 2024 23:05:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.161.51 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715123; cv=none; b=FAqB6s1n2T4zKw17RI+ITct+tm7sB1+xPdlkeZxc4suONSKS13731dsJFgS5BTOfzipkTHnKiV3zldC5LMtMe+wpDNn6YJUgeRACDK5whz0AcvHVDvLFRVDLgVoCDs8iBwrmF5R3Qp3I2Dsu9r0a71Dr7e7uyEDiuek5l+oT5JA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715123; c=relaxed/simple; bh=WT+YMJFLaV8JQqozVknAistKWoae2mRP39yC6aRpTi8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=SlXaGDpHwSpA7epCh+vtUkM9iKEFcSYF3wAZy0lUgjQeQ2wt4EtiXHzQi1p0OQ8z9dOvKYO+scJW81Agep+Z7bmd8HOvJGcQeag1hqzajyuMgPM6cZ9B1yTC7/nThQVj+Dk8ZJxdPknRJJi0p2bJPUNtzhceiaKD/UkkLQDZPIc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=xVEEbk9S; arc=none smtp.client-ip=209.85.161.51 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="xVEEbk9S" Received: by mail-oo1-f51.google.com with SMTP id 006d021491bc7-5b52b0d0dfeso770641eaf.0 for ; Thu, 06 Jun 2024 16:05:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715121; x=1718319921; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=siAgDNIe88y82O6OpOl8MP25gkm1AbXZEjiTV7VdGgI=; b=xVEEbk9SRNQ8qMsxdVj+oypMvDidyavy0VnKZRObqbNG7boqr0X3g9kWrHQ0NCmZHN /NIep1KyO2ItFzwmFOM80C6fu+H6SMxN3p2TafX+gNnlVO7Eq/LU2P4MElF3GVGibfLZ JmwU6VySYT2iSmJ0dcP3kaX1OuqJUCt10it3CwEsAyMzoXCEKb1V+XhEDgdYmMncn683 h7UWiM6Bm4z6f5ufBHU/CA9JFylqUISs6ffjuJgcfwbiTVb4F3odwL2SlKC+qmvytMDX z3nG5o/IOskyCkJBZGQHZUr6D+PPT9ojtHdj0Ikbs5TfyGkMvUTLrvNKlkQO2xTyjCRD QJ1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715121; x=1718319921; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=siAgDNIe88y82O6OpOl8MP25gkm1AbXZEjiTV7VdGgI=; b=Ox75CBOszIRUHEDdisUmbCGNUxOGjJcRdIelwHE9Ofq06cof64YEFCvbDYgEkrSu7V z3Fm7wjDL98J9X45oemWjaWwYGsTLZsdT1FkTCltm3+GSiah1NG3D12AO8VYuL39LFeI 4PmTd2ILeefBZvOazp1JakkUY0RF5cstciDWTrzToNCQRRHYyfWIKUoLu170rWqpyVT/ YrIvRl0XP9MarbKKRUHcguylWvmwJj8g63iaShZmfMGGERAojd2LQVim+lu4PxOV50go 6OTsk+rkMcHqghbU88jVVq75YzP26doTaA5bYw2lVLVn/h2L9QKYJAYO6DdopqbqJmsT pcbQ== X-Gm-Message-State: AOJu0YxYei2XkDz7BDA0+GAbvlKQU21rKTcqdluzEprlNKhHfY8OEvpv YtYwBcx9GbLy61b9ZBg3rS+QQ3omBfaVaCN9YAboNrjPUSeRu7FlnU/lU4kyLzNDa9nWeJh0VQy 9NxE= X-Google-Smtp-Source: AGHT+IEwdr73Xil/ClJTl4vVpjsb6TKDd69T8dcm72DSKLxwRI5Ytq/YxPjAEgQEYsKhF66X30LIVw== X-Received: by 2002:a05:6359:4589:b0:19f:1f2d:8f7 with SMTP id e5c5f4694b2df-19f1ffbf65emr114539455d.29.1717715121002; Thu, 06 Jun 2024 16:05:21 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 6a1803df08f44-6b04f983ea4sm10454996d6.76.2024.06.06.16.05.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:20 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:19 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 18/19] t/t5313-pack-bounds-checks.sh: prepare for sub-directories Message-ID: <76154308d1d2ea2439b25f4efc359f0820afa655.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare for sub-directories to appear in $GIT_DIR/objects/pack by adjusting the copy, remove, and chmod invocations to perform their behavior recursively. This prepares us for the new $GIT_DIR/objects/pack/multi-pack-index.d directory which will be added in a following commit. Signed-off-by: Taylor Blau --- t/t5313-pack-bounds-checks.sh | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/t/t5313-pack-bounds-checks.sh b/t/t5313-pack-bounds-checks.sh index ceaa6700a2..86fc73f9fb 100755 --- a/t/t5313-pack-bounds-checks.sh +++ b/t/t5313-pack-bounds-checks.sh @@ -7,11 +7,11 @@ TEST_PASSES_SANITIZE_LEAK=true clear_base () { test_when_finished 'restore_base' && - rm -f $base + rm -r -f $base } restore_base () { - cp base-backup/* .git/objects/pack/ + cp -r base-backup/* .git/objects/pack/ } do_pack () { @@ -64,9 +64,9 @@ test_expect_success 'set up base packfile and variables' ' git commit -m base && git repack -ad && base=$(echo .git/objects/pack/*) && - chmod +w $base && + chmod -R +w $base && mkdir base-backup && - cp $base base-backup/ && + cp -r $base base-backup/ && object=$(git rev-parse HEAD:file) ' From patchwork Thu Jun 6 23:05:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13689042 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 06D6313E054 for ; Thu, 6 Jun 2024 23:05:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715128; cv=none; b=i4WD7MC5y0Gp8ir0oGsvYvuHbPDeioN4553YV44uXmiSWL5/GYESHauZfEjRP4pdRplzZ0soiKzDLMV0Tx89ZNtcBtt6doDYqsm2bOHowrr0nJkFm4pDLfQs/3ohxhXUHRmRj+oS1zMvVbxuLfUgdCOczfrNsC6HHPyywnmTRwg= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717715128; c=relaxed/simple; bh=ra2i8OOIfJuffgWy5TEhfvFGV06dNU25KYcemND/Uw4=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=aOAaDRrLMBw0QbKhJdEuCW+cfgob+WG2Ia8q2goXmYFa4OeGE/YYP4RRiyhQtsN/ugIt3p2cVHUpEXfO+D274i2R0QAIZ6OwCV/avE3f8cIu+02NNp4BMQiyMsC7yQ2DwuhjgaI1LwWssgUYURjd7CZ2X4bx7y4kirGmfxRdSlM= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=none smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=aN6xyCr/; arc=none smtp.client-ip=209.85.222.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="aN6xyCr/" Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-7951d0c840bso99900585a.1 for ; Thu, 06 Jun 2024 16:05:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1717715124; x=1718319924; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=PtRqTihJITNCsj/gkWdx6EQdKJT+vtESR1aAs0IIqYY=; b=aN6xyCr/QLX1fEui4ur2Khv8JeR/EXmxOV/2eJpkwNM91ufHqtzrc3Vk8Qx+J+b2xU qHXVMulCbdZKjZFJg2Ayc/98HsdIm+GXSX17J+cs7hQPwz8/SRLC4KB5BVQ6BNzVMdvb paAxtUbnakW3dggSTvT6z5k9kHKVx7csM3YkcYSHenqY5zR8AuIqKAbKd2qHespnc+Rm 9rDUNg7Tb+NGtn4Ys4UKBk4jHjC6LD7FYeke8nVq4fvRs2kab9pbQV+KqRuwi2asspQz 0c2rk4MAOdGhJl1vo8/SwbgWHCT5CFZ/V6TjTRy2uyQeTm0u0VEKObZTbl82txhxrAbn AU7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717715124; x=1718319924; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PtRqTihJITNCsj/gkWdx6EQdKJT+vtESR1aAs0IIqYY=; b=un+5+3Wj0UTl6bdvs2c/HvvwYE4huabhg+rzKZNVpPUz9h6pe49rXAlqpe9eP77UVS flH7g77ZdqbfJjtqWQHE/B601da0SPCUCkAKsQL4nEIs+GEKcosDb4MI3hYaOvHbLtHZ j3IUZtNYMOTb7YhYYG4dBJsy5RDRpB5nFnJUrc3l5+N2EiCGxN1QGI8mpuyJvgJSTLaM hKl10q9AsinhG7WhCVWVOFdg4Sb7Qw27jzWOsu9NoyEaOmwQ6IiQiW8MH8PoRz/QdT2K /IfeXuLEXrAeTcuGGrXsiIVEp7toMklT5mogT+ULkt0hbH7ZhVu/PbaSj+KoJwj4yTOh qthg== X-Gm-Message-State: AOJu0Yw55GgTFhkVOhRY8Mbnm2l++ro9ZwNlPgvdWF0qoO7QKTMTqzq0 G04uZnyFpYMPNHFwjYd3VjbrHz66Vl6h146gtMcK9sOXSikwMl9pX7KQa1lI4xBvfD3hQy9mf2A euOI= X-Google-Smtp-Source: AGHT+IEjcBy5SBmOxinGl//Vry7xtoI2wWNSM+1VrA9QjDV+IpOoZFYv19bWHBsCFgfkRgp7DdSUrw== X-Received: by 2002:a05:620a:1a18:b0:794:f8fe:9ec2 with SMTP id af79cd13be357-7953c43ad4emr88050385a.43.1717715124251; Thu, 06 Jun 2024 16:05:24 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id af79cd13be357-79532849506sm103458385a.43.2024.06.06.16.05.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 06 Jun 2024 16:05:23 -0700 (PDT) Date: Thu, 6 Jun 2024 19:05:22 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Jeff King , Elijah Newren , Junio C Hamano Subject: [PATCH 19/19] midx: implement support for writing incremental MIDX chains Message-ID: <97b3ea84b92c543c7bd15a2f054d0d3af1b34c67.1717715060.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the rest of the MIDX subsystem and relevant callers have been updated to learn about how to read and process incremental MIDX chains, let's finally update the implementation in `write_midx_internal()` to be able to write incremental MIDX chains. This new feature is available behind the `--incremental` option for the `multi-pack-index` builtin, like so: $ git multi-pack-index write --incremental The implementation for doing so is relatively straightforward, and boils down to a handful of different kinds of changes implemented in this patch: - The `compute_sorted_entries()` function is taught to reject objects which appear in any existing MIDX layer. - Functions like `write_midx_revindex()` are adjusted to write pack_order values which are offset by the number of objects in the base MIDX layer. - The end of `write_midx_internal()` is adjusted to move non-incremental MIDX files when necessary (i.e. when creating an incremental chain with an existing non-incremental MIDX in the repository). There are a handful of other changes that are introduced, like new functions to clear incremental MIDX files that are unrelated to the current chain (using the same "keep_hash" mechanism as in the non-incremental case). The tests explicitly exercising the new incremental MIDX feature are relatively limited for two reasons: 1. Most of the "interesting" behavior is already thoroughly covered in t5319-multi-pack-index.sh, which handles the core logic of reading objects through a MIDX. The new tests in t5334-incremental-multi-pack-index.sh are mostly focused on creating and destroying incremental MIDXs, as well as stitching their results together across layers. 2. A new GIT_TEST environment variable is added called "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL", which modifies the entire test suite to write incremental MIDXs after repacking when combined with the "GIT_TEST_MULTI_PACK_INDEX" variable. This exercises the long tail of other interesting behavior that is defined implicitly throughout the rest of the CI suite. It is likewise added to the linux-TEST-vars job. Signed-off-by: Taylor Blau --- Documentation/git-multi-pack-index.txt | 11 +- builtin/multi-pack-index.c | 2 + builtin/repack.c | 8 +- ci/run-build-and-tests.sh | 1 + midx-write.c | 281 ++++++++++++++++++++---- midx.c | 62 +++++- midx.h | 4 + packfile.c | 16 +- packfile.h | 4 + t/README | 4 + t/lib-bitmap.sh | 6 +- t/lib-midx.sh | 28 +++ t/t5319-multi-pack-index.sh | 27 +-- t/t5326-multi-pack-bitmaps.sh | 1 + t/t5327-multi-pack-bitmaps-rev.sh | 1 + t/t5332-multi-pack-reuse.sh | 2 + t/t5334-incremental-multi-pack-index.sh | 46 ++++ t/t7700-repack.sh | 27 +-- 18 files changed, 436 insertions(+), 95 deletions(-) create mode 100755 t/t5334-incremental-multi-pack-index.sh diff --git a/Documentation/git-multi-pack-index.txt b/Documentation/git-multi-pack-index.txt index 3696506eb3..631d5c7d15 100644 --- a/Documentation/git-multi-pack-index.txt +++ b/Documentation/git-multi-pack-index.txt @@ -64,6 +64,12 @@ The file given at `` is expected to be readable, and can contain duplicates. (If a given OID is given more than once, it is marked as preferred if at least one instance of it begins with the special `+` marker). + + --incremental:: + Write an incremental MIDX file containing only objects + and packs not present in an existing MIDX layer. + Migrates non-incremental MIDXs to incremental ones when + necessary. Incompatible with `--bitmap`. -- verify:: @@ -74,6 +80,8 @@ expire:: have no objects referenced by the MIDX (with the exception of `.keep` packs and cruft packs). Rewrite the MIDX file afterward to remove all references to these pack-files. ++ +NOTE: this mode is incompatible with incremental MIDX files. repack:: Create a new pack-file containing objects in small pack-files @@ -95,7 +103,8 @@ repack:: + If `repack.packKeptObjects` is `false`, then any pack-files with an associated `.keep` file will not be selected for the batch to repack. - ++ +NOTE: this mode is incompatible with incremental MIDX files. EXAMPLES -------- diff --git a/builtin/multi-pack-index.c b/builtin/multi-pack-index.c index 8360932d2e..92b86153ba 100644 --- a/builtin/multi-pack-index.c +++ b/builtin/multi-pack-index.c @@ -129,6 +129,8 @@ static int cmd_multi_pack_index_write(int argc, const char **argv, MIDX_WRITE_BITMAP | MIDX_WRITE_REV_INDEX), OPT_BIT(0, "progress", &opts.flags, N_("force progress reporting"), MIDX_PROGRESS), + OPT_BIT(0, "incremental", &opts.flags, + N_("write a new incremental MIDX"), MIDX_WRITE_INCREMENTAL), OPT_BOOL(0, "stdin-packs", &opts.stdin_packs, N_("write multi-pack index containing only given indexes")), OPT_FILENAME(0, "refs-snapshot", &opts.refs_snapshot, diff --git a/builtin/repack.c b/builtin/repack.c index e2fec16389..e1fab4d809 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -1514,8 +1514,12 @@ int cmd_repack(int argc, const char **argv, const char *prefix) if (run_update_server_info) update_server_info(0); - if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) - write_midx_file(get_object_directory(), NULL, NULL, 0); + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX, 0)) { + unsigned flags = 0; + if (git_env_bool(GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL, 0)) + flags |= MIDX_WRITE_INCREMENTAL; + write_midx_file(get_object_directory(), NULL, NULL, flags); + } cleanup: string_list_clear(&names, 1); diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index e6fd68630c..2e28d02b20 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -25,6 +25,7 @@ linux-TEST-vars) export GIT_TEST_COMMIT_GRAPH=1 export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 + export GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=1 export GIT_TEST_DEFAULT_INITIAL_BRANCH_NAME=master export GIT_TEST_NO_WRITE_REV_INDEX=1 export GIT_TEST_CHECKOUT_WORKERS=2 diff --git a/midx-write.c b/midx-write.c index b148ee443a..241557d03e 100644 --- a/midx-write.c +++ b/midx-write.c @@ -15,6 +15,8 @@ #include "refs.h" #include "revision.h" #include "list-objects.h" +#include "path.h" +#include "pack-revindex.h" #define PACK_EXPIRED UINT_MAX #define BITMAP_POS_UNKNOWN (~((uint32_t)0)) @@ -23,7 +25,11 @@ extern int midx_checksum_valid(struct multi_pack_index *m); extern void clear_midx_files_ext(const char *object_dir, const char *ext, - unsigned char *keep_hash); + const char *keep_hash); +extern void clear_incremental_midx_files_ext(const char *object_dir, + const char *ext, + const char **keep_hashes, + uint32_t hashes_nr); extern int cmp_idx_or_pack_name(const char *idx_or_pack_name, const char *idx_name); @@ -97,6 +103,9 @@ struct write_midx_context { int preferred_pack_idx; + int incremental; + uint32_t num_multi_pack_indexes_before; + struct string_list *to_include; }; @@ -322,7 +331,7 @@ static void compute_sorted_entries(struct write_midx_context *ctx, for (cur_fanout = 0; cur_fanout < 256; cur_fanout++) { fanout.nr = 0; - if (ctx->m) + if (ctx->m && !ctx->incremental) midx_fanout_add_midx_fanout(&fanout, ctx->m, cur_fanout, ctx->preferred_pack_idx); @@ -348,6 +357,9 @@ static void compute_sorted_entries(struct write_midx_context *ctx, if (cur_object && oideq(&fanout.entries[cur_object - 1].oid, &fanout.entries[cur_object].oid)) continue; + if (ctx->incremental && ctx->m && + midx_has_oid(ctx->m, &fanout.entries[cur_object].oid)) + continue; ALLOC_GROW(ctx->entries, st_add(ctx->entries_nr, 1), alloc_objects); @@ -531,10 +543,15 @@ static int write_midx_revindex(struct hashfile *f, void *data) { struct write_midx_context *ctx = data; - uint32_t i; + uint32_t i, nr_base; + + if (ctx->m && ctx->incremental) + nr_base = ctx->m->num_objects + ctx->m->num_objects_in_base; + else + nr_base = 0; for (i = 0; i < ctx->entries_nr; i++) - hashwrite_be32(f, ctx->pack_order[i]); + hashwrite_be32(f, ctx->pack_order[i] + nr_base); return 0; } @@ -563,12 +580,17 @@ static int midx_pack_order_cmp(const void *va, const void *vb) static uint32_t *midx_pack_order(struct write_midx_context *ctx) { struct midx_pack_order_data *data; - uint32_t *pack_order; + uint32_t *pack_order, base_objects = 0; uint32_t i; trace2_region_enter("midx", "midx_pack_order", the_repository); + if (ctx->incremental && ctx->m) + base_objects = ctx->m->num_objects + ctx->m->num_objects_in_base; + + ALLOC_ARRAY(pack_order, ctx->entries_nr); ALLOC_ARRAY(data, ctx->entries_nr); + for (i = 0; i < ctx->entries_nr; i++) { struct pack_midx_entry *e = &ctx->entries[i]; data[i].nr = i; @@ -580,12 +602,11 @@ static uint32_t *midx_pack_order(struct write_midx_context *ctx) QSORT(data, ctx->entries_nr, midx_pack_order_cmp); - ALLOC_ARRAY(pack_order, ctx->entries_nr); for (i = 0; i < ctx->entries_nr; i++) { struct pack_midx_entry *e = &ctx->entries[data[i].nr]; struct pack_info *pack = &ctx->info[ctx->pack_perm[e->pack_int_id]]; if (pack->bitmap_pos == BITMAP_POS_UNKNOWN) - pack->bitmap_pos = i; + pack->bitmap_pos = i + base_objects; pack->bitmap_nr++; pack_order[i] = data[i].nr; } @@ -633,7 +654,8 @@ static void prepare_midx_packing_data(struct packing_data *pdata, prepare_packing_data(the_repository, pdata); for (i = 0; i < ctx->entries_nr; i++) { - struct pack_midx_entry *from = &ctx->entries[ctx->pack_order[i]]; + uint32_t pos = ctx->pack_order[i]; + struct pack_midx_entry *from = &ctx->entries[pos]; struct object_entry *to = packlist_alloc(pdata, &from->oid); oe_set_in_pack(pdata, to, @@ -881,40 +903,133 @@ static struct multi_pack_index *lookup_multi_pack_index(struct repository *r, static int fill_packs_from_midx(struct write_midx_context *ctx, const char *preferred_pack_name, uint32_t flags) { - uint32_t i; + struct multi_pack_index *m; - for (i = 0; i < ctx->m->num_packs; i++) { - if (!should_include_pack(ctx, ctx->m->pack_names[i], 0)) - continue; + for (m = ctx->m; m; m = m->base_midx) { + uint32_t i; - ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); - - if (flags & MIDX_WRITE_REV_INDEX || preferred_pack_name) { + for (i = 0; i < m->num_packs; i++) { /* * If generating a reverse index, need to have * packed_git's loaded to compare their * mtimes and object count. * - * * If a preferred pack is specified, need to * have packed_git's loaded to ensure the chosen * preferred pack has a non-zero object count. */ - if (prepare_midx_pack(the_repository, ctx->m, i)) - return error(_("could not load pack")); + if (!should_include_pack(ctx, m->pack_names[i], 0)) + continue; - if (open_pack_index(ctx->m->packs[i])) - die(_("could not open index for %s"), - ctx->m->packs[i]->pack_name); + ALLOC_GROW(ctx->info, ctx->nr + 1, ctx->alloc); + + if (flags & MIDX_WRITE_REV_INDEX || + preferred_pack_name) { + if (prepare_midx_pack(the_repository, m, + m->num_packs_in_base + i)) { + error(_("could not load pack")); + return 1; + } + + if (open_pack_index(m->packs[i])) + die(_("could not open index for %s"), + m->packs[i]->pack_name); + } + + fill_pack_info(&ctx->info[ctx->nr++], m->packs[i], + m->pack_names[i], + m->num_packs_in_base + i); } - - fill_pack_info(&ctx->info[ctx->nr++], ctx->m->packs[i], - ctx->m->pack_names[i], i); } - return 0; } +static struct { + const char *non_split; + const char *split; +} midx_exts[] = { + {NULL, MIDX_EXT_MIDX}, + {MIDX_EXT_BITMAP, MIDX_EXT_BITMAP}, + {MIDX_EXT_REV, MIDX_EXT_REV}, +}; + +static int link_midx_to_chain(struct multi_pack_index *m) +{ + struct strbuf from = STRBUF_INIT; + struct strbuf to = STRBUF_INIT; + int ret = 0; + size_t i; + + if (!m || m->has_chain) { + /* + * Either no MIDX previously existed, or it was already + * part of a MIDX chain. In both cases, we have nothing + * to link, so return early. + */ + goto done; + } + + for (i = 0; i < ARRAY_SIZE(midx_exts); i++) { + const unsigned char *hash = get_midx_checksum(m); + + get_midx_filename_ext(&from, m->object_dir, hash, + midx_exts[i].non_split); + get_split_midx_filename_ext(&to, m->object_dir, hash, + midx_exts[i].split); + + if (link(from.buf, to.buf) < 0 && errno != ENOENT) { + ret = error_errno(_("unable to link '%s' to '%s'"), + from.buf, to.buf); + goto done; + } + + strbuf_reset(&from); + strbuf_reset(&to); + } + +done: + strbuf_release(&from); + strbuf_release(&to); + return ret; +} + +static void clear_midx_files(const char *object_dir, + const char **hashes, + uint32_t hashes_nr, + unsigned incremental) +{ + /* + * if incremental: + * - remove all non-incremental MIDX files + * - remove any incremental MIDX files not in the current one + * + * if non-incremental: + * - remove all incremental MIDX files + * - remove any non-incremental MIDX files not matching the current + * hash + */ + struct strbuf buf = STRBUF_INIT; + const char *exts[] = { MIDX_EXT_BITMAP, MIDX_EXT_REV, MIDX_EXT_MIDX }; + uint32_t i, j; + + for (i = 0; i < ARRAY_SIZE(exts); i++) { + clear_incremental_midx_files_ext(object_dir, exts[i], + hashes, hashes_nr); + for (j = 0; j < hashes_nr; j++) + clear_midx_files_ext(object_dir, exts[i], hashes[j]); + } + + if (incremental) + get_midx_filename(&buf, object_dir); + else + get_midx_chain_filename(&buf, object_dir); + + if (unlink(buf.buf) && errno != ENOENT) + die_errno(_("failed to clear multi-pack-index at %s"), buf.buf); + + strbuf_release(&buf); +} + static int write_midx_internal(const char *object_dir, struct string_list *packs_to_include, struct string_list *packs_to_drop, @@ -927,16 +1042,27 @@ static int write_midx_internal(const char *object_dir, uint32_t i, start_pack; struct hashfile *f = NULL; struct lock_file lk; + struct tempfile *incr; struct write_midx_context ctx = { 0 }; int bitmapped_packs_concat_len = 0; int pack_name_concat_len = 0; int dropped_packs = 0; int result = 0; + const char **keep_hashes = NULL; struct chunkfile *cf; trace2_region_enter("midx", "write_midx_internal", the_repository); - get_midx_filename(&midx_name, object_dir); + ctx.incremental = !!(flags & MIDX_WRITE_INCREMENTAL); + if (ctx.incremental && (flags & MIDX_WRITE_BITMAP)) + die(_("cannot write incremental MIDX with bitmap")); + + if (ctx.incremental) + strbuf_addf(&midx_name, + "%s/pack/multi-pack-index.d/tmp_midx_XXXXXX", + object_dir); + else + get_midx_filename(&midx_name, object_dir); if (safe_create_leading_directories(midx_name.buf)) die_errno(_("unable to create leading directories of %s"), midx_name.buf); @@ -948,14 +1074,19 @@ static int write_midx_internal(const char *object_dir, } ctx.nr = 0; - ctx.alloc = ctx.m ? ctx.m->num_packs : 16; + ctx.alloc = ctx.m ? ctx.m->num_packs + ctx.m->num_packs_in_base : 16; ctx.info = NULL; ctx.to_include = packs_to_include; ALLOC_ARRAY(ctx.info, ctx.alloc); - if (ctx.m && fill_packs_from_midx(&ctx, preferred_pack_name, - flags) < 0) { - result = 1; + if (ctx.incremental) { + struct multi_pack_index *m = ctx.m; + while (m) { + ctx.num_multi_pack_indexes_before++; + m = m->base_midx; + } + } else if (ctx.m && fill_packs_from_midx(&ctx, preferred_pack_name, + flags) < 0) { goto cleanup; } @@ -970,7 +1101,8 @@ static int write_midx_internal(const char *object_dir, for_each_file_in_pack_dir(object_dir, add_pack_to_midx, &ctx); stop_progress(&ctx.progress); - if ((ctx.m && ctx.nr == ctx.m->num_packs) && + if ((ctx.m && ctx.nr == ctx.m->num_packs + ctx.m->num_packs_in_base) && + !ctx.incremental && !(packs_to_include || packs_to_drop)) { struct bitmap_index *bitmap_git; int bitmap_exists; @@ -986,12 +1118,14 @@ static int write_midx_internal(const char *object_dir, * corresponding bitmap (or one wasn't requested). */ if (!want_bitmap) - clear_midx_files_ext(object_dir, ".bitmap", - NULL); + clear_midx_files_ext(object_dir, "bitmap", NULL); goto cleanup; } } + if (ctx.incremental && !ctx.nr) + goto cleanup; /* nothing to do */ + if (preferred_pack_name) { ctx.preferred_pack_idx = -1; @@ -1137,8 +1271,30 @@ static int write_midx_internal(const char *object_dir, pack_name_concat_len += MIDX_CHUNK_ALIGNMENT - (pack_name_concat_len % MIDX_CHUNK_ALIGNMENT); - hold_lock_file_for_update(&lk, midx_name.buf, LOCK_DIE_ON_ERROR); - f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); + if (ctx.incremental) { + struct strbuf lock_name = STRBUF_INIT; + + get_midx_chain_filename(&lock_name, object_dir); + hold_lock_file_for_update(&lk, lock_name.buf, LOCK_DIE_ON_ERROR); + strbuf_release(&lock_name); + + incr = mks_tempfile_m(midx_name.buf, 0444); + if (!incr) { + error(_("unable to create temporary MIDX layer")); + return -1; + } + + if (adjust_shared_perm(get_tempfile_path(incr))) { + error(_("unable to adjust shared permissions for '%s'"), + get_tempfile_path(incr)); + return -1; + } + + f = hashfd(get_tempfile_fd(incr), get_tempfile_path(incr)); + } else { + hold_lock_file_for_update(&lk, midx_name.buf, LOCK_DIE_ON_ERROR); + f = hashfd(get_lock_file_fd(&lk), get_lock_file_path(&lk)); + } if (ctx.nr - dropped_packs == 0) { error(_("no pack files to index.")); @@ -1231,14 +1387,55 @@ static int write_midx_internal(const char *object_dir, * have been freed in the previous if block. */ + CALLOC_ARRAY(keep_hashes, ctx.num_multi_pack_indexes_before + 1); + + if (ctx.incremental) { + FILE *chainf = fdopen_lock_file(&lk, "w"); + struct strbuf final_midx_name = STRBUF_INIT; + struct multi_pack_index *m = ctx.m; + + if (!chainf) { + error_errno(_("unable to open multi-pack-index chain file")); + return -1; + } + + if (link_midx_to_chain(ctx.m) < 0) + return -1; + + get_split_midx_filename_ext(&final_midx_name, object_dir, + midx_hash, MIDX_EXT_MIDX); + + if (rename_tempfile(&incr, final_midx_name.buf) < 0) { + error_errno(_("unable to rename new multi-pack-index layer")); + return -1; + } + + keep_hashes[ctx.num_multi_pack_indexes_before] = + xstrdup(hash_to_hex(midx_hash)); + + for (i = 0; i < ctx.num_multi_pack_indexes_before; i++) { + uint32_t j = ctx.num_multi_pack_indexes_before - i - 1; + + keep_hashes[j] = xstrdup(hash_to_hex(get_midx_checksum(m))); + m = m->base_midx; + } + + for (i = 0; i < ctx.num_multi_pack_indexes_before + 1; i++) + fprintf(get_lock_file_fp(&lk), "%s\n", keep_hashes[i]); + } else { + keep_hashes[ctx.num_multi_pack_indexes_before] = + xstrdup(hash_to_hex(midx_hash)); + } + if (ctx.m) close_object_store(the_repository->objects); if (commit_lock_file(&lk) < 0) die_errno(_("could not write multi-pack-index")); - clear_midx_files_ext(object_dir, ".bitmap", midx_hash); - clear_midx_files_ext(object_dir, ".rev", midx_hash); + clear_midx_files(object_dir, keep_hashes, + ctx.num_multi_pack_indexes_before + 1, + ctx.incremental); cleanup: for (i = 0; i < ctx.nr; i++) { @@ -1253,6 +1450,11 @@ static int write_midx_internal(const char *object_dir, free(ctx.entries); free(ctx.pack_perm); free(ctx.pack_order); + if (keep_hashes) { + for (i = 0; i < ctx.num_multi_pack_indexes_before + 1; i++) + free((char *)keep_hashes[i]); + free(keep_hashes); + } strbuf_release(&midx_name); trace2_region_leave("midx", "write_midx_internal", the_repository); @@ -1289,6 +1491,9 @@ int expire_midx_packs(struct repository *r, const char *object_dir, unsigned fla if (!m) return 0; + if (m->base_midx) + die(_("cannot expire packs from an incremental multi-pack-index")); + CALLOC_ARRAY(count, m->num_packs); if (flags & MIDX_PROGRESS) @@ -1463,6 +1668,8 @@ int midx_repack(struct repository *r, const char *object_dir, size_t batch_size, if (!m) return 0; + if (m->base_midx) + die(_("cannot repack an incremental multi-pack-index")); CALLOC_ARRAY(include_pack, m->num_packs); diff --git a/midx.c b/midx.c index ae3e30a062..5aa7e2a6e6 100644 --- a/midx.c +++ b/midx.c @@ -14,7 +14,10 @@ int midx_checksum_valid(struct multi_pack_index *m); void clear_midx_files_ext(const char *object_dir, const char *ext, - unsigned char *keep_hash); + const char *keep_hash); +void clear_incremental_midx_files_ext(const char *object_dir, const char *ext, + char **keep_hashes, + uint32_t hashes_nr); int cmp_idx_or_pack_name(const char *idx_or_pack_name, const char *idx_name); @@ -518,6 +521,11 @@ int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, return 0; } +int midx_has_oid(struct multi_pack_index *m, const struct object_id *oid) +{ + return bsearch_midx(oid, m, NULL); +} + struct object_id *nth_midxed_object_oid(struct object_id *oid, struct multi_pack_index *m, uint32_t n) @@ -719,7 +727,8 @@ int midx_checksum_valid(struct multi_pack_index *m) } struct clear_midx_data { - char *keep; + char **keep; + uint32_t keep_nr; const char *ext; }; @@ -727,32 +736,63 @@ static void clear_midx_file_ext(const char *full_path, size_t full_path_len UNUS const char *file_name, void *_data) { struct clear_midx_data *data = _data; + uint32_t i; if (!(starts_with(file_name, "multi-pack-index-") && ends_with(file_name, data->ext))) return; - if (data->keep && !strcmp(data->keep, file_name)) - return; - + for (i = 0; i < data->keep_nr; i++) { + if (!strcmp(data->keep[i], file_name)) + return; + } if (unlink(full_path)) die_errno(_("failed to remove %s"), full_path); } void clear_midx_files_ext(const char *object_dir, const char *ext, - unsigned char *keep_hash) + const char *keep_hash) { struct clear_midx_data data; memset(&data, 0, sizeof(struct clear_midx_data)); - if (keep_hash) - data.keep = xstrfmt("multi-pack-index-%s%s", - hash_to_hex(keep_hash), ext); + if (keep_hash) { + ALLOC_ARRAY(data.keep, 1); + + data.keep[0] = xstrfmt("multi-pack-index-%s.%s", keep_hash, ext); + data.keep_nr = 1; + } data.ext = ext; for_each_file_in_pack_dir(object_dir, clear_midx_file_ext, &data); + if (keep_hash) + free(data.keep[0]); + free(data.keep); +} + +void clear_incremental_midx_files_ext(const char *object_dir, const char *ext, + char **keep_hashes, + uint32_t hashes_nr) +{ + struct clear_midx_data data; + uint32_t i; + + memset(&data, 0, sizeof(struct clear_midx_data)); + + ALLOC_ARRAY(data.keep, hashes_nr); + for (i = 0; i < hashes_nr; i++) + data.keep[i] = xstrfmt("multi-pack-index-%s.%s", keep_hashes[i], + ext); + data.keep_nr = hashes_nr; + data.ext = ext; + + for_each_file_in_pack_subdir(object_dir, "multi-pack-index.d", + clear_midx_file_ext, &data); + + for (i = 0; i < hashes_nr; i++) + free(data.keep[i]); free(data.keep); } @@ -770,8 +810,8 @@ void clear_midx_file(struct repository *r) if (remove_path(midx.buf)) die(_("failed to clear multi-pack-index at %s"), midx.buf); - clear_midx_files_ext(r->objects->odb->path, ".bitmap", NULL); - clear_midx_files_ext(r->objects->odb->path, ".rev", NULL); + clear_midx_files_ext(r->objects->odb->path, MIDX_EXT_BITMAP, NULL); + clear_midx_files_ext(r->objects->odb->path, MIDX_EXT_REV, NULL); strbuf_release(&midx); } diff --git a/midx.h b/midx.h index 3714cad2cc..42d4f8d149 100644 --- a/midx.h +++ b/midx.h @@ -29,6 +29,8 @@ struct bitmapped_pack; #define MIDX_LARGE_OFFSET_NEEDED 0x80000000 #define GIT_TEST_MULTI_PACK_INDEX "GIT_TEST_MULTI_PACK_INDEX" +#define GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL \ + "GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL" struct multi_pack_index { struct multi_pack_index *next; @@ -77,6 +79,7 @@ struct multi_pack_index { #define MIDX_WRITE_BITMAP (1 << 2) #define MIDX_WRITE_BITMAP_HASH_CACHE (1 << 3) #define MIDX_WRITE_BITMAP_LOOKUP_TABLE (1 << 4) +#define MIDX_WRITE_INCREMENTAL (1 << 5) #define MIDX_EXT_REV "rev" #define MIDX_EXT_BITMAP "bitmap" @@ -101,6 +104,7 @@ int bsearch_one_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); int bsearch_midx(const struct object_id *oid, struct multi_pack_index *m, uint32_t *result); +int midx_has_oid(struct multi_pack_index *m, const struct object_id *oid); off_t nth_midxed_offset(struct multi_pack_index *m, uint32_t pos); uint32_t nth_midxed_pack_int_id(struct multi_pack_index *m, uint32_t pos); struct object_id *nth_midxed_object_oid(struct object_id *oid, diff --git a/packfile.c b/packfile.c index 85f0345435..2c335f4c4d 100644 --- a/packfile.c +++ b/packfile.c @@ -813,9 +813,10 @@ static void report_pack_garbage(struct string_list *list) report_helper(list, seen_bits, first, list->nr); } -void for_each_file_in_pack_dir(const char *objdir, - each_file_in_pack_dir_fn fn, - void *data) +void for_each_file_in_pack_subdir(const char *objdir, + const char *subdir, + each_file_in_pack_dir_fn fn, + void *data) { struct strbuf path = STRBUF_INIT; size_t dirnamelen; @@ -824,6 +825,8 @@ void for_each_file_in_pack_dir(const char *objdir, strbuf_addstr(&path, objdir); strbuf_addstr(&path, "/pack"); + if (subdir) + strbuf_addf(&path, "/%s", subdir); dir = opendir(path.buf); if (!dir) { if (errno != ENOENT) @@ -845,6 +848,13 @@ void for_each_file_in_pack_dir(const char *objdir, strbuf_release(&path); } +void for_each_file_in_pack_dir(const char *objdir, + each_file_in_pack_dir_fn fn, + void *data) +{ + for_each_file_in_pack_subdir(objdir, NULL, fn, data); +} + struct prepare_pack_data { struct repository *r; struct string_list *garbage; diff --git a/packfile.h b/packfile.h index 28c8fd3e39..07ba2c0be0 100644 --- a/packfile.h +++ b/packfile.h @@ -55,6 +55,10 @@ struct packed_git *parse_pack_index(unsigned char *sha1, const char *idx_path); typedef void each_file_in_pack_dir_fn(const char *full_path, size_t full_path_len, const char *file_name, void *data); +void for_each_file_in_pack_subdir(const char *objdir, + const char *subdir, + each_file_in_pack_dir_fn fn, + void *data); void for_each_file_in_pack_dir(const char *objdir, each_file_in_pack_dir_fn fn, void *data); diff --git a/t/README b/t/README index e8a11926e4..e93a29de1b 100644 --- a/t/README +++ b/t/README @@ -469,6 +469,10 @@ GIT_TEST_MULTI_PACK_INDEX=, when true, forces the multi-pack- index to be written after every 'git repack' command, and overrides the 'core.multiPackIndex' setting to true. +GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=, when true, sets +the '--incremental' option on all invocations of 'git multi-pack-index +write'. + GIT_TEST_SIDEBAND_ALL=, when true, overrides the 'uploadpack.allowSidebandAll' setting to true, and when false, forces fetch-pack to not request sideband-all (even if the server advertises diff --git a/t/lib-bitmap.sh b/t/lib-bitmap.sh index f595937094..62aa6744a6 100644 --- a/t/lib-bitmap.sh +++ b/t/lib-bitmap.sh @@ -1,6 +1,8 @@ # Helpers for scripts testing bitmap functionality; see t5310 for # example usage. +. "$TEST_DIRECTORY"/lib-midx.sh + objdir=.git/objects midx=$objdir/pack/multi-pack-index @@ -264,10 +266,6 @@ have_delta () { test_cmp expect actual } -midx_checksum () { - test-tool read-midx --checksum "$1" -} - # midx_pack_source midx_pack_source () { test-tool read-midx --show-objects .git/objects | grep "^$1 " | cut -f2 diff --git a/t/lib-midx.sh b/t/lib-midx.sh index 1261994744..e38c609604 100644 --- a/t/lib-midx.sh +++ b/t/lib-midx.sh @@ -6,3 +6,31 @@ test_midx_consistent () { test_cmp expect actual && git multi-pack-index --object-dir=$1 verify } + +midx_checksum () { + test-tool read-midx --checksum "$1" +} + +midx_git_two_modes () { + git -c core.multiPackIndex=false $1 >expect && + git -c core.multiPackIndex=true $1 >actual && + if [ "$2" = "sorted" ] + then + sort expect.sorted && + mv expect.sorted expect && + sort actual.sorted && + mv actual.sorted actual + fi && + test_cmp expect actual +} + +compare_results_with_midx () { + MSG=$1 + test_expect_success "check normal git operations: $MSG" ' + midx_git_two_modes "rev-list --objects --all" && + midx_git_two_modes "log --raw" && + midx_git_two_modes "count-objects --verbose" && + midx_git_two_modes "cat-file --batch-all-objects --batch-check" && + midx_git_two_modes "cat-file --batch-all-objects --batch-check --unordered" sorted + ' +} diff --git a/t/t5319-multi-pack-index.sh b/t/t5319-multi-pack-index.sh index 6e9ee23398..4b0b5a5c9f 100755 --- a/t/t5319-multi-pack-index.sh +++ b/t/t5319-multi-pack-index.sh @@ -3,8 +3,11 @@ test_description='multi-pack-indexes' . ./test-lib.sh . "$TEST_DIRECTORY"/lib-chunk.sh +. "$TEST_DIRECTORY"/lib-midx.sh GIT_TEST_MULTI_PACK_INDEX=0 +GIT_TEST_MULTI_PACK_INDEX_WRITE_BITMAP=0 +GIT_TEST_MULTI_PACK_INDEX_WRITE_INCREMENTAL=0 objdir=.git/objects HASH_LEN=$(test_oid rawsz) @@ -107,30 +110,6 @@ test_expect_success 'write midx with one v1 pack' ' midx_read_expect 1 18 4 $objdir ' -midx_git_two_modes () { - git -c core.multiPackIndex=false $1 >expect && - git -c core.multiPackIndex=true $1 >actual && - if [ "$2" = "sorted" ] - then - sort expect.sorted && - mv expect.sorted expect && - sort actual.sorted && - mv actual.sorted actual - fi && - test_cmp expect actual -} - -compare_results_with_midx () { - MSG=$1 - test_expect_success "check normal git operations: $MSG" ' - midx_git_two_modes "rev-list --objects --all" && - midx_git_two_modes "log --raw" && - midx_git_two_modes "count-objects --verbose" && - midx_git_two_modes "cat-file --batch-all-objects --batch-check" && - midx_git_two_modes "cat-file --batch-all-objects --batch-check --unordered" sorted - ' -} - test_expect_success 'write midx with one v2 pack' ' git pack-objects --index-version=2,0x40 $objdir/pack/test &2 && incrpackid=$(git pack-objects --all --unpacked --incremental .git/objects/pack/pack err && + git repack -Adl --write-bitmap-index 2>err && cat >expect <<-EOF && warning: disabling bitmap writing, as some objects are not being packed EOF @@ -533,11 +536,11 @@ test_expect_success 'setup for --write-midx tests' ' test_expect_success '--write-midx unchanged' ' ( cd midx && - GIT_TEST_MULTI_PACK_INDEX=0 git repack && + git repack && test_path_is_missing $midx && test_path_is_missing $midx-*.bitmap && - GIT_TEST_MULTI_PACK_INDEX=0 git repack --write-midx && + git repack --write-midx && test_path_is_file $midx && test_path_is_missing $midx-*.bitmap && @@ -550,7 +553,7 @@ test_expect_success '--write-midx with a new pack' ' cd midx && test_commit loose && - GIT_TEST_MULTI_PACK_INDEX=0 git repack --write-midx && + git repack --write-midx && test_path_is_file $midx && test_path_is_missing $midx-*.bitmap && @@ -561,7 +564,7 @@ test_expect_success '--write-midx with a new pack' ' test_expect_success '--write-midx with -b' ' ( cd midx && - GIT_TEST_MULTI_PACK_INDEX=0 git repack -mb && + git repack -mb && test_path_is_file $midx && test_path_is_file $midx-*.bitmap && @@ -574,7 +577,7 @@ test_expect_success '--write-midx with -d' ' cd midx && test_commit repack && - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Ad --write-midx && + git repack -Ad --write-midx && test_path_is_file $midx && test_path_is_missing $midx-*.bitmap && @@ -587,21 +590,21 @@ test_expect_success 'cleans up MIDX when appropriate' ' cd midx && test_commit repack-2 && - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb --write-midx && + git repack -Adb --write-midx && checksum=$(midx_checksum $objdir) && test_path_is_file $midx && test_path_is_file $midx-$checksum.bitmap && test_commit repack-3 && - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb --write-midx && + git repack -Adb --write-midx && test_path_is_file $midx && test_path_is_missing $midx-$checksum.bitmap && test_path_is_file $midx-$(midx_checksum $objdir).bitmap && test_commit repack-4 && - GIT_TEST_MULTI_PACK_INDEX=0 git repack -Adb && + git repack -Adb && find $objdir/pack -type f -name "multi-pack-index*" >files && test_must_be_empty files @@ -622,7 +625,6 @@ test_expect_success '--write-midx with preferred bitmap tips' ' git log --format="create refs/tags/%s/%s %H" HEAD >refs && git update-ref --stdin