From patchwork Tue Nov 19 22:07:19 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880566 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 768FE1D0F68 for ; Tue, 19 Nov 2024 22:07:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054044; cv=none; b=qcK54V4EtvBVH8wJEEy7xUrnqyLbX8lw8bzNswwHr3BLFaFz57R3pqH+gKCTLHJ0kFp3PLZ4G9YIDl59mvU4CSKRpZFFNKCeWEiel8CzbMRtpKGojk8USgubmSXIl0U5jPmmonsMikOOu/0sgq9nd0VxxhqiSegcJzEHce1fVqM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054044; c=relaxed/simple; bh=UEPOt8SI+fidERGPsEtZZPsGHiDlZF6K6CLJrhvoiDk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=PbI/wl5xUgEzcCWiRqZagPKrk7VwBcmzxZ2HqJ/+4/3A3jU/6cyYydsQED4Ro3Ckn4lz5b8OEABgchkykCChVoXCfgFbllptWzPuRTFPj/dO7bJyn7qYmAJtcGmzUBhe4j/EDRZPPkGjGpxbjiAR7vrQgsTIh0hhIVMizKYVSNw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=vkN4kbis; arc=none smtp.client-ip=209.85.219.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="vkN4kbis" Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-e388503c0d7so1417177276.0 for ; Tue, 19 Nov 2024 14:07:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054041; x=1732658841; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=r9JR2dz5YDSDl9TuDi/yTY9jlMWySNJbK5ywQsdWG2s=; b=vkN4kbis40LBFmyWrdRoL4sNByVB+DXydiSWsG9jyNM/nquJWfwIM5sKiMKJHhFJf5 eiAOkX/wCPJnqNlECmVrvKBz17N0nuEf47j8TjpmjIHknjG8amHMu3I2Lrw/XHkHicVA L8Nfe1OwadVmyXgaSD0RUojFgrGuNINOLnzIC/C/VXPe4eehE9Zpd6aAfFML45R0ppLo XC7UINyhOW1UxRrMbn6KC2x2s55IiC8y1cNPTOXl+Idyuvt0mFxwbCWAGSOrPagCHr4U ueGcCRYknOIZBNPx/uj0UAJSM+M77xSGrEY/8N7v+Uz0WsqTAQj2qG+EAfOFQ5VIulrQ fFLg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054041; x=1732658841; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=r9JR2dz5YDSDl9TuDi/yTY9jlMWySNJbK5ywQsdWG2s=; b=wFk2pGCnXObD67XrvnJNR8C/09JQMX1yO5BWpBV2wX2C+GurT0eg7KUrZ2FxlE9PwZ uoKXy1nSWevSCniAxs2W43I4k7nSpcN+5jYSDZbTsufb5PBxL1ss7/izYIn6W2jGikuI BvvgG2ZHPhU+e+SJZKAKULmXtiZ1r1pE+CNbS/asK6O/6cUxo+BIA/Wr3bCGxps6bhs2 juV80omsQKdxEYtU7kZVoWzmsUlfFBoFF7JE9Zzx64v7aqMzwyhBeYKP5ZWE2LKatz5J NdqgG1A/ms+fJdHxcUry5+7zBo2p39Si+npUmDL4ROY5yNhVzIt17kl44vqH0tp6VJse XeWA== X-Gm-Message-State: AOJu0Yw4KfsolnfkqtR5vYioW8mWLjsb8ptxnx+lDkaRudesWwCmUOU1 oAZNB53dhVK45aX51xNXgqnsWNVZLkGFT7FJ1msY1OMW8/3ytP26azYXe9lbEMaI4VlMjjhlzH/ eDbY= X-Google-Smtp-Source: AGHT+IGBsEOatw/Q5owUkJ2P/ktANI8uPWpCGU/qNhK3o/chnbxwV+bKChQDewpMpQSIAXCiIboAeA== X-Received: by 2002:a05:6902:3001:b0:e38:b2ce:992 with SMTP id 3f1490d57ef6-e38cb573f11mr358152276.21.1732054041157; Tue, 19 Nov 2024 14:07:21 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e387e763c98sm2725314276.28.2024.11.19.14.07.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:20 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:19 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 01/13] Documentation: describe incremental MIDX bitmaps Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare to implement support for reachability bitmaps for the new incremental multi-pack index (MIDX) feature over the following commits. This commit begins by first describing the relevant format and usage details for incremental MIDX bitmaps. Signed-off-by: Taylor Blau --- Documentation/technical/multi-pack-index.txt | 64 ++++++++++++++++++++ 1 file changed, 64 insertions(+) diff --git a/Documentation/technical/multi-pack-index.txt b/Documentation/technical/multi-pack-index.txt index cc063b30bea..a063262c360 100644 --- a/Documentation/technical/multi-pack-index.txt +++ b/Documentation/technical/multi-pack-index.txt @@ -164,6 +164,70 @@ objects_nr($H2) + objects_nr($H1) + i (in the C implementation, this is often computed as `i + m->num_objects_in_base`). +=== Pseudo-pack order for incremental MIDXs + +The original implementation of multi-pack reachability bitmaps defined +the pseudo-pack order in linkgit:gitformat-pack[5] (see the section +titled "multi-pack-index reverse indexes") roughly as follows: + +____ +In short, a MIDX's pseudo-pack is the de-duplicated concatenation of +objects in packs stored by the MIDX, laid out in pack order, and the +packs arranged in MIDX order (with the preferred pack coming first). +____ + +In the incremental MIDX design, we extend this definition to include +objects from multiple layers of the MIDX chain. The pseudo-pack order +for incremental MIDXs is determined by concatenating the pseudo-pack +ordering for each layer of the MIDX chain in order. Formally two objects +`o1` and `o2` are compared as follows: + +1. If `o1` appears in an earlier layer of the MIDX chain than `o2`, then + `o1` is considered less than `o2`. +2. Otherwise, if `o1` and `o2` appear in the same MIDX layer, and that + MIDX layer has no base, then If one of `pack(o1)` and `pack(o2)` is + preferred and the other is not, then the preferred one sorts first. If + there is a base layer (i.e. the MIDX layer is not the first layer in + the chain), then if `pack(o1)` appears earlier in that MIDX layer's + pack order, than `o1` is less than `o2`. Likewise if `pack(o2)` + appears earlier, than the opposite is true. +3. Otherwise, `o1` and `o2` appear in the same pack, and thus in the + same MIDX layer. Sort `o1` and `o2` by their offset within their + containing packfile. + +=== Reachability bitmaps and incremental MIDXs + +Each layer of an incremental MIDX chain may have its objects (and the +objects from any previous layer in the same MIDX chain) represented in +its own `*.bitmap` file. + +The structure of a `*.bitmap` file belonging to an incremental MIDX +chain is identical to that of a non-incremental MIDX bitmap, or a +classic single-pack bitmap. Since objects are added to the end of the +incremental MIDX's pseudo-pack order (see: above), it is possible to +extend a bitmap when appending to the end of a MIDX chain. + +(Note: it is possible likewise to compress a contiguous sequence of MIDX +incremental layers, and their `*.bitmap`(s) into a single layer and +`*.bitmap`, but this is not yet implemented.) + +The object positions used are global within the pseudo-pack order, so +subsequent layers will have, for example, `m->num_objects_in_base` +number of `0` bits in each of their four type bitmaps. This follows from +the fact that we only write type bitmap entries for objects present in +the layer immediately corresponding to the bitmap). + +Note also that only the bitmap pertaining to the most recent layer in an +incremental MIDX chain is used to store reachability information about +the interesting and uninteresting objects in a reachability query. +Earlier bitmap layers are only used to look up commit and pseudo-merge +bitmaps from that layer, as well as the type-level bitmaps for objects +in that layer. + +To simplify the implementation, type-level bitmaps are iterated +simultaneously, and their results are OR'd together to avoid recursively +calling internal bitmap functions. + Future Work ----------- From patchwork Tue Nov 19 22:07:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880567 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AEE051D3656 for ; Tue, 19 Nov 2024 22:07:25 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054047; cv=none; b=lpb3c+ShUWREN9ey87lNXF5o8LYoPSP2I9HyHdFDhVq5ZiauUderC/IHwzOXuEbn+CKmHGVgYlYzEG2xEHWuWrGlkJ95OP9+yvPAwH43jUtr3A/LItHSaRubKfDhp5Q2VJhn5bin5T5EJON6fs24ZKdJtJC80OwUx/3CbtLyAhY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054047; c=relaxed/simple; bh=ajYfkR4Iq3arnjSMvwRBQqpRxd3UcEaOLOsDswT373I=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=FQvG8nFEv3Xux7GTXT9Z9A6+RLBuEGujk+numW3+baqJqfT8qSmeEVGr+6uPibKKi1wwj081Q74TIOJ5rjgXns3xEe7vhIYt6K818SX9UVL+NlvMIGfEsv37Ctz8nCuneADe/YHDo1F9g+LGdF4AzpEXQdH4OW98+4D2X/vmzmc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=tsXCGQDw; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="tsXCGQDw" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-6ee55cfa88cso13907487b3.3 for ; Tue, 19 Nov 2024 14:07:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054044; x=1732658844; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=9ydn/gki/FN6aAjjruqHw5OGRDks+E/b15nEzT0kqyQ=; b=tsXCGQDw24W9CR9ooZcjYLMcYs2fk5rYplZhos0uZik5sBKVqKVaeOPmI9drn55U7Q 0foFvjUwrzmowqKlECt3RwYWvscQMUDBdwBZ9H21bELYFVzCJ6FgHk4hRP2a9jbtJR6E u1yWxC6rmffgvk2J0D1ngstTpQZnlYpij4MYTtcXCH6UGWxFtITXBc9xRRqsLto3TApv Eq/PSoSyO1NXiS2XbmJH6BPBKqL703wa+Ys5ava464IOniWg3wQ6b7JTs346mgVkfRwL aVfVxgjNDVXxdmNgRD2v3r3O3VBybhPNvCCFetp5hXs50V/Eav5xFKQ0LuLc3kN6LAUW KBOA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054044; x=1732658844; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=9ydn/gki/FN6aAjjruqHw5OGRDks+E/b15nEzT0kqyQ=; b=wImFpb9Fjzi1xMgel5KjbBjzcrq0WT9Jkyb9Nwa+pGGggTzn1cGeFArBCl+Cd+qME6 ATJSLjg27BgQoCjKUT/5Nm4U6vQ319/IFQHhNx08fAWXMqO2twhjgsVzz4m9fn8WT9iE 2ZZpTgzbupR+nh/y35t22AAfRVxIK+hqYhya/TVF7F4VwhcBKPOrbAbnBZaskS/KaYOJ nKYIcAhf82ZtYeQmTg8m7oV/yYAauPM009SFFHXLFt2tVdIdkue1a20QElm1+OaHWAs1 Fq3Twd2caAGaif+b3btoPgtKCpvZXsa8lakmMH8CbqbCdTqS4TIoW9eW9AUvNQ4BiwPQ u8aA== X-Gm-Message-State: AOJu0YwTVO7thKRbnYhujs6xetviOyxeWKSIP7h4dkjmOrQk5CArBSjP /Yswe+BE12bJXehHPdPcAN6wLTXwVPZF3fGpP2Ejh8r4sqi75uwB7Z9bzhYe3A5lDBz6DR4vvJI EOWA= X-Google-Smtp-Source: AGHT+IGgYSylBRr+eoLSRwEkVkG779LgFwDsngccA7K3GxvlEtwukVyozLuhNjs9VKaRBvfQ94wdwQ== X-Received: by 2002:a05:690c:6312:b0:6ee:9b1d:912c with SMTP id 00721157ae682-6eebd135e50mr6462977b3.22.1732054044334; Tue, 19 Nov 2024 14:07:24 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee713805c7sm19306617b3.126.2024.11.19.14.07.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:23 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:22 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 02/13] pack-revindex: prepare for incremental MIDX bitmaps Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare the reverse index machinery to handle object lookups in an incremental MIDX bitmap. These changes are broken out across a few functions: - load_midx_revindex() learns to use the appropriate MIDX filename depending on whether the given 'struct multi_pack_index *' is incremental or not. - pack_pos_to_midx() and midx_to_pack_pos() now both take in a global object position in the MIDX pseudo-pack order, and finds the earliest containing MIDX (similar to midx.c::midx_for_object(). - midx_pack_order_cmp() adjusts its call to pack_pos_to_midx() by the number of objects in the base (since 'vb - midx->revindx_data' is relative to the containing MIDX, and pack_pos_to_midx() expects a global position). Likewise, this function adjusts its output by adding m->num_objects_in_base to return a global position out through the `*pos` pointer. Together, these changes are sufficient to use the multi-pack index's reverse index format for incremental multi-pack reachability bitmaps. Signed-off-by: Taylor Blau --- pack-bitmap.c | 30 ++++++++++++++++++++---------- pack-revindex.c | 32 +++++++++++++++++++++++--------- 2 files changed, 43 insertions(+), 19 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 4fa9dfc771a..bba9c6a905a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -170,6 +170,15 @@ static struct ewah_bitmap *read_bitmap_1(struct bitmap_index *index) return read_bitmap(index->map, index->map_size, &index->map_pos); } +static uint32_t bitmap_non_extended_bits(struct bitmap_index *index) +{ + if (index->midx) { + struct multi_pack_index *m = index->midx; + return m->num_objects + m->num_objects_in_base; + } + return index->pack->num_objects; +} + static uint32_t bitmap_num_objects(struct bitmap_index *index) { if (index->midx) @@ -925,7 +934,7 @@ static inline int bitmap_position_extended(struct bitmap_index *bitmap_git, if (pos < kh_end(positions)) { int bitmap_pos = kh_value(positions, pos); - return bitmap_pos + bitmap_num_objects(bitmap_git); + return bitmap_pos + bitmap_non_extended_bits(bitmap_git); } return -1; @@ -993,7 +1002,7 @@ static int ext_index_add_object(struct bitmap_index *bitmap_git, bitmap_pos = kh_value(eindex->positions, hash_pos); } - return bitmap_pos + bitmap_num_objects(bitmap_git); + return bitmap_pos + bitmap_non_extended_bits(bitmap_git); } struct bitmap_show_data { @@ -1498,7 +1507,8 @@ static void show_extended_objects(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { struct object *obj; - if (!bitmap_get(objects, st_add(bitmap_num_objects(bitmap_git), i))) + if (!bitmap_get(objects, + st_add(bitmap_non_extended_bits(bitmap_git), i))) continue; obj = eindex->objects[i]; @@ -1677,7 +1687,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, * them individually. */ for (i = 0; i < eindex->count; i++) { - size_t pos = st_add(i, bitmap_num_objects(bitmap_git)); + size_t pos = st_add(i, bitmap_non_extended_bits(bitmap_git)); if (eindex->objects[i]->type == type && bitmap_get(to_filter, pos) && !bitmap_get(tips, pos)) @@ -1703,7 +1713,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, oi.sizep = &size; - if (pos < bitmap_num_objects(bitmap_git)) { + if (pos < bitmap_non_extended_bits(bitmap_git)) { struct packed_git *pack; off_t ofs; @@ -1726,7 +1736,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, } } else { struct eindex *eindex = &bitmap_git->ext_index; - struct object *obj = eindex->objects[pos - bitmap_num_objects(bitmap_git)]; + struct object *obj = eindex->objects[pos - bitmap_non_extended_bits(bitmap_git)]; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) die(_("unable to get size of %s"), oid_to_hex(&obj->oid)); } @@ -1878,7 +1888,7 @@ static void filter_packed_objects_from_bitmap(struct bitmap_index *bitmap_git, uint32_t objects_nr; size_t i, pos; - objects_nr = bitmap_num_objects(bitmap_git); + objects_nr = bitmap_non_extended_bits(bitmap_git); pos = objects_nr / BITS_IN_EWORD; if (pos > result->word_alloc) @@ -2403,7 +2413,7 @@ static uint32_t count_object_type(struct bitmap_index *bitmap_git, for (i = 0; i < eindex->count; ++i) { if (eindex->objects[i]->type == type && bitmap_get(objects, - st_add(bitmap_num_objects(bitmap_git), i))) + st_add(bitmap_non_extended_bits(bitmap_git), i))) count++; } @@ -2802,7 +2812,7 @@ uint32_t *create_bitmap_mapping(struct bitmap_index *bitmap_git, BUG("rebuild_existing_bitmaps: missing required rev-cache " "extension"); - num_objects = bitmap_num_objects(bitmap_git); + num_objects = bitmap_non_extended_bits(bitmap_git); CALLOC_ARRAY(reposition, num_objects); for (i = 0; i < num_objects; ++i) { @@ -2945,7 +2955,7 @@ static off_t get_disk_usage_for_extended(struct bitmap_index *bitmap_git) struct object *obj = eindex->objects[i]; if (!bitmap_get(result, - st_add(bitmap_num_objects(bitmap_git), i))) + st_add(bitmap_non_extended_bits(bitmap_git), i))) continue; if (oid_object_info_extended(the_repository, &obj->oid, &oi, 0) < 0) diff --git a/pack-revindex.c b/pack-revindex.c index 22d3c234648..ce3f7ae2149 100644 --- a/pack-revindex.c +++ b/pack-revindex.c @@ -383,8 +383,12 @@ int load_midx_revindex(struct multi_pack_index *m) trace2_data_string("load_midx_revindex", the_repository, "source", "rev"); - get_midx_filename_ext(&revindex_name, m->object_dir, - get_midx_checksum(m), MIDX_EXT_REV); + if (m->has_chain) + get_split_midx_filename_ext(&revindex_name, m->object_dir, + get_midx_checksum(m), MIDX_EXT_REV); + else + get_midx_filename_ext(&revindex_name, m->object_dir, + get_midx_checksum(m), MIDX_EXT_REV); ret = load_revindex_from_disk(revindex_name.buf, m->num_objects, @@ -471,11 +475,15 @@ off_t pack_pos_to_offset(struct packed_git *p, uint32_t pos) uint32_t pack_pos_to_midx(struct multi_pack_index *m, uint32_t pos) { + while (m && pos < m->num_objects_in_base) + m = m->base_midx; + if (!m) + BUG("NULL multi-pack-index for object position: %"PRIu32, pos); if (!m->revindex_data) BUG("pack_pos_to_midx: reverse index not yet loaded"); - if (m->num_objects <= pos) + if (m->num_objects + m->num_objects_in_base <= pos) BUG("pack_pos_to_midx: out-of-bounds object at %"PRIu32, pos); - return get_be32(m->revindex_data + pos); + return get_be32(m->revindex_data + pos - m->num_objects_in_base); } struct midx_pack_key { @@ -491,7 +499,8 @@ static int midx_pack_order_cmp(const void *va, const void *vb) const struct midx_pack_key *key = va; struct multi_pack_index *midx = key->midx; - uint32_t versus = pack_pos_to_midx(midx, (uint32_t*)vb - (const uint32_t *)midx->revindex_data); + size_t pos = (uint32_t*)vb - (const uint32_t *)midx->revindex_data; + uint32_t versus = pack_pos_to_midx(midx, pos + midx->num_objects_in_base); uint32_t versus_pack = nth_midxed_pack_int_id(midx, versus); off_t versus_offset; @@ -529,9 +538,9 @@ static int midx_key_to_pack_pos(struct multi_pack_index *m, { uint32_t *found; - if (key->pack >= m->num_packs) + if (key->pack >= m->num_packs + m->num_packs_in_base) BUG("MIDX pack lookup out of bounds (%"PRIu32" >= %"PRIu32")", - key->pack, m->num_packs); + key->pack, m->num_packs + m->num_packs_in_base); /* * The preferred pack sorts first, so determine its identifier by * looking at the first object in pseudo-pack order. @@ -551,7 +560,8 @@ static int midx_key_to_pack_pos(struct multi_pack_index *m, if (!found) return -1; - *pos = found - m->revindex_data; + *pos = (found - m->revindex_data) + m->num_objects_in_base; + return 0; } @@ -559,9 +569,13 @@ int midx_to_pack_pos(struct multi_pack_index *m, uint32_t at, uint32_t *pos) { struct midx_pack_key key; + while (m && at < m->num_objects_in_base) + m = m->base_midx; + if (!m) + BUG("NULL multi-pack-index for object position: %"PRIu32, at); if (!m->revindex_data) BUG("midx_to_pack_pos: reverse index not yet loaded"); - if (m->num_objects <= at) + if (m->num_objects + m->num_objects_in_base <= at) BUG("midx_to_pack_pos: out-of-bounds object at %"PRIu32, at); key.pack = nth_midxed_pack_int_id(m, at); From patchwork Tue Nov 19 22:07:26 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880568 Received: from mail-yb1-f174.google.com (mail-yb1-f174.google.com [209.85.219.174]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C6ADC1D4325 for ; Tue, 19 Nov 2024 22:07:28 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.174 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054050; cv=none; b=mCB55g/s5eEARljyW+pyFItxDgq3yq3mIGbDhkOKVpWExCz4t24E1+gqh2fyAu08cPN3wQphqRsRHVO9fRcuCkVtElPE5pWi2zeVSk3ZEVIVZ4XnV6GfAf6nnIjDdeXTOGew+vwa1t8C6qUUOG4kQ00rsDIG6hN3g3rLiCioITI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054050; c=relaxed/simple; bh=L/MxjGPKK/aT+dWI59kd9+KF9SJ6jAAOir8MZW7uGfM=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=FwJTOGa6lFi/PDpaSEeQ3twVeIOINvARE8t4okPub4eATIlkxS+EfT3g8sLbvruh4lfNnb+jQ3wXBT2ASNcYaqyKCrLX0k+a4DOllJJqmntUMfR/gfKdjU9bMk5J3TAIOn4Re3FkrgDtv0rJn3v/i9J71vGVfn6YLEtYKZOtR5A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Ey8r7E1K; arc=none smtp.client-ip=209.85.219.174 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Ey8r7E1K" Received: by mail-yb1-f174.google.com with SMTP id 3f1490d57ef6-e3873c7c9b4so1663557276.3 for ; Tue, 19 Nov 2024 14:07:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054047; x=1732658847; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=2A+CS3T3JINS9cb+RzT8lj8gtedBPWkQsf3U52Y7MMM=; b=Ey8r7E1KinvvqNeGPRBwtTKr7cpoNKeh4y3xZJlS31mjtZTO6BC6ndR8lvUgqiZ3z5 B9B/gnmWMaHCuuEBUmGgjmJ0xR6N8ejPsnCDJy8jXr2NHO7Nc3cXZD7l9Rn0bde8LrUg IKD9izs2Dd08bwpkcq35YRi9u92BRZAJ7iUqaKpuDiFlNwA/a7+wivpE1pTgUQHbc0pf H8fjUMEttJdAKyAXc138JTqf987UEFU339waIDp0uzQIXCaHVMgBD6YVC3/9NqafZCoD Xysygon+dDlEtG4SJA6T0ph83R6QTgYvRImpl8TG1Q0VNsMI1SdxDUW4GxZLUWVQlZef oHdQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054047; x=1732658847; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=2A+CS3T3JINS9cb+RzT8lj8gtedBPWkQsf3U52Y7MMM=; b=iF8l+knNYOw+DPdWMnBRxly/HH9/dRPfUnhqjx4w5ZUemheHGmcV6lhIqV4q8LGJo/ ufPQRdRZbJ7VGEKS7TDm21CE8jHERORhIiRGYYDYeLuU50jZ64qauHdPm6+NjBxuffrV iWf0Nw0LXbRTofzTyDwxgTpbnVpSdcF9WHAvBtBmibP11fJPIawLK7sF54ba+D2pWahz H08f/sTnvDzlwio23atI11d0reUwp/H/yC4n9BFJ+kbOdN19VYxxIo4spWZBw64YPta3 m4ukFWM1jvS13YSj/wlLRdzh4FXvI1L9gVZqu0RHKHeIPSPwEGbIncZT1Cgz2RFPg18v 0MpA== X-Gm-Message-State: AOJu0YxYqgarDmxcRMUH+NU63CmBP/cbGT5iLrZxJoo+b04c8uWbJF4w c/qXILiedpj/py1EzmJHciTy1q7JxuA7/AGwxgyyREnxmJqlyV+XfIjFMT1elph5I5MU/qh7/So Bdjo= X-Google-Smtp-Source: AGHT+IEvTRqsSdivhh7vyaSIgmrM/21qU2f0+SgoxL2aM4JDuVSEv2jFpL4WZuEKd92i9gIFwGiOvw== X-Received: by 2002:a05:6902:1b88:b0:e29:2466:c813 with SMTP id 3f1490d57ef6-e38cb56ef60mr382476276.19.1732054047443; Tue, 19 Nov 2024 14:07:27 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e387e7e7718sm2716126276.40.2024.11.19.14.07.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:27 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:26 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 03/13] pack-bitmap.c: open and store incremental bitmap layers Message-ID: <5b5d625cbe02560a20c12b7dd20aeda4979017bb.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare the pack-bitmap machinery to work with incremental MIDXs by adding a new "base" field to keep track of the bitmap index associated with the previous MIDX layer. The changes in this commit are mostly boilerplate to open the correct bitmap(s), add them to the chain bitmap layers along the "base" pointer, ensures that the correct packs and their reverse indexes are loaded across MIDX layers, etc. While we're at it, keep track of a base_nr field to indicate how many bitmap layers (including the current bitmap) exist. This will be used in a future commit to allocate an array of 'struct ewah_bitmap' pointers to collect all of the respective type bitmaps among all layers to initialize a multi-EWAH iterator. Subsequent commits will teach the functions within the pack-bitmap machinery how to interact with these new fields. Signed-off-by: Taylor Blau --- pack-bitmap.c | 64 ++++++++++++++++++++++++++++++++++++++++----------- 1 file changed, 51 insertions(+), 13 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index bba9c6a905a..41675a69f68 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -54,6 +54,13 @@ struct bitmap_index { struct packed_git *pack; struct multi_pack_index *midx; + /* + * If using a multi-pack index chain, 'base' points to the + * bitmap index corresponding to this bitmap's midx->base_midx. + */ + struct bitmap_index *base; + uint32_t base_nr; + /* mmapped buffer of the whole bitmap index */ unsigned char *map; size_t map_size; /* size of the mmaped buffer */ @@ -377,8 +384,13 @@ static int load_bitmap_entries_v1(struct bitmap_index *index) char *midx_bitmap_filename(struct multi_pack_index *midx) { struct strbuf buf = STRBUF_INIT; - get_midx_filename_ext(&buf, midx->object_dir, get_midx_checksum(midx), - MIDX_EXT_BITMAP); + if (midx->has_chain) + get_split_midx_filename_ext(&buf, midx->object_dir, + get_midx_checksum(midx), + MIDX_EXT_BITMAP); + else + get_midx_filename_ext(&buf, midx->object_dir, + get_midx_checksum(midx), MIDX_EXT_BITMAP); return strbuf_detach(&buf, NULL); } @@ -397,10 +409,17 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git, { struct stat st; char *bitmap_name = midx_bitmap_filename(midx); - int fd = git_open(bitmap_name); + int fd; uint32_t i, preferred_pack; struct packed_git *preferred; + fd = git_open(bitmap_name); + if (fd < 0 && errno == ENOENT) { + FREE_AND_NULL(bitmap_name); + bitmap_name = midx_bitmap_filename(midx); + fd = git_open(bitmap_name); + } + if (fd < 0) { if (errno != ENOENT) warning_errno("cannot open '%s'", bitmap_name); @@ -446,7 +465,7 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git, goto cleanup; } - for (i = 0; i < bitmap_git->midx->num_packs; i++) { + for (i = 0; i < bitmap_git->midx->num_packs + bitmap_git->midx->num_packs_in_base; i++) { if (prepare_midx_pack(the_repository, bitmap_git->midx, i)) { warning(_("could not open pack %s"), bitmap_git->midx->pack_names[i]); @@ -459,13 +478,20 @@ static int open_midx_bitmap_1(struct bitmap_index *bitmap_git, goto cleanup; } - preferred = bitmap_git->midx->packs[preferred_pack]; + preferred = nth_midxed_pack(bitmap_git->midx, preferred_pack); if (!is_pack_valid(preferred)) { warning(_("preferred pack (%s) is invalid"), preferred->pack_name); goto cleanup; } + if (midx->base_midx) { + bitmap_git->base = prepare_midx_bitmap_git(midx->base_midx); + bitmap_git->base_nr = bitmap_git->base->base_nr + 1; + } else { + bitmap_git->base_nr = 1; + } + return 0; cleanup: @@ -516,6 +542,7 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git bitmap_git->map_size = xsize_t(st.st_size); bitmap_git->map = xmmap(NULL, bitmap_git->map_size, PROT_READ, MAP_PRIVATE, fd, 0); bitmap_git->map_pos = 0; + bitmap_git->base_nr = 1; close(fd); if (load_bitmap_header(bitmap_git) < 0) { @@ -535,8 +562,7 @@ static int open_pack_bitmap_1(struct bitmap_index *bitmap_git, struct packed_git static int load_reverse_index(struct repository *r, struct bitmap_index *bitmap_git) { if (bitmap_is_midx(bitmap_git)) { - uint32_t i; - int ret; + struct multi_pack_index *m; /* * The multi-pack-index's .rev file is already loaded via @@ -545,10 +571,15 @@ static int load_reverse_index(struct repository *r, struct bitmap_index *bitmap_ * But we still need to open the individual pack .rev files, * since we will need to make use of them in pack-objects. */ - for (i = 0; i < bitmap_git->midx->num_packs; i++) { - ret = load_pack_revindex(r, bitmap_git->midx->packs[i]); - if (ret) - return ret; + for (m = bitmap_git->midx; m; m = m->base_midx) { + uint32_t i; + int ret; + + for (i = 0; i < m->num_packs; i++) { + ret = load_pack_revindex(r, m->packs[i]); + if (ret) + return ret; + } } return 0; } @@ -574,6 +605,13 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git) if (!bitmap_git->table_lookup && load_bitmap_entries_v1(bitmap_git) < 0) goto failed; + if (bitmap_git->base) { + if (!bitmap_is_midx(bitmap_git)) + BUG("non-MIDX bitmap has non-NULL base bitmap index"); + if (load_bitmap(r, bitmap_git->base) < 0) + goto failed; + } + return 0; failed: @@ -658,10 +696,9 @@ struct bitmap_index *prepare_bitmap_git(struct repository *r) struct bitmap_index *prepare_midx_bitmap_git(struct multi_pack_index *midx) { - struct repository *r = the_repository; struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); - if (!open_midx_bitmap_1(bitmap_git, midx) && !load_bitmap(r, bitmap_git)) + if (!open_midx_bitmap_1(bitmap_git, midx)) return bitmap_git; free_bitmap_index(bitmap_git); @@ -2875,6 +2912,7 @@ void free_bitmap_index(struct bitmap_index *b) close_midx_revindex(b->midx); } free_pseudo_merge_map(&b->pseudo_merges); + free_bitmap_index(b->base); free(b); } From patchwork Tue Nov 19 22:07:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880569 Received: from mail-yb1-f171.google.com (mail-yb1-f171.google.com [209.85.219.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 90A861D45F3 for ; Tue, 19 Nov 2024 22:07:31 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054053; cv=none; b=cHjB5A+0nlHDBH5R5tA4SqyPWFMekhfUohOU9PDEkHbHEO6KLA4GzEHnOE+d1P4mE4Wm+D5fPb6eF2ccxiQX5/wCXgWwyn7epdalphzqOlKbxhlJ4d1iMYfjQ0dcdCFX65xJA18KNfPsQKYCTXQ/h7JrF10T1FaT5/blgm3J94g= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054053; c=relaxed/simple; bh=9YbjvI4hGd+e6sdw9C9jANSx0jJXrDpspzhu1UGyt2E=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GmjT26oZL/Q+0RzecXVXi4g4F3ZPrpxzxaMEuj0OZSIU6FQyvLZm9mQgvNPHVnr2tNmK1wkoOZtpB56B6577FwbfUgQQo8np9XopXmEhRpB/xvX7Y01rhQTXYdsoZblkBAOsHH33ZT4p50IqB0En7F3yPyd8WyS1mZArKJufLG0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=KFo2EwNA; arc=none smtp.client-ip=209.85.219.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="KFo2EwNA" Received: by mail-yb1-f171.google.com with SMTP id 3f1490d57ef6-e38c1ac8e3cso258045276.0 for ; Tue, 19 Nov 2024 14:07:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054050; x=1732658850; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=5L3fFST7B7xqNhSQalGhHKw6KaVjM0T8RAk9FLVaUhU=; b=KFo2EwNA20tisgq2iNqfI8LbP+UZ3Oq0gvbzIlavA/i227S2LHpcq7huCZwaIWZ5JS s90N0eiBS5E6JhfKLPRVPwuyYJnVY0Nx/WQ8v55EK5MDm3RvGwN3u5zj7ExQSf3te+wL HnRzfjwb7qyt9qz/p1y3lAMp4Epr+ovikOBjU0wbRwHOa45BKduXnN5QMb2c1O+/2MSL satsjrcGSHgNla55RcfgI+jumMSUC53bNP5ujkMwNDOVfeP4OobwDdLtoGTyweX6dT1b G6qXeh+4NR1CX+wRrHXlEmbUQDWPMIINrFMIDQvSqvQppXiHXZHrvXcOwMN+t3AE2uuZ pYqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054050; x=1732658850; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=5L3fFST7B7xqNhSQalGhHKw6KaVjM0T8RAk9FLVaUhU=; b=hvpkcinbXFT1Zma5WIByZUBuU1uFnCJd/IAwGun93r47ewaJgaPLcKVrFABQnX4rDB 5WEa53Lip+k9zXU4SfRTNjYkan/gxd/bJUzIJQvWQ29n4MJ76rgV8lowXEesTGT4LoSH JZmb64phyk86szo6aCMhXYDBGIotyXNdPSsd5GZLqPNjaYl6aG1faMBFmFBHwP6INKp4 ldikye9Hv67TjS+MEVUYFvgjxyr0N4OlgO8lfcjHM5rI3GJsLiLbekc35B5D932otAb2 qOLqSp0tT7jOXIi+qUx2V9kVjFI3DZc6/mQRGm9r1HBVRqV6WvQgHG8x5T1Yy5bWEAKW Z4hg== X-Gm-Message-State: AOJu0YwS2CF4i5jlANM8+9MZ2ZdGCD5gdccVutccBfQ0P6s80kABz5LJ gLwEzZtUspR3c6/7t67NFi1Im9bSgnvXzZmdvEEbWYT2IOYmK+Ik9EC38oi85a+KCefnGR/NfPO E X-Google-Smtp-Source: AGHT+IHDah3pAOL+Vd4TlxdOEogwbJ92R5hDSc2w3pBkmUin9hWQcRvhL9rY6nzsu4Jog8EUmNUmFg== X-Received: by 2002:a05:690c:6e06:b0:6ee:4855:45de with SMTP id 00721157ae682-6eeaa34f47cmr45520617b3.9.1732054050515; Tue, 19 Nov 2024 14:07:30 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee713422f4sm19383017b3.78.2024.11.19.14.07.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:30 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:29 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 04/13] pack-bitmap.c: teach `bitmap_for_commit()` about incremental MIDXs Message-ID: <16259667fb4d7534458bb458afd6cefe032c3b6f.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: The pack-bitmap machinery uses `bitmap_for_commit()` to locate the EWAH-compressed bitmap corresponding to some given commit object. Teach this function about incremental MIDX bitmaps by teaching it to recur on earlier bitmap layers when it fails to find a given commit in the current layer. The changes to do so are as follows: - Avoid initializing hash_pos at its declaration, since bitmap_for_commit() is now a recursive function and may receive a NULL bitmap_index pointer as its first argument. - In cases where we would previously return NULL (to indicate that a lookup failed and the given bitmap_index does not contain an entry corresponding to the given commit), recursively call the function on the previous bitmap layer. Signed-off-by: Taylor Blau --- pack-bitmap.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 41675a69f68..e3fdcf8a01a 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -946,18 +946,21 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, struct commit *commit) { - khiter_t hash_pos = kh_get_oid_map(bitmap_git->bitmaps, - commit->object.oid); + khiter_t hash_pos; + if (!bitmap_git) + return NULL; + + hash_pos = kh_get_oid_map(bitmap_git->bitmaps, commit->object.oid); if (hash_pos >= kh_end(bitmap_git->bitmaps)) { struct stored_bitmap *bitmap = NULL; if (!bitmap_git->table_lookup) - return NULL; + return bitmap_for_commit(bitmap_git->base, commit); /* this is a fairly hot codepath - no trace2_region please */ /* NEEDSWORK: cache misses aren't recorded */ bitmap = lazy_bitmap_for_commit(bitmap_git, commit); if (!bitmap) - return NULL; + return bitmap_for_commit(bitmap_git->base, commit); return lookup_stored_bitmap(bitmap); } return lookup_stored_bitmap(kh_value(bitmap_git->bitmaps, hash_pos)); From patchwork Tue Nov 19 22:07:32 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880570 Received: from mail-yb1-f170.google.com (mail-yb1-f170.google.com [209.85.219.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 83B0F1D47DC for ; Tue, 19 Nov 2024 22:07:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054056; cv=none; b=ji7fIr9x7eZIQT7fqen8UdG6KmR1jUmqp5oH0LBZshdGfxdXbLOZo0c9Nnfdh1Pk2PBSsgg5GeJog7f9QeKF/EfgLmItJ4oVj0n6mSZmcdBbpfdQGFZxyqxzqedKsuX07w6y9G8k1KqV1Ar/citm3q2vUY8/gCqG/ic1c/d4zQM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054056; c=relaxed/simple; bh=1W7lAVYmqwyCX7xRvlJoz0Y9t763XBgmhRBNmHvqaYI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=rnybw3jWi95QevJXK1hveJtD2Xkmf6g6BHTxGb6km1c0v4JsGfN2KTSWfl4hxiGiZ15yiVvX5E08ElcQ4l8Wg461zYirTx0YB8TayRqffUuATl7rV3FfI4TSw++tOWm7lOhNzVCwwb8LvYipSiCwkci9YmvmUcHtdsPGGqmybPw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=XpdezrMK; arc=none smtp.client-ip=209.85.219.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="XpdezrMK" Received: by mail-yb1-f170.google.com with SMTP id 3f1490d57ef6-e387ad7abdaso4185407276.0 for ; Tue, 19 Nov 2024 14:07:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054053; x=1732658853; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=5ZmRmRyoHiab2M6+39qY3gDsE1lORmI0KrWIdrhjJfM=; b=XpdezrMK+53UeFjKFCDmwwhiliZzPSbvLPcLvvhM9KcTy+K+hJekVNfswBrhz6PEFY wClCctsX7dVzkCKkPatAkvZ1U/g91bMJwMkVJG8SSXTukG7Z63tgJkQHUMd57/sH6MD6 cpuv23ZPwo0i8CGvz8X8CsG5S6QsIM1UwXvGfLAEQUnvPQaXgXSayzSVHxNXsisYbEDF Tu+yZW4aXKHf3EFzEgz+Cy+m68YUDIcO4Vhqyap81JF1k4G5acTk5upDPWV1ev05qSRg A3xM9vb5MDDJngyiLfAf+Mvht6sh3tZ71hiI3sr6UDd8E8qsLfbZq30955YPjFXSCFls cNcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054053; x=1732658853; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=5ZmRmRyoHiab2M6+39qY3gDsE1lORmI0KrWIdrhjJfM=; b=FnUPiDatwGmIhEJlWaTCiJzd3d3vbYOpVD0RIzU2Hl8je5pfLgzTq5WG/4wcwl5U51 mzEiOttpCfeb8rVykqSbbyeugApbgxxpdXOwvD/xYuHlTNWVXoJKUGq+twztXmH7wTsJ ejH7J154AUOZ+h2akT2LM3Ft+VbYmqCL+B/CsOsTXnz4GIb9QvthS52HnaZ4jeJZ+T3p N2LEeyr4Tnhyp9LHFRWk63xKUjp4PhmmKIpCSzwOGOIfUjnxokZ5eIwYjUcjcvJs2eH5 /f3vxTIRd7c+3Qjum45vQUIoSSi2cHBLA2eMU+9gjHRHzKz4OHBt2pY47q4S+G9WVOND 1Hag== X-Gm-Message-State: AOJu0YzyaSqcoKp1aPfH7ep6mo12an8SwNO/eqqPQReE9oDjMNE15wpV 0HS2t7zfFJby30GrylakSq8K5+zqTy1fnRWEd4qQ/RsYl/zTMjXSl9nh4cAPpjgIqoi6W8D37fU wciY= X-Google-Smtp-Source: AGHT+IGiBZxkzINSRYlLuIpyX7lRRV5r7ZYD6plyXV5/SQsYWoQbWhL30isBgqDU44Ut6UPFqjQ9hg== X-Received: by 2002:a05:6902:2387:b0:e38:b44d:a959 with SMTP id 3f1490d57ef6-e38cb5ec155mr342831276.42.1732054053552; Tue, 19 Nov 2024 14:07:33 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e38b27a45edsm1069050276.48.2024.11.19.14.07.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:33 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:32 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 05/13] pack-bitmap.c: teach `show_objects_for_type()` about incremental MIDXs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Since we may ask for a pack_id that is in an earlier MIDX layer relative to the one corresponding to our bitmap, use nth_midxed_pack() instead of accessing the ->packs array directly. Signed-off-by: Taylor Blau --- pack-bitmap.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index e3fdcf8a01a..c2c824347a6 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1631,7 +1631,7 @@ static void show_objects_for_type( nth_midxed_object_oid(&oid, m, index_pos); pack_id = nth_midxed_pack_int_id(m, index_pos); - pack = bitmap_git->midx->packs[pack_id]; + pack = nth_midxed_pack(bitmap_git->midx, pack_id); } else { index_pos = pack_pos_to_index(bitmap_git->pack, pos + offset); ofs = pack_pos_to_offset(bitmap_git->pack, pos + offset); From patchwork Tue Nov 19 22:07:35 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880571 Received: from mail-yw1-f178.google.com (mail-yw1-f178.google.com [209.85.128.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E1C781D47DC for ; Tue, 19 Nov 2024 22:07:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054059; cv=none; b=VuoY5/byTsRvQmGLNyQZ/agZbo77fgvMp6F+FuIIksRww8/bjC/2JIkrRO5WIXWRI/kgVt8I6AvdbkL5R8Iy0NscU51UD+KXKrEcnCzaU9llVvgbB4Ia2oqX0cP4Fv2uUgA3thttfHLGrf1N6g4q7P8yTGt5OMu1X14SDtO8oaA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054059; c=relaxed/simple; bh=7mzbZ3j8yAoOh6Ty5Gk63/0Y0S6UaFxrI2I7wJU5HZQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=s5ueV/Cn84iIUk3iLvtGtIN5vvEASRzmigkQcFmYff12lHNw3oOhYE38n8ka0th6CTwtIxMGqw71uw66aTRK2nMbEdx1HwgS4QFL2GTKWHk5jnLgskUIoehQNjl7auYWCk4W0dgm+X2z0epAMo1yx82sKv/mE5GRmEO2Uv6OO8E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=QQacbOOx; arc=none smtp.client-ip=209.85.128.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="QQacbOOx" Received: by mail-yw1-f178.google.com with SMTP id 00721157ae682-6eeb2680092so11902697b3.1 for ; Tue, 19 Nov 2024 14:07:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054056; x=1732658856; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=NkTTCTVp9ddpJzKX3JdGkLamJsl5sQkwOUoLGaitR2E=; b=QQacbOOxLFyo+MFvXRINzvzYjACp2hEzBTqeH0VfR0b7pm9j5F79VpiuEuSrIM6bpH qw56Fab5lvcT6tg6oC/tEG27ed62MUDkksr+qNqTAKNQXDBnAE4BTg2EHWpGnaz1cvqs jSuQMQ7oO6Wuavs9HLFZYEKuILg/LpPFutdk+qWFD1dXzDg/IezgM+rm04t0kW4EUQME kfYFkduCk/N0194ciAglhdKVKnt+dio/11OcsrRskVMCLHSJUvtfUJHrcTLD9XVzQRWs ETd3ActgTu1uB+f5XEflI3X5pyIAqxVML+bwYllPhP+OCcOP69ispXzq8gLG3MhCJXYQ yTMg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054056; x=1732658856; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=NkTTCTVp9ddpJzKX3JdGkLamJsl5sQkwOUoLGaitR2E=; b=qBv7dP+L6KdbM/2g/4KXPVmIk0EE8lWSgknvb7v+7qiYcvnrWfsYez4+or/QaHrt6q /ObWhGsIO773OySbszMFWc8sCrCda4SpLuCFzdPCVWgHwoNikpjtUitNm44S184LoZvb 4frFz9Pj8EuzpL4OYawnBu1EsDpfp4RICdUJatRVNNAQbB58eHqba908G01bSmKBqDmo DB7bmKZj3eNcLKgwc8h6POp2Tz4ckeIOBG9ywX9JsL0to+DLhe71z2A3iIGhG0cnbkRd 8YIYLT3qb1QPftwllpwtSrd6vGjCGdqS1UIdpPbtDugEvfwQtFyqr1v4bEhwKBpXa6aB 0FSw== X-Gm-Message-State: AOJu0YyfVOiCYgYo5/uNLN2yrYft66/71VfxovMPYiFue/iNA9lz1YUH I/gLBdGcW2Y5+xgaB970mitaDiqhVA/EHqsaqgaTLp2YL+z3Sfx0OBBF5tFpip25qfbEVqeB0FO z X-Google-Smtp-Source: AGHT+IH342u94Rou5qO3L3j8oItEDMvLm+p+2A7/7gZ/kKFL5QY1LLlpiFq+Qv2IF252DBgV/cgbYA== X-Received: by 2002:a05:690c:688c:b0:6ee:988b:16d4 with SMTP id 00721157ae682-6eebd2a5e8emr7104137b3.29.1732054056663; Tue, 19 Nov 2024 14:07:36 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee71379f3dsm19239417b3.120.2024.11.19.14.07.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:36 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:35 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 06/13] pack-bitmap.c: support bitmap pack-reuse with incremental MIDXs Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In a similar fashion as previous commits in the first phase of incremental MIDXs, enumerate not just the packs in the current incremental MIDX layer, but previous ones as well. Likewise, in reuse_partial_packfile_from_bitmap(), when reusing only a single pack from a MIDX, use the oldest layer's preferred pack as it is likely to contain the most amount of reusable sections. Signed-off-by: Taylor Blau --- pack-bitmap.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index c2c824347a6..1dddb242434 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -2323,7 +2323,8 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, multi_pack_reuse = 0; if (multi_pack_reuse) { - for (i = 0; i < bitmap_git->midx->num_packs; i++) { + struct multi_pack_index *m = bitmap_git->midx; + for (i = 0; i < m->num_packs + m->num_packs_in_base; i++) { struct bitmapped_pack pack; if (nth_bitmapped_pack(r, bitmap_git->midx, &pack, i) < 0) { warning(_("unable to load pack: '%s', disabling pack-reuse"), @@ -2347,14 +2348,18 @@ void reuse_partial_packfile_from_bitmap(struct bitmap_index *bitmap_git, uint32_t pack_int_id; if (bitmap_is_midx(bitmap_git)) { + struct multi_pack_index *m = bitmap_git->midx; uint32_t preferred_pack_pos; - if (midx_preferred_pack(bitmap_git->midx, &preferred_pack_pos) < 0) { + while (m->base_midx) + m = m->base_midx; + + if (midx_preferred_pack(m, &preferred_pack_pos) < 0) { warning(_("unable to compute preferred pack, disabling pack-reuse")); return; } - pack = bitmap_git->midx->packs[preferred_pack_pos]; + pack = nth_midxed_pack(m, preferred_pack_pos); pack_int_id = preferred_pack_pos; } else { pack = bitmap_git->pack; From patchwork Tue Nov 19 22:07:38 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880572 Received: from mail-yb1-f180.google.com (mail-yb1-f180.google.com [209.85.219.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 6A6431D363A for ; Tue, 19 Nov 2024 22:07:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054064; cv=none; b=qy8CVQhq3LgLUDEaqUDMXJvbO1sCV23muUz+ovND5hXumK4glNEJp/QsN/HMOPyn78jYRyeWBfY2NtpGktvDWhHkWJhReaSHd3c5LTV50DTPgei27w6G5YTBdFL0yPcTx1FT3WaF0rc0swNSLg1ckE2MSk9mdS8mU7127tzW0/U= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054064; c=relaxed/simple; bh=0sInAvI+HQAx2r9lL2BTrfzFkY1zxC0zMCovkm1+4Do=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=pBF1ekWSK+SApzFWueReMamJMY1yajd4Zbv4JSZ4M+7gobgIEyCKyNAsSi05yya2xFBuNmjtVmYIQIlTHpU7t6p7wqnSti2k8ANKFbYkIxo06s558lusP5YkT7Ni3954p2BDiy9MitK30gQGWBAHa9BZD8DQTgm9I+GM2gX3JpQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=GNgRQ2mO; arc=none smtp.client-ip=209.85.219.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="GNgRQ2mO" Received: by mail-yb1-f180.google.com with SMTP id 3f1490d57ef6-e38232fc4d8so2950826276.2 for ; Tue, 19 Nov 2024 14:07:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054060; x=1732658860; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=f22BfR2pfhbu10D3BRua0+75M+He42oQQUe+qb51iIQ=; b=GNgRQ2mOvMViDwO+XL+BeBJtDow11YU4X1DFJmu+O1rLcoFkbFrPgjQoVIdsqvwbBo 05OsL50G5MwnF4eoRxNytnrxJXDJzJwmP14/AyCrixqxk49Oy2OSAlayfyh0QZeaMsGA GD8nupxj4a52OiKXEbjSwRD+I2iXcml+Jfn5EqhFh4oM05eKFEz8whOXJqxXqOjZi/lb rGZONanY9BG5w+tQ8FYOAmeDRR+sVu2tVoiZWPo1YH1ZbjNI3seO0y5MaM0z5AFS/Fjb UIbPj9XjYqcrc8OrHTAYwKQMzYB4Npqzywk6bixEZg0FgcQRjDujD5GiHB4xz/k3oOLG lpVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054060; x=1732658860; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=f22BfR2pfhbu10D3BRua0+75M+He42oQQUe+qb51iIQ=; b=KnT/anrTZbDTZ3ouWBHu9b7xPrEYVRybd4RwSCEe3CvhAdR9fDMWfenfFZv8hT/Tdt Wcv296wkT3YVElMDUrVF0bTK9I701rqSSykQIG4EGP51kjamRUMbMbVwtdYYhdiujMop Opjd366x/aVV0XrD6wMFJb9pG44E/teVPyUolyqSNMrXNJaymg1c20fwrvT/kDgfOWc0 gKSRoCjCoSGf+sGf5g1Th8kTlV4HHGbvWLtCptLwT4XCti7LTypp+fwgRg97e3Keh0Qs A1QTn411CNNnadYoxPe/DFk+z6vQznPyxQrdyBLfI6ROQWQlFggAdcIAC5OV5ImbBc9Q of8g== X-Gm-Message-State: AOJu0YxxXvwCizQpvRpay5UvAuZrrfJFjbYWW/tceEIDjYyyYbloj6bn 4eUtIBWVS4G1W8wzYOKzDFL/jLlPYNP2Gj9EEA+CkCa0aLXORiyQZR6ZvEcThPcNqQfp+Fwq+iJ y84Y= X-Google-Smtp-Source: AGHT+IGrLTAA5JA51PEgCoq+Xv+Mf7K1g1Cl7QLSpTzVVWBi6J7Ttor8/BvBHGmU607C5YsIY0GxWA== X-Received: by 2002:a05:6902:2509:b0:e20:2245:6f9c with SMTP id 3f1490d57ef6-e38cb5a0011mr329026276.26.1732054059799; Tue, 19 Nov 2024 14:07:39 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e387e806ae0sm2733447276.57.2024.11.19.14.07.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:39 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:38 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 07/13] pack-bitmap.c: teach `rev-list --test-bitmap` about incremental MIDXs Message-ID: <17ab23dd76dce076275873e96991acd2f2b2a994.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Implement support for the special `--test-bitmap` mode of `git rev-list` when using incremental MIDXs. The bitmap_test_data structure is extended to contain a "base" pointer that mirrors the structure of the bitmap chain that it is being used to test. When we find a commit to test, we first chase down the ->base pointer to find the appropriate bitmap_test_data for the bitmap layer that the given commit is contained within, and then perform the test on that bitmap. In order to implement this, light modifications are made to bitmap_for_commit() to reimplement it in terms of a new function, find_bitmap_for_commit(), which fills out a pointer which indicates the bitmap layer which contains the given commit. Signed-off-by: Taylor Blau --- pack-bitmap.c | 105 ++++++++++++++++++++++++++++++++++++++++---------- 1 file changed, 84 insertions(+), 21 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 1dddb242434..02864a0e1f7 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -943,8 +943,9 @@ static struct stored_bitmap *lazy_bitmap_for_commit(struct bitmap_index *bitmap_ return NULL; } -struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, - struct commit *commit) +static struct ewah_bitmap *find_bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit, + struct bitmap_index **found) { khiter_t hash_pos; if (!bitmap_git) @@ -954,18 +955,30 @@ struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, if (hash_pos >= kh_end(bitmap_git->bitmaps)) { struct stored_bitmap *bitmap = NULL; if (!bitmap_git->table_lookup) - return bitmap_for_commit(bitmap_git->base, commit); + return find_bitmap_for_commit(bitmap_git->base, commit, + found); /* this is a fairly hot codepath - no trace2_region please */ /* NEEDSWORK: cache misses aren't recorded */ bitmap = lazy_bitmap_for_commit(bitmap_git, commit); if (!bitmap) - return bitmap_for_commit(bitmap_git->base, commit); + return find_bitmap_for_commit(bitmap_git->base, commit, + found); + if (found) + *found = bitmap_git; return lookup_stored_bitmap(bitmap); } + if (found) + *found = bitmap_git; return lookup_stored_bitmap(kh_value(bitmap_git->bitmaps, hash_pos)); } +struct ewah_bitmap *bitmap_for_commit(struct bitmap_index *bitmap_git, + struct commit *commit) +{ + return find_bitmap_for_commit(bitmap_git, commit, NULL); +} + static inline int bitmap_position_extended(struct bitmap_index *bitmap_git, const struct object_id *oid) { @@ -2493,6 +2506,8 @@ struct bitmap_test_data { struct bitmap *tags; struct progress *prg; size_t seen; + + struct bitmap_test_data *base_tdata; }; static void test_bitmap_type(struct bitmap_test_data *tdata, @@ -2501,6 +2516,11 @@ static void test_bitmap_type(struct bitmap_test_data *tdata, enum object_type bitmap_type = OBJ_NONE; int bitmaps_nr = 0; + if (bitmap_is_midx(tdata->bitmap_git)) { + while (pos < tdata->bitmap_git->midx->num_objects_in_base) + tdata = tdata->base_tdata; + } + if (bitmap_get(tdata->commits, pos)) { bitmap_type = OBJ_COMMIT; bitmaps_nr++; @@ -2564,13 +2584,57 @@ static void test_show_commit(struct commit *commit, void *data) display_progress(tdata->prg, ++tdata->seen); } +static uint32_t bitmap_total_entry_count(struct bitmap_index *bitmap_git) +{ + uint32_t total = 0; + do { + total = st_add(total, bitmap_git->entry_count); + bitmap_git = bitmap_git->base; + } while (bitmap_git); + + return total; +} + +static void prepare_bitmap_test_data(struct bitmap_test_data *tdata, + struct bitmap_index *bitmap_git) +{ + memset(tdata, 0, sizeof(struct bitmap_test_data)); + + tdata->bitmap_git = bitmap_git; + tdata->base = bitmap_new(); + tdata->commits = ewah_to_bitmap(bitmap_git->commits); + tdata->trees = ewah_to_bitmap(bitmap_git->trees); + tdata->blobs = ewah_to_bitmap(bitmap_git->blobs); + tdata->tags = ewah_to_bitmap(bitmap_git->tags); + + if (bitmap_git->base) { + CALLOC_ARRAY(tdata->base_tdata, 1); + prepare_bitmap_test_data(tdata->base_tdata, bitmap_git->base); + } +} + +static void free_bitmap_test_data(struct bitmap_test_data *tdata) +{ + if (!tdata) + return; + + free_bitmap_test_data(tdata->base_tdata); + free(tdata->base_tdata); + + bitmap_free(tdata->base); + bitmap_free(tdata->commits); + bitmap_free(tdata->trees); + bitmap_free(tdata->blobs); + bitmap_free(tdata->tags); +} + void test_bitmap_walk(struct rev_info *revs) { struct object *root; struct bitmap *result = NULL; size_t result_popcnt; struct bitmap_test_data tdata; - struct bitmap_index *bitmap_git; + struct bitmap_index *bitmap_git, *found; struct ewah_bitmap *bm; if (!(bitmap_git = prepare_bitmap_git(revs->repo))) @@ -2579,17 +2643,26 @@ void test_bitmap_walk(struct rev_info *revs) if (revs->pending.nr != 1) die(_("you must specify exactly one commit to test")); - fprintf_ln(stderr, "Bitmap v%d test (%d entries%s)", + fprintf_ln(stderr, "Bitmap v%d test (%d entries%s, %d total)", bitmap_git->version, bitmap_git->entry_count, - bitmap_git->table_lookup ? "" : " loaded"); + bitmap_git->table_lookup ? "" : " loaded", + bitmap_total_entry_count(bitmap_git)); root = revs->pending.objects[0].item; - bm = bitmap_for_commit(bitmap_git, (struct commit *)root); + bm = find_bitmap_for_commit(bitmap_git, (struct commit *)root, &found); if (bm) { fprintf_ln(stderr, "Found bitmap for '%s'. %d bits / %08x checksum", - oid_to_hex(&root->oid), (int)bm->bit_size, ewah_checksum(bm)); + oid_to_hex(&root->oid), + (int)bm->bit_size, ewah_checksum(bm)); + + if (bitmap_is_midx(found)) + fprintf_ln(stderr, "Located via MIDX '%s'.", + hash_to_hex(get_midx_checksum(found->midx))); + else + fprintf_ln(stderr, "Located via pack '%s'.", + hash_to_hex(found->pack->hash)); result = ewah_to_bitmap(bm); } @@ -2606,14 +2679,8 @@ void test_bitmap_walk(struct rev_info *revs) if (prepare_revision_walk(revs)) die(_("revision walk setup failed")); - tdata.bitmap_git = bitmap_git; - tdata.base = bitmap_new(); - tdata.commits = ewah_to_bitmap(bitmap_git->commits); - tdata.trees = ewah_to_bitmap(bitmap_git->trees); - tdata.blobs = ewah_to_bitmap(bitmap_git->blobs); - tdata.tags = ewah_to_bitmap(bitmap_git->tags); + prepare_bitmap_test_data(&tdata, bitmap_git); tdata.prg = start_progress("Verifying bitmap entries", result_popcnt); - tdata.seen = 0; traverse_commit_list(revs, &test_show_commit, &test_show_object, &tdata); @@ -2625,11 +2692,7 @@ void test_bitmap_walk(struct rev_info *revs) die(_("mismatch in bitmap results")); bitmap_free(result); - bitmap_free(tdata.base); - bitmap_free(tdata.commits); - bitmap_free(tdata.trees); - bitmap_free(tdata.blobs); - bitmap_free(tdata.tags); + free_bitmap_test_data(&tdata); free_bitmap_index(bitmap_git); } From patchwork Tue Nov 19 22:07:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880573 Received: from mail-yw1-f179.google.com (mail-yw1-f179.google.com [209.85.128.179]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id F1B3C1D364C for ; Tue, 19 Nov 2024 22:07:43 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.179 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054065; cv=none; b=YbHF0d6QKh2B1IJZ6MdB+iyZLCuACaPzmx4sBB31QB3+QXwOyGiZm5wyqd1A3UqiBzVkAkjv/ggHgINvMP88MmDn8s1SR8pOMsJp1S5vLdqc7I2Mo25NqTqYYETWqxXsndzCa+/qSb94md3UbqEnSN2r4iaBCdDuJgiS9BYOmjY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054065; c=relaxed/simple; bh=WyTBvn/VF1hgCHvszKW1yxpmvt8QhV68Zm1NNqovMjc=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=bQ6SoNyU9IUqEfpB0hqmHaEnlNB8Unyfw4AuDj2Mhz9Wo03zUFn6vSux6Fs/FVnx4tDWQnvyQV+PNOj+zdWXgn9Lp+0Hn8igXQoc+l7la7JsCfsk/he99oUC5Vx72htPwwQlPWlAnFYv80GcqvSZk2bBA4cpqo9Vq1wn8kt6Pxs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=FUwxKyR6; arc=none smtp.client-ip=209.85.128.179 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="FUwxKyR6" Received: by mail-yw1-f179.google.com with SMTP id 00721157ae682-6e9ed5e57a7so37922117b3.1 for ; Tue, 19 Nov 2024 14:07:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054063; x=1732658863; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=jsxR6nmPVvwLMwLFPaXBy5iADcDqti7zezpV8DCDwYA=; b=FUwxKyR6cdru9557Spx+2Yiyed7uhDb9eXfI855M3DFmJXknjC4VLWrC0o4IOPeneS xCskoNgnqgRAlFJIwTIdezDf25AmySG16ZoMDs7vv13Bkz7N0F9M34M9Z1FBRTuRWi/0 Z36Q56Rw9cjPv+4ZmR5Aan99VFpuwqodDUxmPZ1cMjbmouJijI3LjeTiZhNtmmb3cDOZ avJW1045JUT0opGXlO8oXylDJHBY/Nl6PHKIRqB3xBYeU6yJ8H9MGES2LGvszKsvnQAG 4LIIa4wMsJX/DETWq764iCnxUKy4qvGDFRdX4Tb9mI3oy4ADarhM/1YIjSo/mqTMTwzf FEiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054063; x=1732658863; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=jsxR6nmPVvwLMwLFPaXBy5iADcDqti7zezpV8DCDwYA=; b=TsOtLbzn9wPC91+QfdijAdC0PHvvAXtvyXRW5O32NBcJjIv2zz/ivXobFSY6kO9m6f /AU4EP/wvfAxfdqBnzCgtaK6IwCrdaCl5ECFNyNgrUpeATFvTmORvafn43htVu4el8Wc RfFHBrMAmcj4UVzLCWHTnxCCuZm+nginYF40LJ8PxPL0M4uX7NPQNQWa6BxAPKk4nCYK MwMwc57Tnz86ra/qfPiWhRGPQMvefcMtkuJk4N1aSOAgvc4MBOJMV7EHzNYZXyL3mo6c WkgddsgLtJ13VCSXAzbOHU2Ixy+LtC4H5TDeTpE50YvaWma13oIZTDKI4cmIxZJffxdq DAyQ== X-Gm-Message-State: AOJu0YwyMpjOglxRk7kbMffwiqA0jZV4cR67WfmyldUkumDuyZxuT0Yo 4kmkpi5Ogr9ktXuU6FleKN2GWkAuuT5MCZfxc7E/Rq6peV+656BfwL9eCgfDyg4KNIW8n5k53ac j9bY= X-Google-Smtp-Source: AGHT+IEVITTq1QEk1dkIw1ELk71/EREN+Vwh488QamYcqGsHfiS8FjtPxHtMPU8bzJTXnzX2y/lJyg== X-Received: by 2002:a05:690c:74c6:b0:6db:ddea:eab4 with SMTP id 00721157ae682-6eebd2b2385mr7065137b3.37.1732054062869; Tue, 19 Nov 2024 14:07:42 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee7129b41fsm19250037b3.34.2024.11.19.14.07.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:42 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:41 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 08/13] pack-bitmap.c: compute disk-usage with incremental MIDXs Message-ID: <75d170ce07832a7f31b85f293044b4b32257162a.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In a similar fashion as previous commits, use nth_midxed_pack() instead of accessing the MIDX's ->packs array directly to support incremental MIDXs. Signed-off-by: Taylor Blau --- pack-bitmap.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 02864a0e1f7..b48d6b144d8 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1774,7 +1774,7 @@ static unsigned long get_size_by_pos(struct bitmap_index *bitmap_git, uint32_t midx_pos = pack_pos_to_midx(bitmap_git->midx, pos); uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); - pack = bitmap_git->midx->packs[pack_id]; + pack = nth_midxed_pack(bitmap_git->midx, pack_id); ofs = nth_midxed_offset(bitmap_git->midx, midx_pos); } else { pack = bitmap_git->pack; @@ -3025,7 +3025,7 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, off_t offset = nth_midxed_offset(bitmap_git->midx, midx_pos); uint32_t pack_id = nth_midxed_pack_int_id(bitmap_git->midx, midx_pos); - struct packed_git *pack = bitmap_git->midx->packs[pack_id]; + struct packed_git *pack = nth_midxed_pack(bitmap_git->midx, pack_id); if (offset_to_pack_pos(pack, offset, &pack_pos) < 0) { struct object_id oid; From patchwork Tue Nov 19 22:07:44 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880574 Received: from mail-yw1-f169.google.com (mail-yw1-f169.google.com [209.85.128.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id EDB691D364C for ; Tue, 19 Nov 2024 22:07:46 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054068; cv=none; b=bn1w2qBmR1DWQdJsFUI/n4SSEC92ZQz9pDpTXhNIhepz87t8JG54SnM9EraSRjH0dBPXjS3njWmVl96q8ENIO3IeAxjRLcfykwxxI+y9t4wMBEg/bd62CS4b+wzGAhcLSwfZFQmiCWnidUl8m70g2Tn4UHoqsnd7HjPvoMCQdRA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054068; c=relaxed/simple; bh=X9P8XxaaenCg1LL8ZQEqCn69baQaOCMwAYsBx2uW9n0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=HCGHptRsPAWG+ryM3zKS5Zuf3uUQFXZOKP3My2YwG2m1qVquQrf0RySF2d7CREERnOfiepQelpcY1uG9T29ky1tPI8Sv3EH0B2xvvlHPYj50NKIlPuuruKFwdQQsxESXLnt9rUSC9rpIP3DLoQfwgM2C5sU35pMNsOa/jco6ORw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=I9dRViZD; arc=none smtp.client-ip=209.85.128.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="I9dRViZD" Received: by mail-yw1-f169.google.com with SMTP id 00721157ae682-6eeb3741880so11213407b3.1 for ; Tue, 19 Nov 2024 14:07:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054066; x=1732658866; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=YWL+XrEmUVHpg0K9uA7HH32ZlL8NHHXUKTQ4UcHypJc=; b=I9dRViZDYM8uV7ikMqpIdz97wxn3gk3pA4i7N/OnxTnYo1JF5Zfzca/L20cmeS6mO/ fIM1hXbVhlkOsFcmVUYDfen1EjN0siE6FeqJPNdGbQaFexRhMjmC7urGMxKgXCag7GsP v0GD9hvFc33895TwLZKNdmoKf3+ayCbPnZLwcaC+3eRym1bJa8qHMocGXTt5vX6VVdmS 9TNREcg5Wt8G6u/JEufh3kstNHwcDb5JKA/Sq8Ie+G1CPEaTTIIEnxYTQOjICqnpxyn3 FzJeOXz4dyxQHRBeW4Mf5iN1qQx14Gec3cIUCrmNkL5vv6UxbUkX/oZWx1spxB8ky473 lp8w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054066; x=1732658866; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=YWL+XrEmUVHpg0K9uA7HH32ZlL8NHHXUKTQ4UcHypJc=; b=nADz7KwoCQjJJ75vWArIR00X1fRnBPrXV0M6p+bFf3eIXS0MAgGsEPi32u+o8SAzyI 9KqucAA43MakU2wNab/jmp796b7LDm9P0PrGFh14s7zTwSgcZcQ7m+gTfqkFCCJi0jaw T/5i36pHlsZTXx36P5YExK40ZN2jpZdtqMzN11Z08h1+2yy/Wy2BzM9IDiGC2v8xQK+i oBx3VwbUFhxKI3jdDUIoi6BRfz8Sj3z4aQD1wT2X0NnsxR66+/KFEZIoDbGA/W6jTYDO Y0dWxIBzrCYS/QvZr64gdurgnfE9enyBoTMK2euWX+WAoHzIQE6YBuol9K1CJV/dMz+U Xv1w== X-Gm-Message-State: AOJu0Yze6YwtCn/igZoivYm50VMrz8dZiD+z9lYhvSF7YBjXlF0ELKhZ aZE+zyfSb4ERILQBpZom+qZQVwCHRThh8rd7aOo2ytUhTbARtmarKds7yB4XKFDSx+s6xbeyKKU NYos= X-Google-Smtp-Source: AGHT+IEGZInJ0z0nvjxyruh70K1GSC0T8rV7FsyTiJBV8FZNtzBXgr7rL/5QriyGiYh+dJ3K1XK/tQ== X-Received: by 2002:a05:690c:3342:b0:6ee:b38c:b6e1 with SMTP id 00721157ae682-6eebd167c8fmr4845127b3.14.1732054065949; Tue, 19 Nov 2024 14:07:45 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee713425e5sm19345677b3.96.2024.11.19.14.07.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:45 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:44 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 09/13] pack-bitmap.c: apply pseudo-merge commits with incremental MIDXs Message-ID: <0b4fcfcecb6043534424c3c7ffc80a63dfe63f3c.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare for using pseudo-merges with incremental MIDX bitmaps by attempting to apply pseudo-merges from each layer when encountering a given commit during a walk. Signed-off-by: Taylor Blau --- pack-bitmap.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index b48d6b144d8..570f6dbdad6 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1087,10 +1087,15 @@ static unsigned apply_pseudo_merges_for_commit_1(struct bitmap_index *bitmap_git struct commit *commit, uint32_t commit_pos) { - int ret; + struct bitmap_index *curr = bitmap_git; + int ret = 0; - ret = apply_pseudo_merges_for_commit(&bitmap_git->pseudo_merges, - result, commit, commit_pos); + while (curr) { + ret += apply_pseudo_merges_for_commit(&curr->pseudo_merges, + result, commit, + commit_pos); + curr = curr->base; + } if (ret) pseudo_merges_satisfied_nr += ret; From patchwork Tue Nov 19 22:07:47 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880575 Received: from mail-yw1-f175.google.com (mail-yw1-f175.google.com [209.85.128.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 1E7EA1D364C for ; Tue, 19 Nov 2024 22:07:49 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054071; cv=none; b=PuIsEsru2Dx4TtpgWie4xqTEhxtvfGPJhllok0E9jrG9X1H+ySz/ThlzIfbQdZTal3C8fXPwEzOmbG05XzJCvzZa+2YbH/dRq3DEq3xRUZizVrIm7/1EDpj7tLxwcDIvmRl52m8ORmwyIB46whev6ylMPsqTezgFMDrAJ/2yiKI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054071; c=relaxed/simple; bh=jYaetPrjcSxeTEQb0wweiRM/qdhFVakCc1KsqWhpD2U=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=o4TP88QONVsvonLQMSLJhhEufXQgnJCxF8yMBAFeyDCwmCkd6/n54RTWgcPTqVcQBNl7o9Bzd7DdgZtWJyVVae9cCUCA/SwFXM2DogjsdA5hNVnvhZAMKaNeWluA8ze3eOoqDIGqAUWThbq1P32E4KDDrweEkxhk7uNgZCXD74A= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=EDG5yvk1; arc=none smtp.client-ip=209.85.128.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="EDG5yvk1" Received: by mail-yw1-f175.google.com with SMTP id 00721157ae682-6ea339a41f1so40804607b3.2 for ; Tue, 19 Nov 2024 14:07:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054069; x=1732658869; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=PfN45cfNAdGD0zNnhWLscXUecob0WYAy+1T5qaUB/v4=; b=EDG5yvk10MtOt3c48Qqoj2hwTnOp9rK+ntHQC6mA6tYgCO+hIQ+jNk624H2N/g2H9N uaB5Wo5L+BVOr3f/D+nGkUzIi9PiCZwS3CjNKioBUE9nujUlOlO2kTWBT1qJ0vPVZWTk 10WdMiD+cZ1Uaim2U4EeNuamyzgkXkpyaRLRJJrrBvJfec6/mmnaDAV8GX91g50dCi++ 1MgZbSocQoNzHYCX9TXOpTtKhFAZ72rerHw/iCgmrA/LvwKsyCwuX/u8mC+C9OQnl70r LMVRWIMPp3rhOreEqxOfCy25tnhAfcCdxgWebxGPVIFn5I4axGlrcfFyUWpciC+zZpmZ ikIQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054069; x=1732658869; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=PfN45cfNAdGD0zNnhWLscXUecob0WYAy+1T5qaUB/v4=; b=eGlyzf2UQh35BT5i8Ja5fkNT9s9uLwWx6rGUiJDt+IRBK/fsK6vCEWuF3Gj3Q8NbSj uZmRCUzN4chc+S7K8Nialn8zWBHjCyEcY2k3MU87v2vmTSpBBvUr4qI1nysNOuW80bsj AWhVcg7ChXNt7in4ITqt0AtPy79XBlZuGSZ2m2Q35IYVlDR6E1OjOG+c320yOEYjjzbF j7gACQr0mNAkx95JzxnsBaikBisU5BXYv7v40LiXKY1tUa4No0UPyqPPwFxYsrqC4KxX QqjLvbkYaBmImQLTcSk4ngC4VFKNiN4fGHjoy1xAjyJaSfDhhKQrW9YfaHY0iuhmNSbG TyHQ== X-Gm-Message-State: AOJu0YyZ1Ml0ttf3Hwjq3JVZ5d3Y550xHH9Zrn8mnQICqC4Ogdcj0wYy nAZyYlJ0yqcVm8a6ra0G6sewBG0w+GzmC8IkSR4Py1AvTsDG4rweujah+7wD3KfKICEjHi4u2cG apuM= X-Google-Smtp-Source: AGHT+IFOQ7H3g8AXiDajIQtpIeeanwK4GEBvM2XJ/62QuSgwKccde9x7pmHR2tdfraX0jegglPmu7g== X-Received: by 2002:a05:690c:4b13:b0:6dd:d0fa:15a8 with SMTP id 00721157ae682-6eebd2fac11mr7302297b3.35.1732054068903; Tue, 19 Nov 2024 14:07:48 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee7ea1adccsm16670887b3.80.2024.11.19.14.07.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:48 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:47 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 10/13] ewah: implement `struct ewah_or_iterator` Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: While individual bitmap layers store different commit, type-level, and pseudo-merge bitmaps, only the top-most layer is used to compute reachability traversals. Many functions which implement the aforementioned traversal rely on enumerating the results according to the type-level bitmaps, and so would benefit from a conceptual type-level bitmap that spans multiple layers. Implement `struct ewah_or_iterator` which is capable of enumerating multiple EWAH bitmaps at once, and OR-ing the results together. When initialized with, for example, all of the commit type bitmaps from each layer, callers can pretend as if they are enumerating a large type-level bitmap which contains the commits from *all* bitmap layers. There are a couple of alternative approaches which were considered: - Decompress each EWAH bitmap and OR them together, enumerating a single (non-EWAH) bitmap. This would work, but has the disadvantage of decompressing a potentially large bitmap, which may not be necessary if the caller does not wish to read all of it. - Recursively call bitmap internal functions, reusing the "result" and "haves" bitmap from the top-most layer. This approach resembles the original implementation of this feature, but is inefficient in that it both (a) requires significant refactoring to implement, and (b) enumerates large sections of later bitmaps which are all zeros (as they pertain to objects in earlier layers). (b) is not so bad in and of itself, but can cause significant slow-downs when combined with expensive loop bodies. This approach (enumerating an OR'd together version of all of the type-level bitmaps from each layer) produces a significantly more straightforward implementation with significantly less refactoring required in order to make it work. Signed-off-by: Taylor Blau --- ewah/ewah_bitmap.c | 33 +++++++++++++++++++++++++++++++++ ewah/ewok.h | 12 ++++++++++++ 2 files changed, 45 insertions(+) diff --git a/ewah/ewah_bitmap.c b/ewah/ewah_bitmap.c index 8785cbc54a8..b3a7ada0714 100644 --- a/ewah/ewah_bitmap.c +++ b/ewah/ewah_bitmap.c @@ -372,6 +372,39 @@ void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent) read_new_rlw(it); } +void ewah_or_iterator_init(struct ewah_or_iterator *it, + struct ewah_bitmap **parents, size_t nr) +{ + size_t i; + + memset(it, 0, sizeof(*it)); + + ALLOC_ARRAY(it->its, nr); + for (i = 0; i < nr; i++) + ewah_iterator_init(&it->its[it->nr++], parents[i]); +} + +int ewah_or_iterator_next(eword_t *next, struct ewah_or_iterator *it) +{ + eword_t buf, out = 0; + size_t i; + int ret = 0; + + for (i = 0; i < it->nr; i++) + if (ewah_iterator_next(&buf, &it->its[i])) { + out |= buf; + ret = 1; + } + + *next = out; + return ret; +} + +void ewah_or_iterator_free(struct ewah_or_iterator *it) +{ + free(it->its); +} + void ewah_xor( struct ewah_bitmap *ewah_i, struct ewah_bitmap *ewah_j, diff --git a/ewah/ewok.h b/ewah/ewok.h index 5e357e24933..4b70641045e 100644 --- a/ewah/ewok.h +++ b/ewah/ewok.h @@ -148,6 +148,18 @@ void ewah_iterator_init(struct ewah_iterator *it, struct ewah_bitmap *parent); */ int ewah_iterator_next(eword_t *next, struct ewah_iterator *it); +struct ewah_or_iterator { + struct ewah_iterator *its; + size_t nr; +}; + +void ewah_or_iterator_init(struct ewah_or_iterator *it, + struct ewah_bitmap **parents, size_t nr); + +int ewah_or_iterator_next(eword_t *next, struct ewah_or_iterator *it); + +void ewah_or_iterator_free(struct ewah_or_iterator *it); + void ewah_xor( struct ewah_bitmap *ewah_i, struct ewah_bitmap *ewah_j, From patchwork Tue Nov 19 22:07:50 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880576 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 3D11A1D364C for ; Tue, 19 Nov 2024 22:07:53 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054074; cv=none; b=GFNO+PgW8qNX4LgTbPtv/S6ldEs2MP8Yu/eG1qPuF84knuDDMpqU2Ut+ZrD57gh6/iYeCZ5d9OVrYHubbGavgRYg2vLSBe2uaa8cJY3fHpSAdV05geaMLglFuQ2d8X+d9w7/8qTp2wXBk3t0gVAajbhC6oP267Q2+Wm662MVVaU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054074; c=relaxed/simple; bh=Lc1FzVuNz+XZja7G7LBfv26KYZ6pJ98IseW+NkEDMZE=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=pMwVVpu76gANIXV87lrc8bLcQ0lw2jUK30bkEklLFvTkC/HSV5hQrMTPgXjxr0iHaDV9X0G/l3+QO5dZFSX1OF2Q+ExdKKEPfrqbbdozaBg3bvNuG1Zp3hWdnFUErGAzv8Vqw6F/h8zPB9ABiA7SL39rpf5AQGwNUlhXoxSS5OA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=y7ZafyHd; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="y7ZafyHd" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-6ee7a400647so18572427b3.1 for ; Tue, 19 Nov 2024 14:07:53 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054072; x=1732658872; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=3v660bK1rP4X2cNJW95bLJv8suwZ4TIVAmONXBCq9Uc=; b=y7ZafyHdltW0tw//Lq5ZCz4FkCMoXTf5crih5rQcFRfWNn+zEdgzWPYJ6Paorp/q41 bhoR3FG5ediIANd76moLPCQ0eaeiAcZLuUX7WOzPrM/LMm+Vy17mYI2TpuJgcwMbsWpS R+yXF80dkpTekySX7Zd3BD+GwoIrQmLPDRjqleZxKkyXX8uPw3E0WXL950Nl9EcJfN18 /THU63WgbCBz+vOnx2bnX63xAd+YrZqmGM7mZp0txZR0HUVpwrahogg/mNZSRD4uknUJ J7O6BabBdWkxFxyVwdH9B0Gf4fpyZGiYdumEd/FzunqgtMuzW1PYFgmXhnGom4/pqvLs XGFg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054072; x=1732658872; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=3v660bK1rP4X2cNJW95bLJv8suwZ4TIVAmONXBCq9Uc=; b=ZUo4vhG67PtJxg35C5usMMzSmz78DnYysoGNaUjX6Ypqx9t0VS5SJIRwQ3F1Aw4ucW vbaSyOnGPjaqvqcmmYTPTVoG05GISASwohznOLVkPhCkanLoHvJ1WeOeAs5NBhvdMk7P OVfuEXqNq9Xsgp8E/Ve2Pa0IqqCYIVvkEhmOImZsOfsEsQw/K51m0ivuZ4k4OJC+LfRJ 5U1FWODPLgkdq/cVDPSjHuYhnVxpuXQC9Q/UUoO0NK0pRl8R2MB0ZIHrmDlsblnQCLkW IufkNn6tnUjpciRr15aW39DkATlBUhxbF1dcE7vMLd10oo8/n5m58yyD+CdY2p8smci9 e7Ww== X-Gm-Message-State: AOJu0Yw1ihbDSSWhdPhTfubTuRGGJcN4FfEkdsuEGxPk5JVE4gMDfSDW rTH3zM8xCBRKC77AcEkKL2KJkG+0Y9rhNSvjIQ+Df9nIxJA+RbDNsbthr6JqcvxV4iezwrz7ag0 v X-Google-Smtp-Source: AGHT+IElPCzHQbydUe42fC1hhW4zFGk/riK3pHQ4mG4gvwBoHCqGz28i2C1ldG/c7+erjR0ynkCexg== X-Received: by 2002:a05:690c:6f93:b0:6dd:bba1:b86d with SMTP id 00721157ae682-6eebd11aa56mr4868737b3.10.1732054072014; Tue, 19 Nov 2024 14:07:52 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee71365675sm19171587b3.106.2024.11.19.14.07.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:51 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:50 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 11/13] pack-bitmap.c: keep track of each layer's type bitmaps Message-ID: <9ab8fb472f48f42f7e0eebc6f0f986c6c74970e9.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Prepare for reading the type-level bitmaps from previous bitmap layers by maintaining an array for each type, where each element in that type's array corresponds to one layer's bitmap for that type. These fields will be used in a later commit to instantiate the 'struct ewah_or_iterator' for each type. Signed-off-by: Taylor Blau --- pack-bitmap.c | 55 +++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 51 insertions(+), 4 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 570f6dbdad6..348488e2d9e 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -78,6 +78,24 @@ struct bitmap_index { struct ewah_bitmap *blobs; struct ewah_bitmap *tags; + /* + * Type index arrays when this bitmap is associated with an + * incremental multi-pack index chain. + * + * If n is the number of unique layers in the MIDX chain, then + * commits_all[n-1] is this structs 'commits' field, + * commits_all[n-2] is the commits field of this bitmap's + * 'base', and so on. + * + * When either associated either with a non-incremental MIDX, or + * a single packfile, these arrays each contain a single + * element. + */ + struct ewah_bitmap **commits_all; + struct ewah_bitmap **trees_all; + struct ewah_bitmap **blobs_all; + struct ewah_bitmap **tags_all; + /* Map from object ID -> `stored_bitmap` for all the bitmapped commits */ kh_oid_map_t *bitmaps; @@ -586,7 +604,29 @@ static int load_reverse_index(struct repository *r, struct bitmap_index *bitmap_ return load_pack_revindex(r, bitmap_git->pack); } -static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git) +static void load_all_type_bitmaps(struct bitmap_index *bitmap_git) +{ + struct bitmap_index *curr = bitmap_git; + size_t i = bitmap_git->base_nr - 1; + + ALLOC_ARRAY(bitmap_git->commits_all, bitmap_git->base_nr); + ALLOC_ARRAY(bitmap_git->trees_all, bitmap_git->base_nr); + ALLOC_ARRAY(bitmap_git->blobs_all, bitmap_git->base_nr); + ALLOC_ARRAY(bitmap_git->tags_all, bitmap_git->base_nr); + + while (curr) { + bitmap_git->commits_all[i] = curr->commits; + bitmap_git->trees_all[i] = curr->trees; + bitmap_git->blobs_all[i] = curr->blobs; + bitmap_git->tags_all[i] = curr->tags; + + curr = curr->base; + i -= 1; + } +} + +static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git, + int recursing) { assert(bitmap_git->map); @@ -608,10 +648,13 @@ static int load_bitmap(struct repository *r, struct bitmap_index *bitmap_git) if (bitmap_git->base) { if (!bitmap_is_midx(bitmap_git)) BUG("non-MIDX bitmap has non-NULL base bitmap index"); - if (load_bitmap(r, bitmap_git->base) < 0) + if (load_bitmap(r, bitmap_git->base, 1) < 0) goto failed; } + if (!recursing) + load_all_type_bitmaps(bitmap_git); + return 0; failed: @@ -687,7 +730,7 @@ struct bitmap_index *prepare_bitmap_git(struct repository *r) { struct bitmap_index *bitmap_git = xcalloc(1, sizeof(*bitmap_git)); - if (!open_bitmap(r, bitmap_git) && !load_bitmap(r, bitmap_git)) + if (!open_bitmap(r, bitmap_git) && !load_bitmap(r, bitmap_git, 0)) return bitmap_git; free_bitmap_index(bitmap_git); @@ -2042,7 +2085,7 @@ struct bitmap_index *prepare_bitmap_walk(struct rev_info *revs, * from disk. this is the point of no return; after this the rev_list * becomes invalidated and we must perform the revwalk through bitmaps */ - if (load_bitmap(revs->repo, bitmap_git) < 0) + if (load_bitmap(revs->repo, bitmap_git, 0) < 0) goto cleanup; if (!use_boundary_traversal) @@ -2961,6 +3004,10 @@ void free_bitmap_index(struct bitmap_index *b) ewah_pool_free(b->trees); ewah_pool_free(b->blobs); ewah_pool_free(b->tags); + free(b->commits_all); + free(b->trees_all); + free(b->blobs_all); + free(b->tags_all); if (b->bitmaps) { struct stored_bitmap *sb; kh_foreach_value(b->bitmaps, sb, { From patchwork Tue Nov 19 22:07:53 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880577 Received: from mail-yw1-f171.google.com (mail-yw1-f171.google.com [209.85.128.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4CEDA1D6DA8 for ; Tue, 19 Nov 2024 22:07:56 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054077; cv=none; b=kCIF7OULtoKTNLfV9bAmnsFY0CNoyQobWMdPI2tC7Zt2tapfAFy64EQo6BYLr4NxDuv1j1vatN0gJFFeUBoM+WBMCl4LHOkrUD6S9L9z3Ujx5TLFdcxO4DLDQ6vKH2GAWG9l6Rs4XBpKiRnzWZfSVynf0vdP9LVWPRlr9xGZ7Ms= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054077; c=relaxed/simple; bh=Phc5hS9Ti2m+Ynwhi1XGduuU6vd3ydpUjmvLAmIgTRk=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=E0y+lgkEiSnlAld11BzGFsYTjn4DTJeOeY1H5XIyKq+GLtGickBQyifoNuHzeQQqqg6KMR9ZOdYwLfYKOyFPaaftKdhP2dvrqyka0a3ufvDkiaBkhXk6wtDe2CDMKwnfGNQIcWXAdfTThcdirCiQ7Hb77G3z6BNofHZbaLlZDtQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=K3iZGXyn; arc=none smtp.client-ip=209.85.128.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="K3iZGXyn" Received: by mail-yw1-f171.google.com with SMTP id 00721157ae682-6eea47d51aeso21981417b3.2 for ; Tue, 19 Nov 2024 14:07:56 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054075; x=1732658875; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=6Ec9dSvR1nUEtHxP8lHIiu4AkBAEkg4ijb3RcNk5UjQ=; b=K3iZGXynRXZf+tjRRyxPpnFihpdsgvFRbUgws8QzXg3lkCKvdj/e1ZcFqRynXqBU// h/Pf9f9t5BEelzBO0sZXQiSRpAsuzDd6Ryaw65dq0VFoFnUh/wMRXx7fUQQgw516TrWD 1m+Hb8TNtATrFbTOKQU4EynLeg10aPuCOJOlFocZS2SB8na2Ofxprg2263IYolcJ6JUp i5VVq6Dbb006M5UNOz4AWdrBy+Ej0Hen1jYsd11F0JLzEb6n9cudp4O79wtxdjVUHhFD 35KWCSS5rqposS2OUVpKM0U66a62EFbxxQvlBb7Tu3V7De6M3JyEUeb0LmnBqDr1TCk8 Unpg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054075; x=1732658875; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=6Ec9dSvR1nUEtHxP8lHIiu4AkBAEkg4ijb3RcNk5UjQ=; b=R5c+S8dMrp0HF6a5SdBWyJbiMG9TubmatrfVH1ntcWK4DapGY/ccI4lET/RgxR5mTI O7AZ0S2NazGUenzWDqNaNY/mQxGcS9kT+UcyTfpGriuZOuEK3XNiZp5A48XsxD+VM+i8 rgJtX48pFTG3y3nns37YxVv1BpZ/9H9jGzk1mPenHjDCPDRXpWWQ0cA3oqHgwSlhXxTv GMockF5bE2aaQZfvifUk42m7OBHSA0aLz2nSKp5XbIZ7RE0JbFHFQRoMJqZW26Ohby/L LZab5sL0ahwOT6TlPlBETlyQsGkRnNQe0xVpWQRcHQ4kPkNaszaYSrVnh/if3LZY1DD9 85yg== X-Gm-Message-State: AOJu0YxdD8zBHxMYt4iK/gJZpkV/vhkNtdLNSAWKUDcD0L4vXdRPD+Ju YBMzLzIOy4Q9rya3wzgS81KeG/Fem+0eC/4ubo6P3VNUsAEqT4BrQLYP8/0f/3RF2z17I0B9PnI Y X-Google-Smtp-Source: AGHT+IHoDMadRrkO+C1/ba+Od+w4HP1767p3lBDX1ZyZwbhH/nYoIaVAyWz0/+0EYNwOeF47jx9mDA== X-Received: by 2002:a05:690c:9987:b0:6e5:bf26:578 with SMTP id 00721157ae682-6eebd121400mr7604837b3.17.1732054075169; Tue, 19 Nov 2024 14:07:55 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 00721157ae682-6ee71381201sm19420027b3.128.2024.11.19.14.07.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:54 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:53 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 12/13] pack-bitmap.c: use `ewah_or_iterator` for type bitmap iterators Message-ID: <87cb011e7fc283ef34f4554122fb901c1cd87294.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that we have initialized arrays for each bitmap layer's type bitmaps in the previous commit, adjust existing callers to use them in preparation for multi-layered bitmaps. Signed-off-by: Taylor Blau --- pack-bitmap.c | 42 +++++++++++++++++++++++++++--------------- 1 file changed, 27 insertions(+), 15 deletions(-) diff --git a/pack-bitmap.c b/pack-bitmap.c index 348488e2d9e..83696d834f6 100644 --- a/pack-bitmap.c +++ b/pack-bitmap.c @@ -1622,25 +1622,29 @@ static void show_extended_objects(struct bitmap_index *bitmap_git, } } -static void init_type_iterator(struct ewah_iterator *it, +static void init_type_iterator(struct ewah_or_iterator *it, struct bitmap_index *bitmap_git, enum object_type type) { switch (type) { case OBJ_COMMIT: - ewah_iterator_init(it, bitmap_git->commits); + ewah_or_iterator_init(it, bitmap_git->commits_all, + bitmap_git->base_nr); break; case OBJ_TREE: - ewah_iterator_init(it, bitmap_git->trees); + ewah_or_iterator_init(it, bitmap_git->trees_all, + bitmap_git->base_nr); break; case OBJ_BLOB: - ewah_iterator_init(it, bitmap_git->blobs); + ewah_or_iterator_init(it, bitmap_git->blobs_all, + bitmap_git->base_nr); break; case OBJ_TAG: - ewah_iterator_init(it, bitmap_git->tags); + ewah_or_iterator_init(it, bitmap_git->tags_all, + bitmap_git->base_nr); break; default: @@ -1657,7 +1661,7 @@ static void show_objects_for_type( size_t i = 0; uint32_t offset; - struct ewah_iterator it; + struct ewah_or_iterator it; eword_t filter; struct bitmap *objects = bitmap_git->result; @@ -1665,7 +1669,7 @@ static void show_objects_for_type( init_type_iterator(&it, bitmap_git, object_type); for (i = 0; i < objects->word_alloc && - ewah_iterator_next(&filter, &it); i++) { + ewah_or_iterator_next(&filter, &it); i++) { eword_t word = objects->words[i] & filter; size_t pos = (i * BITS_IN_EWORD); @@ -1707,6 +1711,8 @@ static void show_objects_for_type( show_reach(&oid, object_type, 0, hash, pack, ofs); } } + + ewah_or_iterator_free(&it); } static int in_bitmapped_pack(struct bitmap_index *bitmap_git, @@ -1758,7 +1764,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, { struct eindex *eindex = &bitmap_git->ext_index; struct bitmap *tips; - struct ewah_iterator it; + struct ewah_or_iterator it; eword_t mask; uint32_t i; @@ -1775,7 +1781,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, * packfile. */ for (i = 0, init_type_iterator(&it, bitmap_git, type); - i < to_filter->word_alloc && ewah_iterator_next(&mask, &it); + i < to_filter->word_alloc && ewah_or_iterator_next(&mask, &it); i++) { if (i < tips->word_alloc) mask &= ~tips->words[i]; @@ -1795,6 +1801,7 @@ static void filter_bitmap_exclude_type(struct bitmap_index *bitmap_git, bitmap_unset(to_filter, pos); } + ewah_or_iterator_free(&it); bitmap_free(tips); } @@ -1852,14 +1859,14 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, { struct eindex *eindex = &bitmap_git->ext_index; struct bitmap *tips; - struct ewah_iterator it; + struct ewah_or_iterator it; eword_t mask; uint32_t i; tips = find_tip_objects(bitmap_git, tip_objects, OBJ_BLOB); for (i = 0, init_type_iterator(&it, bitmap_git, OBJ_BLOB); - i < to_filter->word_alloc && ewah_iterator_next(&mask, &it); + i < to_filter->word_alloc && ewah_or_iterator_next(&mask, &it); i++) { eword_t word = to_filter->words[i] & mask; unsigned offset; @@ -1887,6 +1894,7 @@ static void filter_bitmap_blob_limit(struct bitmap_index *bitmap_git, bitmap_unset(to_filter, pos); } + ewah_or_iterator_free(&it); bitmap_free(tips); } @@ -2506,12 +2514,12 @@ static uint32_t count_object_type(struct bitmap_index *bitmap_git, struct eindex *eindex = &bitmap_git->ext_index; uint32_t i = 0, count = 0; - struct ewah_iterator it; + struct ewah_or_iterator it; eword_t filter; init_type_iterator(&it, bitmap_git, type); - while (i < objects->word_alloc && ewah_iterator_next(&filter, &it)) { + while (i < objects->word_alloc && ewah_or_iterator_next(&filter, &it)) { eword_t word = objects->words[i++] & filter; count += ewah_bit_popcount64(word); } @@ -2523,6 +2531,8 @@ static uint32_t count_object_type(struct bitmap_index *bitmap_git, count++; } + ewah_or_iterator_free(&it); + return count; } @@ -3051,13 +3061,13 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, { struct bitmap *result = bitmap_git->result; off_t total = 0; - struct ewah_iterator it; + struct ewah_or_iterator it; eword_t filter; size_t i; init_type_iterator(&it, bitmap_git, object_type); for (i = 0; i < result->word_alloc && - ewah_iterator_next(&filter, &it); i++) { + ewah_or_iterator_next(&filter, &it); i++) { eword_t word = result->words[i] & filter; size_t base = (i * BITS_IN_EWORD); unsigned offset; @@ -3098,6 +3108,8 @@ static off_t get_disk_usage_for_type(struct bitmap_index *bitmap_git, } } + ewah_or_iterator_free(&it); + return total; } From patchwork Tue Nov 19 22:07:56 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 13880578 Received: from mail-yb1-f178.google.com (mail-yb1-f178.google.com [209.85.219.178]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id BDFC81D47BC for ; Tue, 19 Nov 2024 22:07:59 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.178 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054081; cv=none; b=sX28zlHVhMPieFotFnBlv0tHiR9RBUFqogol5m2Y0QhhTU3ZoxARhM3gVLC56ewjkJpnU3CLxv+YlE/PlLPyQrgF/BCkIiFHaofk0uoPnW1TQJfWQedNdB1a/Xc6QP6MXHKBxTp/Zp3tXraK0w7csLHNGBRe7mcFWXFZwCsXFlQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1732054081; c=relaxed/simple; bh=AHMJsOXufHCHK3o1VaRd663rUkae2bG92RGTzHtHqbw=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=JPU56jsfoDB3/kUF2JXRRnnre4C4ARe1Uo2f5fPw8CHC+yIgOOkspUheo42VEna1N9ReFqfwKA0CGua/IUySF8fSkZo5TNZRgw0hHwEjFhHIUOjMRaKjW7Ty/uoU8IjFAi0QfQsSZRI4VYKH2QhYUNjeVxY0ia9JFQoqAuO+nuk= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=IRwBIFSr; arc=none smtp.client-ip=209.85.219.178 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="IRwBIFSr" Received: by mail-yb1-f178.google.com with SMTP id 3f1490d57ef6-e3824e1adcdso3269097276.3 for ; Tue, 19 Nov 2024 14:07:59 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1732054078; x=1732658878; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=uqKtPZKd854qaZKx7qMx+TDjU3UkRS8Ukn9P91XTGzM=; b=IRwBIFSryNNVEh2Peqw307JFlYfVRzoriYiBY6Fx1iE/F4jeFF6Vrys0TBhwjHCkYi CUnntFEWCZf3d7GSLxic/uC3jP+z/Rrr20pC/baDwD/Dmdkqsqk6vFB+4qnvn1L09xoY eJhNS/ObOhaBD1utj/eQzfs1CChnZFpfeP3cFgQqPocLcZqI0cKn1alu/P4JBRP0TzML o7GF/FuCsPAEIpsA7o4jPvpADX8K3nTk0skwwslTgwQsdMOT0praQ/8uTJRyCWhLqVAO LjRr9Qj7k4RuRIoxVztlObKkwOJ0BHSUN5hMLRC9gYIK4yG0tZW4P94bNb/TsJhyhe90 ztHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1732054078; x=1732658878; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=uqKtPZKd854qaZKx7qMx+TDjU3UkRS8Ukn9P91XTGzM=; b=N2O7KCoY3BE7TCbmtlrK/wiW+twypNwke0SOQ7mhU0DkS5ZpeOen8u0EgiGPPyv8hE 2YpwfRjIip4au6Oxk1jxU9zP+vq+ysPQk7BF3cjxpoiO4mcoypbJvrG+AvYAaiKZaiGt A099V6f2Lt1P1nHvkKlQACoSVliGttcVfHUMEvlBKN3Hs7r7V0hePMrlEbSlVEYtQDtG CECfRZqcmec9uaI4zTGTvCxYWhz3rbLQxzAJ7OPtAtW80DzNG35gsMIhtA7VtwWtHqeE NXWugnBHU+GWNJ4dEY/8msyQTUDDnpKqEnLtgOwcELR5s869egJQ8yBkXxE3JwCjgYeN VqBg== X-Gm-Message-State: AOJu0Yy2ZgmSnsnU+IumZIuTSWFbPjwcmm+RYQiJ7U3JcdKeX+4PFJcj Wm0S2aOUlTjiVg4JUzi70byYmXHhMH0isc12TWNsYGqdu3lGVJLt0dEFSlr6afWqF8K0+5o8WrV f85s= X-Google-Smtp-Source: AGHT+IF+EOEk5H15Z6G45Oz/7WWB/SiSiEO2RQLNHOCTtJtO4OBCsmHkw4uL0Jci5BNxF9tkJAdmbA== X-Received: by 2002:a05:6902:12c9:b0:e38:bf0:dab0 with SMTP id 3f1490d57ef6-e38cb5bef63mr375008276.24.1732054078328; Tue, 19 Nov 2024 14:07:58 -0800 (PST) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with ESMTPSA id 3f1490d57ef6-e387e72252fsm2641840276.1.2024.11.19.14.07.57 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 19 Nov 2024 14:07:57 -0800 (PST) Date: Tue, 19 Nov 2024 17:07:56 -0500 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 13/13] midx: implement writing incremental MIDX bitmaps Message-ID: <77ddd1170f9178849b5dbfd9cd16a14ae96cfa87.1732054032.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Now that the pack-bitmap machinery has learned how to read and interact with an incremental MIDX bitmap, teach the pack-bitmap-write.c machinery (and relevant callers from within the MIDX machinery) to write such bitmaps. The details for doing so are mostly straightforward. The main changes are as follows: - find_object_pos() now makes use of an extra MIDX parameter which is used to locate the bit positions of objects which are from previous layers (and thus do not exist in the current layer's pack_order field). (Note also that the pack_order field is moved into struct write_midx_context to further simplify the callers for write_midx_bitmap()). - bitmap_writer_build_type_index() first determines how many objects precede the current bitmap layer and offsets the bits it sets in each respective type-level bitmap by that amount so they can be OR'd together. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 3 +- midx-write.c | 49 ++++++++++----- pack-bitmap-write.c | 65 ++++++++++++++----- pack-bitmap.h | 4 +- t/t5334-incremental-multi-pack-index.sh | 84 +++++++++++++++++++++++++ 5 files changed, 171 insertions(+), 34 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 08007142671..09d9ef62055 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -1370,7 +1370,8 @@ static void write_pack_file(void) if (write_bitmap_index) { bitmap_writer_init(&bitmap_writer, - the_repository, &to_pack); + the_repository, &to_pack, + NULL); bitmap_writer_set_checksum(&bitmap_writer, hash); bitmap_writer_build_type_index(&bitmap_writer, written_list); diff --git a/midx-write.c b/midx-write.c index b3a5f6c5166..6f7a8e045fd 100644 --- a/midx-write.c +++ b/midx-write.c @@ -645,15 +645,21 @@ static uint32_t *midx_pack_order(struct write_midx_context *ctx) return pack_order; } -static void write_midx_reverse_index(char *midx_name, unsigned char *midx_hash, - struct write_midx_context *ctx) +static void write_midx_reverse_index(struct write_midx_context *ctx, + const char *object_dir, + unsigned char *midx_hash) { struct strbuf buf = STRBUF_INIT; char *tmp_file; trace2_region_enter("midx", "write_midx_reverse_index", the_repository); - strbuf_addf(&buf, "%s-%s.rev", midx_name, hash_to_hex(midx_hash)); + if (ctx->incremental) + get_split_midx_filename_ext(&buf, object_dir, midx_hash, + MIDX_EXT_REV); + else + get_midx_filename_ext(&buf, object_dir, midx_hash, + MIDX_EXT_REV); tmp_file = write_rev_file_order(NULL, ctx->pack_order, ctx->entries_nr, midx_hash, WRITE_REV); @@ -827,20 +833,26 @@ static struct commit **find_commits_for_midx_bitmap(uint32_t *indexed_commits_nr return cb.commits; } -static int write_midx_bitmap(const char *midx_name, +static int write_midx_bitmap(struct write_midx_context *ctx, + const char *object_dir, const unsigned char *midx_hash, struct packing_data *pdata, struct commit **commits, uint32_t commits_nr, - uint32_t *pack_order, unsigned flags) { int ret, i; uint16_t options = 0; struct bitmap_writer writer; struct pack_idx_entry **index; - char *bitmap_name = xstrfmt("%s-%s.bitmap", midx_name, - hash_to_hex(midx_hash)); + struct strbuf bitmap_name = STRBUF_INIT; + + if (ctx->incremental) + get_split_midx_filename_ext(&bitmap_name, object_dir, midx_hash, + MIDX_EXT_BITMAP); + else + get_midx_filename_ext(&bitmap_name, object_dir, midx_hash, + MIDX_EXT_BITMAP); trace2_region_enter("midx", "write_midx_bitmap", the_repository); @@ -859,7 +871,8 @@ static int write_midx_bitmap(const char *midx_name, for (i = 0; i < pdata->nr_objects; i++) index[i] = &pdata->objects[i].idx; - bitmap_writer_init(&writer, the_repository, pdata); + bitmap_writer_init(&writer, the_repository, pdata, + ctx->incremental ? ctx->base_midx : NULL); bitmap_writer_show_progress(&writer, flags & MIDX_PROGRESS); bitmap_writer_build_type_index(&writer, index); @@ -877,7 +890,7 @@ static int write_midx_bitmap(const char *midx_name, * bitmap_writer_finish(). */ for (i = 0; i < pdata->nr_objects; i++) - index[pack_order[i]] = &pdata->objects[i].idx; + index[ctx->pack_order[i]] = &pdata->objects[i].idx; bitmap_writer_select_commits(&writer, commits, commits_nr); ret = bitmap_writer_build(&writer); @@ -885,11 +898,11 @@ static int write_midx_bitmap(const char *midx_name, goto cleanup; bitmap_writer_set_checksum(&writer, midx_hash); - bitmap_writer_finish(&writer, index, bitmap_name, options); + bitmap_writer_finish(&writer, index, bitmap_name.buf, options); cleanup: free(index); - free(bitmap_name); + strbuf_release(&bitmap_name); bitmap_writer_free(&writer); trace2_region_leave("midx", "write_midx_bitmap", the_repository); @@ -1073,8 +1086,6 @@ static int write_midx_internal(const char *object_dir, trace2_region_enter("midx", "write_midx_internal", the_repository); ctx.incremental = !!(flags & MIDX_WRITE_INCREMENTAL); - if (ctx.incremental && (flags & MIDX_WRITE_BITMAP)) - die(_("cannot write incremental MIDX with bitmap")); if (ctx.incremental) strbuf_addf(&midx_name, @@ -1116,6 +1127,12 @@ static int write_midx_internal(const char *object_dir, if (ctx.incremental) { struct multi_pack_index *m = ctx.base_midx; while (m) { + if (flags & MIDX_WRITE_BITMAP && load_midx_revindex(m)) { + error(_("could not load reverse index for MIDX %s"), + hash_to_hex(get_midx_checksum(m))); + result = 1; + goto cleanup; + } ctx.num_multi_pack_indexes_before++; m = m->base_midx; } @@ -1382,7 +1399,7 @@ static int write_midx_internal(const char *object_dir, if (flags & MIDX_WRITE_REV_INDEX && git_env_bool("GIT_TEST_MIDX_WRITE_REV", 0)) - write_midx_reverse_index(midx_name.buf, midx_hash, &ctx); + write_midx_reverse_index(&ctx, object_dir, midx_hash); if (flags & MIDX_WRITE_BITMAP) { struct packing_data pdata; @@ -1405,8 +1422,8 @@ static int write_midx_internal(const char *object_dir, FREE_AND_NULL(ctx.entries); ctx.entries_nr = 0; - if (write_midx_bitmap(midx_name.buf, midx_hash, &pdata, - commits, commits_nr, ctx.pack_order, + if (write_midx_bitmap(&ctx, object_dir, + midx_hash, &pdata, commits, commits_nr, flags) < 0) { error(_("could not write multi-pack bitmap")); result = 1; diff --git a/pack-bitmap-write.c b/pack-bitmap-write.c index 49758e2525f..1fbebe84479 100644 --- a/pack-bitmap-write.c +++ b/pack-bitmap-write.c @@ -25,6 +25,8 @@ #include "alloc.h" #include "refs.h" #include "strmap.h" +#include "midx.h" +#include "pack-revindex.h" struct bitmapped_commit { struct commit *commit; @@ -42,7 +44,8 @@ static inline int bitmap_writer_nr_selected_commits(struct bitmap_writer *writer } void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r, - struct packing_data *pdata) + struct packing_data *pdata, + struct multi_pack_index *midx) { memset(writer, 0, sizeof(struct bitmap_writer)); if (writer->bitmaps) @@ -50,6 +53,7 @@ void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r, writer->bitmaps = kh_init_oid_map(); writer->pseudo_merge_commits = kh_init_oid_map(); writer->to_pack = pdata; + writer->midx = midx; string_list_init_dup(&writer->pseudo_merge_groups); @@ -112,6 +116,11 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer, struct pack_idx_entry **index) { uint32_t i; + uint32_t base_objects = 0; + + if (writer->midx) + base_objects = writer->midx->num_objects + + writer->midx->num_objects_in_base; writer->commits = ewah_new(); writer->trees = ewah_new(); @@ -141,19 +150,19 @@ void bitmap_writer_build_type_index(struct bitmap_writer *writer, switch (real_type) { case OBJ_COMMIT: - ewah_set(writer->commits, i); + ewah_set(writer->commits, i + base_objects); break; case OBJ_TREE: - ewah_set(writer->trees, i); + ewah_set(writer->trees, i + base_objects); break; case OBJ_BLOB: - ewah_set(writer->blobs, i); + ewah_set(writer->blobs, i + base_objects); break; case OBJ_TAG: - ewah_set(writer->tags, i); + ewah_set(writer->tags, i + base_objects); break; default: @@ -206,19 +215,37 @@ void bitmap_writer_push_commit(struct bitmap_writer *writer, static uint32_t find_object_pos(struct bitmap_writer *writer, const struct object_id *oid, int *found) { - struct object_entry *entry = packlist_find(writer->to_pack, oid); + struct object_entry *entry; + + entry = packlist_find(writer->to_pack, oid); + if (entry) { + uint32_t base_objects = 0; + if (writer->midx) + base_objects = writer->midx->num_objects + + writer->midx->num_objects_in_base; + + if (found) + *found = 1; + return oe_in_pack_pos(writer->to_pack, entry) + base_objects; + } else if (writer->midx) { + uint32_t at, pos; + + if (!bsearch_midx(oid, writer->midx, &at)) + goto missing; + if (midx_to_pack_pos(writer->midx, at, &pos) < 0) + goto missing; - if (!entry) { if (found) - *found = 0; - warning("Failed to write bitmap index. Packfile doesn't have full closure " - "(object %s is missing)", oid_to_hex(oid)); - return 0; + *found = 1; + return pos; } +missing: if (found) - *found = 1; - return oe_in_pack_pos(writer->to_pack, entry); + *found = 0; + warning("Failed to write bitmap index. Packfile doesn't have full closure " + "(object %s is missing)", oid_to_hex(oid)); + return 0; } static void compute_xor_offsets(struct bitmap_writer *writer) @@ -585,7 +612,7 @@ int bitmap_writer_build(struct bitmap_writer *writer) struct prio_queue queue = { compare_commits_by_gen_then_commit_date }; struct prio_queue tree_queue = { NULL }; struct bitmap_index *old_bitmap; - uint32_t *mapping; + uint32_t *mapping = NULL; int closed = 1; /* until proven otherwise */ if (writer->show_progress) @@ -1018,7 +1045,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer, struct strbuf tmp_file = STRBUF_INIT; struct hashfile *f; off_t *offsets = NULL; - uint32_t i; + uint32_t i, base_objects; struct bitmap_disk_header header; @@ -1044,6 +1071,12 @@ void bitmap_writer_finish(struct bitmap_writer *writer, if (options & BITMAP_OPT_LOOKUP_TABLE) CALLOC_ARRAY(offsets, writer->to_pack->nr_objects); + if (writer->midx) + base_objects = writer->midx->num_objects + + writer->midx->num_objects_in_base; + else + base_objects = 0; + for (i = 0; i < bitmap_writer_nr_selected_commits(writer); i++) { struct bitmapped_commit *stored = &writer->selected[i]; int commit_pos = oid_pos(&stored->commit->object.oid, index, @@ -1052,7 +1085,7 @@ void bitmap_writer_finish(struct bitmap_writer *writer, if (commit_pos < 0) BUG(_("trying to write commit not in index")); - stored->commit_pos = commit_pos; + stored->commit_pos = commit_pos + base_objects; } write_selected_commits_v1(writer, f, offsets); diff --git a/pack-bitmap.h b/pack-bitmap.h index d7f4b8b8e95..dd0951088f6 100644 --- a/pack-bitmap.h +++ b/pack-bitmap.h @@ -111,6 +111,7 @@ struct bitmap_writer { kh_oid_map_t *bitmaps; struct packing_data *to_pack; + struct multi_pack_index *midx; /* if appending to a MIDX chain */ struct bitmapped_commit *selected; unsigned int selected_nr, selected_alloc; @@ -125,7 +126,8 @@ struct bitmap_writer { }; void bitmap_writer_init(struct bitmap_writer *writer, struct repository *r, - struct packing_data *pdata); + struct packing_data *pdata, + struct multi_pack_index *midx); void bitmap_writer_show_progress(struct bitmap_writer *writer, int show); void bitmap_writer_set_checksum(struct bitmap_writer *writer, const unsigned char *sha1); diff --git a/t/t5334-incremental-multi-pack-index.sh b/t/t5334-incremental-multi-pack-index.sh index 471994c4bc8..3aac7ccdfe2 100755 --- a/t/t5334-incremental-multi-pack-index.sh +++ b/t/t5334-incremental-multi-pack-index.sh @@ -45,4 +45,88 @@ test_expect_success 'convert incremental to non-incremental' ' compare_results_with_midx 'non-incremental MIDX conversion' +write_midx_layer () { + n=1 + if test -f $midx_chain + then + n="$(($(wc -l <$midx_chain) + 1))" + fi + + for i in 1 2 + do + test_commit $n.$i && + git repack -d || return 1 + done && + git multi-pack-index write --bitmap --incremental +} + +test_expect_success 'write initial MIDX layer' ' + git repack -ad && + write_midx_layer +' + +test_expect_success 'read bitmap from first MIDX layer' ' + git rev-list --test-bitmap 1.2 +' + +test_expect_success 'write another MIDX layer' ' + write_midx_layer +' + +test_expect_success 'midx verify with multiple layers' ' + git multi-pack-index verify +' + +test_expect_success 'read bitmap from second MIDX layer' ' + git rev-list --test-bitmap 2.2 +' + +test_expect_success 'read earlier bitmap from second MIDX layer' ' + git rev-list --test-bitmap 1.2 +' + +test_expect_success 'show object from first pack' ' + git cat-file -p 1.1 +' + +test_expect_success 'show object from second pack' ' + git cat-file -p 2.2 +' + +for reuse in false single multi +do + test_expect_success "full clone (pack.allowPackReuse=$reuse)" ' + rm -fr clone.git && + + git config pack.allowPackReuse $reuse && + git clone --no-local --bare . clone.git + ' +done + +test_expect_success 'relink existing MIDX layer' ' + rm -fr "$midxdir" && + + GIT_TEST_MIDX_WRITE_REV=1 git multi-pack-index write --bitmap && + + midx_hash="$(test-tool read-midx --checksum $objdir)" && + + test_path_is_file "$packdir/multi-pack-index" && + test_path_is_file "$packdir/multi-pack-index-$midx_hash.bitmap" && + test_path_is_file "$packdir/multi-pack-index-$midx_hash.rev" && + + test_commit another && + git repack -d && + git multi-pack-index write --bitmap --incremental && + + test_path_is_missing "$packdir/multi-pack-index" && + test_path_is_missing "$packdir/multi-pack-index-$midx_hash.bitmap" && + test_path_is_missing "$packdir/multi-pack-index-$midx_hash.rev" && + + test_path_is_file "$midxdir/multi-pack-index-$midx_hash.midx" && + test_path_is_file "$midxdir/multi-pack-index-$midx_hash.bitmap" && + test_path_is_file "$midxdir/multi-pack-index-$midx_hash.rev" && + test_line_count = 2 "$midx_chain" + +' + test_done