From patchwork Tue Apr 15 22:46:48 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052863 Received: from mail-qk1-f172.google.com (mail-qk1-f172.google.com [209.85.222.172]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 320052561D5 for ; Tue, 15 Apr 2025 22:46:50 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.172 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757212; cv=none; b=QOflME0/+ch9kkL9VmQ7qbhpppgOemQPRfqm00ihxUcvIRCIQWWeKrErapDkHTtwb+K+nxhq/Sad666W2HyAR2b5cRpNNzzKsky+C+HRk+qVmHGXt/RG0oI9vZW9PxlnOxQUqJTC4U//hcHUrbjZyZ0hMHuW0BshUmxhhPDG4Tw= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757212; c=relaxed/simple; bh=M85u4OSmuxEBuPFakd/5/Py8DKdU3V61xgtqaRPHa/c=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=hfbL2WIEkBH5bRznqfDK3Jdc4YEfwarFEJ9Avs2mEAZtDx4gULjVlydz04VZYvR32gVqy8tEICqMvOuc4Mkw1PZmCM3xZFdD5lLP/tAwbPGDufxeeGTMmq8txV0P1qwWCqYDeEEr4zdnh+gNpYGIaBL8jYxkSw5qo0l0ioJrdwA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=D0kHt5ZR; arc=none smtp.client-ip=209.85.222.172 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="D0kHt5ZR" Received: by mail-qk1-f172.google.com with SMTP id af79cd13be357-7c5a88b34a6so576868585a.3 for ; Tue, 15 Apr 2025 15:46:50 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757210; x=1745362010; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=TnitAFRHf0ebOOdV/u/iJ2z0b1BIP3zVys35hBCjWXA=; b=D0kHt5ZR8ATjzLmDWE0QEsVnV3fdWQwVyFymx9TgJS/sF0P0ZdAhwq2gM53Px1IXq6 gLd0mOWj8/G8XgYt7gvSRP0MrmsdwqwCE2IBFeLS/FQhIWyRF++Dy2TCDs8dHZsh/3zX nrfAbxpxUP4QkljAXSvur/1WeuYdUajQ5+7XSCkacAQj0tsR7iKq+5h6pb/6zZ2MaIcu QtenCnGgnLk1E2/ka68RAzCc6ZNNZfKIWIqu82llwNNSGkZw3CzZinnedkXesDqZWb1H JYHa6lLw7kkhsZP3KWzX4lQA/XtViVG2qpbfqZPzuwy+kGQBa+4IVeEw7Xf7FgyIfgiG XfSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757210; x=1745362010; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=TnitAFRHf0ebOOdV/u/iJ2z0b1BIP3zVys35hBCjWXA=; b=RJCYJHp07HbtBB7VliDhVaK/TDtJCtrk7YDpHsx9L2WHz1iC8qXtlfWroVZ08kxz/j ztFdJI/qJgpiZ4x1Q/qGKZMteUlSORT0fm2zBshjZzC5r9bZorxGvlo2u6GsAa89yenr VI2skY3tVfC1U6fB/8GQgEGtwdxLiiRR5BifaVErzQJlspyJjY6Ci41uY3HSkPnGdYZI V5MuAuuEtmSRDRL63YcfLD/AkbIE4SFu4BBjj+kOJf2sA8RZ+vJyLZg8UVP3++j0+rfL GVD1HeQDA9rTlhEBE83XdH1qBCXskptsd5W2rWPNVAGMyCdSP62GB1oLOndnSb0LU63b IX8Q== X-Gm-Message-State: AOJu0YxiQWEsjei1BXXHdLiC6eB/tvIQsKbFQ1+4I1z5pTD36FJZwMoY 7tGWPguw0sQXQNgla3P7Ptwo/NbqREGwP7kLM6lXFJBkfT+/eWkh1Tr/ghsiKy2OohOPMsCup1p j3Zw= X-Gm-Gg: ASbGncskY5EUrI0k9qfIWxLqOszyVskRf4TUNV2tIEIgt1WbSM8LfPF0wxvPJ6Vjq4S j3jpfjkgoiPb3qyLg9cwVdmdan9BmMhH2v7/h+0PHK7vbsdWfIqYY2u0Zfn4zhsGSTxUpQelj3B NcwfrLYZfOGTMFk3WqqxEr1QKztFncGRmmFeF3AdJrRmNTsvX3fwQY9iO1NM55Mg/OFgc/yqkJU aXYYw32iwUeUErOdRj3IUk5I4ZaDHW2M3odHeGLV7WVG9q05qbLa3lElpnkxrivjK1WFO09p6WL Akr+K74cKrD747jkdIS9XKl/JUOxgktSleyM3sue5hOX+nRp2mrQs9pItpXCFABlyvIIY+PadMP /Zxh0ph6pogTk X-Google-Smtp-Source: AGHT+IHh8KYzlcxgeh+SbsLm7Md1/K/ztCBc6WsivOsy2M7JCljlbUR8w5JTxCTd6CRRgFdmFDxn3w== X-Received: by 2002:a05:620a:6884:b0:7c5:99e9:4495 with SMTP id af79cd13be357-7c91419dd42mr204923985a.2.1744757209844; Tue, 15 Apr 2025 15:46:49 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7c7b80a7c72sm647095185a.76.2025.04.15.15.46.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:46:49 -0700 (PDT) Date: Tue, 15 Apr 2025 18:46:48 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 1/9] pack-objects: use standard option incompatibility functions Message-ID: References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: pack-objects has a handful of explicit checks for pairs of command-line options which are mutually incompatible. Many of these pre-date a699367bb8 (i18n: factorize more 'incompatible options' messages, 2022-01-31). Convert the explicit checks into die_for_incompatible_opt2() calls, which simplifies the implementation and standardizes pack-objects' output when given incompatible options (e.g., --stdin-packs with --filter gives different output than --keep-unreachable with --unpack-unreachable). There is one minor piece of test fallout in t5331 that expects the old format, which has been corrected. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 20 +++++++++++--------- t/t5331-pack-objects-stdin.sh | 2 +- 2 files changed, 12 insertions(+), 10 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 6b06d159d2..20dd870bbf 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -4651,9 +4651,10 @@ int cmd_pack_objects(int argc, strvec_push(&rp, "--unpacked"); } - if (exclude_promisor_objects && exclude_promisor_objects_best_effort) - die(_("options '%s' and '%s' cannot be used together"), - "--exclude-promisor-objects", "--exclude-promisor-objects-best-effort"); + die_for_incompatible_opt2(exclude_promisor_objects, + "--exclude-promisor-objects", + exclude_promisor_objects_best_effort, + "--exclude-promisor-objects-best-effort"); if (exclude_promisor_objects) { use_internal_rev_list = 1; fetch_if_missing = 0; @@ -4691,13 +4692,14 @@ int cmd_pack_objects(int argc, if (!pack_to_stdout && thin) die(_("--thin cannot be used to build an indexable pack")); - if (keep_unreachable && unpack_unreachable) - die(_("options '%s' and '%s' cannot be used together"), "--keep-unreachable", "--unpack-unreachable"); + die_for_incompatible_opt2(keep_unreachable, "--keep-unreachable", + unpack_unreachable, "--unpack-unreachable"); if (!rev_list_all || !rev_list_reflog || !rev_list_index) unpack_unreachable_expiration = 0; - if (stdin_packs && filter_options.choice) - die(_("cannot use --filter with --stdin-packs")); + die_for_incompatible_opt2(stdin_packs, "--stdin-packs", + filter_options.choice, "--filter"); + if (stdin_packs && use_internal_rev_list) die(_("cannot use internal rev list with --stdin-packs")); @@ -4705,8 +4707,8 @@ int cmd_pack_objects(int argc, if (cruft) { if (use_internal_rev_list) die(_("cannot use internal rev list with --cruft")); - if (stdin_packs) - die(_("cannot use --stdin-packs with --cruft")); + die_for_incompatible_opt2(stdin_packs, "--stdin-packs", + cruft, "--cruft"); } /* diff --git a/t/t5331-pack-objects-stdin.sh b/t/t5331-pack-objects-stdin.sh index b48c0cbe8f..8fd07deb8d 100755 --- a/t/t5331-pack-objects-stdin.sh +++ b/t/t5331-pack-objects-stdin.sh @@ -64,7 +64,7 @@ test_expect_success '--stdin-packs is incompatible with --filter' ' cd stdin-packs && test_must_fail git pack-objects --stdin-packs --stdout \ --filter=blob:none err && - test_grep "cannot use --filter with --stdin-packs" err + test_grep "options .--stdin-packs. and .--filter. cannot be used together" err ) ' From patchwork Tue Apr 15 22:46:51 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052864 Received: from mail-qt1-f182.google.com (mail-qt1-f182.google.com [209.85.160.182]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 568A429E074 for ; Tue, 15 Apr 2025 22:46:54 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.182 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757216; cv=none; b=VCwxL+EF/sIoFR4RnvqZoupo1zeQshvd5wbM1jVMi0/x9wONryfbuZeebQYAwY6nVlSkgCVxk2Sgc8672MwEB5bqYNboOn1qmjY3Zw2yOrvdr9XaobyZjPVpeJOngg/iTFP3YD9QNaO32B5uWXHYmfvSyjOVqVV97tyoZYfqCFY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757216; c=relaxed/simple; bh=T8mpXSgtS40tyZKYuB0imSgPNJtgLW/ctjIJ12/z2hI=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=csM7U/Yi1TJeVg1UwLRYlv4YhxD9pFJEBJLTXbbaUwhZUIlzB8vhA3GoHhSEQ29MVRhKzh49hr0wtEnCtO5ysAYR8XPV6oG6b9g2s8+xp81cowc704iqWb0sQkBS9HKi7IG5aF3UO7bgt3QlY97gdwMD/dmQ3bz8qgASyxcT+Uc= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=WP3JILCV; arc=none smtp.client-ip=209.85.160.182 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="WP3JILCV" Received: by mail-qt1-f182.google.com with SMTP id d75a77b69052e-476f4e9cf92so45600231cf.3 for ; Tue, 15 Apr 2025 15:46:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757213; x=1745362013; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=j7sdBfHc+Jxpj0a4C/yrCX1kvhBnvRFcGb8oMLDw5kw=; b=WP3JILCVie8ekJtTlwhluWNxfU1b2veUXuWTft9+mRTsCu/uwQntGjK0CVc3pkdgPy mqi4q/eKYaYnfYydnbQrfYfeeHk97z0wxUYwlEHVIFMTOTb/KnSlMkN/dJ46uWUUs+91 aNuT7MgaLucbOekE4+msYWDyZeTSAvB8HR6mdY41Fc7mBeVfdVwUeOftN5he+3L6HdXX nH8w1z+0WieAenEBh0UK9UyRmgLUTRlX7wdKvPbBunFUqmLQo4umq017yLUplcKqabxo FO6nGl+u5V+1nB++LA56bPr2OChQ1F1L9uFsDXuyRsY7Bw/d1Oi5IQUuTXoIiP9m9rIr J8Gw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757213; x=1745362013; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=j7sdBfHc+Jxpj0a4C/yrCX1kvhBnvRFcGb8oMLDw5kw=; b=MwnV9gpjRlOYghYAZpVhVxyvZhUr8it8Z2HXHjdALfSfPD3V/dMK67kwT7TXGjYu0d j9q0dPwt1367VhBIyGcXwsnTml91xGERJBST3A1sR/t2mnYu9VHwyAkQ5hjqjEwL6CzL wWvoY8LVfOH0x5w/3FqLUFujjSnflHNiW2+u4/8/sxWpbuVJwIt6qPCSO59qTafeeuHr Vsi+R9hf/A/hAUdA2HSEpWGAjJNCtm1Jlt6KrFbVF2kCAuQOI+STFtdEEpIgfNpLcIu6 9EE3/OgBIUgrhlOXKn1DJgncPNBqDSnj5p65JKoHOgbScuQfBRyqwcfwV9xTZRtM2AQb 2COA== X-Gm-Message-State: AOJu0YwA98L3a9Gu8s3Wx8lHkRtCt7a/APqxinJnG9quGqMxlOtk8B3K 2d7RMQr3MXacgr9A+Suh5azPpjZIKWXNWDZIUxfQ18hNTvVRh9/+oQdfbGg1h06g139cpx9LZo2 EOpQ= X-Gm-Gg: ASbGncsd0V2+Uukjan2kQTaXUQh63A7Vzy8e4TaAo2zSZku8GiJiQixkSO/dWd6+QVe alnZxAJn/Ef5Lg64pzlWQNvdtn42IpvgjJtZm92r6JPhrO8f1pvp6aGxv9Q0ul6OODbjFauu20f hAmjJFuxN/2OtZaN0YqHbeS8n85xKOWvveeh4L4qIAjH/Qietof1DQMbJLlujttj4pqDsbDrMsy VYoakp3OeiVxNnbdlB4FU6MdjLD6ej7ziDA6scuLaRkareWjQHo9vS6nn6DesPWPqDk5fUv+xNA cieyRNlUTWqBQMqb1a0GuENtI6ExzuurM9X2ii+ljOnFz+FFN2SwSuFg4syzQxpr3iYkay6RNt6 aDDopacPRKDd9 X-Google-Smtp-Source: AGHT+IFy8dUkhIBXrFLDWm9U4w3qZi4VmMWWRiMBsfb/yL1u69uCOWpD08TmEzcJnfTBVQ/EKK4qUw== X-Received: by 2002:a05:622a:1495:b0:477:64b0:6a26 with SMTP id d75a77b69052e-47ad3a1ae6fmr15976191cf.22.1744757212922; Tue, 15 Apr 2025 15:46:52 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-4796eb2cbf1sm98845721cf.38.2025.04.15.15.46.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:46:52 -0700 (PDT) Date: Tue, 15 Apr 2025 18:46:51 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 2/9] pack-objects: limit scope in 'add_object_entry_from_pack()' Message-ID: <986bef29b5f33d32fd366aa9370d439175a9b605.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In add_object_entry_from_pack() we declare 'revs' (given to us through the miscellaneous context argument) earlier in the "if (p)" conditional than is necessary. Move it down as far as it can go to reduce its scope. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 20dd870bbf..4ab695a3aa 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3490,14 +3490,14 @@ static int add_object_entry_from_pack(const struct object_id *oid, return 0; if (p) { - struct rev_info *revs = _data; struct object_info oi = OBJECT_INFO_INIT; - oi.typep = &type; + if (packed_object_info(the_repository, p, ofs, &oi) < 0) { die(_("could not get type of object %s in pack %s"), oid_to_hex(oid), p->pack_name); } else if (type == OBJ_COMMIT) { + struct rev_info *revs = _data; /* * commits in included packs are used as starting points for the * subsequent revision walk From patchwork Tue Apr 15 22:46:54 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052865 Received: from mail-qk1-f169.google.com (mail-qk1-f169.google.com [209.85.222.169]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7E5882BCF49 for ; Tue, 15 Apr 2025 22:46:57 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.169 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757219; cv=none; b=kwYyEKLlwKBKXA29NPJJiuvNTCUTX6gjevijEIwp+gXLe8ijdpwcTTflxaHFesQDhYdPI+vtS3HM1DC+8KQxShHfPGbQWZoWEV5qN1KkmgsOZaLMbmQrB8ujEJDnkXwh83PCqFO7oRY/0W1GO/JY4UoeV5vIZPOqjexDPTvSefU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757219; c=relaxed/simple; bh=SzTBiQQnrItaSH0QwLfam/jj5h4h59pmDi+277BRWJ0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=Jd0H6G/G2mxOBcyznoHUsjjOyFUcVuLKACFgSvV/1Dw7zDrkAo7d1qsbCVOcOPwDrdlXlf9qCc0H/AfJeMsBBhlHyawn3g4CAEzkdPuYNjesRl9l4hgzExl3jbaaAR9VzEEmdvVyPXdjFubJJyogFykhwX5rqLTbb6Db913qb+E= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=l2pRg2nW; arc=none smtp.client-ip=209.85.222.169 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="l2pRg2nW" Received: by mail-qk1-f169.google.com with SMTP id af79cd13be357-7c5e39d1db2so341760285a.3 for ; Tue, 15 Apr 2025 15:46:57 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757216; x=1745362016; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=THUXxuoq3jBbce3nzc5a/iCchbKKxI2VBTZ22nN3Os0=; b=l2pRg2nWwjiAf7zh4BovR7SunONTbRlVsbyUCE3RywsCfPwQtW1uP9g/gaNVyShvec ZtfMVOiNb5AYughG6o0/z3jA44IBdPoMfYTv0CID8BFLFTNirK9Wzl+GAPcYm3lVJjCt WUItVEAzI2RNKC76YvVaYjMNZVnf5Y84w0v5RjXE24VATT43u8zLX7ztO98ZTzHUsNyS B2JfWHF7uRb0hsfALygxayN8IAbshcWzEgtBcMEBmRq6CC1IF0TAaLQGXrx7CiLArZLQ 5PmGtw3Jmuh96Te52rnPrjwOTPx0jpi86L2hzHRC6LuTA4aVVvPYS9PNSztHij8xJO3w O1VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757216; x=1745362016; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=THUXxuoq3jBbce3nzc5a/iCchbKKxI2VBTZ22nN3Os0=; b=jHEj1eR9eCoKFLYBVz4BETKO+UmIE28xKDGaOvm82yPOAlbniIsdiP6oZBIn+jsDcs 9IR3hOM8lBvlCsvvMIuQZWSZHoSDuL5MNxQygQCN2oWWOkBBJNo3z5oYUpRyhP/vtulC BsvR9CDZgRvWG+7agKs0U6pgUYRlazYGxTWHrn0+o6jWdL0GUJsR4dc2j+lLd+93K/eL DtYqp5WosskbWKfng94b12PfR9TETmzP9M9hLK5YxsldaWzFLF+SoJivBL5N3QqzJ+cI H5b+QCNuSJn+eD3w2P8ev4CAokh6dmG6ufXltMBdwtecd9LOegPzGZcMd3b+WMwN+Cfu GsSw== X-Gm-Message-State: AOJu0YzjAyPBTEDL5V89drIB7y+8U9SuTWSFOFTDlOO1YOgm2JcVJ0hm mPVspNk6hKnkcaqXKoDCdI7AT0P1tK2YceMR4tAtlNj0/pCe8R6UmsGRAmDCzdefijqwlc+zA+T Yltc= X-Gm-Gg: ASbGncuySpUKhnT2pYZwX505/UKgTBCAxDCxtQt41l/D7vuiSJOKIkk/5gfrWV0Tg2G Htd3ye4N4sSPOhIcS0ZYkNFhminbIA+tL0s06PY/0NbBf4sJRy5uOWwNC8GyBHNrgd63xDFkoTj 0zkRQ9suD4dAgLaI3pu7Rm21U0bmFl6cu511XaZ1B6XKUir1AB3+kD/HzkSo8OCdXEG7Y2At3RT 0o03LVh21hEHEL49uxtFroXCk1E3kQRWkGJb6BhBBbT1Nnt0mw62BiQwRYiTk6nET5tAAUeK30K 0DFdEPg8xl14R0aSxzez9E4xviUzBITucxmfZ4XV9Lw8XsAfpQOzBXWmsf58tV1sHjmgiOZ5Z7x 7X7s96ZWeQyXP X-Google-Smtp-Source: AGHT+IE7guyjeBqI13RuSpV4b3MnI4wrb4nYkUepuSyP4To1cWe1BA+NHS0Oc5HrGo1rO7/K4ggxvQ== X-Received: by 2002:a05:620a:3188:b0:7c5:4be5:b0b1 with SMTP id af79cd13be357-7c914240fbemr190967185a.35.1744757216220; Tue, 15 Apr 2025 15:46:56 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-4796eb15cffsm99288761cf.18.2025.04.15.15.46.55 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:46:55 -0700 (PDT) Date: Tue, 15 Apr 2025 18:46:54 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 3/9] pack-objects: factor out handling '--stdin-packs' Message-ID: <6f8fe8a4e10198b0339337376279cff4ac654879.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: At the bottom of cmd_pack_objects() we check which mode the command is running in (e.g., generating a cruft pack, handling '--stdin-packs', using the internal rev-list, etc.) and handle the mode appropriately. The '--stdin-packs' case is handled inline (dating back to its introduction in 339bce27f4 (builtin/pack-objects.c: add '--stdin-packs' option, 2021-02-22)) since it is relatively short. Extract the body of "if (stdin_packs)" into its own function to prepare for the implementation to become lengthier in a following commit. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 18 ++++++++++++------ 1 file changed, 12 insertions(+), 6 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index 4ab695a3aa..a293267074 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3674,6 +3674,17 @@ static void read_packs_list_from_stdin(void) string_list_clear(&exclude_packs, 0); } +static void add_unreachable_loose_objects(void); + +static void read_stdin_packs(int rev_list_unpacked) +{ + /* avoids adding objects in excluded packs */ + ignore_packed_keep_in_core = 1; + read_packs_list_from_stdin(); + if (rev_list_unpacked) + add_unreachable_loose_objects(); +} + static void add_cruft_object_entry(const struct object_id *oid, enum object_type type, struct packed_git *pack, off_t offset, const char *name, uint32_t mtime) @@ -3769,7 +3780,6 @@ static void mark_pack_kept_in_core(struct string_list *packs, unsigned keep) } } -static void add_unreachable_loose_objects(void); static void add_objects_in_unpacked_packs(void); static void enumerate_cruft_objects(void) @@ -4776,11 +4786,7 @@ int cmd_pack_objects(int argc, progress_state = start_progress(the_repository, _("Enumerating objects"), 0); if (stdin_packs) { - /* avoids adding objects in excluded packs */ - ignore_packed_keep_in_core = 1; - read_packs_list_from_stdin(); - if (rev_list_unpacked) - add_unreachable_loose_objects(); + read_stdin_packs(rev_list_unpacked); } else if (cruft) { read_cruft_objects(); } else if (!use_internal_rev_list) { From patchwork Tue Apr 15 22:46:58 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052866 Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A2FEC2BD5AA for ; Tue, 15 Apr 2025 22:47:00 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757222; cv=none; b=J/HEZQ8pKhkBZc5EhzCsoECEBM7pG8a7YGazKX4BkY9ZuTYyj0L6n7GkWydX/iJntC5lioP5lSK+sNp4CWivrXd1+67w+5Ge0INa3DZ4jvQTGUFrd4zCXbHJ6CNYJYIFRpBq3j+cMmcUirsiPhlmu88E4/hjF1t9wpx34i3Am3A= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757222; c=relaxed/simple; bh=ELrf1JqNaTOTgDlEQ2EbTuV3AH6cksprrpznX1E14Q8=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=qhjQVrJ021n4M5UA8i9k6+3BKNZiKvqxusdApyOgNUdX05He9IyQwQ8HhdR0m7VMM02F+kZ8/5FuXmfXCajgSaTRdIg+9WDF0WfjkZ2bSx6CCh85OwoFLjJq6ZetTSnfs7KaBRT4W9PdJcZKMg4KfYCULgNXhZ3InF42XblI07c= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=kzkgWzeT; arc=none smtp.client-ip=209.85.160.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="kzkgWzeT" Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-47686580529so64059581cf.2 for ; Tue, 15 Apr 2025 15:47:00 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757219; x=1745362019; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=4lqVsSO2M0rsqeART2rhJrSYaB5f+IKmzy4Q/yUCFlY=; b=kzkgWzeT7LVdNEpHPgg+GpQm0Opb2J9aJOHPqJD3PmoBZewa/A3ZYXGhO2yTmqm/Q+ wzCy14p6IXpmYTHttqC68gPTCIvW16+eU1bKYzj6VKRBC8dkLMc2waLhteNE/JVpuHm8 o5bSQZCdS3Rz7jMN/L54qja3nKyDFLTzeWXY4ZQ5TdG+xTV6uG1ZplQtR/x7SxKYeLcE dC+RFmePYcy84tw7YmiXye5GpAhn2a29tUM+fMwtMuurREKiHEeEOsAT5NONqwO7BkaS QkLVnH/5WFDuZNbwV1rbkLuj97VEicXdzlaWeZU33CrC0yL5Ohlg7XjNjBSg3TLJEBnB 64Zw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757219; x=1745362019; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=4lqVsSO2M0rsqeART2rhJrSYaB5f+IKmzy4Q/yUCFlY=; b=pTombWlucN1I1Acz8gCDFIzDoCMvltzJ5m9xPiIFT/B9NdlmXGsjckQNPvHfSOdhMM VWOdSVe07+HEU05DZTlqKygbCCF5YPbZhX2RiaFryaTe5+yVP+xkSmDvA9tMth3pu9N9 1G3Q2blEGm1WUUzyRgNeZpyutu84CdVg2Yiw6VyukH7YgzgQfeuM52A4aPiW6LLryue2 5qasoNxewmVcfVslKZrSLq7yAChT0AJYsZG4S1fvSa6PjX8kJXpKDpIkpOAO5I1LfTew L+pufGeWRqBe7GkjPWNvEkyYfOjmo1pp7wR92sDYIhPZItUB5oveBJekU8nvlIuSvBww 6tQQ== X-Gm-Message-State: AOJu0YyBpwLZti7uEkhKrojNQ5G6aJXRZnb8Gwei763U9fVMeUwFd3kk yF0PIhWnkJrPqRoVajmHDM6il74dCEFYuenBUhJ77BeUDO0qxbY+rHAcDMsvQzhjScxwKnFvk2b zBRM= X-Gm-Gg: ASbGncuOSZiA1IGWtCdSM8n/N2WhpxN1PIgvMF9YMY8kw/E1ToBBr5Ws5dPtXAqE0+4 vGumkxsDRTEjzcOrd9RkCxYijDlfPWp1zIsdxV+JsXx1J31KdYU6gTUCMzrC+fRZrbw8Oh7mYDc ML/VEw2544vtTb4fSwqjiTQsqW38q53EmF4q2vZiv97rzXDK1yvkP2E+TWx54HrPjX6NtERqcuc ZQ4k2DF8G4B5Txg6RJIMMiMtjTCq2ic0LkvzjQEJ4M/28oUaSEiHjO14f0MIfaWJSXiKdt8Hvlh Vc8BxSDrkRJv8IaJGVaHkyg24hh0JMu+0J+WlxR60116ctssq6zA7JiE7APneTRYdQ67/L+cSJx x+Y+U6YelpdeY X-Google-Smtp-Source: AGHT+IFdbc3mohfT1mDPA7mYVDdFDxbIYXPfXIY/2vp3L+2eylwgy99S32jQ0P+7prL2kP4wrhu1Tw== X-Received: by 2002:a05:622a:2c3:b0:476:75d0:dbed with SMTP id d75a77b69052e-47ad3af0159mr16644021cf.44.1744757219257; Tue, 15 Apr 2025 15:46:59 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7c917034940sm5593885a.53.2025.04.15.15.46.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:46:59 -0700 (PDT) Date: Tue, 15 Apr 2025 18:46:58 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 4/9] pack-objects: declare 'rev_info' for '--stdin-packs' earlier Message-ID: <2a235461a611d7abd90311c51174e2ed85eafa1b.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Once 'read_packs_list_from_stdin()' has called for_each_object_in_pack() on each of the input packs, we do a reachability traversal to discover names for any objects we picked up so we can generate name hash values and hopefully get higher quality deltas as a result. A future commit will change the purpose of this reachability traversal to find and pack objects which are reachable from commits in the input packs, but are packed in an unknown (not included nor excluded) pack. Extract the code which initializes and performs the reachability traversal to take place in the caller, not the callee, which prepares us to share this code for the '--unpacked' case (see the function add_unreachable_loose_objects() for more details). Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 71 +++++++++++++++++++++--------------------- 1 file changed, 36 insertions(+), 35 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index a293267074..d60cb042c9 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3558,7 +3558,7 @@ static int pack_mtime_cmp(const void *_a, const void *_b) return 0; } -static void read_packs_list_from_stdin(void) +static void read_packs_list_from_stdin(struct rev_info *revs) { struct strbuf buf = STRBUF_INIT; struct string_list include_packs = STRING_LIST_INIT_DUP; @@ -3566,24 +3566,6 @@ static void read_packs_list_from_stdin(void) struct string_list_item *item = NULL; struct packed_git *p; - struct rev_info revs; - - repo_init_revisions(the_repository, &revs, NULL); - /* - * Use a revision walk to fill in the namehash of objects in the include - * packs. To save time, we'll avoid traversing through objects that are - * in excluded packs. - * - * That may cause us to avoid populating all of the namehash fields of - * all included objects, but our goal is best-effort, since this is only - * an optimization during delta selection. - */ - revs.no_kept_objects = 1; - revs.keep_pack_cache_flags |= IN_CORE_KEEP_PACKS; - revs.blob_objects = 1; - revs.tree_objects = 1; - revs.tag_objects = 1; - revs.ignore_missing_links = 1; while (strbuf_getline(&buf, stdin) != EOF) { if (!buf.len) @@ -3653,10 +3635,44 @@ static void read_packs_list_from_stdin(void) struct packed_git *p = item->util; for_each_object_in_pack(p, add_object_entry_from_pack, - &revs, + revs, FOR_EACH_OBJECT_PACK_ORDER); } + strbuf_release(&buf); + string_list_clear(&include_packs, 0); + string_list_clear(&exclude_packs, 0); +} + +static void add_unreachable_loose_objects(void); + +static void read_stdin_packs(int rev_list_unpacked) +{ + struct rev_info revs; + + repo_init_revisions(the_repository, &revs, NULL); + /* + * Use a revision walk to fill in the namehash of objects in the include + * packs. To save time, we'll avoid traversing through objects that are + * in excluded packs. + * + * That may cause us to avoid populating all of the namehash fields of + * all included objects, but our goal is best-effort, since this is only + * an optimization during delta selection. + */ + revs.no_kept_objects = 1; + revs.keep_pack_cache_flags |= IN_CORE_KEEP_PACKS; + revs.blob_objects = 1; + revs.tree_objects = 1; + revs.tag_objects = 1; + revs.ignore_missing_links = 1; + + /* avoids adding objects in excluded packs */ + ignore_packed_keep_in_core = 1; + read_packs_list_from_stdin(&revs); + if (rev_list_unpacked) + add_unreachable_loose_objects(); + if (prepare_revision_walk(&revs)) die(_("revision walk setup failed")); traverse_commit_list(&revs, @@ -3668,21 +3684,6 @@ static void read_packs_list_from_stdin(void) stdin_packs_found_nr); trace2_data_intmax("pack-objects", the_repository, "stdin_packs_hints", stdin_packs_hints_nr); - - strbuf_release(&buf); - string_list_clear(&include_packs, 0); - string_list_clear(&exclude_packs, 0); -} - -static void add_unreachable_loose_objects(void); - -static void read_stdin_packs(int rev_list_unpacked) -{ - /* avoids adding objects in excluded packs */ - ignore_packed_keep_in_core = 1; - read_packs_list_from_stdin(); - if (rev_list_unpacked) - add_unreachable_loose_objects(); } static void add_cruft_object_entry(const struct object_id *oid, enum object_type type, From patchwork Tue Apr 15 22:47:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052867 Received: from mail-qt1-f170.google.com (mail-qt1-f170.google.com [209.85.160.170]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 867882BE100 for ; Tue, 15 Apr 2025 22:47:03 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.170 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757225; cv=none; b=aMvHgqBlW+r76ju6ffSVWC748KpwEWWMdmt2lJYMYLFgB+tjiwYFc2e0ifzgkY1m+9vNV1n4uYmKqqzc+640/N0j8GaDz1Zrn8z+ZDeYs6FTQkClXTxDBCGMRKLFpVFot/QcqRtggGbNIzorqtHdh6CgvxBghxYg+ZMWRrk4PRo= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757225; c=relaxed/simple; bh=4ISpk/ejtkGsMPU2wfniBoAdH9D4V1/0yyjaWAx9Vk0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=qUwXRspG4FPM124xsj4Thm6VlHl6qUO+vuJ2WdtLT40YEtARpfxYAY6VjqTwVeV+RWnjfkuLcXzoN/DcIYaLrlja9GD9LXyCBYSZJ4gEzKS5c6uQgU2lNZ+qPbyndzaU2I08xjX870Icpei5cqQqqCe0E3q1f2x1iNaVaPCUBtY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=GRvlH2XG; arc=none smtp.client-ip=209.85.160.170 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="GRvlH2XG" Received: by mail-qt1-f170.google.com with SMTP id d75a77b69052e-476af5479feso54156171cf.2 for ; Tue, 15 Apr 2025 15:47:03 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757222; x=1745362022; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=N7nuh9dYwlBWEQNqm3Hh+kPeEYcXto7w0OIy6dlunYk=; b=GRvlH2XGaO8xsPRUC+KIwjN6B4iy767ggBGZWic3/gHMfxoiLjK4QFwv+bLZfzIU6/ hT2BbSaJ9FdMQXziPhV2B9L+9SjKVQDyJDC3O/J8dSv16fhBa1cKzDVdiWtRP0E3qxU+ jy+Uoz2L3E5Wo0QszyKqTEUVUiXBDDmd12DlniABnwPjTUgac/vVDcRmSOAaioJcVZ1Y uSoNza7fiQ2Q1CyAvamrO9Z3A/k9HdSEF7jprJNIYvVwww3HXuTzl48yUvUGdHBYa9PI ztUNr+oaiFilviRUX+3tD9H7sl3bPOScC4zqx8jlTmXO09/n7fp1zoII/gdTG62F3jsc t3KA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757222; x=1745362022; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=N7nuh9dYwlBWEQNqm3Hh+kPeEYcXto7w0OIy6dlunYk=; b=RP3TxuxQdVCO6qfpUdfCvV/DVCCWtb0zt/IuoEGV+GGYNf3E9XLy5+Vv2dmT3A0G4V ZM9ZL7o8fVquPR7krmwWAyAwcIpeWv9+5OHmXuTYbVAObKx7sqqSDQyBEwqmDsrBwVfP VuorzZEAhBFFxieLb8G7v30DUiHbLVm6xK+uXHvTgt3f3r4aalE+o7/UpAcBURnLa36T 3Y7ty0Tnudz7wxqXfotJGuL2YzuM6ZTUlFnCgPS56lUDh+xApyPovUC1bxxVQoDsaIYK j5NQ7JBSUL0VgvS52mLj7oPfg6V4/gm8AGmq0CeSOaMNZXAXAswt4pd0pNzsOkW5WxGM r74w== X-Gm-Message-State: AOJu0YzUqbfQpaLpY1FopcwLxrAIDMMH+UKReUXz5pwFx0eJYql+T1vK FOsR19AE9BavG9Ta2Q8xUng/EKxWKQruOwh2VuFHGwiHkFtQzmgK2875PTJPuJSeNol0Cix3Qj6 QXUc= X-Gm-Gg: ASbGncvhPlRYO2m3+PxCH1s4SetgIxpLvsHNAjhbZpJsHvzxCCLUG/YJTb8moSLIUwJ ufHPSMtYIPclATkc/kck2ebRkPXMJvafcfPXT1P102k3PqNz9ouFJXSGhAwgtOrRE0Em9Oam83r Ph3s56hM66jUaal+PAPNT0Yy1PVM6gG5OtBmdSur8peCiSh+r3veETo1onKydZ5FVqeWX6U9Wnu z/bddg2NgB2BCWRs/GzO32gFXGC+uxdM48lliuMWpU972N/a/Kh5e/cY5tdkOocatK6z69DfWz6 A5XxFJdg5vzEDnj+T661lv07nD0aQzhemRgKv2gJfc9yi0D9rJYBuqUy+bYEVyBZPgA0SwDdz2F 1NA9HfI+ELyOe X-Google-Smtp-Source: AGHT+IGJWHllwBXa1unv/kBZ7wUvys9rCbWJaACPdiG6gIUurmv7XOdMBACq2Wp1/VR+T6qnuQ2mVQ== X-Received: by 2002:a05:622a:c8:b0:476:a969:90c5 with SMTP id d75a77b69052e-47ad3a3dc46mr12901941cf.24.1744757222219; Tue, 15 Apr 2025 15:47:02 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-4796ed9cc97sm99163011cf.61.2025.04.15.15.47.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:47:02 -0700 (PDT) Date: Tue, 15 Apr 2025 18:47:01 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 5/9] pack-objects: perform name-hash traversal for unpacked objects Message-ID: <240e90b68d18b3231826d2a68e4e251c893e645a.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: With '--unpacked', pack-objects adds loose objects (which don't appear in any of the excluded packs from '--stdin-packs') to the output pack without considering them as reachability tips for the name-hash traversal. This was an oversight in the original implementation of '--stdin-packs', since the code which enumerates and adds loose objects to the output pack (`add_unreachable_loose_objects()`) did not have access to the 'rev_info' struct found in `read_packs_list_from_stdin()`. Excluding unpacked objects from that traversal doesn't affect the correctness of the resulting pack, but it does make it harder to discover good deltas for loose objects. Now that the 'rev_info' struct is declared outside of `read_packs_list_from_stdin()`, we can pass it to `add_objects_in_unpacked_packs()` and add any loose objects as tips to the above-mentioned traversal, in theory producing slightly tighter packs as a result. Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 20 ++++++++++++-------- 1 file changed, 12 insertions(+), 8 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index d60cb042c9..eb2a4099cc 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3644,7 +3644,7 @@ static void read_packs_list_from_stdin(struct rev_info *revs) string_list_clear(&exclude_packs, 0); } -static void add_unreachable_loose_objects(void); +static void add_unreachable_loose_objects(struct rev_info *revs); static void read_stdin_packs(int rev_list_unpacked) { @@ -3671,7 +3671,7 @@ static void read_stdin_packs(int rev_list_unpacked) ignore_packed_keep_in_core = 1; read_packs_list_from_stdin(&revs); if (rev_list_unpacked) - add_unreachable_loose_objects(); + add_unreachable_loose_objects(&revs); if (prepare_revision_walk(&revs)) die(_("revision walk setup failed")); @@ -3790,7 +3790,7 @@ static void enumerate_cruft_objects(void) _("Enumerating cruft objects"), 0); add_objects_in_unpacked_packs(); - add_unreachable_loose_objects(); + add_unreachable_loose_objects(NULL); stop_progress(&progress_state); } @@ -4068,8 +4068,9 @@ static void add_objects_in_unpacked_packs(void) } static int add_loose_object(const struct object_id *oid, const char *path, - void *data UNUSED) + void *data) { + struct rev_info *revs = data; enum object_type type = oid_object_info(the_repository, oid, NULL); if (type < 0) { @@ -4090,6 +4091,10 @@ static int add_loose_object(const struct object_id *oid, const char *path, } else { add_object_entry(oid, type, "", 0); } + + if (revs && type == OBJ_COMMIT) + add_pending_oid(revs, NULL, oid, 0); + return 0; } @@ -4098,11 +4103,10 @@ static int add_loose_object(const struct object_id *oid, const char *path, * add_object_entry will weed out duplicates, so we just add every * loose object we find. */ -static void add_unreachable_loose_objects(void) +static void add_unreachable_loose_objects(struct rev_info *revs) { for_each_loose_file_in_objdir(repo_get_object_directory(the_repository), - add_loose_object, - NULL, NULL, NULL); + add_loose_object, NULL, NULL, revs); } static int has_sha1_pack_kept_or_nonlocal(const struct object_id *oid) @@ -4358,7 +4362,7 @@ static void get_object_list(struct rev_info *revs, int ac, const char **av) if (keep_unreachable) add_objects_in_unpacked_packs(); if (pack_loose_unreachable) - add_unreachable_loose_objects(); + add_unreachable_loose_objects(NULL); if (unpack_unreachable) loosen_unused_packed_objects(); From patchwork Tue Apr 15 22:47:05 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052868 Received: from mail-qv1-f48.google.com (mail-qv1-f48.google.com [209.85.219.48]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id B073C2063D2 for ; Tue, 15 Apr 2025 22:47:08 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.219.48 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757230; cv=none; b=TURZJSKKv9QhhgidBViZQUA9pbTh2sIfUgKOwqesfIrXds4q595gqUe23t4Ybm3eAqwEsk135PktrWvmF/362mGK7Q3GnEGHXTG0XLHyQgngdUkSiWDOeAFSi+1PE16lv4qfKKJoTkGHPDWTGtcDCbe4urpnUWZMB6x1hb9zfwA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757230; c=relaxed/simple; bh=hP1POH9Oxs+UqPZGTFHXuUYUS4SR81sHGlZ6cD/MC6E=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=j/Nqb0Loj37/8luACXcZeihGQT/tatImYYmQ4sqkBeVyS5Qe8pioSgUL3ssyrL5dssypxLTXQl3/pZuSL48huWTI29O+pQZdg2EtC58mkW7IKz+SBmgNbwkM+fyDo4UybTNqHdj8hs6u/CupBmFQ+rlcacbMyoOq9JvQz1OIU50= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=qvRTZHyO; arc=none smtp.client-ip=209.85.219.48 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="qvRTZHyO" Received: by mail-qv1-f48.google.com with SMTP id 6a1803df08f44-6f0ad74483fso64407896d6.1 for ; Tue, 15 Apr 2025 15:47:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757227; x=1745362027; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=mnLgjrhSpY+i0iWWeDkjxgdH0dseFd77tQ42KQwZnSk=; b=qvRTZHyOzG53/18M9HhvXb664M2YVMxAwjGOYw6ochz12Xd1GCxtqohG/9dBCuJjEb PhIFh2DqkfeFSoG6ClihVEykEJJrJILnkeNuKn07U9ckk15ZX5NjOB0kAWC5lqh9Sffz LiMbYxcOzcguwX0D97QJLnL26zNCun8azsDFBO/Mheq6O8RI4Spj1jlCO22/kg8YpnDo BZNPaNYecjH4I8jHDcDaWKy2nuU8UBF3I8k7uWOvW6QY4qlFnI2rMm5pM0kjr+kj2Lfb 1JC1Gmk5TfZceAkR5me5DCynDWfK13eJS47qnSGC76WwTpBO51WwSRTULMTsqwvliZTv 8o+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757227; x=1745362027; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=mnLgjrhSpY+i0iWWeDkjxgdH0dseFd77tQ42KQwZnSk=; b=vMXXI88bh1ImEHELBeUxR2fBHfMjRzEbAvzf3UifuAALiTnlx3Vx0ULUgTIX3mTXqK PpSFLsNxkqC0fKA5O4q92QXZ0j/R352O4prY5gRwDoHWTk4XBiKEAV+2drey3LNtVOS/ AR65DCldR0Jbagx5Ud+DhH3rvfUVHfgyQnjYvykOwAGv6I/NHie57ZLOuLeil4yX/ejC ReAwCFzGhji/m2VdEx+kvvAbdLfzWycms0SUVvUgJnCj5lAWL5bnYJJMI3wrsCGkfhAS E6nbYLhhhHankycKdiG/mEbb5JkstVTFHhtBK5SogOKDNzO6TKdcaKSnOPGuq1J17aM9 MqWw== X-Gm-Message-State: AOJu0Ywv6ZsnJoeYpCfLiaeefHUoz1fuGvnBQYzQDp57zf0h1xGVHd1s ZPdm+t+dxht4KwuDz72NqGFv7TkS1zjlAT7qsZ2QyoeFuEyazJIPmMaOf8xqkbm3kpeyJk6UVV1 Np68= X-Gm-Gg: ASbGncuR9IEQ41yhQAF5Oc3xSkqKCITskvbsM7ktDp+yj/clchuOT+cBuznv/ZFgpJW FSiVdoW2X2cmjFq5Z/cdp7M5GXjzJHT2/CyB/WIiD/uYSGJJpblpIgGlGDxYVHeIVzDtRhOcch5 FhVW98xubRJgc+2rZxQuLlV+/3UR643IYfNDqTEsndQoLv+e/P/C2xH9XAZp/6ULoAv4XMWt50R uUdAJmhuprLUv1dBj//oLHv15xHWIxuJqtqXvJ3FG/7Tq9pzkKruPzPOA8fr+zYk2gVwRGTXOnd 4kL1f/NjgrjDLzyzLslaPBVW7KWp0ca1nizTdowNpMVuCn5GYUnvVnEgGgKbI9Mv4ZCEesnBLWL tlmvOqVR3BAAn X-Google-Smtp-Source: AGHT+IHpz6eHgZWvjuoVNKZcM3mr6y/QpElk5yqupHsoRF1N0WGrQi2CHvn5cdXeziOlPFdUkgnUOA== X-Received: by 2002:ad4:5aa5:0:b0:6e8:f3ed:3f21 with SMTP id 6a1803df08f44-6f2ad974d24mr21910096d6.31.1744757227259; Tue, 15 Apr 2025 15:47:07 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id 6a1803df08f44-6f0dea107a3sm106399306d6.114.2025.04.15.15.47.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:47:06 -0700 (PDT) Date: Tue, 15 Apr 2025 18:47:05 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 6/9] pack-objects: fix typo in 'show_object_pack_hint()' Message-ID: <9a18fa2e52bfe1bd98ea2d50b8e91509dcf67102.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: Noticed-by: Elijah Newren Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index eb2a4099cc..f06b359150 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3532,7 +3532,7 @@ static void show_object_pack_hint(struct object *object, const char *name, * would typically pick up during a reachability traversal. * * Make a best-effort attempt to fill in the ->hash and ->no_try_delta - * here using a now in order to perhaps improve the delta selection + * fields here in order to perhaps improve the delta selection * process. */ oe->hash = pack_name_hash_fn(name); From patchwork Tue Apr 15 22:47:09 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052869 Received: from mail-qt1-f171.google.com (mail-qt1-f171.google.com [209.85.160.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9F7052BF3FE for ; Tue, 15 Apr 2025 22:47:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757233; cv=none; b=sbGlCI/qVddMNCCPR+fF1giNQkWQczxO8nGG15HM2WcLYFo2rxKYpKTm/k4L3mW1jyBgpZVVUSr0Fcj+6e2ho8gkSRZk4fr7KnYNkrjmSc01BgGDf1s9EdEQoK7IWv5UKg30zbLxI1bJaqrGeucY+0suALLW/fycAFhTmCD7N5s= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757233; c=relaxed/simple; bh=sXRPwxDHs9pln2H0TSPY5feRTVEgeDGurA4Q+i3aYr0=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=GBUOq4LkJr4bUbDhxZDDuRO0xvL8cz/4na4KnWSTrPg9VY6jRS0uDqFVmJc8XAuwokPdvjmkNiaH2cDgjVTJwrpJ6xneLE6X6XVmX49NHyHsS2M0A2BE5UeTVEUNFmnnChSgHB4DzVY4+oeY67VHqzeYT0Rt43QkkZ5q5NBaPyw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=Q4SGl5nE; arc=none smtp.client-ip=209.85.160.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="Q4SGl5nE" Received: by mail-qt1-f171.google.com with SMTP id d75a77b69052e-476964b2c1dso99563761cf.3 for ; Tue, 15 Apr 2025 15:47:11 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757230; x=1745362030; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=CdgXDncBl79uhyB2Zn24sX1erKzVOM/9l3B83Mt9eQc=; b=Q4SGl5nErJAi5TlqXNsVXcdYGg8+C1jskkfbG8M3UH2fx0wyFc5VUvNKGjaJqEz1Ik SB+9EeIvwIUHnz4c/Ai++cwB7l+bcDMWYLjWuP32TT1H1cyFswVyfVIMpuC/vhtBe19P wjhFg6JAfYQ5yGczExbHMb6Nki6CZACzRPdO5Ott8DdRFBI8FKdhjyJMQTewaVLs6scK mmpoSzvdL8dpGpNURfJz4izdeeO/qPQIQK+GoHt9Soq5mwfzeu0ZSKWh1/y9saCscdlQ bOUl/BZRtgEe3mW05WyYF3jGqRj6RxEPGBGuN5wu7JhGKYBpOxO7xLMc7BMJ5+ySUQyu K6CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757230; x=1745362030; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=CdgXDncBl79uhyB2Zn24sX1erKzVOM/9l3B83Mt9eQc=; b=r5x3Hum2+K+e7xZ13+eVeKQa1RnCR36GxVJ6mflmZQiW8ULRlfF5Fs8kq6k1qrYWtV H7uGQydVjlk4X6bIpeiyesWyi5qChmg4TD01evoikgrQ5X6ynbKmvfSsxRrWn3A4bW/V jSS5jwIfMLq1jaOr7jcUV/+kyE1FhuxKsYPk/oXnRisSucv1WuzqaDJLgPa+Y3RTpqPP T7y+l05oVPA23NhM5HQ6ICa1elg8sfLKEi0zQucCTaW7wJAfGOHpjmac1VJtjYxwnbss K9551uUfUdC+uDXRTcaL6kPHSfFt2epvVZe2VsYfVSF2ImeyWGfTHwzc9nP14TICIK3v I5oQ== X-Gm-Message-State: AOJu0YxstDAvJbGcYwJRZaOeJwgPXLu3A+cnMbS8jNGcsOsvNF0iBOud Y2uiViwmMe+jZ37G59An26Fw2SUnKy6f8F6tnHfbZtWBKgj6xOJzy7cZRGvev+41HiBJmoCVLYE dDcg= X-Gm-Gg: ASbGncsq0XgkjYw8XHYKyLNO2QjnkU2fxADW1IA9wGyz/XbHjtcb1h+/he15l2shZ3a 9SkeC7NGx0zUAr8YtNq/7J0xCB+tDV40i0/+Y5k98yZfohIN7K8jT1GX96M4fHi0Hx217VkzXf/ K5W+6bkhbDGha1Fmhk7TEUol232II5xf8VRpvCT6UgK1ZDolxwNZBwDdXvz05lw5ZP4O6SujRqr FyLGNYwag1++RRDDm2ITnqkBu1nq7uK2/XSdQ6ddY3I8uFk0TarvkvCTZ5PwiAj9+86M0ZQuGr0 q66jGT+66tJ7Yxo0nzKeK4Jnu+6/kg5YyOjfCmupUhrUI9LLmJJoJoSkMoh3XXa7ach9w38Amqb 64PSuXtKgArqa X-Google-Smtp-Source: AGHT+IEtMLHlKdqah86LnB9EYaAFPoML3gF+n+6jD0iWhXGiBUOjtu2RVTHktKLcj2XJRRPLhZtJ5w== X-Received: by 2002:a05:622a:1820:b0:476:7cc2:3f57 with SMTP id d75a77b69052e-47ad39f6a05mr16939691cf.10.1744757230356; Tue, 15 Apr 2025 15:47:10 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-4796eb163e4sm99170781cf.31.2025.04.15.15.47.09 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:47:10 -0700 (PDT) Date: Tue, 15 Apr 2025 18:47:09 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 7/9] pack-objects: swap 'show_{object,commit}_pack_hint' Message-ID: <6c997853f15deb03a7b55577316720c40573be9b.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: show_commit_pack_hint() has heretofore been a noop, so its position within its compilation unit only needs to appear before its first use. But the following commit will sometimes have `show_commit_pack_hint()` call `show_object_pack_hint()`, so reorder the former to appear after the latter to minimize the code movement in that patch. Suggested-by: Elijah Newren Signed-off-by: Taylor Blau --- builtin/pack-objects.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f06b359150..f4009cd391 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -3513,12 +3513,6 @@ static int add_object_entry_from_pack(const struct object_id *oid, return 0; } -static void show_commit_pack_hint(struct commit *commit UNUSED, - void *data UNUSED) -{ - /* nothing to do; commits don't have a namehash */ -} - static void show_object_pack_hint(struct object *object, const char *name, void *data UNUSED) { @@ -3541,6 +3535,12 @@ static void show_object_pack_hint(struct object *object, const char *name, stdin_packs_hints_nr++; } +static void show_commit_pack_hint(struct commit *commit UNUSED, + void *data UNUSED) +{ + /* nothing to do; commits don't have a namehash */ +} + static int pack_mtime_cmp(const void *_a, const void *_b) { struct packed_git *a = ((const struct string_list_item*)_a)->util; From patchwork Tue Apr 15 22:47:12 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052870 Received: from mail-qt1-f180.google.com (mail-qt1-f180.google.com [209.85.160.180]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C19662BF3CB for ; Tue, 15 Apr 2025 22:47:14 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.180 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757237; cv=none; b=GZXVUHnSOGZzZIkp6UKId4PRE/bQHhVlFjfDwPiHkr2fh9ej/+Ka+eWCiizF8efSl2GqT34Q1D/BSp6rzNfL49N8PNwiu7jMiSOUT/aTO2EBAkdMYGE/Riuw9XYMhArv7c/0/E/VF0lVKytiWouupDSTIz4EUcW1ygNMc/gjIVU= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757237; c=relaxed/simple; bh=KggbUgs3x5kddYLqvSSiLhf1O2bGTc8JTjZq7UmOJ6A=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=tpg8JVhuw6+EY3xGkz8Ksufp5lzxt+iSlDNSu+zIbCK5oECUxhm4OkyOZ8yxAb8sExeHcdCxmJEsX2T6fyW5fryZfdLmDVF2xvFQwVKdY9XETVB74OflbvVO1TpPVk1hRrGHh/hV9VpPNDSHaPZzYX7tDCpLndPtvjxLkE5YUFY= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=C1lO43NO; arc=none smtp.client-ip=209.85.160.180 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="C1lO43NO" Received: by mail-qt1-f180.google.com with SMTP id d75a77b69052e-4775ccf3e56so1592961cf.0 for ; Tue, 15 Apr 2025 15:47:14 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757233; x=1745362033; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=dj0O55QR9L2Dg5EFiN37xW7Ns7WfgqrPzKVJ+G+GPHo=; b=C1lO43NOt9lVA6m2GAYRnuIoYYj0kZ/KgGbW/CBVRbEXxyD8IWCokL4Stvm8IjlWrl Q+mwSuPiEwDLpw817BIgAjfs+OIRa7hRIHzZDxjHCn9H+01Pt+qVaDWZeLVy4KfeAmZ7 DidJ6I4/4ZvA/xlf+nr/5T+/aVCvlDxkRdcHm30Oblk+rs8uCcljlLBnUiDdA3+7qtT5 upn9TK8K92Icomeb8OjY3jWlDK5ZfXLCro1uRrQahe9LSFG4cXG3v0icrsPGhEL50X9b 1EtszbFBYnSDRPJURiufh1vz4hKX0b4xdmzlcfteBXmlG3QhGArkmKj8q+I2DCmLnOJS Gttg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757233; x=1745362033; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=dj0O55QR9L2Dg5EFiN37xW7Ns7WfgqrPzKVJ+G+GPHo=; b=iekY2BN0+T8EAv2C05HogSLR4DReJEBQwQfvR1khqkllJhMFPQbjbYcCmbg0nfz6K5 R84Jh9UPoTg6G8ZGDuta/tSmXRendTWlbdNq2iuT0LuInGYE5nc307Wd4a5JDEa6Wj/5 jYSUSkOkltaGQbva7zQpyxIiD6n+LxoXAJ4ZFAtJ6/mRjj+FHFFDXzjjOw5chMzOrobw +VvDfzFgW4wkrOZbRck7DfOQEEVsKDbp4eUhO13XD8yVU1QaZhooj1dkm1qUV/1guOhO 9IsWDircfonNzH4Yz8gimGARaq9csF/GcGdY1DTqKNbOqgrHjU+b16AgS4bp96gX86db J6Eg== X-Gm-Message-State: AOJu0YxOH/DLJ5LbjIcG/23cv0DS2sAgV9AtVpa33MpkNurA81w2lEmP 3Oc1cEsVsbaARuON1XK5AJtIsKYxF2xcA1tHi9AnQvybosOoxQzlIuAv0nwtfBv+5SaQMrhzR1y aUCk= X-Gm-Gg: ASbGnctUbs71xf/apis6hh4j+evl+1NMTXzZdw2tDbp0aqSocipgKkjtV2/fpwjNshe EnUT9o+6A/ddDXRH1zHdrnEMCZLTF/N4F6KlS0RZz6g63jvOYAePl1F+wx7B8+eleaN786x9QKc dFskL66alwBl62S4opWMFartyP5te/4hNOjMuEYUwWfH6ZnzVu7GQuv+rfnUeh3+zoj2MvQeP5Z DUmBv57GNrsfx/LBsSJzKZr/LAgMvjoUeIuNoJ2SEZi3SFMLKa2nkKM+2TcqjW2Xckax18Ee4t4 ZdC/OLyiqJn+6FsDqZpE7dI895mPlMO0Ppo8lOznBYOIXtu+mAd2y7gLDPY1QA2gjm5MQzvWhUd ZYCnbsSJXqgZu X-Google-Smtp-Source: AGHT+IG+n6iE+2ka1r/C+SOYWNl1ULDk8SCG8aNSLuQDXwsCHn3Col4GWowEnqfVqOmaZyoltCGpPg== X-Received: by 2002:ac8:588f:0:b0:471:f482:bba7 with SMTP id d75a77b69052e-47a0780c6femr79817221cf.22.1744757233386; Tue, 15 Apr 2025 15:47:13 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id af79cd13be357-7c7a8951562sm964588685a.35.2025.04.15.15.47.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:47:13 -0700 (PDT) Date: Tue, 15 Apr 2025 18:47:12 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 8/9] pack-objects: introduce '--stdin-packs=follow' Message-ID: <0ff699f05636ec3373bbcd16cd082a402cda4c25.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: When invoked with '--stdin-packs', pack-objects will generate a pack which contains the objects found in the "included" packs, less any objects from "excluded" packs. Packs that exist in the repository but weren't specified as either included or excluded are in practice treated like the latter, at least in the sense that pack-objects won't include objects from those packs. This behavior forces us to include any cruft pack(s) in a repository's multi-pack index for the reasons described in ddee3703b3 (builtin/repack.c: add cruft packs to MIDX during geometric repack, 2022-05-20). The full details are in ddee3703b3, but the gist is if you have a once-unreachable object in a cruft pack which later becomes reachable via one or more commits in a pack generated with '--stdin-packs', you *have* to include that object in the MIDX via the copy in the cruft pack, otherwise we cannot generate reachability bitmaps for any commits which reach that object. This prepares us for new repacking behavior which will "resurrect" objects found in cruft or otherwise unspecified packs when generating new packs. In the context of geometric repacking, this may be used to maintain a sequence of geometrically-repacked packs, the union of which is closed under reachability, even in the case described earlier. Signed-off-by: Taylor Blau --- Documentation/git-pack-objects.adoc | 10 +++- builtin/pack-objects.c | 83 +++++++++++++++++++++-------- t/t5331-pack-objects-stdin.sh | 82 ++++++++++++++++++++++++++++ 3 files changed, 152 insertions(+), 23 deletions(-) diff --git a/Documentation/git-pack-objects.adoc b/Documentation/git-pack-objects.adoc index 7f69ae4855..8f0cecaec9 100644 --- a/Documentation/git-pack-objects.adoc +++ b/Documentation/git-pack-objects.adoc @@ -87,13 +87,21 @@ base-name:: reference was included in the resulting packfile. This can be useful to send new tags to native Git clients. ---stdin-packs:: +--stdin-packs[=]:: Read the basenames of packfiles (e.g., `pack-1234abcd.pack`) from the standard input, instead of object names or revision arguments. The resulting pack contains all objects listed in the included packs (those not beginning with `^`), excluding any objects listed in the excluded packs (beginning with `^`). + +When `mode` is "follow", objects from packs not listed on stdin receive +special treatment. Objects within unlisted packs will be included if +those objects are (1) reachable from the included packs, and (2) not +found in any excluded packs. This mode is useful, for example, to +resurrect once-unreachable objects found in cruft packs to generate +packs which are closed under reachability up to the boundary set by the +excluded packs. ++ Incompatible with `--revs`, or options that imply `--revs` (such as `--all`), with the exception of `--unpacked`, which is compatible. diff --git a/builtin/pack-objects.c b/builtin/pack-objects.c index f4009cd391..67a22b2dc4 100644 --- a/builtin/pack-objects.c +++ b/builtin/pack-objects.c @@ -272,6 +272,12 @@ static struct oidmap configured_exclusions; static struct oidset excluded_by_config; static int name_hash_version = -1; +enum stdin_packs_mode { + STDIN_PACKS_MODE_NONE, + STDIN_PACKS_MODE_STANDARD, + STDIN_PACKS_MODE_FOLLOW, +}; + /** * Check whether the name_hash_version chosen by user input is appropriate, * and also validate whether it is compatible with other features. @@ -3514,31 +3520,44 @@ static int add_object_entry_from_pack(const struct object_id *oid, } static void show_object_pack_hint(struct object *object, const char *name, - void *data UNUSED) + void *data) { - struct object_entry *oe = packlist_find(&to_pack, &object->oid); - if (!oe) - return; + enum stdin_packs_mode mode = *(enum stdin_packs_mode *)data; + if (mode == STDIN_PACKS_MODE_FOLLOW) { + add_object_entry(&object->oid, object->type, name, 0); + } else { + struct object_entry *oe = packlist_find(&to_pack, &object->oid); + if (!oe) + return; - /* - * Our 'to_pack' list was constructed by iterating all objects packed in - * included packs, and so doesn't have a non-zero hash field that you - * would typically pick up during a reachability traversal. - * - * Make a best-effort attempt to fill in the ->hash and ->no_try_delta - * fields here in order to perhaps improve the delta selection - * process. - */ - oe->hash = pack_name_hash_fn(name); - oe->no_try_delta = name && no_try_delta(name); + /* + * Our 'to_pack' list was constructed by iterating all + * objects packed in included packs, and so doesn't have + * a non-zero hash field that you would typically pick + * up during a reachability traversal. + * + * Make a best-effort attempt to fill in the ->hash and + * ->no_try_delta fields here in order to perhaps + * improve the delta selection process. + */ + oe->hash = pack_name_hash_fn(name); + oe->no_try_delta = name && no_try_delta(name); - stdin_packs_hints_nr++; + stdin_packs_hints_nr++; + } } -static void show_commit_pack_hint(struct commit *commit UNUSED, - void *data UNUSED) +static void show_commit_pack_hint(struct commit *commit, void *data) { + enum stdin_packs_mode mode = *(enum stdin_packs_mode *)data; + + if (mode == STDIN_PACKS_MODE_FOLLOW) { + show_object_pack_hint((struct object *)commit, "", data); + return; + } + /* nothing to do; commits don't have a namehash */ + } static int pack_mtime_cmp(const void *_a, const void *_b) @@ -3646,7 +3665,7 @@ static void read_packs_list_from_stdin(struct rev_info *revs) static void add_unreachable_loose_objects(struct rev_info *revs); -static void read_stdin_packs(int rev_list_unpacked) +static void read_stdin_packs(enum stdin_packs_mode mode, int rev_list_unpacked) { struct rev_info revs; @@ -3678,7 +3697,7 @@ static void read_stdin_packs(int rev_list_unpacked) traverse_commit_list(&revs, show_commit_pack_hint, show_object_pack_hint, - NULL); + &mode); trace2_data_intmax("pack-objects", the_repository, "stdin_packs_found", stdin_packs_found_nr); @@ -4469,6 +4488,23 @@ static int is_not_in_promisor_pack(struct commit *commit, void *data) { return is_not_in_promisor_pack_obj((struct object *) commit, data); } +static int parse_stdin_packs_mode(const struct option *opt, const char *arg, + int unset) +{ + enum stdin_packs_mode *mode = opt->value; + + if (unset) + *mode = STDIN_PACKS_MODE_NONE; + else if (!arg || !*arg) + *mode = STDIN_PACKS_MODE_STANDARD; + else if (!strcmp(arg, "follow")) + *mode = STDIN_PACKS_MODE_FOLLOW; + else + die(_("invalid value for '%s': '%s'"), opt->long_name, arg); + + return 0; +} + int cmd_pack_objects(int argc, const char **argv, const char *prefix, @@ -4480,7 +4516,7 @@ int cmd_pack_objects(int argc, struct strvec rp = STRVEC_INIT; int rev_list_unpacked = 0, rev_list_all = 0, rev_list_reflog = 0; int rev_list_index = 0; - int stdin_packs = 0; + enum stdin_packs_mode stdin_packs = STDIN_PACKS_MODE_NONE; struct string_list keep_pack_list = STRING_LIST_INIT_NODUP; struct list_objects_filter_options filter_options = LIST_OBJECTS_FILTER_INIT; @@ -4535,6 +4571,9 @@ int cmd_pack_objects(int argc, OPT_SET_INT_F(0, "indexed-objects", &rev_list_index, N_("include objects referred to by the index"), 1, PARSE_OPT_NONEG), + OPT_CALLBACK_F(0, "stdin-packs", &stdin_packs, N_("mode"), + N_("read packs from stdin"), + PARSE_OPT_OPTARG, parse_stdin_packs_mode), OPT_BOOL(0, "stdin-packs", &stdin_packs, N_("read packs from stdin")), OPT_BOOL(0, "stdout", &pack_to_stdout, @@ -4791,7 +4830,7 @@ int cmd_pack_objects(int argc, progress_state = start_progress(the_repository, _("Enumerating objects"), 0); if (stdin_packs) { - read_stdin_packs(rev_list_unpacked); + read_stdin_packs(stdin_packs, rev_list_unpacked); } else if (cruft) { read_cruft_objects(); } else if (!use_internal_rev_list) { diff --git a/t/t5331-pack-objects-stdin.sh b/t/t5331-pack-objects-stdin.sh index 8fd07deb8d..60a2b4bc07 100755 --- a/t/t5331-pack-objects-stdin.sh +++ b/t/t5331-pack-objects-stdin.sh @@ -236,4 +236,86 @@ test_expect_success 'pack-objects --stdin with packfiles from main and alternate test_cmp expected-objects actual-objects ' +packdir=.git/objects/pack + +objects_in_packs () { + for p in "$@" + do + git show-index <"$packdir/pack-$p.idx" || return 1 + done >objects.raw && + + cut -d' ' -f2 objects.raw | sort && + rm -f objects.raw +} + +test_expect_success '--stdin-packs=follow walks into unknown packs' ' + test_when_finished "rm -fr repo" && + + git init repo && + ( + cd repo && + + for c in A B C D + do + test_commit "$c" || return 1 + done && + + A="$(echo A | git pack-objects --revs $packdir/pack)" && + B="$(echo A..B | git pack-objects --revs $packdir/pack)" && + C="$(echo B..C | git pack-objects --revs $packdir/pack)" && + D="$(echo C..D | git pack-objects --revs $packdir/pack)" && + test_commit E && + + git prune-packed && + + cat >in <<-EOF && + pack-$B.pack + ^pack-$C.pack + pack-$D.pack + EOF + + # With just --stdin-packs, pack "A" is unknown to us, so + # only objects from packs "B" and "D" are included in + # the output pack. + P=$(git pack-objects --stdin-packs $packdir/pack expect && + objects_in_packs $P >actual && + test_cmp expect actual && + + # But with --stdin-packs=follow, objects from both + # included packs reach objects from the unknown pack, so + # objects from pack "A" is included in the output pack + # in addition to the above. + P=$(git pack-objects --stdin-packs=follow $packdir/pack expect && + objects_in_packs $P >actual && + test_cmp expect actual && + + # And with --unpacked, we will pick up objects from unknown + # packs that are reachable from loose objects. Loose object E + # reaches objects in pack A, but there are three excluded packs + # in between. + # + # The resulting pack should include objects reachable from E + # that are not present in packs B, C, or D, along with those + # present in pack A. + cat >in <<-EOF && + ^pack-$B.pack + ^pack-$C.pack + ^pack-$D.pack + EOF + + P=$(git pack-objects --stdin-packs=follow --unpacked \ + $packdir/pack expect.raw && + sort expect.raw >expect && + objects_in_packs $P >actual && + test_cmp expect actual + ) +' + test_done From patchwork Tue Apr 15 22:47:15 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Taylor Blau X-Patchwork-Id: 14052871 Received: from mail-qk1-f173.google.com (mail-qk1-f173.google.com [209.85.222.173]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 477112BE0F8 for ; Tue, 15 Apr 2025 22:47:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.222.173 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757243; cv=none; b=MQpo2wRwzgVxCJtJxRieKu+Wv0d/cVsyBkP+hMqCi8HGNnru7yKH7OnIbZUy2dNeTaH3e1YezhT4hM3gb+9K4wGZVw0bg32FDmaAhWOomAR+q1AR452q+r/VexLyiZnJsDer8Q5Uk5KnXv1U6dQjzHddfwc8qipbq7Lr+9317B0= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744757243; c=relaxed/simple; bh=UZ5pdy1/bURF6nrjCU5hbOJAsBfaUu+QtVQjun8AXJQ=; h=Date:From:To:Cc:Subject:Message-ID:References:MIME-Version: Content-Type:Content-Disposition:In-Reply-To; b=EQI2+U5okQLJchvj8s7lTlTdf+0k956ZIVh3+WtgwVIjXEQZJFQ/rlhOTCFQdRvyfQavjpIEHVwuROudcLafixF8mYi+TFSAhNG65j1PTGOWwlFXt4ccAuswGlfRjDFYaWvBHZkDbfLsqXcIFL/N81q5ubh8MV25+9io4/ebAow= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com; spf=pass smtp.mailfrom=ttaylorr.com; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b=ZzqRea6z; arc=none smtp.client-ip=209.85.222.173 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=ttaylorr.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=ttaylorr-com.20230601.gappssmtp.com header.i=@ttaylorr-com.20230601.gappssmtp.com header.b="ZzqRea6z" Received: by mail-qk1-f173.google.com with SMTP id af79cd13be357-7c5b8d13f73so648032485a.0 for ; Tue, 15 Apr 2025 15:47:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ttaylorr-com.20230601.gappssmtp.com; s=20230601; t=1744757240; x=1745362040; darn=vger.kernel.org; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:from:to:cc:subject:date:message-id:reply-to; bh=hiF4W13JzQlSuSeH+hPHu+QCsY1403ZcJF02DviVMkc=; b=ZzqRea6zzxXibnFxHkQFPwuWhDvuwq9u6WM9vASQUGM6m3c6mkM717z9Rjfzv9oQK/ 3di08pT/CnyhE+KDOX1BcSn+hV7zbZPMaJNciYuIMPg9gWOkbOHYD45j4xDjxCq03H8z KFfWXNkJSgYSU9JCmZxQWlM7P5jqR/NTKDQhwbsOBdaunVsq1IpfiexeIxYWNxDk+Ji9 yaFdN815fTCutY1dGHk4jGiw0Z9Ort4cNvBNnhCNEwwrIEvLGnRZD2W8FQzjJtPE0tDf 3ZlsWrc3o/UgFXssA+8UCYb7nh70g4eGDRmIN9BBIhhCsyTlRJ4/M1BEPkdXkQUQuRFR n5VQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1744757240; x=1745362040; h=in-reply-to:content-disposition:mime-version:references:message-id :subject:cc:to:from:date:x-gm-message-state:from:to:cc:subject:date :message-id:reply-to; bh=hiF4W13JzQlSuSeH+hPHu+QCsY1403ZcJF02DviVMkc=; b=RT+xq2eA7zINz3H3Sm15SMm/BI1tmuWL0QEDhDsiSj7U+xqgBceUnacEBBAv253hyD LmqWSDYpJf9FuvcRQQ8dFfF9Hpk+TIISrpI0pub5i4BmOhRr6rwtyTYy+L8iEdTZfSDW FBd1fJ2HfVA0Jla7+a3v9oBPGO5BQZI1qknp463p/W7Fq3TjcMxZ7lumtQv1yHWUgZ3c 9ypWyFpRsi6lUDoZOu+8MrwfDG6AhYbWlEIh7ZalE4RAYlB8JmrumqMEoKsFLAU96oAJ DAmX2ReVAMm1TIh76CHoRBC5oHefAFbVidlI3y7618cOdSe0IAfbfaa043le0/uNRH6c UxFg== X-Gm-Message-State: AOJu0Yz49yzPJMWAX3ER4QNMzjkHVO41E85BepduppVgs9SNd6eV2pa1 Duf28PeB3uDEDBRv+Y6keUos8KGYDP8sM7FbQVGNjqe5DD5g0mAGkVKD4Da5YUaMaHE78M4Obz4 qru4= X-Gm-Gg: ASbGncv59CUAwfHjjNd7MhgqEspIOzcenLWWV+Poaz1OLPW0lK+SzdUiGhiEVm5ocJa U1+eZ4ILHU0K5avldDQiUIW13ddmIrytV09Ak60adUZiZSVHlqcUc3YqDM4Gx6R/LGNFL3b6NYQ ZjlJN6AJ66nksmBMoRE4jnyAii7pc8m/0JXLJfJbT/TjCiu8ROztvSP3GZS5eXB66WJEvyo9KIH +I3C7UlLqj2NHCwoWFzpoxSbBHs6aBHJkFFbNytgLjhunO8eIF9pBuq2yllSNl114CxT2hCgjxA LQA56swIlJyicrx3jOPMRG/nbhPkOC0uqLyUJKMEwMXd0+Bp5Dj5u+wuggCdrmdxtpm6EHVYzkP td1CJC/BOyxqa X-Google-Smtp-Source: AGHT+IGl/ijRAtGyTvbTxGVPhlL5ajW8rPNNQ3jy/M6/9P66LRjI/5YHNBIC+A+myR0Mml56hJR5yw== X-Received: by 2002:a05:620a:1aa1:b0:7c5:9a6c:b7d3 with SMTP id af79cd13be357-7c914260f7fmr236019585a.37.1744757236915; Tue, 15 Apr 2025 15:47:16 -0700 (PDT) Received: from localhost (104-178-186-189.lightspeed.milwwi.sbcglobal.net. [104.178.186.189]) by smtp.gmail.com with UTF8SMTPSA id d75a77b69052e-4796edc20c6sm99285971cf.71.2025.04.15.15.47.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 15 Apr 2025 15:47:16 -0700 (PDT) Date: Tue, 15 Apr 2025 18:47:15 -0400 From: Taylor Blau To: git@vger.kernel.org Cc: Elijah Newren , Jeff King , Junio C Hamano Subject: [PATCH v3 9/9] repack: exclude cruft pack(s) from the MIDX where possible Message-ID: <58891101f377267df120dc4a9edea2997296dbec.1744757204.git.me@ttaylorr.com> References: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: In ddee3703b3 (builtin/repack.c: add cruft packs to MIDX during geometric repack, 2022-05-20), repack began adding cruft pack(s) to the MIDX with '--write-midx' to ensure that the resulting MIDX was always closed under reachability in order to generate reachability bitmaps. Suppose (prior to this patch) you have a once-unreachable object packed in a cruft pack, which later on becomes reachable from one or more objects in a geometrically repacked pack. That once-unreachable object *won't* appear in the new pack, since the cruft pack was specified as neither included nor excluded to 'pack-objects --stdin-packs'. If the new pack is included in a MIDX without the cruft pack, then trying to generate bitmaps for that MIDX may fail. This happens when the bitmap selection process picks one or more commits which reach the once-unreachable objects, commit ddee3703b3 ensures that the MIDX will be closed under reachability. Without it, we would fail to generate a MIDX bitmap. ddee3703b3 alludes to the fact that this is sub-optimal by saying [...] it's desirable to avoid including cruft packs in the MIDX because it causes the MIDX to store a bunch of objects which are likely to get thrown away. , which is true, but hides an even larger problem. If repositories rarely prune their unreachable objects and/or have many of them, the MIDX must keep track of a large number of objects which bloats the MIDX and slows down object lookup. This is doubly unfortunate because the vast majority of objects in cruft pack(s) are unlikely to be read. But any object lookups that go through the MIDX must binary search over them anyway, slowing down object lookups using the MIDX. This patch causes geometrically-repacked packs to contain a copy of any once-unreachable object(s) with 'git pack-objects --stdin-packs=follow', allowing us to avoid including any cruft packs in the MIDX. This is because a sequence of geometrically-repacked packs that were all generated with '--stdin-packs=follow' are guaranteed to have their union be closed under reachability. Note that you cannot guarantee that a collection of packs is closed under reachability if not all of them were generated with "following" as above. One tell-tale sign that not all geometrically-repacked packs in the MIDX were generated with "following" is to see if there is a pack in the existing MIDX that is not going to be somehow represented (either verbatim or as part of a geometric rollup) in the new MIDX. If there is, then starting to generate packs with "following" during geometric repacking won't work, since it's open to the same race as described above. But if you're starting from scratch (e.g., building the first MIDX after an all-into-one '--cruft' repack), then you can guarantee that the union of subsequently generated packs from geometric repacking *is* closed under reachability. Detect when this is the case and avoid including cruft packs in the MIDX where possible. The existing behavior remains the default, and the new behavior is available with the config 'repack.midxMustIncludeCruft' set to 'false'. Signed-off-by: Taylor Blau --- Documentation/config/repack.adoc | 7 ++ builtin/repack.c | 163 +++++++++++++++++++++++++++---- t/t7704-repack-cruft.sh | 90 +++++++++++++++++ 3 files changed, 242 insertions(+), 18 deletions(-) diff --git a/Documentation/config/repack.adoc b/Documentation/config/repack.adoc index c79af6d7b8..e9e78dcb19 100644 --- a/Documentation/config/repack.adoc +++ b/Documentation/config/repack.adoc @@ -39,3 +39,10 @@ repack.cruftThreads:: a cruft pack and the respective parameters are not given over the command line. See similarly named `pack.*` configuration variables for defaults and meaning. + +repack.midxMustContainCruft:: + When set to true, linkgit:git-repack[1] will unconditionally include + cruft pack(s), if any, in the multi-pack index when invoked with + `--write-midx`. When false, cruft packs are only included in the MIDX + when necessary (e.g., because they might be required to form a + reachability closure with MIDX bitmaps). Defaults to true. diff --git a/builtin/repack.c b/builtin/repack.c index f3330ade7b..c9e2e3d04d 100644 --- a/builtin/repack.c +++ b/builtin/repack.c @@ -39,6 +39,7 @@ static int write_bitmaps = -1; static int use_delta_islands; static int run_update_server_info = 1; static char *packdir, *packtmp_name, *packtmp; +static int midx_must_contain_cruft = 1; static const char *const git_repack_usage[] = { N_("git repack [-a] [-A] [-d] [-f] [-F] [-l] [-n] [-q] [-b] [-m]\n" @@ -107,6 +108,10 @@ static int repack_config(const char *var, const char *value, free(cruft_po_args->threads); return git_config_string(&cruft_po_args->threads, var, value); } + if (!strcmp(var, "repack.midxmustcontaincruft")) { + midx_must_contain_cruft = git_config_bool(var, value); + return 0; + } return git_default_config(var, value, ctx, cb); } @@ -687,6 +692,77 @@ static void free_pack_geometry(struct pack_geometry *geometry) free(geometry->pack); } +static int midx_has_unknown_packs(char **midx_pack_names, + size_t midx_pack_names_nr, + struct string_list *include, + struct pack_geometry *geometry, + struct existing_packs *existing) +{ + size_t i; + + string_list_sort(include); + + for (i = 0; i < midx_pack_names_nr; i++) { + const char *pack_name = midx_pack_names[i]; + + /* + * Determine whether or not each MIDX'd pack from the existing + * MIDX (if any) is represented in the new MIDX. For each pack + * in the MIDX, it must either be: + * + * - In the "include" list of packs to be included in the new + * MIDX. Note this function is called before the include + * list is populated with any cruft pack(s). + * + * - Below the geometric split line (if using pack geometry), + * indicating that the pack won't be included in the new + * MIDX, but its contents were rolled up as part of the + * geometric repack. + * + * - In the existing non-kept packs list (if not using pack + * geometry), and marked as non-deleted. + */ + if (string_list_has_string(include, pack_name)) { + continue; + } else if (geometry) { + struct strbuf buf = STRBUF_INIT; + uint32_t j; + + for (j = 0; j < geometry->split; j++) { + strbuf_reset(&buf); + strbuf_addstr(&buf, pack_basename(geometry->pack[j])); + strbuf_strip_suffix(&buf, ".pack"); + strbuf_addstr(&buf, ".idx"); + + if (!strcmp(pack_name, buf.buf)) { + strbuf_release(&buf); + break; + } + } + + strbuf_release(&buf); + + if (j < geometry->split) + continue; + } else { + struct string_list_item *item; + + item = string_list_lookup(&existing->non_kept_packs, + pack_name); + if (item && !pack_is_marked_for_deletion(item)) + continue; + } + + /* + * If we got to this point, the MIDX includes some pack that we + * don't know about. + */ + return 1; + } + + return 0; +} + struct midx_snapshot_ref_data { struct tempfile *f; struct oidset seen; @@ -755,6 +831,8 @@ static void midx_snapshot_refs(struct tempfile *f) static void midx_included_packs(struct string_list *include, struct existing_packs *existing, + char **midx_pack_names, + size_t midx_pack_names_nr, struct string_list *names, struct pack_geometry *geometry) { @@ -808,26 +886,56 @@ static void midx_included_packs(struct string_list *include, } } - for_each_string_list_item(item, &existing->cruft_packs) { + if (midx_must_contain_cruft || + midx_has_unknown_packs(midx_pack_names, midx_pack_names_nr, + include, geometry, existing)) { /* - * When doing a --geometric repack, there is no need to check - * for deleted packs, since we're by definition not doing an - * ALL_INTO_ONE repack (hence no packs will be deleted). - * Otherwise we must check for and exclude any packs which are - * enqueued for deletion. + * If there are one or more unknown pack(s) present (see + * midx_has_unknown_packs() for what makes a pack + * "unknown") in the MIDX before the repack, keep them + * as they may be required to form a reachability + * closure if the MIDX is bitmapped. * - * So we could omit the conditional below in the --geometric - * case, but doing so is unnecessary since no packs are marked - * as pending deletion (since we only call - * `mark_packs_for_deletion()` when doing an all-into-one - * repack). + * For example, a cruft pack can be required to form a + * reachability closure if the MIDX is bitmapped and one + * or more of the bitmap's selected commits reaches a + * once-cruft object that was later made reachable. */ - if (pack_is_marked_for_deletion(item)) - continue; + for_each_string_list_item(item, &existing->cruft_packs) { + /* + * When doing a --geometric repack, there is no + * need to check for deleted packs, since we're + * by definition not doing an ALL_INTO_ONE + * repack (hence no packs will be deleted). + * Otherwise we must check for and exclude any + * packs which are enqueued for deletion. + * + * So we could omit the conditional below in the + * --geometric case, but doing so is unnecessary + * since no packs are marked as pending + * deletion (since we only call + * `mark_packs_for_deletion()` when doing an + * all-into-one repack). + */ + if (pack_is_marked_for_deletion(item)) + continue; - strbuf_reset(&buf); - strbuf_addf(&buf, "%s.idx", item->string); - string_list_insert(include, buf.buf); + strbuf_reset(&buf); + strbuf_addf(&buf, "%s.idx", item->string); + string_list_insert(include, buf.buf); + } + } else { + /* + * Modern versions of Git (with the appropriate + * configuration setting) will write new copies of + * once-cruft objects when doing a --geometric repack. + * + * If the MIDX has no cruft pack, new packs written + * during a --geometric repack will not rely on the + * cruft pack to form a reachability closure, so we can + * avoid including them in the MIDX in that case. + */ + ; } strbuf_release(&buf); @@ -1142,6 +1250,8 @@ int cmd_repack(int argc, struct tempfile *refs_snapshot = NULL; int i, ext, ret; int show_progress; + char **midx_pack_names = NULL; + size_t midx_pack_names_nr = 0; /* variables to be filled by option parsing */ int delete_redundant = 0; @@ -1356,7 +1466,10 @@ int cmd_repack(int argc, !(pack_everything & PACK_CRUFT)) strvec_push(&cmd.args, "--pack-loose-unreachable"); } else if (geometry.split_factor) { - strvec_push(&cmd.args, "--stdin-packs"); + if (midx_must_contain_cruft) + strvec_push(&cmd.args, "--stdin-packs"); + else + strvec_push(&cmd.args, "--stdin-packs=follow"); strvec_push(&cmd.args, "--unpacked"); } else { strvec_push(&cmd.args, "--unpacked"); @@ -1478,6 +1591,16 @@ int cmd_repack(int argc, string_list_sort(&names); + if (get_local_multi_pack_index(the_repository)) { + uint32_t i; + struct multi_pack_index *m = + get_local_multi_pack_index(the_repository); + + ALLOC_ARRAY(midx_pack_names, m->num_packs); + for (i = 0; i < m->num_packs; i++) + midx_pack_names[midx_pack_names_nr++] = xstrdup(m->pack_names[i]); + } + close_object_store(the_repository->objects); /* @@ -1519,7 +1642,8 @@ int cmd_repack(int argc, if (write_midx) { struct string_list include = STRING_LIST_INIT_DUP; - midx_included_packs(&include, &existing, &names, &geometry); + midx_included_packs(&include, &existing, midx_pack_names, + midx_pack_names_nr, &names, &geometry); ret = write_midx_included_packs(&include, &geometry, &names, refs_snapshot ? get_tempfile_path(refs_snapshot) : NULL, @@ -1570,6 +1694,9 @@ int cmd_repack(int argc, string_list_clear(&names, 1); existing_packs_release(&existing); free_pack_geometry(&geometry); + for (size_t i = 0; i < midx_pack_names_nr; i++) + free(midx_pack_names[i]); + free(midx_pack_names); pack_objects_args_release(&po_args); pack_objects_args_release(&cruft_po_args); diff --git a/t/t7704-repack-cruft.sh b/t/t7704-repack-cruft.sh index 8aebfb45f5..2b0a55f8fd 100755 --- a/t/t7704-repack-cruft.sh +++ b/t/t7704-repack-cruft.sh @@ -724,4 +724,94 @@ test_expect_success 'cruft repack respects --quiet' ' ) ' +setup_cruft_exclude_tests() { + git init "$1" && + ( + cd "$1" && + + git config repack.midxMustContainCruft false && + + test_commit one && + + test_commit --no-tag two && + two="$(git rev-parse HEAD)" && + test_commit --no-tag three && + three="$(git rev-parse HEAD)" && + git reset --hard one && + git reflog expire --all --expire=all && + + GIT_TEST_MULTI_PACK_INDEX=0 git repack --cruft -d && + + git merge $two && + test_commit four + ) +} + +test_expect_success 'repack --write-midx excludes cruft where possible' ' + setup_cruft_exclude_tests exclude-cruft-when-possible && + ( + cd exclude-cruft-when-possible && + + GIT_TEST_MULTI_PACK_INDEX=0 \ + git repack -d --geometric=2 --write-midx --write-bitmap-index && + + test-tool read-midx --show-objects $objdir >midx && + cruft="$(ls $packdir/*.mtimes)" && + test_grep ! "$(basename "$cruft" .mtimes).idx" midx && + + git rev-list --all --objects --no-object-names >reachable.raw && + sort reachable.raw >reachable.objects && + awk "/\.pack$/ { print \$1 }" midx.objects && + + test_cmp reachable.objects midx.objects + ) +' + +test_expect_success 'repack --write-midx includes cruft when instructed' ' + setup_cruft_exclude_tests exclude-cruft-when-instructed && + ( + cd exclude-cruft-when-instructed && + + GIT_TEST_MULTI_PACK_INDEX=0 \ + git -c repack.midxMustContainCruft=true repack \ + -d --geometric=2 --write-midx --write-bitmap-index && + + test-tool read-midx --show-objects $objdir >midx && + cruft="$(ls $packdir/*.mtimes)" && + test_grep "$(basename "$cruft" .mtimes).idx" midx && + + git cat-file --batch-check="%(objectname)" --batch-all-objects \ + >all.objects && + awk "/\.pack$/ { print \$1 }" midx.objects && + + test_cmp all.objects midx.objects + ) +' + +test_expect_success 'repack --write-midx includes cruft when necessary' ' + setup_cruft_exclude_tests exclude-cruft-when-necessary && + ( + cd exclude-cruft-when-necessary && + + test_path_is_file $(ls $packdir/pack-*.mtimes) && + ls $packdir/pack-*.idx | sort >packs.all && + grep -o "pack-.*\.idx$" packs.all >in && + + git multi-pack-index write --stdin-packs --bitmap midx && + awk "/\.pack$/ { print \$1 }" midx.objects && + git cat-file --batch-all-objects --batch-check="%(objectname)" \ + >expect.objects && + test_cmp expect.objects midx.objects && + + grep "^pack-" midx >midx.packs && + test_line_count = "$(($(wc -l