From patchwork Tue Dec 3 21:43:02 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13892996 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 63B1C208981 for ; Tue, 3 Dec 2024 21:43:10 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262192; cv=none; b=RGIOAAQx1hifdGtI481ZYDiip4QRYq8JZGCtmvdnA54JY1Hm404gb8nR7WVATvA0B6b+z3Jo1+Z67or7x4A9SIfBzAIA+RMHrm98C4pmlMtfkZNayGgTCk1ET7LQMKSxj0A7FQEkCvesADEK131T7cwsPVylaVUSUp6zdZTfdSI= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262192; c=relaxed/simple; bh=CUEIWvURSubmvumRn/4Dk/TphslCRo+Q/kDcOlNpgw4=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=X4lK0W5zFaAVLmhxRVYL8DLGJiXBe9x8/OAJ3MGwyHpfWc6BncmeEm5s8+HdFJUTHkF2bO4R1NRltzKB0ml6lXNg9htX95aGKDIhlJZhFPeKObnisdKiCmB3fn14g/zCJjtNXIcK8Ra8MLGj4rfvpY6bB1LsxwjM3eSL7swKdko= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=AAuaRoRZ; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="AAuaRoRZ" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-7fcd0eff40cso2126948a12.0 for ; Tue, 03 Dec 2024 13:43:10 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733262189; x=1733866989; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=o2G4SSla57aF4GIAeWoBSIxKru0BAHIYgXUN7YmxNNU=; b=AAuaRoRZzA4jgCjHH4/cD3kdCI3YtlMNH7x0xkTTm9gzO3JWhsma/Ic/XrvmrSE0Iu Z5BSy2aVVQrD2kXGgllwQpqtR5AYad7thsNKMT5JDifl1geXK+Nvew1LmDk81n3fVyw2 rLe3fFAZXivw7FrhfbxB87175fwbzY+fOI1+6r93sQHjDPvZ9A/ZXRMioKq0KUfN9dCS E7VPQz9YbwU+h8+FG2t5ElcWORGiA1yv0/Vi9HCZEPzZMKfux5bmfcX+losxhquz7kUG 8P7x/gTc1Kos2poZ9/CWiFhuIwaHeVjHzDoRtZGyld/o4yACSp3C4sZDyIZFj260RDv7 zmlA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733262189; x=1733866989; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=o2G4SSla57aF4GIAeWoBSIxKru0BAHIYgXUN7YmxNNU=; b=nRNAbQhQeIQPSc3luoKcpUP1SnhutYxCsqPdoEeD8+7dKKiZbZLO33qm7tfrW+SEg5 9ai62dZqlLK7342jZjieDertSXcM+C32PZJbdd/9JLHoOfOooUYeaI+oWgK27aor+ges /J7l1+tBTLPxWyKJJ6MeqNk2iH2bIHrwWU9+OT79K3Ger7yPm7ygay9U5CvjmgWA5tw2 8+dP9TLJwbXz1ljp1MgP56Do2MB26Mk0PVLwT7NDmYwaSBl1+bXtsugM2b9t4bY/QNQ2 nmdI7IPl5y1e7D2RjB6JbHDlGSU2Sx9Piaf4jTkH5FlF4s7reziexg7Sogr5qHlI+fS3 Ve6A== X-Gm-Message-State: AOJu0YwlnxSSSphVscTjYYFE/TBKNeCxCLLcYKLP0CtcOSBhZkQw6WVP Mp/H793S3X19VpnBKKiqAo7bgnC01rQ8KkxJIzQJxKrDER0Bw0hS4w7F4jMZ2uwYkv2EJp/w3gX zHtSY14p9btYA/nnVjNYhnMSfT03E99uT9Q2yBXWT9UX96aV1XPR0we/Dcfp2qBFKhTOje13z1c AaeFQXBKOLk2axolNgDutbS1dMGGWtjuxpB0JUYA28ydIcJIJKWkEQxmTCkiM33mMBEg== X-Google-Smtp-Source: AGHT+IEs+JcJ1ftfBdJ0NPLXotjDT61Zq0FctBsXGp7XzJ1evITq77UlQKeLVpViiQbLBNjVLleSUDnZ3ae42ztORK26 X-Received: from pgbdk14.prod.google.com ([2002:a05:6a02:c8e:b0:7fb:db54:f065]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:9f4b:b0:1e0:bb0d:b1f4 with SMTP id adf61e73a8af0-1e16bdfb82fmr2916538637.11.1733262189503; Tue, 03 Dec 2024 13:43:09 -0800 (PST) Date: Tue, 3 Dec 2024 13:43:02 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: <7ae21c921fe367d4b15cd4a299196009c15205d9.1733259949.git.jonathantanmy@google.com> Subject: [PATCH v2 1/3] index-pack --promisor: dedup before checking links From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , ps@pks.im, gitster@pobox.com Commit c08589efdc (index-pack: repack local links into promisor packs, 2024-11-01) fixed a bug with what was believed to be a negligible decrease in performance [1] [2]. But at $DAYJOB, with at least one repo, it was found that the decrease in performance was very significant. Looking at the patch, whenever we parse an object in the packfile to be indexed, we check the targets of all its outgoing links for its existence. However, this could be optimized by first collecting all such targets into an oidset (thus deduplicating them) before checking. Teach Git to do that. On a certain fetch from the aforementioned repo, this improved performance from approximately 7 hours to 24m47.815s. This number will be further reduced in a subsequent patch. [1] https://lore.kernel.org/git/CAG1j3zGiNMbri8rZNaF0w+yP+6OdMz0T8+8_Wgd1R_p1HzVasg@mail.gmail.com/ [2] https://lore.kernel.org/git/20241105212849.3759572-1-jonathantanmy@google.com/ Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 73 +++++++++++++++++++++++++------------------- 1 file changed, 41 insertions(+), 32 deletions(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 95babdc5ea..d1c777a6af 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -155,11 +155,11 @@ static int input_fd, output_fd; static const char *curr_pack; /* - * local_links is guarded by read_mutex, and record_local_links is read-only in - * a thread. + * outgoing_links is guarded by read_mutex, and record_outgoing_links is + * read-only in a thread. */ -static struct oidset local_links = OIDSET_INIT; -static int record_local_links; +static struct oidset outgoing_links = OIDSET_INIT; +static int record_outgoing_links; static struct thread_local_data *thread_data; static int nr_dispatched; @@ -812,18 +812,12 @@ static int check_collison(struct object_entry *entry) return 0; } -static void record_if_local_object(const struct object_id *oid) +static void record_outgoing_link(const struct object_id *oid) { - struct object_info info = OBJECT_INFO_INIT; - if (oid_object_info_extended(the_repository, oid, &info, 0)) - /* Missing; assume it is a promisor object */ - return; - if (info.whence == OI_PACKED && info.u.packed.pack->pack_promisor) - return; - oidset_insert(&local_links, oid); + oidset_insert(&outgoing_links, oid); } -static void do_record_local_links(struct object *obj) +static void do_record_outgoing_links(struct object *obj) { if (obj->type == OBJ_TREE) { struct tree *tree = (struct tree *)obj; @@ -837,16 +831,16 @@ static void do_record_local_links(struct object *obj) */ return; while (tree_entry_gently(&desc, &entry)) - record_if_local_object(&entry.oid); + record_outgoing_link(&entry.oid); } else if (obj->type == OBJ_COMMIT) { struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; for (; parents; parents = parents->next) - record_if_local_object(&parents->item->object.oid); + record_outgoing_link(&parents->item->object.oid); } else if (obj->type == OBJ_TAG) { struct tag *tag = (struct tag *) obj; - record_if_local_object(get_tagged_oid(tag)); + record_outgoing_link(get_tagged_oid(tag)); } } @@ -896,7 +890,7 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, free(has_data); } - if (strict || do_fsck_object || record_local_links) { + if (strict || do_fsck_object || record_outgoing_links) { read_lock(); if (type == OBJ_BLOB) { struct blob *blob = lookup_blob(the_repository, oid); @@ -928,8 +922,8 @@ static void sha1_object(const void *data, struct object_entry *obj_entry, die(_("fsck error in packed object")); if (strict && fsck_walk(obj, NULL, &fsck_options)) die(_("Not all child objects of %s are reachable"), oid_to_hex(&obj->oid)); - if (record_local_links) - do_record_local_links(obj); + if (record_outgoing_links) + do_record_outgoing_links(obj); if (obj->type == OBJ_TREE) { struct tree *item = (struct tree *) obj; @@ -1781,26 +1775,41 @@ static void repack_local_links(void) struct object_id *oid; char *base_name; - if (!oidset_size(&local_links)) + if (!oidset_size(&outgoing_links)) return; - base_name = mkpathdup("%s/pack/pack", repo_get_object_directory(the_repository)); + oidset_iter_init(&outgoing_links, &iter); + while ((oid = oidset_iter_next(&iter))) { + struct object_info info = OBJECT_INFO_INIT; + if (oid_object_info_extended(the_repository, oid, &info, 0)) + /* Missing; assume it is a promisor object */ + continue; + if (info.whence == OI_PACKED && info.u.packed.pack->pack_promisor) + continue; - strvec_push(&cmd.args, "pack-objects"); - strvec_push(&cmd.args, "--exclude-promisor-objects-best-effort"); - strvec_push(&cmd.args, base_name); - cmd.git_cmd = 1; - cmd.in = -1; - cmd.out = -1; - if (start_command(&cmd)) - die(_("could not start pack-objects to repack local links")); + if (!cmd.args.nr) { + base_name = mkpathdup( + "%s/pack/pack", + repo_get_object_directory(the_repository)); + strvec_push(&cmd.args, "pack-objects"); + strvec_push(&cmd.args, + "--exclude-promisor-objects-best-effort"); + strvec_push(&cmd.args, base_name); + cmd.git_cmd = 1; + cmd.in = -1; + cmd.out = -1; + if (start_command(&cmd)) + die(_("could not start pack-objects to repack local links")); + } - oidset_iter_init(&local_links, &iter); - while ((oid = oidset_iter_next(&iter))) { if (write_in_full(cmd.in, oid_to_hex(oid), the_hash_algo->hexsz) < 0 || write_in_full(cmd.in, "\n", 1) < 0) die(_("failed to feed local object to pack-objects")); } + + if (!cmd.args.nr) + return; + close(cmd.in); out = xfdopen(cmd.out, "r"); @@ -1899,7 +1908,7 @@ int cmd_index_pack(int argc, } else if (skip_to_optional_arg(arg, "--keep", &keep_msg)) { ; /* nothing to do */ } else if (skip_to_optional_arg(arg, "--promisor", &promisor_msg)) { - record_local_links = 1; + record_outgoing_links = 1; } else if (starts_with(arg, "--threads=")) { char *end; nr_threads = strtoul(arg+10, &end, 0); From patchwork Tue Dec 3 21:43:03 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13892997 Received: from mail-pj1-f73.google.com (mail-pj1-f73.google.com [209.85.216.73]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D5419204F77 for ; Tue, 3 Dec 2024 21:43:11 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.216.73 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262193; cv=none; b=PvhzHmlOnsK/fZiN4ygIo7YbnfwI2BaVPi2T6tqEQHooZsRc/Emx8rojhEFmnOfsYQYFMBv9mSal3Z/FpHHFKZnTUnVuZRCfafyOeiw4BLgv1scCw2s/bFf5sL56T4U/aVqItNU2BRn/7VA27FxbwjxVXWcUGYjaLPDC0h0+w8k= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262193; c=relaxed/simple; bh=YSPDyeo4ZMIygmSkvwxktap1OJ8GvR/O2BT+VDOn3mA=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=kvbdEivsuB/oa2WMZRbg8I2tDjZM6cUgAOW7ekkwAs2FV5vmOkrd7/S8AYu5PwkDDTJRNyhf5UM2o4bFX9rgiFcaiudltWSFysO3e8Fhnli9M4BqmPxZhkDAQZ69tYzD0/zdr4keHjQWPdsIjn31xp0QVUa2TL8ZXK986+MfjZw= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=iu1FAOnn; arc=none smtp.client-ip=209.85.216.73 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="iu1FAOnn" Received: by mail-pj1-f73.google.com with SMTP id 98e67ed59e1d1-2ee6eb39edaso5288967a91.0 for ; Tue, 03 Dec 2024 13:43:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733262191; x=1733866991; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=WwWez5ol7FeBhH6sA5fPARd0zXOEP0zk86+9NV7+9aU=; b=iu1FAOnnNwbEMC4SGL5Q50WrVE9EeeYt/d7p7U2dyPCuIotUcm2Ab6sgSTjOwHGiLx u75gN3xqqTn9yhRzVe437FfVg5AYVIIOmDMXkQt5BwBfZ+dt/bk7a9VTYDAbvYeCFOVx 5mtQntAHkQ2T243u4cDlWaeYH+Si+cT09yJLKoJkpw/x0AYssDrHLcWJw+Tj8ywnF/lA j1xPddDpyvEooYNpD370jkYBWs3hpS34XwogrBHK9xAwz/1gQkNzro6o6qfJqfkxs3jY JAP3j08jEehjZJZzNia8wQlcJuL5J4dgQkRfLbpQJCfpwzxs00TMZFgRWKpegNoa6K8O czSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733262191; x=1733866991; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=WwWez5ol7FeBhH6sA5fPARd0zXOEP0zk86+9NV7+9aU=; b=REG+tvW/B6mD3peoP+rgQPg/YF3BCBySFKJ50o0QyCk/dIw7M8X46BFhkuyqgA/dn0 C/nMuQFE05zb+8lJFZnxkzoM34AqEHTPer/E8+QywyN9BOxz3/v2eclw0X6iBMOJV4yL ++uN2l1QBJdlevD9YPMEHL8GLX8138iaEsP4NuA3B93bnmV+xJEN2tK0xYNQz+hQPLCu DaVSFLzddk75aMMMg8aKhtVRbuwx3iLu13UycEPfAWWYMQIst1mtw+OUAvXwSGdT6oqX ZPynICcg2PGE6nkvhX0KNtFlFREH3cGTMTgLkpuc32BKJG7DU4vrbKsZJtvNDs2Xqoj0 lQRQ== X-Gm-Message-State: AOJu0YzFZhSnBOI8FXTJOGazZqECbB0EWf8DJKNCy+GVVbdGgl31OCpb 5okcxsU3/UceX/vbYdlyoASL5DaXtlsLP54kwoGxlqM1KEUN+ilbiXONtizBvBRr6L2X8GHwBoI dOS/2ne3PnMFOBrvAkvDM9WfBRHZAlOuP2fKauUdYbA8ShTakkQmfeOlEUFpUitmWFgIBLhYNpE nMohX7rKyG/jKHs0Lgr0d6hahh94SmZ1yLClGv7fpo/BycbA6M5E9nduKQKgrqLFULLw== X-Google-Smtp-Source: AGHT+IF+H78IibFS/IwwKrmeuLxix5R3V6H9HpgyxJtSoyhXhyasDXFhyalfJYlWVc7DD9k+QqBw/kRpPXo1Zeois1WG X-Received: from pjyp7.prod.google.com ([2002:a17:90a:e707:b0:2ea:499c:f031]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a17:90b:3847:b0:2ee:a4f2:b307 with SMTP id 98e67ed59e1d1-2ef011daaf6mr5210062a91.4.1733262191355; Tue, 03 Dec 2024 13:43:11 -0800 (PST) Date: Tue, 3 Dec 2024 13:43:03 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: <5a63c9a5cac8088730cc536f33b0af052c90aca1.1733259949.git.jonathantanmy@google.com> Subject: [PATCH v2 2/3] index-pack --promisor: don't check blobs From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , ps@pks.im, gitster@pobox.com As a follow-up to the parent of this commit, it was found that not checking for the existence of blobs linked from trees sped up the fetch from 24m47.815s to 2m2.127s. Teach Git to do that. The tradeoff of not checking blobs is documented in a code comment. (Blobs may also be linked from tag objects, but it is impossible to know the type of an object linked from a tag object without looking it up in the object database, so the code for that is untouched.) Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 29 ++++++++++++++++++++++++++++- 1 file changed, 28 insertions(+), 1 deletion(-) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index d1c777a6af..57b7888c42 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -817,6 +817,33 @@ static void record_outgoing_link(const struct object_id *oid) oidset_insert(&outgoing_links, oid); } +static void maybe_record_name_entry(const struct name_entry *entry) +{ + /* + * The benefit of doing this is as above (fetch speedup), but the drawback +is that if the packfile to be indexed references a local blob directly +(that is, not through a local tree), that local blob is in danger of +being garbage collected. Such a situation may arise if we push local +commits, including one with a change to a blob in the root tree, +and then the server incorporates them into its main branch through a +"rebase" or "squash" merge strategy, and then we fetch the new main +branch from the server. + +This situation has not been observed yet - we have only noticed missing +commits, not missing trees or blobs. (In fact, if it were believed that +only missing commits are problematic, one could argue that we should +also exclude trees during the outgoing link check; but it is safer to +include them.) + +Due to the rarity of the situation (it has not been observed to happen +in real life), and because the "penalty" in such a situation is merely +to refetch the missing blob when it's needed, the tradeoff seems +worth it. + */ + if (S_ISDIR(entry->mode)) + record_outgoing_link(&entry->oid); +} + static void do_record_outgoing_links(struct object *obj) { if (obj->type == OBJ_TREE) { @@ -831,7 +858,7 @@ static void do_record_outgoing_links(struct object *obj) */ return; while (tree_entry_gently(&desc, &entry)) - record_outgoing_link(&entry.oid); + maybe_record_name_entry(&entry); } else if (obj->type == OBJ_COMMIT) { struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; From patchwork Tue Dec 3 21:43:04 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 13892998 Received: from mail-pg1-f202.google.com (mail-pg1-f202.google.com [209.85.215.202]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D3C64204F77 for ; Tue, 3 Dec 2024 21:43:13 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.215.202 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262196; cv=none; b=dMlQL89xckXoHl1ddShwBMcZpAlXtraNd6X1D9RWubYz9PAo3/AsOcF7NiS8D0TMUDSn0hA8w0xMFWnBTTas/xs4OdB8RQIUnWhdfsH1UpokAgWj6Ym4IgyCTKZ+IeCz9Q2A08e1/kYdtw6XCq88/VFursW5DyzAEVbI3XnI3ms= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1733262196; c=relaxed/simple; bh=BroE40RtbDZavwTRswNl3gbidmM1sFCYBZQhrLgpXCw=; h=Date:In-Reply-To:Mime-Version:References:Message-ID:Subject:From: To:Cc:Content-Type; b=b019a8sCHZG/R/Hgzq1auXSYSQ1xY6V5TcLocxgbFCXTMDuy/hHZsK2nczIZIcJBJHZFnwKtkHRvW7iI7H+tgBaUGIBnwKWguuOAEP9kzgrbv+lLfXnRTimalJfTL3HFq/ZqpjUFeJTvjMrVS0t0f91MewnqVyNsZGvhPfHscz0= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=YWxHAJy6; arc=none smtp.client-ip=209.85.215.202 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=flex--jonathantanmy.bounces.google.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="YWxHAJy6" Received: by mail-pg1-f202.google.com with SMTP id 41be03b00d2f7-7fc4206332bso4975789a12.1 for ; Tue, 03 Dec 2024 13:43:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1733262193; x=1733866993; darn=vger.kernel.org; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:from:to:cc:subject:date:message-id:reply-to; bh=gEyL3KSohgJ7u310j8UKmLEJ85gu8sPt/z2vK4dEjgI=; b=YWxHAJy6yX67NLaSRbRfGOyxRh/P4HuSnaz9jGn4iL5GRwJooWEyciY8EBo70sLtvY M9491kGf2JGJxyw3tG8szlSt2lmiNCBRHUTV31hPEiOLVRXgdRsp4IpTYXxZsvP0bUtE WQAaMAN5EaaJJg8MwCH7x/7vqK79Ur4uVftFcil04vEzUGXqIGRh+ZCzxKyGzGBK0wGe MoOyO+yvrCNHJI3wTpAGrS1FXz5CvJclESifx8/symwnPTVomUPZBLbgJKoTEcp83S3X te741tAJb32ZwzRN6TewwQuQqmS2gKb0Q/U1HbgC/BZAOszW2aDB6idjTu4WPYS9wzOA 2ByQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1733262193; x=1733866993; h=cc:to:from:subject:message-id:references:mime-version:in-reply-to :date:x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=gEyL3KSohgJ7u310j8UKmLEJ85gu8sPt/z2vK4dEjgI=; b=K59aIn8CEiDUOTY7Ab2bl777vOO8PXFhBX3A5Zm8NzoRp8IpRp0nvz8efVIbr5RKSJ +FWYuB1vZy8wEl8C487WKm7SbT8PfDcod6hbnnvwE5TxqJR4k1Kbdk4XsY9+5yyfbp9u a3VAnUnthMBjkU3IB+UhBqhXiVGqFYYzqDL+r6MIXb8URXATeYi/i8Ptypyw4IC1XrzQ Lb9ox9VtmwvEGu+q0Fi9DYnmMUmZogH69m45IaPWUqZRvgati+M0At7LPQgiz9ArTOEQ 7hVRzIEXoiniF7U2fs6DmOPZMw2aaB9xi7JCL8VmR8J8ND0cLTNWRYTyvvNHYPtir1jU QRLA== X-Gm-Message-State: AOJu0YyOlW+7VzbVsOkNCs8EuLpYsmT6jqs1nLAIj4qaP5VPzH5x28aM g5UXheQpQ2Dw+5t79F2P0nzsO9wUTsa70GGhbL0KS9cWvOM+L6xFLlaLBNqfsDqiMeFdo61Vet4 Y+P9ZhoULeP0F6/mfVOEeY62Bp1dzeSOBeI0/kGfwyzwfVgLBml56zU3WVO9Qg+TmYSBBi8RR0L rjG55F6xj93/uNr0P8NgsPpv/pM/OwThprMX/QmKwKe+LrDR8KfCch4pTdDBlV+YzDzw== X-Google-Smtp-Source: AGHT+IEV56MxTYepgI6oXDcS7Oz4zPmagD01iehHm/feR9kEmPsrI5LNmHVoO4n3QE97lKUx8eSNa1tVyQsehqOCrDZG X-Received: from pjbqd13.prod.google.com ([2002:a17:90b:3ccd:b0:2da:ac73:93dd]) (user=jonathantanmy job=prod-delivery.src-stubby-dispatcher) by 2002:a05:6a20:748f:b0:1e0:ed9b:14f6 with SMTP id adf61e73a8af0-1e1653f3ebamr5854408637.32.1733262193135; Tue, 03 Dec 2024 13:43:13 -0800 (PST) Date: Tue, 3 Dec 2024 13:43:04 -0800 In-Reply-To: Precedence: bulk X-Mailing-List: git@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: Mime-Version: 1.0 References: X-Mailer: git-send-email 2.47.0.338.g60cca15819-goog Message-ID: <8139325bf221685793ec40487c201be7103daad6.1733259949.git.jonathantanmy@google.com> Subject: [PATCH v2 3/3] index-pack --promisor: also check commits' trees From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , ps@pks.im, gitster@pobox.com Commit c08589efdc (index-pack: repack local links into promisor packs, 2024-11-01) seems to contain an oversight in that the tree of a commit is not checked. Teach git to check these trees. The fix slows down a fetch from a certain repo at $DAYJOB from 2m2.127s to 2m45.052s, but in order to make the fetch correct, it seems worth it. In order to test this, we could create server and client repos as follows... C S \ / O (O and C are commits both on the client and server. S is a commit only on the server. C and S have the same tree but different commit messages. The diff between O and C is non-zero.) ...and then, from the client, fetch S from the server. In theory, the client declares "have C" and the server can use this information to exclude S's tree (since it knows that the client has C's tree, which is the same as S's tree). However, it is also possible for the server to compute that it needs to send S and not O, and proceed from there; therefore the objects of C are not considered at all when determining what to send in the packfile. In order to prevent a test of client functionality from having such a dependence on server behavior, I have not included such a test. Signed-off-by: Jonathan Tan --- builtin/index-pack.c | 1 + 1 file changed, 1 insertion(+) diff --git a/builtin/index-pack.c b/builtin/index-pack.c index 57b7888c42..2250a410e2 100644 --- a/builtin/index-pack.c +++ b/builtin/index-pack.c @@ -863,6 +863,7 @@ static void do_record_outgoing_links(struct object *obj) struct commit *commit = (struct commit *) obj; struct commit_list *parents = commit->parents; + record_outgoing_link(get_commit_tree_oid(commit)); for (; parents; parents = parents->next) record_outgoing_link(&parents->item->object.oid); } else if (obj->type == OBJ_TAG) {