From patchwork Fri Mar 29 21:39:27 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10877731 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D3A81874 for ; Fri, 29 Mar 2019 21:39:38 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 69842286EB for ; Fri, 29 Mar 2019 21:39:38 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 5D88328715; Fri, 29 Mar 2019 21:39:38 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EE1C7286EB for ; Fri, 29 Mar 2019 21:39:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730112AbfC2Vjg (ORCPT ); Fri, 29 Mar 2019 17:39:36 -0400 Received: from mail-qt1-f201.google.com ([209.85.160.201]:48599 "EHLO mail-qt1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729771AbfC2Vjf (ORCPT ); Fri, 29 Mar 2019 17:39:35 -0400 Received: by mail-qt1-f201.google.com with SMTP id 54so3652326qtn.15 for ; Fri, 29 Mar 2019 14:39:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=UGaeHy2mebNCfWClC4a9txQDeOlf69FpZRBVczCrEp4=; b=HEm9SdGOTesAX7F7UPKUwxQ1LZ3G9fEex60BmECs1LPUNLVi0Eoj6VDYfB2RdEO/e5 ahOtQh5LEIy7Fsdq5nMEIdnUltfvc+jEtD/mOQwToWRONjZolc7ckS7HPaH4033Zovhe jyjw+d+jPtSHnvJOejuXPoCF6JafeAMKRSQwSdIcUpA3PbBGX4b7Ig+NfJTTm/Exf0Vz duA8X14yYxluhCdAFQ79kduToMYXLT62Y+3MlOEeOkv4MoXDIaecPO1I8bG22t703zWA 0xsn5XDH43mnSgA/E002KyMctDkk0sF0PEfYrCYX77lPcQKuU9REGEFZkMiORjTHTR7W Rkdw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=UGaeHy2mebNCfWClC4a9txQDeOlf69FpZRBVczCrEp4=; b=R0oj7OHDBZW2uzOBlE3ibjO3kV4lArw4lf8mEqTh/pkyexo6jL7pXEuwDeQTh1LeiJ iAsPIJ6ujEmyMjjHQiTTuu23SpqLD0281CMGx5P6u6qOEW1wWgK4k0sunmgdvPB0ReE/ UCBu7krDNGOe82ktiD64KvQw0MmwCgMvE2Urd5t3MoWzByXM+LMCxyR4drRQ/ie/JC5U gMuTZcB8z+Wwtpry35crldchf5Wh6YtchuMKZmFYrML9J0XZT5wZMY59eRzfLwdMJTFw q/Qmc1MxgCnhPsrwsGsq6gMCFsPGgEFhPDb4DwNH1JKmYlfsruP3o7zN5ydXX27r7N73 Zu7A== X-Gm-Message-State: APjAAAWQvLohWtP7q/vwb2imIzoBVaHjwgqf1KRhm6D/NvPFiwlzi0ks 94F5MXY5reP2b3PxruJRyJLUYow+Ovn25gWlWjih3fxMYj1yCYYT+wlCj1aGD3AxGeEC5y3e3yN /OygnU6Q6RVm4T/klRwUkKRh9MSEexv4CwSAVyZQeBNCZgF3VRoIAYj8Q6FXehKslNa32+uCUua ic X-Google-Smtp-Source: APXvYqxbETs3NOuk88dwtQsX1S+qBRQxI5c3A6lmfjsPGXwfwbQKAowy4JIE7cwjEWBNiCHvWUR6OXPNCbHt7GKWiiCa X-Received: by 2002:a37:a48d:: with SMTP id n135mr3987611qke.6.1553895574794; Fri, 29 Mar 2019 14:39:34 -0700 (PDT) Date: Fri, 29 Mar 2019 14:39:27 -0700 In-Reply-To: Message-Id: <068861632b85179d2a5a5ceb966e951a78b27141.1553895166.git.jonathantanmy@google.com> Mime-Version: 1.0 References: <20190326220906.111879-1-jonathantanmy@google.com> X-Mailer: git-send-email 2.21.0.197.gd478713db0 Subject: [PATCH v2 1/2] sha1-file: support OBJECT_INFO_FOR_PREFETCH From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, Johannes.Schindelin@gmx.de, szeder.dev@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Teach oid_object_info_extended() to support a new flag that inhibits fetching of missing objects. This is equivalent to setting fetch_is_missing to 0, calling oid_object_info_extended(), then setting fetch_if_missing to whatever it was before. Update unpack-trees.c to use this new flag instead of repeatedly setting fetch_if_missing. This new flag complicates things slightly in that there are now 2 ways to do the same thing. But this eliminates the need to repeatedly set a global variable, and more importantly, allows prefetching to be done in parallel (in the future); hence, this patch. Signed-off-by: Jonathan Tan --- object-store.h | 6 ++++++ sha1-file.c | 3 ++- unpack-trees.c | 17 +++++++++-------- 3 files changed, 17 insertions(+), 9 deletions(-) diff --git a/object-store.h b/object-store.h index 14fc935bd1..dd3f9b75f0 100644 --- a/object-store.h +++ b/object-store.h @@ -280,6 +280,12 @@ struct object_info { #define OBJECT_INFO_QUICK 8 /* Do not check loose object */ #define OBJECT_INFO_IGNORE_LOOSE 16 +/* + * Do not attempt to fetch the object if missing (even if fetch_is_missing is + * nonzero). This is meant for bulk prefetching of missing blobs in a partial + * clone. Implies OBJECT_INFO_QUICK. + */ +#define OBJECT_INFO_FOR_PREFETCH (32 + OBJECT_INFO_QUICK) int oid_object_info_extended(struct repository *r, const struct object_id *, diff --git a/sha1-file.c b/sha1-file.c index 494606f771..ad02649124 100644 --- a/sha1-file.c +++ b/sha1-file.c @@ -1370,7 +1370,8 @@ int oid_object_info_extended(struct repository *r, const struct object_id *oid, /* Check if it is a missing object */ if (fetch_if_missing && repository_format_partial_clone && - !already_retried && r == the_repository) { + !already_retried && r == the_repository && + !(flags & OBJECT_INFO_FOR_PREFETCH)) { /* * TODO Investigate having fetch_object() return * TODO error/success and stopping the music here. diff --git a/unpack-trees.c b/unpack-trees.c index 22c41a3ba8..381b0cd65e 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -404,20 +404,21 @@ static int check_updates(struct unpack_trees_options *o) * below. */ struct oid_array to_fetch = OID_ARRAY_INIT; - int fetch_if_missing_store = fetch_if_missing; - fetch_if_missing = 0; for (i = 0; i < index->cache_nr; i++) { struct cache_entry *ce = index->cache[i]; - if ((ce->ce_flags & CE_UPDATE) && - !S_ISGITLINK(ce->ce_mode)) { - if (!has_object_file(&ce->oid)) - oid_array_append(&to_fetch, &ce->oid); - } + + if (!(ce->ce_flags & CE_UPDATE) || + S_ISGITLINK(ce->ce_mode)) + continue; + if (!oid_object_info_extended(the_repository, &ce->oid, + NULL, + OBJECT_INFO_FOR_PREFETCH)) + continue; + oid_array_append(&to_fetch, &ce->oid); } if (to_fetch.nr) fetch_objects(repository_format_partial_clone, to_fetch.oid, to_fetch.nr); - fetch_if_missing = fetch_if_missing_store; oid_array_clear(&to_fetch); } for (i = 0; i < index->cache_nr; i++) { From patchwork Fri Mar 29 21:39:28 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jonathan Tan X-Patchwork-Id: 10877733 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8F3321669 for ; Fri, 29 Mar 2019 21:39:40 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 7B89D286EB for ; Fri, 29 Mar 2019 21:39:40 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 6F78F28715; Fri, 29 Mar 2019 21:39:40 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-15.5 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI, USER_IN_DEF_DKIM_WL autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id D2C11286EB for ; Fri, 29 Mar 2019 21:39:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730142AbfC2Vji (ORCPT ); Fri, 29 Mar 2019 17:39:38 -0400 Received: from mail-pf1-f201.google.com ([209.85.210.201]:51745 "EHLO mail-pf1-f201.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729771AbfC2Vjh (ORCPT ); Fri, 29 Mar 2019 17:39:37 -0400 Received: by mail-pf1-f201.google.com with SMTP id v76so1293451pfa.18 for ; Fri, 29 Mar 2019 14:39:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=date:in-reply-to:message-id:mime-version:references:subject:from:to :cc; bh=VNvrZLGhpPFyTCT+DiuVy3sDxvyT+oiA9jHkUVu/Rdo=; b=ntPoFWcPSC2YSpqlVuArueyby4+Q7r2AUZqImVbXnZwoO8WQFjoGT+FJ/hXJYDgt2f iGHXxCoyemdyPqCQoJ8wK6HmGCOhOXTCxCe2dTCgqFWVv9czpYzBWcQjO3MyORz+AYrG 9mlAZdqptrWkbqIUkTqs1Z5Tlq699MMlz6JMlqlEMi8fHNzGkBiS9aVY29e+PtCmGBF2 U2L+HlbRX5/ZBJLSf+vHs9yxq3sBItBlG+uttiniPz+3CSCLZf2JMuL2RNf+3eZcbj/9 PqLMS4T0keX8jCE8BHj1SuSog3B8lJkJ0pVZF2OquuijDznrYriRWU+n0PEdo+wC4pYf 3Bbw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:in-reply-to:message-id:mime-version :references:subject:from:to:cc; bh=VNvrZLGhpPFyTCT+DiuVy3sDxvyT+oiA9jHkUVu/Rdo=; b=rgSE648OL2UyZl8+txZU8yZzEICoudcl54OYZghKJtUiPfKi+B+F0SosJDIcqK983P d22dr7FRqEOcNm1GEkxPXvyGuzbz1/c8FhdBSCZtEaczHFDnvavKSFPm4ik3pExwaRVk QPiXHb4UnlyqXXilQAZwFFKra1Ss94NCJ15o3QgkXCuc462K+RhJPfNpjLT6s67IgUfO tFCZ/6YSQRSnAj2e4vg9rxrYR69ffGu+UbJMsRuW7rlBj3urHj3hyWU7tg/WDcHmMGu7 tQllo28Emgzc4YuoCC9YuTcGCfMc3UvY4XebTdqr7jYCWDvq6f/0Gk5nQRklSBM0OmHs PwNw== X-Gm-Message-State: APjAAAULfNx8UCFJb1lLTc5nxNt1WSwwonghSSE/8/bg44/aLT6KRzbC yZ2oeBDRUGEHLOXIud+qGEmtAZQlR2bRP6MqwprLC5Z1gEYgwWrQ17+ChtegzWyGnIEHM3UyLIK Yb8L5pjmpjn7GddIHBWlAxMwMIsfGMdYV3Umk2n8ozaaEIOXZoDeYO2xKnAtgWNJJcIUPO5XAJ2 gt X-Google-Smtp-Source: APXvYqxZZ0jILPXI5X0oiPhHwdfL8VwHvIejtzlSk6YbXIBqiiE55IQyDybnBqa8ODTl2VmlkSVXMIdG8HgBY/H5xj27 X-Received: by 2002:a17:902:6687:: with SMTP id e7mr1304309plk.149.1553895576697; Fri, 29 Mar 2019 14:39:36 -0700 (PDT) Date: Fri, 29 Mar 2019 14:39:28 -0700 In-Reply-To: Message-Id: <44de02e584f449481e6fb00cf35d74adf0192e9d.1553895166.git.jonathantanmy@google.com> Mime-Version: 1.0 References: <20190326220906.111879-1-jonathantanmy@google.com> X-Mailer: git-send-email 2.21.0.197.gd478713db0 Subject: [PATCH v2 2/2] diff: batch fetching of missing blobs From: Jonathan Tan To: git@vger.kernel.org Cc: Jonathan Tan , peff@peff.net, Johannes.Schindelin@gmx.de, szeder.dev@gmail.com Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When running a command like "git show" or "git diff" in a partial clone, batch all missing blobs to be fetched as one request. This is similar to c0c578b33c ("unpack-trees: batch fetching of missing blobs", 2017-12-08), but for another command. Signed-off-by: Jonathan Tan Signed-off-by: Johannes Schindelin --- diff.c | 32 +++++++++++ t/t4067-diff-partial-clone.sh | 103 ++++++++++++++++++++++++++++++++++ 2 files changed, 135 insertions(+) create mode 100755 t/t4067-diff-partial-clone.sh diff --git a/diff.c b/diff.c index ec5c095199..1eccefb4ef 100644 --- a/diff.c +++ b/diff.c @@ -25,6 +25,7 @@ #include "packfile.h" #include "parse-options.h" #include "help.h" +#include "fetch-object.h" #ifdef NO_FAST_WORKING_DIRECTORY #define FAST_WORKING_DIRECTORY 0 @@ -6366,8 +6367,39 @@ void diffcore_fix_diff_index(void) QSORT(q->queue, q->nr, diffnamecmp); } +static void add_if_missing(struct oid_array *to_fetch, + const struct diff_filespec *filespec) +{ + if (filespec && filespec->oid_valid && + oid_object_info_extended(the_repository, &filespec->oid, NULL, + OBJECT_INFO_FOR_PREFETCH)) + oid_array_append(to_fetch, &filespec->oid); +} + void diffcore_std(struct diff_options *options) { + if (repository_format_partial_clone) { + /* + * Prefetch the diff pairs that are about to be flushed. + */ + int i; + struct diff_queue_struct *q = &diff_queued_diff; + struct oid_array to_fetch = OID_ARRAY_INIT; + + for (i = 0; i < q->nr; i++) { + struct diff_filepair *p = q->queue[i]; + add_if_missing(&to_fetch, p->one); + add_if_missing(&to_fetch, p->two); + } + if (to_fetch.nr) + /* + * NEEDSWORK: Consider deduplicating the OIDs sent. + */ + fetch_objects(repository_format_partial_clone, + to_fetch.oid, to_fetch.nr); + oid_array_clear(&to_fetch); + } + /* NOTE please keep the following in sync with diff_tree_combined() */ if (options->skip_stat_unmatch) diffcore_skip_stat_unmatch(options); diff --git a/t/t4067-diff-partial-clone.sh b/t/t4067-diff-partial-clone.sh new file mode 100755 index 0000000000..349851be7d --- /dev/null +++ b/t/t4067-diff-partial-clone.sh @@ -0,0 +1,103 @@ +#!/bin/sh + +test_description='behavior of diff when reading objects in a partial clone' + +. ./test-lib.sh + +test_expect_success 'git show batches blobs' ' + test_when_finished "rm -rf server client trace" && + + test_create_repo server && + echo a >server/a && + echo b >server/b && + git -C server add a b && + git -C server commit -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + git clone --bare --filter=blob:limit=0 "file://$(pwd)/server" client && + + # Ensure that there is exactly 1 negotiation by checking that there is + # only 1 "done" line sent. ("done" marks the end of negotiation.) + GIT_TRACE_PACKET="$(pwd)/trace" git -C client show HEAD && + grep "git> done" trace >done_lines && + test_line_count = 1 done_lines +' + +test_expect_success 'diff batches blobs' ' + test_when_finished "rm -rf server client trace" && + + test_create_repo server && + echo a >server/a && + echo b >server/b && + git -C server add a b && + git -C server commit -m x && + echo c >server/c && + echo d >server/d && + git -C server add c d && + git -C server commit -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + git clone --bare --filter=blob:limit=0 "file://$(pwd)/server" client && + + # Ensure that there is exactly 1 negotiation by checking that there is + # only 1 "done" line sent. ("done" marks the end of negotiation.) + GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff HEAD^ HEAD && + grep "git> done" trace >done_lines && + test_line_count = 1 done_lines +' + +test_expect_success 'diff skips same-OID blobs' ' + test_when_finished "rm -rf server client trace" && + + test_create_repo server && + echo a >server/a && + echo b >server/b && + git -C server add a b && + git -C server commit -m x && + echo another-a >server/a && + git -C server add a && + git -C server commit -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + git clone --bare --filter=blob:limit=0 "file://$(pwd)/server" client && + + echo a | git hash-object --stdin >hash-old-a && + echo another-a | git hash-object --stdin >hash-new-a && + echo b | git hash-object --stdin >hash-b && + + # Ensure that only a and another-a are fetched. + GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff HEAD^ HEAD && + grep "want $(cat hash-old-a)" trace && + grep "want $(cat hash-new-a)" trace && + ! grep "want $(cat hash-b)" trace +' + +test_expect_success 'diff with rename detection batches blobs' ' + test_when_finished "rm -rf server client trace" && + + test_create_repo server && + echo a >server/a && + printf "b\nb\nb\nb\nb\n" >server/b && + git -C server add a b && + git -C server commit -m x && + rm server/b && + printf "b\nb\nb\nb\nbX\n" >server/c && + git -C server add c && + git -C server commit -a -m x && + + test_config -C server uploadpack.allowfilter 1 && + test_config -C server uploadpack.allowanysha1inwant 1 && + git clone --bare --filter=blob:limit=0 "file://$(pwd)/server" client && + + # Ensure that there is exactly 1 negotiation by checking that there is + # only 1 "done" line sent. ("done" marks the end of negotiation.) + GIT_TRACE_PACKET="$(pwd)/trace" git -C client diff -M HEAD^ HEAD >out && + grep "similarity index" out && + grep "git> done" trace >done_lines && + test_line_count = 1 done_lines +' + +test_done