From patchwork Thu Oct 29 02:14:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865049 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A061C55179 for ; Thu, 29 Oct 2020 02:17:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 01DA020738 for ; Thu, 29 Oct 2020 02:17:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="LM5ncCbB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403943AbgJ2CRE (ORCPT ); Wed, 28 Oct 2020 22:17:04 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729196AbgJ2CPW (ORCPT ); Wed, 28 Oct 2020 22:15:22 -0400 Received: from mail-qk1-x744.google.com (mail-qk1-x744.google.com [IPv6:2607:f8b0:4864:20::744]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1E19BC0613CF for ; Wed, 28 Oct 2020 19:15:22 -0700 (PDT) Received: by mail-qk1-x744.google.com with SMTP id b18so865645qkc.9 for ; Wed, 28 Oct 2020 19:15:22 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=/Lv3aK8stRDSfcPzJc3HwDuGlPwpS3bnVg+tImaWaOg=; b=LM5ncCbBTbPFdnuif1NQzuQycQJY2+eiCnHYo9Qiu4xB7d8lk69ZZCCaiMpf+mQBga SVrPQc0VHQoqVDWm/IewnBDpRT9SwQf9YIGW075SaDeWMGU9TmkLgoxd4VbULa67KMC4 X4XH9avOjWXHrMjVJnJL+NqO32Vqx61gW339wA2WA/ERNupIYzxgLBqURcdqA4jju2Cn A9PCZPYoa3pqevOv3zd0PaPZMyOxyxp++unDvZkF4SnBAjopPxbpKAYJzbRi39K/6Q0s Q4FHWPEG3rogi/gVwVqZRd71v99xf0Jz9MIs6R1V3I96Mu2bGyq0LByyYutUYe0FKByj L/Uw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=/Lv3aK8stRDSfcPzJc3HwDuGlPwpS3bnVg+tImaWaOg=; b=Yslq/aJBfCH/nhqYxYQCussh9t1Xkb7TIMiORDLebzI2wOJt5p4j/iXFv+QJixl0bM wMJofHYbHtDQhJqmv03uGCzWbyyo7mbhRT3dgGAI/85hXHD29TTldDp/qsLF9PT5Q/sz DxKcECaRt19han8OdT3oI6R+sxXTb72LVss3RB5c0uZmRoSXKNz5OCPF5KptGffBqWh1 LUsdwbeDjRBf7BEU3lQT58FeRsI3Kh8SyQf1PIknhzHFrq0G7BdxxG8bhRNJ7t8n2ucY sn9KK3vGxqH4C+nxqxyaysWYlIY9AitdlhgDuQWbHbfZANkaEToavutYlNDbvUXZrwvj PyXw== X-Gm-Message-State: AOAM530XYCiJt6d24h39sv0yVN/JWhagnOxw2AkC7T7KMAQR8Yaqy6D1 qLs4rFF5S9tL6tVpbM6TREiDfvMJ2vP9+Q== X-Google-Smtp-Source: ABdhPJwat8hwfPQv/K25Zpud9NNyTT9AmiI34BxzV6yHnq/ItKHyG1GNLrtkB2mNkbe43zHnTDjTyw== X-Received: by 2002:a05:620a:c0f:: with SMTP id l15mr1712880qki.494.1603937720977; Wed, 28 Oct 2020 19:15:20 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:20 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com, Jeff Hostetler Subject: [PATCH v3 01/19] convert: make convert_attrs() and convert structs public Date: Wed, 28 Oct 2020 23:14:38 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Move convert_attrs() declaration from convert.c to convert.h, together with the conv_attrs struct and the crlf_action enum. This function and the data structures will be used outside convert.c in the upcoming parallel checkout implementation. Signed-off-by: Jeff Hostetler [matheus.bernardino: squash and reword msg] Signed-off-by: Matheus Tavares --- convert.c | 23 ++--------------------- convert.h | 24 ++++++++++++++++++++++++ 2 files changed, 26 insertions(+), 21 deletions(-) diff --git a/convert.c b/convert.c index ee360c2f07..eb14714979 100644 --- a/convert.c +++ b/convert.c @@ -24,17 +24,6 @@ #define CONVERT_STAT_BITS_TXT_CRLF 0x2 #define CONVERT_STAT_BITS_BIN 0x4 -enum crlf_action { - CRLF_UNDEFINED, - CRLF_BINARY, - CRLF_TEXT, - CRLF_TEXT_INPUT, - CRLF_TEXT_CRLF, - CRLF_AUTO, - CRLF_AUTO_INPUT, - CRLF_AUTO_CRLF -}; - struct text_stat { /* NUL, CR, LF and CRLF counts */ unsigned nul, lonecr, lonelf, crlf; @@ -1297,18 +1286,10 @@ static int git_path_check_ident(struct attr_check_item *check) return !!ATTR_TRUE(value); } -struct conv_attrs { - struct convert_driver *drv; - enum crlf_action attr_action; /* What attr says */ - enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */ - int ident; - const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */ -}; - static struct attr_check *check; -static void convert_attrs(const struct index_state *istate, - struct conv_attrs *ca, const char *path) +void convert_attrs(const struct index_state *istate, + struct conv_attrs *ca, const char *path) { struct attr_check_item *ccheck = NULL; diff --git a/convert.h b/convert.h index e29d1026a6..aeb4a1be9a 100644 --- a/convert.h +++ b/convert.h @@ -37,6 +37,27 @@ enum eol { #endif }; +enum crlf_action { + CRLF_UNDEFINED, + CRLF_BINARY, + CRLF_TEXT, + CRLF_TEXT_INPUT, + CRLF_TEXT_CRLF, + CRLF_AUTO, + CRLF_AUTO_INPUT, + CRLF_AUTO_CRLF +}; + +struct convert_driver; + +struct conv_attrs { + struct convert_driver *drv; + enum crlf_action attr_action; /* What attr says */ + enum crlf_action crlf_action; /* When no attr is set, use core.autocrlf */ + int ident; + const char *working_tree_encoding; /* Supported encoding or default encoding if NULL */ +}; + enum ce_delay_state { CE_NO_DELAY = 0, CE_CAN_DELAY = 1, @@ -102,6 +123,9 @@ void convert_to_git_filter_fd(const struct index_state *istate, int would_convert_to_git_filter_fd(const struct index_state *istate, const char *path); +void convert_attrs(const struct index_state *istate, + struct conv_attrs *ca, const char *path); + /* * Initialize the checkout metadata with the given values. Any argument may be * NULL if it is not applicable. The treeish should be a commit if that is From patchwork Thu Oct 29 02:14:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865039 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A71FC4363A for ; Thu, 29 Oct 2020 02:17:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CBD5220738 for ; Thu, 29 Oct 2020 02:17:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="Q9IpFO9p" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728645AbgJ2CRD (ORCPT ); Wed, 28 Oct 2020 22:17:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35462 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728305AbgJ2CP1 (ORCPT ); Wed, 28 Oct 2020 22:15:27 -0400 Received: from mail-qv1-xf44.google.com (mail-qv1-xf44.google.com [IPv6:2607:f8b0:4864:20::f44]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D899EC0613CF for ; Wed, 28 Oct 2020 19:15:26 -0700 (PDT) Received: by mail-qv1-xf44.google.com with SMTP id t6so773319qvz.4 for ; Wed, 28 Oct 2020 19:15:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=66zkwNbHD03qTLRBYhQMFq46TKZEM4VBeSWiT5pW1/c=; b=Q9IpFO9pBHQzywYsUjc1X/QSfNG7ZZ7hqow7XTcdYhBHUZaGU7gSNZcDJjvvxUEJ21 9vj53EG+JLXFN7uR1K+b+XbciQbFDiDmtbI9oNe7ak+x27vljGQWvdb3Q9Vxn1CFv4hO kchDLQA0gyNGfvL1Nyo7q3lQyCAufuLJLp+JYyNd2o6Yhb4qInqFfd0jXOK4I8vj7qbS OA0sF2qvhoj+SO+HaTjGXNE/YphWe55jWeGZXafSgAtb/SdZkvZ41I+WefSHS7e6Wfni UMMOFxKOwfvAi9Lx4mOf8aCc/lZuCo/zvJ09YNGi7wmXXbe08Uik0i75J1z0pAdjz3ab uc4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=66zkwNbHD03qTLRBYhQMFq46TKZEM4VBeSWiT5pW1/c=; b=ZT2eef0CtnM15RPWiaULPy/JtrYz1oB0c3PJVkIGvWe5A6sFFHFOTUG4LMCtM5616G k/y2Ifiu/6unJuwX28M24dlRmXisJfjDfbDFKvuhqZU3fHX7cgMHbjYxMxB9EvBUjvsf QIqQPExBmKeP+6HjBG0QntMA0LFc9m9ShOwThhjMUIAQhDr9d8GytfCxkP+t10IWyN98 FeOdJXBw7OBq1/zVQbGrEqvAu3S+cocui/etioczSuMmryeqVd42QBOCbPpyUdnnkhfY IhmSoSdYzm7eGlnuVh3HiBh+TeQfDlFIWCKtgKFHS/YiouWg6bJ7cpduP521v5yl9q3S JFkg== X-Gm-Message-State: AOAM531lycDxdNA3tOgjXSCnG32NcwIqaQ3I9ko1GOqHFMxCCJjOM6He rx49qoHK/VE1Z7J2Aj34BVvuYu8dosdfAw== X-Google-Smtp-Source: ABdhPJxrDs1UsfA6FX9eZxep+17LdqWjo1McRpBEkkM/78CKn3LBo8Y5F4GfZ54uSK+tlrPeba1mGQ== X-Received: by 2002:a0c:9e0e:: with SMTP id p14mr1932855qve.25.1603937725608; Wed, 28 Oct 2020 19:15:25 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:24 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com, Jeff Hostetler Subject: [PATCH v3 02/19] convert: add [async_]convert_to_working_tree_ca() variants Date: Wed, 28 Oct 2020 23:14:39 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Separate the attribute gathering from the actual conversion by adding _ca() variants of the conversion functions. These variants receive a precomputed 'struct conv_attrs', not relying, thus, on a index state. They will be used in a future patch adding parallel checkout support, for two reasons: - We will already load the conversion attributes in checkout_entry(), before conversion, to decide whether a path is eligible for parallel checkout. Therefore, it would be wasteful to load them again later, for the actual conversion. - The parallel workers will be responsible for reading, converting and writing blobs to the working tree. They won't have access to the main process' index state, so they cannot load the attributes. Instead, they will receive the preloaded ones and call the _ca() variant of the conversion functions. Furthermore, the attributes machinery is optimized to handle paths in sequential order, so it's better to leave it for the main process, anyway. Signed-off-by: Jeff Hostetler [matheus.bernardino: squash, remove one function definition and reword] Signed-off-by: Matheus Tavares --- convert.c | 50 ++++++++++++++++++++++++++++++++++++-------------- convert.h | 9 +++++++++ 2 files changed, 45 insertions(+), 14 deletions(-) diff --git a/convert.c b/convert.c index eb14714979..191a42a0ae 100644 --- a/convert.c +++ b/convert.c @@ -1447,7 +1447,7 @@ void convert_to_git_filter_fd(const struct index_state *istate, ident_to_git(dst->buf, dst->len, dst, ca.ident); } -static int convert_to_working_tree_internal(const struct index_state *istate, +static int convert_to_working_tree_internal(const struct conv_attrs *ca, const char *path, const char *src, size_t len, struct strbuf *dst, int normalizing, @@ -1455,11 +1455,8 @@ static int convert_to_working_tree_internal(const struct index_state *istate, struct delayed_checkout *dco) { int ret = 0, ret_filter = 0; - struct conv_attrs ca; - - convert_attrs(istate, &ca, path); - ret |= ident_to_worktree(src, len, dst, ca.ident); + ret |= ident_to_worktree(src, len, dst, ca->ident); if (ret) { src = dst->buf; len = dst->len; @@ -1469,24 +1466,24 @@ static int convert_to_working_tree_internal(const struct index_state *istate, * is a smudge or process filter (even if the process filter doesn't * support smudge). The filters might expect CRLFs. */ - if ((ca.drv && (ca.drv->smudge || ca.drv->process)) || !normalizing) { - ret |= crlf_to_worktree(src, len, dst, ca.crlf_action); + if ((ca->drv && (ca->drv->smudge || ca->drv->process)) || !normalizing) { + ret |= crlf_to_worktree(src, len, dst, ca->crlf_action); if (ret) { src = dst->buf; len = dst->len; } } - ret |= encode_to_worktree(path, src, len, dst, ca.working_tree_encoding); + ret |= encode_to_worktree(path, src, len, dst, ca->working_tree_encoding); if (ret) { src = dst->buf; len = dst->len; } ret_filter = apply_filter( - path, src, len, -1, dst, ca.drv, CAP_SMUDGE, meta, dco); - if (!ret_filter && ca.drv && ca.drv->required) - die(_("%s: smudge filter %s failed"), path, ca.drv->name); + path, src, len, -1, dst, ca->drv, CAP_SMUDGE, meta, dco); + if (!ret_filter && ca->drv && ca->drv->required) + die(_("%s: smudge filter %s failed"), path, ca->drv->name); return ret | ret_filter; } @@ -1497,7 +1494,9 @@ int async_convert_to_working_tree(const struct index_state *istate, const struct checkout_metadata *meta, void *dco) { - return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, dco); + struct conv_attrs ca; + convert_attrs(istate, &ca, path); + return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, dco); } int convert_to_working_tree(const struct index_state *istate, @@ -1505,13 +1504,36 @@ int convert_to_working_tree(const struct index_state *istate, size_t len, struct strbuf *dst, const struct checkout_metadata *meta) { - return convert_to_working_tree_internal(istate, path, src, len, dst, 0, meta, NULL); + struct conv_attrs ca; + convert_attrs(istate, &ca, path); + return convert_to_working_tree_internal(&ca, path, src, len, dst, 0, meta, NULL); +} + +int async_convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta, + void *dco) +{ + return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, dco); +} + +int convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta) +{ + return convert_to_working_tree_internal(ca, path, src, len, dst, 0, meta, NULL); } int renormalize_buffer(const struct index_state *istate, const char *path, const char *src, size_t len, struct strbuf *dst) { - int ret = convert_to_working_tree_internal(istate, path, src, len, dst, 1, NULL, NULL); + struct conv_attrs ca; + int ret; + + convert_attrs(istate, &ca, path); + ret = convert_to_working_tree_internal(&ca, path, src, len, dst, 1, NULL, NULL); if (ret) { src = dst->buf; len = dst->len; diff --git a/convert.h b/convert.h index aeb4a1be9a..46d537d1ae 100644 --- a/convert.h +++ b/convert.h @@ -100,11 +100,20 @@ int convert_to_working_tree(const struct index_state *istate, const char *path, const char *src, size_t len, struct strbuf *dst, const struct checkout_metadata *meta); +int convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta); int async_convert_to_working_tree(const struct index_state *istate, const char *path, const char *src, size_t len, struct strbuf *dst, const struct checkout_metadata *meta, void *dco); +int async_convert_to_working_tree_ca(const struct conv_attrs *ca, + const char *path, const char *src, + size_t len, struct strbuf *dst, + const struct checkout_metadata *meta, + void *dco); int async_query_available_blobs(const char *cmd, struct string_list *available_paths); int renormalize_buffer(const struct index_state *istate, From patchwork Thu Oct 29 02:14:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865043 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 03CBDC55179 for ; Thu, 29 Oct 2020 02:17:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9E6E520738 for ; Thu, 29 Oct 2020 02:17:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="t3tR/rt7" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403978AbgJ2CRA (ORCPT ); Wed, 28 Oct 2020 22:17:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35474 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728645AbgJ2CPb (ORCPT ); Wed, 28 Oct 2020 22:15:31 -0400 Received: from mail-qt1-x82c.google.com (mail-qt1-x82c.google.com [IPv6:2607:f8b0:4864:20::82c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id D893FC0613CF for ; Wed, 28 Oct 2020 19:15:30 -0700 (PDT) Received: by mail-qt1-x82c.google.com with SMTP id h19so994613qtq.4 for ; Wed, 28 Oct 2020 19:15:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=w4RWxOAoNb5b+kMQ3sr8xlU7Nfkn7EHNNqWmSKfTHCw=; b=t3tR/rt7wk1eCXgudYUcahM5Z+wdTBPTm8+hliqCvVwac1USNq1zY8Xqs1Mx9DvES3 NucAqKvqanZDVJ60XJUSO9vvahnfwbAp4jEr73jfEnhUlJGq91d2ZVePJkp8oqL01sfd j4xXBrXFi3ID7V7+6Abso50R6H3FNnFJQ1ewiJ5/zSvCUfFSSJvp7mZQxW0u5cKncZMm d+27Ad+/It0ksh93Xn25z18nfhutFisaprLbPTfKr/HsoRZUpe5zUkmxNA549X+xl78q oNddj2y1aJY/LS133Ph2OP26xaZA9Z/3SPKcZhL7o2TV4v/zl38Kq1ZS2UsHX3zR3h96 CryQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=w4RWxOAoNb5b+kMQ3sr8xlU7Nfkn7EHNNqWmSKfTHCw=; b=Kvfa0qPCuStRuzxdBmErRzsjBEQyTi3JmS7WzBmdp4OTFP9xx8LY+ABsP78DwCBqKi Qlwry3TNeLv2dcVKpD87qlgM7c4rMAI0ujHO2a3nGyn1aYsIDJy8VOYi5JQ/nOgWRlR+ mqhN6nANsIhoMJThc65p2cVJYjIL2058CE2ifFAnsubCMweVzcHPhoN8ljwnIAJxhwwD SE/hJpyG0H5JhWh3Fx3L6kPRgZG7+jHCJdLpkkgdtj0Wc5tOrGWuNbncyhJdkrIp45vh 81U9WZha1vEdKhsaVc8Stc3LVAtmvG1xnts81ISvXEh1HZLULUJkAN3JjuP+s0Gw4WR7 GpSw== X-Gm-Message-State: AOAM5328YgQU4Hq+6mFUCKakC0+bQR7FhZ3/4SaHQwedj6m2NXy+HAxw qCKim3ZifWf23iyURnP/gABFJDH+bMY47Q== X-Google-Smtp-Source: ABdhPJxJYFumPNK3lTTxfDWOvzJpwStTKFs4MmKmFGxzCcBGEEGljQO/eGClhZk6NznQEcpXnZmv+A== X-Received: by 2002:ac8:3984:: with SMTP id v4mr1808484qte.240.1603937729668; Wed, 28 Oct 2020 19:15:29 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:28 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com, Jeff Hostetler Subject: [PATCH v3 03/19] convert: add get_stream_filter_ca() variant Date: Wed, 28 Oct 2020 23:14:40 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Like the previous patch, we will also need to call get_stream_filter() with a precomputed `struct conv_attrs`, when we add support for parallel checkout workers. So add the _ca() variant which takes the conversion attributes struct as a parameter. Signed-off-by: Jeff Hostetler [matheus.bernardino: move header comment to ca() variant and reword msg] Signed-off-by: Matheus Tavares --- convert.c | 28 +++++++++++++++++----------- convert.h | 2 ++ 2 files changed, 19 insertions(+), 11 deletions(-) diff --git a/convert.c b/convert.c index 191a42a0ae..bd4d3f01cd 100644 --- a/convert.c +++ b/convert.c @@ -1960,34 +1960,31 @@ static struct stream_filter *ident_filter(const struct object_id *oid) } /* - * Return an appropriately constructed filter for the path, or NULL if + * Return an appropriately constructed filter for the given ca, or NULL if * the contents cannot be filtered without reading the whole thing * in-core. * * Note that you would be crazy to set CRLF, smudge/clean or ident to a * large binary blob you would want us not to slurp into the memory! */ -struct stream_filter *get_stream_filter(const struct index_state *istate, - const char *path, - const struct object_id *oid) +struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca, + const struct object_id *oid) { - struct conv_attrs ca; struct stream_filter *filter = NULL; - convert_attrs(istate, &ca, path); - if (ca.drv && (ca.drv->process || ca.drv->smudge || ca.drv->clean)) + if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean)) return NULL; - if (ca.working_tree_encoding) + if (ca->working_tree_encoding) return NULL; - if (ca.crlf_action == CRLF_AUTO || ca.crlf_action == CRLF_AUTO_CRLF) + if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF) return NULL; - if (ca.ident) + if (ca->ident) filter = ident_filter(oid); - if (output_eol(ca.crlf_action) == EOL_CRLF) + if (output_eol(ca->crlf_action) == EOL_CRLF) filter = cascade_filter(filter, lf_to_crlf_filter()); else filter = cascade_filter(filter, &null_filter_singleton); @@ -1995,6 +1992,15 @@ struct stream_filter *get_stream_filter(const struct index_state *istate, return filter; } +struct stream_filter *get_stream_filter(const struct index_state *istate, + const char *path, + const struct object_id *oid) +{ + struct conv_attrs ca; + convert_attrs(istate, &ca, path); + return get_stream_filter_ca(&ca, oid); +} + void free_stream_filter(struct stream_filter *filter) { filter->vtbl->free(filter); diff --git a/convert.h b/convert.h index 46d537d1ae..262c1a1d46 100644 --- a/convert.h +++ b/convert.h @@ -169,6 +169,8 @@ struct stream_filter; /* opaque */ struct stream_filter *get_stream_filter(const struct index_state *istate, const char *path, const struct object_id *); +struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca, + const struct object_id *oid); void free_stream_filter(struct stream_filter *); int is_null_stream_filter(struct stream_filter *); From patchwork Thu Oct 29 02:14:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865047 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 39ADBC5517A for ; Thu, 29 Oct 2020 02:17:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DB9B42076B for ; Thu, 29 Oct 2020 02:17:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="xy5yM/2M" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403954AbgJ2CQ6 (ORCPT ); Wed, 28 Oct 2020 22:16:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35490 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729539AbgJ2CPf (ORCPT ); Wed, 28 Oct 2020 22:15:35 -0400 Received: from mail-qt1-x842.google.com (mail-qt1-x842.google.com [IPv6:2607:f8b0:4864:20::842]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B0979C0613CF for ; Wed, 28 Oct 2020 19:15:35 -0700 (PDT) Received: by mail-qt1-x842.google.com with SMTP id i7so985763qti.6 for ; Wed, 28 Oct 2020 19:15:35 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GslneMXDEueXo3ahHKownahgckpV0kWvLsydXiYzxCk=; b=xy5yM/2MHMRa7Zx5GQT86Fqibk9xzXiC5JFeSk480Qv5U60ndA77RrZwuGhki1vCg4 lIwkUCfl5a+AEAiw7u6Yn3lHhDwKDcRGuGkRqCti+gheDp6fl34asUDqa0i3Rp5dsNiR LkcbxRXqhzwsoxWc4xMH+RxDiTo0hfy53UmbS9bLFtScE7WwEXlsMs3CkSgRjeA31q9d cwRQakSJUXHqDKCvG7MmTntk97lQJYJjwahDDi/qPwy7zNks/xZy7A6x10gLBHQI/Kym qkEhuE5tswiE75jssylpQ548mqFUksqyT6ZD/Pir0pHV9XNc5im0XKFxN4BbSZoY7sHJ 1QtA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GslneMXDEueXo3ahHKownahgckpV0kWvLsydXiYzxCk=; b=cqbCNbdYe6uu90DJpySFshwJECYJOuAYSeO95pxobywcC/zrBmZ6IBp29vKzzq5jCt Z52/ZQXQrOsuKaKEPsUYXh2/crPojd/purvZ3O8Gqu1sNGY52G4j9cuXj4Yuj/Vaekf/ XTwobWQVWF3hrJLgZqTDeVsfZZSlj6IESjnlkZEASx51B4BcgxrVbsccwNoGoZtWgAQ6 DqMVYLD/5ygtT3w+edUwlAGFpNLGfVVKCDvB1x1LB/gRak+M5CxAWjUKsajkhJj7mwK7 8jxBF8iPOLQR0oxA8kmhAo0Oc1zp6U+dnbUdsJmlerG1vvzFBa2ite2ErQeEH7KNRAhB 8cNQ== X-Gm-Message-State: AOAM531+L17lgCPwh+OC7+ktdhU0bYicQs9UbQQ9KF9efliHqEk193Gs 0svrXPqPZdSAX+PXhSGOBP8KJjza34R4LA== X-Google-Smtp-Source: ABdhPJwZ1DJH3CtujZaJcV4wnSNUC4ssf+OgHo96DNxuCDoA8/kPzg5OYeOBZ7MIJ+dTuEFfMG3njA== X-Received: by 2002:a05:622a:242:: with SMTP id c2mr1811131qtx.230.1603937734472; Wed, 28 Oct 2020 19:15:34 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:33 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com, Jeff Hostetler Subject: [PATCH v3 04/19] convert: add conv_attrs classification Date: Wed, 28 Oct 2020 23:14:41 -0300 Message-Id: <18c3f4247e717a7766f13b4b33a0bbe31aee6b69.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Jeff Hostetler Create `enum conv_attrs_classification` to express the different ways that attributes are handled for a blob during checkout. This will be used in a later commit when deciding whether to add a file to the parallel or delayed queue during checkout. For now, we can also use it in get_stream_filter_ca() to simplify the function (as the classifying logic is the same). Signed-off-by: Jeff Hostetler [matheus.bernardino: use classification in get_stream_filter_ca()] Signed-off-by: Matheus Tavares --- convert.c | 26 +++++++++++++++++++------- convert.h | 33 +++++++++++++++++++++++++++++++++ 2 files changed, 52 insertions(+), 7 deletions(-) diff --git a/convert.c b/convert.c index bd4d3f01cd..c0b45149b5 100644 --- a/convert.c +++ b/convert.c @@ -1972,13 +1972,7 @@ struct stream_filter *get_stream_filter_ca(const struct conv_attrs *ca, { struct stream_filter *filter = NULL; - if (ca->drv && (ca->drv->process || ca->drv->smudge || ca->drv->clean)) - return NULL; - - if (ca->working_tree_encoding) - return NULL; - - if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF) + if (classify_conv_attrs(ca) != CA_CLASS_STREAMABLE) return NULL; if (ca->ident) @@ -2034,3 +2028,21 @@ void clone_checkout_metadata(struct checkout_metadata *dst, if (blob) oidcpy(&dst->blob, blob); } + +enum conv_attrs_classification classify_conv_attrs(const struct conv_attrs *ca) +{ + if (ca->drv) { + if (ca->drv->process) + return CA_CLASS_INCORE_PROCESS; + if (ca->drv->smudge || ca->drv->clean) + return CA_CLASS_INCORE_FILTER; + } + + if (ca->working_tree_encoding) + return CA_CLASS_INCORE; + + if (ca->crlf_action == CRLF_AUTO || ca->crlf_action == CRLF_AUTO_CRLF) + return CA_CLASS_INCORE; + + return CA_CLASS_STREAMABLE; +} diff --git a/convert.h b/convert.h index 262c1a1d46..523ba9b140 100644 --- a/convert.h +++ b/convert.h @@ -190,4 +190,37 @@ int stream_filter(struct stream_filter *, const char *input, size_t *isize_p, char *output, size_t *osize_p); +enum conv_attrs_classification { + /* + * The blob must be loaded into a buffer before it can be + * smudged. All smudging is done in-proc. + */ + CA_CLASS_INCORE, + + /* + * The blob must be loaded into a buffer, but uses a + * single-file driver filter, such as rot13. + */ + CA_CLASS_INCORE_FILTER, + + /* + * The blob must be loaded into a buffer, but uses a + * long-running driver process, such as LFS. This might or + * might not use delayed operations. (The important thing is + * that there is a single subordinate long-running process + * handling all associated blobs and in case of delayed + * operations, may hold per-blob state.) + */ + CA_CLASS_INCORE_PROCESS, + + /* + * The blob can be streamed and smudged without needing to + * completely read it into a buffer. + */ + CA_CLASS_STREAMABLE, +}; + +enum conv_attrs_classification classify_conv_attrs( + const struct conv_attrs *ca); + #endif /* CONVERT_H */ From patchwork Thu Oct 29 02:14:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865035 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E68FBC388F7 for ; Thu, 29 Oct 2020 02:16:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9039820720 for ; Thu, 29 Oct 2020 02:16:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="Em8OSLuh" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403942AbgJ2CQ5 (ORCPT ); Wed, 28 Oct 2020 22:16:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35502 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729557AbgJ2CPj (ORCPT ); Wed, 28 Oct 2020 22:15:39 -0400 Received: from mail-qt1-x842.google.com (mail-qt1-x842.google.com [IPv6:2607:f8b0:4864:20::842]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 98EB2C0613CF for ; Wed, 28 Oct 2020 19:15:39 -0700 (PDT) Received: by mail-qt1-x842.google.com with SMTP id h12so1005664qtu.1 for ; Wed, 28 Oct 2020 19:15:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=qWt5S6CR4bv3wmLSVyvNCY7+0YIRjTJ13RI1O+bm5Bw=; b=Em8OSLuhYdCZ0zh00A/M47+yQfXboGA35yQfTTuMBQI/yC++Z1BkHbtFcIx4YwptxZ 4aYHCnyEBRbmnVcgIrwLZS9HUHVbyvZve9RjUQ+wMfFBWzBQ+KRG4hIIMdpfxIRlb+f1 e4bZN/kDTnjvRHiMmncZC+B3HhPFDlEFKPGF8frmfBBzJ1ysZzevHnIg4wfbVg9q0l3a MoeksXrTqG5gBvRmZjNN3zAtKanWvv+TWrGFQprB3/WMFAb3wyX/dGEiJJkLfOqkEnoe Z2OZotDWmFeYarxMW2p4BIG2OaydtuesDUu9StLTeQISJVVjisJZU2NgCl2Fo/4zMTMS ulgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=qWt5S6CR4bv3wmLSVyvNCY7+0YIRjTJ13RI1O+bm5Bw=; b=b1TcUoT2ivYy94E5N67hqS1OE+NldsbchIwK1bUmvPaZFzcUXpcb9MXHjmuMNVxfT/ YMSzX9Dr9Z5zzn6a5M//wdoeOGlOgy42CEG8pZWSUu5PJiAhfYFVzK6FN4TNi4C7bjxt QHQGveqYLiJqYdYGd/iJuSifg6iCoRzDLTWP35Y1UFeZTuZo+RLIcC5ITv3bB4roKDLT oFBSKyJIpaCQIV7eJvnAQhxhUerHLjAu+R9Ttwcmf89anVRBe08sa9t8wpTX3iKiU1NO usCdB87hA66BqoNx14r1JeJ8XubMdHiJ0pEtOF8Etnyu0sI/Ya0uHzR9N9D5St2QcJkO kHkA== X-Gm-Message-State: AOAM531O5+xYLEZsZpj0aRBUJdySsGTP6RrUbmtlCilRztEFOp87o104 ji5yaE3NX+Idrt5IU+y07by5EQHkn5A3+g== X-Google-Smtp-Source: ABdhPJx8wWiQ0K2QYQQmidm01y5mxNov4WGxwMF8mmvY75yb+Ij7UpxRZyE910mrL0yUozjo0pC09g== X-Received: by 2002:aed:2227:: with SMTP id n36mr1751892qtc.118.1603937738355; Wed, 28 Oct 2020 19:15:38 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:37 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 05/19] entry: extract a header file for entry.c functions Date: Wed, 28 Oct 2020 23:14:42 -0300 Message-Id: <2caa2c4345d524e9e3bb0c388f8dc0b99236d166.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The declarations of entry.c's public functions and structures currently reside in cache.h. Although not many, they contribute to the size of cache.h and, when changed, cause the unnecessary recompilation of modules that don't really use these functions. So let's move them to a new entry.h header. Original-patch-by: Nguyễn Thái Ngọc Duy Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Matheus Tavares --- apply.c | 1 + builtin/checkout-index.c | 1 + builtin/checkout.c | 1 + builtin/difftool.c | 1 + cache.h | 24 ----------------------- entry.c | 9 +-------- entry.h | 41 ++++++++++++++++++++++++++++++++++++++++ unpack-trees.c | 1 + 8 files changed, 47 insertions(+), 32 deletions(-) create mode 100644 entry.h diff --git a/apply.c b/apply.c index 76dba93c97..ddec80b4b0 100644 --- a/apply.c +++ b/apply.c @@ -21,6 +21,7 @@ #include "quote.h" #include "rerere.h" #include "apply.h" +#include "entry.h" struct gitdiff_data { struct strbuf *root; diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c index 4bbfc92dce..9276ed0258 100644 --- a/builtin/checkout-index.c +++ b/builtin/checkout-index.c @@ -11,6 +11,7 @@ #include "quote.h" #include "cache-tree.h" #include "parse-options.h" +#include "entry.h" #define CHECKOUT_ALL 4 static int nul_term_line; diff --git a/builtin/checkout.c b/builtin/checkout.c index 0951f8fee5..b18b9d6f3c 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -26,6 +26,7 @@ #include "unpack-trees.h" #include "wt-status.h" #include "xdiff-interface.h" +#include "entry.h" static const char * const checkout_usage[] = { N_("git checkout [] "), diff --git a/builtin/difftool.c b/builtin/difftool.c index 7ac432b881..dfa22b67eb 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -23,6 +23,7 @@ #include "lockfile.h" #include "object-store.h" #include "dir.h" +#include "entry.h" static int trust_exit_code; diff --git a/cache.h b/cache.h index c0072d43b1..ccfeb9ba2b 100644 --- a/cache.h +++ b/cache.h @@ -1706,30 +1706,6 @@ const char *show_ident_date(const struct ident_split *id, */ int ident_cmp(const struct ident_split *, const struct ident_split *); -struct checkout { - struct index_state *istate; - const char *base_dir; - int base_dir_len; - struct delayed_checkout *delayed_checkout; - struct checkout_metadata meta; - unsigned force:1, - quiet:1, - not_new:1, - clone:1, - refresh_cache:1; -}; -#define CHECKOUT_INIT { NULL, "" } - -#define TEMPORARY_FILENAME_LENGTH 25 -int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts); -void enable_delayed_checkout(struct checkout *state); -int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); -/* - * Unlink the last component and schedule the leading directories for - * removal, such that empty directories get removed. - */ -void unlink_entry(const struct cache_entry *ce); - struct cache_def { struct strbuf path; int flags; diff --git a/entry.c b/entry.c index a0532f1f00..b0b8099699 100644 --- a/entry.c +++ b/entry.c @@ -6,6 +6,7 @@ #include "submodule.h" #include "progress.h" #include "fsmonitor.h" +#include "entry.h" static void create_directories(const char *path, int path_len, const struct checkout *state) @@ -429,14 +430,6 @@ static void mark_colliding_entries(const struct checkout *state, } } -/* - * Write the contents from ce out to the working tree. - * - * When topath[] is not NULL, instead of writing to the working tree - * file named by ce, a temporary file is created by this function and - * its name is returned in topath[], which must be able to hold at - * least TEMPORARY_FILENAME_LENGTH bytes long. - */ int checkout_entry(struct cache_entry *ce, const struct checkout *state, char *topath, int *nr_checkouts) { diff --git a/entry.h b/entry.h new file mode 100644 index 0000000000..2d69185448 --- /dev/null +++ b/entry.h @@ -0,0 +1,41 @@ +#ifndef ENTRY_H +#define ENTRY_H + +#include "cache.h" +#include "convert.h" + +struct checkout { + struct index_state *istate; + const char *base_dir; + int base_dir_len; + struct delayed_checkout *delayed_checkout; + struct checkout_metadata meta; + unsigned force:1, + quiet:1, + not_new:1, + clone:1, + refresh_cache:1; +}; +#define CHECKOUT_INIT { NULL, "" } + +#define TEMPORARY_FILENAME_LENGTH 25 + +/* + * Write the contents from ce out to the working tree. + * + * When topath[] is not NULL, instead of writing to the working tree + * file named by ce, a temporary file is created by this function and + * its name is returned in topath[], which must be able to hold at + * least TEMPORARY_FILENAME_LENGTH bytes long. + */ +int checkout_entry(struct cache_entry *ce, const struct checkout *state, + char *topath, int *nr_checkouts); +void enable_delayed_checkout(struct checkout *state); +int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); +/* + * Unlink the last component and schedule the leading directories for + * removal, such that empty directories get removed. + */ +void unlink_entry(const struct cache_entry *ce); + +#endif /* ENTRY_H */ diff --git a/unpack-trees.c b/unpack-trees.c index 323280dd48..a511fadd89 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -16,6 +16,7 @@ #include "fsmonitor.h" #include "object-store.h" #include "promisor-remote.h" +#include "entry.h" /* * Error messages expected by scripts out of plumbing commands such as From patchwork Thu Oct 29 02:14:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865031 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F783C5DF9D for ; Thu, 29 Oct 2020 02:16:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3831A20747 for ; Thu, 29 Oct 2020 02:16:59 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="PLxqdMNP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403926AbgJ2CQ4 (ORCPT ); Wed, 28 Oct 2020 22:16:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35510 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729710AbgJ2CPn (ORCPT ); Wed, 28 Oct 2020 22:15:43 -0400 Received: from mail-qk1-x741.google.com (mail-qk1-x741.google.com [IPv6:2607:f8b0:4864:20::741]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55193C0613CF for ; Wed, 28 Oct 2020 19:15:43 -0700 (PDT) Received: by mail-qk1-x741.google.com with SMTP id s14so850138qkg.11 for ; Wed, 28 Oct 2020 19:15:43 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=pnjvx5nwJkdEA5uoqvN7tjpCjIhkVeUGZe2ysSBLLsA=; b=PLxqdMNPyLHNqcYV0AqQB4FL5N87WswBQoXvR4g8SxwiORgkXyoueFncovGPPpuLFq y1XqLF5utoT6ZxLM5yHzZ7LZzyo7I/RI+5OmXXiTu8FIbXGZHs8b9whoKE0gdnZ/9gfD y3y6mMLjXbJyqRZkH7FRvmnTQH3Pm9Seho1/lR6EG9w4D29CKFHINEUyydgcAgm8ZH/L O/WElwawdfAWuNTKs0TSW95Xov9Hh8sDVVIiW87mjPld3CE0uG8bILDzXPcftQcR7lc+ oKUccemZe7vJLSo0+/6TNr5HkLDQaYnGNIaopMtFgT694C/rtS/F2sc6GC51JJfkR3YZ 3WjQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=pnjvx5nwJkdEA5uoqvN7tjpCjIhkVeUGZe2ysSBLLsA=; b=RLZFAvjh8/mPluDt0HzHVZyABHbSjbyUHbcZpzAbT+AQJ4VE8mqywKVLYjMbyju+lY oTPnViIKkCif+LiXQJmO4vgk3VBM1251xxQxQP9+CSlzWb77wDd3nU0sBd9HzPvcrf2e ZhYvruCpNrE267c/IH3f0ce64XmrdanV6F/qlfugTWGoys/Vb66WbPMIAtUC9GLdwwOp vn3SRb8N6+DGlPmt5/4YsKCfJh7gXZpIVfup5F6aL/NG/HTdRoUpSrjj79wLsPvndgvf q2FXgZOc4+Mh9dqQXDwoUMEqZP2vsjsw7+sWbh79rIOpjHohJIRlr8TfSCulx4jB6O1l gUCQ== X-Gm-Message-State: AOAM531TXyB4CKKMxV6tM6zaO5TA+LRzjWFkZVCieIfvGf2pjm7i07L5 ajxo+Yh6hqhSzlHe2P/454CGdJqwBRmwjg== X-Google-Smtp-Source: ABdhPJznlAqRABL1uKCHCurLImOcukpfZrBw/FWLG2QPcyx8Dc/MxjIJVw26pLtZVoqFLPh7qd5nMw== X-Received: by 2002:a05:620a:2054:: with SMTP id d20mr1667905qka.175.1603937742186; Wed, 28 Oct 2020 19:15:42 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:41 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 06/19] entry: make fstat_output() and read_blob_entry() public Date: Wed, 28 Oct 2020 23:14:43 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org These two functions will be used by the parallel checkout code, so let's make them public. Note: fstat_output() is renamed to fstat_checkout_output(), now that it has become public, seeking to avoid future name collisions. Signed-off-by: Matheus Tavares --- entry.c | 8 ++++---- entry.h | 2 ++ 2 files changed, 6 insertions(+), 4 deletions(-) diff --git a/entry.c b/entry.c index b0b8099699..b36071a610 100644 --- a/entry.c +++ b/entry.c @@ -84,7 +84,7 @@ static int create_file(const char *path, unsigned int mode) return open(path, O_WRONLY | O_CREAT | O_EXCL, mode); } -static void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) +void *read_blob_entry(const struct cache_entry *ce, unsigned long *size) { enum object_type type; void *blob_data = read_object_file(&ce->oid, &type, size); @@ -109,7 +109,7 @@ static int open_output_fd(char *path, const struct cache_entry *ce, int to_tempf } } -static int fstat_output(int fd, const struct checkout *state, struct stat *st) +int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st) { /* use fstat() only when path == ce->name */ if (fstat_is_reliable() && @@ -132,7 +132,7 @@ static int streaming_write_entry(const struct cache_entry *ce, char *path, return -1; result |= stream_blob_to_fd(fd, &ce->oid, filter, 1); - *fstat_done = fstat_output(fd, state, statbuf); + *fstat_done = fstat_checkout_output(fd, state, statbuf); result |= close(fd); if (result) @@ -346,7 +346,7 @@ static int write_entry(struct cache_entry *ce, wrote = write_in_full(fd, new_blob, size); if (!to_tempfile) - fstat_done = fstat_output(fd, state, &st); + fstat_done = fstat_checkout_output(fd, state, &st); close(fd); free(new_blob); if (wrote < 0) diff --git a/entry.h b/entry.h index 2d69185448..f860e60846 100644 --- a/entry.h +++ b/entry.h @@ -37,5 +37,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); * removal, such that empty directories get removed. */ void unlink_entry(const struct cache_entry *ce); +void *read_blob_entry(const struct cache_entry *ce, unsigned long *size); +int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st); #endif /* ENTRY_H */ From patchwork Thu Oct 29 02:14:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865041 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 50394C4363A for ; Thu, 29 Oct 2020 02:16:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2DDF2076B for ; Thu, 29 Oct 2020 02:16:58 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="wFVOJhnl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391085AbgJ2CQ4 (ORCPT ); Wed, 28 Oct 2020 22:16:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729725AbgJ2CPt (ORCPT ); Wed, 28 Oct 2020 22:15:49 -0400 Received: from mail-qv1-xf42.google.com (mail-qv1-xf42.google.com [IPv6:2607:f8b0:4864:20::f42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8FB62C0613CF for ; Wed, 28 Oct 2020 19:15:47 -0700 (PDT) Received: by mail-qv1-xf42.google.com with SMTP id t20so757464qvv.8 for ; Wed, 28 Oct 2020 19:15:47 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=nN6bJrj4N/km2e+ROlSwUAEO25dq+vCA518kUob9Gws=; b=wFVOJhnl6iHEsSMayaEubiH7SFMOUli/rkv9VTIFRr1F7RIP+ZG2xh6tGrgROjRFmC jqfmL81Bto6x3xQEQ6fKHVTuwqwZOKxYy11B8Td4djjf/h2q29pXhzmhWYCDWPQIeX1y tI+RP32DOQGRze8u1YfWcUM4TJcazET0OlklVIV1KakiV7OtSaYIewpLvObu+mFl5Fu4 ZNfixdQ7QsiSUbhGKDWobEHKKgmUeriDhWi0IqRb9R+NLvrkz2UklRZudkhZsCA1iRMJ zEx1zLQh7qSQXIFmTUYlKqG0Pngxy/i0GgjK6xy7bSo2Bekz83FsKOgqv413YeabldYp UDXA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=nN6bJrj4N/km2e+ROlSwUAEO25dq+vCA518kUob9Gws=; b=HUdyITzgMZtzmrAu/+ahblR3qFClU9pxr5ClWaZDm4pt6W2KqG2EF0IC0yH2sG6oQr 846aiWVYfl9vJKKbMtR3pUtWkP8BU6t1/ZEpxJ2bBvKii6JfJLTxjdPdYKHLCiDNT5In 0kFKttjk8TayDkq6Sw+nwJxwGMdm4caW9eruondolDG7K4i9uGfb1o685wtxO5Gw4FLR jdv2yGm5rvJdZu+IWUX8ossKJPUuzz893CP5/Ghl8t/rzQE4TMFwAWlMtwNoYA6AVtcL /oXoeXSsYOrYRKxNxFtKJFym644XYuk8RkPTEoYjsa4tgHPeFZ7O3SsHS0NiZuIi1nMC hTBw== X-Gm-Message-State: AOAM533zXYoe8kSIe1TcE3xep+IVhtPSJaack/CNO9ukRNNXk5IraFC7 fUFs15dPMdSs9CO/bDgxeR6MXZVvLvGrJw== X-Google-Smtp-Source: ABdhPJyT4bieDGlyv5BE/Q707KWsuubgLgSo+N6Mj50e+sX1KMa0QDibZPuvlCVzw4qao0EbbIcLPg== X-Received: by 2002:a0c:b2c6:: with SMTP id d6mr2474795qvf.38.1603937746420; Wed, 28 Oct 2020 19:15:46 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:45 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 07/19] entry: extract cache_entry update from write_entry() Date: Wed, 28 Oct 2020 23:14:44 -0300 Message-Id: <91ef17f533e6ed8ba2410ca6b966f06ca40973bb.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This code will be used by the parallel checkout functions, outside entry.c, so extract it to a public function. Signed-off-by: Matheus Tavares --- entry.c | 25 ++++++++++++++++--------- entry.h | 2 ++ 2 files changed, 18 insertions(+), 9 deletions(-) diff --git a/entry.c b/entry.c index b36071a610..1d2df188e5 100644 --- a/entry.c +++ b/entry.c @@ -251,6 +251,18 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts) return errs; } +void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, + struct stat *st) +{ + if (state->refresh_cache) { + assert(state->istate); + fill_stat_cache_info(state->istate, ce, st); + ce->ce_flags |= CE_UPDATE_IN_BASE; + mark_fsmonitor_invalid(state->istate, ce); + state->istate->cache_changed |= CE_ENTRY_CHANGED; + } +} + static int write_entry(struct cache_entry *ce, char *path, const struct checkout *state, int to_tempfile) { @@ -371,15 +383,10 @@ static int write_entry(struct cache_entry *ce, finish: if (state->refresh_cache) { - assert(state->istate); - if (!fstat_done) - if (lstat(ce->name, &st) < 0) - return error_errno("unable to stat just-written file %s", - ce->name); - fill_stat_cache_info(state->istate, ce, &st); - ce->ce_flags |= CE_UPDATE_IN_BASE; - mark_fsmonitor_invalid(state->istate, ce); - state->istate->cache_changed |= CE_ENTRY_CHANGED; + if (!fstat_done && lstat(ce->name, &st) < 0) + return error_errno("unable to stat just-written file %s", + ce->name); + update_ce_after_write(state, ce , &st); } delayed: return 0; diff --git a/entry.h b/entry.h index f860e60846..664aed1576 100644 --- a/entry.h +++ b/entry.h @@ -39,5 +39,7 @@ int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); void unlink_entry(const struct cache_entry *ce); void *read_blob_entry(const struct cache_entry *ce, unsigned long *size); int fstat_checkout_output(int fd, const struct checkout *state, struct stat *st); +void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, + struct stat *st); #endif /* ENTRY_H */ From patchwork Thu Oct 29 02:14:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865037 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9346DC55179 for ; Thu, 29 Oct 2020 02:16:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3C13420738 for ; Thu, 29 Oct 2020 02:16:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="ZYIYTc/6" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391084AbgJ2CQz (ORCPT ); Wed, 28 Oct 2020 22:16:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35538 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729486AbgJ2CPx (ORCPT ); Wed, 28 Oct 2020 22:15:53 -0400 Received: from mail-qt1-x841.google.com (mail-qt1-x841.google.com [IPv6:2607:f8b0:4864:20::841]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C788BC0613CF for ; Wed, 28 Oct 2020 19:15:51 -0700 (PDT) Received: by mail-qt1-x841.google.com with SMTP id m14so949263qtc.12 for ; Wed, 28 Oct 2020 19:15:51 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=d0o9M053aS4Fpu58NIsI3IZY8vBtu9bdnRbtLyCPxzA=; b=ZYIYTc/6hKIMKMxm7fGfGzl9ZIz3tP6+rQ3AxbabEqSAQX4qWcyEkLilul58xrUq0R IIHm5a/NPD9GoEJh0nGY5d53kRkDHz0FkfSh+r5bvnXMTWehGGN10IjEG/Sr2y2hdu3o WTJmxPhaE6ObXTN2aEZBeRNU9rCBxxxfUD0nCMNBlFjjxIhP3JbALtmc6jLiaY0zvcGP eS6IKiJrcv6zk+1bWwFo8Z7E/D5rdtu5/sI8UvnEtdy9BbtDwJqyyiiRZM7I0aBnaZQs 6Kn432WmDia7bqJtj1Uk55FECVx8BYipZTwYjEVt7aWlQ3YrR3sCy29XELwwL9+Zi5Th FxaQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=d0o9M053aS4Fpu58NIsI3IZY8vBtu9bdnRbtLyCPxzA=; b=C3xKzM5JhyZzjWoKRj5n0GsJjr0wdUhrhQnF3rDyWTYcs8wCTc+UCeBfPx11HoeSS8 p73e/BXcdwaU8plvpNqdT6Uck/UbWZyavMj2usC/gsgSi9CQZZy2YePZgeVb1RmESJ2o 9DK3ERu5CaxjgYMgnHABgvOwEyhTF6zVLFY7UXYaohgAHel8ZHB8o9upAPyNncf/0uHw N1Phmbb+oaPXHly9zUvAcgSgZxMW0JRGzGaQK88ii2VAWzK0xUbGI+Duvgu4qrS2mgdD cDScqV7xfYSo0T3owY9xtP5YivUUY4v+On4T24xldmTwyCwoWDiWkvZdF4hFF3hsg3Mg fdKA== X-Gm-Message-State: AOAM533stPQhh7lNPo81Q5pCXg1WTNp7Ar4PjANTIA3tLR6SHyLDDfq9 +C5lI55+3MHCkKJ0TAbhFuUYlq0eJ8t4Cg== X-Google-Smtp-Source: ABdhPJyFDzKgqHt9AYgKrs9MOo+UcK/zgx+pdXWcTdSzGNxXa/YZyakXDPWv/E9dgi6BFI83rMASJw== X-Received: by 2002:ac8:7207:: with SMTP id a7mr1824575qtp.40.1603937750613; Wed, 28 Oct 2020 19:15:50 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:49 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 08/19] entry: move conv_attrs lookup up to checkout_entry() Date: Wed, 28 Oct 2020 23:14:45 -0300 Message-Id: <81e03baab1dd7e28262e1d721eac1646c5908b5a.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org In a following patch, checkout_entry() will use conv_attrs to decide whether an entry should be enqueued for parallel checkout or not. But the attributes lookup only happens lower in this call stack. To avoid the unnecessary work of loading the attributes twice, let's move it up to checkout_entry(), and pass the loaded struct down to write_entry(). Signed-off-by: Matheus Tavares --- entry.c | 38 +++++++++++++++++++++++++++----------- 1 file changed, 27 insertions(+), 11 deletions(-) diff --git a/entry.c b/entry.c index 1d2df188e5..8237859b12 100644 --- a/entry.c +++ b/entry.c @@ -263,8 +263,9 @@ void update_ce_after_write(const struct checkout *state, struct cache_entry *ce, } } -static int write_entry(struct cache_entry *ce, - char *path, const struct checkout *state, int to_tempfile) +/* Note: ca is used (and required) iff the entry refers to a regular file. */ +static int write_entry(struct cache_entry *ce, char *path, struct conv_attrs *ca, + const struct checkout *state, int to_tempfile) { unsigned int ce_mode_s_ifmt = ce->ce_mode & S_IFMT; struct delayed_checkout *dco = state->delayed_checkout; @@ -281,8 +282,7 @@ static int write_entry(struct cache_entry *ce, clone_checkout_metadata(&meta, &state->meta, &ce->oid); if (ce_mode_s_ifmt == S_IFREG) { - struct stream_filter *filter = get_stream_filter(state->istate, ce->name, - &ce->oid); + struct stream_filter *filter = get_stream_filter_ca(ca, &ce->oid); if (filter && !streaming_write_entry(ce, path, filter, state, to_tempfile, @@ -329,14 +329,17 @@ static int write_entry(struct cache_entry *ce, * Convert from git internal format to working tree format */ if (dco && dco->state != CE_NO_DELAY) { - ret = async_convert_to_working_tree(state->istate, ce->name, new_blob, - size, &buf, &meta, dco); + ret = async_convert_to_working_tree_ca(ca, ce->name, + new_blob, size, + &buf, &meta, dco); if (ret && string_list_has_string(&dco->paths, ce->name)) { free(new_blob); goto delayed; } - } else - ret = convert_to_working_tree(state->istate, ce->name, new_blob, size, &buf, &meta); + } else { + ret = convert_to_working_tree_ca(ca, ce->name, new_blob, + size, &buf, &meta); + } if (ret) { free(new_blob); @@ -442,6 +445,7 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, { static struct strbuf path = STRBUF_INIT; struct stat st; + struct conv_attrs ca; if (ce->ce_flags & CE_WT_REMOVE) { if (topath) @@ -454,8 +458,13 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, return 0; } - if (topath) - return write_entry(ce, topath, state, 1); + if (topath) { + if (S_ISREG(ce->ce_mode)) { + convert_attrs(state->istate, &ca, ce->name); + return write_entry(ce, topath, &ca, state, 1); + } + return write_entry(ce, topath, NULL, state, 1); + } strbuf_reset(&path); strbuf_add(&path, state->base_dir, state->base_dir_len); @@ -517,9 +526,16 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, return 0; create_directories(path.buf, path.len, state); + if (nr_checkouts) (*nr_checkouts)++; - return write_entry(ce, path.buf, state, 0); + + if (S_ISREG(ce->ce_mode)) { + convert_attrs(state->istate, &ca, ce->name); + return write_entry(ce, path.buf, &ca, state, 0); + } + + return write_entry(ce, path.buf, NULL, state, 0); } void unlink_entry(const struct cache_entry *ce) From patchwork Thu Oct 29 02:14:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865021 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58D47C4363A for ; Thu, 29 Oct 2020 02:16:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F380020738 for ; Thu, 29 Oct 2020 02:16:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="k8PnvXwd" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391078AbgJ2CQt (ORCPT ); Wed, 28 Oct 2020 22:16:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35546 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728811AbgJ2CPz (ORCPT ); Wed, 28 Oct 2020 22:15:55 -0400 Received: from mail-qv1-xf42.google.com (mail-qv1-xf42.google.com [IPv6:2607:f8b0:4864:20::f42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2F1B3C0613CF for ; Wed, 28 Oct 2020 19:15:55 -0700 (PDT) Received: by mail-qv1-xf42.google.com with SMTP id d1so160668qvl.6 for ; Wed, 28 Oct 2020 19:15:55 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=XYT4R9ALwv2Rpt7TK8K0S9uE1PRUq0gDUVR/1tkQK5U=; b=k8PnvXwd/0el+S1HBmgesjfpP2V1aY6sGZLDAQli/+k1lhUqMiqTMWwInMmRKQZpBI OyxtxQMQDqSpGTNvmWqxf3cJrF2OWxxVo6xzuTyaxk9FEw9sMnYsT6o6n0PKR5d//hfx z/9YK7zavuDxRUQRfHI/UxVJuFL5W+7nQ5GX8cuBCBayQixEZKCCQb6z19zEiSyMY0+N 9XeeUTrIWNH0UG++zO+8WQiMoKuQLQFWtj/rCsFR/htiip0Zef6/y2aNSyN9YMTGbDvN IKL2M4ykRT1z0xfXS1tzHHIZH5AB/5hrTAfajdDKtFhgtfbKsEXnN35Ch0EJXOFr4W6D Ly1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=XYT4R9ALwv2Rpt7TK8K0S9uE1PRUq0gDUVR/1tkQK5U=; b=GQ2XSTtMWxDuXmEtS6efrtG9T9Hrlj2ywjTfEp66Ji2jhZVglPJSYSQMHow9UbuBcp Lgqv860qExhGXF5Lz6aejycwFt0xEeaCx2Xi8Zc++oJHQtDzJOprs3dLpFejUt2P9Rwz qF8jyUjE1DX2EGZGgIdW92tn/3pHQart+/XFSkpI5kAlGXz74tmuBplB0zw235E3/abt Sgp1+b2tmY4ipv/ZUtTytaGxHDNh5CGszHtb19bfzErkD+5kVIgU2Fh1N5GEdSzcfvmr MsBkbtFzXngdPGe+64lA37rb+r7qAxIji36h13BbQ1hFzUAwlT9cbWc1PS2L3svaYmI5 SNMA== X-Gm-Message-State: AOAM532uaGlfDDTFEWMltfZGTp6IyH5FxJdXMCgL9sSEzEG7785stfyf cxiP7lAIoOgy8bRTMUhjKeIAC4q5CX+Gcg== X-Google-Smtp-Source: ABdhPJxxqNCtGNN1xBFDZsX0vYCa7LV/UbDNGkc+HSAErkevlozmaIpErYo+G4CeW6dHxGrAC4SMug== X-Received: by 2002:ad4:58eb:: with SMTP id di11mr2012329qvb.56.1603937753996; Wed, 28 Oct 2020 19:15:53 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.50 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:53 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 09/19] entry: add checkout_entry_ca() which takes preloaded conv_attrs Date: Wed, 28 Oct 2020 23:14:46 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org The parallel checkout machinery will call checkout_entry() for entries that could not be written in parallel due to path collisions. At this point, we will already be holding the conversion attributes for each entry, and it would be wasteful to let checkout_entry() load these again. Instead, let's add the checkout_entry_ca() variant, which optionally takes a preloaded conv_attrs struct. Signed-off-by: Matheus Tavares --- entry.c | 23 ++++++++++++----------- entry.h | 13 +++++++++++-- 2 files changed, 23 insertions(+), 13 deletions(-) diff --git a/entry.c b/entry.c index 8237859b12..9d79a5671f 100644 --- a/entry.c +++ b/entry.c @@ -440,12 +440,13 @@ static void mark_colliding_entries(const struct checkout *state, } } -int checkout_entry(struct cache_entry *ce, const struct checkout *state, - char *topath, int *nr_checkouts) +int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca, + const struct checkout *state, char *topath, + int *nr_checkouts) { static struct strbuf path = STRBUF_INIT; struct stat st; - struct conv_attrs ca; + struct conv_attrs ca_buf; if (ce->ce_flags & CE_WT_REMOVE) { if (topath) @@ -459,11 +460,11 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, } if (topath) { - if (S_ISREG(ce->ce_mode)) { - convert_attrs(state->istate, &ca, ce->name); - return write_entry(ce, topath, &ca, state, 1); + if (S_ISREG(ce->ce_mode) && !ca) { + convert_attrs(state->istate, &ca_buf, ce->name); + ca = &ca_buf; } - return write_entry(ce, topath, NULL, state, 1); + return write_entry(ce, topath, ca, state, 1); } strbuf_reset(&path); @@ -530,12 +531,12 @@ int checkout_entry(struct cache_entry *ce, const struct checkout *state, if (nr_checkouts) (*nr_checkouts)++; - if (S_ISREG(ce->ce_mode)) { - convert_attrs(state->istate, &ca, ce->name); - return write_entry(ce, path.buf, &ca, state, 0); + if (S_ISREG(ce->ce_mode) && !ca) { + convert_attrs(state->istate, &ca_buf, ce->name); + ca = &ca_buf; } - return write_entry(ce, path.buf, NULL, state, 0); + return write_entry(ce, path.buf, ca, state, 0); } void unlink_entry(const struct cache_entry *ce) diff --git a/entry.h b/entry.h index 664aed1576..2081fbbbab 100644 --- a/entry.h +++ b/entry.h @@ -27,9 +27,18 @@ struct checkout { * file named by ce, a temporary file is created by this function and * its name is returned in topath[], which must be able to hold at * least TEMPORARY_FILENAME_LENGTH bytes long. + * + * With checkout_entry_ca(), callers can optionally pass a preloaded + * conv_attrs struct (to avoid reloading it), when ce refers to a + * regular file. If ca is NULL, the attributes will be loaded + * internally when (and if) needed. */ -int checkout_entry(struct cache_entry *ce, const struct checkout *state, - char *topath, int *nr_checkouts); +#define checkout_entry(ce, state, topath, nr_checkouts) \ + checkout_entry_ca(ce, NULL, state, topath, nr_checkouts) +int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca, + const struct checkout *state, char *topath, + int *nr_checkouts); + void enable_delayed_checkout(struct checkout *state); int finish_delayed_checkout(struct checkout *state, int *nr_checkouts); /* From patchwork Thu Oct 29 02:14:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865025 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 75144C388F7 for ; Thu, 29 Oct 2020 02:16:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1109220738 for ; Thu, 29 Oct 2020 02:16:50 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="QXMh1g3m" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726790AbgJ2CQs (ORCPT ); Wed, 28 Oct 2020 22:16:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35564 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729751AbgJ2CQB (ORCPT ); Wed, 28 Oct 2020 22:16:01 -0400 Received: from mail-qt1-x844.google.com (mail-qt1-x844.google.com [IPv6:2607:f8b0:4864:20::844]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 38D5DC0613CF for ; Wed, 28 Oct 2020 19:16:01 -0700 (PDT) Received: by mail-qt1-x844.google.com with SMTP id f6so46194qtc.7 for ; Wed, 28 Oct 2020 19:16:01 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=RH2x56Ypos6xh2n4oydkfqK5HVQbGiNc2vAF3WlAavk=; b=QXMh1g3mXaEppRqKlHfGqyS9FZEzwgW3jEafYkPZdEerO3LpC+XdrqJ3qnMq0b+EQL unDns5wVUw8F30sUUM11Tl552TEn8YFqIw3pksYF86DIry7GuAzylboDDIodAvr6NnOj 6ZmiZlwl0kvWopcnV3Huuikr1iqVAzuyrOB+9g2RtEPkCEZGXs8MkX4V6e7XK3oEYmAu fDDY/5HJNA+g4EsmULZjEy7RjGqyxP82HvedSeTq22fj4OtJGrUXPfdJsFCSujvXIRD0 j9ov1npPiUui7+784tweHP6Po5N0G8UM7BQzdm9t4UUnEbodxMBR+/KFjVL75VP+AbkJ zHRw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=RH2x56Ypos6xh2n4oydkfqK5HVQbGiNc2vAF3WlAavk=; b=GRc9zVMNtqkotuAZ66O1Yq5OXJLDmW4eZ/FpT3lv/E1W8+Kn8WiQ55Mux3ZW90OwgX wvj/EOxbiu7KcOwKeZLLpOgfUvVJgFHQR4n1O5eiqXR9MCOBPaVOua2j2wXXoyPj+igN gJn6rFeGNu7d65OPELCbhr8pw0wXxp7n8FJsfQ7HwZXI6WmMGduSguoeAckX7NxnV1LB qmyGU7625tHUzAU0RiOLb4DIr1p2enUZGjSyoQhIlwJ+6j38V9gf+n8e0k5nMKKf9m3/ zQQAhE9D03tMe+BrjxP/IVnzBGbhfPeufvc/ECbKg8BblaoA9FtSCvJ/C6/DKj37Oqwx ZDZw== X-Gm-Message-State: AOAM53126JMdD9sp3FyEsRkNaOvyO0V0MKCDCN0fG/ElD6WQttCJyJ0S skTWzU62/a67K9O7dOsbzLEABUScAekYqg== X-Google-Smtp-Source: ABdhPJz7Dn2gfREn0JBKSiqTmSOTbWk/YxsR2d014YblBEHD0+Qrd/RITTEW9briQwc5Vj2T1iDTtg== X-Received: by 2002:ac8:6f0e:: with SMTP id g14mr1714209qtv.279.1603937758468; Wed, 28 Oct 2020 19:15:58 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.54 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:15:57 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 10/19] unpack-trees: add basic support for parallel checkout Date: Wed, 28 Oct 2020 23:14:47 -0300 Message-Id: <2bdc13664e65a25607b8ecb4c0ea54fb2dad482c.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org This new interface allows us to enqueue some of the entries being checked out to later call write_entry() for them in parallel. For now, the parallel checkout machinery is enabled by default and there is no user configuration, but run_parallel_checkout() just writes the queued entries in sequence (without spawning additional workers). The next patch will actually implement the parallelism and, later, we will make it configurable. When there are path collisions among the entries being written (which can happen e.g. with case-sensitive files in case-insensitive file systems), the parallel checkout code detects the problem and marks the item with PC_ITEM_COLLIDED. Later, these items are sequentially fed to checkout_entry() again. This is similar to the way the sequential code deals with collisions, overwriting the previously checked out entries with the subsequent ones. The only difference is that, when we start writing the entries in parallel, we won't be able to determine which of the colliding entries will survive on disk (for the sequential algorithm, it is always the last one). I also experimented with the idea of not overwriting colliding entries, and it seemed to work well in my simple tests. However, because just one entry of each colliding group would be actually written, the others would have null lstat() fields on the index. This might not be a problem by itself, but it could cause performance penalties for subsequent commands that need to refresh the index: when the st_size value cached is 0, read-cache.c:ie_modified() will go to the filesystem to see if the contents match. As mentioned in the function: * Immediately after read-tree or update-index --cacheinfo, * the length field is zero, as we have never even read the * lstat(2) information once, and we cannot trust DATA_CHANGED * returned by ie_match_stat() which in turn was returned by * ce_match_stat_basic() to signal that the filesize of the * blob changed. We have to actually go to the filesystem to * see if the contents match, and if so, should answer "unchanged". So, if we have N entries in a colliding group and we decide to write and lstat() only one of them, every subsequent git-status will have to read, convert, and hash the written file N - 1 times, to check that the N - 1 unwritten entries are dirty. By checking out all colliding entries (like the sequential code does), we only pay the overhead once. Co-authored-by: Nguyễn Thái Ngọc Duy Co-authored-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- Makefile | 1 + entry.c | 17 +- parallel-checkout.c | 368 ++++++++++++++++++++++++++++++++++++++++++++ parallel-checkout.h | 27 ++++ unpack-trees.c | 6 +- 5 files changed, 416 insertions(+), 3 deletions(-) create mode 100644 parallel-checkout.c create mode 100644 parallel-checkout.h diff --git a/Makefile b/Makefile index 1fb0ec1705..10ee5e709b 100644 --- a/Makefile +++ b/Makefile @@ -945,6 +945,7 @@ LIB_OBJS += pack-revindex.o LIB_OBJS += pack-write.o LIB_OBJS += packfile.o LIB_OBJS += pager.o +LIB_OBJS += parallel-checkout.o LIB_OBJS += parse-options-cb.o LIB_OBJS += parse-options.o LIB_OBJS += patch-delta.o diff --git a/entry.c b/entry.c index 9d79a5671f..6676954431 100644 --- a/entry.c +++ b/entry.c @@ -7,6 +7,7 @@ #include "progress.h" #include "fsmonitor.h" #include "entry.h" +#include "parallel-checkout.h" static void create_directories(const char *path, int path_len, const struct checkout *state) @@ -426,8 +427,17 @@ static void mark_colliding_entries(const struct checkout *state, for (i = 0; i < state->istate->cache_nr; i++) { struct cache_entry *dup = state->istate->cache[i]; - if (dup == ce) - break; + if (dup == ce) { + /* + * Parallel checkout creates the files in no particular + * order. So the other side of the collision may appear + * after the given cache_entry in the array. + */ + if (parallel_checkout_status() == PC_RUNNING) + continue; + else + break; + } if (dup->ce_flags & (CE_MATCHED | CE_VALID | CE_SKIP_WORKTREE)) continue; @@ -536,6 +546,9 @@ int checkout_entry_ca(struct cache_entry *ce, struct conv_attrs *ca, ca = &ca_buf; } + if (!enqueue_checkout(ce, ca)) + return 0; + return write_entry(ce, path.buf, ca, state, 0); } diff --git a/parallel-checkout.c b/parallel-checkout.c new file mode 100644 index 0000000000..981dbe6ff3 --- /dev/null +++ b/parallel-checkout.c @@ -0,0 +1,368 @@ +#include "cache.h" +#include "entry.h" +#include "parallel-checkout.h" +#include "streaming.h" + +enum pc_item_status { + PC_ITEM_PENDING = 0, + PC_ITEM_WRITTEN, + /* + * The entry could not be written because there was another file + * already present in its path or leading directories. Since + * checkout_entry_ca() removes such files from the working tree before + * enqueueing the entry for parallel checkout, it means that there was + * a path collision among the entries being written. + */ + PC_ITEM_COLLIDED, + PC_ITEM_FAILED, +}; + +struct parallel_checkout_item { + /* pointer to a istate->cache[] entry. Not owned by us. */ + struct cache_entry *ce; + struct conv_attrs ca; + struct stat st; + enum pc_item_status status; +}; + +struct parallel_checkout { + enum pc_status status; + struct parallel_checkout_item *items; + size_t nr, alloc; +}; + +static struct parallel_checkout parallel_checkout = { 0 }; + +enum pc_status parallel_checkout_status(void) +{ + return parallel_checkout.status; +} + +void init_parallel_checkout(void) +{ + if (parallel_checkout.status != PC_UNINITIALIZED) + BUG("parallel checkout already initialized"); + + parallel_checkout.status = PC_ACCEPTING_ENTRIES; +} + +static void finish_parallel_checkout(void) +{ + if (parallel_checkout.status == PC_UNINITIALIZED) + BUG("cannot finish parallel checkout: not initialized yet"); + + free(parallel_checkout.items); + memset(¶llel_checkout, 0, sizeof(parallel_checkout)); +} + +static int is_eligible_for_parallel_checkout(const struct cache_entry *ce, + const struct conv_attrs *ca) +{ + enum conv_attrs_classification c; + + if (!S_ISREG(ce->ce_mode)) + return 0; + + c = classify_conv_attrs(ca); + switch (c) { + case CA_CLASS_INCORE: + return 1; + + case CA_CLASS_INCORE_FILTER: + /* + * It would be safe to allow concurrent instances of + * single-file smudge filters, like rot13, but we should not + * assume that all filters are parallel-process safe. So we + * don't allow this. + */ + return 0; + + case CA_CLASS_INCORE_PROCESS: + /* + * The parallel queue and the delayed queue are not compatible, + * so they must be kept completely separated. And we can't tell + * if a long-running process will delay its response without + * actually asking it to perform the filtering. Therefore, this + * type of filter is not allowed in parallel checkout. + * + * Furthermore, there should only be one instance of the + * long-running process filter as we don't know how it is + * managing its own concurrency. So, spreading the entries that + * requisite such a filter among the parallel workers would + * require a lot more inter-process communication. We would + * probably have to designate a single process to interact with + * the filter and send all the necessary data to it, for each + * entry. + */ + return 0; + + case CA_CLASS_STREAMABLE: + return 1; + + default: + BUG("unsupported conv_attrs classification '%d'", c); + } +} + +int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca) +{ + struct parallel_checkout_item *pc_item; + + if (parallel_checkout.status != PC_ACCEPTING_ENTRIES || + !is_eligible_for_parallel_checkout(ce, ca)) + return -1; + + ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1, + parallel_checkout.alloc); + + pc_item = ¶llel_checkout.items[parallel_checkout.nr++]; + pc_item->ce = ce; + memcpy(&pc_item->ca, ca, sizeof(pc_item->ca)); + pc_item->status = PC_ITEM_PENDING; + + return 0; +} + +static int handle_results(struct checkout *state) +{ + int ret = 0; + size_t i; + int have_pending = 0; + + /* + * We first update the successfully written entries with the collected + * stat() data, so that they can be found by mark_colliding_entries(), + * in the next loop, when necessary. + */ + for (i = 0; i < parallel_checkout.nr; ++i) { + struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; + if (pc_item->status == PC_ITEM_WRITTEN) + update_ce_after_write(state, pc_item->ce, &pc_item->st); + } + + for (i = 0; i < parallel_checkout.nr; ++i) { + struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; + + switch(pc_item->status) { + case PC_ITEM_WRITTEN: + /* Already handled */ + break; + case PC_ITEM_COLLIDED: + /* + * The entry could not be checked out due to a path + * collision with another entry. Since there can only + * be one entry of each colliding group on the disk, we + * could skip trying to check out this one and move on. + * However, this would leave the unwritten entries with + * null stat() fields on the index, which could + * potentially slow down subsequent operations that + * require refreshing it: git would not be able to + * trust st_size and would have to go to the filesystem + * to see if the contents match (see ie_modified()). + * + * Instead, let's pay the overhead only once, now, and + * call checkout_entry_ca() again for this file, to + * have it's stat() data stored in the index. This also + * has the benefit of adding this entry and its + * colliding pair to the collision report message. + * Additionally, this overwriting behavior is consistent + * with what the sequential checkout does, so it doesn't + * add any extra overhead. + */ + ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca, + state, NULL, NULL); + break; + case PC_ITEM_PENDING: + have_pending = 1; + /* fall through */ + case PC_ITEM_FAILED: + ret = -1; + break; + default: + BUG("unknown checkout item status in parallel checkout"); + } + } + + if (have_pending) + error(_("parallel checkout finished with pending entries")); + + return ret; +} + +static int reset_fd(int fd, const char *path) +{ + if (lseek(fd, 0, SEEK_SET) != 0) + return error_errno("failed to rewind descriptor of %s", path); + if (ftruncate(fd, 0)) + return error_errno("failed to truncate file %s", path); + return 0; +} + +static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd, + const char *path) +{ + int ret; + struct stream_filter *filter; + struct strbuf buf = STRBUF_INIT; + char *new_blob; + unsigned long size; + size_t newsize = 0; + ssize_t wrote; + + /* Sanity check */ + assert(is_eligible_for_parallel_checkout(pc_item->ce, &pc_item->ca)); + + filter = get_stream_filter_ca(&pc_item->ca, &pc_item->ce->oid); + if (filter) { + if (stream_blob_to_fd(fd, &pc_item->ce->oid, filter, 1)) { + /* On error, reset fd to try writing without streaming */ + if (reset_fd(fd, path)) + return -1; + } else { + return 0; + } + } + + new_blob = read_blob_entry(pc_item->ce, &size); + if (!new_blob) + return error("unable to read sha1 file of %s (%s)", path, + oid_to_hex(&pc_item->ce->oid)); + + /* + * checkout metadata is used to give context for external process + * filters. Files requiring such filters are not eligible for parallel + * checkout, so pass NULL. + */ + ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name, + new_blob, size, &buf, NULL); + + if (ret) { + free(new_blob); + new_blob = strbuf_detach(&buf, &newsize); + size = newsize; + } + + wrote = write_in_full(fd, new_blob, size); + free(new_blob); + if (wrote < 0) + return error("unable to write file %s", path); + + return 0; +} + +static int close_and_clear(int *fd) +{ + int ret = 0; + + if (*fd >= 0) { + ret = close(*fd); + *fd = -1; + } + + return ret; +} + +static int check_leading_dirs(const char *path, int len, int prefix_len) +{ + const char *slash = path + len; + + while (slash > path && *slash != '/') + slash--; + + return has_dirs_only_path(path, slash - path, prefix_len); +} + +static void write_pc_item(struct parallel_checkout_item *pc_item, + struct checkout *state) +{ + unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666; + int fd = -1, fstat_done = 0; + struct strbuf path = STRBUF_INIT; + + strbuf_add(&path, state->base_dir, state->base_dir_len); + strbuf_add(&path, pc_item->ce->name, pc_item->ce->ce_namelen); + + /* + * At this point, leading dirs should have already been created. But if + * a symlink being checked out has collided with one of the dirs, due to + * file system folding rules, it's possible that the dirs are no longer + * present. So we have to check again, and report any path collisions. + */ + if (!check_leading_dirs(path.buf, path.len, state->base_dir_len)) { + pc_item->status = PC_ITEM_COLLIDED; + goto out; + } + + fd = open(path.buf, O_WRONLY | O_CREAT | O_EXCL, mode); + + if (fd < 0) { + if (errno == EEXIST || errno == EISDIR) { + /* + * Errors which probably represent a path collision. + * Suppress the error message and mark the item to be + * retried later, sequentially. ENOTDIR and ENOENT are + * also interesting, but check_leading_dirs() should + * have already caught these cases. + */ + pc_item->status = PC_ITEM_COLLIDED; + } else { + error_errno("failed to open file %s", path.buf); + pc_item->status = PC_ITEM_FAILED; + } + goto out; + } + + if (write_pc_item_to_fd(pc_item, fd, path.buf)) { + /* Error was already reported. */ + pc_item->status = PC_ITEM_FAILED; + goto out; + } + + fstat_done = fstat_checkout_output(fd, state, &pc_item->st); + + if (close_and_clear(&fd)) { + error_errno("unable to close file %s", path.buf); + pc_item->status = PC_ITEM_FAILED; + goto out; + } + + if (state->refresh_cache && !fstat_done && lstat(path.buf, &pc_item->st) < 0) { + error_errno("unable to stat just-written file %s", path.buf); + pc_item->status = PC_ITEM_FAILED; + goto out; + } + + pc_item->status = PC_ITEM_WRITTEN; + +out: + /* + * No need to check close() return. At this point, either fd is already + * closed, or we are on an error path, that has already been reported. + */ + close_and_clear(&fd); + strbuf_release(&path); +} + +static void write_items_sequentially(struct checkout *state) +{ + size_t i; + + for (i = 0; i < parallel_checkout.nr; ++i) + write_pc_item(¶llel_checkout.items[i], state); +} + +int run_parallel_checkout(struct checkout *state) +{ + int ret; + + if (parallel_checkout.status != PC_ACCEPTING_ENTRIES) + BUG("cannot run parallel checkout: uninitialized or already running"); + + parallel_checkout.status = PC_RUNNING; + + write_items_sequentially(state); + ret = handle_results(state); + + finish_parallel_checkout(); + return ret; +} diff --git a/parallel-checkout.h b/parallel-checkout.h new file mode 100644 index 0000000000..e6d6fc01ea --- /dev/null +++ b/parallel-checkout.h @@ -0,0 +1,27 @@ +#ifndef PARALLEL_CHECKOUT_H +#define PARALLEL_CHECKOUT_H + +struct cache_entry; +struct checkout; +struct conv_attrs; + +enum pc_status { + PC_UNINITIALIZED = 0, + PC_ACCEPTING_ENTRIES, + PC_RUNNING, +}; + +enum pc_status parallel_checkout_status(void); +void init_parallel_checkout(void); + +/* + * Return -1 if parallel checkout is currently not enabled or if the entry is + * not eligible for parallel checkout. Otherwise, enqueue the entry for later + * write and return 0. + */ +int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca); + +/* Write all the queued entries, returning 0 on success.*/ +int run_parallel_checkout(struct checkout *state); + +#endif /* PARALLEL_CHECKOUT_H */ diff --git a/unpack-trees.c b/unpack-trees.c index a511fadd89..1b1da7485a 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -17,6 +17,7 @@ #include "object-store.h" #include "promisor-remote.h" #include "entry.h" +#include "parallel-checkout.h" /* * Error messages expected by scripts out of plumbing commands such as @@ -438,7 +439,6 @@ static int check_updates(struct unpack_trees_options *o, if (should_update_submodules()) load_gitmodules_file(index, &state); - enable_delayed_checkout(&state); if (has_promisor_remote()) { /* * Prefetch the objects that are to be checked out in the loop @@ -461,6 +461,9 @@ static int check_updates(struct unpack_trees_options *o, to_fetch.oid, to_fetch.nr); oid_array_clear(&to_fetch); } + + enable_delayed_checkout(&state); + init_parallel_checkout(); for (i = 0; i < index->cache_nr; i++) { struct cache_entry *ce = index->cache[i]; @@ -474,6 +477,7 @@ static int check_updates(struct unpack_trees_options *o, } } stop_progress(&progress); + errs |= run_parallel_checkout(&state); errs |= finish_delayed_checkout(&state, NULL); git_attr_set_direction(GIT_ATTR_CHECKIN); From patchwork Thu Oct 29 02:14:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865033 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 10CE5C5517A for ; Thu, 29 Oct 2020 02:16:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9D61420738 for ; Thu, 29 Oct 2020 02:16:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="fwv5XlGR" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391082AbgJ2CQs (ORCPT ); Wed, 28 Oct 2020 22:16:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35576 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729868AbgJ2CQE (ORCPT ); Wed, 28 Oct 2020 22:16:04 -0400 Received: from mail-qk1-x742.google.com (mail-qk1-x742.google.com [IPv6:2607:f8b0:4864:20::742]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BE04C0613CF for ; Wed, 28 Oct 2020 19:16:04 -0700 (PDT) Received: by mail-qk1-x742.google.com with SMTP id q199so859662qke.10 for ; Wed, 28 Oct 2020 19:16:04 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=FWT42qj1Qj723mVCon98hA86fVqcoBR0TQZlh3RTs4g=; b=fwv5XlGR0lYFXNau19ukEUinylXFqI66+ekSfw/9k7Lt6Ry2L1D6uiO0GaRu+utbLf HRq5grP42+0GvgbSQy91MRa8/SILx2Gg05aPNh5Yb/JJ3DVcipNBUkkridcK5k6zs4ef U1ldfOx66or4dp51MmlMR/hixyqRs/ZsX/SN3bqhIFX/H+tQyJYoi57RpsDNAodZFaf2 YKoN6CTQpDiLXEsXBCsY3Vhea+uCdbmbVdICE8buP4dIH4FmFISo2jHGOypm6MO/rcHx SbjRLPQR7P2FYXgsExfq8iqafTDsTcBLlodMwzSVEJuMl/7tK52tLtCHEPtdzQgW0GjW +HOQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=FWT42qj1Qj723mVCon98hA86fVqcoBR0TQZlh3RTs4g=; b=eMDWv1j5V5ckXgl4021rAeHOgDY5idK3E3L8LhkzTEECS1hWCfD7uKx4lFocO00b/c i0j5pIbBvMOwytPlKfx2q7qh3w63h/N4F2GTisxijmD9fdryMcH8AkMOEvVGFGUBvKU4 fmfqNJha+QP6CyGGVjvrh6mYR3OEzax+wH706bgGppVNEMH6wtaYgQcGco/6tM5EFhLN rkie4Ya1a8MH3awkKSeHMD8BJbOMRUzMardHxBxpUSoXIxSMuYEyopmBXyRVGDzhuLJC piHjZvy72Qodbqd/vXMYq2mMasr9Eq5QvwoFdkPcOVyj8OLZc+aU1UgAy9K4jAzLJpXB A5HQ== X-Gm-Message-State: AOAM531mqrw6FuIDp0isIKNtR4/fYhnmsv2WWWFq1F1aSABM0zFla4Um S7jDX4UqRozT+HfwTmRtnaS3Ll7phZL5yw== X-Google-Smtp-Source: ABdhPJz3SkzrwTNaDy67/Fyp3klFGajkvl5Qo/z9P0JIqEkcX/LavwE9O6u8zLDfxj1C5K6YZdDk9Q== X-Received: by 2002:a05:620a:214a:: with SMTP id m10mr1733346qkm.210.1603937762720; Wed, 28 Oct 2020 19:16:02 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.15.58 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:02 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 11/19] parallel-checkout: make it truly parallel Date: Wed, 28 Oct 2020 23:14:48 -0300 Message-Id: <096e543fd208d18cae875d3ea1a0ecfacf10fa08.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Use multiple worker processes to distribute the queued entries and call write_checkout_item() in parallel for them. The items are distributed uniformly in contiguous chunks. This minimizes the chances of two workers writing to the same directory simultaneously, which could affect performance due to lock contention in the kernel. Work stealing (or any other format of re-distribution) is not implemented yet. The parallel version was benchmarked during three operations in the linux repo, with cold cache: cloning v5.8, checking out v5.8 from v2.6.15 (checkout I) and checking out v5.8 from v5.7 (checkout II). The four tables below show the mean run times and standard deviations for 5 runs in: a local file system with SSD, a local file system with HDD, a Linux NFS server, and Amazon EFS. The numbers of workers were chosen based on what produces the best result for each case. Local SSD: Clone Checkout I Checkout II Sequential 8.171 s ± 0.206 s 8.735 s ± 0.230 s 4.166 s ± 0.246 s 10 workers 3.277 s ± 0.138 s 3.774 s ± 0.188 s 2.561 s ± 0.120 s Speedup 2.49 ± 0.12 2.31 ± 0.13 1.63 ± 0.12 Local HDD: Clone Checkout I Checkout II Sequential 35.157 s ± 0.205 s 48.835 s ± 0.407 s 47.302 s ± 1.435 s 8 workers 35.538 s ± 0.325 s 49.353 s ± 0.826 s 48.919 s ± 0.416 s Speedup 0.99 ± 0.01 0.99 ± 0.02 0.97 ± 0.03 Linux NFS server (v4.1, on EBS, single availability zone): Clone Checkout I Checkout II Sequential 216.070 s ± 3.611 s 211.169 s ± 3.147 s 57.446 s ± 1.301 s 32 workers 67.997 s ± 0.740 s 66.563 s ± 0.457 s 23.708 s ± 0.622 s Speedup 3.18 ± 0.06 3.17 ± 0.05 2.42 ± 0.08 EFS (v4.1, replicated over multiple availability zones): Clone Checkout I Checkout II Sequential 1249.329 s ± 13.857 s 1438.979 s ± 78.792 s 543.919 s ± 18.745 s 64 workers 225.864 s ± 12.433 s 316.345 s ± 1.887 s 183.648 s ± 10.095 s Speedup 5.53 ± 0.31 4.55 ± 0.25 2.96 ± 0.19 The above benchmarks show that parallel checkout is most effective on repositories located on an SSD or over a distributed file system. For local file systems on spinning disks, and/or older machines, the parallelism does not always bring a good performance. In fact, it can even increase the run time. For this reason, the sequential code is still the default. Two settings are added to optionally enable and configure the new parallel version as desired. Local SSD tests were executed in an i7-7700HQ (4 cores with hyper-threading) running Manjaro Linux. Local HDD tests were executed in an i7-2600 (also 4 cores with hyper-threading), HDD Seagate Barracuda 7200 rpm SATA 3.0, running Debian 9.13. NFS and EFS tests were executed in an Amazon EC2 c5n.large instance, with 2 vCPUs. The Linux NFS server was running on a m6g.large instance with 1 TB, EBS GP2 volume. Before each timing, the linux repository was removed (or checked out back), and `sync && sysctl vm.drop_caches=3` was executed. Co-authored-by: Nguyễn Thái Ngọc Duy Co-authored-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- .gitignore | 1 + Documentation/config/checkout.txt | 21 +++ Makefile | 1 + builtin.h | 1 + builtin/checkout--helper.c | 142 +++++++++++++++ git.c | 2 + parallel-checkout.c | 280 +++++++++++++++++++++++++++--- parallel-checkout.h | 84 ++++++++- unpack-trees.c | 10 +- 9 files changed, 508 insertions(+), 34 deletions(-) create mode 100644 builtin/checkout--helper.c diff --git a/.gitignore b/.gitignore index 6232d33924..1a341ea184 100644 --- a/.gitignore +++ b/.gitignore @@ -33,6 +33,7 @@ /git-check-mailmap /git-check-ref-format /git-checkout +/git-checkout--helper /git-checkout-index /git-cherry /git-cherry-pick diff --git a/Documentation/config/checkout.txt b/Documentation/config/checkout.txt index 6b646813ab..23e8f7cde0 100644 --- a/Documentation/config/checkout.txt +++ b/Documentation/config/checkout.txt @@ -16,3 +16,24 @@ will checkout the '' branch on another remote, and by linkgit:git-worktree[1] when 'git worktree add' refers to a remote branch. This setting might be used for other checkout-like commands or functionality in the future. + +checkout.workers:: + The number of parallel workers to use when updating the working tree. + The default is one, i.e. sequential execution. If set to a value less + than one, Git will use as many workers as the number of logical cores + available. This setting and `checkout.thresholdForParallelism` affect + all commands that perform checkout. E.g. checkout, clone, reset, + sparse-checkout, etc. ++ +Note: parallel checkout usually delivers better performance for repositories +located on SSDs or over NFS. For repositories on spinning disks and/or machines +with a small number of cores, the default sequential checkout often performs +better. The size and compression level of a repository might also influence how +well the parallel version performs. + +checkout.thresholdForParallelism:: + When running parallel checkout with a small number of files, the cost + of subprocess spawning and inter-process communication might outweigh + the parallelization gains. This setting allows to define the minimum + number of files for which parallel checkout should be attempted. The + default is 100. diff --git a/Makefile b/Makefile index 10ee5e709b..535e6e94aa 100644 --- a/Makefile +++ b/Makefile @@ -1063,6 +1063,7 @@ BUILTIN_OBJS += builtin/check-attr.o BUILTIN_OBJS += builtin/check-ignore.o BUILTIN_OBJS += builtin/check-mailmap.o BUILTIN_OBJS += builtin/check-ref-format.o +BUILTIN_OBJS += builtin/checkout--helper.o BUILTIN_OBJS += builtin/checkout-index.o BUILTIN_OBJS += builtin/checkout.o BUILTIN_OBJS += builtin/clean.o diff --git a/builtin.h b/builtin.h index 53fb290963..2abbe14b0b 100644 --- a/builtin.h +++ b/builtin.h @@ -123,6 +123,7 @@ int cmd_bugreport(int argc, const char **argv, const char *prefix); int cmd_bundle(int argc, const char **argv, const char *prefix); int cmd_cat_file(int argc, const char **argv, const char *prefix); int cmd_checkout(int argc, const char **argv, const char *prefix); +int cmd_checkout__helper(int argc, const char **argv, const char *prefix); int cmd_checkout_index(int argc, const char **argv, const char *prefix); int cmd_check_attr(int argc, const char **argv, const char *prefix); int cmd_check_ignore(int argc, const char **argv, const char *prefix); diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c new file mode 100644 index 0000000000..67fe37cf11 --- /dev/null +++ b/builtin/checkout--helper.c @@ -0,0 +1,142 @@ +#include "builtin.h" +#include "config.h" +#include "entry.h" +#include "parallel-checkout.h" +#include "parse-options.h" +#include "pkt-line.h" + +static void packet_to_pc_item(char *line, int len, + struct parallel_checkout_item *pc_item) +{ + struct pc_item_fixed_portion *fixed_portion; + char *encoding, *variant; + + if (len < sizeof(struct pc_item_fixed_portion)) + BUG("checkout worker received too short item (got %dB, exp %dB)", + len, (int)sizeof(struct pc_item_fixed_portion)); + + fixed_portion = (struct pc_item_fixed_portion *)line; + + if (len - sizeof(struct pc_item_fixed_portion) != + fixed_portion->name_len + fixed_portion->working_tree_encoding_len) + BUG("checkout worker received corrupted item"); + + variant = line + sizeof(struct pc_item_fixed_portion); + + /* + * Note: the main process uses zero length to communicate that the + * encoding is NULL. There is no use case in actually sending an empty + * string since it's considered as NULL when ca.working_tree_encoding + * is set at git_path_check_encoding(). + */ + if (fixed_portion->working_tree_encoding_len) { + encoding = xmemdupz(variant, + fixed_portion->working_tree_encoding_len); + variant += fixed_portion->working_tree_encoding_len; + } else { + encoding = NULL; + } + + memset(pc_item, 0, sizeof(*pc_item)); + pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len); + pc_item->ce->ce_namelen = fixed_portion->name_len; + pc_item->ce->ce_mode = fixed_portion->ce_mode; + memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen); + oidcpy(&pc_item->ce->oid, &fixed_portion->oid); + + pc_item->id = fixed_portion->id; + pc_item->ca.crlf_action = fixed_portion->crlf_action; + pc_item->ca.ident = fixed_portion->ident; + pc_item->ca.working_tree_encoding = encoding; +} + +static void report_result(struct parallel_checkout_item *pc_item) +{ + struct pc_item_result res = { 0 }; + size_t size; + + res.id = pc_item->id; + res.status = pc_item->status; + + if (pc_item->status == PC_ITEM_WRITTEN) { + res.st = pc_item->st; + size = sizeof(res); + } else { + size = PC_ITEM_RESULT_BASE_SIZE; + } + + packet_write(1, (const char *)&res, size); +} + +/* Free the worker-side malloced data, but not pc_item itself. */ +static void release_pc_item_data(struct parallel_checkout_item *pc_item) +{ + free((char *)pc_item->ca.working_tree_encoding); + discard_cache_entry(pc_item->ce); +} + +static void worker_loop(struct checkout *state) +{ + struct parallel_checkout_item *items = NULL; + size_t i, nr = 0, alloc = 0; + + while (1) { + int len; + char *line = packet_read_line(0, &len); + + if (!line) + break; + + ALLOC_GROW(items, nr + 1, alloc); + packet_to_pc_item(line, len, &items[nr++]); + } + + for (i = 0; i < nr; ++i) { + struct parallel_checkout_item *pc_item = &items[i]; + write_pc_item(pc_item, state); + report_result(pc_item); + release_pc_item_data(pc_item); + } + + packet_flush(1); + + free(items); +} + +static const char * const checkout_helper_usage[] = { + N_("git checkout--helper []"), + NULL +}; + +int cmd_checkout__helper(int argc, const char **argv, const char *prefix) +{ + struct checkout state = CHECKOUT_INIT; + struct option checkout_helper_options[] = { + OPT_STRING(0, "prefix", &state.base_dir, N_("string"), + N_("when creating files, prepend ")), + OPT_END() + }; + + if (argc == 2 && !strcmp(argv[1], "-h")) + usage_with_options(checkout_helper_usage, + checkout_helper_options); + + git_config(git_default_config, NULL); + argc = parse_options(argc, argv, prefix, checkout_helper_options, + checkout_helper_usage, 0); + if (argc > 0) + usage_with_options(checkout_helper_usage, checkout_helper_options); + + if (state.base_dir) + state.base_dir_len = strlen(state.base_dir); + + /* + * Setting this on worker won't actually update the index. We just need + * to pretend so to induce the checkout machinery to stat() the written + * entries. + */ + state.refresh_cache = 1; + + worker_loop(&state); + return 0; +} diff --git a/git.c b/git.c index 4bdcdad2cc..384f144593 100644 --- a/git.c +++ b/git.c @@ -487,6 +487,8 @@ static struct cmd_struct commands[] = { { "check-mailmap", cmd_check_mailmap, RUN_SETUP }, { "check-ref-format", cmd_check_ref_format, NO_PARSEOPT }, { "checkout", cmd_checkout, RUN_SETUP | NEED_WORK_TREE }, + { "checkout--helper", cmd_checkout__helper, + RUN_SETUP | NEED_WORK_TREE | SUPPORT_SUPER_PREFIX }, { "checkout-index", cmd_checkout_index, RUN_SETUP | NEED_WORK_TREE}, { "cherry", cmd_cherry, RUN_SETUP }, diff --git a/parallel-checkout.c b/parallel-checkout.c index 981dbe6ff3..a5508e27c2 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -1,28 +1,15 @@ #include "cache.h" #include "entry.h" #include "parallel-checkout.h" +#include "pkt-line.h" +#include "run-command.h" #include "streaming.h" +#include "thread-utils.h" +#include "config.h" -enum pc_item_status { - PC_ITEM_PENDING = 0, - PC_ITEM_WRITTEN, - /* - * The entry could not be written because there was another file - * already present in its path or leading directories. Since - * checkout_entry_ca() removes such files from the working tree before - * enqueueing the entry for parallel checkout, it means that there was - * a path collision among the entries being written. - */ - PC_ITEM_COLLIDED, - PC_ITEM_FAILED, -}; - -struct parallel_checkout_item { - /* pointer to a istate->cache[] entry. Not owned by us. */ - struct cache_entry *ce; - struct conv_attrs ca; - struct stat st; - enum pc_item_status status; +struct pc_worker { + struct child_process cp; + size_t next_to_complete, nr_to_complete; }; struct parallel_checkout { @@ -38,6 +25,19 @@ enum pc_status parallel_checkout_status(void) return parallel_checkout.status; } +#define DEFAULT_THRESHOLD_FOR_PARALLELISM 100 + +void get_parallel_checkout_configs(int *num_workers, int *threshold) +{ + if (git_config_get_int("checkout.workers", num_workers)) + *num_workers = 1; + else if (*num_workers < 1) + *num_workers = online_cpus(); + + if (git_config_get_int("checkout.thresholdForParallelism", threshold)) + *threshold = DEFAULT_THRESHOLD_FOR_PARALLELISM; +} + void init_parallel_checkout(void) { if (parallel_checkout.status != PC_UNINITIALIZED) @@ -115,10 +115,12 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca) ALLOC_GROW(parallel_checkout.items, parallel_checkout.nr + 1, parallel_checkout.alloc); - pc_item = ¶llel_checkout.items[parallel_checkout.nr++]; + pc_item = ¶llel_checkout.items[parallel_checkout.nr]; pc_item->ce = ce; memcpy(&pc_item->ca, ca, sizeof(pc_item->ca)); pc_item->status = PC_ITEM_PENDING; + pc_item->id = parallel_checkout.nr; + parallel_checkout.nr++; return 0; } @@ -231,7 +233,8 @@ static int write_pc_item_to_fd(struct parallel_checkout_item *pc_item, int fd, /* * checkout metadata is used to give context for external process * filters. Files requiring such filters are not eligible for parallel - * checkout, so pass NULL. + * checkout, so pass NULL. Note: if that changes, the metadata must also + * be passed from the main process to the workers. */ ret = convert_to_working_tree_ca(&pc_item->ca, pc_item->ce->name, new_blob, size, &buf, NULL); @@ -272,8 +275,8 @@ static int check_leading_dirs(const char *path, int len, int prefix_len) return has_dirs_only_path(path, slash - path, prefix_len); } -static void write_pc_item(struct parallel_checkout_item *pc_item, - struct checkout *state) +void write_pc_item(struct parallel_checkout_item *pc_item, + struct checkout *state) { unsigned int mode = (pc_item->ce->ce_mode & 0100) ? 0777 : 0666; int fd = -1, fstat_done = 0; @@ -343,6 +346,221 @@ static void write_pc_item(struct parallel_checkout_item *pc_item, strbuf_release(&path); } +static void send_one_item(int fd, struct parallel_checkout_item *pc_item) +{ + size_t len_data; + char *data, *variant; + struct pc_item_fixed_portion *fixed_portion; + const char *working_tree_encoding = pc_item->ca.working_tree_encoding; + size_t name_len = pc_item->ce->ce_namelen; + size_t working_tree_encoding_len = working_tree_encoding ? + strlen(working_tree_encoding) : 0; + + len_data = sizeof(struct pc_item_fixed_portion) + name_len + + working_tree_encoding_len; + + data = xcalloc(1, len_data); + + fixed_portion = (struct pc_item_fixed_portion *)data; + fixed_portion->id = pc_item->id; + fixed_portion->ce_mode = pc_item->ce->ce_mode; + fixed_portion->crlf_action = pc_item->ca.crlf_action; + fixed_portion->ident = pc_item->ca.ident; + fixed_portion->name_len = name_len; + fixed_portion->working_tree_encoding_len = working_tree_encoding_len; + /* + * We use hashcpy() instead of oidcpy() because the hash[] positions + * after `the_hash_algo->rawsz` might not be initialized. And Valgrind + * would complain about passing uninitialized bytes to a syscall + * (write(2)). There is no real harm in this case, but the warning could + * hinder the detection of actual errors. + */ + hashcpy(fixed_portion->oid.hash, pc_item->ce->oid.hash); + + variant = data + sizeof(*fixed_portion); + if (working_tree_encoding_len) { + memcpy(variant, working_tree_encoding, working_tree_encoding_len); + variant += working_tree_encoding_len; + } + memcpy(variant, pc_item->ce->name, name_len); + + packet_write(fd, data, len_data); + + free(data); +} + +static void send_batch(int fd, size_t start, size_t nr) +{ + size_t i; + for (i = 0; i < nr; ++i) + send_one_item(fd, ¶llel_checkout.items[start + i]); + packet_flush(fd); +} + +static struct pc_worker *setup_workers(struct checkout *state, int num_workers) +{ + struct pc_worker *workers; + int i, workers_with_one_extra_item; + size_t base_batch_size, next_to_assign = 0; + + ALLOC_ARRAY(workers, num_workers); + + for (i = 0; i < num_workers; ++i) { + struct child_process *cp = &workers[i].cp; + + child_process_init(cp); + cp->git_cmd = 1; + cp->in = -1; + cp->out = -1; + cp->clean_on_exit = 1; + strvec_push(&cp->args, "checkout--helper"); + if (state->base_dir_len) + strvec_pushf(&cp->args, "--prefix=%s", state->base_dir); + if (start_command(cp)) + die(_("failed to spawn checkout worker")); + } + + base_batch_size = parallel_checkout.nr / num_workers; + workers_with_one_extra_item = parallel_checkout.nr % num_workers; + + for (i = 0; i < num_workers; ++i) { + struct pc_worker *worker = &workers[i]; + size_t batch_size = base_batch_size; + + /* distribute the extra work evenly */ + if (i < workers_with_one_extra_item) + batch_size++; + + send_batch(worker->cp.in, next_to_assign, batch_size); + worker->next_to_complete = next_to_assign; + worker->nr_to_complete = batch_size; + + next_to_assign += batch_size; + } + + return workers; +} + +static void finish_workers(struct pc_worker *workers, int num_workers) +{ + int i; + + /* + * Close pipes before calling finish_command() to let the workers + * exit asynchronously and avoid spending extra time on wait(). + */ + for (i = 0; i < num_workers; ++i) { + struct child_process *cp = &workers[i].cp; + if (cp->in >= 0) + close(cp->in); + if (cp->out >= 0) + close(cp->out); + } + + for (i = 0; i < num_workers; ++i) { + if (finish_command(&workers[i].cp)) + error(_("checkout worker %d finished with error"), i); + } + + free(workers); +} + +#define ASSERT_PC_ITEM_RESULT_SIZE(got, exp) \ +{ \ + if (got != exp) \ + BUG("corrupted result from checkout worker (got %dB, exp %dB)", \ + got, exp); \ +} while(0) + +static void parse_and_save_result(const char *line, int len, + struct pc_worker *worker) +{ + struct pc_item_result *res; + struct parallel_checkout_item *pc_item; + struct stat *st = NULL; + + if (len < PC_ITEM_RESULT_BASE_SIZE) + BUG("too short result from checkout worker (got %dB, exp %dB)", + len, (int)PC_ITEM_RESULT_BASE_SIZE); + + res = (struct pc_item_result *)line; + + /* + * Worker should send either the full result struct on success, or + * just the base (i.e. no stat data), otherwise. + */ + if (res->status == PC_ITEM_WRITTEN) { + ASSERT_PC_ITEM_RESULT_SIZE(len, (int)sizeof(struct pc_item_result)); + st = &res->st; + } else { + ASSERT_PC_ITEM_RESULT_SIZE(len, (int)PC_ITEM_RESULT_BASE_SIZE); + } + + if (!worker->nr_to_complete || res->id != worker->next_to_complete) + BUG("checkout worker sent unexpected item id"); + + worker->next_to_complete++; + worker->nr_to_complete--; + + pc_item = ¶llel_checkout.items[res->id]; + pc_item->status = res->status; + if (st) + pc_item->st = *st; +} + + +static void gather_results_from_workers(struct pc_worker *workers, + int num_workers) +{ + int i, active_workers = num_workers; + struct pollfd *pfds; + + CALLOC_ARRAY(pfds, num_workers); + for (i = 0; i < num_workers; ++i) { + pfds[i].fd = workers[i].cp.out; + pfds[i].events = POLLIN; + } + + while (active_workers) { + int nr = poll(pfds, num_workers, -1); + + if (nr < 0) { + if (errno == EINTR) + continue; + die_errno("failed to poll checkout workers"); + } + + for (i = 0; i < num_workers && nr > 0; ++i) { + struct pc_worker *worker = &workers[i]; + struct pollfd *pfd = &pfds[i]; + + if (!pfd->revents) + continue; + + if (pfd->revents & POLLIN) { + int len; + const char *line = packet_read_line(pfd->fd, &len); + + if (!line) { + pfd->fd = -1; + active_workers--; + } else { + parse_and_save_result(line, len, worker); + } + } else if (pfd->revents & POLLHUP) { + pfd->fd = -1; + active_workers--; + } else if (pfd->revents & (POLLNVAL | POLLERR)) { + die(_("error polling from checkout worker")); + } + + nr--; + } + } + + free(pfds); +} + static void write_items_sequentially(struct checkout *state) { size_t i; @@ -351,7 +569,7 @@ static void write_items_sequentially(struct checkout *state) write_pc_item(¶llel_checkout.items[i], state); } -int run_parallel_checkout(struct checkout *state) +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold) { int ret; @@ -360,7 +578,17 @@ int run_parallel_checkout(struct checkout *state) parallel_checkout.status = PC_RUNNING; - write_items_sequentially(state); + if (parallel_checkout.nr < num_workers) + num_workers = parallel_checkout.nr; + + if (num_workers <= 1 || parallel_checkout.nr < threshold) { + write_items_sequentially(state); + } else { + struct pc_worker *workers = setup_workers(state, num_workers); + gather_results_from_workers(workers, num_workers); + finish_workers(workers, num_workers); + } + ret = handle_results(state); finish_parallel_checkout(); diff --git a/parallel-checkout.h b/parallel-checkout.h index e6d6fc01ea..0c9984584e 100644 --- a/parallel-checkout.h +++ b/parallel-checkout.h @@ -1,9 +1,12 @@ #ifndef PARALLEL_CHECKOUT_H #define PARALLEL_CHECKOUT_H -struct cache_entry; -struct checkout; -struct conv_attrs; +#include "entry.h" +#include "convert.h" + +/**************************************************************** + * Users of parallel checkout + ****************************************************************/ enum pc_status { PC_UNINITIALIZED = 0, @@ -12,6 +15,7 @@ enum pc_status { }; enum pc_status parallel_checkout_status(void); +void get_parallel_checkout_configs(int *num_workers, int *threshold); void init_parallel_checkout(void); /* @@ -21,7 +25,77 @@ void init_parallel_checkout(void); */ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca); -/* Write all the queued entries, returning 0 on success.*/ -int run_parallel_checkout(struct checkout *state); +/* + * Write all the queued entries, returning 0 on success. If the number of + * entries is smaller than the specified threshold, the operation is performed + * sequentially. + */ +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold); + +/**************************************************************** + * Interface with checkout--helper + ****************************************************************/ + +enum pc_item_status { + PC_ITEM_PENDING = 0, + PC_ITEM_WRITTEN, + /* + * The entry could not be written because there was another file + * already present in its path or leading directories. Since + * checkout_entry_ca() removes such files from the working tree before + * enqueueing the entry for parallel checkout, it means that there was + * a path collision among the entries being written. + */ + PC_ITEM_COLLIDED, + PC_ITEM_FAILED, +}; + +struct parallel_checkout_item { + /* + * In main process ce points to a istate->cache[] entry. Thus, it's not + * owned by us. In workers they own the memory, which *must be* released. + */ + struct cache_entry *ce; + struct conv_attrs ca; + size_t id; /* position in parallel_checkout.items[] of main process */ + + /* Output fields, sent from workers. */ + enum pc_item_status status; + struct stat st; +}; + +/* + * The fixed-size portion of `struct parallel_checkout_item` that is sent to the + * workers. Following this will be 2 strings: ca.working_tree_encoding and + * ce.name; These are NOT null terminated, since we have the size in the fixed + * portion. + * + * Note that not all fields of conv_attrs and cache_entry are passed, only the + * ones that will be required by the workers to smudge and write the entry. + */ +struct pc_item_fixed_portion { + size_t id; + struct object_id oid; + unsigned int ce_mode; + enum crlf_action crlf_action; + int ident; + size_t working_tree_encoding_len; + size_t name_len; +}; + +/* + * The fields of `struct parallel_checkout_item` that are returned by the + * workers. Note: `st` must be the last one, as it is omitted on error. + */ +struct pc_item_result { + size_t id; + enum pc_item_status status; + struct stat st; +}; + +#define PC_ITEM_RESULT_BASE_SIZE offsetof(struct pc_item_result, st) + +void write_pc_item(struct parallel_checkout_item *pc_item, + struct checkout *state); #endif /* PARALLEL_CHECKOUT_H */ diff --git a/unpack-trees.c b/unpack-trees.c index 1b1da7485a..117ed42370 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -399,7 +399,7 @@ static int check_updates(struct unpack_trees_options *o, int errs = 0; struct progress *progress; struct checkout state = CHECKOUT_INIT; - int i; + int i, pc_workers, pc_threshold; trace_performance_enter(); state.force = 1; @@ -462,8 +462,11 @@ static int check_updates(struct unpack_trees_options *o, oid_array_clear(&to_fetch); } + get_parallel_checkout_configs(&pc_workers, &pc_threshold); + enable_delayed_checkout(&state); - init_parallel_checkout(); + if (pc_workers > 1) + init_parallel_checkout(); for (i = 0; i < index->cache_nr; i++) { struct cache_entry *ce = index->cache[i]; @@ -477,7 +480,8 @@ static int check_updates(struct unpack_trees_options *o, } } stop_progress(&progress); - errs |= run_parallel_checkout(&state); + if (pc_workers > 1) + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold); errs |= finish_delayed_checkout(&state, NULL); git_attr_set_direction(GIT_ATTR_CHECKIN); From patchwork Thu Oct 29 02:14:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865045 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D831C56201 for ; Thu, 29 Oct 2020 02:16:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 020002076B for ; Thu, 29 Oct 2020 02:16:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="WWC23hif" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391077AbgJ2CQr (ORCPT ); Wed, 28 Oct 2020 22:16:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35586 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729870AbgJ2CQI (ORCPT ); Wed, 28 Oct 2020 22:16:08 -0400 Received: from mail-qk1-x744.google.com (mail-qk1-x744.google.com [IPv6:2607:f8b0:4864:20::744]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3499AC0613CF for ; Wed, 28 Oct 2020 19:16:08 -0700 (PDT) Received: by mail-qk1-x744.google.com with SMTP id b18so866689qkc.9 for ; Wed, 28 Oct 2020 19:16:08 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=w74KLyxJJrZ35tgyAAxi/gyAxURp+mAbLnstGNoRi3o=; b=WWC23hifQ2zxGXf4XjDVGScxtAe8/RDhDa6iRguGG6xI4fZJKDfDUkw82dyVvIupgQ iYy0aAtM1TMY+oQqXkBz/saAjgsqxZMXnZ10/CritHy1XgV1vyooO38qdYcG7E7mr2zM G7KdFqo4VtDcbV1bVKbH8W1myZ2aPO58mA09kg5ZLuniFevrMaeJEz80S5Rk95Z9q3I6 1hPs7Zeqaj41QhiWZEaSiMoABYUMfgQob8Xs5lediQVDS/QgqT9hRVsuA09JLyDsfFNT bkwsx3Tr1PuMpSWDxk9QJ4Ym4yS/T0PXUcguhmv5dlr7sY4dysBUgmm9DQrtaRoR2v3m ujkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=w74KLyxJJrZ35tgyAAxi/gyAxURp+mAbLnstGNoRi3o=; b=onqIwFORUnVLalo4kx9jnd2vWhTSvYfZxhy2SchrRvZA+uwuoXRngpz/8sr5oqEJAX L3rG270AjJAuk4FiGUVbaAu9mT+ePWg5eAJ9HNDG4okIzv139ZaplaaQCsZ+AUXduT8H jJJaRr95gglr0uTWxwDjFySHrXniAZEtR+DIH4iS6j1+s44IidcLjtZtqE1o/2NfnsbR svQK67B5JRZGIexy8ETv7UltUpQ7AqqrjrKbjD2LPhq/v0Tv6K1R4UEUzk+xdB85k0Jn FwGZt4pHxexbqj9UqK8aKrwei8KJSYOsD5hvipVXxtT8BLevxN2CoHbFSqPA7mpMV1IH ASjg== X-Gm-Message-State: AOAM5336dH5FLS/sL5xTveOkTSPWRcMzbG98FhZzMpBHRkydR54ONyyz b3VH4+QkKf41cp1h4Ii7B0lwVIRWoA9BOQ== X-Google-Smtp-Source: ABdhPJzs7lgts87MQV4ng5/H2++YDQy1P6tLQpvmbq181IAlY0YYexv++AGvxLZsXeR31KSok2eCug== X-Received: by 2002:a37:448c:: with SMTP id r134mr1735581qka.357.1603937767034; Wed, 28 Oct 2020 19:16:07 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.03 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:06 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 12/19] parallel-checkout: support progress displaying Date: Wed, 28 Oct 2020 23:14:49 -0300 Message-Id: <9cfeb4821ca88fe25122798239316c4524f28c92.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Original-patch-by: Nguyễn Thái Ngọc Duy Signed-off-by: Nguyễn Thái Ngọc Duy Signed-off-by: Matheus Tavares --- parallel-checkout.c | 34 +++++++++++++++++++++++++++++++--- parallel-checkout.h | 4 +++- unpack-trees.c | 11 ++++++++--- 3 files changed, 42 insertions(+), 7 deletions(-) diff --git a/parallel-checkout.c b/parallel-checkout.c index a5508e27c2..c5c449d224 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -2,6 +2,7 @@ #include "entry.h" #include "parallel-checkout.h" #include "pkt-line.h" +#include "progress.h" #include "run-command.h" #include "streaming.h" #include "thread-utils.h" @@ -16,6 +17,8 @@ struct parallel_checkout { enum pc_status status; struct parallel_checkout_item *items; size_t nr, alloc; + struct progress *progress; + unsigned int *progress_cnt; }; static struct parallel_checkout parallel_checkout = { 0 }; @@ -125,6 +128,20 @@ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca) return 0; } +size_t pc_queue_size(void) +{ + return parallel_checkout.nr; +} + +static void advance_progress_meter(void) +{ + if (parallel_checkout.progress) { + (*parallel_checkout.progress_cnt)++; + display_progress(parallel_checkout.progress, + *parallel_checkout.progress_cnt); + } +} + static int handle_results(struct checkout *state) { int ret = 0; @@ -173,6 +190,7 @@ static int handle_results(struct checkout *state) */ ret |= checkout_entry_ca(pc_item->ce, &pc_item->ca, state, NULL, NULL); + advance_progress_meter(); break; case PC_ITEM_PENDING: have_pending = 1; @@ -506,6 +524,9 @@ static void parse_and_save_result(const char *line, int len, pc_item->status = res->status; if (st) pc_item->st = *st; + + if (res->status != PC_ITEM_COLLIDED) + advance_progress_meter(); } @@ -565,11 +586,16 @@ static void write_items_sequentially(struct checkout *state) { size_t i; - for (i = 0; i < parallel_checkout.nr; ++i) - write_pc_item(¶llel_checkout.items[i], state); + for (i = 0; i < parallel_checkout.nr; ++i) { + struct parallel_checkout_item *pc_item = ¶llel_checkout.items[i]; + write_pc_item(pc_item, state); + if (pc_item->status != PC_ITEM_COLLIDED) + advance_progress_meter(); + } } -int run_parallel_checkout(struct checkout *state, int num_workers, int threshold) +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold, + struct progress *progress, unsigned int *progress_cnt) { int ret; @@ -577,6 +603,8 @@ int run_parallel_checkout(struct checkout *state, int num_workers, int threshold BUG("cannot run parallel checkout: uninitialized or already running"); parallel_checkout.status = PC_RUNNING; + parallel_checkout.progress = progress; + parallel_checkout.progress_cnt = progress_cnt; if (parallel_checkout.nr < num_workers) num_workers = parallel_checkout.nr; diff --git a/parallel-checkout.h b/parallel-checkout.h index 0c9984584e..6c3a016c0b 100644 --- a/parallel-checkout.h +++ b/parallel-checkout.h @@ -24,13 +24,15 @@ void init_parallel_checkout(void); * write and return 0. */ int enqueue_checkout(struct cache_entry *ce, struct conv_attrs *ca); +size_t pc_queue_size(void); /* * Write all the queued entries, returning 0 on success. If the number of * entries is smaller than the specified threshold, the operation is performed * sequentially. */ -int run_parallel_checkout(struct checkout *state, int num_workers, int threshold); +int run_parallel_checkout(struct checkout *state, int num_workers, int threshold, + struct progress *progress, unsigned int *progress_cnt); /**************************************************************** * Interface with checkout--helper diff --git a/unpack-trees.c b/unpack-trees.c index 117ed42370..e05e6ceff2 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -471,17 +471,22 @@ static int check_updates(struct unpack_trees_options *o, struct cache_entry *ce = index->cache[i]; if (ce->ce_flags & CE_UPDATE) { + size_t last_pc_queue_size = pc_queue_size(); + if (ce->ce_flags & CE_WT_REMOVE) BUG("both update and delete flags are set on %s", ce->name); - display_progress(progress, ++cnt); ce->ce_flags &= ~CE_UPDATE; errs |= checkout_entry(ce, &state, NULL, NULL); + + if (last_pc_queue_size == pc_queue_size()) + display_progress(progress, ++cnt); } } - stop_progress(&progress); if (pc_workers > 1) - errs |= run_parallel_checkout(&state, pc_workers, pc_threshold); + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, + progress, &cnt); + stop_progress(&progress); errs |= finish_delayed_checkout(&state, NULL); git_attr_set_direction(GIT_ATTR_CHECKIN); From patchwork Thu Oct 29 02:14:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865027 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78055C56201 for ; Thu, 29 Oct 2020 02:16:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2F18020738 for ; Thu, 29 Oct 2020 02:16:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="xEnGyMVI" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726741AbgJ2CQp (ORCPT ); Wed, 28 Oct 2020 22:16:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35598 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729882AbgJ2CQM (ORCPT ); Wed, 28 Oct 2020 22:16:12 -0400 Received: from mail-qt1-x844.google.com (mail-qt1-x844.google.com [IPv6:2607:f8b0:4864:20::844]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 60EB1C0613CF for ; Wed, 28 Oct 2020 19:16:12 -0700 (PDT) Received: by mail-qt1-x844.google.com with SMTP id f93so962198qtb.10 for ; Wed, 28 Oct 2020 19:16:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mYmw0wbugerHFiIop6aY7/UBduBtI0oJMxDOPf0MwGM=; b=xEnGyMVIOUVJwNhO4qOwoMIRwcMbdJyI9M5EAowWGw+BI08iRSyW1ncm8ImLEGdjIv 4u9AxjPxSIyMJNsG6pMYEG47fkFxFjauP3wrWG0Yu1MjggsuArM7dn9CAgfej9bdJOnH t+GKAFkpI1hFOSDyI1dseJexrO0+HP6oh02POKRCt1UfAf7XMgrChnDmLqYLBpJ2st3j +3vVc1VEGJCu6Ps9xwR4iwfqL3EboiYRyTlc20zZpwuNzDiS2TL8NT3jll+48bZFvHJy 5PabbmWfYtPadiSeoKS0rkumP3gr6jqXQmFJqpHXysxiF1C7Ia3ern1lYxe+6i+ZfdAj jxYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mYmw0wbugerHFiIop6aY7/UBduBtI0oJMxDOPf0MwGM=; b=WgecelHJvFGGDL6voGVCMRX4FrzZFJVLr1WQ/1L09yF2HMbrNkM5W+hkM1e7vZ/BdM o81BFG/vbcQpnlPd6BlJmSO87GxlO/CLsSFg9GIPaZdp29BZQX8s3WB9cKaPXAHSOf4l TYoGulnuyjq3uhOsZHcyAbUgWJA7i9ne9AJjTk7pgBh79rSTKfQCfo5xHbcXzbbfL6DK uXLvIsWdS+4qgtCCjvB3SLSF/Zk2sO2wJgzy0rjmBUd1NJ0nAJKE4kfoyEyWGua1f8xc Urh4TJWQFD5Jti25WaiC1p9SHxtRdlmfzDXQcgHHL2zUo+J+gnS1eypKJN67EjUMDPmT qg7w== X-Gm-Message-State: AOAM531yTI4bjSdWevJ/MwtbqDFqDG25/oum7SbbSAe+tmaeGgZacIFZ k7exfJjBHlikYHXC1k0jKeAkWViOQKojyQ== X-Google-Smtp-Source: ABdhPJxgjklgBbK0vboaQYijRAwbxP5QY435m0Wjl58SJPqO1gF/DBB9RGf8O/rIXTRtD/uadY1IDA== X-Received: by 2002:aed:2bc4:: with SMTP id e62mr1777517qtd.26.1603937771113; Wed, 28 Oct 2020 19:16:11 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:10 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 13/19] make_transient_cache_entry(): optionally alloc from mem_pool Date: Wed, 28 Oct 2020 23:14:50 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Allow make_transient_cache_entry() to optionally receive a mem_pool struct in which it should allocate the entry. This will be used in the following patch, to store some transient entries which should persist until parallel checkout finishes. Signed-off-by: Matheus Tavares --- builtin/checkout--helper.c | 2 +- builtin/checkout.c | 2 +- builtin/difftool.c | 2 +- cache.h | 10 +++++----- read-cache.c | 12 ++++++++---- unpack-trees.c | 2 +- 6 files changed, 17 insertions(+), 13 deletions(-) diff --git a/builtin/checkout--helper.c b/builtin/checkout--helper.c index 67fe37cf11..9646ed9eeb 100644 --- a/builtin/checkout--helper.c +++ b/builtin/checkout--helper.c @@ -38,7 +38,7 @@ static void packet_to_pc_item(char *line, int len, } memset(pc_item, 0, sizeof(*pc_item)); - pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len); + pc_item->ce = make_empty_transient_cache_entry(fixed_portion->name_len, NULL); pc_item->ce->ce_namelen = fixed_portion->name_len; pc_item->ce->ce_mode = fixed_portion->ce_mode; memcpy(pc_item->ce->name, variant, pc_item->ce->ce_namelen); diff --git a/builtin/checkout.c b/builtin/checkout.c index b18b9d6f3c..c0bf5e6711 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -291,7 +291,7 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid)) die(_("Unable to add merge result for '%s'"), path); free(result_buf.ptr); - ce = make_transient_cache_entry(mode, &oid, path, 2); + ce = make_transient_cache_entry(mode, &oid, path, 2, NULL); if (!ce) die(_("make_cache_entry failed for path '%s'"), path); status = checkout_entry(ce, state, NULL, nr_checkouts); diff --git a/builtin/difftool.c b/builtin/difftool.c index dfa22b67eb..5e7a57c8c2 100644 --- a/builtin/difftool.c +++ b/builtin/difftool.c @@ -323,7 +323,7 @@ static int checkout_path(unsigned mode, struct object_id *oid, struct cache_entry *ce; int ret; - ce = make_transient_cache_entry(mode, oid, path, 0); + ce = make_transient_cache_entry(mode, oid, path, 0, NULL); ret = checkout_entry(ce, state, NULL, NULL); discard_cache_entry(ce); diff --git a/cache.h b/cache.h index ccfeb9ba2b..b5074b2cb2 100644 --- a/cache.h +++ b/cache.h @@ -355,16 +355,16 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate, size_t name_len); /* - * Create a cache_entry that is not intended to be added to an index. - * Caller is responsible for discarding the cache_entry - * with `discard_cache_entry`. + * Create a cache_entry that is not intended to be added to an index. If mp is + * not NULL, the entry is allocated within the given memory pool. Caller is + * responsible for discarding the cache_entry with `discard_cache_entry`. */ struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct object_id *oid, const char *path, - int stage); + int stage, struct mem_pool *mp); -struct cache_entry *make_empty_transient_cache_entry(size_t name_len); +struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp); /* * Discard cache entry. diff --git a/read-cache.c b/read-cache.c index ecf6f68994..f9bac760af 100644 --- a/read-cache.c +++ b/read-cache.c @@ -813,8 +813,10 @@ struct cache_entry *make_empty_cache_entry(struct index_state *istate, size_t le return mem_pool__ce_calloc(find_mem_pool(istate), len); } -struct cache_entry *make_empty_transient_cache_entry(size_t len) +struct cache_entry *make_empty_transient_cache_entry(size_t len, struct mem_pool *mp) { + if (mp) + return mem_pool__ce_calloc(mp, len); return xcalloc(1, cache_entry_size(len)); } @@ -848,8 +850,10 @@ struct cache_entry *make_cache_entry(struct index_state *istate, return ret; } -struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct object_id *oid, - const char *path, int stage) +struct cache_entry *make_transient_cache_entry(unsigned int mode, + const struct object_id *oid, + const char *path, int stage, + struct mem_pool *mp) { struct cache_entry *ce; int len; @@ -860,7 +864,7 @@ struct cache_entry *make_transient_cache_entry(unsigned int mode, const struct o } len = strlen(path); - ce = make_empty_transient_cache_entry(len); + ce = make_empty_transient_cache_entry(len, mp); oidcpy(&ce->oid, oid); memcpy(ce->name, path, len); diff --git a/unpack-trees.c b/unpack-trees.c index e05e6ceff2..dcb40dc8fa 100644 --- a/unpack-trees.c +++ b/unpack-trees.c @@ -1031,7 +1031,7 @@ static struct cache_entry *create_ce_entry(const struct traverse_info *info, size_t len = traverse_path_len(info, tree_entry_len(n)); struct cache_entry *ce = is_transient ? - make_empty_transient_cache_entry(len) : + make_empty_transient_cache_entry(len, NULL) : make_empty_cache_entry(istate, len); ce->ce_mode = create_ce_mode(n->mode); From patchwork Thu Oct 29 02:14:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865017 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id BA70CC55179 for ; Thu, 29 Oct 2020 02:16:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5CBC420738 for ; Thu, 29 Oct 2020 02:16:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="keazu4ve" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732848AbgJ2CQq (ORCPT ); Wed, 28 Oct 2020 22:16:46 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35612 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729929AbgJ2CQQ (ORCPT ); Wed, 28 Oct 2020 22:16:16 -0400 Received: from mail-qt1-x842.google.com (mail-qt1-x842.google.com [IPv6:2607:f8b0:4864:20::842]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 80494C0613CF for ; Wed, 28 Oct 2020 19:16:16 -0700 (PDT) Received: by mail-qt1-x842.google.com with SMTP id f93so962264qtb.10 for ; Wed, 28 Oct 2020 19:16:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GS4h2QNwoW0z+9ey/+u8HNPPBDVmzVwuGUsdvudWn3I=; b=keazu4veC6M1CohyII/Bzok1T23yHpul4GCX+0G+asz3GK2QZ0zYe8ScGhotxpgVtj VgyPPgUbKKFQBwdC1ezoNglsKIedbmgTRgEKBZtYfNfy2bRBtObm6KRQii8shgzK1arQ UaL4r39k1B0aEbksjrcu/j0Z3s1SaNE2kSCXao1M/Gi5OsaL718BbPbl4EucI94M5FgJ UNDtxeRjRqgXQ6w72nd5/vUaQU+jHVsPqk6qiKeLb5BVFGnPzNBcZ72bfwjdhVB4YPtS r9L77kBq5W9EhD24Z2bpl97xgGrRng/UG0izk/hLLzjz2tHUjEX5G/KpRzTdb6y0Ql55 AgLw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GS4h2QNwoW0z+9ey/+u8HNPPBDVmzVwuGUsdvudWn3I=; b=k7YfkDEVxbBSRTeesKXN23jYycvHpJdVgLWdk5CAXLCQC/J7IoVfjofA8f0GJCynE0 62TkWEquGK4ZeC9G2dlOLae6wOZuyTOEIYBUiO5Kq7Z5T7fqTaSsR8mfClvUthBaSGus G5ikkxiofOMN8Mbl1Q/6o3SuzwjCFiiVY8S6ZgG9bmMLLNdpB2YBLK6Z8S3c931RAIoz ZshaxwsH1CU9FJQqriSujOUlPr3AwyEDP5gcVum8dBIvxTbwUvQHv4slX5rmf6vAXi+i dObT2QZO7dB9rEv+m1duUXItptKzvH90vrTi2XvqgUO9szPcpI6j4VUrl1GiEH0nj6pk XV7Q== X-Gm-Message-State: AOAM530hxH1vt0vz8wYpT0w4hnHskgRzDINj2UWaLiCqh2o5dEdIdZJ6 f/jD7acfwG9EkI9YgQBzZdaDiNbTpoY4mQ== X-Google-Smtp-Source: ABdhPJzttrz5K6sYtVYP87CANSniTIUC/NxaVGzqvSRIpydUCdKURBXB9ldZrJmWbripRsJYfEbvsw== X-Received: by 2002:ac8:221a:: with SMTP id o26mr1849024qto.116.1603937775315; Wed, 28 Oct 2020 19:16:15 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.11 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:14 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 14/19] builtin/checkout.c: complete parallel checkout support Date: Wed, 28 Oct 2020 23:14:51 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org There is one code path in builtin/checkout.c which still doesn't benefit from parallel checkout because it calls checkout_entry() directly, instead of unpack_trees(). Let's add parallel support for this missing spot as well. Note: the transient cache entries allocated in checkout_merged() are now allocated in a mem_pool which is only discarded after parallel checkout finishes. This is done because the entries need to be valid when run_parallel_checkout() is called. Signed-off-by: Matheus Tavares --- builtin/checkout.c | 20 ++++++++++++++++---- 1 file changed, 16 insertions(+), 4 deletions(-) diff --git a/builtin/checkout.c b/builtin/checkout.c index c0bf5e6711..ddc4079b85 100644 --- a/builtin/checkout.c +++ b/builtin/checkout.c @@ -27,6 +27,7 @@ #include "wt-status.h" #include "xdiff-interface.h" #include "entry.h" +#include "parallel-checkout.h" static const char * const checkout_usage[] = { N_("git checkout [] "), @@ -230,7 +231,8 @@ static int checkout_stage(int stage, const struct cache_entry *ce, int pos, return error(_("path '%s' does not have their version"), ce->name); } -static int checkout_merged(int pos, const struct checkout *state, int *nr_checkouts) +static int checkout_merged(int pos, const struct checkout *state, + int *nr_checkouts, struct mem_pool *ce_mem_pool) { struct cache_entry *ce = active_cache[pos]; const char *path = ce->name; @@ -291,11 +293,10 @@ static int checkout_merged(int pos, const struct checkout *state, int *nr_checko if (write_object_file(result_buf.ptr, result_buf.size, blob_type, &oid)) die(_("Unable to add merge result for '%s'"), path); free(result_buf.ptr); - ce = make_transient_cache_entry(mode, &oid, path, 2, NULL); + ce = make_transient_cache_entry(mode, &oid, path, 2, ce_mem_pool); if (!ce) die(_("make_cache_entry failed for path '%s'"), path); status = checkout_entry(ce, state, NULL, nr_checkouts); - discard_cache_entry(ce); return status; } @@ -359,16 +360,22 @@ static int checkout_worktree(const struct checkout_opts *opts, int nr_checkouts = 0, nr_unmerged = 0; int errs = 0; int pos; + int pc_workers, pc_threshold; + struct mem_pool ce_mem_pool; state.force = 1; state.refresh_cache = 1; state.istate = &the_index; + mem_pool_init(&ce_mem_pool, 0); + get_parallel_checkout_configs(&pc_workers, &pc_threshold); init_checkout_metadata(&state.meta, info->refname, info->commit ? &info->commit->object.oid : &info->oid, NULL); enable_delayed_checkout(&state); + if (pc_workers > 1) + init_parallel_checkout(); for (pos = 0; pos < active_nr; pos++) { struct cache_entry *ce = active_cache[pos]; if (ce->ce_flags & CE_MATCHED) { @@ -384,10 +391,15 @@ static int checkout_worktree(const struct checkout_opts *opts, &nr_checkouts, opts->overlay_mode); else if (opts->merge) errs |= checkout_merged(pos, &state, - &nr_unmerged); + &nr_unmerged, + &ce_mem_pool); pos = skip_same_name(ce, pos) - 1; } } + if (pc_workers > 1) + errs |= run_parallel_checkout(&state, pc_workers, pc_threshold, + NULL, NULL); + mem_pool_discard(&ce_mem_pool, should_validate_cache_entries()); remove_marked_cache_entries(&the_index, 1); remove_scheduled_dirs(); errs |= finish_delayed_checkout(&state, &nr_checkouts); From patchwork Thu Oct 29 02:14:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865015 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B3D2C5517A for ; Thu, 29 Oct 2020 02:16:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EA56920720 for ; Thu, 29 Oct 2020 02:16:45 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="e2VV0Yxp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391072AbgJ2CQo (ORCPT ); Wed, 28 Oct 2020 22:16:44 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35624 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728041AbgJ2CQV (ORCPT ); Wed, 28 Oct 2020 22:16:21 -0400 Received: from mail-qk1-x742.google.com (mail-qk1-x742.google.com [IPv6:2607:f8b0:4864:20::742]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CBFA9C0613CF for ; Wed, 28 Oct 2020 19:16:20 -0700 (PDT) Received: by mail-qk1-x742.google.com with SMTP id b69so871182qkg.8 for ; Wed, 28 Oct 2020 19:16:20 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=O9RLtFNMZomD/yozwWMk1L4zZxHAQwyEKgw7IYN/ljU=; b=e2VV0YxpmNOX/qJU3llgOMwwQwcJ233tpN1ELPPB0od9wXwW9P9mkuAhD1gIpBiCvH u6D1JyyMmFt9nEbopTkT92fe1hgQptjrXqbYAKoXrCM9WcyUK3Wl0k/l/rb4XBts1V9z sCP7z+HBhSiosYXkJO4Om5FW5PMpHitculw4Lx4NuSMQHfjBbAEIno6QiARY6PNl5qvD /EaV6TUhKASuPrV5Cp2U1G3a5PfX3NMpK9Ptj3XJg0kixvAeV9udJ0ES8YH0/1LVh0D8 EepCOs33cso7fR3ZPB3wfXJoSn6noQLy4kwQkJbjvwWdZ4tnBmT+HS3jvAnoDJUg8TWV Nufw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=O9RLtFNMZomD/yozwWMk1L4zZxHAQwyEKgw7IYN/ljU=; b=UvxRrqAn20EcnzOdtXhnHhAgtEhCMLGHec/ibH3z4Q5FnhqnRYloyKoBFKc93vYz05 zGpWioetW21MnmsLYK/h79/mzAmMseVBg2EYPVlaLQDf2N8Fx+r7DhbHiP9nue9sA9hr 8pDXUbHjl3DBhW4QlhDMn8JiOELtqL9BKDd0Wangq0wkn/PeC5qTHQtn1XUvzh5NOMeO 3KbS66m2KmRogHG+gEvAzLdqbC1xUkKUkpSSUtN66TnD250wCSIcZw4KmYVW+Oy5UrVz I7mzbNxPO+yiOyKtLiCxt5EIf1PSeBVkhTHTWwHNw2V17DepfbcpoijbTyf6uR6vOJRy 4Mnw== X-Gm-Message-State: AOAM531Hi8bH5jehwqRYvzJ7HoqeN9PkV0XLES0NNqebmJGSmVPtk7Ar NIAcWMyckO2Q1dOlaA3JRHGGS/tQNtAPgg== X-Google-Smtp-Source: ABdhPJySed+MHFteDzankat/xIyu7fDc8jE5Q426WIZkax8SK6i07c+t90cXDqroRkwaklasgDIrUg== X-Received: by 2002:a37:a7c7:: with SMTP id q190mr156139qke.185.1603937779610; Wed, 28 Oct 2020 19:16:19 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:18 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 15/19] checkout-index: add parallel checkout support Date: Wed, 28 Oct 2020 23:14:52 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Signed-off-by: Matheus Tavares --- builtin/checkout-index.c | 21 ++++++++++++++++++--- 1 file changed, 18 insertions(+), 3 deletions(-) diff --git a/builtin/checkout-index.c b/builtin/checkout-index.c index 9276ed0258..9a2e255f58 100644 --- a/builtin/checkout-index.c +++ b/builtin/checkout-index.c @@ -12,6 +12,7 @@ #include "cache-tree.h" #include "parse-options.h" #include "entry.h" +#include "parallel-checkout.h" #define CHECKOUT_ALL 4 static int nul_term_line; @@ -169,6 +170,7 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix) int force = 0, quiet = 0, not_new = 0; int index_opt = 0; int err = 0; + int pc_workers, pc_threshold; struct option builtin_checkout_index_options[] = { OPT_BOOL('a', "all", &all, N_("check out all files in the index")), @@ -223,6 +225,14 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix) hold_locked_index(&lock_file, LOCK_DIE_ON_ERROR); } + if (!to_tempfile) + get_parallel_checkout_configs(&pc_workers, &pc_threshold); + else + pc_workers = 1; + + if (pc_workers > 1) + init_parallel_checkout(); + /* Check out named files first */ for (i = 0; i < argc; i++) { const char *arg = argv[i]; @@ -262,12 +272,17 @@ int cmd_checkout_index(int argc, const char **argv, const char *prefix) strbuf_release(&buf); } - if (err) - return 1; - if (all) checkout_all(prefix, prefix_length); + if (pc_workers > 1) { + err |= run_parallel_checkout(&state, pc_workers, pc_threshold, + NULL, NULL); + } + + if (err) + return 1; + if (is_lock_file_locked(&lock_file) && write_locked_index(&the_index, &lock_file, COMMIT_LOCK)) die("Unable to write new index file"); From patchwork Thu Oct 29 02:14:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865019 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7BD30C388F7 for ; Thu, 29 Oct 2020 02:16:45 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BAA0120747 for ; Thu, 29 Oct 2020 02:16:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="oUiw9uYo" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391069AbgJ2CQm (ORCPT ); Wed, 28 Oct 2020 22:16:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387999AbgJ2CQZ (ORCPT ); Wed, 28 Oct 2020 22:16:25 -0400 Received: from mail-qv1-xf30.google.com (mail-qv1-xf30.google.com [IPv6:2607:f8b0:4864:20::f30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2628AC0613CF for ; Wed, 28 Oct 2020 19:16:25 -0700 (PDT) Received: by mail-qv1-xf30.google.com with SMTP id w5so747082qvn.12 for ; Wed, 28 Oct 2020 19:16:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=mcECZuewGQIzLYTOXVySAK04qScKz5c+cmRSWyERmok=; b=oUiw9uYoGUk8zS30Z5QA98UoujXbvRa37XSEsSfkRyThdz6vEdDi3jN6MlK+BJCL0g I2FDtR00n1Hpwnaq1D92z80eXoRxRobcIfGOEVNhntxmGZYn2cI1K536yTAtzTdqPGDX rcKeq5n5pfyiL+tSlO5POUwmbEYjEJYPjhAbiI+Ef8IGXJ5D69OsQZKmqEf3TqiRjRv0 LD0RcxYwmtitDkdT3/SOTHbAym5yfydqY/2hpFEI83hWgfv7PGFIPtBPKLh6EcFagbIb tk2mhHngK1m0/qUNRvlyii5nnq+pXn5q9GUAeYjlk+crenC+27cIUAicN/XBwSZ9nsgb 10JQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=mcECZuewGQIzLYTOXVySAK04qScKz5c+cmRSWyERmok=; b=G2Q6Q7XkylMdzI/na3e/nRP/+ET+qETSsO6feKqGP3j3l0o8zQ++AS5MvuYj/f3fjA 45ESV7iJ0WknVVO/pUjGm01Sey1zsplUx6p9jKhPMz+R/21zTVxsjVoHwSFMA+GGFIZB Nmqt/AJCluWqkRFJbTU9TgVQTD7CzEgitsRSSNtyMEZX/a3yEGiavfBCw49VQf+PENAn PCWlYeR8my0gZ6I+QYZXagsT9EWtcvVKhZ+iLa/x4+Dx9N8G6hN3NWYM9DxmfSdPDq9M p3L6BbIbZ4VzJCcJILt9YPtBtt0/xalQSssWemp6Z2gzhKEAyBlYrhtDHVAy0gk/Ljbq +fEA== X-Gm-Message-State: AOAM533K5LcUzuc+OAuigrcXrk7L6ymoDidqdX4ZgzDzrQrT7jD1oojS KMz8JooXmZ4Czwbi46/r/Q2OZIKwqHmz0A== X-Google-Smtp-Source: ABdhPJxQf/9A4Qn/PsUDxUpGZLHQOyZGcNeIoAU2n0nwEdrNehunx/AXNcSVOknQmXCh7rfybxb9sg== X-Received: by 2002:ad4:4841:: with SMTP id t1mr1043677qvy.34.1603937783888; Wed, 28 Oct 2020 19:16:23 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:23 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 16/19] parallel-checkout: add tests for basic operations Date: Wed, 28 Oct 2020 23:14:53 -0300 Message-Id: <05299a3cc0ae8ebb55d17ba35adb953aeb003dca.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to populate the working tree during clone and checkout using the sequential and parallel modes, to confirm that they produce identical results. Also test basic checkout mechanics, such as checking for symlinks in the leading directories and the abidance to --force. Note: some helper functions are added to a common lib file which is only included by t2080 for now. But it will also be used by other parallel-checkout tests in the following patches. Original-patch-by: Jeff Hostetler Signed-off-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- t/lib-parallel-checkout.sh | 40 +++++++ t/t2080-parallel-checkout-basics.sh | 170 ++++++++++++++++++++++++++++ 2 files changed, 210 insertions(+) create mode 100644 t/lib-parallel-checkout.sh create mode 100755 t/t2080-parallel-checkout-basics.sh diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh new file mode 100644 index 0000000000..4dad9043fb --- /dev/null +++ b/t/lib-parallel-checkout.sh @@ -0,0 +1,40 @@ +# Helpers for t208* tests + +# Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}` +# and checks that the number of workers spawned is equal to $3. +# +git_pc() +{ + if test $# -lt 4 + then + BUG "too few arguments to git_pc()" + fi && + + workers=$1 threshold=$2 expected_workers=$3 && + shift 3 && + + rm -f trace && + GIT_TRACE2="$(pwd)/trace" git \ + -c checkout.workers=$workers \ + -c checkout.thresholdForParallelism=$threshold \ + -c advice.detachedHead=0 \ + "$@" && + + # Check that the expected number of workers has been used. Note that it + # can be different from the requested number in two cases: when the + # threshold is not reached; and when there are not enough + # parallel-eligible entries for all workers. + # + local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) && + test $workers_in_trace -eq $expected_workers && + rm -f trace +} + +# Verify that both the working tree and the index were created correctly +verify_checkout() +{ + git -C "$1" diff-index --quiet HEAD -- && + git -C "$1" diff-index --quiet --cached HEAD -- && + git -C "$1" status --porcelain >"$1".status && + test_must_be_empty "$1".status +} diff --git a/t/t2080-parallel-checkout-basics.sh b/t/t2080-parallel-checkout-basics.sh new file mode 100755 index 0000000000..edea88f14f --- /dev/null +++ b/t/t2080-parallel-checkout-basics.sh @@ -0,0 +1,170 @@ +#!/bin/sh + +test_description='parallel-checkout basics + +Ensure that parallel-checkout basically works on clone and checkout, spawning +the required number of workers and correctly populating both the index and +working tree. +' + +TEST_NO_CREATE_REPO=1 +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" + +# Test parallel-checkout with different operations (creation, deletion, +# modification) and entry types. A branch switch from B1 to B2 will contain: +# +# - a (file): modified +# - e/x (file): deleted +# - b (symlink): deleted +# - b/f (file): created +# - e (symlink): created +# - d (submodule): created +# +test_expect_success SYMLINKS 'setup repo for checkout with various operations' ' + git init various && + ( + cd various && + git checkout -b B1 && + echo a>a && + mkdir e && + echo e/x >e/x && + ln -s e b && + git add -A && + git commit -m B1 && + + git checkout -b B2 && + echo modified >a && + rm -rf e && + rm b && + mkdir b && + echo b/f >b/f && + ln -s b e && + git init d && + test_commit -C d f && + git submodule add ./d && + git add -A && + git commit -m B2 && + + git checkout --recurse-submodules B1 + ) +' + +test_expect_success SYMLINKS 'sequential checkout' ' + cp -R various various_sequential && + git_pc 1 0 0 -C various_sequential checkout --recurse-submodules B2 && + verify_checkout various_sequential +' + +test_expect_success SYMLINKS 'parallel checkout' ' + cp -R various various_parallel && + git_pc 2 0 2 -C various_parallel checkout --recurse-submodules B2 && + verify_checkout various_parallel +' + +test_expect_success SYMLINKS 'fallback to sequential checkout (threshold)' ' + cp -R various various_sequential_fallback && + git_pc 2 100 0 -C various_sequential_fallback checkout --recurse-submodules B2 && + verify_checkout various_sequential_fallback +' + +test_expect_success SYMLINKS 'parallel checkout on clone' ' + git -C various checkout --recurse-submodules B2 && + git_pc 2 0 2 clone --recurse-submodules various various_parallel_clone && + verify_checkout various_parallel_clone +' + +test_expect_success SYMLINKS 'fallback to sequential checkout on clone (threshold)' ' + git -C various checkout --recurse-submodules B2 && + git_pc 2 100 0 clone --recurse-submodules various various_sequential_fallback_clone && + verify_checkout various_sequential_fallback_clone +' + +# Just to be paranoid, actually compare the working trees' contents directly. +test_expect_success SYMLINKS 'compare the working trees' ' + rm -rf various_*/.git && + rm -rf various_*/d/.git && + + diff -r various_sequential various_parallel && + diff -r various_sequential various_sequential_fallback && + diff -r various_sequential various_parallel_clone && + diff -r various_sequential various_sequential_fallback_clone +' + +test_cmp_str() +{ + echo "$1" >tmp && + test_cmp tmp "$2" +} + +test_expect_success 'parallel checkout respects --[no]-force' ' + git init dirty && + ( + cd dirty && + mkdir D && + test_commit D/F && + test_commit F && + + echo changed >F.t && + rm -rf D && + echo changed >D && + + # We expect 0 workers because there is nothing to be updated + git_pc 2 0 0 checkout HEAD && + test_path_is_file D && + test_cmp_str changed D && + test_cmp_str changed F.t && + + git_pc 2 0 2 checkout --force HEAD && + test_path_is_dir D && + test_cmp_str D/F D/F.t && + test_cmp_str F F.t + ) +' + +test_expect_success SYMLINKS 'parallel checkout checks for symlinks in leading dirs' ' + git init symlinks && + ( + cd symlinks && + mkdir D E && + + # Create two entries in D to have enough work for 2 parallel + # workers + test_commit D/A && + test_commit D/B && + test_commit E/C && + rm -rf D && + ln -s E D && + + git_pc 2 0 2 checkout --force HEAD && + ! test -L D && + test_cmp_str D/A D/A.t && + test_cmp_str D/B D/B.t + ) +' + +test_expect_success SYMLINKS,CASE_INSENSITIVE_FS 'symlink colliding with leading dir' ' + git init colliding-symlink && + ( + cd colliding-symlink && + file_hex=$(git hash-object -w --stdin tree && + printf "100644 E/B\0${file_oct}" >>tree && + printf "120000 e\0${sym_oct}" >>tree && + + tree_hex=$(git hash-object -w -t tree --stdin X-Patchwork-Id: 11865023 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E2A56C4363A for ; Thu, 29 Oct 2020 02:16:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1496420738 for ; Thu, 29 Oct 2020 02:16:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="llxUlodH" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391067AbgJ2CQm (ORCPT ); Wed, 28 Oct 2020 22:16:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35650 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727871AbgJ2CQ3 (ORCPT ); Wed, 28 Oct 2020 22:16:29 -0400 Received: from mail-qt1-x836.google.com (mail-qt1-x836.google.com [IPv6:2607:f8b0:4864:20::836]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81B9EC0613CF for ; Wed, 28 Oct 2020 19:16:29 -0700 (PDT) Received: by mail-qt1-x836.google.com with SMTP id f6so46682qtc.7 for ; Wed, 28 Oct 2020 19:16:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=ZJH0dcQnUkMYeBvsD6V1EHfXQ/Y49yH3IXl4nhN33iw=; b=llxUlodHAwFFYa7G38NqZttXbYqHzK15+AqKvPuQerjqgTDLrqUf4mjDi83xmLPxfF ymnlRBYG8qkdidW3LQ3FhN03wsEUopAFxfr2k6UoreG0zkZv9mi8YsDJ3TLt68a1GmOb E1BfqjpPcHe4Xo6pE+Ykel1MKZoiZOjwSWCU7CMTqoJ5tV/8cU/z2ZyNdZYqsvs1GGqH 4gq5r0ahcUzjAp5cNcqRS8Ht5utEW46pcGo5tXJeCX4bYX07OKXzYhrjgwoa+n1/U2vH XFTrBP3/OjppRlx1Jw6mPphGva/spoGvQxZ28gfjcgkgxc3mVOPhttd+CcAnED1chItR 8vuQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=ZJH0dcQnUkMYeBvsD6V1EHfXQ/Y49yH3IXl4nhN33iw=; b=sxK2nZp4X260spdgkor1RyfylkB5qfn0U55S/1fCeiOoaFr7l35M1hza1zpY9IYI44 5/kV1K3rof8aAWsERufPQ3fVR2Ax0LaKhyjjoiPLBkSmADZVW/cw9Xeda8fuyw7WK0W/ 5OyZQCfIJyb0y5+kHHlaa2xgXZa/WyCV1s8KH0ibkzvp8grmh2cnuUE3C3TyI1Vw9LtX D2NIBsQkaAAMlJK4PWgZrDRWzuAqifHTWz3FLCM0ARzs2yv/zr5jP+rmSkhZYrsZCfi/ jG6SSM5s0uysiJ8VRlY3drJRaJp6bVXafcvBG+x7yYGzKQjAn8KkaKJB1t6aUQb+4vls P+rA== X-Gm-Message-State: AOAM530v8RoaTJcdfkxWwjN8Oc+1CpmsHUD8Oo9Z+Chi5oU3uhaklC9R LuePE5LekHRPxM9Rf/diVdqZ1ixAxywfFQ== X-Google-Smtp-Source: ABdhPJytGvAOxQOzS+ynE0lZNPMUglGepgEQa2j4kMaziiP0pnEiGyjUUhkp6FWGu2H3JAoWuWQrNQ== X-Received: by 2002:ac8:7517:: with SMTP id u23mr1766468qtq.261.1603937788148; Wed, 28 Oct 2020 19:16:28 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:27 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 17/19] parallel-checkout: add tests related to clone collisions Date: Wed, 28 Oct 2020 23:14:54 -0300 Message-Id: <3d140dcacbd7fd49ea2dfb7bc0839e57b11427de.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to confirm that path collisions are properly reported during a clone operation using parallel-checkout. Original-patch-by: Jeff Hostetler Signed-off-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- t/lib-parallel-checkout.sh | 4 +- t/t2081-parallel-checkout-collisions.sh | 98 +++++++++++++++++++++++++ 2 files changed, 100 insertions(+), 2 deletions(-) create mode 100755 t/t2081-parallel-checkout-collisions.sh diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh index 4dad9043fb..e62a433eb1 100644 --- a/t/lib-parallel-checkout.sh +++ b/t/lib-parallel-checkout.sh @@ -18,7 +18,7 @@ git_pc() -c checkout.workers=$workers \ -c checkout.thresholdForParallelism=$threshold \ -c advice.detachedHead=0 \ - "$@" && + "$@" 2>&8 && # Check that the expected number of workers has been used. Note that it # can be different from the requested number in two cases: when the @@ -28,7 +28,7 @@ git_pc() local workers_in_trace=$(grep "child_start\[..*\] git checkout--helper" trace | wc -l) && test $workers_in_trace -eq $expected_workers && rm -f trace -} +} 8>&2 2>&4 # Verify that both the working tree and the index were created correctly verify_checkout() diff --git a/t/t2081-parallel-checkout-collisions.sh b/t/t2081-parallel-checkout-collisions.sh new file mode 100755 index 0000000000..5cab2dcd2c --- /dev/null +++ b/t/t2081-parallel-checkout-collisions.sh @@ -0,0 +1,98 @@ +#!/bin/sh + +test_description='parallel-checkout collisions + +When there are path collisions during a clone, Git should report a warning +listing all of the colliding entries. The sequential code detects a collision +by calling lstat() before trying to open(O_CREAT) the file. Then, to find the +colliding pair of an item k, it searches cache_entry[0, k-1]. + +This is not sufficient in parallel checkout since: + +- A colliding file may be created between the lstat() and open() calls; +- A colliding entry might appear in the second half of the cache_entry array. + +The tests in this file make sure that the collision detection code is extended +for parallel checkout. +' + +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" + +TEST_ROOT="$PWD" + +test_expect_success CASE_INSENSITIVE_FS 'setup' ' + file_x_hex=$(git hash-object -w --stdin tree && + printf "100644 FILE_x\0${file_x_oct}" >>tree && + printf "100644 file_X\0${file_x_oct}" >>tree && + printf "100644 file_x\0${file_x_oct}" >>tree && + printf "100644 .gitattributes\0${attr_oct}" >>tree && + + tree_hex=$(git hash-object -w -t tree --stdin >filter.log + EOF +' + +for mode in parallel sequential-fallback +do + + case $mode in + parallel) workers=2 threshold=0 expected_workers=2 ;; + sequential-fallback) workers=2 threshold=100 expected_workers=0 ;; + esac + + test_expect_success CASE_INSENSITIVE_FS "collision detection on $mode clone" ' + git_pc $workers $threshold $expected_workers \ + clone --branch=collisions . $mode 2>$mode.stderr && + + grep FILE_X $mode.stderr && + grep FILE_x $mode.stderr && + grep file_X $mode.stderr && + grep file_x $mode.stderr && + test_i18ngrep "the following paths have collided" $mode.stderr + ' + + # The following test ensures that the collision detection code is + # correctly looking for colliding peers in the second half of the + # cache_entry array. This is done by defining a smudge command for the + # *last* array entry, which makes it non-eligible for parallel-checkout. + # The last entry is then checked out *before* any worker is spawned, + # making it succeed and the workers' entries collide. + # + # Note: this test don't work on Windows because, on this system, + # collision detection uses strcmp() when core.ignoreCase=false. And we + # have to set core.ignoreCase=false so that only 'file_x' matches the + # pattern of the filter attribute. But it works on OSX, where collision + # detection uses inode. + # + test_expect_success CASE_INSENSITIVE_FS,!MINGW,!CYGWIN "collision detection on $mode clone w/ filter" ' + git_pc $workers $threshold $expected_workers \ + -c core.ignoreCase=false \ + -c filter.logger.smudge="\"$TEST_ROOT/logger_script\" %f" \ + clone --branch=collisions . ${mode}_with_filter \ + 2>${mode}_with_filter.stderr && + + grep FILE_X ${mode}_with_filter.stderr && + grep FILE_x ${mode}_with_filter.stderr && + grep file_X ${mode}_with_filter.stderr && + grep file_x ${mode}_with_filter.stderr && + test_i18ngrep "the following paths have collided" ${mode}_with_filter.stderr && + + # Make sure only "file_x" was filtered + test_path_is_file ${mode}_with_filter/filter.log && + echo file_x >expected.filter.log && + test_cmp ${mode}_with_filter/filter.log expected.filter.log + ' +done + +test_done From patchwork Thu Oct 29 02:14:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865029 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A9719C55179 for ; Thu, 29 Oct 2020 02:16:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4130720738 for ; Thu, 29 Oct 2020 02:16:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="JyZGBtvx" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391065AbgJ2CQk (ORCPT ); Wed, 28 Oct 2020 22:16:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35662 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388367AbgJ2CQe (ORCPT ); Wed, 28 Oct 2020 22:16:34 -0400 Received: from mail-qv1-xf2e.google.com (mail-qv1-xf2e.google.com [IPv6:2607:f8b0:4864:20::f2e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA7ABC0613CF for ; Wed, 28 Oct 2020 19:16:33 -0700 (PDT) Received: by mail-qv1-xf2e.google.com with SMTP id 63so766559qva.7 for ; Wed, 28 Oct 2020 19:16:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=5lbaPV17AcHOGfhR/neKqPzAHtfoFiFxXZYHRMTvLD8=; b=JyZGBtvx/zcwoEYwgCSW9uRwIDGtxYaSB49C4dk1eXVyPWc/pPzXzP9bDqHTUAu8LF R9v3yIHY9XA24l4BbWNEce/BwP/HqcYg9AQIzhWrj0KdXbJvQY5oOO5sJYHyebLBAys4 1yYZmqLX2GcuH6gj8w9eiqsgnz5j0aYoIMbuvteOo5YarX5Jlxv9kH094NGK3Zsg2sDY WLX14JttYzNJATi73JzfpHET5WIAa4bHQtD0ISfh1IUtkjtYDlvbvltSPsG8M9Mt0M6x JoWFyRB/4gM3beAwNW9g+2xyGzoC6cylRG25VZnV1qoRhXaRdq8RIRyBCO30PsNdobRW qSbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=5lbaPV17AcHOGfhR/neKqPzAHtfoFiFxXZYHRMTvLD8=; b=qBLPHLhN46TDC61tyDlHoDIIcKYQUO8C5Mf1beYA5x+P6eRpY95Pk3T7cPvPtewWWk 408/vK1122lHx0rhLVGhCmJuLBvu3MoAtKuHNATIIwEssJKnX5TLwK+olbsWjllBy8eh xTNeT4A/T1zBGNQeD/DsdX93rTLtQVUh/4BW/EYdojpgpAjbpeE8TKNeu1e/K0lCWsZ9 jJk3BBvpjzLO2Kqw42KHoy7T+TEdqsnMy2BAbU6n/n5WodqQ/L7CBFVwLWQgaUyKZwqZ DPnIkCao2jmOFxClWshUjKsdCp/u7Rk4n2OsWPs+lr4RVlPkDzXnmia9a+/90LK/zN8S RZpg== X-Gm-Message-State: AOAM533+Lsoqbn7iE7AmtcsXbBzCMoLy34i9pE7/yFZ+AVxeono1eyvE T6EXf81vHLQio1QxcrW9j/eIzYNNbeiKKQ== X-Google-Smtp-Source: ABdhPJz03mh9iwKBnlnbw/uSRaQgLvx3GelwSk15NcKBdVfyEbg/+FhqGdn/WWvKR6KrNivhpjZ31Q== X-Received: by 2002:a0c:f442:: with SMTP id h2mr2374716qvm.55.1603937792532; Wed, 28 Oct 2020 19:16:32 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:31 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 18/19] parallel-checkout: add tests related to .gitattributes Date: Wed, 28 Oct 2020 23:14:55 -0300 Message-Id: X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org Add tests to confirm that `struct conv_attrs` data is correctly passed from the main process to the workers, and that they properly smudge files before writing to the working tree. Also check that non-parallel-eligible entries, such as regular files that require external filters, are correctly smudge and written when parallel-checkout is enabled. Note: to avoid repeating code, some helper functions are extracted from t0028 into a common lib file. Original-patch-by: Jeff Hostetler Signed-off-by: Jeff Hostetler Signed-off-by: Matheus Tavares --- t/lib-encoding.sh | 25 ++++ t/t0028-working-tree-encoding.sh | 25 +--- t/t2082-parallel-checkout-attributes.sh | 174 ++++++++++++++++++++++++ 3 files changed, 200 insertions(+), 24 deletions(-) create mode 100644 t/lib-encoding.sh create mode 100755 t/t2082-parallel-checkout-attributes.sh diff --git a/t/lib-encoding.sh b/t/lib-encoding.sh new file mode 100644 index 0000000000..c52ffbbed5 --- /dev/null +++ b/t/lib-encoding.sh @@ -0,0 +1,25 @@ +# Encoding helpers used by t0028 and t2082 + +test_lazy_prereq NO_UTF16_BOM ' + test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6 +' + +test_lazy_prereq NO_UTF32_BOM ' + test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12 +' + +write_utf16 () { + if test_have_prereq NO_UTF16_BOM + then + printf '\376\377' + fi && + iconv -f UTF-8 -t UTF-16 +} + +write_utf32 () { + if test_have_prereq NO_UTF32_BOM + then + printf '\0\0\376\377' + fi && + iconv -f UTF-8 -t UTF-32 +} diff --git a/t/t0028-working-tree-encoding.sh b/t/t0028-working-tree-encoding.sh index bfc4fb9af5..4fffc3a639 100755 --- a/t/t0028-working-tree-encoding.sh +++ b/t/t0028-working-tree-encoding.sh @@ -3,33 +3,10 @@ test_description='working-tree-encoding conversion via gitattributes' . ./test-lib.sh +. "$TEST_DIRECTORY/lib-encoding.sh" GIT_TRACE_WORKING_TREE_ENCODING=1 && export GIT_TRACE_WORKING_TREE_ENCODING -test_lazy_prereq NO_UTF16_BOM ' - test $(printf abc | iconv -f UTF-8 -t UTF-16 | wc -c) = 6 -' - -test_lazy_prereq NO_UTF32_BOM ' - test $(printf abc | iconv -f UTF-8 -t UTF-32 | wc -c) = 12 -' - -write_utf16 () { - if test_have_prereq NO_UTF16_BOM - then - printf '\376\377' - fi && - iconv -f UTF-8 -t UTF-16 -} - -write_utf32 () { - if test_have_prereq NO_UTF32_BOM - then - printf '\0\0\376\377' - fi && - iconv -f UTF-8 -t UTF-32 -} - test_expect_success 'setup test files' ' git config core.eol lf && diff --git a/t/t2082-parallel-checkout-attributes.sh b/t/t2082-parallel-checkout-attributes.sh new file mode 100755 index 0000000000..6800574588 --- /dev/null +++ b/t/t2082-parallel-checkout-attributes.sh @@ -0,0 +1,174 @@ +#!/bin/sh + +test_description='parallel-checkout: attributes + +Verify that parallel-checkout correctly creates files that require +conversions, as specified in .gitattributes. The main point here is +to check that the conv_attr data is correctly sent to the workers +and that it contains sufficient information to smudge files +properly (without access to the index or attribute stack). +' + +TEST_NO_CREATE_REPO=1 +. ./test-lib.sh +. "$TEST_DIRECTORY/lib-parallel-checkout.sh" +. "$TEST_DIRECTORY/lib-encoding.sh" + +test_expect_success 'parallel-checkout with ident' ' + git init ident && + ( + cd ident && + echo "A ident" >.gitattributes && + echo "\$Id\$" >A && + echo "\$Id\$" >B && + git add -A && + git commit -m id && + + rm A B && + git_pc 2 0 2 reset --hard && + hexsz=$(test_oid hexsz) && + grep -E "\\\$Id: [0-9a-f]{$hexsz} \\\$" A && + grep "\\\$Id\\\$" B + ) +' + +test_expect_success 'parallel-checkout with re-encoding' ' + git init encoding && + ( + cd encoding && + echo text >utf8-text && + cat utf8-text | write_utf16 >utf16-text && + + echo "A working-tree-encoding=UTF-16" >.gitattributes && + cp utf16-text A && + cp utf16-text B && + git add A B .gitattributes && + git commit -m encoding && + + # Check that A (and only A) is stored in UTF-8 + git cat-file -p :A >A.internal && + test_cmp_bin utf8-text A.internal && + git cat-file -p :B >B.internal && + test_cmp_bin utf16-text B.internal && + + # Check that A is re-encoded during checkout + rm A B && + git_pc 2 0 2 checkout A B && + test_cmp_bin utf16-text A + ) +' + +test_expect_success 'parallel-checkout with eol conversions' ' + git init eol && + ( + cd eol && + git config core.autocrlf false && + printf "multi\r\nline\r\ntext" >crlf-text && + printf "multi\nline\ntext" >lf-text && + + echo "A text eol=crlf" >.gitattributes && + echo "B -text" >>.gitattributes && + cp crlf-text A && + cp crlf-text B && + git add A B .gitattributes && + git commit -m eol && + + # Check that A (and only A) is stored with LF format + git cat-file -p :A >A.internal && + test_cmp_bin lf-text A.internal && + git cat-file -p :B >B.internal && + test_cmp_bin crlf-text B.internal && + + # Check that A is converted to CRLF during checkout + rm A B && + git_pc 2 0 2 checkout A B && + test_cmp_bin crlf-text A + ) +' + +test_cmp_str() +{ + echo "$1" >tmp && + test_cmp tmp "$2" +} + +# Entries that require an external filter are not eligible for parallel +# checkout. Check that both the parallel-eligible and non-eligible entries are +# properly writen in a single checkout process. +# +test_expect_success 'parallel-checkout and external filter' ' + git init filter && + ( + cd filter && + git config filter.x2y.clean "tr x y" && + git config filter.x2y.smudge "tr y x" && + git config filter.x2y.required true && + + echo "A filter=x2y" >.gitattributes && + echo x >A && + echo x >B && + echo x >C && + git add -A && + git commit -m filter && + + # Check that A (and only A) was cleaned + git cat-file -p :A >A.internal && + test_cmp_str y A.internal && + git cat-file -p :B >B.internal && + test_cmp_str x B.internal && + git cat-file -p :C >C.internal && + test_cmp_str x C.internal && + + rm A B C *.internal && + git_pc 2 0 2 checkout A B C && + test_cmp_str x A && + test_cmp_str x B && + test_cmp_str x C + ) +' + +# The delayed queue is independent from the parallel queue, and they should be +# able to work together in the same checkout process. +# +test_expect_success PERL 'parallel-checkout and delayed checkout' ' + write_script rot13-filter.pl "$PERL_PATH" \ + <"$TEST_DIRECTORY"/t0021/rot13-filter.pl && + test_config_global filter.delay.process \ + "\"$(pwd)/rot13-filter.pl\" \"$(pwd)/delayed.log\" clean smudge delay" && + test_config_global filter.delay.required true && + + echo "a b c" >delay-content && + echo "n o p" >delay-rot13-content && + + git init delayed && + ( + cd delayed && + echo "*.a filter=delay" >.gitattributes && + cp ../delay-content test-delay10.a && + cp ../delay-content test-delay11.a && + echo parallel >parallel1.b && + echo parallel >parallel2.b && + git add -A && + git commit -m delayed && + + # Check that the stored data was cleaned + git cat-file -p :test-delay10.a > delay10.internal && + test_cmp delay10.internal ../delay-rot13-content && + git cat-file -p :test-delay11.a > delay11.internal && + test_cmp delay11.internal ../delay-rot13-content && + rm *.internal && + + rm *.a *.b + ) && + + git_pc 2 0 2 -C delayed checkout -f && + verify_checkout delayed && + + # Check that the *.a files got to the delay queue and were filtered + grep "smudge test-delay10.a .* \[DELAYED\]" delayed.log && + grep "smudge test-delay11.a .* \[DELAYED\]" delayed.log && + test_cmp delayed/test-delay10.a delay-content && + test_cmp delayed/test-delay11.a delay-content +' + +test_done From patchwork Thu Oct 29 02:14:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Matheus Tavares X-Patchwork-Id: 11865013 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH, MAILING_LIST_MULTI,SIGNED_OFF_BY,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED, USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DE40AC4363A for ; Thu, 29 Oct 2020 02:16:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0C3AF20738 for ; Thu, 29 Oct 2020 02:16:41 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=usp.br header.i=@usp.br header.b="dT8zKcKp" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391063AbgJ2CQj (ORCPT ); Wed, 28 Oct 2020 22:16:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35674 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2389009AbgJ2CQi (ORCPT ); Wed, 28 Oct 2020 22:16:38 -0400 Received: from mail-qt1-x835.google.com (mail-qt1-x835.google.com [IPv6:2607:f8b0:4864:20::835]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 113EAC0613CF for ; Wed, 28 Oct 2020 19:16:38 -0700 (PDT) Received: by mail-qt1-x835.google.com with SMTP id s39so1001018qtb.2 for ; Wed, 28 Oct 2020 19:16:38 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=usp.br; s=usp-google; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=CcfJ3MakZ1gPCKrlP6Fji0VV2INYQGopaTF/QWlRaGE=; b=dT8zKcKp5HpmgGTIukyBxTRnMoTR6driRr0y9s11ZNqoIDzk/B2Uj5wmE+X0HegccY maXJUYdBDkoHkyivcFlUydbOOWn6g0NPyGmGS9D1AmFtdQIs0o2671wwZz7l6vjV+YS1 duw3SOiQrHv+k1OaQ1sZ4VQMN1nGsMu3E63iww/tECxWz2DM0+QFWJEey6Jtyt94FuJq X2EMBoGDVG+RjEeveXljDhtSDFH/4U+O72jo0V6qE+m4765zMkMpuE4aTFHiOUC/tQIo ebdXZPZjbr919NFWb3qo/jypfLHRuxOBKlfK7TrDoayafFzxyqEYnMOfZRufXri9DXn8 ZJCQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=CcfJ3MakZ1gPCKrlP6Fji0VV2INYQGopaTF/QWlRaGE=; b=FnuPA2DWPB23++wdWq0bF3BysR2hh94ecmWYB56JfyI88oUihcTXwK4J9/thjaVXH1 lgXlK6aETVIoWNSEqLDpAApN+DYQKb6oo2sSGzhtuieER6RWRI7X4KKYFgf5gTzRv8hG BY+VEFVyaC1iFBDkAAnz+Vy2MVEGjDN6WUSVVZzQ+cey/1n5QRC+ufV48Xc+RyMQ3lyq lp8AQ025n5xNclfiwsurZQTqdzms+SyAH9/FHN6LJPX0QHTiYOLeEmZyYdKiD2z7+Zi9 wXgRqfC9UAv/Q6QoHkjzuC3kfELCHZFAVgM4trpWOhNueMab6EGDUp9/M44cKTE7otJv Gt5g== X-Gm-Message-State: AOAM533MBZPfKlAvB50R0gvJ+26aXqPNGondjepYixlBpVq+pNM489/6 hqiCaauYnDya7FeYX+eBA8GfJqJSo0bjHg== X-Google-Smtp-Source: ABdhPJyjj++tE+BAJRV8n+Kvsa0DUt5w7atqn6bPr4uUk5zdUyh0NVA64aUf+VRF2TzaTIGHA16cEQ== X-Received: by 2002:ac8:1c39:: with SMTP id a54mr1768177qtk.344.1603937796803; Wed, 28 Oct 2020 19:16:36 -0700 (PDT) Received: from mango.meuintelbras.local ([177.32.118.149]) by smtp.gmail.com with ESMTPSA id n201sm608371qka.32.2020.10.28.19.16.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 28 Oct 2020 19:16:36 -0700 (PDT) From: Matheus Tavares To: git@vger.kernel.org Cc: gitster@pobox.com, git@jeffhostetler.com, chriscool@tuxfamily.org, peff@peff.net, newren@gmail.com, jrnieder@gmail.com, martin.agren@gmail.com Subject: [PATCH v3 19/19] ci: run test round with parallel-checkout enabled Date: Wed, 28 Oct 2020 23:14:56 -0300 Message-Id: <641c61f9b65ee99be1f8ed3031b53ece90ec187c.1603937110.git.matheus.bernardino@usp.br> X-Mailer: git-send-email 2.28.0 In-Reply-To: References: MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org We already have tests for the basic parallel-checkout operations. But this code can also run in other commands, such as git-read-tree and git-sparse-checkout, which are currently not tested with multiple workers. To promote a wider test coverage without duplicating tests: 1. Add the GIT_TEST_CHECKOUT_WORKERS environment variable, to optionally force parallel-checkout execution during the whole test suite. 2. Include this variable in the second test round of the linux-gcc job of our ci scripts. This round runs `make test` again with some optional GIT_TEST_* variables enabled, so there is no additional overhead in exercising the parallel-checkout code here. Note: the specific parallel-checkout tests t208* cannot be used in combination with GIT_TEST_CHECKOUT_WORKERS as they need to set and check the number of workers by themselves. So skip those tests when this flag is set. Signed-off-by: Matheus Tavares --- ci/run-build-and-tests.sh | 1 + parallel-checkout.c | 14 ++++++++++++++ t/README | 4 ++++ t/lib-parallel-checkout.sh | 6 ++++++ 4 files changed, 25 insertions(+) diff --git a/ci/run-build-and-tests.sh b/ci/run-build-and-tests.sh index 6c27b886b8..aa32ddc361 100755 --- a/ci/run-build-and-tests.sh +++ b/ci/run-build-and-tests.sh @@ -22,6 +22,7 @@ linux-gcc) export GIT_TEST_COMMIT_GRAPH_CHANGED_PATHS=1 export GIT_TEST_MULTI_PACK_INDEX=1 export GIT_TEST_ADD_I_USE_BUILTIN=1 + export GIT_TEST_CHECKOUT_WORKERS=2 make test ;; linux-clang) diff --git a/parallel-checkout.c b/parallel-checkout.c index c5c449d224..7482447f2d 100644 --- a/parallel-checkout.c +++ b/parallel-checkout.c @@ -32,6 +32,20 @@ enum pc_status parallel_checkout_status(void) void get_parallel_checkout_configs(int *num_workers, int *threshold) { + char *env_workers = getenv("GIT_TEST_CHECKOUT_WORKERS"); + + if (env_workers && *env_workers) { + if (strtol_i(env_workers, 10, num_workers)) { + die("invalid value for GIT_TEST_CHECKOUT_WORKERS: '%s'", + env_workers); + } + if (*num_workers < 1) + *num_workers = online_cpus(); + + *threshold = 0; + return; + } + if (git_config_get_int("checkout.workers", num_workers)) *num_workers = 1; else if (*num_workers < 1) diff --git a/t/README b/t/README index 2adaf7c2d2..cd1b15c55a 100644 --- a/t/README +++ b/t/README @@ -425,6 +425,10 @@ GIT_TEST_DEFAULT_HASH= specifies which hash algorithm to use in the test scripts. Recognized values for are "sha1" and "sha256". +GIT_TEST_CHECKOUT_WORKERS= overrides the 'checkout.workers' setting +to and 'checkout.thresholdForParallelism' to 0, forcing the +execution of the parallel-checkout code. + Naming Tests ------------ diff --git a/t/lib-parallel-checkout.sh b/t/lib-parallel-checkout.sh index e62a433eb1..7b454da375 100644 --- a/t/lib-parallel-checkout.sh +++ b/t/lib-parallel-checkout.sh @@ -1,5 +1,11 @@ # Helpers for t208* tests +if ! test -z "$GIT_TEST_CHECKOUT_WORKERS" +then + skip_all="skipping test, GIT_TEST_CHECKOUT_WORKERS is set" + test_done +fi + # Runs `git -c checkout.workers=$1 -c checkout.thesholdForParallelism=$2 ${@:4}` # and checks that the number of workers spawned is equal to $3. #