From patchwork Sun Dec 13 08:04:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970493 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38283C4361B for ; Sun, 13 Dec 2020 08:05:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D95CC22D72 for ; Sun, 13 Dec 2020 08:05:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391436AbgLMIFM (ORCPT ); Sun, 13 Dec 2020 03:05:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725777AbgLMIFL (ORCPT ); Sun, 13 Dec 2020 03:05:11 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6EF15C0613D3 for ; Sun, 13 Dec 2020 00:04:31 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id 91so13249916wrj.7 for ; Sun, 13 Dec 2020 00:04:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=wexj9gd+qsmT1DDQD7Wmh03jp7SfOBdvrUOJZ+r+Qfc=; b=NivFQyeVZ3544hruaH02lLtxdV8viaSg55W53YlddHn8bJFHlyoA8Q3IAswd+pe8fE sLHob7X6jI6+r00AwSUFqAjLsvK9ix5pmniZ4h8lvqz3wz68V0OQgaIsDl9xA8RXIMPA Y+J78NSYwYKJj4OljZEMydm3wCEDwbIIHlp0dtdOGrG4yyDNOhjTce8HY92IScyzjp4c 2k84O+BsTwYx0PcD/HCK5wMycpBDDv4RTUGS63u8t2Pmeyw+zMEYENjA8pSBqk/G7EL0 Dyp9qAudM1faeIj5kWXbCHfssQhn9A8htqK7R5amkJBmnkbQ/VuhKTSmxx8MVuH6ZOde ghgQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=wexj9gd+qsmT1DDQD7Wmh03jp7SfOBdvrUOJZ+r+Qfc=; b=XELO/svGBrFE8mJaAA9FbBimFxKYBwLlmouNjMiCxUpbicgZri3pM9T1GVP2WEjgQZ CiktWbnYjQ2J0lEtS3z/yAJkF4/mpoRWvUzoovOnUz+ITtYb+vBgIbe/dIqU3pM4yYLv It27G20jSsSPXNd7n9DMzdqa9Ysqm0qI5JgISa+y2lx80M8Cji8fYN4UCanymWWLkckq Wsxsi92d5lvkRGiMZ92F+ra2uLLwEbSFKAPcY2j9oybS8pGIO/e45nkWCjhWqEdNijp5 +HKfwCmcdoJ/csUUIfK76C4O/6Pekb4IIyN6wygzrif5sDXEJ7xIksqTbGUINsNY3LGv XfxQ== X-Gm-Message-State: AOAM532xvjH8pfgeXqCe9B5OUOTJjUqHAmQRMvfng/uSrvuRo/5tyvaC 9HZSb+3l+3+7J0iY3pNjiauD8LQb8ns= X-Google-Smtp-Source: ABdhPJwAi2jPt11gM1a7vVK0r5B+ppOtKeuJ4NETN0bbO3ZoEgDizE4UXJwnnBd1aoR2Rf/fo9bPYQ== X-Received: by 2002:adf:80d0:: with SMTP id 74mr2211971wrl.110.1607846669868; Sun, 13 Dec 2020 00:04:29 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u85sm23072518wmu.43.2020.12.13.00.04.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:29 -0800 (PST) Message-Id: <518dde86966cba9aba211f933e74672fba95509a.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:08 +0000 Subject: [PATCH v3 01/20] merge-ort: setup basic internal data structures Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Set up some basic internal data structures. The only carry-over from merge-recursive.c is call_depth, though needed_rename_limit will be added later. The central piece of data will definitely be the strmap "paths", which will map every relevant pathname under consideration to either a merged_info or a conflict_info. ("conflicted" is a strmap that is a subset of "paths".) merged_info contains all relevant information for a non-conflicted entry. conflict_info contains a merged_info, plus any additional information about a conflict such as the higher orders stages involved and the names of the paths those came from (handy once renames get involved). If an entry remains conflicted, the merged_info portion of a conflict_info will later be filled with whatever version of the file should be placed in the working directory (e.g. an as-merged-as-possible variation that contains conflict markers). Signed-off-by: Elijah Newren --- merge-ort.c | 147 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 147 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index b487901d3ec..3325c9c0a2c 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,153 @@ #include "cache.h" #include "merge-ort.h" +#include "strmap.h" + +struct merge_options_internal { + /* + * paths: primary data structure in all of merge ort. + * + * The keys of paths: + * * are full relative paths from the toplevel of the repository + * (e.g. "drivers/firmware/raspberrypi.c"). + * * store all relevant paths in the repo, both directories and + * files (e.g. drivers, drivers/firmware would also be included) + * * these keys serve to intern all the path strings, which allows + * us to do pointer comparison on directory names instead of + * strcmp; we just have to be careful to use the interned strings. + * + * The values of paths: + * * either a pointer to a merged_info, or a conflict_info struct + * * merged_info contains all relevant information for a + * non-conflicted entry. + * * conflict_info contains a merged_info, plus any additional + * information about a conflict such as the higher orders stages + * involved and the names of the paths those came from (handy + * once renames get involved). + * * a path may start "conflicted" (i.e. point to a conflict_info) + * and then a later step (e.g. three-way content merge) determines + * it can be cleanly merged, at which point it'll be marked clean + * and the algorithm will ignore any data outside the contained + * merged_info for that entry + * * If an entry remains conflicted, the merged_info portion of a + * conflict_info will later be filled with whatever version of + * the file should be placed in the working directory (e.g. an + * as-merged-as-possible variation that contains conflict markers). + */ + struct strmap paths; + + /* + * conflicted: a subset of keys->values from "paths" + * + * conflicted is basically an optimization between process_entries() + * and record_conflicted_index_entries(); the latter could loop over + * ALL the entries in paths AGAIN and look for the ones that are + * still conflicted, but since process_entries() has to loop over + * all of them, it saves the ones it couldn't resolve in this strmap + * so that record_conflicted_index_entries() can iterate just the + * relevant entries. + */ + struct strmap conflicted; + + /* + * current_dir_name: temporary var used in collect_merge_info_callback() + * + * Used to set merged_info.directory_name; see documentation for that + * variable and the requirements placed on that field. + */ + const char *current_dir_name; + + /* call_depth: recursion level counter for merging merge bases */ + int call_depth; +}; + +struct version_info { + struct object_id oid; + unsigned short mode; +}; + +struct merged_info { + /* if is_null, ignore result. otherwise result has oid & mode */ + struct version_info result; + unsigned is_null:1; + + /* + * clean: whether the path in question is cleanly merged. + * + * see conflict_info.merged for more details. + */ + unsigned clean:1; + + /* + * basename_offset: offset of basename of path. + * + * perf optimization to avoid recomputing offset of final '/' + * character in pathname (0 if no '/' in pathname). + */ + size_t basename_offset; + + /* + * directory_name: containing directory name. + * + * Note that we assume directory_name is constructed such that + * strcmp(dir1_name, dir2_name) == 0 iff dir1_name == dir2_name, + * i.e. string equality is equivalent to pointer equality. For this + * to hold, we have to be careful setting directory_name. + */ + const char *directory_name; +}; + +struct conflict_info { + /* + * merged: the version of the path that will be written to working tree + * + * WARNING: It is critical to check merged.clean and ensure it is 0 + * before reading any conflict_info fields outside of merged. + * Allocated merge_info structs will always have clean set to 1. + * Allocated conflict_info structs will have merged.clean set to 0 + * initially. The merged.clean field is how we know if it is safe + * to access other parts of conflict_info besides merged; if a + * conflict_info's merged.clean is changed to 1, the rest of the + * algorithm is not allowed to look at anything outside of the + * merged member anymore. + */ + struct merged_info merged; + + /* oids & modes from each of the three trees for this path */ + struct version_info stages[3]; + + /* pathnames for each stage; may differ due to rename detection */ + const char *pathnames[3]; + + /* Whether this path is/was involved in a directory/file conflict */ + unsigned df_conflict:1; + + /* + * For filemask and dirmask, the ith bit corresponds to whether the + * ith entry is a file (filemask) or a directory (dirmask). Thus, + * filemask & dirmask is always zero, and filemask | dirmask is at + * most 7 but can be less when a path does not appear as either a + * file or a directory on at least one side of history. + * + * Note that these masks are related to enum merge_side, as the ith + * entry corresponds to side i. + * + * These values come from a traverse_trees() call; more info may be + * found looking at tree-walk.h's struct traverse_info, + * particularly the documentation above the "fn" member (note that + * filemask = mask & ~dirmask from that documentation). + */ + unsigned filemask:3; + unsigned dirmask:3; + + /* + * Optimization to track which stages match, to avoid the need to + * recompute it in multiple steps. Either 0 or at least 2 bits are + * set; if at least 2 bits are set, their corresponding stages match. + */ + unsigned match_mask:3; +}; + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, From patchwork Sun Dec 13 08:04:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970501 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B72CC4361B for ; Sun, 13 Dec 2020 08:05:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E9E3C22D73 for ; Sun, 13 Dec 2020 08:05:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2391790AbgLMIFN (ORCPT ); Sun, 13 Dec 2020 03:05:13 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48260 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725777AbgLMIFM (ORCPT ); Sun, 13 Dec 2020 03:05:12 -0500 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 24C16C0613D6 for ; Sun, 13 Dec 2020 00:04:32 -0800 (PST) Received: by mail-wm1-x341.google.com with SMTP id v14so11036342wml.1 for ; Sun, 13 Dec 2020 00:04:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=RNb+86sFrfx/u9J090MtIk+rxf0vfyt/r7AKOVdVyZE=; b=mJD+e+JhLH9muZ08J0OZXJFLSIEqjID97Elry5hD8yS6NEPPX5RcZuCFhMTzpY3MlB gDrzskFjVnmaSYKbd5iHVeju6UCopmgTU52g61OgNr9H+7c2spbxCxQypQu1rPkOLn1G TgBC5aIMJk5bjSA5kacWtlMfWXahwBgvRUHimN17+ehABS1aeFjEJXr1KknQ41ciFfRw BYBQHxhq7FTrmHQ6otio3gxOyV+12g7jeai7FTr8McQpElt7lGuGhQrWW+3CgpG4FxBU 3A45LByJXIeik3+H88EXgdejMJEyrWXM17Esn/VTa0kYRmXcXxk3w4E/LqAWx6MAWgoe H+Cg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=RNb+86sFrfx/u9J090MtIk+rxf0vfyt/r7AKOVdVyZE=; b=oImJlJZ/BEoHzxmcKkS2ZGXa0yjLPwVWqmjIhEkBSN/pm9hfTYqW2AgjtMfyJyhafE 8UnUP6MRmdWWr9o/OXHOnCpqWJO0BzAGscpPFbbKaKoZJ15SePCL967sRYv4fgmHHRVr reKtaY4SKuaxqU2dlT+xkOSAqkvi3xl3Zg3Kt09P6FWMl+wbL2vGVes9Xzwb9/CE+d9Q urAOddIPYw6BoITvPemYMyaFTLb27B/16cOQjMO2ynEqtTKzo7bpR9hH6BY5MIazmssw 0Gk2V6fftTGwmVYEQOyBlIXZXc75LO7nLL5HX5KHHY+3hj6n2H7pJ9cM4feMqVUQCJ6K Bc6Q== X-Gm-Message-State: AOAM530Dv4ID8zxn/EOBFWvMUgGkNsR7Xjzi/B+HxSETFVAyZdhOQ0a9 D/PhI7nCvpEjFLcstzW0k/wabf1nams= X-Google-Smtp-Source: ABdhPJxDjO7kbd2kM+3gijQ6ru6TIeP+u0i2x4zZ3j3yixTHc+Lcu/X4DQwY3Nv5RYlGIgZ81PGhWA== X-Received: by 2002:a1c:bc57:: with SMTP id m84mr21761872wmf.163.1607846670749; Sun, 13 Dec 2020 00:04:30 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j7sm23571366wmb.40.2020.12.13.00.04.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:30 -0800 (PST) Message-Id: <5827ec7f3ebf7f333c15a64aea6d25ce596bf0cf.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:09 +0000 Subject: [PATCH v3 02/20] merge-ort: add some high-level algorithm structure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren merge_ort_nonrecursive_internal() will be used by both merge_inmemory_nonrecursive() and merge_inmemory_recursive(); let's focus on it for now. It involves some setup -- merge_start() -- followed by the following chain of functions: collect_merge_info() This function will populate merge_options_internal's paths field, via a call to traverse_trees() and a new callback that will be added later. detect_and_process_renames() This function will detect renames, and then adjust entries in paths to move conflict stages from old pathnames into those for new pathnames, so that the next step doesn't have to think about renames and just can do three-way content merging and such. process_entries() This function determines how to take the various stages (versions of a file from the three different sides) and merge them, and whether to mark the result as conflicted or cleanly merged. It also writes out these merged file versions as it goes to create a tree. Signed-off-by: Elijah Newren --- merge-ort.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 67 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 3325c9c0a2c..d0abee9b6ab 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -18,6 +18,7 @@ #include "merge-ort.h" #include "strmap.h" +#include "tree.h" struct merge_options_internal { /* @@ -164,6 +165,38 @@ struct conflict_info { unsigned match_mask:3; }; +static int collect_merge_info(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2) +{ + /* TODO: Implement this using traverse_trees() */ + die("Not yet implemented."); +} + +static int detect_and_process_renames(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2) +{ + int clean = 1; + + /* + * Rename detection works by detecting file similarity. Here we use + * a really easy-to-implement scheme: files are similar IFF they have + * the same filename. Therefore, by this scheme, there are no renames. + * + * TODO: Actually implement a real rename detection scheme. + */ + return clean; +} + +static void process_entries(struct merge_options *opt, + struct object_id *result_oid) +{ + die("Not yet implemented."); +} + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, @@ -180,13 +213,46 @@ void merge_finalize(struct merge_options *opt, die("Not yet implemented"); } +static void merge_start(struct merge_options *opt, struct merge_result *result) +{ + die("Not yet implemented."); +} + +/* + * Originally from merge_trees_internal(); heavily adapted, though. + */ +static void merge_ort_nonrecursive_internal(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2, + struct merge_result *result) +{ + struct object_id working_tree_oid; + + collect_merge_info(opt, merge_base, side1, side2); + result->clean = detect_and_process_renames(opt, merge_base, + side1, side2); + process_entries(opt, &working_tree_oid); + + /* Set return values */ + result->tree = parse_tree_indirect(&working_tree_oid); + /* existence of conflicted entries implies unclean */ + result->clean &= strmap_empty(&opt->priv->conflicted); + if (!opt->priv->call_depth) { + result->priv = opt->priv; + opt->priv = NULL; + } +} + void merge_incore_nonrecursive(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2, struct merge_result *result) { - die("Not yet implemented"); + assert(opt->ancestor != NULL); + merge_start(opt, result); + merge_ort_nonrecursive_internal(opt, merge_base, side1, side2, result); } void merge_incore_recursive(struct merge_options *opt, From patchwork Sun Dec 13 08:04:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970499 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6EBC5C0018C for ; Sun, 13 Dec 2020 08:05:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1D28422DCC for ; Sun, 13 Dec 2020 08:05:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393113AbgLMIFd (ORCPT ); Sun, 13 Dec 2020 03:05:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48268 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2391839AbgLMIFO (ORCPT ); Sun, 13 Dec 2020 03:05:14 -0500 Received: from mail-wr1-x42a.google.com (mail-wr1-x42a.google.com [IPv6:2a00:1450:4864:20::42a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 35153C061794 for ; Sun, 13 Dec 2020 00:04:34 -0800 (PST) Received: by mail-wr1-x42a.google.com with SMTP id r14so13265158wrn.0 for ; Sun, 13 Dec 2020 00:04:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=c1FRh8KHfITLyIKVnDMkVjqTG8SNjqfSu6W0ryaVzGo=; b=BoasK2InoHgrym38bdAlempw50V2+jqF7J2hPDHCJwidtTyu4USf8bTQdS+WwZmQgP AYvNAZl4x9bErS3kUKvXoCm6oKbQPr1ucsE9QW8xPwq0dXxg8RnnhuFuegT851lmwzw8 dAn6Q/93mZUkZaTatJVmRLeCEp6Swu0tsWS0e+Ea098RysXzchK/hOXGqlYNnjFnwhc1 baGKGoM+PuupEnhs60mXnNIERxva4h8kxazoDuYzpBeQefiUkPyrBCSuozdf1oq0aI21 Otzizz2Dp2Kc54ppKwttDemGZaCPCpvYTsOu/NFeYHXUehneJqr5vLfHtTDUxWqL6Fex b3Zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=c1FRh8KHfITLyIKVnDMkVjqTG8SNjqfSu6W0ryaVzGo=; b=uM12J11cJScxchL3ozkchqLE+ijBVFQB+gMKJwbCoCa0Hi091pRow4m6Y61dVUR6F8 ILb/jrDdv4+e+hWarGZV42o0b9YDSooSH/lkw167M9FkbCBktRHa1dalkipUkcQEXTEH WFthp+xKbFiUvK5SZ7Ust6Ks8h1l4dIUMq0+GsaWXn9OM44n7Ol43F22b3JjxyOhebhv DgUFxPbVk3DEF4kobmKNk3rpfTXMvrbCeYEFau111ihsbgDUXvBUqLyc+ZerO9QNw4co a2mb8oJHEp5q6FkF+t4UpXWULLtIYY3PsJwWW9zXpZN5cfhvfUuBNwjPE3GLRNxpl2Zi SyfA== X-Gm-Message-State: AOAM532lnZTd//u2v2J3ma1ZFV6wjpEWOaVMQTtUT2Frrp+m/z4AvSFE 36vpToCo9s4gmgJKieqjAQd+Iix1cYA= X-Google-Smtp-Source: ABdhPJw1L73SLSdZHtuPR3GOP5RNqOIialoE6kH0R0o3ycWfojlY2A31i0/JljbZzsSBm5T0frErJw== X-Received: by 2002:adf:8184:: with SMTP id 4mr23533302wra.63.1607846671643; Sun, 13 Dec 2020 00:04:31 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q17sm24189426wrr.53.2020.12.13.00.04.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:31 -0800 (PST) Message-Id: <8295591ee13a500502cb35cc0fa1b557d4696ef6.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:10 +0000 Subject: [PATCH v3 03/20] merge-ort: port merge_start() from merge-recursive Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren merge_start() basically does a bunch of sanity checks, then allocates and initializes opt->priv -- a struct merge_options_internal. Most of the sanity checks are usable as-is. The allocation/intialization is a bit different since merge-ort has a very different merge_options_internal than merge-recursive, but the idea is the same. The weirdest part here is that merge-ort and merge-recursive use the same struct merge_options, even though merge_options has a number of fields that are oddly specific to merge-recursive's internal implementation and don't even make sense with merge-ort's high-level design (e.g. buffer_output, which merge-ort has to always do). I reused the same data structure because: * most the fields made sense to both merge algorithms * making a new struct would have required making new enums or somehow externalizing them, and that was getting messy. * it simplifies converting the existing callers by not having to have different code paths for merge_options setup. I also marked detect_renames as ignored. We can revisit that later, but in short: merge-recursive allowed turning off rename detection because it was sometimes glacially slow. When you speed something up by a few orders of magnitude, it's worth revisiting whether that justification is still relevant. Besides, if folks find it's still too slow, perhaps they have a better scaling case than I could find and maybe it turns up some more optimizations we can add. If it still is needed as an option, it is easy to add later. Signed-off-by: Elijah Newren --- merge-ort.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index d0abee9b6ab..fb07c8f2b30 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,8 @@ #include "cache.h" #include "merge-ort.h" +#include "diff.h" +#include "diffcore.h" #include "strmap.h" #include "tree.h" @@ -215,7 +217,48 @@ void merge_finalize(struct merge_options *opt, static void merge_start(struct merge_options *opt, struct merge_result *result) { - die("Not yet implemented."); + /* Sanity checks on opt */ + assert(opt->repo); + + assert(opt->branch1 && opt->branch2); + + assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE && + opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE); + assert(opt->rename_limit >= -1); + assert(opt->rename_score >= 0 && opt->rename_score <= MAX_SCORE); + assert(opt->show_rename_progress >= 0 && opt->show_rename_progress <= 1); + + assert(opt->xdl_opts >= 0); + assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL && + opt->recursive_variant <= MERGE_VARIANT_THEIRS); + + /* + * detect_renames, verbosity, buffer_output, and obuf are ignored + * fields that were used by "recursive" rather than "ort" -- but + * sanity check them anyway. + */ + assert(opt->detect_renames >= -1 && + opt->detect_renames <= DIFF_DETECT_COPY); + assert(opt->verbosity >= 0 && opt->verbosity <= 5); + assert(opt->buffer_output <= 2); + assert(opt->obuf.len == 0); + + assert(opt->priv == NULL); + + /* Initialization of opt->priv, our internal merge data */ + opt->priv = xcalloc(1, sizeof(*opt->priv)); + + /* + * Although we initialize opt->priv->paths with strdup_strings=0, + * that's just to avoid making yet another copy of an allocated + * string. Putting the entry into paths means we are taking + * ownership, so we will later free it. + * + * In contrast, conflicted just has a subset of keys from paths, so + * we don't want to free those (it'd be a duplicate free). + */ + strmap_init_with_options(&opt->priv->paths, NULL, 0); + strmap_init_with_options(&opt->priv->conflicted, NULL, 0); } /* From patchwork Sun Dec 13 08:04:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970497 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 972FEC4167B for ; Sun, 13 Dec 2020 08:05:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 45C8F22D6F for ; Sun, 13 Dec 2020 08:05:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393147AbgLMIFp (ORCPT ); Sun, 13 Dec 2020 03:05:45 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725777AbgLMIFO (ORCPT ); Sun, 13 Dec 2020 03:05:14 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EA966C061793 for ; Sun, 13 Dec 2020 00:04:33 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id a6so11036085wmc.2 for ; Sun, 13 Dec 2020 00:04:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=HmYvvwe91lf2YCvD8rHvJQ8LBJ8qQf8gKQo72XVX0JI=; b=mlp7h9PoVVq1mCGH+ObfvyV7q6+e0aIy8CDPEGcPIB+z0xHu9SIKJXBqOKh+MZNvSU 9ZAzPEvBuJEDs9/bw5tz9FQUd95vaHV9LAc7vmOqfvGldcXvZrbHhHw90MDjYX1Kb+An s8qZlhN7EZJsXRBnLHzeXw5ULeMbjRs+fuvgPRZtGigl2MbxDGE8LgWPy+dKt6BfDrb3 URrnY+lje7M9pkt/0GJTYRTgxxtSflMI5jX9cEdX1BVivuUI8VglHciH9fgeztZgcTyH ihczHESYEuR/BY1uoEulpbNxvyTwzUJYu6JFE08+qXC9yDHAx48hqYG67NTmPUPjtOzB vkeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=HmYvvwe91lf2YCvD8rHvJQ8LBJ8qQf8gKQo72XVX0JI=; b=FOJib3fHnP6CD24yoc6PcN4KcCOhASdWxzs+4FQTSDylyNauY44NFLEOPWRqsLmXXy /1T5iTy+hrUnxueeWgVQyyG4R6NiEb9nbdgWCUsefute18dwXir2uqDXzr385BVVt8fC b0ENCxdX260dGvAR1m3hZiVHUsRBYEauRtW5TInLtK2cDnC6WLAOgMAV+Ur6hbed+FxK 6qlYjlmbwowsFMdNiISKf73THeDiWlTVZhYFvrYr73nJE/m423GCJn26JNNO/z1kE89V qN2AB9lKOxkijUCLTa6hodAUV1AOc70WWACSi2YVh9mXfGpZFpG92mwGLMVrZ4WlF+JD HwMA== X-Gm-Message-State: AOAM530OY62FrjrlGhDGMmM4tlznkHHiA1IvOTu4xEJEVHQfRrFgw5pf ITmkH6Aob4HXzK10QLBYNT82CuUFLJw= X-Google-Smtp-Source: ABdhPJyDP1fCkh5k0IdXOEUz0bQFkOWwonygTMICAY1fpj0tMq7ezjNH7G+7Qlt0qP4IIgjaZD7O9Q== X-Received: by 2002:a1c:4d0c:: with SMTP id o12mr22487978wmh.134.1607846672535; Sun, 13 Dec 2020 00:04:32 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s205sm25154328wmf.46.2020.12.13.00.04.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:31 -0800 (PST) Message-Id: <38b4f9cf78c885e958158a3960dd74715dc22d97.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:11 +0000 Subject: [PATCH v3 04/20] merge-ort: use histogram diff Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren In my cursory investigation, histogram diffs are about 2% slower than Myers diffs. Others have probably done more detailed benchmarks. But, in short, histogram diffs have been around for years and in a number of cases provide obviously better looking diffs where Myers diffs are unintelligible but the performance hit has kept them from becoming the default. However, there are real merge bugs we know about that have triggered on git.git and linux.git, which I don't have a clue how to address without the additional information that I believe is provided by histogram diffs. See the following: https://lore.kernel.org/git/20190816184051.GB13894@sigill.intra.peff.net/ https://lore.kernel.org/git/CABPp-BHvJHpSJT7sdFwfNcPn_sOXwJi3=o14qjZS3M8Rzcxe2A@mail.gmail.com/ https://lore.kernel.org/git/CABPp-BGtez4qjbtFT1hQoREfcJPmk9MzjhY5eEq1QhXT23tFOw@mail.gmail.com/ I don't like mismerges. I really don't like silent mismerges. While I am sometimes willing to make performance and correctness tradeoff, I'm much more interested in correctness in general. I want to fix the above bugs. I have not yet started doing so, but I believe histogram diff at least gives me an angle. Unfortunately, I can't rely on using the information from histogram diff unless it's in use. And it hasn't been used because of a few percentage performance hit. In testcases I have looked at, merge-ort is _much_ faster than merge-recursive for non-trivial merges/rebases/cherry-picks. As such, this is a golden opportunity to switch out the underlying diff algorithm (at least the one used by the merge machinery; git-diff and git-log are separate questions); doing so will allow me to get additional data and improved diffs, and I believe it will help me fix the above bugs at some point in the future. Signed-off-by: Elijah Newren --- merge-ort.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index fb07c8f2b30..85942cfa7c7 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -21,6 +21,7 @@ #include "diffcore.h" #include "strmap.h" #include "tree.h" +#include "xdiff-interface.h" struct merge_options_internal { /* @@ -245,6 +246,9 @@ static void merge_start(struct merge_options *opt, struct merge_result *result) assert(opt->priv == NULL); + /* Default to histogram diff. Actually, just hardcode it...for now. */ + opt->xdl_opts = DIFF_WITH_ALG(opt, HISTOGRAM_DIFF); + /* Initialization of opt->priv, our internal merge data */ opt->priv = xcalloc(1, sizeof(*opt->priv)); From patchwork Sun Dec 13 08:04:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0240DC433FE for ; Sun, 13 Dec 2020 08:06:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id B65DC22D74 for ; Sun, 13 Dec 2020 08:06:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393987AbgLMIG1 (ORCPT ); Sun, 13 Dec 2020 03:06:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725777AbgLMIFv (ORCPT ); Sun, 13 Dec 2020 03:05:51 -0500 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F17CFC06179C for ; Sun, 13 Dec 2020 00:04:34 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id k10so11037738wmi.3 for ; Sun, 13 Dec 2020 00:04:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hxJ/gDnigE+oShEMEjtNoVPaR5vzW1LmOaQbRYfF+nQ=; b=ZtdEFoz2t4+xXZeeBFRBFlluvmL8R9q9kLQwrG8sIpT2VdPZY+kHPcVo1KBVVGVBzO 6q4gFqqjINLbZFUjpHfGw2O6fFFg01ItD34ej6lD+xTdrP8SnmN6t99gizySZ1QlaZO/ WnPzkzNIlDifH5ksPOaGncFrdBhXtKFBmYsRbXNmcqO6ICL4UT1QMJ7qAsxqt/Pn/uas yl/mXigBi0n2ImcB7oZh2nidKcpcDt9SMFpuCmd/j5HLwOFbhWQnT1wEjvFgGqdosUub zlWVB9Esb8Orx7sW/PpUs67lTDjUFisVZ5lWW+k76PQuT0MKiP1jYAppE2yr/n5FBntD 8IYw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hxJ/gDnigE+oShEMEjtNoVPaR5vzW1LmOaQbRYfF+nQ=; b=iTgeyR/xpc3KB+C4lr+GY5tJgojef/t4BOy6O/2Jv3DtjBvau/x5yAKXR1bvGPqxuW B6x7PSDh8dzwEBQFNSzzebAh42YoQ8U0o37tslRP07lzgoaVhPorQgIsMUEqc94zRt3I KAybN9kC0jOVpWeGRb09sar4euYwKYGPKw+c4mBbRRKf82AfS12KhG2Fi39bVsHPqef4 A7CqBdashB7pL00kKFX/kvJgFXT7XBTrQZM03YhBnh81Um1jkXG+DkGDWRJfJIJW94UX AIBp9Boo0fIE2WHt4ZxRTJNaAIBebzn475CL4UyzINZ9L4rd9PkpYdH/hWZOwOsTmWtK ig1Q== X-Gm-Message-State: AOAM532v31YpiJGjFwhLAIWBWb7UV5J5ydFreeK1P9ElAypi77wwMv7f HJ4TsV5tN2L+kXXtjtPrHgMjNlMhgy0= X-Google-Smtp-Source: ABdhPJz2uHMzq327X4PY8bDAis6NxkB+TR4G6479AjL/G1QyBQSvFasclVS2IrlYd3QDzogHTtE0rw== X-Received: by 2002:a1c:7201:: with SMTP id n1mr21585903wmc.139.1607846673438; Sun, 13 Dec 2020 00:04:33 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y68sm26493533wmc.0.2020.12.13.00.04.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:32 -0800 (PST) Message-Id: <95143bebf09aba6a6dc985b0b33f76e633761115.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:12 +0000 Subject: [PATCH v3 05/20] merge-ort: add an err() function similar to one from merge-recursive Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Various places in merge-recursive used an err() function when it hit some kind of unrecoverable error. That code was from the reusable bits of merge-recursive.c that we liked, such as merge_3way, writing object files to the object store, reading blobs from the object store, etc. So create a similar function to allow us to port that code over, and use it for when we detect problems returned from collect_merge_info()'s traverse_trees() call, which we will be adding next. While we are at it, also add more documentation for the "clean" field from struct merge_result, particularly since the name suggests a boolean but it is not quite one and this is our first non-boolean usage. Signed-off-by: Elijah Newren --- merge-ort.c | 31 +++++++++++++++++++++++++++++-- merge-ort.h | 9 ++++++++- 2 files changed, 37 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 85942cfa7c7..76c0f934279 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -168,12 +168,27 @@ struct conflict_info { unsigned match_mask:3; }; +static int err(struct merge_options *opt, const char *err, ...) +{ + va_list params; + struct strbuf sb = STRBUF_INIT; + + strbuf_addstr(&sb, "error: "); + va_start(params, err); + strbuf_vaddf(&sb, err, params); + va_end(params); + + error("%s", sb.buf); + strbuf_release(&sb); + + return -1; +} + static int collect_merge_info(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2) { - /* TODO: Implement this using traverse_trees() */ die("Not yet implemented."); } @@ -276,7 +291,19 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt, { struct object_id working_tree_oid; - collect_merge_info(opt, merge_base, side1, side2); + if (collect_merge_info(opt, merge_base, side1, side2) != 0) { + /* + * TRANSLATORS: The %s arguments are: 1) tree hash of a merge + * base, and 2-3) the trees for the two trees we're merging. + */ + err(opt, _("collecting merge info failed for trees %s, %s, %s"), + oid_to_hex(&merge_base->object.oid), + oid_to_hex(&side1->object.oid), + oid_to_hex(&side2->object.oid)); + result->clean = -1; + return; + } + result->clean = detect_and_process_renames(opt, merge_base, side1, side2); process_entries(opt, &working_tree_oid); diff --git a/merge-ort.h b/merge-ort.h index 74adccad162..55ae7ee865d 100644 --- a/merge-ort.h +++ b/merge-ort.h @@ -7,7 +7,14 @@ struct commit; struct tree; struct merge_result { - /* Whether the merge is clean */ + /* + * Whether the merge is clean; possible values: + * 1: clean + * 0: not clean (merge conflicts) + * <0: operation aborted prematurely. (object database + * unreadable, disk full, etc.) Worktree may be left in an + * inconsistent state if operation failed near the end. + */ int clean; /* From patchwork Sun Dec 13 08:04:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970505 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5E6CAC433FE for ; Sun, 13 Dec 2020 08:06:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0E05E22D72 for ; Sun, 13 Dec 2020 08:06:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393962AbgLMIGV (ORCPT ); Sun, 13 Dec 2020 03:06:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393156AbgLMIFv (ORCPT ); Sun, 13 Dec 2020 03:05:51 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9DCBC0617A6 for ; Sun, 13 Dec 2020 00:04:35 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id q18so5735573wrn.1 for ; Sun, 13 Dec 2020 00:04:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=z7gfcsgza8x3dT0GL9H+Upq6sllDr5cmde6UM++ogtA=; b=iMlNL9oP4MocnJu6bvKTlv5I9fcHCPW5gSgfizFY0YFmcDe7VYm45BjK5JBvYnoSPa UVPmaEyU553bCpZz5n4rRdr5bsnpY0Jr2UHfPC4hhmRj4S3C4q/RDlTTFDg+vRZJYzDW mGurDE5aIXce0jSKdJrlRTvPsWMGJGdjlht3CYXEzTwm7N2wqWFJ3fScQvAL4jZIW2Fm /wcy/8mYYl8fJjDmTeFG7vKi32IlUVnSOWJDX6hdSX5HTDDrOnsVwz9AtWyvsCmN6U63 VZNVJwYx4UKUHkwh6OrbqSEKIwwlJE8ws/ver+tByv2oA8vcXaaAwbUH3h1vWiWgwDZf oiWQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=z7gfcsgza8x3dT0GL9H+Upq6sllDr5cmde6UM++ogtA=; b=KZpx2/9tO8z8d1X1XBaoIq4Zv+l/042ExFQZBZBfQAXBlc+OSDrvycFNV2a5TasD/a ZkQUrJuSs3E1XhXWbLJfVr4VAbhftXCZ4bvInEjcImUYSz7DJBVO8nQTWU4HoAY2JCCv gkj8mbXuuRGYyuc/4QMIwNehz61cJoybiorVPG55wFWtSWsmjLsftV8rTJYVZr1/D1t7 4dX44LR46b0tpwAJ18sdfmxWCN2aeXl8YfkyO0zTAJUHjeQ0tkObiEq84AcO4usE/OII xGsYXR+YR6J/JtaHZrkWqg59cyKwsBhdGRIZv+OkKf8wHCNEQGFROgaHVI+t9kODgBOd A0pg== X-Gm-Message-State: AOAM531zbGUKz8Yt5P1UImTJuJ7TQm7et2zGy9Y4TRbLswwHXSSntCpG w7erdBXLm10ZtfVieXfy6hm48Lyf42c= X-Google-Smtp-Source: ABdhPJxX+DBaB2cUv2/0mT7se9E8CSIVry+u3R3hPj/gnBs2tpRajHyjz+pzZNvdC8aAf69UUd4i2A== X-Received: by 2002:adf:eb08:: with SMTP id s8mr22964874wrn.12.1607846674278; Sun, 13 Dec 2020 00:04:34 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id u85sm23072856wmu.43.2020.12.13.00.04.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:33 -0800 (PST) Message-Id: <242f6462ebb10f2cad50260d2301dc800408196b.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:13 +0000 Subject: [PATCH v3 06/20] merge-ort: implement a very basic collect_merge_info() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This does not actually collect any necessary info other than the pathnames involved, since it just allocates an all-zero conflict_info and stuffs that into paths. However, it invokes the traverse_trees() machinery to walk over all the paths and sets up the basic infrastructure we need. I have left out a few obvious optimizations to try to make this patch as short and obvious as possible. A subsequent patch will add some of those back in with some more useful data fields before we introduce a patch that actually sets up the conflict_info fields. Signed-off-by: Elijah Newren --- merge-ort.c | 135 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 134 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 76c0f934279..4a2c7de6e8e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -23,6 +23,23 @@ #include "tree.h" #include "xdiff-interface.h" +/* + * We have many arrays of size 3. Whenever we have such an array, the + * indices refer to one of the sides of the three-way merge. This is so + * pervasive that the constants 0, 1, and 2 are used in many places in the + * code (especially in arithmetic operations to find the other side's index + * or to compute a relevant mask), but sometimes these enum names are used + * to aid code clarity. + * + * See also 'filemask' and 'dirmask' in struct conflict_info; the "ith side" + * referred to there is one of these three sides. + */ +enum merge_side { + MERGE_BASE = 0, + MERGE_SIDE1 = 1, + MERGE_SIDE2 = 2 +}; + struct merge_options_internal { /* * paths: primary data structure in all of merge ort. @@ -184,12 +201,128 @@ static int err(struct merge_options *opt, const char *err, ...) return -1; } +static int collect_merge_info_callback(int n, + unsigned long mask, + unsigned long dirmask, + struct name_entry *names, + struct traverse_info *info) +{ + /* + * n is 3. Always. + * common ancestor (mbase) has mask 1, and stored in index 0 of names + * head of side 1 (side1) has mask 2, and stored in index 1 of names + * head of side 2 (side2) has mask 4, and stored in index 2 of names + */ + struct merge_options *opt = info->data; + struct merge_options_internal *opti = opt->priv; + struct conflict_info *ci; + struct name_entry *p; + size_t len; + char *fullpath; + unsigned filemask = mask & ~dirmask; + unsigned mbase_null = !(mask & 1); + unsigned side1_null = !(mask & 2); + unsigned side2_null = !(mask & 4); + + /* n = 3 is a fundamental assumption. */ + if (n != 3) + BUG("Called collect_merge_info_callback wrong"); + + /* + * A bunch of sanity checks verifying that traverse_trees() calls + * us the way I expect. Could just remove these at some point, + * though maybe they are helpful to future code readers. + */ + assert(mbase_null == is_null_oid(&names[0].oid)); + assert(side1_null == is_null_oid(&names[1].oid)); + assert(side2_null == is_null_oid(&names[2].oid)); + assert(!mbase_null || !side1_null || !side2_null); + assert(mask > 0 && mask < 8); + + /* + * Get the name of the relevant filepath, which we'll pass to + * setup_path_info() for tracking. + */ + p = names; + while (!p->mode) + p++; + len = traverse_path_len(info, p->pathlen); + + /* +1 in both of the following lines to include the NUL byte */ + fullpath = xmalloc(len + 1); + make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); + + /* + * TODO: record information about the path other than all zeros, + * so we can resolve later in process_entries. + */ + ci = xcalloc(1, sizeof(struct conflict_info)); + strmap_put(&opti->paths, fullpath, ci); + + /* If dirmask, recurse into subdirectories */ + if (dirmask) { + struct traverse_info newinfo; + struct tree_desc t[3]; + void *buf[3] = {NULL, NULL, NULL}; + const char *original_dir_name; + int i, ret; + + ci->match_mask &= filemask; + newinfo = *info; + newinfo.prev = info; + newinfo.name = p->path; + newinfo.namelen = p->pathlen; + newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); + + for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) { + const struct object_id *oid = NULL; + if (dirmask & 1) + oid = &names[i].oid; + buf[i] = fill_tree_descriptor(opt->repo, t + i, oid); + dirmask >>= 1; + } + + original_dir_name = opti->current_dir_name; + opti->current_dir_name = fullpath; + ret = traverse_trees(NULL, 3, t, &newinfo); + opti->current_dir_name = original_dir_name; + + for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) + free(buf[i]); + + if (ret < 0) + return -1; + } + + return mask; +} + static int collect_merge_info(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2) { - die("Not yet implemented."); + int ret; + struct tree_desc t[3]; + struct traverse_info info; + const char *toplevel_dir_placeholder = ""; + + opt->priv->current_dir_name = toplevel_dir_placeholder; + setup_traverse_info(&info, toplevel_dir_placeholder); + info.fn = collect_merge_info_callback; + info.data = opt; + info.show_all_errors = 1; + + parse_tree(merge_base); + parse_tree(side1); + parse_tree(side2); + init_tree_desc(t + 0, merge_base->buffer, merge_base->size); + init_tree_desc(t + 1, side1->buffer, side1->size); + init_tree_desc(t + 2, side2->buffer, side2->size); + + ret = traverse_trees(NULL, 3, t, &info); + + return ret; } static int detect_and_process_renames(struct merge_options *opt, From patchwork Sun Dec 13 08:04:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970507 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CFC9FC4361B for ; Sun, 13 Dec 2020 08:06:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 871A022D72 for ; Sun, 13 Dec 2020 08:06:41 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393632AbgLMIGF (ORCPT ); Sun, 13 Dec 2020 03:06:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393157AbgLMIFw (ORCPT ); Sun, 13 Dec 2020 03:05:52 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 96AF2C0617A7 for ; Sun, 13 Dec 2020 00:04:36 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id a12so13244439wrv.8 for ; Sun, 13 Dec 2020 00:04:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=t1IsNGWzotdnHOnSfGKrCq25w3EteMzMjz74ihCbnK8=; b=Ze8eoVCprMUifvmxW3uwQ8xJhntKEfCDQS978hdZK3CFkFuP8/H21baWm3KK6tqNri TSKwkgIsHTQ05c8xgG9Cczt6NF5j5Y222oJP0eEllablRSvApiqFKSVDjEPRHI/Mbafz 3mTD0yZAunlCvKrpLzQFJXR+3Qaw5rSUvJOgp8UQs2o1BjjNTFcbLYLoEieX8Fm808tg Rg+u856/5ogKYPPn+Ty3cmpLV5FS1heePbPaF1fzx45KfZAtiVbXMN3SCGeBM5cJgW2E 4jbz4FVc4hnWFdaWhENEzI44TMnipQSP/k0zD/iMo+CZHZFU3PCRIKVwnfoo365UBdem Srxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=t1IsNGWzotdnHOnSfGKrCq25w3EteMzMjz74ihCbnK8=; b=EQAbpIsevUtzVjj79dFw4ifSgx4ZJT5ai7BCR/0/Y4OJ+RD4cbXcH0ZBPowT2CvQAi ElgqfqmP9VspRvvlE1j8wLB7YCqJ3c6UoVsqS8Evose8l0WGKstN9ojcIfwRiVwTB4JW ZC/fxKFXQUK9a/vYg/CMJ8KXxSoCEf9SBvKqxVgxXB8u9Xirq0DKFLqAmyp7gv/bdgg/ r9LBkvH07ZxYF2FRhZ1afrs/LcnSPyTpTNUtdSvfgPMoU7013ONEtGZ9ZNpliarCECQo jBij6uNdz2bsMXiN/YJNGKriI8/zklw9tY4GpJY6OzQ94RruraBLAYsbKeQ7DUafvNI1 RAEg== X-Gm-Message-State: AOAM5322RI6VEa1ZN0R/vhUYhm524J7TOYRCXli+0QHFCICxwxWkbdhG B0S/BdwvMEENlr1u1m763I4uk2iy808= X-Google-Smtp-Source: ABdhPJx9D4KbzD85u3GE0/3/dPrG4+LDDYP8UYZfscm/vebkgA7rmpTFtyUolWX99Pimpe9NU7t9wQ== X-Received: by 2002:adf:e30f:: with SMTP id b15mr23202765wrj.148.1607846675194; Sun, 13 Dec 2020 00:04:35 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j9sm19828297wrm.14.2020.12.13.00.04.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:34 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:14 +0000 Subject: [PATCH v3 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Three-way merges, by their nature, are going to often have two or more trees match at a given subdirectory. We can avoid calling fill_tree_descriptor() on the same tree by checking when these trees match. Noting when various oids match will also be useful in other calculations and optimizations as well. Signed-off-by: Elijah Newren --- merge-ort.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 4a2c7de6e8e..690c64fe264 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -223,6 +223,15 @@ static int collect_merge_info_callback(int n, unsigned mbase_null = !(mask & 1); unsigned side1_null = !(mask & 2); unsigned side2_null = !(mask & 4); + unsigned side1_matches_mbase = (!side1_null && !mbase_null && + names[0].mode == names[1].mode && + oideq(&names[0].oid, &names[1].oid)); + unsigned side2_matches_mbase = (!side2_null && !mbase_null && + names[0].mode == names[2].mode && + oideq(&names[0].oid, &names[2].oid)); + unsigned sides_match = (!side1_null && !side2_null && + names[1].mode == names[2].mode && + oideq(&names[1].oid, &names[2].oid)); /* n = 3 is a fundamental assumption. */ if (n != 3) @@ -275,10 +284,19 @@ static int collect_merge_info_callback(int n, newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) { - const struct object_id *oid = NULL; - if (dirmask & 1) - oid = &names[i].oid; - buf[i] = fill_tree_descriptor(opt->repo, t + i, oid); + if (i == 1 && side1_matches_mbase) + t[1] = t[0]; + else if (i == 2 && side2_matches_mbase) + t[2] = t[0]; + else if (i == 2 && sides_match) + t[2] = t[1]; + else { + const struct object_id *oid = NULL; + if (dirmask & 1) + oid = &names[i].oid; + buf[i] = fill_tree_descriptor(opt->repo, + t + i, oid); + } dirmask >>= 1; } From patchwork Sun Dec 13 08:04:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970503 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3B93FC4361B for ; Sun, 13 Dec 2020 08:06:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D2FBA22D6F for ; Sun, 13 Dec 2020 08:06:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403792AbgLMIGT (ORCPT ); Sun, 13 Dec 2020 03:06:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393158AbgLMIFw (ORCPT ); Sun, 13 Dec 2020 03:05:52 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 710B3C0617B0 for ; Sun, 13 Dec 2020 00:04:37 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id i9so13258309wrc.4 for ; Sun, 13 Dec 2020 00:04:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=yfwQ8J3b23+d8uTmuuKp67sJovO5cdSWtQBt3v2Hv7M=; b=gxqzDonTrARLsw5rLoi0lIcEEqblfzX+Zk7l+DX3ZI9zSTKZHbG490LnZE4C47HPtw Ka/FPGcx+8nvkQFJlgQYawiI7jSzxeiKBFbTNoPcaTs//hMdOhDNlVOlSP4XC567jex9 FEA9ePrhn9Npr4fJb6IxGXrIBVaSJz2qwQxZwGRy8fCdpGTLtDfBdm893PVcfwq006iP 62STK4yysozKTaKKAcmZeq9OCYCm17d8cfuiffnH4sArauTvN97suMpwuV9RwjtInKcB Sf45RRsKG4hMfSLEF73NfJz8DsrdhmGoaj/FNq1EbZbplgydEheJlJfJcEOIxFVm6dSX IUiw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=yfwQ8J3b23+d8uTmuuKp67sJovO5cdSWtQBt3v2Hv7M=; b=d7P+JCCHtVQgk3UAwNAxjZ2MCW2KET37A2g0N6pD96aYe7OZp7ITjcCpsFwz5Cac9O 4q6PJlr1/RkhjafL8PUQRmWUb8AraAw2bBULk4j2IhsHiuuaLKx4OtaQms7XDyCNGuAQ V4zS9YCIULMg2KOA7Vjg9qsQVNMyBHBt3Wudz1HBFbOo5aTMqo6X6FxrE076MKOeu+TV KFPsFpuDciXbQOrLTHW9shvYy6Ig9Z4x8Cok3wafYFnG90TS1Eetv1gATh3MsfuS2Swh BVj2iFB7h17ZNKg8hUUBNDlTPsHXh1znU1WfAFTTHVy1hdAX1Yz8uTlNhvlNy8fKhs4m 9JyQ== X-Gm-Message-State: AOAM5333exo9J1Q5Jd96XExQbnjzNj3Zlt1LPWOkB121BpE6Rd7d+q89 UjnucGas/iwBrXc3umhUkKwaebi2fBs= X-Google-Smtp-Source: ABdhPJzkabo5dFGrVyzd5H0pAOcT1h4yPCr35XaLQs+60pKrd2yZf7VN/oCTv73+q7qC4XdsuWqT6g== X-Received: by 2002:a5d:4c45:: with SMTP id n5mr15023374wrt.396.1607846676063; Sun, 13 Dec 2020 00:04:36 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 35sm26750900wrf.9.2020.12.13.00.04.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:35 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:15 +0000 Subject: [PATCH v3 08/20] merge-ort: compute a few more useful fields for collect_merge_info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- merge-ort.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index 690c64fe264..a6876191c02 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -220,6 +220,7 @@ static int collect_merge_info_callback(int n, size_t len; char *fullpath; unsigned filemask = mask & ~dirmask; + unsigned match_mask = 0; /* will be updated below */ unsigned mbase_null = !(mask & 1); unsigned side1_null = !(mask & 2); unsigned side2_null = !(mask & 4); @@ -233,6 +234,22 @@ static int collect_merge_info_callback(int n, names[1].mode == names[2].mode && oideq(&names[1].oid, &names[2].oid)); + /* + * Note: When a path is a file on one side of history and a directory + * in another, we have a directory/file conflict. In such cases, if + * the conflict doesn't resolve from renames and deletions, then we + * always leave directories where they are and move files out of the + * way. Thus, while struct conflict_info has a df_conflict field to + * track such conflicts, we ignore that field for any directories at + * a path and only pay attention to it for files at the given path. + * The fact that we leave directories were they are also means that + * we do not need to worry about getting additional df_conflict + * information propagated from parent directories down to children + * (unlike, say traverse_trees_recursive() in unpack-trees.c, which + * sets a newinfo.df_conflicts field specifically to propagate it). + */ + unsigned df_conflict = (filemask != 0) && (dirmask != 0); + /* n = 3 is a fundamental assumption. */ if (n != 3) BUG("Called collect_merge_info_callback wrong"); @@ -248,6 +265,14 @@ static int collect_merge_info_callback(int n, assert(!mbase_null || !side1_null || !side2_null); assert(mask > 0 && mask < 8); + /* Determine match_mask */ + if (side1_matches_mbase) + match_mask = (side2_matches_mbase ? 7 : 3); + else if (side2_matches_mbase) + match_mask = 5; + else if (sides_match) + match_mask = 6; + /* * Get the name of the relevant filepath, which we'll pass to * setup_path_info() for tracking. @@ -266,6 +291,8 @@ static int collect_merge_info_callback(int n, * so we can resolve later in process_entries. */ ci = xcalloc(1, sizeof(struct conflict_info)); + ci->df_conflict = df_conflict; + ci->match_mask = match_mask; strmap_put(&opti->paths, fullpath, ci); /* If dirmask, recurse into subdirectories */ @@ -282,6 +309,15 @@ static int collect_merge_info_callback(int n, newinfo.name = p->path; newinfo.namelen = p->pathlen; newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); + /* + * If this directory we are about to recurse into cared about + * its parent directory (the current directory) having a D/F + * conflict, then we'd propagate the masks in this way: + * newinfo.df_conflicts |= (mask & ~dirmask); + * But we don't worry about propagating D/F conflicts. (See + * comment near setting of local df_conflict variable near + * the beginning of this function). + */ for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) { if (i == 1 && side1_matches_mbase) From patchwork Sun Dec 13 08:04:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970519 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C7A5C4361B for ; Sun, 13 Dec 2020 08:07:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 595EC22D72 for ; Sun, 13 Dec 2020 08:07:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394114AbgLMIGq (ORCPT ); Sun, 13 Dec 2020 03:06:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393200AbgLMIFx (ORCPT ); Sun, 13 Dec 2020 03:05:53 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 55701C06138C for ; Sun, 13 Dec 2020 00:04:38 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id r3so13280370wrt.2 for ; Sun, 13 Dec 2020 00:04:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Og3aZK8VK2Qt2KGO3ya3FSuqQqSkXKu+yAFkJenXSh8=; b=FXCDGO3upBI1ZPg7ZMDEy33ZTDxa3f2EN9CCuCS8alQkrnXST7t7Cctr1/sGf4Z5Ra 9O/zqJ3qW2KRvDDXN71yLH60+HKa6NOU6zmldKzTT4P2TVO+HKWLhTdT5GXXaHZfBXsF LUc2FJGyBRZj9NGAM14Fk80i6Vq7eJP8wWY+Ac9LSrcBHK6crclI94hpqg3LXGYwBsWu aWM0zJ1qxOug0j8v3DeCKUxm2QfzWy1mfaRt3+A93DAOtc9NhdAMOUbWp0dLpRXUpZls +2KiABYLLURYbxVaWQUScbPq+cNOzixHVfDpuQB4sTK8S77Red4HyPuZp2j6T1OXJ/4k bbaw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Og3aZK8VK2Qt2KGO3ya3FSuqQqSkXKu+yAFkJenXSh8=; b=tL5T0O1U4V7ZZoE2OmrdJGcpTzK2Gl0Pqlgd4fc8m984OgMS4gm0sE0E8InwmHMdb3 kPDdk55mjAc7KBxIBET4u79x+KyNOQMuC9KgdNecVd3geTpKB3bSyYXkF0lxG/hmnvJQ SaieQsWrmRf2xORITI5maeC/dOPW452nikYOnf2/JHV9FmXt8fiEygj01T8ZaVIvzptj neOwnh5xacVB7Ou89VOCIRK60RdZKTrXsMBfQNTOTITJPxn+HNAUawT9bSdjmsrqHzx0 Fg43ppAZ/n9pgjh5GCW7FqyH80f7JS5CDDoOSnZVmQksHTvtIiC4xfHiIWGXwJjlzd/h zgZg== X-Gm-Message-State: AOAM5318HrxYC28cLIpvWW9ASBk24uK6tHMBdW7jJjCtF6VdgDfAlRUf py+DuRNG2K1EcKX3TRQSSp9rxsMqHY8= X-Google-Smtp-Source: ABdhPJzYTCTY28KSp/jUuVA4tqd9+pO/9Xbpa2VXpogV53nsg3u1LxNio6TVhQTT9eKt249xMHGD/A== X-Received: by 2002:adf:e990:: with SMTP id h16mr23313138wrm.307.1607846676909; Sun, 13 Dec 2020 00:04:36 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f13sm7851855wrs.65.2020.12.13.00.04.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:36 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:16 +0000 Subject: [PATCH v3 09/20] merge-ort: record stage and auxiliary info for every path Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Create a helper function, setup_path_info(), which can be used to record all the information we want in a merged_info or conflict_info. While there is currently only one caller of this new function, and some of its particular parameters are fixed, future callers of this function will be added later. Signed-off-by: Elijah Newren --- merge-ort.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 90 insertions(+), 7 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index a6876191c02..bbfc056300b 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -185,6 +185,26 @@ struct conflict_info { unsigned match_mask:3; }; +/* + * For the next three macros, see warning for conflict_info.merged. + * + * In each of the below, mi is a struct merged_info*, and ci was defined + * as a struct conflict_info* (but we need to verify ci isn't actually + * pointed at a struct merged_info*). + * + * INITIALIZE_CI: Assign ci to mi but only if it's safe; set to NULL otherwise. + * VERIFY_CI: Ensure that something we assigned to a conflict_info* is one. + * ASSIGN_AND_VERIFY_CI: Similar to VERIFY_CI but do assignment first. + */ +#define INITIALIZE_CI(ci, mi) do { \ + (ci) = (!(mi) || (mi)->clean) ? NULL : (struct conflict_info *)(mi); \ +} while (0) +#define VERIFY_CI(ci) assert(ci && !ci->merged.clean); +#define ASSIGN_AND_VERIFY_CI(ci, mi) do { \ + (ci) = (struct conflict_info *)(mi); \ + assert((ci) && !(mi)->clean); \ +} while (0) + static int err(struct merge_options *opt, const char *err, ...) { va_list params; @@ -201,6 +221,65 @@ static int err(struct merge_options *opt, const char *err, ...) return -1; } +static void setup_path_info(struct merge_options *opt, + struct string_list_item *result, + const char *current_dir_name, + int current_dir_name_len, + char *fullpath, /* we'll take over ownership */ + struct name_entry *names, + struct name_entry *merged_version, + unsigned is_null, /* boolean */ + unsigned df_conflict, /* boolean */ + unsigned filemask, + unsigned dirmask, + int resolved /* boolean */) +{ + /* result->util is void*, so mi is a convenience typed variable */ + struct merged_info *mi; + + assert(!is_null || resolved); + assert(!df_conflict || !resolved); /* df_conflict implies !resolved */ + assert(resolved == (merged_version != NULL)); + + mi = xcalloc(1, resolved ? sizeof(struct merged_info) : + sizeof(struct conflict_info)); + mi->directory_name = current_dir_name; + mi->basename_offset = current_dir_name_len; + mi->clean = !!resolved; + if (resolved) { + mi->result.mode = merged_version->mode; + oidcpy(&mi->result.oid, &merged_version->oid); + mi->is_null = !!is_null; + } else { + int i; + struct conflict_info *ci; + + ASSIGN_AND_VERIFY_CI(ci, mi); + for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) { + ci->pathnames[i] = fullpath; + ci->stages[i].mode = names[i].mode; + oidcpy(&ci->stages[i].oid, &names[i].oid); + } + ci->filemask = filemask; + ci->dirmask = dirmask; + ci->df_conflict = !!df_conflict; + if (dirmask) + /* + * Assume is_null for now, but if we have entries + * under the directory then when it is complete in + * write_completed_directory() it'll update this. + * Also, for D/F conflicts, we have to handle the + * directory first, then clear this bit and process + * the file to see how it is handled -- that occurs + * near the top of process_entry(). + */ + mi->is_null = 1; + } + strmap_put(&opt->priv->paths, fullpath, mi); + result->string = fullpath; + result->util = mi; +} + static int collect_merge_info_callback(int n, unsigned long mask, unsigned long dirmask, @@ -215,10 +294,12 @@ static int collect_merge_info_callback(int n, */ struct merge_options *opt = info->data; struct merge_options_internal *opti = opt->priv; - struct conflict_info *ci; + struct string_list_item pi; /* Path Info */ + struct conflict_info *ci; /* typed alias to pi.util (which is void*) */ struct name_entry *p; size_t len; char *fullpath; + const char *dirname = opti->current_dir_name; unsigned filemask = mask & ~dirmask; unsigned match_mask = 0; /* will be updated below */ unsigned mbase_null = !(mask & 1); @@ -287,13 +368,15 @@ static int collect_merge_info_callback(int n, make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); /* - * TODO: record information about the path other than all zeros, - * so we can resolve later in process_entries. + * Record information about the path so we can resolve later in + * process_entries. */ - ci = xcalloc(1, sizeof(struct conflict_info)); - ci->df_conflict = df_conflict; + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, NULL, 0, df_conflict, filemask, dirmask, 0); + + ci = pi.util; + VERIFY_CI(ci); ci->match_mask = match_mask; - strmap_put(&opti->paths, fullpath, ci); /* If dirmask, recurse into subdirectories */ if (dirmask) { @@ -337,7 +420,7 @@ static int collect_merge_info_callback(int n, } original_dir_name = opti->current_dir_name; - opti->current_dir_name = fullpath; + opti->current_dir_name = pi.string; ret = traverse_trees(NULL, 3, t, &newinfo); opti->current_dir_name = original_dir_name; From patchwork Sun Dec 13 08:04:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9F16C433FE for ; Sun, 13 Dec 2020 08:06:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D7C922D72 for ; Sun, 13 Dec 2020 08:06:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394097AbgLMIGf (ORCPT ); Sun, 13 Dec 2020 03:06:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48378 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393181AbgLMIFx (ORCPT ); Sun, 13 Dec 2020 03:05:53 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 39961C061282 for ; Sun, 13 Dec 2020 00:04:39 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id r7so13253153wrc.5 for ; Sun, 13 Dec 2020 00:04:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ebAd0ejwDqj4cDZCIKl+zb9eM9wtPgQZTfvPCZY9VMA=; b=UFL3YIoV4M3QtwyZr2G1TbJ4RmnFQZQlgQEWrRr+tHg7lfooKN8CcYvEiVXkmFhiYV jZ9VGmgM50vcTNkPmQiyFSCbkyuazPTD1abRcjOavyNjFglOoA0+KOWigyyL08B5pLrZ VhE6bN1NWqqnpnUYH458erOJxpoXZLKYtW0f/VfkA8Z9M2QKcAeb6EdgIRsj5qrWVtMl fU09QmgrjSGTpkg/ukpi2EYK0V1WdOqqhAlXQ7bT3idxPkD0eZutlVDKgyJKKcvfP6bP OcIfGMXAOz/LVAk/ZRqQpX6QTD2qOVGTfGfcfw4qjOoxmjbVL8GR4ucOy/CxOqawpPYa VBkw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ebAd0ejwDqj4cDZCIKl+zb9eM9wtPgQZTfvPCZY9VMA=; b=Jh1Ogv91OLqEGAnbKupQRorMZvBCarv/jtQsUZmU08Da3ETWebxnEG8BXBhexpPQqp k7oNCm/59H8mS7CEZnr8nYyx79wHdKb8DdgTHbbmaE7/1svqlhqJlCT4c7vRkZd7sv84 1mBIumqgbv9ZWcqQMMGct0KS2kOK/LEYXmrlfgw4Ol7udlpwvMMF1XxfGqlwa1c+Ew0F 8wesaG/cu6SUkiCJ4ae6hkvuUN4CQbRBy2IpEsXv1u/6+LHaL+vjYveB5ioXGca6jN0A 4SwBg2U9pDVlApfKDKaWjxWG+nY/DfusnELijryUBf4+jjzed7u1nMwKw8sZ7j2AuFU7 IIYQ== X-Gm-Message-State: AOAM53236EKYparJq+67XMd5IFqq1BqHLDY/iNB8UW9hYYJ8kIgbRBzv im14bmZDGpUJ0wTmvIcgXHTOHH5uDtA= X-Google-Smtp-Source: ABdhPJwX2KN1db++7IVafM/OAUQaN3U2ID2Cz4U0CrnSWa3oRD1tke8i5z+i1AUDwYVIE5PKw8mdRg== X-Received: by 2002:adf:8503:: with SMTP id 3mr23099039wrh.56.1607846677804; Sun, 13 Dec 2020 00:04:37 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s13sm23697118wmj.28.2020.12.13.00.04.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:37 -0800 (PST) Message-Id: <6fdf85c8f1a9c4ced61a81a051e8ee88b52eae01.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:17 +0000 Subject: [PATCH v3 10/20] merge-ort: avoid recursing into identical trees Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren When all three trees have the same oid, there is no need to recurse into these trees to find that all files within them happen to match. We can just record any one of the trees as the resolution of merging that particular path. Immediately resolving trees for other types of trivial tree merges (such as one side matches the merge base, or the two sides match each other) would prevent us from detecting renames for some paths, and thus prevent us from doing three-way content merges for those paths whose renames we did not detect. Signed-off-by: Elijah Newren --- merge-ort.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index bbfc056300b..868ac65091b 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -367,6 +367,19 @@ static int collect_merge_info_callback(int n, fullpath = xmalloc(len + 1); make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); + /* + * If mbase, side1, and side2 all match, we can resolve early. Even + * if these are trees, there will be no renames or anything + * underneath. + */ + if (side1_matches_mbase && side2_matches_mbase) { + /* mbase, side1, & side2 all match; use mbase as resolution */ + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, names+0, mbase_null, 0, + filemask, dirmask, 1); + return mask; + } + /* * Record information about the path so we can resolve later in * process_entries. From patchwork Sun Dec 13 08:04:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970533 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2229AC4361B for ; Sun, 13 Dec 2020 08:08:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D74B122D72 for ; Sun, 13 Dec 2020 08:08:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404088AbgLMIIH (ORCPT ); Sun, 13 Dec 2020 03:08:07 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48396 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393439AbgLMIF7 (ORCPT ); Sun, 13 Dec 2020 03:05:59 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 20590C061285 for ; Sun, 13 Dec 2020 00:04:40 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id w5so9519810wrm.11 for ; Sun, 13 Dec 2020 00:04:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=E53HHn76OO/MGWBVwoA3qT9wPi5++SZ9bkb3Jqk0C6M=; b=d4MRoFqJzLHWt5ADkRL6Kgy2gzRGrEY7mbEHPklNFKbvq+Gr/oGA21xbr5laWqFIim rY6Zk7jcxqwFHjwTvI3Om3+9YPdHh6dpq1Iu+fgcIVKlQAKnCAycDDqaNYdEIgETuJiU FI5MeuVDHgDV9SEirHQYmG8uhoE5gZIQnLFo9FT6ccs2w3KS1v0ys74chRKcy+NZ6dmf BphjSsGWd8OJDyrIBmB0akqmn+sn+u7QIcxwXSCjRPWU80DLzlUAhuL7He0uglIejXb9 QKvG6aYW4mgwkXobWJwQ7Gtuo1sVq3b9my2TWS36Kh2PLmH57ez27e0b/oESceJykZCb Si1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=E53HHn76OO/MGWBVwoA3qT9wPi5++SZ9bkb3Jqk0C6M=; b=dyxGbUkSBF4JjjYnKpvjPSE+LBs2Vnv0etcOEkWiMQI/qWDu6HvDQUbe0cdKsDg+QG 3u9OOBp0BGFUx0tmU3Wxi1j8zsnJB+XQaX0WrLx45yQZSCeNchqxIyhdmPX6aAzhvOgg fsUCNXD38PMC+2uaOJJUubdH+LVuCrp8LgSaTgqxjSa3kYUg0+ILERV+KS19EKy5SkEI IQK8sgoCco5WXUr10YQ54+gMAwg3SMvwu99ORKK2bokHH9vjIckNdQcacWmvmJHPy6Hk X650aZ8lBx+XFkuG9q6wMphCx3N779ZvPQxUafpuD+bQw63UGnYKNZ2bVL2Cpq0o9Prz Dtwg== X-Gm-Message-State: AOAM530Si4b6FUYA+2QVk+mOUEi8mEE1y9x2OnoYraeB6KdKAiSEvIyx Qg7nMixhE2/hb9obWy72KSz6rTOa+EU= X-Google-Smtp-Source: ABdhPJxZt2xPQBC1muqYXv4Di+b7JhiYHw1EywDchCgfE7xqsnYI1e2ywiutlQuAjV08dmWxwv2s/A== X-Received: by 2002:adf:a319:: with SMTP id c25mr23130987wrb.262.1607846678701; Sun, 13 Dec 2020 00:04:38 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id g11sm24869409wrq.7.2020.12.13.00.04.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:38 -0800 (PST) Message-Id: <8b001ae643a2aba66e3c16fbe33bf145a4662703.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:18 +0000 Subject: [PATCH v3 11/20] merge-ort: add a preliminary simple process_entries() implementation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Add a process_entries() implementation that just loops over the paths and processes each one individually with an auxiliary process_entry() call. Add a basic process_entry() as well, which handles several cases but leaves a few of the more involved ones with die-not-implemented messages. Also, although process_entries() is supposed to create a tree, it does not yet have code to do so -- except in the special case of merging completely empty trees. Signed-off-by: Elijah Newren --- merge-ort.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 102 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 868ac65091b..d78b6b0873d 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -492,10 +492,111 @@ static int detect_and_process_renames(struct merge_options *opt, return clean; } +/* Per entry merge function */ +static void process_entry(struct merge_options *opt, + const char *path, + struct conflict_info *ci) +{ + VERIFY_CI(ci); + assert(ci->filemask >= 0 && ci->filemask <= 7); + /* ci->match_mask == 7 was handled in collect_merge_info_callback() */ + assert(ci->match_mask == 0 || ci->match_mask == 3 || + ci->match_mask == 5 || ci->match_mask == 6); + + if (ci->df_conflict) { + die("Not yet implemented."); + } + + /* + * NOTE: Below there is a long switch-like if-elseif-elseif... block + * which the code goes through even for the df_conflict cases + * above. Well, it will once we don't die-not-implemented above. + */ + if (ci->match_mask) { + ci->merged.clean = 1; + if (ci->match_mask == 6) { + /* stages[1] == stages[2] */ + ci->merged.result.mode = ci->stages[1].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[1].oid); + } else { + /* determine the mask of the side that didn't match */ + unsigned int othermask = 7 & ~ci->match_mask; + int side = (othermask == 4) ? 2 : 1; + + ci->merged.result.mode = ci->stages[side].mode; + ci->merged.is_null = !ci->merged.result.mode; + oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); + + assert(othermask == 2 || othermask == 4); + assert(ci->merged.is_null == + (ci->filemask == ci->match_mask)); + } + } else if (ci->filemask >= 6 && + (S_IFMT & ci->stages[1].mode) != + (S_IFMT & ci->stages[2].mode)) { + /* + * Two different items from (file/submodule/symlink) + */ + die("Not yet implemented."); + } else if (ci->filemask >= 6) { + /* + * TODO: Needs a two-way or three-way content merge, but we're + * just being lazy and copying the version from HEAD and + * leaving it as conflicted. + */ + ci->merged.clean = 0; + ci->merged.result.mode = ci->stages[1].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[1].oid); + } else if (ci->filemask == 3 || ci->filemask == 5) { + /* Modify/delete */ + die("Not yet implemented."); + } else if (ci->filemask == 2 || ci->filemask == 4) { + /* Added on one side */ + int side = (ci->filemask == 4) ? 2 : 1; + ci->merged.result.mode = ci->stages[side].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); + ci->merged.clean = !ci->df_conflict; + } else if (ci->filemask == 1) { + /* Deleted on both sides */ + ci->merged.is_null = 1; + ci->merged.result.mode = 0; + oidcpy(&ci->merged.result.oid, &null_oid); + ci->merged.clean = 1; + } + + /* + * If still conflicted, record it separately. This allows us to later + * iterate over just conflicted entries when updating the index instead + * of iterating over all entries. + */ + if (!ci->merged.clean) + strmap_put(&opt->priv->conflicted, path, ci); +} + static void process_entries(struct merge_options *opt, struct object_id *result_oid) { - die("Not yet implemented."); + struct hashmap_iter iter; + struct strmap_entry *e; + + if (strmap_empty(&opt->priv->paths)) { + oidcpy(result_oid, opt->repo->hash_algo->empty_tree); + return; + } + + strmap_for_each_entry(&opt->priv->paths, &iter, e) { + /* + * NOTE: mi may actually be a pointer to a conflict_info, but + * we have to check mi->clean first to see if it's safe to + * reassign to such a pointer type. + */ + struct merged_info *mi = e->value; + + if (!mi->clean) + process_entry(opt, e->key, e->value); + } + + die("Tree creation not yet implemented"); } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Dec 13 08:04:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970515 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C76E3C4361B for ; Sun, 13 Dec 2020 08:07:02 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 82B4B22D72 for ; Sun, 13 Dec 2020 08:07:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403914AbgLMIG4 (ORCPT ); Sun, 13 Dec 2020 03:06:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48394 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393498AbgLMIF7 (ORCPT ); Sun, 13 Dec 2020 03:05:59 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 03606C061248 for ; Sun, 13 Dec 2020 00:04:41 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id y23so12433367wmi.1 for ; Sun, 13 Dec 2020 00:04:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=GfaYyRxiX6gKvn9amEytdBpQnWUCwBNlm34sa3Sx/K0=; b=MOrZJX61IL3+511j37fie8NwzpBKj/0xb1uBlYrHc8Os6AQBF2GxMvdO5ouWXGS4Nv k13YPADBXgg5tF8MHElzS8V23sd513el3luYq/B4Zgq4nR3eho4IOyXPVyEnDjpoWKvL ZYHZUpFMYerBIDNQeJFedS4fIyuxMABJcj0MD3YG8xE5kI0I/X3ljADp2QIXdjuA5va+ m9vVjriqITY85bNlxQSLap1SJjTCvloP3Do6XVogpbJ2JH/ec9oKLsuD+xuPntpOVya8 yJvIYzsTua/0XoGiZgVAC55407nsnjNA4Y/VurBpNMIYJ83b9Eb8Xp4o8DAk/mb5lUp9 2n9g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=GfaYyRxiX6gKvn9amEytdBpQnWUCwBNlm34sa3Sx/K0=; b=eZLkYFzc1npVTfVn8+HJ7BWy0RPf+7BB1vDOLspadm+YFlMrqBdYzEc3vt09LaaV6y 84zNBvnwBp8jdI5Q7qlBUeNG1tyjt3p342GyEfD8+ot4KAAtPKNlUwqbZLcOHk7qMXfq bGtLratqS9Dr934J2OUd3t+sPArraXMNA8SQgZhkpLc3Nuk5SnaA+JG8aGHDnXvYgYsV Dtfite+hAjh7+/y0laexpgnNH7j13ZaccpvAcUqXUH2CaPvtHXjMuiu50QRxuhYQ2ujo Wtk6GSazUql2ZAJdKpEPkRNN4pRh3W/makeVJjy85UU9NMw4HR460F3mZL2TrG2ZPfs3 dMbA== X-Gm-Message-State: AOAM531jOK4PtBy1OVhZixpLcr5i2hWptmr7zHSC1gwNndOwcc57gkcZ /8Sx88zJyhcMUeyHX9ox9+IxWhoLv9o= X-Google-Smtp-Source: ABdhPJxvOcF3VRrB8iUvFqHNyPx5ZTVtiIUk9MaHZyV2dKv5u9V4BsMgPgOocf6u/7rXfcdUs9N9hQ== X-Received: by 2002:a1c:7318:: with SMTP id d24mr22265748wmb.39.1607846679542; Sun, 13 Dec 2020 00:04:39 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l1sm24436861wrq.64.2020.12.13.00.04.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:39 -0800 (PST) Message-Id: <260b12290fb4cbbf69e5114906b29c65f09dfdaa.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:19 +0000 Subject: [PATCH v3 12/20] merge-ort: have process_entries operate in a defined order Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren We want to handle paths below a directory before needing to handle the directory itself. Also, we want to handle the directory immediately after the paths below it, so we can't use simple lexicographic ordering from strcmp (which would insert foo.txt between foo and foo/file.c). Copy string_list_df_name_compare() from merge-recursive.c, and set up a string list of paths sorted by that function so that we can iterate in the desired order. Signed-off-by: Elijah Newren --- merge-ort.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 50 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index d78b6b0873d..d83ed8768f5 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -492,6 +492,33 @@ static int detect_and_process_renames(struct merge_options *opt, return clean; } +static int string_list_df_name_compare(const char *one, const char *two) +{ + int onelen = strlen(one); + int twolen = strlen(two); + /* + * Here we only care that entries for D/F conflicts are + * adjacent, in particular with the file of the D/F conflict + * appearing before files below the corresponding directory. + * The order of the rest of the list is irrelevant for us. + * + * To achieve this, we sort with df_name_compare and provide + * the mode S_IFDIR so that D/F conflicts will sort correctly. + * We use the mode S_IFDIR for everything else for simplicity, + * since in other cases any changes in their order due to + * sorting cause no problems for us. + */ + int cmp = df_name_compare(one, onelen, S_IFDIR, + two, twolen, S_IFDIR); + /* + * Now that 'foo' and 'foo/bar' compare equal, we have to make sure + * that 'foo' comes before 'foo/bar'. + */ + if (cmp) + return cmp; + return onelen - twolen; +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, @@ -578,24 +605,44 @@ static void process_entries(struct merge_options *opt, { struct hashmap_iter iter; struct strmap_entry *e; + struct string_list plist = STRING_LIST_INIT_NODUP; + struct string_list_item *entry; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); return; } + /* Hack to pre-allocate plist to the desired size */ + ALLOC_GROW(plist.items, strmap_get_size(&opt->priv->paths), plist.alloc); + + /* Put every entry from paths into plist, then sort */ strmap_for_each_entry(&opt->priv->paths, &iter, e) { + string_list_append(&plist, e->key)->util = e->value; + } + plist.cmp = string_list_df_name_compare; + string_list_sort(&plist); + + /* + * Iterate over the items in reverse order, so we can handle paths + * below a directory before needing to handle the directory itself. + */ + for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) { + char *path = entry->string; /* * NOTE: mi may actually be a pointer to a conflict_info, but * we have to check mi->clean first to see if it's safe to * reassign to such a pointer type. */ - struct merged_info *mi = e->value; + struct merged_info *mi = entry->util; - if (!mi->clean) - process_entry(opt, e->key, e->value); + if (!mi->clean) { + struct conflict_info *ci = (struct conflict_info *)mi; + process_entry(opt, path, ci); + } } + string_list_clear(&plist, 0); die("Tree creation not yet implemented"); } From patchwork Sun Dec 13 08:04:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970531 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DB109C433FE for ; Sun, 13 Dec 2020 08:08:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 94A2522D72 for ; Sun, 13 Dec 2020 08:08:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404019AbgLMIH5 (ORCPT ); Sun, 13 Dec 2020 03:07:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48400 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393502AbgLMIF7 (ORCPT ); Sun, 13 Dec 2020 03:05:59 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E3103C061257 for ; Sun, 13 Dec 2020 00:04:41 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id r3so13280466wrt.2 for ; Sun, 13 Dec 2020 00:04:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=blbyNMGGXSWVTiGKLoyZLBEOol9B9r6IVAGKeByG294=; b=PTsyWIndmvh8UuE5h5TN/YLFxIqT+KGQvtjC5iH+1EFDk1U2WSYaYDPAdHh/HoEGEa MunXagD+TEmnPabxl53GUCGZD7cZ+fjciaOuqAOI59I8R+OeUxsrFUQML01+HW+OC4kM V1ALJjcAsNCVSdCc8kE8o71fEDMkTLEHK7Qd8b5oH6jcR8/LEfdDK2M1849bnCi/YOge PNshvzgQ7tU1DD3dDLcdIt/97hTQzgi/0v5i439yJy7cyPStOUZtVuyS6q0wTRmbuv/W klFqylFuJ96SX4/uwYZ45JQvEwOECom4SkiUi3Z+uoy498SEUCg565j1abIVTGYwhUR6 6fkA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=blbyNMGGXSWVTiGKLoyZLBEOol9B9r6IVAGKeByG294=; b=m46Zr2ZBkc/on/LJmAfgX9mTkTFt3FlDSzzE3x8qrDOJpDh1VlOC2aeY4EqSXS/mMV ynvfD26db6lxhJh7san7fK4ze+kWLILEGkGd1xUi44GE+gkoKZcE+LjKqmRH/Jbknuhr 38GQuz9Ex+uPqpipx8JPzeQwI7z5xRGmTm9OVajgSFq+vYbz54Vol4D+brDrs9ZpTgem wxdiqW1hfOK/iUtqcs6eAOM2UmFYF8sxR3L9i0D/0PwnGhCUC/bLqSH8pQGWx+JNsPq1 vk3Ob5Y5z/X4c5NXuOxrY/8L5IY8A8z2HUFxG6epkGLUrpEAwiplgeK6CrOLyMuOWeeu ZLVA== X-Gm-Message-State: AOAM531XOBoyDZzj4bOJSuvCCkKcJ20FI4RoXidp2aY6Z/3DIF6MuOGw RjN0f/X1MjE6ZnO1wAbx5hRlekNsZRE= X-Google-Smtp-Source: ABdhPJxmVy0G9ZrD0L57Jx+OT31DVQoNNUBEDsbPeaecGYoyT2B8vOEBusYsTXOk7F7Y5Asi0wVGww== X-Received: by 2002:adf:9d49:: with SMTP id o9mr23319649wre.413.1607846680509; Sun, 13 Dec 2020 00:04:40 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z140sm24954045wmc.30.2020.12.13.00.04.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:40 -0800 (PST) Message-Id: <092e77bbb153048a21a524aeae7ed5f75403452b.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:20 +0000 Subject: [PATCH v3 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren As a step towards transforming the processed path->conflict_info entries into an actual tree object, start recording basenames, modes, and oids in a dir_metadata structure. Subsequent commits will make use of this to actually write a tree. Signed-off-by: Elijah Newren --- merge-ort.c | 40 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index d83ed8768f5..95369c6a052 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -519,10 +519,31 @@ static int string_list_df_name_compare(const char *one, const char *two) return onelen - twolen; } +struct directory_versions { + struct string_list versions; +}; + +static void record_entry_for_tree(struct directory_versions *dir_metadata, + const char *path, + struct merged_info *mi) +{ + const char *basename; + + if (mi->is_null) + /* nothing to record */ + return; + + basename = path + mi->basename_offset; + assert(strchr(basename, '/') == NULL); + string_list_append(&dir_metadata->versions, + basename)->util = &mi->result; +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, - struct conflict_info *ci) + struct conflict_info *ci, + struct directory_versions *dir_metadata) { VERIFY_CI(ci); assert(ci->filemask >= 0 && ci->filemask <= 7); @@ -530,6 +551,14 @@ static void process_entry(struct merge_options *opt, assert(ci->match_mask == 0 || ci->match_mask == 3 || ci->match_mask == 5 || ci->match_mask == 6); + if (ci->dirmask) { + record_entry_for_tree(dir_metadata, path, &ci->merged); + if (ci->filemask == 0) + /* nothing else to handle */ + return; + assert(ci->df_conflict); + } + if (ci->df_conflict) { die("Not yet implemented."); } @@ -598,6 +627,7 @@ static void process_entry(struct merge_options *opt, */ if (!ci->merged.clean) strmap_put(&opt->priv->conflicted, path, ci); + record_entry_for_tree(dir_metadata, path, &ci->merged); } static void process_entries(struct merge_options *opt, @@ -607,6 +637,7 @@ static void process_entries(struct merge_options *opt, struct strmap_entry *e; struct string_list plist = STRING_LIST_INIT_NODUP; struct string_list_item *entry; + struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP }; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); @@ -636,13 +667,16 @@ static void process_entries(struct merge_options *opt, */ struct merged_info *mi = entry->util; - if (!mi->clean) { + if (mi->clean) + record_entry_for_tree(&dir_metadata, path, mi); + else { struct conflict_info *ci = (struct conflict_info *)mi; - process_entry(opt, path, ci); + process_entry(opt, path, ci, &dir_metadata); } } string_list_clear(&plist, 0); + string_list_clear(&dir_metadata.versions, 0); die("Tree creation not yet implemented"); } From patchwork Sun Dec 13 08:04:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970513 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id DA67FC4361B for ; Sun, 13 Dec 2020 08:06:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9565B22D72 for ; Sun, 13 Dec 2020 08:06:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2393200AbgLMIGz (ORCPT ); Sun, 13 Dec 2020 03:06:55 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48402 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393505AbgLMIF7 (ORCPT ); Sun, 13 Dec 2020 03:05:59 -0500 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4C71FC0611C5 for ; Sun, 13 Dec 2020 00:04:43 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id w206so7089762wma.0 for ; Sun, 13 Dec 2020 00:04:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=B1g5reDOKhODNLyOrVGW7c/8Vk5Z0+9AlFs/EoWvjWo=; b=PL3yMOf2ifGpMlPf2e8xkLH9V30UMyRI+1I+X2Z+lnN3L9ngGuQbfyLh339s2cKsm2 YcMPSNUoihtzTL0Kqzvq7BptIy8WFgHX6D7bKSqRCcvXec5p8k2AdGwEKVBfiqkguToJ 7thcLUUKogwzEgF9yrJ4jUgEvcpv489dW+8RcnjncNhrca7xXTqmTB2UE5U+WwSfV4/+ kKqUPnkve6Rxunc8lWu27O6GclDwvza+ypv0XRPxotgNMv5uOcDS/3IcXeSTBxrcE57v 7htPVLMFUB5uyktZ4uSD0fJc4pzEYfvXCLH+aotvWJ6yV2H1By+haTuwICqTwy3n3Fjo FJ7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=B1g5reDOKhODNLyOrVGW7c/8Vk5Z0+9AlFs/EoWvjWo=; b=ffiFh0s4xIRzFmTJZ0RkpPrNoImfOySCxWVgugmTO1UE1ZRrxKhgJocZQk6mQOcORW gASqi02mfgIbh4w3igB0L+AaIWqT4LjIFUB38pPaX0iJZ65/P4psiWvJNS87GOPYAZYF X+yRWFHhJthhskDeIuybqrR3XaGKR0bVAa3L0ZUcKxutX8+F/CnVbiSiJ5sKNviDFhoo UpQLhFVRFd+1jimTHX70vTpGT+jGN6WKDgXD1r5COdBZyjrObpw4Vw5zLTTgpx1pL0EM Nujt/UpPAgXhCBmOZ2HNxUSdukKs0rHAFnmfBRqhM0KWpjc9dGp2iXDgvruK8wePXPSL 1zMQ== X-Gm-Message-State: AOAM530lT73tNR0kJC2Y9OPi9JpWQtfZwP8mqfHwy5Ug9o/kYwluWo/b TF3q7girUoiiZKCYEl+KiYdAJEXfOzY= X-Google-Smtp-Source: ABdhPJwgpT4HgBpzMntuDp3VmgzwCE13PEMi9QLD2BmrHMRrWIC63I0VUGb0LiRDX0jwWsTfk0m1yg== X-Received: by 2002:a1c:3902:: with SMTP id g2mr21770126wma.117.1607846681381; Sun, 13 Dec 2020 00:04:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c4sm24589397wmf.19.2020.12.13.00.04.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:40 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:21 +0000 Subject: [PATCH v3 14/20] merge-ort: step 2 of tree writing -- function to create tree object Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Create a new function, write_tree(), which will take a list of basenames, modes, and oids for a single directory and create a tree object in the object-store. We do not yet have just basenames, modes, and oids for just a single directory (we have a mixture of entries from all directory levels in the hierarchy) so we still die() before the current call to write_tree(), but the next patch will rectify that. Signed-off-by: Elijah Newren --- merge-ort.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 95369c6a052..f7041cfeac4 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -19,6 +19,7 @@ #include "diff.h" #include "diffcore.h" +#include "object-store.h" #include "strmap.h" #include "tree.h" #include "xdiff-interface.h" @@ -523,6 +524,62 @@ struct directory_versions { struct string_list versions; }; +static int tree_entry_order(const void *a_, const void *b_) +{ + const struct string_list_item *a = a_; + const struct string_list_item *b = b_; + + const struct merged_info *ami = a->util; + const struct merged_info *bmi = b->util; + return base_name_compare(a->string, strlen(a->string), ami->result.mode, + b->string, strlen(b->string), bmi->result.mode); +} + +static void write_tree(struct object_id *result_oid, + struct string_list *versions, + unsigned int offset, + size_t hash_size) +{ + size_t maxlen = 0, extra; + unsigned int nr = versions->nr - offset; + struct strbuf buf = STRBUF_INIT; + struct string_list relevant_entries = STRING_LIST_INIT_NODUP; + int i; + + /* + * We want to sort the last (versions->nr-offset) entries in versions. + * Do so by abusing the string_list API a bit: make another string_list + * that contains just those entries and then sort them. + * + * We won't use relevant_entries again and will let it just pop off the + * stack, so there won't be allocation worries or anything. + */ + relevant_entries.items = versions->items + offset; + relevant_entries.nr = versions->nr - offset; + QSORT(relevant_entries.items, relevant_entries.nr, tree_entry_order); + + /* Pre-allocate some space in buf */ + extra = hash_size + 8; /* 8: 6 for mode, 1 for space, 1 for NUL char */ + for (i = 0; i < nr; i++) { + maxlen += strlen(versions->items[offset+i].string) + extra; + } + strbuf_grow(&buf, maxlen); + + /* Write each entry out to buf */ + for (i = 0; i < nr; i++) { + struct merged_info *mi = versions->items[offset+i].util; + struct version_info *ri = &mi->result; + strbuf_addf(&buf, "%o %s%c", + ri->mode, + versions->items[offset+i].string, '\0'); + strbuf_add(&buf, ri->oid.hash, hash_size); + } + + /* Write this object file out, and record in result_oid */ + write_object_file(buf.buf, buf.len, tree_type, result_oid); + strbuf_release(&buf); +} + static void record_entry_for_tree(struct directory_versions *dir_metadata, const char *path, struct merged_info *mi) @@ -675,9 +732,17 @@ static void process_entries(struct merge_options *opt, } } + /* + * TODO: We can't actually write a tree yet, because dir_metadata just + * contains all basenames of all files throughout the tree with their + * mode and hash. Not only is that a nonsensical tree, it will have + * lots of duplicates for paths such as "Makefile" or ".gitignore". + */ + die("Not yet implemented; need to process subtrees separately"); + write_tree(result_oid, &dir_metadata.versions, 0, + opt->repo->hash_algo->rawsz); string_list_clear(&plist, 0); string_list_clear(&dir_metadata.versions, 0); - die("Tree creation not yet implemented"); } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Dec 13 08:04:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A681C4361B for ; Sun, 13 Dec 2020 08:07:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D9A4E22D72 for ; Sun, 13 Dec 2020 08:07:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403955AbgLMIHW (ORCPT ); Sun, 13 Dec 2020 03:07:22 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48372 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393955AbgLMIGI (ORCPT ); Sun, 13 Dec 2020 03:06:08 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E9EB9C0611CA for ; Sun, 13 Dec 2020 00:04:43 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id c5so9624340wrp.6 for ; Sun, 13 Dec 2020 00:04:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ywmNtzI9n8gN2YGRxQ7HimUCQVZ5FFp1PIcjZEJxkpg=; b=u1L+deDlyUmKRi7DAcrJvbBoSDIUttrnSAOaRFOJ7PbvATCGoqyJEx/Rg5VjdEMVK2 tI7d5AqWTvSdEQPthKvlgKAEbRjkh7TLieuvIYd5klfQENFtJcNw4v13o9B5cAn5Wcj0 HiX6/02uOwYLTTRHPBCdxVuxPGLpqyxGKCw4WU2wokA+CjP7ykJ9wUrjT9+giiePcJGK 1bsszA9bvqUviquaPAITNMYTZsYlgzByvB1SNzSIUVrI4/hEVRSfNY28RGi8hd0qDo/8 LJJbfWsCNqfQzVVJYaBiHSKJ9p0o5f/ZITLW38otSDeURAmw9LnWwREUfuHjKg3w9x1j CpHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ywmNtzI9n8gN2YGRxQ7HimUCQVZ5FFp1PIcjZEJxkpg=; b=UY8Ddg3cnABK0Ljb39s/2L8PcMyLmEkK3GkpnY2uHLLl1WR4WAAWjVXTZs8/Z/qDa7 g3ktG4Fo2nZTUduhyByKEanC3zsIFgd/OaIoF7CQq7tvPUqYmdwDBrER4EvJVbwPYEpC jWYqkWrWhIn5FjJhf7gDhQPctd7cUt1ZK/BE3yYBdbOrMYXxlsidQEWBAd2guAqK+I9n bXrTE0LqeOkv8kXOEAQwBNZPvaq2NjDhjZVR2JWjzYQzqMGE1fMY8nBIchP85+l0l9Mt vuKcQMYQw946nInzkSfSm9zynI/60sg6yJWB0JaIUIrTNyotjO8w+glxlfYNnq6I2OFS S2ZA== X-Gm-Message-State: AOAM530uZ2izJKkMVFoBwF/6Y5cqoIlKTg9YZ76+AIWrnG6zEpceuqbq hByFJ/H3EJQuCfcBKBrJgNoU5rMybcI= X-Google-Smtp-Source: ABdhPJxsroJO7hE+znEt9DKOCz1YpYP0x1flNGLpBirG47wys0RxHUZ64HA+O4u6J7hozoxAUKFtKg== X-Received: by 2002:adf:94c7:: with SMTP id 65mr22144338wrr.423.1607846682224; Sun, 13 Dec 2020 00:04:42 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s63sm26862271wms.18.2020.12.13.00.04.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:41 -0800 (PST) Message-Id: <81374cbf205d4fa24bef2f4850f95564f25fcfa1.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:22 +0000 Subject: [PATCH v3 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Our order for processing of entries means that if we have a tree of files that looks like Makefile src/moduleA/foo.c src/moduleA/bar.c src/moduleB/baz.c src/moduleB/umm.c tokens.txt Then we will process paths in the order of the leftmost column below. I have added two additional columns that help explain the algorithm that follows; the 2nd column is there to remind us we have oid & mode info we are tracking for each of these paths (which differs between the paths which I'm not representing well here), and the third column annotates the parent directory of the entry: tokens.txt "" src/moduleB/umm.c src/moduleB src/moduleB/baz.c src/moduleB src/moduleB src src/moduleA/foo.c src/moduleA src/moduleA/bar.c src/moduleA src/moduleA src src "" Makefile "" When the parent directory changes, if it's a subdirectory of the previous parent directory (e.g. "" -> src/moduleB) then we can just keep appending. If the parent directory differs from the previous parent directory and is not a subdirectory, then we should process that directory. So, for example, when we get to this point: tokens.txt "" src/moduleB/umm.c src/moduleB src/moduleB/baz.c src/moduleB and note that the next entry (src/moduleB) has a different parent than the last one that isn't a subdirectory, we should write out a tree for it 100644 blob umm.c 100644 blob baz.c then pop all the entries under that directory while recording the new hash for that directory, leaving us with tokens.txt "" src/moduleB src This process repeats until at the end we get to tokens.txt "" src "" Makefile "" and then we can write out the toplevel tree. Since we potentially have entries in our string_list corresponding to multiple different toplevel directories, e.g. a slightly different repository might have: whizbang.txt "" tokens.txt "" src/moduleD src src/moduleC src src/moduleB src src/moduleA/foo.c src/moduleA src/moduleA/bar.c src/moduleA When src/moduleA is popped off, we need to know that the "last directory" reverts back to src, and how many entries in our string_list are associated with that parent directory. So I use an auxiliary offsets string_list which would have (parent_directory,offset) information of the form "" 0 src 2 src/moduleA 5 Whenever I write out a tree for a subdirectory, I set versions.nr to the final offset value and then decrement offsets.nr...and then add an entry to versions with a hash for the new directory. The idea is relatively simple, there's just a lot of accounting to implement this. Signed-off-by: Elijah Newren --- merge-ort.c | 242 ++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 234 insertions(+), 8 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index f7041cfeac4..a7b0df8cb08 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -521,7 +521,46 @@ static int string_list_df_name_compare(const char *one, const char *two) } struct directory_versions { + /* + * versions: list of (basename -> version_info) + * + * The basenames are in reverse lexicographic order of full pathnames, + * as processed in process_entries(). This puts all entries within + * a directory together, and covers the directory itself after + * everything within it, allowing us to write subtrees before needing + * to record information for the tree itself. + */ struct string_list versions; + + /* + * offsets: list of (full relative path directories -> integer offsets) + * + * Since versions contains basenames from files in multiple different + * directories, we need to know which entries in versions correspond + * to which directories. Values of e.g. + * "" 0 + * src 2 + * src/moduleA 5 + * Would mean that entries 0-1 of versions are files in the toplevel + * directory, entries 2-4 are files under src/, and the remaining + * entries starting at index 5 are files under src/moduleA/. + */ + struct string_list offsets; + + /* + * last_directory: directory that previously processed file found in + * + * last_directory starts NULL, but records the directory in which the + * previous file was found within. As soon as + * directory(current_file) != last_directory + * then we need to start updating accounting in versions & offsets. + * Note that last_directory is always the last path in "offsets" (or + * NULL if "offsets" is empty) so this exists just for quick access. + */ + const char *last_directory; + + /* last_directory_len: cached computation of strlen(last_directory) */ + unsigned last_directory_len; }; static int tree_entry_order(const void *a_, const void *b_) @@ -596,6 +635,181 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata, basename)->util = &mi->result; } +static void write_completed_directory(struct merge_options *opt, + const char *new_directory_name, + struct directory_versions *info) +{ + const char *prev_dir; + struct merged_info *dir_info = NULL; + unsigned int offset; + + /* + * Some explanation of info->versions and info->offsets... + * + * process_entries() iterates over all relevant files AND + * directories in reverse lexicographic order, and calls this + * function. Thus, an example of the paths that process_entries() + * could operate on (along with the directories for those paths + * being shown) is: + * + * xtract.c "" + * tokens.txt "" + * src/moduleB/umm.c src/moduleB + * src/moduleB/stuff.h src/moduleB + * src/moduleB/baz.c src/moduleB + * src/moduleB src + * src/moduleA/foo.c src/moduleA + * src/moduleA/bar.c src/moduleA + * src/moduleA src + * src "" + * Makefile "" + * + * info->versions: + * + * always contains the unprocessed entries and their + * version_info information. For example, after the first five + * entries above, info->versions would be: + * + * xtract.c + * token.txt + * umm.c + * stuff.h + * baz.c + * + * Once a subdirectory is completed we remove the entries in + * that subdirectory from info->versions, writing it as a tree + * (write_tree()). Thus, as soon as we get to src/moduleB, + * info->versions would be updated to + * + * xtract.c + * token.txt + * moduleB + * + * info->offsets: + * + * helps us track which entries in info->versions correspond to + * which directories. When we are N directories deep (e.g. 4 + * for src/modA/submod/subdir/), we have up to N+1 unprocessed + * directories (+1 because of toplevel dir). Corresponding to + * the info->versions example above, after processing five entries + * info->offsets will be: + * + * "" 0 + * src/moduleB 2 + * + * which is used to know that xtract.c & token.txt are from the + * toplevel dirctory, while umm.c & stuff.h & baz.c are from the + * src/moduleB directory. Again, following the example above, + * once we need to process src/moduleB, then info->offsets is + * updated to + * + * "" 0 + * src 2 + * + * which says that moduleB (and only moduleB so far) is in the + * src directory. + * + * One unique thing to note about info->offsets here is that + * "src" was not added to info->offsets until there was a path + * (a file OR directory) immediately below src/ that got + * processed. + * + * Since process_entry() just appends new entries to info->versions, + * write_completed_directory() only needs to do work if the next path + * is in a directory that is different than the last directory found + * in info->offsets. + */ + + /* + * If we are working with the same directory as the last entry, there + * is no work to do. (See comments above the directory_name member of + * struct merged_info for why we can use pointer comparison instead of + * strcmp here.) + */ + if (new_directory_name == info->last_directory) + return; + + /* + * If we are just starting (last_directory is NULL), or last_directory + * is a prefix of the current directory, then we can just update + * info->offsets to record the offset where we started this directory + * and update last_directory to have quick access to it. + */ + if (info->last_directory == NULL || + !strncmp(new_directory_name, info->last_directory, + info->last_directory_len)) { + uintptr_t offset = info->versions.nr; + + info->last_directory = new_directory_name; + info->last_directory_len = strlen(info->last_directory); + /* + * Record the offset into info->versions where we will + * start recording basenames of paths found within + * new_directory_name. + */ + string_list_append(&info->offsets, + info->last_directory)->util = (void*)offset; + return; + } + + /* + * The next entry that will be processed will be within + * new_directory_name. Since at this point we know that + * new_directory_name is within a different directory than + * info->last_directory, we have all entries for info->last_directory + * in info->versions and we need to create a tree object for them. + */ + dir_info = strmap_get(&opt->priv->paths, info->last_directory); + assert(dir_info); + offset = (uintptr_t)info->offsets.items[info->offsets.nr-1].util; + if (offset == info->versions.nr) { + /* + * Actually, we don't need to create a tree object in this + * case. Whenever all files within a directory disappear + * during the merge (e.g. unmodified on one side and + * deleted on the other, or files were renamed elsewhere), + * then we get here and the directory itself needs to be + * omitted from its parent tree as well. + */ + dir_info->is_null = 1; + } else { + /* + * Write out the tree to the git object directory, and also + * record the mode and oid in dir_info->result. + */ + dir_info->is_null = 0; + dir_info->result.mode = S_IFDIR; + write_tree(&dir_info->result.oid, &info->versions, offset, + opt->repo->hash_algo->rawsz); + } + + /* + * We've now used several entries from info->versions and one entry + * from info->offsets, so we get rid of those values. + */ + info->offsets.nr--; + info->versions.nr = offset; + + /* + * Now we've taken care of the completed directory, but we need to + * prepare things since future entries will be in + * new_directory_name. (In particular, process_entry() will be + * appending new entries to info->versions.) So, we need to make + * sure new_directory_name is the last entry in info->offsets. + */ + prev_dir = info->offsets.nr == 0 ? NULL : + info->offsets.items[info->offsets.nr-1].string; + if (new_directory_name != prev_dir) { + uintptr_t c = info->versions.nr; + string_list_append(&info->offsets, + new_directory_name)->util = (void*)c; + } + + /* And, of course, we need to update last_directory to match. */ + info->last_directory = new_directory_name; + info->last_directory_len = strlen(info->last_directory); +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, @@ -694,7 +908,9 @@ static void process_entries(struct merge_options *opt, struct strmap_entry *e; struct string_list plist = STRING_LIST_INIT_NODUP; struct string_list_item *entry; - struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP }; + struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP, + STRING_LIST_INIT_NODUP, + NULL, 0 }; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); @@ -714,6 +930,11 @@ static void process_entries(struct merge_options *opt, /* * Iterate over the items in reverse order, so we can handle paths * below a directory before needing to handle the directory itself. + * + * This allows us to write subtrees before we need to write trees, + * and it also enables sane handling of directory/file conflicts + * (because it allows us to know whether the directory is still in + * the way when it is time to process the file at the same path). */ for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) { char *path = entry->string; @@ -724,6 +945,8 @@ static void process_entries(struct merge_options *opt, */ struct merged_info *mi = entry->util; + write_completed_directory(opt, mi->directory_name, + &dir_metadata); if (mi->clean) record_entry_for_tree(&dir_metadata, path, mi); else { @@ -732,17 +955,20 @@ static void process_entries(struct merge_options *opt, } } - /* - * TODO: We can't actually write a tree yet, because dir_metadata just - * contains all basenames of all files throughout the tree with their - * mode and hash. Not only is that a nonsensical tree, it will have - * lots of duplicates for paths such as "Makefile" or ".gitignore". - */ - die("Not yet implemented; need to process subtrees separately"); + if (dir_metadata.offsets.nr != 1 || + (uintptr_t)dir_metadata.offsets.items[0].util != 0) { + printf("dir_metadata.offsets.nr = %d (should be 1)\n", + dir_metadata.offsets.nr); + printf("dir_metadata.offsets.items[0].util = %u (should be 0)\n", + (unsigned)(uintptr_t)dir_metadata.offsets.items[0].util); + fflush(stdout); + BUG("dir_metadata accounting completely off; shouldn't happen"); + } write_tree(result_oid, &dir_metadata.versions, 0, opt->repo->hash_algo->rawsz); string_list_clear(&plist, 0); string_list_clear(&dir_metadata.versions, 0); + string_list_clear(&dir_metadata.offsets, 0); } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Dec 13 08:04:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970517 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 63475C433FE for ; Sun, 13 Dec 2020 08:07:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1CDFE22D72 for ; Sun, 13 Dec 2020 08:07:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403917AbgLMIG4 (ORCPT ); Sun, 13 Dec 2020 03:06:56 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48370 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393872AbgLMIGI (ORCPT ); Sun, 13 Dec 2020 03:06:08 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CC065C0611CB for ; Sun, 13 Dec 2020 00:04:44 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id w206so7089790wma.0 for ; Sun, 13 Dec 2020 00:04:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=q30DZV4Ul5zUbF1Xb7wjc7z6AvKBgAzBlBhRSeesxD8=; b=EgRTK++EToSAKkkIM3mo0C3/5mMKnrcJEvN/Ps0J6i6ZT6jlVsOyttX9JHuKUEohWl Z2aBF9VXIzBtWop02rF1qxWDbVVsgJddlfvMjrv849NQfA7lDEvV6WKKwIlBIrVReduI uMy5rNh3vF3aKaEVG1xTE+nWcv5zs9+JFisdizFRK6uRb9VViyAAAjblmbWVa9Rs47vK at5A5o76j1dgazbpFTzRknwhO+rVs8tEtTy4xe4ulSFnogcMCOWb8Img0mwtTZVpwYft eBYKo0pDmuN3Md+oACviP+U5t9vzDuASsN7bIoCbGKtH2kJtmFCVCzvYwl1S/y6Cip+n t6zg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=q30DZV4Ul5zUbF1Xb7wjc7z6AvKBgAzBlBhRSeesxD8=; b=TqRSMSwbXma64G8/a95/MWNEqG6YvFnt3ZRzy22GYK2IJNOm/2J30De0Yh6mDdCa6w VCJTqs4Z+M4lzMFpH5ls9rpLB6G1GDkFD/FgbUSjnuRSGDvmIVM4Sp/bzkDstiqK/wZ7 au9xNv+LX/8YMVX/LgBqKRC6a/P+/NEwSKzdn6U4xtYCv0YudWgzDtGc+iPQU5sypO4J ogReyS/IXNIRyiiMfgOTa210ldt28fRRp0I/Ee1Q3a5RfClBi3whlTDye6pUxMdBU5/G aUmiQrK1tTSaJa+AJk1UK2eshFeylLjQHqZk5m4r7sKSamBcGE8f+6jkJIBmvyXZ92vf yX0g== X-Gm-Message-State: AOAM5318iTt7t5Tcfw4bU9Dup+cQpMw8Y4202re0vtLSDUveis+KeX6s F4qxfo1W9MCX+8DejNGZTfJ2Ngz3/oA= X-Google-Smtp-Source: ABdhPJw/ZRhdM6VJP68M0TDtshNaCgWmsUL9UZq0OzriyecQIsrB2onXXruZy5WuHhRXa7ZrZWkFYA== X-Received: by 2002:a1c:4684:: with SMTP id t126mr21945638wma.165.1607846683408; Sun, 13 Dec 2020 00:04:43 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h16sm19128910wrq.29.2020.12.13.00.04.42 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:42 -0800 (PST) Message-Id: <3198efe31882cac93af2d0625b4872e52697caf4.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:23 +0000 Subject: [PATCH v3 16/20] merge-ort: basic outline for merge_switch_to_result() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This adds a basic implementation for merge_switch_to_result(), though just in terms of a few new empty functions that will be defined in subsequent commits. Signed-off-by: Elijah Newren --- merge-ort.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index a7b0df8cb08..ee7fbe71404 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -971,13 +971,53 @@ static void process_entries(struct merge_options *opt, string_list_clear(&dir_metadata.offsets, 0); } +static int checkout(struct merge_options *opt, + struct tree *prev, + struct tree *next) +{ + die("Not yet implemented."); +} + +static int record_conflicted_index_entries(struct merge_options *opt, + struct index_state *index, + struct strmap *paths, + struct strmap *conflicted) +{ + if (strmap_empty(conflicted)) + return 0; + + die("Not yet implemented."); +} + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, int update_worktree_and_index, int display_update_msgs) { - die("Not yet implemented"); + assert(opt->priv == NULL); + if (result->clean >= 0 && update_worktree_and_index) { + struct merge_options_internal *opti = result->priv; + + if (checkout(opt, head, result->tree)) { + /* failure to function */ + result->clean = -1; + return; + } + + if (record_conflicted_index_entries(opt, opt->repo->index, + &opti->paths, + &opti->conflicted)) { + /* failure to function */ + result->clean = -1; + return; + } + } + + if (display_update_msgs) { + /* TODO: print out CONFLICT and other informational messages. */ + } + merge_finalize(opt, result); } From patchwork Sun Dec 13 08:04:24 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970525 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6FAF1C4361B for ; Sun, 13 Dec 2020 08:07:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 18C0622D72 for ; Sun, 13 Dec 2020 08:07:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403970AbgLMIHb (ORCPT ); Sun, 13 Dec 2020 03:07:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393956AbgLMIGI (ORCPT ); Sun, 13 Dec 2020 03:06:08 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BABD6C0611CC for ; Sun, 13 Dec 2020 00:04:45 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id 91so13250259wrj.7 for ; Sun, 13 Dec 2020 00:04:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=oqhHjw6EdnqKcciqU2lQHITMJpUz2e2Y8szWZFVr6pQ=; b=QQ7H/6l4+i5IVkpdm9fPYmjyKMydxZ19BiHJ55hZojoNsEQy4dFK34v2XxKP/rKvjZ VaSqB7QYnKQwJXJiodZ09ZglMuftaYZmpnaO8Ud7kHvXIl9m7vP4k94sZa+2DkiBvNoW y+EpNDsNeiPXCS8KwVFyEseRJ8kjiAl0mvwIO4b/6pRGBguDWzBJShFwpgzpgz4u7xls IqKasWxrj1SHOHI6vewAmG0+5eST5HGvNHPdDfnzjSjd/TP0ANww5DKQqq6rmu25ieEW wxydT1DJ9w6CiDGWKgk7WG1ToyaNTeLQPFbn3Q3mPiOk9y6e71oDgISOjWXcs2K86Vk5 McHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=oqhHjw6EdnqKcciqU2lQHITMJpUz2e2Y8szWZFVr6pQ=; b=TL6tEzqhwrQKPf7i3ht6UzHaWak1g7dTv1T4NQjhk6ydpNzocwNIui5n7yNkFIpP1E U3rwAKFQTEFxVSJhAghSMuh8rlOpprGuD10cMQztstL+9HE13bCKOe8QzGf3c1Z9IQCm 5a/lMEQO/N2lbj5yX6SzvMsfxb7UBj2iH7R8B+1TAp3qjw6MQV3rRWdqQpO/XzFqSjzJ cG7waUH87OalmkP8YrUqSMFrVq6GrPGe9uOmOyXCGjpAdZBUWneFsrMiNxtDceQ9CT1S Bxx8xubR5ZUlMkNE1KfsQzL2heCL6gigyBN/rzyFXJxsBeR9071XEVgr9c5Wj+32fMkD dQTQ== X-Gm-Message-State: AOAM530yOEh6GdCLzP1rNkSK692jhRRY5dffxYrEukgKuU6+M55Q2Jo0 feJDeKhhUiwZB03bkNJhsgSSgj2rcLE= X-Google-Smtp-Source: ABdhPJw/mQGLEdCA4IOUScg4J8++9o05AAQ2YJ4OhOFrCwHEv/Fii2v/fMS9LIeYbo8YHJ1/ZLpETQ== X-Received: by 2002:a5d:6909:: with SMTP id t9mr15791721wru.327.1607846684298; Sun, 13 Dec 2020 00:04:44 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s12sm2113767wmh.29.2020.12.13.00.04.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:43 -0800 (PST) Message-Id: <119f40c77f8d44ff5d3a8b82d61678c06a690753.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:24 +0000 Subject: [PATCH v3 17/20] merge-ort: add implementation of checkout() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Since merge-ort creates a tree for its output, when there are no conflicts, updating the working tree and index is as simple as using the unpack_trees() machinery with a twoway_merge (i.e. doing the equivalent of a "checkout" operation). If there were conflicts in the merge, then since the tree we created included all the conflict markers, then using the unpack_trees machinery in this manner will still update the working tree correctly. Further, all index entries corresponding to cleanly merged files will also be updated correctly by this procedure. Index entries corresponding to conflicted entries will appear as though the user had run "git add -u" after the merge to accept all files as-is with conflict markers. Thus, after running unpack_trees(), there needs to be a separate step for updating the entries in the index corresponding to conflicted files. This will be the job for the function record_conflicted_index_entris(), which will be implemented in a subsequent commit. Signed-off-by: Elijah Newren --- merge-ort.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index ee7fbe71404..3c4f64e2675 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -19,9 +19,11 @@ #include "diff.h" #include "diffcore.h" +#include "dir.h" #include "object-store.h" #include "strmap.h" #include "tree.h" +#include "unpack-trees.h" #include "xdiff-interface.h" /* @@ -975,7 +977,48 @@ static int checkout(struct merge_options *opt, struct tree *prev, struct tree *next) { - die("Not yet implemented."); + /* Switch the index/working copy from old to new */ + int ret; + struct tree_desc trees[2]; + struct unpack_trees_options unpack_opts; + + memset(&unpack_opts, 0, sizeof(unpack_opts)); + unpack_opts.head_idx = -1; + unpack_opts.src_index = opt->repo->index; + unpack_opts.dst_index = opt->repo->index; + + setup_unpack_trees_porcelain(&unpack_opts, "merge"); + + /* + * NOTE: if this were just "git checkout" code, we would probably + * read or refresh the cache and check for a conflicted index, but + * builtin/merge.c or sequencer.c really needs to read the index + * and check for conflicted entries before starting merging for a + * good user experience (no sense waiting for merges/rebases before + * erroring out), so there's no reason to duplicate that work here. + */ + + /* 2-way merge to the new branch */ + unpack_opts.update = 1; + unpack_opts.merge = 1; + unpack_opts.quiet = 0; /* FIXME: sequencer might want quiet? */ + unpack_opts.verbose_update = (opt->verbosity > 2); + unpack_opts.fn = twoway_merge; + if (1/* FIXME: opts->overwrite_ignore*/) { + unpack_opts.dir = xcalloc(1, sizeof(*unpack_opts.dir)); + unpack_opts.dir->flags |= DIR_SHOW_IGNORED; + setup_standard_excludes(unpack_opts.dir); + } + parse_tree(prev); + init_tree_desc(&trees[0], prev->buffer, prev->size); + parse_tree(next); + init_tree_desc(&trees[1], next->buffer, next->size); + + ret = unpack_trees(2, trees, &unpack_opts); + clear_unpack_trees_porcelain(&unpack_opts); + dir_clear(unpack_opts.dir); + FREE_AND_NULL(unpack_opts.dir); + return ret; } static int record_conflicted_index_entries(struct merge_options *opt, From patchwork Sun Dec 13 08:04:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970521 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1019DC4361B for ; Sun, 13 Dec 2020 08:07:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BCFE922D72 for ; Sun, 13 Dec 2020 08:07:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2403940AbgLMIHI (ORCPT ); Sun, 13 Dec 2020 03:07:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48362 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2393756AbgLMIGI (ORCPT ); Sun, 13 Dec 2020 03:06:08 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9253AC0611CD for ; Sun, 13 Dec 2020 00:04:46 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id r7so13253335wrc.5 for ; Sun, 13 Dec 2020 00:04:46 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=NJ9DfsR44CRQ6MBxmFGkdBFtUEn9AS/jGwKdtehq0pY=; b=HiEUia6Szm1kQjQnnMmejjOefu9uNN9sOaV/vnxtrOQBB9aDtWt3k0qm8zq+sfdioq OOx/kd0Vgqdvn3cSGY9+KziedbKZz51OoJY7UCzVjUAUNYFXW1GyFADZXvW3eZsNivki xJSNLCeg+Q5gW0bckBKI4omb4ghOCN1mzpCG9O2lKboD+m7BuZvhUnzvyS8aAk54P8aa HppTz7soGJe8hymvQA+nhEw+BGKTrPtO/RFT9f87wZgetIR1sInhtMAy1lHex8OuEEKR WwwMIGgJHxPOIheX0/T61C77GOsfDplpm1UiRRU4ESS2+mM74LZ2NOW7TdLuVF/3bCoj iIWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=NJ9DfsR44CRQ6MBxmFGkdBFtUEn9AS/jGwKdtehq0pY=; b=aMEaq04carVJTceztQRjE6lHUK3A3DpuWF+JEpu562O/NHjOg6TinD+790RO5SZVon 5fqPcoXWQL8y4mZ55pGP0+dhC92XYb9SHE6n0Ot7dBdVY8yBUEl5JEt2esHt4d+WBsc2 EQ3TWO3IrOv3ooZA5fGgOhmUpZvWmFWo1/UoScc8xwijJkoP+FbixXcVWqPBsYA/Ua/r cONmfCoGoWV3508o9jaYfCw5akbWNdOWaLGq7sTXskvsuPXF6KRw8rV8p5453A12VjL1 u0A2rcoNAuVb1pTkiKL0D0KmjuQ27JApGzXNtDawWTueguYhGw87NUJ/rh+cdbJci8Uw Nu3Q== X-Gm-Message-State: AOAM530d0D76rhwOU/tfvArBlZ1zme7N76vQ5nkMpPFafmmnQG8Ow36R s2CbVne15Ep3w1NGpKD70Zg3m3AO4JU= X-Google-Smtp-Source: ABdhPJzdVX/782rA+Mrfjzq4EMSpO8fatc13gWCfN+QkZzCSGvGiyiFUvG29GqatOznIFo3Zlp6j0Q== X-Received: by 2002:adf:f3c8:: with SMTP id g8mr22809232wrp.405.1607846685222; Sun, 13 Dec 2020 00:04:45 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c1sm22565118wml.8.2020.12.13.00.04.44 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:44 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:25 +0000 Subject: [PATCH v3 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- tree.c | 2 +- tree.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/tree.c b/tree.c index e76517f6b18..a52479812ce 100644 --- a/tree.c +++ b/tree.c @@ -144,7 +144,7 @@ int read_tree_recursive(struct repository *r, return ret; } -static int cmp_cache_name_compare(const void *a_, const void *b_) +int cmp_cache_name_compare(const void *a_, const void *b_) { const struct cache_entry *ce1, *ce2; diff --git a/tree.h b/tree.h index 93837450739..3eb0484cbf2 100644 --- a/tree.h +++ b/tree.h @@ -28,6 +28,8 @@ void free_tree_buffer(struct tree *tree); /* Parses and returns the tree in the given ent, chasing tags and commits. */ struct tree *parse_tree_indirect(const struct object_id *oid); +int cmp_cache_name_compare(const void *a_, const void *b_); + #define READ_TREE_RECURSIVE 1 typedef int (*read_tree_fn_t)(const struct object_id *, struct strbuf *, const char *, unsigned int, int, void *); From patchwork Sun Dec 13 08:04:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970527 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 334CFC4167B for ; Sun, 13 Dec 2020 08:07:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D68E122D74 for ; Sun, 13 Dec 2020 08:07:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2394077AbgLMIHt (ORCPT ); Sun, 13 Dec 2020 03:07:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48492 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2394041AbgLMIGe (ORCPT ); Sun, 13 Dec 2020 03:06:34 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81FD1C0611CE for ; Sun, 13 Dec 2020 00:04:47 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id a11so5515524wrr.13 for ; Sun, 13 Dec 2020 00:04:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=1k06gWoziwVjj9jXw2JYtrRTtEGJ87h3blUvQeqMuRw=; b=F/qSDUNKMXtZ16vp38er9wbe1BombDmpUQ/JIe4ooseibTMvgKYC2ZY3dPnDt9kaCy Q1G+gfbhGFbkq/iH0CRFIl5loldcPxqR8gtsgd4fpHhTjGCxM4D5nCth442psYg8Zcgh 9z/C7II/bKic5ezhH+Ce/v3LwVuSFnx/N+1OnZhtcpNE8lIjeLGNq4EArO4DUf980JZa 098WKoxJHclW5ncsPx2vMlPHfAaHhTwTahESmhDaqxTsRncO4H8hpWJRiLyO3D03FRUX KdbHukcseVj4TvUHZ7eIvp2azhnWZRvvt9LlNCfz/hkw0lM/K+fW6P30uZcY0y3A10W9 AMgg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=1k06gWoziwVjj9jXw2JYtrRTtEGJ87h3blUvQeqMuRw=; b=By16boBoRDQy2WQ7/4ja6B8Lke3wTHZHpy33munFHlQqvA7YC6aS3Std1aegMvAFnz 9Sxl1/6VAaCZVBWX1ioBonIT73NSxi3+N+z6APnFrC5AQG6iQwPGBRjNk6lD3z40LBvm VUGnE1csPNz4mknvg/TQrSv+z/5eLTbulV1tl0wsPELOUAxbAFiXcCR0HzTgqBdsVtaQ LbBSiztDF3r0mxXblcqlSEi++yVhz/BsR1DKtvXryC+MQCDvArRHTvn/i0q6Mu+aOBkJ I1ykEXvi3TYqAGPcXJ73EuiNMiwdvmhT2ZbxSrzNeL7lBlo1iHeUC1KeR869RzbGU6hW kWYw== X-Gm-Message-State: AOAM533LYQeuvwB/FNlSV9vFhDaTiqoZCBPqcz69M58cCa7Gfjb+WM81 +O/RPTZip+WPiF0qViRQ258S2/sCMNA= X-Google-Smtp-Source: ABdhPJzilhy8Spxuc8Gpni8JbiY0UAVn9EjUsXJEmMtTLOmAREyAExWS0KrbL4/gIAGLXEz9K3Xq5Q== X-Received: by 2002:a5d:6789:: with SMTP id v9mr11700201wru.86.1607846686059; Sun, 13 Dec 2020 00:04:46 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v125sm23423325wme.42.2020.12.13.00.04.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:45 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:26 +0000 Subject: [PATCH v3 19/20] merge-ort: add implementation of record_conflicted_index_entries() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren After checkout(), the working tree has the appropriate contents, and the index matches the working copy. That means that all unmodified and cleanly merged files have correct index entries, but conflicted entries need to be updated. We do this by looping over the conflicted entries, marking the existing index entry for the path with CE_REMOVE, adding new higher order staged for the path at the end of the index (ignoring normal index sort order), and then at the end of the loop removing the CE_REMOVED-marked cache entries and sorting the index. Signed-off-by: Elijah Newren --- merge-ort.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 87 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 3c4f64e2675..47cd772e805 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,7 @@ #include "cache.h" #include "merge-ort.h" +#include "cache-tree.h" #include "diff.h" #include "diffcore.h" #include "dir.h" @@ -1026,10 +1027,95 @@ static int record_conflicted_index_entries(struct merge_options *opt, struct strmap *paths, struct strmap *conflicted) { + struct hashmap_iter iter; + struct strmap_entry *e; + int errs = 0; + int original_cache_nr; + if (strmap_empty(conflicted)) return 0; - die("Not yet implemented."); + original_cache_nr = index->cache_nr; + + /* Put every entry from paths into plist, then sort */ + strmap_for_each_entry(conflicted, &iter, e) { + const char *path = e->key; + struct conflict_info *ci = e->value; + int pos; + struct cache_entry *ce; + int i; + + VERIFY_CI(ci); + + /* + * The index will already have a stage=0 entry for this path, + * because we created an as-merged-as-possible version of the + * file and checkout() moved the working copy and index over + * to that version. + * + * However, previous iterations through this loop will have + * added unstaged entries to the end of the cache which + * ignore the standard alphabetical ordering of cache + * entries and break invariants needed for index_name_pos() + * to work. However, we know the entry we want is before + * those appended cache entries, so do a temporary swap on + * cache_nr to only look through entries of interest. + */ + SWAP(index->cache_nr, original_cache_nr); + pos = index_name_pos(index, path, strlen(path)); + SWAP(index->cache_nr, original_cache_nr); + if (pos < 0) { + if (ci->filemask != 1) + BUG("Conflicted %s but nothing in basic working tree or index; this shouldn't happen", path); + cache_tree_invalidate_path(index, path); + } else { + ce = index->cache[pos]; + + /* + * Clean paths with CE_SKIP_WORKTREE set will not be + * written to the working tree by the unpack_trees() + * call in checkout(). Our conflicted entries would + * have appeared clean to that code since we ignored + * the higher order stages. Thus, we need override + * the CE_SKIP_WORKTREE bit and manually write those + * files to the working disk here. + * + * TODO: Implement this CE_SKIP_WORKTREE fixup. + */ + + /* + * Mark this cache entry for removal and instead add + * new stage>0 entries corresponding to the + * conflicts. If there are many conflicted entries, we + * want to avoid memmove'ing O(NM) entries by + * inserting the new entries one at a time. So, + * instead, we just add the new cache entries to the + * end (ignoring normal index requirements on sort + * order) and sort the index once we're all done. + */ + ce->ce_flags |= CE_REMOVE; + } + + for (i = MERGE_BASE; i <= MERGE_SIDE2; i++) { + struct version_info *vi; + if (!(ci->filemask & (1ul << i))) + continue; + vi = &ci->stages[i]; + ce = make_cache_entry(index, vi->mode, &vi->oid, + path, i+1, 0); + add_index_entry(index, ce, ADD_CACHE_JUST_APPEND); + } + } + + /* + * Remove the unused cache entries (and invalidate the relevant + * cache-trees), then sort the index entries to get the conflicted + * entries we added to the end into their right locations. + */ + remove_marked_cache_entries(index, 1); + QSORT(index->cache, index->cache_nr, cmp_cache_name_compare); + + return errs; } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Dec 13 08:04:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11970529 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 78310C4361B for ; Sun, 13 Dec 2020 08:08:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E9A722D72 for ; Sun, 13 Dec 2020 08:08:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404010AbgLMIHm (ORCPT ); Sun, 13 Dec 2020 03:07:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48494 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2394077AbgLMIGe (ORCPT ); Sun, 13 Dec 2020 03:06:34 -0500 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 64D79C0611CF for ; Sun, 13 Dec 2020 00:04:48 -0800 (PST) Received: by mail-wm1-x341.google.com with SMTP id g25so7095196wmh.1 for ; Sun, 13 Dec 2020 00:04:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=i7qANtG3qXKjHjlGq/JcCBpxVy3gkdyUMbD2qpW2pBQ=; b=CFvC4g/FOvzE6bnXdsxtJHLJfu7Kg0MiqKhrqZcwz9kR4P0Wktt4PDXYMS7UMGNeSM 3QmsW26hEDGOG+RmxJb+jt5VK1bqkq2R5iCzZ1RerlQhAxXIleHnzUoiippTJqzyKiqO ptILAlt9XtTSoGUHWWJ+oUw+/X9Z4a6K9jf9cTy2Nd4jaqrJ/DE0hyZ3+ytha6p3N5Kb mBIb5gBoTESitGPimsKYhDJPxHwd+ImxORqBQKG7TPqzM+YuOgVnJlIYL8SQ8L6gEy19 nrqVUIJTOb4VSvAf3w6SRV1FCY3FohyToOb220J6q2sQbJ6R8eJEBd0vYIa5bLGURIQN OZlw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=i7qANtG3qXKjHjlGq/JcCBpxVy3gkdyUMbD2qpW2pBQ=; b=gWXuKVU6GnZNqn3nlHNPZhOGYZRAY1GmsA6IpLtDXHpE9IFJvOpzd0wC8GW+ybGHIS yo/8rqQb8Ehpd/JQOBamLH/4bhFejSrU0iGF4ECwg2ekN85fLGoHDO8zbiD8mPBMHnyD Uc36riYuCCt+hkPycHxK+F0VnEglYRsw8WrfyWEd3QceJEFrBt1dvuN/Ue8CSFZXDTAr SPuhsjbnk8zKN+bpZe+uPu9GEpoTH06NkEpLjkZTyS2XyS8xtDjbBPjuUL5DLmK+O2yP nzp45Ni3IWwToMwenWZ4ACl4eO6Ax6T6jRedtFJ4BlLoadbT4Cn8CB0D9y/yOAyLxXlk TfqQ== X-Gm-Message-State: AOAM533Zm0rpLmjJlZ/c5XN82nF6uWpQ7y+XZl57YsV2spwWf3+05BN0 kHERbkr2NuuFne1bdPKUhuHav+JZmU8= X-Google-Smtp-Source: ABdhPJx89g2dkiTEnGxJXxWJhHDecrot+7M+80SSZ3Po7KmPjjoU9eMYTennwSdsxwkJpp+5qgN4rg== X-Received: by 2002:a1c:bd43:: with SMTP id n64mr22099196wmf.169.1607846686986; Sun, 13 Dec 2020 00:04:46 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r82sm24105514wma.18.2020.12.13.00.04.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 13 Dec 2020 00:04:46 -0800 (PST) Message-Id: <55451a79eecf984d7adf54a324378227ea95e9d6.1607846667.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 13 Dec 2020 08:04:27 +0000 Subject: [PATCH v3 20/20] merge-ort: free data structures in merge_finalize() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- merge-ort.c | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 47cd772e805..51b049358e4 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -209,6 +209,16 @@ struct conflict_info { assert((ci) && !(mi)->clean); \ } while (0) +static void free_strmap_strings(struct strmap *map) +{ + struct hashmap_iter iter; + struct strmap_entry *entry; + + strmap_for_each_entry(map, &iter, entry) { + free((char*)entry->key); + } +} + static int err(struct merge_options *opt, const char *err, ...) { va_list params; @@ -1153,7 +1163,27 @@ void merge_switch_to_result(struct merge_options *opt, void merge_finalize(struct merge_options *opt, struct merge_result *result) { - die("Not yet implemented"); + struct merge_options_internal *opti = result->priv; + + assert(opt->priv == NULL); + + /* + * We marked opti->paths with strdup_strings = 0, so that we + * wouldn't have to make another copy of the fullpath created by + * make_traverse_path from setup_path_info(). But, now that we've + * used it and have no other references to these strings, it is time + * to deallocate them. + */ + free_strmap_strings(&opti->paths); + strmap_clear(&opti->paths, 1); + + /* + * All keys and values in opti->conflicted are a subset of those in + * opti->paths. We don't want to deallocate anything twice, so we + * don't free the keys and we pass 0 for free_values. + */ + strmap_clear(&opti->conflicted, 0); + FREE_AND_NULL(opti); } static void merge_start(struct merge_options *opt, struct merge_result *result)