From patchwork Fri Dec 4 20:47:51 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23918C4361A for ; Fri, 4 Dec 2020 20:49:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DD2DA22CF7 for ; Fri, 4 Dec 2020 20:49:02 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728582AbgLDUtC (ORCPT ); Fri, 4 Dec 2020 15:49:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39866 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727517AbgLDUtB (ORCPT ); Fri, 4 Dec 2020 15:49:01 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A24D0C061A51 for ; Fri, 4 Dec 2020 12:48:15 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id g14so6543388wrm.13 for ; Fri, 04 Dec 2020 12:48:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=gy/Ul4mnZ7gWvitV4aXNcvdTXFp79UiZbsQWtGrz1KM=; b=urxVLgQmoJKDozlV0AM96VE/K1CS2d9UIfjl2y1iLGX4v+nE3KMvIH83Son7jNgrUV FbU+nbldYPGB3DIQQBSjorGXIdbp8I1ikttvJZCiIfePm3WYyJHVIXNcSzCOOyEYR0fY 60lzoDM6oapRbF/8NfD2QUBIPHJ6jMzrSkiXe4BRem023UpFIrad8A6RsD6dOERvd0rQ ZMxovkm4OjjApTVBn5wSGneTL6QVLJgJRkeSfO0UVpR8L2c21rukL1Y+9jEtuJH4aouD uG7vmfIRKoRbNebzoRpxu3w/F9yMfOaeUPhSF45ELPIsK9iD1xq+l4hTHazvMz7kGOKZ EgHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=gy/Ul4mnZ7gWvitV4aXNcvdTXFp79UiZbsQWtGrz1KM=; b=K+tbiL5g9Lv14vLmtqHelbTysQR8ZhId7gUWzqbRnCfucNSEqEVNUqvBrpf3eAAIej W5LRxoBdjbP2SLXCBhUmPMv7jxvLiQpPcW6NQm29qi2GSOUSkDoaINp/ZJ8BrDa+sfPG GJ17DQWhJV6MBIwtd2fSCQpt5ijMbkazDPlyoQx8Kr/Z9nr0J7fYER3y8J6l3w5Yrrjw Q8phR7O0LTEWcZ33HCpQBHasB5GZ7qcavxSSMa3LkrriKsvdWgmop1FUtbrKyduXQDv3 3kg3iiQzh/Bd5FpD6VDjg2Ip0voqmCQ+HsiCNXKvAKZ0upYHp6MqzK5sgJW2pdMBbHM3 L6EA== X-Gm-Message-State: AOAM531XSBRRLe+324FOgpMGc23Ug7Id1XZFb1GPIUiSQElb5tBtjFkI /QdRsya3jR6kWqnGhujvQyHwYf/2UME= X-Google-Smtp-Source: ABdhPJyPag2sDf2PTPPpy+akGO9/T1Ni5yXiSDOxB9kxYHCiN3IqYog8a6o32oydTML18pMtCE3UoQ== X-Received: by 2002:adf:fa0c:: with SMTP id m12mr7060784wrr.222.1607114894081; Fri, 04 Dec 2020 12:48:14 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h204sm4723144wme.17.2020.12.04.12.48.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:13 -0800 (PST) Message-Id: <2568ec92c6d96dc51aff4a411900eaec8d32ce27.1607114890.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:51 +0000 Subject: [PATCH v2 01/20] merge-ort: setup basic internal data structures Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Set up some basic internal data structures. The only carry-over from merge-recursive.c is call_depth, though needed_rename_limit will be added later. The central piece of data will definitely be the strmap "paths", which will map every relevant pathname under consideration to either a merged_info or a conflict_info. ("conflicted" is a strmap that is a subset of "paths".) merged_info contains all relevant information for a non-conflicted entry. conflict_info contains a merged_info, plus any additional information about a conflict such as the higher orders stages involved and the names of the paths those came from (handy once renames get involved). If an entry remains conflicted, the merged_info portion of a conflict_info will later be filled with whatever version of the file should be placed in the working directory (e.g. an as-merged-as-possible variation that contains conflict markers). Signed-off-by: Elijah Newren --- merge-ort.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 137 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index b487901d3e..bb37fdf838 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,143 @@ #include "cache.h" #include "merge-ort.h" +#include "strmap.h" + +struct merge_options_internal { + /* + * paths: primary data structure in all of merge ort. + * + * The keys of paths: + * * are full relative paths from the toplevel of the repository + * (e.g. "drivers/firmware/raspberrypi.c"). + * * store all relevant paths in the repo, both directories and + * files (e.g. drivers, drivers/firmware would also be included) + * * these keys serve to intern all the path strings, which allows + * us to do pointer comparison on directory names instead of + * strcmp; we just have to be careful to use the interned strings. + * + * The values of paths: + * * either a pointer to a merged_info, or a conflict_info struct + * * merged_info contains all relevant information for a + * non-conflicted entry. + * * conflict_info contains a merged_info, plus any additional + * information about a conflict such as the higher orders stages + * involved and the names of the paths those came from (handy + * once renames get involved). + * * a path may start "conflicted" (i.e. point to a conflict_info) + * and then a later step (e.g. three-way content merge) determines + * it can be cleanly merged, at which point it'll be marked clean + * and the algorithm will ignore any data outside the contained + * merged_info for that entry + * * If an entry remains conflicted, the merged_info portion of a + * conflict_info will later be filled with whatever version of + * the file should be placed in the working directory (e.g. an + * as-merged-as-possible variation that contains conflict markers). + */ + struct strmap paths; + + /* + * conflicted: a subset of keys->values from "paths" + * + * conflicted is basically an optimization between process_entries() + * and record_conflicted_index_entries(); the latter could loop over + * ALL the entries in paths AGAIN and look for the ones that are + * still conflicted, but since process_entries() has to loop over + * all of them, it saves the ones it couldn't resolve in this strmap + * so that record_conflicted_index_entries() can iterate just the + * relevant entries. + */ + struct strmap conflicted; + + /* + * current_dir_name: temporary var used in collect_merge_info_callback() + * + * Used to set merged_info.directory_name; see documentation for that + * variable and the requirements placed on that field. + */ + const char *current_dir_name; + + /* call_depth: recursion level counter for merging merge bases */ + int call_depth; +}; + +struct version_info { + struct object_id oid; + unsigned short mode; +}; + +struct merged_info { + /* if is_null, ignore result. otherwise result has oid & mode */ + struct version_info result; + unsigned is_null:1; + + /* + * clean: whether the path in question is cleanly merged. + * + * see conflict_info.merged for more details. + */ + unsigned clean:1; + + /* + * basename_offset: offset of basename of path. + * + * perf optimization to avoid recomputing offset of final '/' + * character in pathname (0 if no '/' in pathname). + */ + size_t basename_offset; + + /* + * directory_name: containing directory name. + * + * Note that we assume directory_name is constructed such that + * strcmp(dir1_name, dir2_name) == 0 iff dir1_name == dir2_name, + * i.e. string equality is equivalent to pointer equality. For this + * to hold, we have to be careful setting directory_name. + */ + const char *directory_name; +}; + +struct conflict_info { + /* + * merged: the version of the path that will be written to working tree + * + * WARNING: It is critical to check merged.clean and ensure it is 0 + * before reading any conflict_info fields outside of merged. + * Allocated merge_info structs will always have clean set to 1. + * Allocated conflict_info structs will have merged.clean set to 0 + * initially. The merged.clean field is how we know if it is safe + * to access other parts of conflict_info besides merged; if a + * conflict_info's merged.clean is changed to 1, the rest of the + * algorithm is not allowed to look at anything outside of the + * merged member anymore. + */ + struct merged_info merged; + + /* oids & modes from each of the three trees for this path */ + struct version_info stages[3]; + + /* pathnames for each stage; may differ due to rename detection */ + const char *pathnames[3]; + + /* Whether this path is/was involved in a directory/file conflict */ + unsigned df_conflict:1; + + /* + * For filemask and dirmask, see tree-walk.h's struct traverse_info, + * particularly the documentation above the "fn" member. Note that + * filemask = mask & ~dirmask from that documentation. + */ + unsigned filemask:3; + unsigned dirmask:3; + + /* + * Optimization to track which stages match, to avoid the need to + * recompute it in multiple steps. Either 0 or at least 2 bits are + * set; if at least 2 bits are set, their corresponding stages match. + */ + unsigned match_mask:3; +}; + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, From patchwork Fri Dec 4 20:47:52 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 90748C4361B for ; Fri, 4 Dec 2020 20:48:58 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4DFFA22CF6 for ; Fri, 4 Dec 2020 20:48:58 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727916AbgLDUs5 (ORCPT ); Fri, 4 Dec 2020 15:48:57 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39872 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727517AbgLDUs5 (ORCPT ); Fri, 4 Dec 2020 15:48:57 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 92532C061A52 for ; Fri, 4 Dec 2020 12:48:16 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id g14so6543411wrm.13 for ; Fri, 04 Dec 2020 12:48:16 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=s0PXW8ogrXFRO0LdlzBeU+/0bXqVOha+G8onxIjwhho=; b=meN5ca2q6+G5prn42glSifAyQsEmnZsRCjF+XJ8zkhhHylVr7TYAkYtyZD7q+0h15c qvDrmVybTFl/uLXmX0LiTwDfrS+GUjH1swDRCEWimT477dwhbHnRs1gTlMqyX9wWTvi3 9WfH9SSI0dDveyvbvJdC5K8NGoFTpKalqwjm/UZVY3vSWMrbHhgtHLpaSljz3AJj8gFV cdzWgc5WokJn3ideHTKYNHioPGC3romqwGyfBLGtGfLDDQaYyNolpIIGWmKQP7yLf0ew f1lz+lfo+TGQT4rD3YmO2LkVP8XRE+HRtBkvH0P9CHTra27WcR03lMm/L5Cx6GRMLfkm Hqpw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=s0PXW8ogrXFRO0LdlzBeU+/0bXqVOha+G8onxIjwhho=; b=IolnxFpo67LUylOyQ5JwMKsVtfuXg0nPAhx/av+bnaSEQ3BjzjMLA/mvcH+seBHAcf wDtz5xN/MUCiaTASbBnZkYvdgiKR+z67Ch2SoEZ0OO5hadMQNJqndbt1DRwPNOqjeyrt Y7Hsksdd6YmTWCgkgWYuIUdZawf6zYiZzRInIeJ9BTAfnrrZ6AJkXKlIx13vDQOKbD7w QgObIqefR4w7o5s6Mg/DFHnCeqm599Va2Y+DPK8gg40DcqOMWdcpqrQ/4FW2/Jk7ziuV 6t5fg9OVD4l9C9QhYROSr681dC9yXMEBlp9mroaP7txBjvHB3XRIaL/PNTZR/N9BB0EM TDLg== X-Gm-Message-State: AOAM532Z0ZuePaqqnKjuz3/YwQo/bFu3SPRkqgGsRh156O9FFQUOIbot ZN+t2JNmqdquhVK7FvQqqmtZNxjLdMg= X-Google-Smtp-Source: ABdhPJxBJhSGyL2rwTUwCArEFIU/QJ9bRZ/j45ScYb8cKQDikR2a7RzRClGkK7+ua0aicII/XkBW5Q== X-Received: by 2002:adf:ed89:: with SMTP id c9mr5858037wro.4.1607114895134; Fri, 04 Dec 2020 12:48:15 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a15sm5183391wrn.75.2020.12.04.12.48.14 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:14 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:52 +0000 Subject: [PATCH v2 02/20] merge-ort: add some high-level algorithm structure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren merge_ort_nonrecursive_internal() will be used by both merge_inmemory_nonrecursive() and merge_inmemory_recursive(); let's focus on it for now. It involves some setup -- merge_start() -- followed by the following chain of functions: collect_merge_info() This function will populate merge_options_internal's paths field, via a call to traverse_trees() and a new callback that will be added later. detect_and_process_renames() This function will detect renames, and then adjust entries in paths to move conflict stages from old pathnames into those for new pathnames, so that the next step doesn't have to think about renames and just can do three-way content merging and such. process_entries() This function determines how to take the various stages (versions of a file from the three different sides) and merge them, and whether to mark the result as conflicted or cleanly merged. It also writes out these merged file versions as it goes to create a tree. Signed-off-by: Elijah Newren --- merge-ort.c | 68 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 67 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index bb37fdf838..8c9fea1a5a 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -18,6 +18,7 @@ #include "merge-ort.h" #include "strmap.h" +#include "tree.h" struct merge_options_internal { /* @@ -154,6 +155,38 @@ struct conflict_info { unsigned match_mask:3; }; +static int collect_merge_info(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2) +{ + /* TODO: Implement this using traverse_trees() */ + die("Not yet implemented."); +} + +static int detect_and_process_renames(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2) +{ + int clean = 1; + + /* + * Rename detection works by detecting file similarity. Here we use + * a really easy-to-implement scheme: files are similar IFF they have + * the same filename. Therefore, by this scheme, there are no renames. + * + * TODO: Actually implement a real rename detection scheme. + */ + return clean; +} + +static void process_entries(struct merge_options *opt, + struct object_id *result_oid) +{ + die("Not yet implemented."); +} + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, @@ -170,13 +203,46 @@ void merge_finalize(struct merge_options *opt, die("Not yet implemented"); } +static void merge_start(struct merge_options *opt, struct merge_result *result) +{ + die("Not yet implemented."); +} + +/* + * Originally from merge_trees_internal(); heavily adapted, though. + */ +static void merge_ort_nonrecursive_internal(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2, + struct merge_result *result) +{ + struct object_id working_tree_oid; + + collect_merge_info(opt, merge_base, side1, side2); + result->clean = detect_and_process_renames(opt, merge_base, + side1, side2); + process_entries(opt, &working_tree_oid); + + /* Set return values */ + result->tree = parse_tree_indirect(&working_tree_oid); + /* existence of conflicted entries implies unclean */ + result->clean &= strmap_empty(&opt->priv->conflicted); + if (!opt->priv->call_depth) { + result->priv = opt->priv; + opt->priv = NULL; + } +} + void merge_incore_nonrecursive(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2, struct merge_result *result) { - die("Not yet implemented"); + assert(opt->ancestor != NULL); + merge_start(opt, result); + merge_ort_nonrecursive_internal(opt, merge_base, side1, side2, result); } void merge_incore_recursive(struct merge_options *opt, From patchwork Fri Dec 4 20:47:53 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BAF9C433FE for ; Fri, 4 Dec 2020 20:48:59 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 248D422CF6 for ; Fri, 4 Dec 2020 20:48:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728161AbgLDUs6 (ORCPT ); Fri, 4 Dec 2020 15:48:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39874 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727517AbgLDUs6 (ORCPT ); Fri, 4 Dec 2020 15:48:58 -0500 Received: from mail-wr1-x434.google.com (mail-wr1-x434.google.com [IPv6:2a00:1450:4864:20::434]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9E9A9C061A53 for ; Fri, 4 Dec 2020 12:48:17 -0800 (PST) Received: by mail-wr1-x434.google.com with SMTP id x6so2636459wro.11 for ; Fri, 04 Dec 2020 12:48:17 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Hc3mWT9aEJqqlzm5KsR2jjqFQ0+PmgVyER5CrCXRFso=; b=myceDPXNSP3md5j6CNLcKUntR+8HuvC7y40VlwZSG/rKoZckq4KvtKmzkOaZV/BVnX ZpjTDhh3HL2SxvHA2YJ1T+E+GIrOVZGzkJvCqHwAyZfYntk2Xw7Cw66ZGahT8I4HR/Mf 7GomcoVMZljonoke/glJR+KyfmAfnK7wC+X8xWaLC4gWvYKNWjcWxbmzTC8jcKEPYEut QTab0xFjziDXkrar5C9UzJ53jIxO5SvHL30+CS+Bz2Oh3f65TAhp7GUkqqJyfZgvUPsU hz9ZI4eE4wPcruKeUhHcFShPwaQ3QgL+a7ekXDCb947CSBuw6VxR+TkDoO940zZ6Oiyg 5aoQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Hc3mWT9aEJqqlzm5KsR2jjqFQ0+PmgVyER5CrCXRFso=; b=bUd25Pi5NFWaraz7d+DGi4vsAkhfdGITTZ0i4blvE+cd1CY9kJIiyXfzBsSZEaiBtl PwLBf8OU4rVKUVxz33RhXaxsog+V5TT0uLXCCZ7TQyHLYlpxvz5FN2Rm8sv71/liaA1L Wjg4ZZgs4QV1B6H+TI2nCzkbDY6qqp0bsqAw2M+/BIUeTyZsPh874i9qOx3BfbNP8hFM k1prziO5vPxUZ7sV95SEhZn/mTNjEaF2ZlH9iBcmXhjFhVicrXjfSn7cRykiJy8F2at/ 41OpaQey25WNbBWWrPw7xNUT3NGewzhBcRpu/l42NheStgwnmRamMXD0w0kQJ5KoLWUJ iimA== X-Gm-Message-State: AOAM530t4tBsaoQYoCD4FdNZOh6FtI3bUwOQmWbfn4+O01N1teJr+3Qf PCtEJn3e6mwK80npnl6kFwqiALjugaY= X-Google-Smtp-Source: ABdhPJyWyPqZWB8UMZK1hz98v/XzvvJWmabjRpVenIzUzH1UGztvl4nxwXpApVZI3JHtswPLe1HX2A== X-Received: by 2002:adf:f94b:: with SMTP id q11mr7061607wrr.351.1607114896075; Fri, 04 Dec 2020 12:48:16 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 189sm4707254wma.22.2020.12.04.12.48.15 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:15 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:53 +0000 Subject: [PATCH v2 03/20] merge-ort: port merge_start() from merge-recursive Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren merge_start() basically does a bunch of sanity checks, then allocates and initializes opt->priv -- a struct merge_options_internal. Most of the sanity checks are usable as-is. The allocation/intialization is a bit different since merge-ort has a very different merge_options_internal than merge-recursive, but the idea is the same. The weirdest part here is that merge-ort and merge-recursive use the same struct merge_options, even though merge_options has a number of fields that are oddly specific to merge-recursive's internal implementation and don't even make sense with merge-ort's high-level design (e.g. buffer_output, which merge-ort has to always do). I reused the same data structure because: * most the fields made sense to both merge algorithms * making a new struct would have required making new enums or somehow externalizing them, and that was getting messy. * it simplifies converting the existing callers by not having to have different code paths for merge_options setup. I also marked detect_renames as ignored. We can revisit that later, but in short: merge-recursive allowed turning off rename detection because it was sometimes glacially slow. When you speed something up by a few orders of magnitude, it's worth revisiting whether that justification is still relevant. Besides, if folks find it's still too slow, perhaps they have a better scaling case than I could find and maybe it turns up some more optimizations we can add. If it still is needed as an option, it is easy to add later. Signed-off-by: Elijah Newren --- merge-ort.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 8c9fea1a5a..f8ac721aa3 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,8 @@ #include "cache.h" #include "merge-ort.h" +#include "diff.h" +#include "diffcore.h" #include "strmap.h" #include "tree.h" @@ -205,7 +207,48 @@ void merge_finalize(struct merge_options *opt, static void merge_start(struct merge_options *opt, struct merge_result *result) { - die("Not yet implemented."); + /* Sanity checks on opt */ + assert(opt->repo); + + assert(opt->branch1 && opt->branch2); + + assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE && + opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE); + assert(opt->rename_limit >= -1); + assert(opt->rename_score >= 0 && opt->rename_score <= MAX_SCORE); + assert(opt->show_rename_progress >= 0 && opt->show_rename_progress <= 1); + + assert(opt->xdl_opts >= 0); + assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL && + opt->recursive_variant <= MERGE_VARIANT_THEIRS); + + /* + * detect_renames, verbosity, buffer_output, and obuf are ignored + * fields that were used by "recursive" rather than "ort" -- but + * sanity check them anyway. + */ + assert(opt->detect_renames >= -1 && + opt->detect_renames <= DIFF_DETECT_COPY); + assert(opt->verbosity >= 0 && opt->verbosity <= 5); + assert(opt->buffer_output <= 2); + assert(opt->obuf.len == 0); + + assert(opt->priv == NULL); + + /* Initialization of opt->priv, our internal merge data */ + opt->priv = xcalloc(1, sizeof(*opt->priv)); + + /* + * Although we initialize opt->priv->paths with strdup_strings=0, + * that's just to avoid making yet another copy of an allocated + * string. Putting the entry into paths means we are taking + * ownership, so we will later free it. + * + * In contrast, conflicted just has a subset of keys from paths, so + * we don't want to free those (it'd be a duplicate free). + */ + strmap_init_with_options(&opt->priv->paths, NULL, 0); + strmap_init_with_options(&opt->priv->conflicted, NULL, 0); } /* From patchwork Fri Dec 4 20:47:54 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 122ADC4361B for ; Fri, 4 Dec 2020 20:49:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id D65E222CF6 for ; Fri, 4 Dec 2020 20:49:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729391AbgLDUtL (ORCPT ); Fri, 4 Dec 2020 15:49:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727517AbgLDUtK (ORCPT ); Fri, 4 Dec 2020 15:49:10 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B3859C061A54 for ; Fri, 4 Dec 2020 12:48:18 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id 3so8152330wmg.4 for ; Fri, 04 Dec 2020 12:48:18 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=WYOD5AtdxVLH1YV7SPTrpljjr50F7ig+c9RRQqMMot0=; b=U6gJQlNAOfBo44axlUuz4yDCmCJbiSKUJywicHqjZvSU5BxKvR2svJjE/BlIDgoVB3 omKqLckrg+g5KXMyjAhTxaUy/D39nZwGYP0I5H1XUg3cU5ZqECfgHQLsMy4TUYLKTyxW 2rBemZRd+IJVXfE31YYuiIgbDzVXkfBfbIhXT46oNhf31utHa/B52fqD5mr7+54UdIXd LkncIyKgaRg2zW77j6W57oqz0nfxsbZ2jWDEnK/t4ejiWUkc53UzHxJxTG2BlVq5+jeq JsyF1NgUGR3XS4DJfZuZM3QYloYzAkxmJuYUetj1jiFWG43p/++CJJpGWnA/lhbsu02B 8THA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=WYOD5AtdxVLH1YV7SPTrpljjr50F7ig+c9RRQqMMot0=; b=bixYaD5zFECCBZoOfihnUr/lMG+Tza0m9hXiiKDeVIIdhqhTRu592IPyrxfA1W7prp YlmMgM1t0yKxlr0dYlAhk86cKXv3eUAmNzF43i4PGn/ENM0dY8e1L6pKUu4RnLOtRChp Wu8gsHZfTNhLN7Q++7mPFHMJzM1Aiq8GB577iU1fgBDW5PppOgabc2hfhNkfMvXA0HUc CrA8NxYiVaieDOvfqChLm4umncPvkr1gw4hRiDkU6JZbPaujkucAowxHJ3yMBeHfjFGp CfdVoyUCahpJA7RZQBZKtF0g7R7GdlZpfOah2KaRv6v1zELqurEJePJxrgxVdRDQG/ZN kOdQ== X-Gm-Message-State: AOAM532W31qevd3UopLpzbC5lPolVwMIg0sTqLpFH+33w/8lDtFjuPKn +EtswEdZg1qsxI5KNBwKd3eA42dJACk= X-Google-Smtp-Source: ABdhPJw1yESpL5FAkaC7pNItBxAzmlrl+JjNP+MLYWUrajDva2tgBAOSrnn0AYDGuAdG8yvRCsdN8A== X-Received: by 2002:a7b:c2e8:: with SMTP id e8mr6010611wmk.103.1607114897237; Fri, 04 Dec 2020 12:48:17 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n14sm4471123wmi.1.2020.12.04.12.48.16 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:16 -0800 (PST) Message-Id: <22fecf6ccd1e1eb80bb8391975b482b3d6233540.1607114890.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:54 +0000 Subject: [PATCH v2 04/20] merge-ort: use histogram diff Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren In my cursory investigation, histogram diffs are about 2% slower than Myers diffs. Others have probably done more detailed benchmarks. But, in short, histogram diffs have been around for years and in a number of cases provide obviously better looking diffs where Myers diffs are unintelligible but the performance hit has kept them from becoming the default. However, there are real merge bugs we know about that have triggered on git.git and linux.git, which I don't have a clue how to address without the additional information that I believe is provided by histogram diffs. See the following: https://lore.kernel.org/git/20190816184051.GB13894@sigill.intra.peff.net/ https://lore.kernel.org/git/CABPp-BHvJHpSJT7sdFwfNcPn_sOXwJi3=o14qjZS3M8Rzcxe2A@mail.gmail.com/ https://lore.kernel.org/git/CABPp-BGtez4qjbtFT1hQoREfcJPmk9MzjhY5eEq1QhXT23tFOw@mail.gmail.com/ I don't like mismerges. I really don't like silent mismerges. While I am sometimes willing to make performance and correctness tradeoff, I'm much more interested in correctness in general. I want to fix the above bugs. I have not yet started doing so, but I believe histogram diff at least gives me an angle. Unfortunately, I can't rely on using the information from histogram diff unless it's in use. And it hasn't been used because of a few percentage performance hit. In testcases I have looked at, merge-ort is _much_ faster than merge-recursive for non-trivial merges/rebases/cherry-picks. As such, this is a golden opportunity to switch out the underlying diff algorithm (at least the one used by the merge machinery; git-diff and git-log are separate questions); doing so will allow me to get additional data and improved diffs, and I believe it will help me fix the above bugs at some point in the future. Signed-off-by: Elijah Newren --- merge-ort.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index f8ac721aa3..ff305bcbe4 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -21,6 +21,7 @@ #include "diffcore.h" #include "strmap.h" #include "tree.h" +#include "xdiff-interface.h" struct merge_options_internal { /* @@ -235,6 +236,9 @@ static void merge_start(struct merge_options *opt, struct merge_result *result) assert(opt->priv == NULL); + /* Default to histogram diff. Actually, just hardcode it...for now. */ + opt->xdl_opts = DIFF_WITH_ALG(opt, HISTOGRAM_DIFF); + /* Initialization of opt->priv, our internal merge data */ opt->priv = xcalloc(1, sizeof(*opt->priv)); From patchwork Fri Dec 4 20:47:55 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7AC8C4361A for ; Fri, 4 Dec 2020 20:49:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5D39722CF6 for ; Fri, 4 Dec 2020 20:49:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729762AbgLDUtL (ORCPT ); Fri, 4 Dec 2020 15:49:11 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728623AbgLDUtK (ORCPT ); Fri, 4 Dec 2020 15:49:10 -0500 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A28AEC061A55 for ; Fri, 4 Dec 2020 12:48:19 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id c198so6591619wmd.0 for ; Fri, 04 Dec 2020 12:48:19 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=MAygxFzJxH6VuIUbTy5dbpWI5LMqtfnM4A8vTiQGh3o=; b=sv8foDz84MwPAZ2IuMTR+FZ/7RsiTRWEftlbcbAielR1Jg1BNQnG0vFhl/BRagudoj YOxdUj+KVqwAA/7WtI1jajLWqUaXiKFlDvsaKQIsZhDUbbPVPWIXXmkAG/ZcCdLBmxGA RUFdWsjjvPlCzNasR/MuO7ORqExQ4fAMLPWFRfithBRBaaWD7yLRTCuBr/DtUMXvWmsJ kcEcWIgghcqbGhta9HafE0rT1XG2AF2crXk0Xri6pxDmlVfdJsNVMfFER6oOO+zJ8qYu mAoj7D0JcBGvb1u78GMpMz0TkUtd8xG0YwN8ypjZELk98J/496XGOfrD8aTZ63E2GC30 RJ4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=MAygxFzJxH6VuIUbTy5dbpWI5LMqtfnM4A8vTiQGh3o=; b=ZXDdDnkmrXDPbzWNZpAV8YZe5j/+WdWScUaZC0qmcOfunRAiKKwPOD1K6XpGXtFvas C8uDUbHKeh4RITVqEU96xOB5XLSGiCV1O0INWqETgto7S4UOSk/1DkhJaCyvdYEc6XKH EmQWhLL2QoENq3DTeeWFNbDwzEDZ3QMhFtprSr1ej9bQYl7M+wBM3wZhNm1a0asgTcar Hf9uJzeQVho/dsnpQAux8tqaCU1v9kOL/Zs0aD0OtsjlFlV+W7ZtZ2BMjSJBJiatyI+r hbzFDY1dyrZv3kB7Bmt5br0xK1vu77SxYR35CavlK4Q10O2rh1lh+b7WQhxVR2zRTZag nWsw== X-Gm-Message-State: AOAM530j20EcTjMS+0fE5v9RSR5oOjoZnTv4w9laiboRffs8CtTlFTCL osFzUvMKWNlUJzAA/nOnmNXayRgQZmU= X-Google-Smtp-Source: ABdhPJxQtf17WrputJOJj/qJdp4JiFBBIWMFItdUiWL0pUvBRLPvWko8gIQAXPJ0kA25L7+Wflirjw== X-Received: by 2002:a1c:a9c8:: with SMTP id s191mr6132615wme.89.1607114898208; Fri, 04 Dec 2020 12:48:18 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w3sm4473945wma.3.2020.12.04.12.48.17 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:17 -0800 (PST) Message-Id: <6c4c0c15b3d1283d817149181cdf88903926a52b.1607114890.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:55 +0000 Subject: [PATCH v2 05/20] merge-ort: add an err() function similar to one from merge-recursive Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Various places in merge-recursive used an err() function when it hit some kind of unrecoverable error. That code was from the reusable bits of merge-recursive.c that we liked, such as merge_3way, writing object files to the object store, reading blobs from the object store, etc. So create a similar function to allow us to port that code over, and use it for when we detect problems returned from collect_merge_info()'s traverse_trees() call, which we will be adding next. Signed-off-by: Elijah Newren --- merge-ort.c | 31 +++++++++++++++++++++++++++++-- 1 file changed, 29 insertions(+), 2 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index ff305bcbe4..b056db6fc8 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -158,12 +158,27 @@ struct conflict_info { unsigned match_mask:3; }; +static int err(struct merge_options *opt, const char *err, ...) +{ + va_list params; + struct strbuf sb = STRBUF_INIT; + + strbuf_addstr(&sb, "error: "); + va_start(params, err); + strbuf_vaddf(&sb, err, params); + va_end(params); + + error("%s", sb.buf); + strbuf_release(&sb); + + return -1; +} + static int collect_merge_info(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2) { - /* TODO: Implement this using traverse_trees() */ die("Not yet implemented."); } @@ -266,7 +281,19 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt, { struct object_id working_tree_oid; - collect_merge_info(opt, merge_base, side1, side2); + if (collect_merge_info(opt, merge_base, side1, side2) != 0) { + /* + * TRANSLATORS: The %s arguments are: 1) tree hash of a merge + * base, and 2-3) the trees for the two trees we're merging. + */ + err(opt, _("collecting merge info failed for trees %s, %s, %s"), + oid_to_hex(&merge_base->object.oid), + oid_to_hex(&side1->object.oid), + oid_to_hex(&side2->object.oid)); + result->clean = -1; + return; + } + result->clean = detect_and_process_renames(opt, merge_base, side1, side2); process_entries(opt, &working_tree_oid); From patchwork Fri Dec 4 20:47:56 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952377 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F7B9C0018C for ; Fri, 4 Dec 2020 20:49:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 0ED1F22D03 for ; Fri, 4 Dec 2020 20:49:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730292AbgLDUtl (ORCPT ); Fri, 4 Dec 2020 15:49:41 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39974 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730047AbgLDUtl (ORCPT ); Fri, 4 Dec 2020 15:49:41 -0500 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C70CAC061A56 for ; Fri, 4 Dec 2020 12:48:20 -0800 (PST) Received: by mail-wm1-x344.google.com with SMTP id 3so8152383wmg.4 for ; Fri, 04 Dec 2020 12:48:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=On7LwH2XUvC16/QYhBu9B2qHm1zMtNBUonmisYPUJGk=; b=JCbHinsZZOYM5Gn/h7iUuxucZVtYlErzeqIN62IvbZ0hppgb7V5qdfwMPGPskdUsHy Yf/GamVq9QbHRrsCfy448W+BrHNxCBC096kPbCU+uzfDJauMscrJIntYxv+sREEdEzzl TnQ882wQyHb0pLGmYEwpG205CnTr0O79pfRm6mZ39vKJ/XY+ILaxBpWTIibITab7z07O sxdqK7BuSQ/zrdLOWwIQUZata39iI2qqhWBb7Knigg3qGpFjnDX2FBcUNuWQo3TXd68L sbXCFBRgXko4cgM9LsvmdDsJvnCut65ihtP5E437e5xRjgCpCcmMaJYNsx496b/92PoZ 6cBg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=On7LwH2XUvC16/QYhBu9B2qHm1zMtNBUonmisYPUJGk=; b=IPtlmSOFRy0cm5HFZjHsllaxXoxcMd4Pkr8GFOKi4k+LhTnzzu9bqSJz2YnUvytn3H N/0GtmqEdRBn7veX95o9+GxZuBEpngFU3YCuRtEorYzg1NvbF2s/6+Wkhb20OmXuWnrz mLm44TPP7ogE0TMdxnUmNls1g5T62N5OmsFs402qQTN4Jov+KAMGupOTFG5r8MZ331Y5 3Y16EDefmg8BLDpZpFvoMv8PgjqB0mKfbnn0MiNZ44zLeUXzIhiw5NEeZHJ8E4Ds0xk1 mkOFM9l15cP9U//NKBCJdxGNxkvtdjjWaXFoYSlnkY738HgpS2tLMZ0gMYxvZt9UbsGS n54A== X-Gm-Message-State: AOAM533AltADFxfW+T8wAMAPdmdpa1rrzcViu6d0kz9+5s8olsjdadOt k6Hrw7/R+//sfdBKfgzQ2m7z5zcnZjQ= X-Google-Smtp-Source: ABdhPJwnZWPAcuMrZ1vZkv7jK7NXjCG+WQmoLbWXSJzUA39H7tymucBMVI49jjy5GdMakh6S6Gk+Tg== X-Received: by 2002:a7b:c3c8:: with SMTP id t8mr6126408wmj.88.1607114899276; Fri, 04 Dec 2020 12:48:19 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id k16sm4910574wrl.65.2020.12.04.12.48.18 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:18 -0800 (PST) Message-Id: <27268ef8a3cf01b5f938075b6a1a12f6d9d9d98d.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:56 +0000 Subject: [PATCH v2 06/20] merge-ort: implement a very basic collect_merge_info() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This does not actually collect any necessary info other than the pathnames involved, since it just allocates an all-zero conflict_info and stuffs that into paths. However, it invokes the traverse_trees() machinery to walk over all the paths and sets up the basic infrastructure we need. I have left out a few obvious optimizations to try to make this patch as short and obvious as possible. A subsequent patch will add some of those back in with some more useful data fields before we introduce a patch that actually sets up the conflict_info fields. Signed-off-by: Elijah Newren --- merge-ort.c | 118 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 117 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index b056db6fc8..0c37f8bf52 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -174,12 +174,128 @@ static int err(struct merge_options *opt, const char *err, ...) return -1; } +static int collect_merge_info_callback(int n, + unsigned long mask, + unsigned long dirmask, + struct name_entry *names, + struct traverse_info *info) +{ + /* + * n is 3. Always. + * common ancestor (mbase) has mask 1, and stored in index 0 of names + * head of side 1 (side1) has mask 2, and stored in index 1 of names + * head of side 2 (side2) has mask 4, and stored in index 2 of names + */ + struct merge_options *opt = info->data; + struct merge_options_internal *opti = opt->priv; + struct conflict_info *ci; + struct name_entry *p; + size_t len; + char *fullpath; + unsigned filemask = mask & ~dirmask; + unsigned mbase_null = !(mask & 1); + unsigned side1_null = !(mask & 2); + unsigned side2_null = !(mask & 4); + + /* n = 3 is a fundamental assumption. */ + if (n != 3) + BUG("Called collect_merge_info_callback wrong"); + + /* + * A bunch of sanity checks verifying that traverse_trees() calls + * us the way I expect. Could just remove these at some point, + * though maybe they are helpful to future code readers. + */ + assert(mbase_null == is_null_oid(&names[0].oid)); + assert(side1_null == is_null_oid(&names[1].oid)); + assert(side2_null == is_null_oid(&names[2].oid)); + assert(!mbase_null || !side1_null || !side2_null); + assert(mask > 0 && mask < 8); + + /* + * Get the name of the relevant filepath, which we'll pass to + * setup_path_info() for tracking. + */ + p = names; + while (!p->mode) + p++; + len = traverse_path_len(info, p->pathlen); + + /* +1 in both of the following lines to include the NUL byte */ + fullpath = xmalloc(len + 1); + make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); + + /* + * TODO: record information about the path other than all zeros, + * so we can resolve later in process_entries. + */ + ci = xcalloc(1, sizeof(struct conflict_info)); + strmap_put(&opti->paths, fullpath, ci); + + /* If dirmask, recurse into subdirectories */ + if (dirmask) { + struct traverse_info newinfo; + struct tree_desc t[3]; + void *buf[3] = {NULL, NULL, NULL}; + const char *original_dir_name; + int i, ret; + + ci->match_mask &= filemask; + newinfo = *info; + newinfo.prev = info; + newinfo.name = p->path; + newinfo.namelen = p->pathlen; + newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); + + for (i = 0; i < 3; i++) { + const struct object_id *oid = NULL; + if (dirmask & 1) + oid = &names[i].oid; + buf[i] = fill_tree_descriptor(opt->repo, t + i, oid); + dirmask >>= 1; + } + + original_dir_name = opti->current_dir_name; + opti->current_dir_name = fullpath; + ret = traverse_trees(NULL, 3, t, &newinfo); + opti->current_dir_name = original_dir_name; + + for (i = 0; i < 3; i++) + free(buf[i]); + + if (ret < 0) + return -1; + } + + return mask; +} + static int collect_merge_info(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2) { - die("Not yet implemented."); + int ret; + struct tree_desc t[3]; + struct traverse_info info; + const char *toplevel_dir_placeholder = ""; + + opt->priv->current_dir_name = toplevel_dir_placeholder; + setup_traverse_info(&info, toplevel_dir_placeholder); + info.fn = collect_merge_info_callback; + info.data = opt; + info.show_all_errors = 1; + + parse_tree(merge_base); + parse_tree(side1); + parse_tree(side2); + init_tree_desc(t + 0, merge_base->buffer, merge_base->size); + init_tree_desc(t + 1, side1->buffer, side1->size); + init_tree_desc(t + 2, side2->buffer, side2->size); + + ret = traverse_trees(NULL, 3, t, &info); + + return ret; } static int detect_and_process_renames(struct merge_options *opt, From patchwork Fri Dec 4 20:47:57 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B1EDC4361A for ; Fri, 4 Dec 2020 20:49:36 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 602FA22CF6 for ; Fri, 4 Dec 2020 20:49:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730027AbgLDUtf (ORCPT ); Fri, 4 Dec 2020 15:49:35 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39976 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729435AbgLDUtf (ORCPT ); Fri, 4 Dec 2020 15:49:35 -0500 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C539DC08C5F2 for ; Fri, 4 Dec 2020 12:48:21 -0800 (PST) Received: by mail-wm1-x341.google.com with SMTP id f190so8238618wme.1 for ; Fri, 04 Dec 2020 12:48:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=lKs/4/NmcQsdKxhykR0Z1o5EmnS1ZCM82SOYo5NY8Tc=; b=BWKxYiRDCkCh8E/BZoJbQcDhFnK09BRbskMXokqd6jjgSkyf0tNYQ57WQrSEFaJOno v2A8bThDC8FQPH/H1+zUOY/vQk6keGAH6xd4Ht/p1XdymL9OUJCIBh4EYd2b3x6YjE6A U6lAP1PG5MxBA9n9owka8YpmixNfUrcS1MP3s/KFfpbf3OXqtsj4h1zwEEAPTrL3azUV kiA81a0h2w8rpl+9TDb4SNU9bXD3haIkD9cGeCjnF5/yD+cNftcZXb2hllybxjjgfpup kfKFZCI3v8L4K/lOjgUGgvc2K1/jbdFUuxVoqx0CyRUNR7DvNhnS2v6bqgph7+TxtQFm ISQw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=lKs/4/NmcQsdKxhykR0Z1o5EmnS1ZCM82SOYo5NY8Tc=; b=NT8zSOT/rFPz5Vsw0wU5kdgFQA7CwC9T3Pr3m3S+eGbQSXADvFFTHi0fPKM1Wccvxp 0MoKR08o6S1XW03+nutM8TU3yv207SVJUocIn1cI59sKRzifzXNCxLGwu7sd0YfHwjEB /YriwrvxRkIL5r3SzRcQuBZvNgsi/Ot8mk/DJMtVjIJJ0Hoee+xoI1c6JcUqvb+ZG26Q VMAbGnmtTTNMdi4SVRChHGUrJtax5pHKocJH4co8DSMO3q9Gg9rPXWRNjNraqNIHTKL0 UQqtZ7bt8evCa/uPB7BEzwabUfgvPTTu4Ztk9Eqz77h9JAWojOIUZE83E/R2BRsQNEii 4GGw== X-Gm-Message-State: AOAM531GbuVmZ/FVjfWdQ2b7D2pTTiE/pQSDImNO0bEdfSLbSDDMmX2w Z1RfD/jM0bn5v1COGDg+WymwzqluFnw= X-Google-Smtp-Source: ABdhPJzIpmBP3nLrRVKqKH47reS/cBvM0cWkfPAy4KTJ1Reg1ALITvOcMUhutNKpaoMOjVxmHRLXvg== X-Received: by 2002:a05:600c:20f:: with SMTP id 15mr6202555wmi.36.1607114900317; Fri, 04 Dec 2020 12:48:20 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id j6sm4860976wrq.38.2020.12.04.12.48.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:19 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:57 +0000 Subject: [PATCH v2 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Three-way merges, by their nature, are going to often have two or more trees match at a given subdirectory. We can avoid calling fill_tree_descriptor() on the same tree by checking when these trees match. Noting when various oids match will also be useful in other calculations and optimizations as well. Signed-off-by: Elijah Newren --- merge-ort.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 0c37f8bf52..ab3119d2d8 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -196,6 +196,15 @@ static int collect_merge_info_callback(int n, unsigned mbase_null = !(mask & 1); unsigned side1_null = !(mask & 2); unsigned side2_null = !(mask & 4); + unsigned side1_matches_mbase = (!side1_null && !mbase_null && + names[0].mode == names[1].mode && + oideq(&names[0].oid, &names[1].oid)); + unsigned side2_matches_mbase = (!side2_null && !mbase_null && + names[0].mode == names[2].mode && + oideq(&names[0].oid, &names[2].oid)); + unsigned sides_match = (!side1_null && !side2_null && + names[1].mode == names[2].mode && + oideq(&names[1].oid, &names[2].oid)); /* n = 3 is a fundamental assumption. */ if (n != 3) @@ -248,10 +257,19 @@ static int collect_merge_info_callback(int n, newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); for (i = 0; i < 3; i++) { - const struct object_id *oid = NULL; - if (dirmask & 1) - oid = &names[i].oid; - buf[i] = fill_tree_descriptor(opt->repo, t + i, oid); + if (i == 1 && side1_matches_mbase) + t[1] = t[0]; + else if (i == 2 && side2_matches_mbase) + t[2] = t[0]; + else if (i == 2 && sides_match) + t[2] = t[1]; + else { + const struct object_id *oid = NULL; + if (dirmask & 1) + oid = &names[i].oid; + buf[i] = fill_tree_descriptor(opt->repo, + t + i, oid); + } dirmask >>= 1; } From patchwork Fri Dec 4 20:47:58 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 49678C433FE for ; Fri, 4 Dec 2020 20:49:41 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E110D22CF6 for ; Fri, 4 Dec 2020 20:49:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730090AbgLDUth (ORCPT ); Fri, 4 Dec 2020 15:49:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39984 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729435AbgLDUth (ORCPT ); Fri, 4 Dec 2020 15:49:37 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DB211C08E85E for ; Fri, 4 Dec 2020 12:48:22 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id l1so6557155wrb.9 for ; Fri, 04 Dec 2020 12:48:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=XZLXtauf7A66fSwsU7ObDyVwg4P5JldgSKJXv1l6wP0=; b=J/Eta9o+dUt/PQ2tBeNw0ZoNMgCA1lIo6FeNOOU4+0KEVQDM54ZkNbAz5ck9js10eH qqe8f9CoC3gnDI89tftUAwImv9BmbEf+u4BD//Pv1T3nU/+xP0JRsIVrCoFGnOYga/Uj EmqRY5TyZbcrMzbJmA65FvMOKrghfq6rj1LiaTHkf617LNQeIdyZd4Ja7yf9Ti/W8Sek KgR35uIc4V5q9SK9t7S/xjGbJPwPeBcJSwhRSHLwwxEPf+iV9vuIBM9cPikqX3H5a4GX KjtffTlNdI1IfpjvMtLz4c0+xWPMWMAmwrhrhTmhuMkGVqCKmwndys9e8yu3a0f96c7G bgiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=XZLXtauf7A66fSwsU7ObDyVwg4P5JldgSKJXv1l6wP0=; b=exRZFR9dM2iawN3Cl0TqgmmiTg4bObysrZVc+gJYe5ZkliASXiyJpnMS3eoXnh6pY/ C95bv1CweTfTkg8OIbm938eyquQxhmX9H/BTd/8YHDsPhk0mQftIFoxddSSM1NghEjgj T7vOKzmqbRQh//NO3yyBkRwAWguD9ccRCMdlJAAu9ObI+Ye72HnfnVI8IfMwhKqB95KQ bJs3jAnvhkQKJJkQK7eX9n/rBZpyDkvQLw4ZpccbacHERonIE9H7vwx9WxDzRERhSXw5 0p8m510rKx3pT1FiNAK7C1aOjxB04fZVYVzR+vwRvOeB+vyNc0hcEP2DdMU8bmq92IOp tQfw== X-Gm-Message-State: AOAM532WCfj6YasAM9hWo6nFIgii3WduuZhKwXojOUAGQ8gXBSsvrtjD Ljr89fCqJjK8rm9dFysHEcNpFPds8Ew= X-Google-Smtp-Source: ABdhPJyyiOaiIK+uygL4MwFZTOoHYnPhH0G+/2CCeVxDg/FEXjsiU2UbmBnnTVRZBDQ6OUT5pn7Z9w== X-Received: by 2002:adf:82cc:: with SMTP id 70mr6975137wrc.74.1607114901373; Fri, 04 Dec 2020 12:48:21 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id f18sm4842826wru.42.2020.12.04.12.48.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:20 -0800 (PST) Message-Id: <93fd69fa3c6095ea65b55c76c3357ccf642b1ca2.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:58 +0000 Subject: [PATCH v2 08/20] merge-ort: compute a few more useful fields for collect_merge_info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- merge-ort.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index ab3119d2d8..b4e4c1f157 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -193,6 +193,7 @@ static int collect_merge_info_callback(int n, size_t len; char *fullpath; unsigned filemask = mask & ~dirmask; + unsigned match_mask = 0; /* will be updated below */ unsigned mbase_null = !(mask & 1); unsigned side1_null = !(mask & 2); unsigned side2_null = !(mask & 4); @@ -206,6 +207,22 @@ static int collect_merge_info_callback(int n, names[1].mode == names[2].mode && oideq(&names[1].oid, &names[2].oid)); + /* + * Note: When a path is a file on one side of history and a directory + * in another, we have a directory/file conflict. In such cases, if + * the conflict doesn't resolve from renames and deletions, then we + * always leave directories where they are and move files out of the + * way. Thus, while struct conflict_info has a df_conflict field to + * track such conflicts, we ignore that field for any directories at + * a path and only pay attention to it for files at the given path. + * The fact that we leave directories were they are also means that + * we do not need to worry about getting additional df_conflict + * information propagated from parent directories down to children + * (unlike, say traverse_trees_recursive() in unpack-trees.c, which + * sets a newinfo.df_conflicts field specifically to propagate it). + */ + unsigned df_conflict = (filemask != 0) && (dirmask != 0); + /* n = 3 is a fundamental assumption. */ if (n != 3) BUG("Called collect_merge_info_callback wrong"); @@ -221,6 +238,14 @@ static int collect_merge_info_callback(int n, assert(!mbase_null || !side1_null || !side2_null); assert(mask > 0 && mask < 8); + /* Determine match_mask */ + if (side1_matches_mbase) + match_mask = (side2_matches_mbase ? 7 : 3); + else if (side2_matches_mbase) + match_mask = 5; + else if (sides_match) + match_mask = 6; + /* * Get the name of the relevant filepath, which we'll pass to * setup_path_info() for tracking. @@ -239,6 +264,8 @@ static int collect_merge_info_callback(int n, * so we can resolve later in process_entries. */ ci = xcalloc(1, sizeof(struct conflict_info)); + ci->df_conflict = df_conflict; + ci->match_mask = match_mask; strmap_put(&opti->paths, fullpath, ci); /* If dirmask, recurse into subdirectories */ @@ -255,6 +282,15 @@ static int collect_merge_info_callback(int n, newinfo.name = p->path; newinfo.namelen = p->pathlen; newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); + /* + * If this directory we are about to recurse into cared about + * its parent directory (the current directory) having a D/F + * conflict, then we'd propagate the masks in this way: + * newinfo.df_conflicts |= (mask & ~dirmask); + * But we don't worry about propagating D/F conflicts. (See + * comment near setting of local df_conflict variable near + * the beginning of this function). + */ for (i = 0; i < 3; i++) { if (i == 1 && side1_matches_mbase) From patchwork Fri Dec 4 20:47:59 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B9429C433FE for ; Fri, 4 Dec 2020 20:49:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 801CC22CF6 for ; Fri, 4 Dec 2020 20:49:38 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730113AbgLDUth (ORCPT ); Fri, 4 Dec 2020 15:49:37 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39986 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730047AbgLDUth (ORCPT ); Fri, 4 Dec 2020 15:49:37 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EB07BC08E85F for ; Fri, 4 Dec 2020 12:48:23 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id l1so6557185wrb.9 for ; Fri, 04 Dec 2020 12:48:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=aBBagrEFLIkxxL3BVYZ7Bhc6hs4GxQo7fyhQpsMyIs8=; b=qvODyy5GixpJWQGKver/DiDMK99uXE1fg48dFUH1RPB0tgA1gDvdfoi/u7Q3Dl72H1 EF3eoJyoadLnBougvzL81LtNopCUNiLCGDuQII0ir4agojKdgFvTQOR2y42xGmTtSL8d C5zrsjRMdOkV3gzFBTQoag1B6vTWj7U4fMdFk5hettYJ9Gl3J0nB/oo2gjoEJFADJuNC cMEOuE3dGF/f8u8l1HNuJuTR21AYxSkK0DB/mm/2s+KcPr1pVya71Pxf2OjcJzYbKKVB DZiPTWmfd5Io4qV8WUM2QIFi0m5R9D9z1Ig9/aS0jvtNjVwGIT+YZG7iBT7Es/h2yO3w lScA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=aBBagrEFLIkxxL3BVYZ7Bhc6hs4GxQo7fyhQpsMyIs8=; b=drek1Q8GA2ijXL+6cenasEgtEz26ILqO149WXvnPU+FEeNZ6OXO/ylvptMKZOTSgsd 2OTGRG0QkZSG2EekbdLoltNcfF4ZWKJo7lLiwRU6bwZPPhW3Vcla1Pi3uZKx3fFIM6KQ gILFGCa9GePZjr8Sc3w46k70TAHE6FMF4OBxWWHaQciBp5dKlvgzXdyQo2n1dcDimSbg wKpiYg8o5WNiwc4AKp3/OGMxA9uOl1aFShSJ3U8mz7DzmNWCwsqwkvjfW/LSvqg9GIyH Rd3ny8glxm7+wdStdPz65CYgEjPxooI6rGs6dHiWhVnrylXMweJYZyNEZZVn3xqOm5OY XLWQ== X-Gm-Message-State: AOAM532cWSjvkGDExTnTxXzUoE5cxCVq8jt8n9wZNUBcaI3OBU/+8+vq UTENihu50zHWbE7W5OGYRxrDaWnWYrE= X-Google-Smtp-Source: ABdhPJyQNEf1As6IEt6ih02QTMnxLrOICjbmZf6UaJGuw5nn+OCn9O18mD6PR9/GPsFa1CIzzOAGVA== X-Received: by 2002:a5d:6783:: with SMTP id v3mr6983633wru.45.1607114902396; Fri, 04 Dec 2020 12:48:22 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id r1sm4872836wra.97.2020.12.04.12.48.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:21 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:47:59 +0000 Subject: [PATCH v2 09/20] merge-ort: record stage and auxiliary info for every path Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Create a helper function, setup_path_info(), which can be used to record all the information we want in a merged_info or conflict_info. While there is currently only one caller of this new function, and some of its particular parameters are fixed, future callers of this function will be added later. Signed-off-by: Elijah Newren --- merge-ort.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 90 insertions(+), 7 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index b4e4c1f157..007c6fc067 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -158,6 +158,26 @@ struct conflict_info { unsigned match_mask:3; }; +/* + * For the next three macros, see warning for conflict_info.merged. + * + * In each of the below, mi is a struct merged_info*, and ci was defined + * as a struct conflict_info* (but we need to verify ci isn't actually + * pointed at a struct merged_info*). + * + * INITIALIZE_CI: Assign ci to mi but only if it's safe; set to NULL otherwise. + * VERIFY_CI: Ensure that something we assigned to a conflict_info* is one. + * ASSIGN_AND_VERIFY_CI: Similar to VERIFY_CI but do assignment first. + */ +#define INITIALIZE_CI(ci, mi) do { \ + (ci) = (!(mi) || (mi)->clean) ? NULL : (struct conflict_info *)(mi); \ +} while (0) +#define VERIFY_CI(ci) assert(ci && !ci->merged.clean); +#define ASSIGN_AND_VERIFY_CI(ci, mi) do { \ + (ci) = (struct conflict_info *)(mi); \ + assert((ci) && !(mi)->clean); \ +} while (0) + static int err(struct merge_options *opt, const char *err, ...) { va_list params; @@ -174,6 +194,65 @@ static int err(struct merge_options *opt, const char *err, ...) return -1; } +static void setup_path_info(struct merge_options *opt, + struct string_list_item *result, + const char *current_dir_name, + int current_dir_name_len, + char *fullpath, /* we'll take over ownership */ + struct name_entry *names, + struct name_entry *merged_version, + unsigned is_null, /* boolean */ + unsigned df_conflict, /* boolean */ + unsigned filemask, + unsigned dirmask, + int resolved /* boolean */) +{ + /* result->util is void*, so mi is a convenience typed variable */ + struct merged_info *mi; + + assert(!is_null || resolved); + assert(!df_conflict || !resolved); /* df_conflict implies !resolved */ + assert(resolved == (merged_version != NULL)); + + mi = xcalloc(1, resolved ? sizeof(struct merged_info) : + sizeof(struct conflict_info)); + mi->directory_name = current_dir_name; + mi->basename_offset = current_dir_name_len; + mi->clean = !!resolved; + if (resolved) { + mi->result.mode = merged_version->mode; + oidcpy(&mi->result.oid, &merged_version->oid); + mi->is_null = !!is_null; + } else { + int i; + struct conflict_info *ci; + + ASSIGN_AND_VERIFY_CI(ci, mi); + for (i = 0; i < 3; i++) { + ci->pathnames[i] = fullpath; + ci->stages[i].mode = names[i].mode; + oidcpy(&ci->stages[i].oid, &names[i].oid); + } + ci->filemask = filemask; + ci->dirmask = dirmask; + ci->df_conflict = !!df_conflict; + if (dirmask) + /* + * Assume is_null for now, but if we have entries + * under the directory then when it is complete in + * write_completed_directory() it'll update this. + * Also, for D/F conflicts, we have to handle the + * directory first, then clear this bit and process + * the file to see how it is handled -- that occurs + * near the top of process_entry(). + */ + mi->is_null = 1; + } + strmap_put(&opt->priv->paths, fullpath, mi); + result->string = fullpath; + result->util = mi; +} + static int collect_merge_info_callback(int n, unsigned long mask, unsigned long dirmask, @@ -188,10 +267,12 @@ static int collect_merge_info_callback(int n, */ struct merge_options *opt = info->data; struct merge_options_internal *opti = opt->priv; - struct conflict_info *ci; + struct string_list_item pi; /* Path Info */ + struct conflict_info *ci; /* typed alias to pi.util (which is void*) */ struct name_entry *p; size_t len; char *fullpath; + const char *dirname = opti->current_dir_name; unsigned filemask = mask & ~dirmask; unsigned match_mask = 0; /* will be updated below */ unsigned mbase_null = !(mask & 1); @@ -260,13 +341,15 @@ static int collect_merge_info_callback(int n, make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); /* - * TODO: record information about the path other than all zeros, - * so we can resolve later in process_entries. + * Record information about the path so we can resolve later in + * process_entries. */ - ci = xcalloc(1, sizeof(struct conflict_info)); - ci->df_conflict = df_conflict; + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, NULL, 0, df_conflict, filemask, dirmask, 0); + + ci = pi.util; + VERIFY_CI(ci); ci->match_mask = match_mask; - strmap_put(&opti->paths, fullpath, ci); /* If dirmask, recurse into subdirectories */ if (dirmask) { @@ -310,7 +393,7 @@ static int collect_merge_info_callback(int n, } original_dir_name = opti->current_dir_name; - opti->current_dir_name = fullpath; + opti->current_dir_name = pi.string; ret = traverse_trees(NULL, 3, t, &newinfo); opti->current_dir_name = original_dir_name; From patchwork Fri Dec 4 20:48:00 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3512C4167B for ; Fri, 4 Dec 2020 20:49:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id CE96322CF8 for ; Fri, 4 Dec 2020 20:49:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730178AbgLDUtj (ORCPT ); Fri, 4 Dec 2020 15:49:39 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39988 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730047AbgLDUti (ORCPT ); Fri, 4 Dec 2020 15:49:38 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E2E36C0613D1 for ; Fri, 4 Dec 2020 12:48:24 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id a3so8172822wmb.5 for ; Fri, 04 Dec 2020 12:48:24 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=hll+1QbjL01rZ2JGt6n4wIIvpciV503XWM1AnIHMyy0=; b=iYz/oqI5qcmn7U2nvHhxOrEx9PRhMz9NQJGGu0jkLgWVFBpTOyDtwEm4JdtMD7+wZR ZAAFn/c0vcRwvm4z1IRAhqmVIARl0yRoc4LrEs1cdO8yiFN6i7GQxZaAGyiz5Ys632XQ 7mW82CV8xq21q5w5piC4Gz7etd7VGG3Qlp5US3S1+JWtBMv1E8f1176zSmmd2V7iaTxg 3E02C8am7WexugGcVi0Ogs95IXJkGxgzzQPljKO4NjZP5VJ++A0CygyvalULCh6S+NTO U9aPA1hBMtGyQp2xbvwyKjpw8agIudpkqg1fupskKG22429yXqJksFLieF3gnYhrDq54 Bswg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=hll+1QbjL01rZ2JGt6n4wIIvpciV503XWM1AnIHMyy0=; b=lw5jDBmTLLvBd+DjAxVVdkdy0q6XX0JpPVLDCerdTW94btMNhgBsldYTJwZR6J9YRg e5PfIakLbbIgy/9HX6feSS29Vm0zXcS3vQszXS671+YPGz6T7SmMWWqc5stJsC5lDUsR s/kiez4AJ3cM7zmB1wMSE1+m8TWD2hYzDMOboJ/eUknS2AydK8PcmYAWvbzMe0+GB7Hu kfPEtFU2cWpOXHHOU2I4GMSZNlSnqG+A4D5zMMdmTR5JqQrzF+fQVY2ldutZnB+DHN+v NQE+mJMDJHNQmZxXyKksxsMomSGfBh7fTN/CBy3cCY/byKiXxOz2Sjip1k08BL+U9Esh 0XHQ== X-Gm-Message-State: AOAM533OlbAW8d479DEQmpQ8vOhvdUxk5NNGmE9ijTylG14RLehQVa7B nBJeOACWlXMeNDxCoVO3S2UyMQRu/N0= X-Google-Smtp-Source: ABdhPJxZ2SUBSqElPLHkp+n5CUY1qTaU4to1B8yKlyUPJ3wvkyvraPCGJ7zM0mE0yR8W/GuHhkeqCw== X-Received: by 2002:a7b:c8da:: with SMTP id f26mr6224553wml.50.1607114903442; Fri, 04 Dec 2020 12:48:23 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d3sm4767253wrr.2.2020.12.04.12.48.22 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:22 -0800 (PST) Message-Id: <86c661fe1eebf692b5b6cbb6379d12303f16cde0.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:00 +0000 Subject: [PATCH v2 10/20] merge-ort: avoid recursing into identical trees Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren When all three trees have the same oid, there is no need to recurse into these trees to find that all files within them happen to match. We can just record any one of the trees as the resolution of merging that particular path. Immediately resolving trees for other types of trivial tree merges (such as one side matches the merge base, or the two sides match each other) would prevent us from detecting renames for some paths, and thus prevent us from doing three-way content merges for those paths whose renames we did not detect. Signed-off-by: Elijah Newren --- merge-ort.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index 007c6fc067..2dd52ab426 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -340,6 +340,19 @@ static int collect_merge_info_callback(int n, fullpath = xmalloc(len + 1); make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); + /* + * If mbase, side1, and side2 all match, we can resolve early. Even + * if these are trees, there will be no renames or anything + * underneath. + */ + if (side1_matches_mbase && side2_matches_mbase) { + /* mbase, side1, & side2 all match; use mbase as resolution */ + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, names+0, mbase_null, 0, + filemask, dirmask, 1); + return mask; + } + /* * Record information about the path so we can resolve later in * process_entries. From patchwork Fri Dec 4 20:48:01 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CD3A6C4361A for ; Fri, 4 Dec 2020 20:49:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9A9EC22CF6 for ; Fri, 4 Dec 2020 20:49:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730171AbgLDUti (ORCPT ); Fri, 4 Dec 2020 15:49:38 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39990 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730128AbgLDUti (ORCPT ); Fri, 4 Dec 2020 15:49:38 -0500 Received: from mail-wm1-x32f.google.com (mail-wm1-x32f.google.com [IPv6:2a00:1450:4864:20::32f]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED64AC08E860 for ; Fri, 4 Dec 2020 12:48:25 -0800 (PST) Received: by mail-wm1-x32f.google.com with SMTP id g185so8193713wmf.3 for ; Fri, 04 Dec 2020 12:48:25 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=gZA4vMNHGh9mjCJSJZpRA0igoqQutFqUbTBcYvzvJK8=; b=ewFl6dLwuY7nCY9XbyJ+u7N6NFGs5CW6APIMuFjGBzDOTxELbiurEmdeYsgN1dEw6j dIXl/cGb73OtGNS3qX2CWosrlt84oV8AKWPjx2tSWbUJyzFiggi9GLZDeLNHXkQI4hbp Sra20f734FWEhcizMlwg6sFHtXKb3hr+gKN8UmPo83hV2Fng2OWt4pOqoMNibMBPy2c/ uM+sGvj0B98ZgD1sTpUkxlMqveqXqMNtvwyJ3KbGnGX2ix6L+sQe/J/Mf0SNu623aDO/ ZI384H4aQdszStKtz7//BQHQYbUca7+yRjFTpvQKkwPnoZ7CEAm3ZiS8CENiCw7wlHNe 5Eqg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=gZA4vMNHGh9mjCJSJZpRA0igoqQutFqUbTBcYvzvJK8=; b=fTgWo6Huca77M6rUmqSO+qfHq5O0GnP12dKiCsnfK2kcLoYbiiyDSHjd7v8+o2bvmy vpwhKPesKV707SlrEgyON5fqBo49m5bdnSBJMG9dkKir9DlS+3HfyPjW65iwtE2pDqqN KCdJBPRXuGCPYCW5a6zcacdMqwZzQrDD25ZBGCFyDVdVAtY+ll+Nwa09baBCASxCMd2k TX4sKk/U4oOC/CPYsfqCGOiIBGpjy/r2yVqEVDfESlK6PoFmsz+0XGhQ35+9JVMatJO/ mVyeaa3xZiaLK1aUG+SaBO1bGAhlrzma5pCQZ4yNpg15VfNVHlgFlC3yxpWndvIt+sQu T0KA== X-Gm-Message-State: AOAM532BYMP4cOzhF+zpwgO8y4VXCa3TSoycuw03xTcFYnkkIvIKkkC5 4qlFi2+pr0XwUahMLKf4BnrOgx2YsX8= X-Google-Smtp-Source: ABdhPJx4OsYRx5GJY+pMDdw4yGM/KDkmM7zOoHNOOjaiZrIIwFlOQSzHUdDotEVqhsuQKi7MvY0rng== X-Received: by 2002:a1c:80cb:: with SMTP id b194mr6141555wmd.91.1607114904402; Fri, 04 Dec 2020 12:48:24 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t136sm4494722wmt.18.2020.12.04.12.48.23 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:23 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:01 +0000 Subject: [PATCH v2 11/20] merge-ort: add a preliminary simple process_entries() implementation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Add a process_entries() implementation that just loops over the paths and processes each one individually with an auxiliary process_entry() call. Add a basic process_entry() as well, which handles several cases but leaves a few of the more involved ones with die-not-implemented messages. Also, although process_entries() is supposed to create a tree, it does not yet have code to do so -- except in the special case of merging completely empty trees. Signed-off-by: Elijah Newren --- merge-ort.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 102 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 2dd52ab426..fbbbde1c3f 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -465,10 +465,111 @@ static int detect_and_process_renames(struct merge_options *opt, return clean; } +/* Per entry merge function */ +static void process_entry(struct merge_options *opt, + const char *path, + struct conflict_info *ci) +{ + VERIFY_CI(ci); + assert(ci->filemask >= 0 && ci->filemask <= 7); + /* ci->match_mask == 7 was handled in collect_merge_info_callback() */ + assert(ci->match_mask == 0 || ci->match_mask == 3 || + ci->match_mask == 5 || ci->match_mask == 6); + + if (ci->df_conflict) { + die("Not yet implemented."); + } + + /* + * NOTE: Below there is a long switch-like if-elseif-elseif... block + * which the code goes through even for the df_conflict cases + * above. Well, it will once we don't die-not-implemented above. + */ + if (ci->match_mask) { + ci->merged.clean = 1; + if (ci->match_mask == 6) { + /* stages[1] == stages[2] */ + ci->merged.result.mode = ci->stages[1].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[1].oid); + } else { + /* determine the mask of the side that didn't match */ + unsigned int othermask = 7 & ~ci->match_mask; + int side = (othermask == 4) ? 2 : 1; + + ci->merged.result.mode = ci->stages[side].mode; + ci->merged.is_null = !ci->merged.result.mode; + oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); + + assert(othermask == 2 || othermask == 4); + assert(ci->merged.is_null == + (ci->filemask == ci->match_mask)); + } + } else if (ci->filemask >= 6 && + (S_IFMT & ci->stages[1].mode) != + (S_IFMT & ci->stages[2].mode)) { + /* + * Two different items from (file/submodule/symlink) + */ + die("Not yet implemented."); + } else if (ci->filemask >= 6) { + /* + * TODO: Needs a two-way or three-way content merge, but we're + * just being lazy and copying the version from HEAD and + * leaving it as conflicted. + */ + ci->merged.clean = 0; + ci->merged.result.mode = ci->stages[1].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[1].oid); + } else if (ci->filemask == 3 || ci->filemask == 5) { + /* Modify/delete */ + die("Not yet implemented."); + } else if (ci->filemask == 2 || ci->filemask == 4) { + /* Added on one side */ + int side = (ci->filemask == 4) ? 2 : 1; + ci->merged.result.mode = ci->stages[side].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); + ci->merged.clean = !ci->df_conflict; + } else if (ci->filemask == 1) { + /* Deleted on both sides */ + ci->merged.is_null = 1; + ci->merged.result.mode = 0; + oidcpy(&ci->merged.result.oid, &null_oid); + ci->merged.clean = 1; + } + + /* + * If still conflicted, record it separately. This allows us to later + * iterate over just conflicted entries when updating the index instead + * of iterating over all entries. + */ + if (!ci->merged.clean) + strmap_put(&opt->priv->conflicted, path, ci); +} + static void process_entries(struct merge_options *opt, struct object_id *result_oid) { - die("Not yet implemented."); + struct hashmap_iter iter; + struct strmap_entry *e; + + if (strmap_empty(&opt->priv->paths)) { + oidcpy(result_oid, opt->repo->hash_algo->empty_tree); + return; + } + + strmap_for_each_entry(&opt->priv->paths, &iter, e) { + /* + * NOTE: mi may actually be a pointer to a conflict_info, but + * we have to check mi->clean first to see if it's safe to + * reassign to such a pointer type. + */ + struct merged_info *mi = e->value; + + if (!mi->clean) + process_entry(opt, e->key, e->value); + } + + die("Tree creation not yet implemented"); } void merge_switch_to_result(struct merge_options *opt, From patchwork Fri Dec 4 20:48:02 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 991D8C4167B for ; Fri, 4 Dec 2020 20:49:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 5EC4F22CE3 for ; Fri, 4 Dec 2020 20:49:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730417AbgLDUtn (ORCPT ); Fri, 4 Dec 2020 15:49:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40002 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730332AbgLDUtm (ORCPT ); Fri, 4 Dec 2020 15:49:42 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DA00FC08E861 for ; Fri, 4 Dec 2020 12:48:26 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id g185so8193734wmf.3 for ; Fri, 04 Dec 2020 12:48:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=ajsah9vP6m5FStxN4iJIKQQYqeKlqqcbNP8dSaxtu+0=; b=LrNxzJKzV+J2YoYYyS+60OHSQ7x3C7uIqTd1OD10x9304efG1sthMiLqPf65LQCOis CT7XObcbeXxpsCxHdjwHs1IEAJnGXKBbq+L8dQjiVmypX8YiG5V9O0dDFV2144oY0afS W/SmFLGSejpoI2dSdnGEfSSyyf7adI1tdE5f8PaNZ7Mxj6LDFc8Sa8NqqTB/Amipw6ir PidIDKAq60LAS7vIxmhHUbRFoRx37b83soutRxGIc+W/9N3u69gIFfSJFaGO2LFvF0D+ IcUIkx0TFzt0np8Z9hMrOVZUoZnYI1XZPtLzysyJrRGtlvsStfPlP69iWr2IdOYH4zk9 JDMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=ajsah9vP6m5FStxN4iJIKQQYqeKlqqcbNP8dSaxtu+0=; b=LIIHwZJuo7jwi90ekFEkmZf4fAi1GifzlAOhYB6/WUiY+KlxMq660zeckSVlNQfgQ7 thhOjGQopMhbtcg1kxBNf4AMsnMJIFNThWjL9N4v3BQKK9MfSl6IUz+4Jw4lhliCvDcz 23eAodIGIQtSqvU/1/mVKFwaAmQcCZks5XlFddZpj1mlQQsmTx8fTwRFZ3Wulmg1uPfA FsIYGBLvl6dOT9WA3fA9tEzezITVqdF4SpOc2cWy58lnAjAjgtQG9c562E5s6dIEiGjk AJVoP2mVWc+7lzH6jvyIwaJBY+fdpbTVmhUZQ4oGnSp5cVxlKpXBIVLVab3w8xP5KfG3 28AA== X-Gm-Message-State: AOAM533idH5tIeSC4gfQfvO9sFEk2HglBMRq9Eaul6sD3h2Kxdk00LUU px/fhsWsKiyIeP35s6m8EUq6o9/Ff+E= X-Google-Smtp-Source: ABdhPJyvJ8my+CpvPd9vTNNBMF3hKbU7bkaIr6Q5HVH/6RwwEoncAkd5G1iXLhOT6FSvfx4/YehGZA== X-Received: by 2002:a1c:17:: with SMTP id 23mr6034021wma.35.1607114905413; Fri, 04 Dec 2020 12:48:25 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y7sm4854142wrp.3.2020.12.04.12.48.24 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:24 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:02 +0000 Subject: [PATCH v2 12/20] merge-ort: have process_entries operate in a defined order Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren We want to handle paths below a directory before needing to handle the directory itself. Also, we want to handle the directory immediately after the paths below it, so we can't use simple lexicographic ordering from strcmp (which would insert foo.txt between foo and foo/file.c). Copy string_list_df_name_compare() from merge-recursive.c, and set up a string list of paths sorted by that function so that we can iterate in the desired order. Signed-off-by: Elijah Newren --- merge-ort.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 50 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index fbbbde1c3f..c54837999f 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -465,6 +465,33 @@ static int detect_and_process_renames(struct merge_options *opt, return clean; } +static int string_list_df_name_compare(const char *one, const char *two) +{ + int onelen = strlen(one); + int twolen = strlen(two); + /* + * Here we only care that entries for D/F conflicts are + * adjacent, in particular with the file of the D/F conflict + * appearing before files below the corresponding directory. + * The order of the rest of the list is irrelevant for us. + * + * To achieve this, we sort with df_name_compare and provide + * the mode S_IFDIR so that D/F conflicts will sort correctly. + * We use the mode S_IFDIR for everything else for simplicity, + * since in other cases any changes in their order due to + * sorting cause no problems for us. + */ + int cmp = df_name_compare(one, onelen, S_IFDIR, + two, twolen, S_IFDIR); + /* + * Now that 'foo' and 'foo/bar' compare equal, we have to make sure + * that 'foo' comes before 'foo/bar'. + */ + if (cmp) + return cmp; + return onelen - twolen; +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, @@ -551,24 +578,44 @@ static void process_entries(struct merge_options *opt, { struct hashmap_iter iter; struct strmap_entry *e; + struct string_list plist = STRING_LIST_INIT_NODUP; + struct string_list_item *entry; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); return; } + /* Hack to pre-allocate plist to the desired size */ + ALLOC_GROW(plist.items, strmap_get_size(&opt->priv->paths), plist.alloc); + + /* Put every entry from paths into plist, then sort */ strmap_for_each_entry(&opt->priv->paths, &iter, e) { + string_list_append(&plist, e->key)->util = e->value; + } + plist.cmp = string_list_df_name_compare; + string_list_sort(&plist); + + /* + * Iterate over the items in reverse order, so we can handle paths + * below a directory before needing to handle the directory itself. + */ + for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) { + char *path = entry->string; /* * NOTE: mi may actually be a pointer to a conflict_info, but * we have to check mi->clean first to see if it's safe to * reassign to such a pointer type. */ - struct merged_info *mi = e->value; + struct merged_info *mi = entry->util; - if (!mi->clean) - process_entry(opt, e->key, e->value); + if (!mi->clean) { + struct conflict_info *ci = (struct conflict_info *)mi; + process_entry(opt, path, ci); + } } + string_list_clear(&plist, 0); die("Tree creation not yet implemented"); } From patchwork Fri Dec 4 20:48:03 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952375 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01C34C4361A for ; Fri, 4 Dec 2020 20:49:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C7D5222CF6 for ; Fri, 4 Dec 2020 20:49:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730394AbgLDUtm (ORCPT ); Fri, 4 Dec 2020 15:49:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40004 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730232AbgLDUtm (ORCPT ); Fri, 4 Dec 2020 15:49:42 -0500 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDE98C08E862 for ; Fri, 4 Dec 2020 12:48:27 -0800 (PST) Received: by mail-wm1-x344.google.com with SMTP id a3so8172922wmb.5 for ; Fri, 04 Dec 2020 12:48:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=XJPfvf2mmPYAO+WF6XjbOW5AJ00Zwdiy5fArUe9FNv8=; b=JRET0NH4e+TBKL0BjMQCpoKiS92AwMtIFdax0pXYYHvizO9+uMANEPG9odM/rN8BRZ J/LinS4bc3a98kkQu7FkfZLh/kX2wMnODldiol4pEzrBt3Sqgc9bzszGSDvpRKm/QJmh fnH4yWrBC4v3ygHgDqy5XsYlQh4CqTWuh2Tn4PK9gcue63UqQsDaiyYeYSnBXkusv3Ir lOhnDJFkAewmjPE37hqbh27Zel30djdl1hB0pcD4scl5OXYVhwErnabQy3NO9ZwvZNl8 KQ26UBDzVRr2eLc5Gl9Fs6ukdzOIT8xcAR2tUPRm7vVdNnuFMgvx+CJ62LCM6S7h2kA9 yYNw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=XJPfvf2mmPYAO+WF6XjbOW5AJ00Zwdiy5fArUe9FNv8=; b=VkQDU5TRqplX/wV9YvR3R7LPXklhIuPFonCftZMYfsa7bB681GNqs6h/8Kx/ew7KNV T5sKuL0U7BuC5kcc8cEKE+N7niZQ9Yc/57xJNXhTcLImmFIbeg0b9zMxl3HeioUnvZTo HJYHn83+EFx04EBtlQK2iqrlza/xYpiYxaPxvqrp6mPkAyd6a61XXhKHPWPuP6/EJMaV hXwSbXXkCh17U0N+6BmoFwyRmcJ0Xy5ciHvHZRPiY9q1+8vJRVIkaxedB3NI9z2WIaCg H0QzmwLaFri/lt8EK+ZWVT0eo9QATvAAlbEOiM/oO5vyhH+Awz4Al+FAmIqbMQ4njFRK 3Uug== X-Gm-Message-State: AOAM5300tp1mAFBY2DGovBy2PFegbrxqJkJExAqAQed5eh8+XrKsBhXz wz2Q0g1Hv0vCUIxfhQvOIpvdUaQSAjo= X-Google-Smtp-Source: ABdhPJy6SEgGp+tTx9RlrmsWmcE5j2WmwHe20q2mhOzAghmJYjWvhdNVMsxKgAzS6avthoInU8OZzQ== X-Received: by 2002:a1c:f707:: with SMTP id v7mr6281656wmh.85.1607114906426; Fri, 04 Dec 2020 12:48:26 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y68sm3563239wmc.0.2020.12.04.12.48.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:25 -0800 (PST) Message-Id: <8ee8561d7a3b4cb76ccffbd6061f3a874f386675.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:03 +0000 Subject: [PATCH v2 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren As a step towards transforming the processed path->conflict_info entries into an actual tree object, start recording basenames, modes, and oids in a dir_metadata structure. Subsequent commits will make use of this to actually write a tree. Signed-off-by: Elijah Newren --- merge-ort.c | 40 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index c54837999f..60cd73416e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -492,10 +492,31 @@ static int string_list_df_name_compare(const char *one, const char *two) return onelen - twolen; } +struct directory_versions { + struct string_list versions; +}; + +static void record_entry_for_tree(struct directory_versions *dir_metadata, + const char *path, + struct merged_info *mi) +{ + const char *basename; + + if (mi->is_null) + /* nothing to record */ + return; + + basename = path + mi->basename_offset; + assert(strchr(basename, '/') == NULL); + string_list_append(&dir_metadata->versions, + basename)->util = &mi->result; +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, - struct conflict_info *ci) + struct conflict_info *ci, + struct directory_versions *dir_metadata) { VERIFY_CI(ci); assert(ci->filemask >= 0 && ci->filemask <= 7); @@ -503,6 +524,14 @@ static void process_entry(struct merge_options *opt, assert(ci->match_mask == 0 || ci->match_mask == 3 || ci->match_mask == 5 || ci->match_mask == 6); + if (ci->dirmask) { + record_entry_for_tree(dir_metadata, path, &ci->merged); + if (ci->filemask == 0) + /* nothing else to handle */ + return; + assert(ci->df_conflict); + } + if (ci->df_conflict) { die("Not yet implemented."); } @@ -571,6 +600,7 @@ static void process_entry(struct merge_options *opt, */ if (!ci->merged.clean) strmap_put(&opt->priv->conflicted, path, ci); + record_entry_for_tree(dir_metadata, path, &ci->merged); } static void process_entries(struct merge_options *opt, @@ -580,6 +610,7 @@ static void process_entries(struct merge_options *opt, struct strmap_entry *e; struct string_list plist = STRING_LIST_INIT_NODUP; struct string_list_item *entry; + struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP }; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); @@ -609,13 +640,16 @@ static void process_entries(struct merge_options *opt, */ struct merged_info *mi = entry->util; - if (!mi->clean) { + if (mi->clean) + record_entry_for_tree(&dir_metadata, path, mi); + else { struct conflict_info *ci = (struct conflict_info *)mi; - process_entry(opt, path, ci); + process_entry(opt, path, ci, &dir_metadata); } } string_list_clear(&plist, 0); + string_list_clear(&dir_metadata.versions, 0); die("Tree creation not yet implemented"); } From patchwork Fri Dec 4 20:48:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7F3E7C4361A for ; Fri, 4 Dec 2020 20:49:28 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 38C9A22CF6 for ; Fri, 4 Dec 2020 20:49:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729924AbgLDUt0 (ORCPT ); Fri, 4 Dec 2020 15:49:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729852AbgLDUtZ (ORCPT ); Fri, 4 Dec 2020 15:49:25 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C9ADBC08E863 for ; Fri, 4 Dec 2020 12:48:28 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id g14so6543777wrm.13 for ; Fri, 04 Dec 2020 12:48:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AVSHYFXW3EbaNbEARjffEMI0emS4ihimICcXp5mLfeg=; b=k0lMsiawkf6iRaGTMraSROn0li8gdQtBYTn7WI/mMOUYfHNe2/Zl0djqpWh4zCAUx3 fPJEW76hSpcWYXXpqY0KkUMPRntTDBR49RKbrbP0tz2YtEVDtqNoXYmQ9se+8CXoEnKR TcR4rFbjcYPc12+QpQnEpXKtnK7Yr+/96ec3+z8fAiAJPHK+ffQwBR/oduz5OYdmjIb6 7I5JO2xBgsCkBRW7YQkQpiLraMI0E08dBo4e73+kQ5oF2KzqZGk60qJe/UnQQGWUkmg3 h9iQIcpkUe8vjLoDOjo/lPnGvN6vZWDtvqRgBA7G/CsNAlFWk9DY0QbTvjeGDDRkwUAk HAkQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AVSHYFXW3EbaNbEARjffEMI0emS4ihimICcXp5mLfeg=; b=TOcSecqP3waqGAOVPgCxEWGjEvCSI3p+0w6JdXfzQzV8FsrMWgwJmkaj0l6Tqzd09y 8LgxjvZ0kWOZNAxuI6e0ty3uWpeERNAzVjg3xX/bqGAc40pTRaPJG5XvUgwl4y1Ud3zG CHzWXqyZ3XMz+HKYGiX/0/SZaAC2TSSEfdvpB0wyrd5QGB9+tkmJVQrCla3Z4jGy75xI GOWTPPdaZHmu2EVu8BjY3AdHHx+NZocdgydN8DFeGNOoEn59fDZufZlFJD90cDR0GGO9 72zBWt0mDk05w/6msO/fE2v1r6gbdSXeaHutAWU0JfbtZcXC4xTqfbTGzhXNpTMalmDm nPsA== X-Gm-Message-State: AOAM530MF925Q7E4z9cmEco4Bgi1c9GaZYMJUuxbnLInSmA7kNccI0il q7dlb+278m1s61bHiz19tU6k42bz140= X-Google-Smtp-Source: ABdhPJykApnvv+l8KmGJuiB04dupiWAYaf/rHfJPj+aEmWJcNc3xg1gBz6q2e3XUjw0ICkNWRA6Nag== X-Received: by 2002:a5d:4f10:: with SMTP id c16mr6835815wru.39.1607114907355; Fri, 04 Dec 2020 12:48:27 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id t136sm4494816wmt.18.2020.12.04.12.48.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:26 -0800 (PST) Message-Id: <6ff56824c336e6268620928f7eb022b71ea07ea9.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:04 +0000 Subject: [PATCH v2 14/20] merge-ort: step 2 of tree writing -- function to create tree object Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Create a new function, write_tree(), which will take a list of basenames, modes, and oids for a single directory and create a tree object in the object-store. We do not yet have just basenames, modes, and oids for just a single directory (we have a mixture of entries from all directory levels in the hierarchy) so we still die() before the current call to write_tree(), but the next patch will rectify that. Signed-off-by: Elijah Newren --- merge-ort.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 60cd73416e..eec6874943 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -19,6 +19,7 @@ #include "diff.h" #include "diffcore.h" +#include "object-store.h" #include "strmap.h" #include "tree.h" #include "xdiff-interface.h" @@ -496,6 +497,62 @@ struct directory_versions { struct string_list versions; }; +static int tree_entry_order(const void *a_, const void *b_) +{ + const struct string_list_item *a = a_; + const struct string_list_item *b = b_; + + const struct merged_info *ami = a->util; + const struct merged_info *bmi = b->util; + return base_name_compare(a->string, strlen(a->string), ami->result.mode, + b->string, strlen(b->string), bmi->result.mode); +} + +static void write_tree(struct object_id *result_oid, + struct string_list *versions, + unsigned int offset, + size_t hash_size) +{ + size_t maxlen = 0, extra; + unsigned int nr = versions->nr - offset; + struct strbuf buf = STRBUF_INIT; + struct string_list relevant_entries = STRING_LIST_INIT_NODUP; + int i; + + /* + * We want to sort the last (versions->nr-offset) entries in versions. + * Do so by abusing the string_list API a bit: make another string_list + * that contains just those entries and then sort them. + * + * We won't use relevant_entries again and will let it just pop off the + * stack, so there won't be allocation worries or anything. + */ + relevant_entries.items = versions->items + offset; + relevant_entries.nr = versions->nr - offset; + QSORT(relevant_entries.items, relevant_entries.nr, tree_entry_order); + + /* Pre-allocate some space in buf */ + extra = hash_size + 8; /* 8: 6 for mode, 1 for space, 1 for NUL char */ + for (i = 0; i < nr; i++) { + maxlen += strlen(versions->items[offset+i].string) + extra; + } + strbuf_grow(&buf, maxlen); + + /* Write each entry out to buf */ + for (i = 0; i < nr; i++) { + struct merged_info *mi = versions->items[offset+i].util; + struct version_info *ri = &mi->result; + strbuf_addf(&buf, "%o %s%c", + ri->mode, + versions->items[offset+i].string, '\0'); + strbuf_add(&buf, ri->oid.hash, hash_size); + } + + /* Write this object file out, and record in result_oid */ + write_object_file(buf.buf, buf.len, tree_type, result_oid); + strbuf_release(&buf); +} + static void record_entry_for_tree(struct directory_versions *dir_metadata, const char *path, struct merged_info *mi) @@ -648,9 +705,17 @@ static void process_entries(struct merge_options *opt, } } + /* + * TODO: We can't actually write a tree yet, because dir_metadata just + * contains all basenames of all files throughout the tree with their + * mode and hash. Not only is that a nonsensical tree, it will have + * lots of duplicates for paths such as "Makefile" or ".gitignore". + */ + die("Not yet implemented; need to process subtrees separately"); + write_tree(result_oid, &dir_metadata.versions, 0, + opt->repo->hash_algo->rawsz); string_list_clear(&plist, 0); string_list_clear(&dir_metadata.versions, 0); - die("Tree creation not yet implemented"); } void merge_switch_to_result(struct merge_options *opt, From patchwork Fri Dec 4 20:48:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 74E6BC433FE for ; Fri, 4 Dec 2020 20:49:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 28D2222CF6 for ; Fri, 4 Dec 2020 20:49:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729961AbgLDUt0 (ORCPT ); Fri, 4 Dec 2020 15:49:26 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729897AbgLDUtZ (ORCPT ); Fri, 4 Dec 2020 15:49:25 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 436CAC08E864 for ; Fri, 4 Dec 2020 12:48:30 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id u12so6609193wrt.0 for ; Fri, 04 Dec 2020 12:48:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=t2s9uic3y9iLNpiRf7m/spmlzdIvl6f1Dda1So6z99w=; b=QgXYsAHkGp+m8snXjpneAMvJhMoFcbu5sAoZNkxYFTMpylY9N6zO6vGELVuRhsy5BL JUsHKTqYDzSGeApQcgGrOl23grGBRmG/9IMWbGHafrHgkbvzO7awkNts/+bn9aQMxsCk Xc3Sbd3klXDznfft3cGgf80H4mxUvRuvTLIyoB8Xq9/YmjjVigAB+nZEYGPICr00DwWH G9zXEpggdrGrZfGczer/Ymd48tfvXm95EnOm3wAEr+4zNNoXc4OCJXVoENM/q+vSAh1r 9Wu1RLFS+8ddZW6usUNuDd8POO8OP9Oxahfzhr9wuuG0AiiQr0AVp5/c4s992xSnzXHC c26A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=t2s9uic3y9iLNpiRf7m/spmlzdIvl6f1Dda1So6z99w=; b=iAFn0x6D7Fb4ObuTJ131xG/vocBwicTb+kgiq9LW/9Fc4sc5zjxx9EqVOvcT0Ib+VQ iYGN6Dsqkf3ZZZrzf8HNciObeP34i/yt8qEhz8jPXBWW8JAHNiAYGTy6cJvBU6ZxcmIi Ppo+aQWCCEyaNixQDsTA7K7PlYjtVihHCpSGm4efxHpcvG5COrKT5tu2/3OnTwJEkQvf dgPL0QNjQr/H6muG5uk4tWv+aZb0D70IFtC4q93ej5m0DLSKKlVeWeC4x9Lmu8YLmsg5 1a22QTTQaYJDNpUDm6RpT/fRPIbJ21T0BYSljHRB90q8R7OgG+zF9/gDNQXASWMvsGxR FPog== X-Gm-Message-State: AOAM530aXkYDQrNStREfLAU0CcMToZzD9YzkKJqLYdw5GlmxKwV//6qm wmjJ/J/OuQgFyrThFMXnjXFhB/0pw/4= X-Google-Smtp-Source: ABdhPJyDviQrCMMysSBJDffaPbxjQCqgxaybq4r1An6wh+sWu0y6yThUmVO5MIzvy6J29g6oGrKbKg== X-Received: by 2002:adf:f607:: with SMTP id t7mr6879579wrp.169.1607114908399; Fri, 04 Dec 2020 12:48:28 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id h15sm4918046wrw.15.2020.12.04.12.48.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:27 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:05 +0000 Subject: [PATCH v2 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Our order for processing of entries means that if we have a tree of files that looks like Makefile src/moduleA/foo.c src/moduleA/bar.c src/moduleB/baz.c src/moduleB/umm.c tokens.txt Then we will process paths in the order of the leftmost column below. I have added two additional columns that help explain the algorithm that follows; the 2nd column is there to remind us we have oid & mode info we are tracking for each of these paths (which differs between the paths which I'm not representing well here), and the third column annotates the parent directory of the entry: tokens.txt "" src/moduleB/umm.c src/moduleB src/moduleB/baz.c src/moduleB src/moduleB src src/moduleA/foo.c src/moduleA src/moduleA/bar.c src/moduleA src/moduleA src src "" Makefile "" When the parent directory changes, if it's a subdirectory of the previous parent directory (e.g. "" -> src/moduleB) then we can just keep appending. If the parent directory differs from the previous parent directory and is not a subdirectory, then we should process that directory. So, for example, when we get to this point: tokens.txt "" src/moduleB/umm.c src/moduleB src/moduleB/baz.c src/moduleB and note that the next entry (src/moduleB) has a different parent than the last one that isn't a subdirectory, we should write out a tree for it 100644 blob umm.c 100644 blob baz.c then pop all the entries under that directory while recording the new hash for that directory, leaving us with tokens.txt "" src/moduleB src This process repeats until at the end we get to tokens.txt "" src "" Makefile "" and then we can write out the toplevel tree. Since we potentially have entries in our string_list corresponding to multiple different toplevel directories, e.g. a slightly different repository might have: whizbang.txt "" tokens.txt "" src/moduleD src src/moduleC src src/moduleB src src/moduleA/foo.c src/moduleA src/moduleA/bar.c src/moduleA When src/moduleA is popped off, we need to know that the "last directory" reverts back to src, and how many entries in our string_list are associated with that parent directory. So I use an auxiliary offsets string_list which would have (parent_directory,offset) information of the form "" 0 src 2 src/moduleA 5 Whenever I write out a tree for a subdirectory, I set versions.nr to the final offset value and then decrement offsets.nr...and then add an entry to versions with a hash for the new directory. The idea is relatively simple, there's just a lot of accounting to implement this. Signed-off-by: Elijah Newren --- merge-ort.c | 242 ++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 234 insertions(+), 8 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index eec6874943..cf6f395c69 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -494,7 +494,46 @@ static int string_list_df_name_compare(const char *one, const char *two) } struct directory_versions { + /* + * versions: list of (basename -> version_info) + * + * The basenames are in reverse lexicographic order of full pathnames, + * as processed in process_entries(). This puts all entries within + * a directory together, and covers the directory itself after + * everything within it, allowing us to write subtrees before needing + * to record information for the tree itself. + */ struct string_list versions; + + /* + * offsets: list of (full relative path directories -> integer offsets) + * + * Since versions contains basenames from files in multiple different + * directories, we need to know which entries in versions correspond + * to which directories. Values of e.g. + * "" 0 + * src 2 + * src/moduleA 5 + * Would mean that entries 0-1 of versions are files in the toplevel + * directory, entries 2-4 are files under src/, and the remaining + * entries starting at index 5 are files under src/moduleA/. + */ + struct string_list offsets; + + /* + * last_directory: directory that previously processed file found in + * + * last_directory starts NULL, but records the directory in which the + * previous file was found within. As soon as + * directory(current_file) != last_directory + * then we need to start updating accounting in versions & offsets. + * Note that last_directory is always the last path in "offsets" (or + * NULL if "offsets" is empty) so this exists just for quick access. + */ + const char *last_directory; + + /* last_directory_len: cached computation of strlen(last_directory) */ + unsigned last_directory_len; }; static int tree_entry_order(const void *a_, const void *b_) @@ -569,6 +608,181 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata, basename)->util = &mi->result; } +static void write_completed_directory(struct merge_options *opt, + const char *new_directory_name, + struct directory_versions *info) +{ + const char *prev_dir; + struct merged_info *dir_info = NULL; + unsigned int offset; + + /* + * Some explanation of info->versions and info->offsets... + * + * process_entries() iterates over all relevant files AND + * directories in reverse lexicographic order, and calls this + * function. Thus, an example of the paths that process_entries() + * could operate on (along with the directories for those paths + * being shown) is: + * + * xtract.c "" + * tokens.txt "" + * src/moduleB/umm.c src/moduleB + * src/moduleB/stuff.h src/moduleB + * src/moduleB/baz.c src/moduleB + * src/moduleB src + * src/moduleA/foo.c src/moduleA + * src/moduleA/bar.c src/moduleA + * src/moduleA src + * src "" + * Makefile "" + * + * info->versions: + * + * always contains the unprocessed entries and their + * version_info information. For example, after the first five + * entries above, info->versions would be: + * + * xtract.c + * token.txt + * umm.c + * stuff.h + * baz.c + * + * Once a subdirectory is completed we remove the entries in + * that subdirectory from info->versions, writing it as a tree + * (write_tree()). Thus, as soon as we get to src/moduleB, + * info->versions would be updated to + * + * xtract.c + * token.txt + * moduleB + * + * info->offsets: + * + * helps us track which entries in info->versions correspond to + * which directories. When we are N directories deep (e.g. 4 + * for src/modA/submod/subdir/), we have up to N+1 unprocessed + * directories (+1 because of toplevel dir). Corresponding to + * the info->versions example above, after processing five entries + * info->offsets will be: + * + * "" 0 + * src/moduleB 2 + * + * which is used to know that xtract.c & token.txt are from the + * toplevel dirctory, while umm.c & stuff.h & baz.c are from the + * src/moduleB directory. Again, following the example above, + * once we need to process src/moduleB, then info->offsets is + * updated to + * + * "" 0 + * src 2 + * + * which says that moduleB (and only moduleB so far) is in the + * src directory. + * + * One unique thing to note about info->offsets here is that + * "src" was not added to info->offsets until there was a path + * (a file OR directory) immediately below src/ that got + * processed. + * + * Since process_entry() just appends new entries to info->versions, + * write_completed_directory() only needs to do work if the next path + * is in a directory that is different than the last directory found + * in info->offsets. + */ + + /* + * If we are working with the same directory as the last entry, there + * is no work to do. (See comments above the directory_name member of + * struct merged_info for why we can use pointer comparison instead of + * strcmp here.) + */ + if (new_directory_name == info->last_directory) + return; + + /* + * If we are just starting (last_directory is NULL), or last_directory + * is a prefix of the current directory, then we can just update + * info->offsets to record the offset where we started this directory + * and update last_directory to have quick access to it. + */ + if (info->last_directory == NULL || + !strncmp(new_directory_name, info->last_directory, + info->last_directory_len)) { + uintptr_t offset = info->versions.nr; + + info->last_directory = new_directory_name; + info->last_directory_len = strlen(info->last_directory); + /* + * Record the offset into info->versions where we will + * start recording basenames of paths found within + * new_directory_name. + */ + string_list_append(&info->offsets, + info->last_directory)->util = (void*)offset; + return; + } + + /* + * The next entry that will be processed will be within + * new_directory_name. Since at this point we know that + * new_directory_name is within a different directory than + * info->last_directory, we have all entries for info->last_directory + * in info->versions and we need to create a tree object for them. + */ + dir_info = strmap_get(&opt->priv->paths, info->last_directory); + assert(dir_info); + offset = (uintptr_t)info->offsets.items[info->offsets.nr-1].util; + if (offset == info->versions.nr) { + /* + * Actually, we don't need to create a tree object in this + * case. Whenever all files within a directory disappear + * during the merge (e.g. unmodified on one side and + * deleted on the other, or files were renamed elsewhere), + * then we get here and the directory itself needs to be + * omitted from its parent tree as well. + */ + dir_info->is_null = 1; + } else { + /* + * Write out the tree to the git object directory, and also + * record the mode and oid in dir_info->result. + */ + dir_info->is_null = 0; + dir_info->result.mode = S_IFDIR; + write_tree(&dir_info->result.oid, &info->versions, offset, + opt->repo->hash_algo->rawsz); + } + + /* + * We've now used several entries from info->versions and one entry + * from info->offsets, so we get rid of those values. + */ + info->offsets.nr--; + info->versions.nr = offset; + + /* + * Now we've taken care of the completed directory, but we need to + * prepare things since future entries will be in + * new_directory_name. (In particular, process_entry() will be + * appending new entries to info->versions.) So, we need to make + * sure new_directory_name is the last entry in info->offsets. + */ + prev_dir = info->offsets.nr == 0 ? NULL : + info->offsets.items[info->offsets.nr-1].string; + if (new_directory_name != prev_dir) { + uintptr_t c = info->versions.nr; + string_list_append(&info->offsets, + new_directory_name)->util = (void*)c; + } + + /* And, of course, we need to update last_directory to match. */ + info->last_directory = new_directory_name; + info->last_directory_len = strlen(info->last_directory); +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, @@ -667,7 +881,9 @@ static void process_entries(struct merge_options *opt, struct strmap_entry *e; struct string_list plist = STRING_LIST_INIT_NODUP; struct string_list_item *entry; - struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP }; + struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP, + STRING_LIST_INIT_NODUP, + NULL, 0 }; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); @@ -687,6 +903,11 @@ static void process_entries(struct merge_options *opt, /* * Iterate over the items in reverse order, so we can handle paths * below a directory before needing to handle the directory itself. + * + * This allows us to write subtrees before we need to write trees, + * and it also enables sane handling of directory/file conflicts + * (because it allows us to know whether the directory is still in + * the way when it is time to process the file at the same path). */ for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) { char *path = entry->string; @@ -697,6 +918,8 @@ static void process_entries(struct merge_options *opt, */ struct merged_info *mi = entry->util; + write_completed_directory(opt, mi->directory_name, + &dir_metadata); if (mi->clean) record_entry_for_tree(&dir_metadata, path, mi); else { @@ -705,17 +928,20 @@ static void process_entries(struct merge_options *opt, } } - /* - * TODO: We can't actually write a tree yet, because dir_metadata just - * contains all basenames of all files throughout the tree with their - * mode and hash. Not only is that a nonsensical tree, it will have - * lots of duplicates for paths such as "Makefile" or ".gitignore". - */ - die("Not yet implemented; need to process subtrees separately"); + if (dir_metadata.offsets.nr != 1 || + (uintptr_t)dir_metadata.offsets.items[0].util != 0) { + printf("dir_metadata.offsets.nr = %d (should be 1)\n", + dir_metadata.offsets.nr); + printf("dir_metadata.offsets.items[0].util = %u (should be 0)\n", + (unsigned)(uintptr_t)dir_metadata.offsets.items[0].util); + fflush(stdout); + BUG("dir_metadata accounting completely off; shouldn't happen"); + } write_tree(result_oid, &dir_metadata.versions, 0, opt->repo->hash_algo->rawsz); string_list_clear(&plist, 0); string_list_clear(&dir_metadata.versions, 0); + string_list_clear(&dir_metadata.offsets, 0); } void merge_switch_to_result(struct merge_options *opt, From patchwork Fri Dec 4 20:48:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6E02DC4361A for ; Fri, 4 Dec 2020 20:50:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2E97F22CF6 for ; Fri, 4 Dec 2020 20:50:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387728AbgLDUt6 (ORCPT ); Fri, 4 Dec 2020 15:49:58 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40030 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387533AbgLDUt4 (ORCPT ); Fri, 4 Dec 2020 15:49:56 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8E8DCC08E9AA for ; Fri, 4 Dec 2020 12:48:30 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id x6so2636898wro.11 for ; Fri, 04 Dec 2020 12:48:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Uj/TZ3jFHMQeuQ7H4VVBYYgj6awrEN/LKO888rYPk+0=; b=ZQjr345rkCagPf3DKoL8kITcI5UcVWLs6NFIClRmmVhjbZDwgMj0m3ChvaK+UtUiO+ HWy02GWn+QMrjVRMskgya3GL8nQleFhK6NDqU97ekjeo32hktjTADRAsTNLG8CKAkotF 4H1rsS703Ej2yrfMqXneFfKjc7i/bfbWDOqqcJvI3tlHi3Rv7kXrukxFsw+R3RQiYRPz NHYHAip5vTCsgf+Q9CyCqA9EBunppr81xnfu0LVvMFtmrIXIBHo9dXGLhLkStocshNEq jtgScIS0uLLBa0rxRe4hZ38TVPKDR58M1Um8c+U5vTRRt2wY8NOU+ku+Y1VQr3Unte4L gicw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Uj/TZ3jFHMQeuQ7H4VVBYYgj6awrEN/LKO888rYPk+0=; b=MEdCuwz55xxKFdEGeXzB7dAJsbGArCp4vcbcTnUMDqjPEIOQY7hM1YRc8vtgchfDwd qebZzbWuaOdZmfqhb+Glz5VjV8TcJnMdTfop+/cIwTckNUpKdG8JUEAGNPf8Vd66nkwE kCSMToR2oRAA7e3OtSLaOMjAv6Eutmo+0aatIXrfkBiXLvlf+zHZtC2IkXF8MXWglwOS PUeHuPKhPMHFcA/aT9/TKzIXkWfn8TB++f3dPYd2qwsTnw/HsAzlVlXnbVjNTgNA+cBk Qysd0lL2vZ7U2zKGpDDrWwExQzWIQcu/D2ys84QJmPjZSJ6hZ21yE3t5OPe3AdUqbmpe MSfA== X-Gm-Message-State: AOAM53203m/NexulyB57r35hrMYi3pggGsuJB/b2PM9bTnH0tODYg3zX 4uA6ln+Rw0JzJLHpApWiZG47ZTjAmjA= X-Google-Smtp-Source: ABdhPJzjynF8SUk1N+Gzx16oKB8Ls1dJCWcA4rGZ77d73FS1pbarWHZI5pE699jkiNo/9h/7v+3XVA== X-Received: by 2002:adf:f1c2:: with SMTP id z2mr6843233wro.281.1607114909169; Fri, 04 Dec 2020 12:48:29 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a1sm4860291wrv.61.2020.12.04.12.48.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:28 -0800 (PST) Message-Id: <8e90d211c5dd5fd62dcd2f3500c47b37ab2b9abf.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:06 +0000 Subject: [PATCH v2 16/20] merge-ort: basic outline for merge_switch_to_result() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This adds a basic implementation for merge_switch_to_result(), though just in terms of a few new empty functions that will be defined in subsequent commits. Signed-off-by: Elijah Newren --- merge-ort.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index cf6f395c69..fe22751d22 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -944,13 +944,53 @@ static void process_entries(struct merge_options *opt, string_list_clear(&dir_metadata.offsets, 0); } +static int checkout(struct merge_options *opt, + struct tree *prev, + struct tree *next) +{ + die("Not yet implemented."); +} + +static int record_conflicted_index_entries(struct merge_options *opt, + struct index_state *index, + struct strmap *paths, + struct strmap *conflicted) +{ + if (strmap_empty(conflicted)) + return 0; + + die("Not yet implemented."); +} + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, int update_worktree_and_index, int display_update_msgs) { - die("Not yet implemented"); + assert(opt->priv == NULL); + if (result->clean >= 0 && update_worktree_and_index) { + struct merge_options_internal *opti = result->priv; + + if (checkout(opt, head, result->tree)) { + /* failure to function */ + result->clean = -1; + return; + } + + if (record_conflicted_index_entries(opt, opt->repo->index, + &opti->paths, + &opti->conflicted)) { + /* failure to function */ + result->clean = -1; + return; + } + } + + if (display_update_msgs) { + /* TODO: print out CONFLICT and other informational messages. */ + } + merge_finalize(opt, result); } From patchwork Fri Dec 4 20:48:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 58F05C433FE for ; Fri, 4 Dec 2020 20:50:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id F2C7F22CE3 for ; Fri, 4 Dec 2020 20:50:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387587AbgLDUtv (ORCPT ); Fri, 4 Dec 2020 15:49:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40032 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387533AbgLDUtv (ORCPT ); Fri, 4 Dec 2020 15:49:51 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 86033C08ED7E for ; Fri, 4 Dec 2020 12:48:31 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id 23so6574622wrc.8 for ; Fri, 04 Dec 2020 12:48:31 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=4Ef3rtLqdQ3bkXPrtOa/AQcwQqh9txT9O5D8ylnCFp4=; b=nTRm+COIQmx6Ax3aJtnUm8SZ5GguKUEBfhI5edmexWYM+6u2+01sGEJfhGmbBn6u4L QV6C7icdmLV3RJhBx6aCAZ+Uz9sXg4R1Rga9MQHna4BMFNkq0gv6K9/16lf3SfHkaZxx 4lWEDA8XmTcfjy4Ma2ItCIowIwF7562ntf35CV2BWMlU+A3vb4C+JQCtWaCnZXUjBmY1 m7Z38GFBGQlDryOo7QtHGVHMN0mVUyZQdWBRZrAMEDdBB1CiALFpsdOYvhP9tZ8+Ngqm Z8CFRt0ouTK/ur6Ky293c/3nwlXQJZDQxRd9sXAjdpbqFlmZEdRAuLdaECXqRDpqOi3q Ai3g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=4Ef3rtLqdQ3bkXPrtOa/AQcwQqh9txT9O5D8ylnCFp4=; b=PkeE+t+N2B6DvnKVtc/AF9XONlXcxCC/k0UexCjmFLCowibwNysyjSPKuq0nsEFKo9 0/isegWb4BXkSV3KdkUvhNVfuOJt6g818+ZFrNu0+2AppGnw6iHHBQfmrm9nEsGLgIh+ 6B0Er0IcsZhtWkAWMxz68FEKvxFTNoJTXiGcw7cCyhiSGu47akPtx4t1qX69AepIwZWc SZlmV9hqCPK2fOT6KGLWgRu7li7IiDopS1V3drJvuJjMdIC6MXGzoTtGXlINW8cWGEEs 5fjI2VVKTqLbLYGkRs4rk4vcrRppRwaEpQZabyUDBpbx5I2dt885WJrWBhikfObiYH8V rY1g== X-Gm-Message-State: AOAM530EM6TkbPlDyqlvFmp2SIKDm9GBu0I5u5wORUJByMMTZoo55AEa 5nMCLB389LieQUDuGZXplRo4nD1UeJ0= X-Google-Smtp-Source: ABdhPJyT7P4GS2wtLP5qcJXhHWPf3EJKF+qF0DLvR/bJK17yZDqlU3yb+HQte2+y+6SDo7A6266imA== X-Received: by 2002:adf:f8d2:: with SMTP id f18mr6779749wrq.379.1607114910036; Fri, 04 Dec 2020 12:48:30 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a15sm5184089wrn.75.2020.12.04.12.48.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:29 -0800 (PST) Message-Id: <61fada146cf26dd86705d6eeb375c28735f2364b.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:07 +0000 Subject: [PATCH v2 17/20] merge-ort: add implementation of checkout() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Since merge-ort creates a tree for its output, when there are no conflicts, updating the working tree and index is as simple as using the unpack_trees() machinery with a twoway_merge (i.e. doing the equivalent of a "checkout" operation). If there were conflicts in the merge, then since the tree we created included all the conflict markers, then using the unpack_trees machinery in this manner will still update the working tree correctly. Further, all index entries corresponding to cleanly merged files will also be updated correctly by this procedure. Index entries corresponding to conflicted entries will appear as though the user had run "git add -u" after the merge to accept all files as-is with conflict markers. Thus, after running unpack_trees(), there needs to be a separate step for updating the entries in the index corresponding to conflicted files. This will be the job for the function record_conflicted_index_entris(), which will be implemented in a subsequent commit. Signed-off-by: Elijah Newren --- merge-ort.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index fe22751d22..ba62f80420 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -19,9 +19,11 @@ #include "diff.h" #include "diffcore.h" +#include "dir.h" #include "object-store.h" #include "strmap.h" #include "tree.h" +#include "unpack-trees.h" #include "xdiff-interface.h" struct merge_options_internal { @@ -948,7 +950,48 @@ static int checkout(struct merge_options *opt, struct tree *prev, struct tree *next) { - die("Not yet implemented."); + /* Switch the index/working copy from old to new */ + int ret; + struct tree_desc trees[2]; + struct unpack_trees_options unpack_opts; + + memset(&unpack_opts, 0, sizeof(unpack_opts)); + unpack_opts.head_idx = -1; + unpack_opts.src_index = opt->repo->index; + unpack_opts.dst_index = opt->repo->index; + + setup_unpack_trees_porcelain(&unpack_opts, "merge"); + + /* + * NOTE: if this were just "git checkout" code, we would probably + * read or refresh the cache and check for a conflicted index, but + * builtin/merge.c or sequencer.c really needs to read the index + * and check for conflicted entries before starting merging for a + * good user experience (no sense waiting for merges/rebases before + * erroring out), so there's no reason to duplicate that work here. + */ + + /* 2-way merge to the new branch */ + unpack_opts.update = 1; + unpack_opts.merge = 1; + unpack_opts.quiet = 0; /* FIXME: sequencer might want quiet? */ + unpack_opts.verbose_update = (opt->verbosity > 2); + unpack_opts.fn = twoway_merge; + if (1/* FIXME: opts->overwrite_ignore*/) { + unpack_opts.dir = xcalloc(1, sizeof(*unpack_opts.dir)); + unpack_opts.dir->flags |= DIR_SHOW_IGNORED; + setup_standard_excludes(unpack_opts.dir); + } + parse_tree(prev); + init_tree_desc(&trees[0], prev->buffer, prev->size); + parse_tree(next); + init_tree_desc(&trees[1], next->buffer, next->size); + + ret = unpack_trees(2, trees, &unpack_opts); + clear_unpack_trees_porcelain(&unpack_opts); + dir_clear(unpack_opts.dir); + FREE_AND_NULL(unpack_opts.dir); + return ret; } static int record_conflicted_index_entries(struct merge_options *opt, From patchwork Fri Dec 4 20:48:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 28487C193FE for ; Fri, 4 Dec 2020 20:49:43 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id DEF6022CF8 for ; Fri, 4 Dec 2020 20:49:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730359AbgLDUtm (ORCPT ); Fri, 4 Dec 2020 15:49:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39896 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730182AbgLDUtl (ORCPT ); Fri, 4 Dec 2020 15:49:41 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 750CFC094240 for ; Fri, 4 Dec 2020 12:48:32 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id e7so6568409wrv.6 for ; Fri, 04 Dec 2020 12:48:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5Vqos9ENXW+RP/6IqXk99famSFdcxfCZlVpwjy1vIpo=; b=NJ4cr+iaKASDPi2g///wm8WW+4B8WzFx3h+WqujPiLElaElqIm1YpswEJJrBcA7ceQ 2mxmFcTpxITdwCzIWENI5TwqIiUptLRuyAqJr78wC0VNZEn1bPxka8n+Aza6eGrClmp7 vezToD+UoKDUa/m3U2KM2g+NXX8FzbwpasIT6JRiKHSp4XQ74cTObul+dpOzhudChqwj NSEmDRFylyfysrtW71cDHgg3N0dSuEXquajunYK87kIPsw629aCWORBjI2hK0fGdTYMz HenrUiV3iGX4sAUzVHXLPigwhKO6kb6u2dnhTMLhblOB2Ik0FL3uqu2pGkkFCBRmDVqB y2Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5Vqos9ENXW+RP/6IqXk99famSFdcxfCZlVpwjy1vIpo=; b=QKP/xuzgbhADVwncOPsD4R1Bcjcl3XmHUdDJBdGGRFSg4COFf6zFXxWEevOwmUjPc3 Zv6hGdYgPhNCmIrL9k2tvSbFOCa68s1wgnEkcVeSWcdq/qjRi7txP6H5k88LMEjYKBPz WNI2U3T0wCvyT3blKJvHPPnbxL3z2uJGs3ee6Hmb8IZfIn8UwksNO5HowPx91UXD66+X 8D7PSxQgNnjoEa5vKbwIrXMCkd/x0IHP3AHT1wjlQAfEg1h6BRctlDjvvs8MSx43UEC3 pUAtaaH+4JsHeMzv69sr0z2vqTkoucINAXSXdntDb7AoilrIWDfjT6x08V3hFaorvkVb lUOQ== X-Gm-Message-State: AOAM5308qPxUhNEGXncAL1TLOV6R343M8bvLp2uL3vVc6VMWoc5eOROg LGLmaAPX6EIPqj41pBjXuulOzG9lRnU= X-Google-Smtp-Source: ABdhPJyHi+NYXSfNdiAbT32NsN0gKRtOMuFZGNRLGTvVEcq5N4Zk1JmiD1szFXw/YTrYN/7TBbuZ9A== X-Received: by 2002:adf:f085:: with SMTP id n5mr6709931wro.371.1607114911014; Fri, 04 Dec 2020 12:48:31 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q25sm5254726wmq.37.2020.12.04.12.48.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:30 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:08 +0000 Subject: [PATCH v2 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- tree.c | 2 +- tree.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/tree.c b/tree.c index e76517f6b1..a52479812c 100644 --- a/tree.c +++ b/tree.c @@ -144,7 +144,7 @@ int read_tree_recursive(struct repository *r, return ret; } -static int cmp_cache_name_compare(const void *a_, const void *b_) +int cmp_cache_name_compare(const void *a_, const void *b_) { const struct cache_entry *ce1, *ce2; diff --git a/tree.h b/tree.h index 9383745073..3eb0484cbf 100644 --- a/tree.h +++ b/tree.h @@ -28,6 +28,8 @@ void free_tree_buffer(struct tree *tree); /* Parses and returns the tree in the given ent, chasing tags and commits. */ struct tree *parse_tree_indirect(const struct object_id *oid); +int cmp_cache_name_compare(const void *a_, const void *b_); + #define READ_TREE_RECURSIVE 1 typedef int (*read_tree_fn_t)(const struct object_id *, struct strbuf *, const char *, unsigned int, int, void *); From patchwork Fri Dec 4 20:48:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8490C4361B for ; Fri, 4 Dec 2020 20:49:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A162822CF6 for ; Fri, 4 Dec 2020 20:49:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730312AbgLDUtm (ORCPT ); Fri, 4 Dec 2020 15:49:42 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39898 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730232AbgLDUtl (ORCPT ); Fri, 4 Dec 2020 15:49:41 -0500 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 812C4C094241 for ; Fri, 4 Dec 2020 12:48:33 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id x22so6515418wmc.5 for ; Fri, 04 Dec 2020 12:48:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=WdcUpPrt+KeTZ/lMASTVX5s+4NhAknn3qnvaVYr3LmE=; b=Wc+ISZFCDE9tUj2znaQthFzMpSXfDNOKyWrMgKqFuuIdKnvhij6wPEcl9NBE3JE1q9 5WIClf1Emc4yIM5tuGfMcZbhR6NAF3O8Go36Pw/TcxK4MAyPeoaNyeoagy8+OVFXcWAi UAMEIxh848bZTPnh1iNsc0AlOsU6egSzYYhZoFRgLLOf48e2repPGEQrWdyq5+C7/lrX hZkdHqhlTE046lH9jGVA3ivDpO42RcD6RevT+mRdi4kZpN3ajJhvJ4GGz1fR3x4NGLS7 RyVgCq9kxxtLSAtIW/zM5vAF320MNJRIk4KdAAPG8CXVhsh1CTFQru76FGGvk06cLAof 7E+g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=WdcUpPrt+KeTZ/lMASTVX5s+4NhAknn3qnvaVYr3LmE=; b=qFdCVDAhfyIvnLs6Opp1APN1+dTmO/2+fPpt9tvDMHf8duErseZv+yhoo6GJPubKOI aiCKFxKZWZ5S8sDWUGinK0jy0KOdv/4kq6cn+RXIWmsIsbZXYBpap7fz88x7LiuFtzIp +lYdm+414mZ0OZHaxBLWbSScsbt9eCM1HsHSVTABFsypWX8mfDWsSLdXKRQhsD6SuWbU vVYDgjiU6hiHukMoPUEURIYiSZfsZPVH8vmz+mGp2hCIWQnIIO3KgoiCgMR9nT9sTvg0 URYZyrrYRz8zhfFZxxoTSq5ZPb8+7WZlfns/j69AudryY20I8ZoSWztU+Q6s6cgBqvP/ S23A== X-Gm-Message-State: AOAM533lI5LHEX+Tn3+SPVhqGqYdVHRAMGD7omlkM11kOz4B6d8Jg1NA NMaEX1Ftf6vyXoaLxdTk22ww4ayKeR0= X-Google-Smtp-Source: ABdhPJynhcSq3fSsshWqcyUBUsI3xiUG4DW8f+TY8cy8es8xTYSX1kqsAzdvrrZRiExuLZIAk/NBNg== X-Received: by 2002:a05:600c:210e:: with SMTP id u14mr6250580wml.48.1607114912028; Fri, 04 Dec 2020 12:48:32 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w17sm4354453wmk.12.2020.12.04.12.48.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:31 -0800 (PST) Message-Id: <4efac38116dd43d50d0003d20f7cff301014315a.1607114891.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:09 +0000 Subject: [PATCH v2 19/20] merge-ort: add implementation of record_conflicted_index_entries() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren After checkout(), the working tree has the appropriate contents, and the index matches the working copy. That means that all unmodified and cleanly merged files have correct index entries, but conflicted entries need to be updated. We do this by looping over the conflicted entries, marking the existing index entry for the path with CE_REMOVE, adding new higher order staged for the path at the end of the index (ignoring normal index sort order), and then at the end of the loop removing the CE_REMOVED-marked cache entries and sorting the index. Signed-off-by: Elijah Newren --- merge-ort.c | 88 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 87 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index ba62f80420..faebee8e7e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,7 @@ #include "cache.h" #include "merge-ort.h" +#include "cache-tree.h" #include "diff.h" #include "diffcore.h" #include "dir.h" @@ -999,10 +1000,95 @@ static int record_conflicted_index_entries(struct merge_options *opt, struct strmap *paths, struct strmap *conflicted) { + struct hashmap_iter iter; + struct strmap_entry *e; + int errs = 0; + int original_cache_nr; + if (strmap_empty(conflicted)) return 0; - die("Not yet implemented."); + original_cache_nr = index->cache_nr; + + /* Put every entry from paths into plist, then sort */ + strmap_for_each_entry(conflicted, &iter, e) { + const char *path = e->key; + struct conflict_info *ci = e->value; + int pos; + struct cache_entry *ce; + int i; + + VERIFY_CI(ci); + + /* + * The index will already have a stage=0 entry for this path, + * because we created an as-merged-as-possible version of the + * file and checkout() moved the working copy and index over + * to that version. + * + * However, previous iterations through this loop will have + * added unstaged entries to the end of the cache which + * ignore the standard alphabetical ordering of cache + * entries and break invariants needed for index_name_pos() + * to work. However, we know the entry we want is before + * those appended cache entries, so do a temporary swap on + * cache_nr to only look through entries of interest. + */ + SWAP(index->cache_nr, original_cache_nr); + pos = index_name_pos(index, path, strlen(path)); + SWAP(index->cache_nr, original_cache_nr); + if (pos < 0) { + if (ci->filemask != 1) + BUG("Conflicted %s but nothing in basic working tree or index; this shouldn't happen", path); + cache_tree_invalidate_path(index, path); + } else { + ce = index->cache[pos]; + + /* + * Clean paths with CE_SKIP_WORKTREE set will not be + * written to the working tree by the unpack_trees() + * call in checkout(). Our conflicted entries would + * have appeared clean to that code since we ignored + * the higher order stages. Thus, we need override + * the CE_SKIP_WORKTREE bit and manually write those + * files to the working disk here. + * + * TODO: Implement this CE_SKIP_WORKTREE fixup. + */ + + /* + * Mark this cache entry for removal and instead add + * new stage>0 entries corresponding to the + * conflicts. If there are many conflicted entries, we + * want to avoid memmove'ing O(NM) entries by + * inserting the new entries one at a time. So, + * instead, we just add the new cache entries to the + * end (ignoring normal index requirements on sort + * order) and sort the index once we're all done. + */ + ce->ce_flags |= CE_REMOVE; + } + + for (i = 0; i < 3; i++) { + struct version_info *vi; + if (!(ci->filemask & (1ul << i))) + continue; + vi = &ci->stages[i]; + ce = make_cache_entry(index, vi->mode, &vi->oid, + path, i+1, 0); + add_index_entry(index, ce, ADD_CACHE_JUST_APPEND); + } + } + + /* + * Remove the unused cache entries (and invalidate the relevant + * cache-trees), then sort the index entries to get the conflicted + * entries we added to the end into their right locations. + */ + remove_marked_cache_entries(index, 1); + QSORT(index->cache, index->cache_nr, cmp_cache_name_compare); + + return errs; } void merge_switch_to_result(struct merge_options *opt, From patchwork Fri Dec 4 20:48:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11952383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB40CC193FE for ; Fri, 4 Dec 2020 20:50:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 8A6BE22CF6 for ; Fri, 4 Dec 2020 20:50:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2387999AbgLDUuM (ORCPT ); Fri, 4 Dec 2020 15:50:12 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40070 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2387781AbgLDUuM (ORCPT ); Fri, 4 Dec 2020 15:50:12 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5E96BC094242 for ; Fri, 4 Dec 2020 12:48:34 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id s8so6569084wrw.10 for ; Fri, 04 Dec 2020 12:48:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=K29zBB+AFMmeMA/iUOnzryDPtPH0Jw7ZArcsBIzNtsY=; b=qQ7x2IcwR4XUxqyQgRNj/0WQRLuQMtQXHj9qGc0VwdWylwdn3NNLnuAf+ClfClIpGT CHLGfAfO7V8+lOtMDZhQQ6TfsLPxsLFAboQT7LidhaxUxMbHz6ScTjUecyF1aMxsir9Z lDMIz1wbAUYk7o4N56ZZtZotkgSLA0UUKHAskrWE6ueUtuGu3X+XX1J9mbohkotZSWHp WjrlLA+KDqY+UrOLVgJIVvJz1QL2gqHWWTh34ZkEYQk9WVS8eAzYq5GIp3T7P086+im1 Wi13NDxivmhuV/ftuaYzZFf9Te0kJhf9SHqumt4uWCPjhIA2LGWc0KZ1TMN7QRln25d9 8BiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=K29zBB+AFMmeMA/iUOnzryDPtPH0Jw7ZArcsBIzNtsY=; b=Al4dUggFqEDOsC1/Q7zsZlVbbH/w7kEHfkHhnR2qH47fklJxbA3pqXpjEXOTqXUVSm 5udMSAR4v2rBtL3BjNT6isA+hDbMgQuiFiZ19OMvZR8Yh8rUvnSFIkP4d/6oT/1BABQZ EEY4silIfxnjnGRc6qFgDsWS6GXJ2kmgq9YrcGEtFPKImVrQJLafNgNiJtbknl5PGe+O /qDmrh7Z2zlLCRhWni0/40P0dZBRqITUFqk1RdJXC8J23VUp5tTg5IEvXYbP3wot0GQZ 7Sn2W+Kr54ner4RhUcA7J0fUhj7/ox3pCZmm7JAaEvKjqPAYW9B7HaVYiohPEMBuYDxV B/HA== X-Gm-Message-State: AOAM532xC5drXgRCpJ5tgmm0W/VO+Y6Zxi4yhR4P0swllIgM/tvL4Sez NKrYjXgokp5y2pEfWKI0WeZhdCxCZgo= X-Google-Smtp-Source: ABdhPJzH25DmPZo/JtnChSb+eVK9an79u3bn1UY1pgMR2hNk5IbK4HGsRhXToicvzYYwJb5PlzgBRA== X-Received: by 2002:a5d:4f10:: with SMTP id c16mr6836099wru.39.1607114912985; Fri, 04 Dec 2020 12:48:32 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y130sm4690822wmc.22.2020.12.04.12.48.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 04 Dec 2020 12:48:32 -0800 (PST) Message-Id: In-Reply-To: References: Date: Fri, 04 Dec 2020 20:48:10 +0000 Subject: [PATCH v2 20/20] merge-ort: free data structures in merge_finalize() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: jonathantanmy@google.com, dstolee@microsoft.com, Elijah Newren , =?utf-8?b?w4Z2YXIgQXJuZmrDtnLDsA==?= Bjarmason , Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- merge-ort.c | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index faebee8e7e..5d13932dd9 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -182,6 +182,16 @@ struct conflict_info { assert((ci) && !(mi)->clean); \ } while (0) +static void free_strmap_strings(struct strmap *map) +{ + struct hashmap_iter iter; + struct strmap_entry *entry; + + strmap_for_each_entry(map, &iter, entry) { + free((char*)entry->key); + } +} + static int err(struct merge_options *opt, const char *err, ...) { va_list params; @@ -1126,7 +1136,27 @@ void merge_switch_to_result(struct merge_options *opt, void merge_finalize(struct merge_options *opt, struct merge_result *result) { - die("Not yet implemented"); + struct merge_options_internal *opti = result->priv; + + assert(opt->priv == NULL); + + /* + * We marked opti->paths with strdup_strings = 0, so that we + * wouldn't have to make another copy of the fullpath created by + * make_traverse_path from setup_path_info(). But, now that we've + * used it and have no other references to these strings, it is time + * to deallocate them. + */ + free_strmap_strings(&opti->paths); + strmap_clear(&opti->paths, 1); + + /* + * All keys and values in opti->conflicted are a subset of those in + * opti->paths. We don't want to deallocate anything twice, so we + * don't free the keys and we pass 0 for free_values. + */ + strmap_clear(&opti->conflicted, 0); + FREE_AND_NULL(opti); } static void merge_start(struct merge_options *opt, struct merge_result *result)