From patchwork Sun Nov 29 07:43:04 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938973 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5A1E0C64E90 for ; Sun, 29 Nov 2020 07:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1DD3F2078D for ; Sun, 29 Nov 2020 07:44:39 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="o54JXi9I" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726589AbgK2HoX (ORCPT ); Sun, 29 Nov 2020 02:44:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47192 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726516AbgK2HoN (ORCPT ); Sun, 29 Nov 2020 02:44:13 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 730AAC0613D2 for ; Sat, 28 Nov 2020 23:43:27 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id f190so12327545wme.1 for ; Sat, 28 Nov 2020 23:43:27 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=gy/Ul4mnZ7gWvitV4aXNcvdTXFp79UiZbsQWtGrz1KM=; b=o54JXi9INuYk1jRiDf66lkSFzi3gx8Xj24muabY9h0j7Jgd3DehB0cGBMDw7Ys/qbd 2ZcuqAahBx+e1rzJ6ZGmpXcVosm5yEcQbrrqSt8J4QOf5aVgfuvdxqNKuftfZySmTUgi hVsu6XQQ92jdFc9shmG/DO+kCKZagY6lq61yowF5vWpfGTZU92N33VRwirLH4SJzO/m4 vbnfowj0g36JRDT9mUHV4yN976yd8LjMuv/CClaV6s4nzrcHqFCSvwxH2ViMzBQ3+er4 Hu8+IjU5Ldf9zPqvUiMuOlEUPLtI1AIBaGF1sxezSoyEKZkbtPZPBhYU6JzpXxvSFVRJ pA7w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=gy/Ul4mnZ7gWvitV4aXNcvdTXFp79UiZbsQWtGrz1KM=; b=bYLrRcidTYuUowispZyEp4c0Xd4Rg9w5YqI2UfsmW5F3d9eFGsVL0YPWZDPDQLk2v4 JKH1g6hR/CJBndEDL828hFUBSUuKp4Z88wedO73efHLIdBrQneQpj6vzve9C0p61ToFl /5YwrbVkooqPzG3mgpAekR7LVgulUHRGbi+hM3OGjJevcLsCM6TlHw6PQFcJHxlCPDju 2CGF2S9hYc5gKIbIkVDGKTliIUP8swKNNcPgC4LmLDSFzf6LpHACmWQc6lsEG6xaj+IT Rrl4bqZ6vDOCyWtOaBBNrH0DZk8l3X3/tAAM+16mUNnaHL7vNj8SbBPplzgKea+EBb3A UEMg== X-Gm-Message-State: AOAM531ruxmcq/TXuGQC+xSAk17NYb+9v4lXcUE7Zlmn7U95SCRQFvKx 6P7PwbzvyrX2/JDGTYtmtjLKnYDtofU= X-Google-Smtp-Source: ABdhPJxd+U5x1UhGv3nRlc1AKCNiSHO2AeVJ35BbDeQzelSIiQloKuulssmkz6mzTX9bpBIVpjjeTw== X-Received: by 2002:a1c:f017:: with SMTP id a23mr16944157wmb.56.1606635805930; Sat, 28 Nov 2020 23:43:25 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id l13sm22428063wrm.24.2020.11.28.23.43.25 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:25 -0800 (PST) Message-Id: <2568ec92c6d96dc51aff4a411900eaec8d32ce27.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:04 +0000 Subject: [PATCH 01/20] merge-ort: setup basic internal data structures Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Set up some basic internal data structures. The only carry-over from merge-recursive.c is call_depth, though needed_rename_limit will be added later. The central piece of data will definitely be the strmap "paths", which will map every relevant pathname under consideration to either a merged_info or a conflict_info. ("conflicted" is a strmap that is a subset of "paths".) merged_info contains all relevant information for a non-conflicted entry. conflict_info contains a merged_info, plus any additional information about a conflict such as the higher orders stages involved and the names of the paths those came from (handy once renames get involved). If an entry remains conflicted, the merged_info portion of a conflict_info will later be filled with whatever version of the file should be placed in the working directory (e.g. an as-merged-as-possible variation that contains conflict markers). Signed-off-by: Elijah Newren --- merge-ort.c | 137 ++++++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 137 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index b487901d3e..bb37fdf838 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,143 @@ #include "cache.h" #include "merge-ort.h" +#include "strmap.h" + +struct merge_options_internal { + /* + * paths: primary data structure in all of merge ort. + * + * The keys of paths: + * * are full relative paths from the toplevel of the repository + * (e.g. "drivers/firmware/raspberrypi.c"). + * * store all relevant paths in the repo, both directories and + * files (e.g. drivers, drivers/firmware would also be included) + * * these keys serve to intern all the path strings, which allows + * us to do pointer comparison on directory names instead of + * strcmp; we just have to be careful to use the interned strings. + * + * The values of paths: + * * either a pointer to a merged_info, or a conflict_info struct + * * merged_info contains all relevant information for a + * non-conflicted entry. + * * conflict_info contains a merged_info, plus any additional + * information about a conflict such as the higher orders stages + * involved and the names of the paths those came from (handy + * once renames get involved). + * * a path may start "conflicted" (i.e. point to a conflict_info) + * and then a later step (e.g. three-way content merge) determines + * it can be cleanly merged, at which point it'll be marked clean + * and the algorithm will ignore any data outside the contained + * merged_info for that entry + * * If an entry remains conflicted, the merged_info portion of a + * conflict_info will later be filled with whatever version of + * the file should be placed in the working directory (e.g. an + * as-merged-as-possible variation that contains conflict markers). + */ + struct strmap paths; + + /* + * conflicted: a subset of keys->values from "paths" + * + * conflicted is basically an optimization between process_entries() + * and record_conflicted_index_entries(); the latter could loop over + * ALL the entries in paths AGAIN and look for the ones that are + * still conflicted, but since process_entries() has to loop over + * all of them, it saves the ones it couldn't resolve in this strmap + * so that record_conflicted_index_entries() can iterate just the + * relevant entries. + */ + struct strmap conflicted; + + /* + * current_dir_name: temporary var used in collect_merge_info_callback() + * + * Used to set merged_info.directory_name; see documentation for that + * variable and the requirements placed on that field. + */ + const char *current_dir_name; + + /* call_depth: recursion level counter for merging merge bases */ + int call_depth; +}; + +struct version_info { + struct object_id oid; + unsigned short mode; +}; + +struct merged_info { + /* if is_null, ignore result. otherwise result has oid & mode */ + struct version_info result; + unsigned is_null:1; + + /* + * clean: whether the path in question is cleanly merged. + * + * see conflict_info.merged for more details. + */ + unsigned clean:1; + + /* + * basename_offset: offset of basename of path. + * + * perf optimization to avoid recomputing offset of final '/' + * character in pathname (0 if no '/' in pathname). + */ + size_t basename_offset; + + /* + * directory_name: containing directory name. + * + * Note that we assume directory_name is constructed such that + * strcmp(dir1_name, dir2_name) == 0 iff dir1_name == dir2_name, + * i.e. string equality is equivalent to pointer equality. For this + * to hold, we have to be careful setting directory_name. + */ + const char *directory_name; +}; + +struct conflict_info { + /* + * merged: the version of the path that will be written to working tree + * + * WARNING: It is critical to check merged.clean and ensure it is 0 + * before reading any conflict_info fields outside of merged. + * Allocated merge_info structs will always have clean set to 1. + * Allocated conflict_info structs will have merged.clean set to 0 + * initially. The merged.clean field is how we know if it is safe + * to access other parts of conflict_info besides merged; if a + * conflict_info's merged.clean is changed to 1, the rest of the + * algorithm is not allowed to look at anything outside of the + * merged member anymore. + */ + struct merged_info merged; + + /* oids & modes from each of the three trees for this path */ + struct version_info stages[3]; + + /* pathnames for each stage; may differ due to rename detection */ + const char *pathnames[3]; + + /* Whether this path is/was involved in a directory/file conflict */ + unsigned df_conflict:1; + + /* + * For filemask and dirmask, see tree-walk.h's struct traverse_info, + * particularly the documentation above the "fn" member. Note that + * filemask = mask & ~dirmask from that documentation. + */ + unsigned filemask:3; + unsigned dirmask:3; + + /* + * Optimization to track which stages match, to avoid the need to + * recompute it in multiple steps. Either 0 or at least 2 bits are + * set; if at least 2 bits are set, their corresponding stages match. + */ + unsigned match_mask:3; +}; + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, From patchwork Sun Nov 29 07:43:05 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938969 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0C5E5C64E7D for ; Sun, 29 Nov 2020 07:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BCB8C20855 for ; Sun, 29 Nov 2020 07:44:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="LkrcK+8E" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726472AbgK2HoI (ORCPT ); Sun, 29 Nov 2020 02:44:08 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47194 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725830AbgK2HoI (ORCPT ); Sun, 29 Nov 2020 02:44:08 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AC33C0613D3 for ; Sat, 28 Nov 2020 23:43:28 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id u12so10846537wrt.0 for ; Sat, 28 Nov 2020 23:43:28 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=tIc/yVjj5JkzkP94oZThu+8Obk4wAdjeVxslrCVvI4E=; b=LkrcK+8E4LczdveRZv/b0wA4u+67Yr/v5n/6YnujYULCxX46AX/uCC+veVuI3aIA2s bvIjQUl9RWgfNJsYX6hCyOqvY/N/np2ZPNSM33Znv18/sedXvrW/a8pnARYi1hErVaLM MQve0rylkF2Okh4LRX96NvCRmXriRrXMWGNgbOVu/+584dnjav2pv//wJAcqJHZ0EhoH ANUpLUZIXZOfmF/eTyqnQdXbFaDcEoBJFg5bT54kPJgCSmoajtguLEPvLC05e1xJ3D1I ST2TzH8ynZdnlex132sFhCyy53G5cwTkJvQ7W5dnmKp7GDjoRdPqzQzAtkuK5hhz0MP7 /yAw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=tIc/yVjj5JkzkP94oZThu+8Obk4wAdjeVxslrCVvI4E=; b=Mprvj2NdrLkKLXL/dhQwQCi8KjVJ2NS0m1hcCJN4Og5koFlsuVOJEFMaI0Eep7nNR5 /OW+kK92ReXl0KpNkI9zQMaE8CFuMMleoHrUHbpUumVE4IHPBB392IAPBkPiGSZfqZ9J EaXdzzXE3pb6Nq5QYDtHOtOTB/Geud7khc+OyjevWRKrMfeJx51Wr/H2lyNCl4v4TF59 DleJ0HtcKLpbWxVetPHtrHyDN17ivy8M07Ioixboh1Ljl5x39evDuNv0M70Mt8BKaDoF fWrHZWXXuOHZu5HJJA6MW+hS4ag9SmodJmEqE5Rwsml2stz3P7Y4jV1TVr7BbCLlhe1d oPJQ== X-Gm-Message-State: AOAM533P5J14R1Qen8MG2C1HXoAHqLUWQj6PZoNFtCXGqw2Rj+bnsacZ +8D7AtFd0f1lockziHBRwc0N1HKlnKo= X-Google-Smtp-Source: ABdhPJzgyZqD9KPi0XjiP5EAP/ZJslX1UsG8jKZTYzgZW8WqigSBU+IJJLfCIEtZYv/xxsY47tcPhQ== X-Received: by 2002:adf:e84e:: with SMTP id d14mr20349540wrn.190.1606635806806; Sat, 28 Nov 2020 23:43:26 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d16sm24236451wrw.17.2020.11.28.23.43.26 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:26 -0800 (PST) Message-Id: <3a063865c39f9d1d03acc0cda9023811007d92b5.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:05 +0000 Subject: [PATCH 02/20] merge-ort: add some high-level algorithm structure Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren merge_ort_nonrecursive_internal() will be used by both merge_inmemory_nonrecursive() and merge_inmemory_recursive(); let's focus on it for now. It involves some setup -- merge_start() -- followed by the following chain of functions: collect_merge_info() This function will populate merge_options_internal's paths field, via a call to traverse_trees() and a new callback that will be added later. detect_and_process_renames() This function will detect renames, and then adjust entries in paths to move conflict stages from old pathnames into those for new pathnames, so that the next step doesn't have to think about renames and just can do three-way content merging and such. process_entries() This function determines how to take the various stages (versions of a file from the three different sides) and merge them, and whether to mark the result as conflicted or cleanly merged. It also writes out these merged file versions as it goes to create a tree. Signed-off-by: Elijah Newren --- merge-ort.c | 67 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 66 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index bb37fdf838..97ef2276bd 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -18,6 +18,7 @@ #include "merge-ort.h" #include "strmap.h" +#include "tree.h" struct merge_options_internal { /* @@ -154,6 +155,37 @@ struct conflict_info { unsigned match_mask:3; }; +static int collect_merge_info(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2) +{ + die("Not yet implemented."); +} + +static int detect_and_process_renames(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2) +{ + int clean = 1; + + /* + * Rename detection works by detecting file similarity. Here we use + * a really easy-to-implement scheme: files are similar IFF they have + * the same filename. Therefore, by this scheme, there are no renames. + * + * TODO: Actually implement a real rename detection scheme. + */ + return clean; +} + +static void process_entries(struct merge_options *opt, + struct object_id *result_oid) +{ + die("Not yet implemented."); +} + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, @@ -170,13 +202,46 @@ void merge_finalize(struct merge_options *opt, die("Not yet implemented"); } +static void merge_start(struct merge_options *opt, struct merge_result *result) +{ + die("Not yet implemented."); +} + +/* + * Originally from merge_trees_internal(); heavily adapted, though. + */ +static void merge_ort_nonrecursive_internal(struct merge_options *opt, + struct tree *merge_base, + struct tree *side1, + struct tree *side2, + struct merge_result *result) +{ + struct object_id working_tree_oid; + + collect_merge_info(opt, merge_base, side1, side2); + result->clean = detect_and_process_renames(opt, merge_base, + side1, side2); + process_entries(opt, &working_tree_oid); + + /* Set return values */ + result->tree = parse_tree_indirect(&working_tree_oid); + /* existence of conflicted entries implies unclean */ + result->clean &= strmap_empty(&opt->priv->conflicted); + if (!opt->priv->call_depth) { + result->priv = opt->priv; + opt->priv = NULL; + } +} + void merge_incore_nonrecursive(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2, struct merge_result *result) { - die("Not yet implemented"); + assert(opt->ancestor != NULL); + merge_start(opt, result); + merge_ort_nonrecursive_internal(opt, merge_base, side1, side2, result); } void merge_incore_recursive(struct merge_options *opt, From patchwork Sun Nov 29 07:43:06 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938967 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E059FC64E7C for ; Sun, 29 Nov 2020 07:44:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 95E912080C for ; Sun, 29 Nov 2020 07:44:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="WHnwOmxm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726661AbgK2HoY (ORCPT ); Sun, 29 Nov 2020 02:44:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47200 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725830AbgK2HoJ (ORCPT ); Sun, 29 Nov 2020 02:44:09 -0500 Received: from mail-wr1-x432.google.com (mail-wr1-x432.google.com [IPv6:2a00:1450:4864:20::432]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 252CAC0613D4 for ; Sat, 28 Nov 2020 23:43:29 -0800 (PST) Received: by mail-wr1-x432.google.com with SMTP id 23so10824823wrc.8 for ; Sat, 28 Nov 2020 23:43:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=4d6geiWr0GOYAu9feXFM09HfSTtPef86PNI0FJUU0gk=; b=WHnwOmxmQ6A6tX8Ailg3LYDpPQR3hfpfNWnZ48j+56Wngtj7yZzd9rqCHGrEtHiHA+ ++R8Vf3YyC9WbpUu2lfJ9M6VWHw9XvSqxrDJitDNvjsLqoXlt+npLCYlvR8d2P1z8gig aJdlteLmfrb4NkrYeiPAyNW/ZsKIlbIDSmw9nbA4eCuI5vok88MQAl9VvOyyXZvvneYX AEziC8NICYX8M+7A7p6aCqyKmqSwnN1YkuSQqZNuxwgigtndlML8cdjGNVgP4MO3pOp+ bS2ToPHWivZTL3zj56HQd1TG/HhOzmYXz8ZOLaLUg5LF4Bc80MJHjlClgp9fDhpWBTuU 7nVg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=4d6geiWr0GOYAu9feXFM09HfSTtPef86PNI0FJUU0gk=; b=ILCvo2w4oUHdQHkalscYOiTq1iJkw9DzD+9TwhnsNqI/fIEcYPiHGiQfUxMVO9iR40 NSkvmA4zpVWOOeOTgrIHCaeE8AoB58o2hgug167gLRYnxIkFFTPZ+ubMcvOtuy3a9bZw rN/sGzoY9AV3EWB+sgY0SmUQKLkIjD/hJ8BAZQ9tR0JQH26m0nKrE6oI714v8p+QWaVC 7o/GcQWUqnakM2XGZiPy5LyfykC4jVi2QCMEsCO6uSkAWiW0DAhgA+Qk9outUNsoyR02 wk3vxK3+Wf6wpqonLm6LkY++SIrHBoYekNbvFhL+uQs3XoBmvYqMgcztkLhy07PBzAp0 cYTA== X-Gm-Message-State: AOAM532PBWgY0wKxA+MDqZaTvxiFNEIXbGrq/pbJ+jqU/BlaunnrbxG2 a1hev47WyZdFviEU3zXjNht76QcI19k= X-Google-Smtp-Source: ABdhPJx1hy6oBR+veGhT45JuKvaHI9C4hbTxM0x2da2FHPceyaklRYu2oG2wC1UtjEe2zY6d3ELF3g== X-Received: by 2002:adf:eb05:: with SMTP id s5mr7975883wrn.333.1606635807619; Sat, 28 Nov 2020 23:43:27 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b73sm36656152wmb.0.2020.11.28.23.43.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:27 -0800 (PST) Message-Id: <5615f0eecb49a7605d316b71fc1c89229c0277f4.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:06 +0000 Subject: [PATCH 03/20] merge-ort: port merge_start() from merge-recursive Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren merge_start() basically does a bunch of sanity checks, then allocates and initializes opt->priv -- a struct merge_options_internal. Most of the sanity checks are usable as-is. The allocation/intialization is a bit different since merge-ort has a very different merge_options_internal than merge-recursive, but the idea is the same. The weirdest part here is that merge-ort and merge-recursive use the same struct merge_options, even though merge_options has a number of fields that are oddly specific to merge-recursive's internal implementation and don't even make sense with merge-ort's high-level design (e.g. buffer_output, which merge-ort has to always do). I reused the same data structure because: * most the fields made sense to both merge algorithms * making a new struct would have required making new enums or somehow externalizing them, and that was getting messy. * it simplifies converting the existing callers by not having to have different code paths for merge_options setup. I also marked detect_renames as ignored. We can revisit that later, but in short: merge-recursive allowed turning off rename detection because it was sometimes glacially slow. When you speed something up by a few orders of magnitude, it's worth revisiting whether that justification is still relevant. Besides, if folks find it's still too slow, perhaps they have a better scaling case than I could find and maybe it turns up some more optimizations we can add. If it still is needed as an option, it is easy to add later. Signed-off-by: Elijah Newren --- merge-ort.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 97ef2276bd..3581a7d278 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,8 @@ #include "cache.h" #include "merge-ort.h" +#include "diff.h" +#include "diffcore.h" #include "strmap.h" #include "tree.h" @@ -204,7 +206,48 @@ void merge_finalize(struct merge_options *opt, static void merge_start(struct merge_options *opt, struct merge_result *result) { - die("Not yet implemented."); + /* Sanity checks on opt */ + assert(opt->repo); + + assert(opt->branch1 && opt->branch2); + + assert(opt->detect_directory_renames >= MERGE_DIRECTORY_RENAMES_NONE && + opt->detect_directory_renames <= MERGE_DIRECTORY_RENAMES_TRUE); + assert(opt->rename_limit >= -1); + assert(opt->rename_score >= 0 && opt->rename_score <= MAX_SCORE); + assert(opt->show_rename_progress >= 0 && opt->show_rename_progress <= 1); + + assert(opt->xdl_opts >= 0); + assert(opt->recursive_variant >= MERGE_VARIANT_NORMAL && + opt->recursive_variant <= MERGE_VARIANT_THEIRS); + + /* + * detect_renames, verbosity, buffer_output, and obuf are ignored + * fields that were used by "recursive" rather than "ort" -- but + * sanity check them anyway. + */ + assert(opt->detect_renames >= -1 && + opt->detect_renames <= DIFF_DETECT_COPY); + assert(opt->verbosity >= 0 && opt->verbosity <= 5); + assert(opt->buffer_output <= 2); + assert(opt->obuf.len == 0); + + assert(opt->priv == NULL); + + /* Initialization of opt->priv, our internal merge data */ + opt->priv = xcalloc(1, sizeof(*opt->priv)); + + /* + * Although we initialize opt->priv->paths with strdup_strings=0, + * that's just to avoid making yet another copy of an allocated + * string. Putting the entry into paths means we are taking + * ownership, so we will later free it. + * + * In contrast, conflicted just has a subset of keys from paths, so + * we don't want to free those (it'd be a duplicate free). + */ + strmap_init_with_options(&opt->priv->paths, NULL, 0); + strmap_init_with_options(&opt->priv->conflicted, NULL, 0); } /* From patchwork Sun Nov 29 07:43:07 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938971 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 344E2C64E8A for ; Sun, 29 Nov 2020 07:44:39 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E3CB2208D5 for ; Sun, 29 Nov 2020 07:44:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="mWbLzI3Y" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726618AbgK2HoX (ORCPT ); Sun, 29 Nov 2020 02:44:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47204 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726477AbgK2HoK (ORCPT ); Sun, 29 Nov 2020 02:44:10 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0094CC0617A7 for ; Sat, 28 Nov 2020 23:43:29 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id g185so7315156wmf.3 for ; Sat, 28 Nov 2020 23:43:29 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=u7xaiChqwYG+2Tum1Kh8oEQYlGlfDx8LGEVIJ0Q6ahU=; b=mWbLzI3Y5a3gckfUfsGvdFi9wGaduSgTluDZX7JPv7w/RweWMaKIgqcDwQHiIXVKj3 RU8fKpA0miiILO2tPnMSgJnOaaNKubgQ13WbSpFX3iMH+0j3I9wOzR6bPoAGU/ltdYkV bnVESPcHzQ0eUDzXjXgVP5C1m2d6H9i9bSCmRgCl1QRxBEggV+8PH5DyEXjmfWkf+6d9 K//uShlzeWqZyMGeJhIVsRT8y48PYLo+86cjEYH+LqL7cK30xHCjOdg/aCmqUOV4gFL0 TjrZLHj7U8Gnd8CsrtacuL1bjN1do9Al/CCyTlRHMiZLrU4YEH4Lb60UGtSjN3nzHuB7 tSSg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=u7xaiChqwYG+2Tum1Kh8oEQYlGlfDx8LGEVIJ0Q6ahU=; b=pRVnYDdPsBGfQHacKXCzN7nMEIVn4p1G3kNMAGgHknFnyCVP2MZXT22hYVC3tmcfy0 3Mdrq0VTq9NoYUEBd56UnUVnXclnqh/jDVIF5WYvmFFpT5XXv8XHa7xzytOysiMP+R7u CfgN3op2VRgSYoUQx20xG0d+LzqL6TQIcEpxzNkh5+0d/8wfwNNM38z25ODKeWbGamSb gB4ZpfpSOPFcdOSSuVP8qBfJ4ca6jRZCEOpn5D3fKXoZYt84P39VZSt/4mVwXISJHfJ4 mA2jKZ11AdLxCew7e4Z8RKA9NbbmqDM/zPWJydsie68qr+23ltH1HiYA/Cwhc5yKBBIL RPSg== X-Gm-Message-State: AOAM530Qp0BXWwdWMo9pVgy5qhcK0tQ3fawKeJumbeGoV044z4Q3Dpl4 g1lb6f4cbx9CFNNFWHUKpn1nVZ4O2R4= X-Google-Smtp-Source: ABdhPJwjMmHEElppjXGPfoyhWMGA2Zt0KODA2XR/UhiNdJjfmg0i/A/mXE1ahI6RM31DUz8ic40tdw== X-Received: by 2002:a1c:dc82:: with SMTP id t124mr17095549wmg.94.1606635808585; Sat, 28 Nov 2020 23:43:28 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w11sm20088965wmg.36.2020.11.28.23.43.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:28 -0800 (PST) Message-Id: <564b072ac105ee9c3ccb30c6046ce66270fbbf53.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:07 +0000 Subject: [PATCH 04/20] merge-ort: use histogram diff Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren In my cursory investigation, histogram diffs are about 2% slower than Myers diffs. Others have probably done more detailed benchmarks. But, in short, histogram diffs have been around for years and in a number of cases provide obviously better looking diffs where Myers diffs are unintelligible but the performance hit has kept them from becoming the default. However, there are real merge bugs we know about that have triggered on git.git and linux.git, which I don't have a clue how to address without the additional information that I believe is provided by histogram diffs. See the following: https://lore.kernel.org/git/20190816184051.GB13894@sigill.intra.peff.net/ https://lore.kernel.org/git/CABPp-BHvJHpSJT7sdFwfNcPn_sOXwJi3=o14qjZS3M8Rzcxe2A@mail.gmail.com/ https://lore.kernel.org/git/CABPp-BGtez4qjbtFT1hQoREfcJPmk9MzjhY5eEq1QhXT23tFOw@mail.gmail.com/ I don't like mismerges. I really don't like silent mismerges. While I am sometimes willing to make performance and correctness tradeoff, I'm much more interested in correctness in general. I want to fix the above bugs. I have not yet started doing so, but I believe histogram diff at least gives me an angle. Unfortunately, I can't rely on using the information from histogram diff unless it's in use. And it hasn't been used because of a few percentage performance hit. In testcases I have looked at, merge-ort is _much_ faster than merge-recursive for non-trivial merges/rebases/cherry-picks. As such, this is a golden opportunity to switch out the underlying diff algorithm (at least the one used by the merge machinery; git-diff and git-log are separate questions); doing so will allow me to get additional data and improved diffs, and I believe it will help me fix the above bugs at some point in the future. Signed-off-by: Elijah Newren --- merge-ort.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index 3581a7d278..d737762700 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -21,6 +21,7 @@ #include "diffcore.h" #include "strmap.h" #include "tree.h" +#include "xdiff-interface.h" struct merge_options_internal { /* @@ -234,6 +235,9 @@ static void merge_start(struct merge_options *opt, struct merge_result *result) assert(opt->priv == NULL); + /* Default to histogram diff. Actually, just hardcode it...for now. */ + opt->xdl_opts = DIFF_WITH_ALG(opt, HISTOGRAM_DIFF); + /* Initialization of opt->priv, our internal merge data */ opt->priv = xcalloc(1, sizeof(*opt->priv)); From patchwork Sun Nov 29 07:43:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938977 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 00A10C64E7D for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C23982080C for ; Sun, 29 Nov 2020 07:45:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="XGp+m9Au" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726704AbgK2Hor (ORCPT ); Sun, 29 Nov 2020 02:44:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725830AbgK2Hor (ORCPT ); Sun, 29 Nov 2020 02:44:47 -0500 Received: from mail-wr1-x441.google.com (mail-wr1-x441.google.com [IPv6:2a00:1450:4864:20::441]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B4530C061A04 for ; Sat, 28 Nov 2020 23:43:30 -0800 (PST) Received: by mail-wr1-x441.google.com with SMTP id m6so10804833wrg.7 for ; Sat, 28 Nov 2020 23:43:30 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=l+uEjOkxm94slzmtCXqPDLGdmMBiPqyT9ZMyOAakzC0=; b=XGp+m9AuoBl5PGPYvEqeqMMrNkG90Hmia0azXd3in1uQVDBU2fzjIrwoPHkfw3Yvaa K3+UBTpw7ZPWe+4E5KhFjqjxS1lBCsZniFBHfTES7Jv/XsS/LNjuKqmT9FJz9cusqC32 5ecGmN07J52VTfSowGATRbTjBFZx6lEQgXGZNuvH/U0foAar/QoKddqX+t7B6OHkAUFR QsFUL6z3eIMvnS9+PewG8LumLqaxBnedGN839EhkkDjAexkZs/qbI/WcJ1H1+rX4Qw1/ xR6jYYlE0DeUHJ0ndWfM7jR2ez4bAUv9lOJ4HAOdouFiPrQ+2I+rkgGi6lV3TYHGo1zv xhAQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=l+uEjOkxm94slzmtCXqPDLGdmMBiPqyT9ZMyOAakzC0=; b=t6NrP5UwCykFw1Syl+keLvUMbFdVpjRwYnvIwelYPAMNw13wGGp12HvucTM7S9aP/3 RI6kNVYxUM4+OmGdtxp2oT0MBpXzcYj/j6iIp1Jojn6W6bKx+7zu1EDx2YZioUwGT/3q OlhytOpqIFoZgt0zAvzfHT/FmNXZcucjpbPAXIC5i5wrm9PMEEL79IaVy7M+oCLr6aYM IAhuTfF1C089RqUECrQO8sm+UiM+jaoDvAz2rf2vW7EE6fH3Sym6fUb2DLiIg4lva11+ GgG8DVbZbJSpLkaXx0BNi88Kvid9M7jtNFqdZAxwiN5VWODgSiOVAs2f0EFm6J8dl0Fj y7/g== X-Gm-Message-State: AOAM533sxjYaICMLZo8wGd4UU1Bkjw39loKKdUendJMGi8uHg+lL0MFH /XdopyA2FVCUZD/Nhl63o1qA6t+mcUQ= X-Google-Smtp-Source: ABdhPJz026q+F/ImwLRDPGnOfdBsdfmgMOaLIJgQ2yZGdPtF2cvSwVH0cwm+ogbed5tCaebcwx406w== X-Received: by 2002:a5d:4e87:: with SMTP id e7mr21839836wru.352.1606635809383; Sat, 28 Nov 2020 23:43:29 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o21sm23976059wra.40.2020.11.28.23.43.28 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:28 -0800 (PST) Message-Id: <91516799e46ebbc91fb6b1811164fe7c9a15a3ad.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:08 +0000 Subject: [PATCH 05/20] merge-ort: add an err() function similar to one from merge-recursive Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Various places in merge-recursive used an err() function when it hit some kind of unrecoverable error. That code was from the reusable bits of merge-recursive.c that we liked, such as merge_3way, writing object files to the object store, reading blobs from the object store, etc. So create a similar function to allow us to port that code over, and use it for when we detect problems returned from collect_merge_info()'s traverse_trees() call, which we will be adding next. Signed-off-by: Elijah Newren --- merge-ort.c | 27 ++++++++++++++++++++++++++- 1 file changed, 26 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index d737762700..baf31bcc28 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -158,11 +158,28 @@ struct conflict_info { unsigned match_mask:3; }; +static int err(struct merge_options *opt, const char *err, ...) +{ + va_list params; + struct strbuf sb = STRBUF_INIT; + + strbuf_addstr(&sb, "error: "); + va_start(params, err); + strbuf_vaddf(&sb, err, params); + va_end(params); + + error("%s", sb.buf); + strbuf_release(&sb); + + return -1; +} + static int collect_merge_info(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2) { + /* TODO: Implement this using traverse_trees() */ die("Not yet implemented."); } @@ -265,7 +282,15 @@ static void merge_ort_nonrecursive_internal(struct merge_options *opt, { struct object_id working_tree_oid; - collect_merge_info(opt, merge_base, side1, side2); + if (collect_merge_info(opt, merge_base, side1, side2) != 0) { + err(opt, _("collecting merge info failed for trees %s, %s, %s"), + oid_to_hex(&merge_base->object.oid), + oid_to_hex(&side1->object.oid), + oid_to_hex(&side2->object.oid)); + result->clean = -1; + return; + } + result->clean = detect_and_process_renames(opt, merge_base, side1, side2); process_entries(opt, &working_tree_oid); From patchwork Sun Nov 29 07:43:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938983 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 36507C64E7C for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EDC9B207CD for ; Sun, 29 Nov 2020 07:45:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="RU2d/Sd+" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726829AbgK2Hot (ORCPT ); Sun, 29 Nov 2020 02:44:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726685AbgK2Hos (ORCPT ); Sun, 29 Nov 2020 02:44:48 -0500 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8AC3C061A48 for ; Sat, 28 Nov 2020 23:43:32 -0800 (PST) Received: by mail-wm1-x344.google.com with SMTP id g185so7315344wmf.3 for ; Sat, 28 Nov 2020 23:43:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=RfavVJtLDh39220rcxisgawxzE2vD7vnVIF+bM9WzMY=; b=RU2d/Sd+DzepSLY8oYMrIk5VyTiFoRJ9KQ3f4HZxq6CmUAlRZ1tzND/yk9tCQxkPJa 6hxp33EyUOsai3xNrAGO9txRClt0pyH20nxH9LaPN5H/Ko7Nluw3S3H4DeTNG9VTszPc mu5RNk6PQdPmriLS4AkZCakCS3GEYZpgErL1lOUQE7wqbDJ0uhx2I3ueAVqPeIP/2mHt fcNy7umPLpnTr61vpomzGljVETTA2MSWgelahDmrjnfS1HKbDDAYzhCvSshTqKjqSALM 0dZKs6k//nH8o5bRaMGC9uU4uRyxq89Pi8Bjj28s8VCrpXl246rgh36Tqvp7hrjRpzSY DTuA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=RfavVJtLDh39220rcxisgawxzE2vD7vnVIF+bM9WzMY=; b=p7YtNB13WRnHsXdejqHiMQKyqdqNuvN1AfK0fGAmOc9gcB1PAW+lqjTC+I8n4fMs17 ue6Ay3LmgvDeZhiYxhSws6/1EghPlkxKGTnnhNStq9Hb1wfsO+b5TGuoxvfoSPZtFj6g pHsi5VDJGapxgp/egd+SDNePN7yga5HN6nBGjpgV0GJrZOIH4VTZJc4VNIn9e/uiwIJ8 WkUB2UtJX30YdS/GZP7m28lLlso3WJopZTy3QaZ+twlgpCC4vfzQvPVQjjIXLMT1a5A2 G/UDc46OsIfPkPIluKWKBm2TXAPnj7MmZ7UxNHbGIYqcyhbP6Za+XZk7dIeHeKqHqyZB rh/w== X-Gm-Message-State: AOAM533AG4pIeTZ/5SdylNcJwuuTGrP8Pr0xh+rC/aILCme4T2AQgEHh gDfkXoyBgoVNO/GuKln+TsEcgVVuCRQ= X-Google-Smtp-Source: ABdhPJx/ZMwp9cE3mZdXlfhi+MdPlzgIBmoDG/UwEzb0WYnZKbA3iMUNDh6E/bMR8SkEpBXHvxkFDA== X-Received: by 2002:a1c:9c53:: with SMTP id f80mr17045321wme.19.1606635810294; Sat, 28 Nov 2020 23:43:30 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id m9sm22610931wrx.59.2020.11.28.23.43.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:29 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:09 +0000 Subject: [PATCH 06/20] merge-ort: implement a very basic collect_merge_info() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This does not actually collect any necessary info other than the pathnames involved, since it just allocates an all-zero conflict_info and stuffs that into paths. However, it invokes the traverse_trees() machinery to walk over all the paths and sets up the basic infrastructure we need. I have left out a few obvious optimizations to try to make this patch as short and obvious as possible. A subsequent patch will add some of those back in with some more useful data fields before we introduce a patch that actually sets up the conflict_info fields. Signed-off-by: Elijah Newren --- merge-ort.c | 119 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 117 insertions(+), 2 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index baf31bcc28..a3096876d4 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -174,13 +174,128 @@ static int err(struct merge_options *opt, const char *err, ...) return -1; } +static int collect_merge_info_callback(int n, + unsigned long mask, + unsigned long dirmask, + struct name_entry *names, + struct traverse_info *info) +{ + /* + * n is 3. Always. + * common ancestor (mbase) has mask 1, and stored in index 0 of names + * head of side 1 (side1) has mask 2, and stored in index 1 of names + * head of side 2 (side2) has mask 4, and stored in index 2 of names + */ + struct merge_options *opt = info->data; + struct merge_options_internal *opti = opt->priv; + struct conflict_info *ci; + struct name_entry *p; + size_t len; + char *fullpath; + unsigned filemask = mask & ~dirmask; + unsigned mbase_null = !(mask & 1); + unsigned side1_null = !(mask & 2); + unsigned side2_null = !(mask & 4); + + /* n = 3 is a fundamental assumption. */ + if (n != 3) + BUG("Called collect_merge_info_callback wrong"); + + /* + * A bunch of sanity checks verifying that traverse_trees() calls + * us the way I expect. Could just remove these at some point, + * though maybe they are helpful to future code readers. + */ + assert(mbase_null == is_null_oid(&names[0].oid)); + assert(side1_null == is_null_oid(&names[1].oid)); + assert(side2_null == is_null_oid(&names[2].oid)); + assert(!mbase_null || !side1_null || !side2_null); + assert(mask > 0 && mask < 8); + + /* + * Get the name of the relevant filepath, which we'll pass to + * setup_path_info() for tracking. + */ + p = names; + while (!p->mode) + p++; + len = traverse_path_len(info, p->pathlen); + + /* +1 in both of the following lines to include the NUL byte */ + fullpath = xmalloc(len + 1); + make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); + + /* + * TODO: record information about the path other than all zeros, + * so we can resolve later in process_entries. + */ + ci = xcalloc(1, sizeof(struct conflict_info)); + strmap_put(&opti->paths, fullpath, ci); + + /* If dirmask, recurse into subdirectories */ + if (dirmask) { + struct traverse_info newinfo; + struct tree_desc t[3]; + void *buf[3] = {NULL, NULL, NULL}; + const char *original_dir_name; + int i, ret; + + ci->match_mask &= filemask; + newinfo = *info; + newinfo.prev = info; + newinfo.name = p->path; + newinfo.namelen = p->pathlen; + newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); + + for (i = 0; i < 3; i++) { + const struct object_id *oid = NULL; + if (dirmask & 1) + oid = &names[i].oid; + buf[i] = fill_tree_descriptor(opt->repo, t + i, oid); + dirmask >>= 1; + } + + original_dir_name = opti->current_dir_name; + opti->current_dir_name = fullpath; + ret = traverse_trees(NULL, 3, t, &newinfo); + opti->current_dir_name = original_dir_name; + + for (i = 0; i < 3; i++) + free(buf[i]); + + if (ret < 0) + return -1; + } + + return mask; +} + static int collect_merge_info(struct merge_options *opt, struct tree *merge_base, struct tree *side1, struct tree *side2) { - /* TODO: Implement this using traverse_trees() */ - die("Not yet implemented."); + int ret; + struct tree_desc t[3]; + struct traverse_info info; + const char *toplevel_dir_placeholder = ""; + + opt->priv->current_dir_name = toplevel_dir_placeholder; + setup_traverse_info(&info, toplevel_dir_placeholder); + info.fn = collect_merge_info_callback; + info.data = opt; + info.show_all_errors = 1; + + parse_tree(merge_base); + parse_tree(side1); + parse_tree(side2); + init_tree_desc(t + 0, merge_base->buffer, merge_base->size); + init_tree_desc(t + 1, side1->buffer, side1->size); + init_tree_desc(t + 2, side2->buffer, side2->size); + + ret = traverse_trees(NULL, 3, t, &info); + + return ret; } static int detect_and_process_renames(struct merge_options *opt, From patchwork Sun Nov 29 07:43:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938975 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id E0026C3E8C5 for ; Sun, 29 Nov 2020 07:45:03 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9AC6C207CD for ; Sun, 29 Nov 2020 07:45:03 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="YULWrSGn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726734AbgK2Hor (ORCPT ); Sun, 29 Nov 2020 02:44:47 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726685AbgK2Hor (ORCPT ); Sun, 29 Nov 2020 02:44:47 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E4C0C061A47 for ; Sat, 28 Nov 2020 23:43:32 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id t4so10781122wrr.12 for ; Sat, 28 Nov 2020 23:43:32 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=vwIVPxVU+l+Pw5Bx9zlRbW7ky00OhooiBwH+telTYaI=; b=YULWrSGnGwg0t64X3TsxbqGd1B0DtIXlm+hfBqU4nrBTU3l1VssRbc/ji6s3G03CN9 VwI3bq/ED+6ABr+q1U21jMM/JAfDGVeRmRdr9EXTNr+Fj0FJNVP2Rq6M1ZmwSpPptbQJ B6krYQ0kM6YyejS/Dl66S1/qqWJJTcVaxdUItjwU7L/RShz26kpLMc5ifWdVo94+o2d+ 2TzmNvzULBCSb085JlveaWG08iSVS1UhBlGXkAfstDZOs42LfvRiIrxQvHQvnfSKLdaB yf7EZnY9pBrrgTnv2yMcsNDZ/miD4U1diwanQyRzQa8/ijiFmKZgrN5zi2DwjkYkUdxs Jw7Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=vwIVPxVU+l+Pw5Bx9zlRbW7ky00OhooiBwH+telTYaI=; b=DirKfGPTL4yAP/xG1wjchmQwWkaw+FbLRLbEuSKu2N0nwSGRYlAdOvb6JZSy6Fm+Qo WqtYgAUK/qq/SHi8P865tCh+Zwv82b/r8aWo8I9Yj6kXkSnJSeP8UUlTk0O+lJsXm+Df uPTsbPkY+F7sCXGS/epErtO6NpSCkGpL1GkW4cLYG3L/2yLv+8m5h97tpHFpH9JQ69qv PUWhBKV6I/8w8Ol7B9PqzcGO8BqhK99EWxj61son2rYznMPp9G5C4ACr1GdEAV11nJK4 w3ytWsebz/rAy+zGUBNnks0WB1zNgY7/fQPTthopNeR3x5J7cN4PDnFf3GmeyXDH6ZH9 8Wxg== X-Gm-Message-State: AOAM531zQtHaGv47Pk26ztlnW8MLophHUzIxSY1Z/ZdmFoa9HV+mJKBf A2+o/5ZbHfjS4jst/0n5CvJoA5mRArI= X-Google-Smtp-Source: ABdhPJy8egLY0SormjxkEFnYUXUyZgALNHrariQD+AL/2kekuThYExb4Q6K8VZS0Q3KwsUpyTOtYoA== X-Received: by 2002:adf:d84b:: with SMTP id k11mr21293619wrl.305.1606635811153; Sat, 28 Nov 2020 23:43:31 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id v20sm18773030wmh.44.2020.11.28.23.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:30 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:10 +0000 Subject: [PATCH 07/20] merge-ort: avoid repeating fill_tree_descriptor() on the same tree Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Three-way merges, by their nature, are going to often have two or more trees match at a given subdirectory. We can avoid calling fill_tree_descriptor() on the same tree by checking when these trees match. Noting when various oids match will also be useful in other calculations and optimizations as well. Signed-off-by: Elijah Newren --- merge-ort.c | 26 ++++++++++++++++++++++---- 1 file changed, 22 insertions(+), 4 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index a3096876d4..820809f67e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -196,6 +196,15 @@ static int collect_merge_info_callback(int n, unsigned mbase_null = !(mask & 1); unsigned side1_null = !(mask & 2); unsigned side2_null = !(mask & 4); + unsigned side1_matches_mbase = (!side1_null && !mbase_null && + names[0].mode == names[1].mode && + oideq(&names[0].oid, &names[1].oid)); + unsigned side2_matches_mbase = (!side2_null && !mbase_null && + names[0].mode == names[2].mode && + oideq(&names[0].oid, &names[2].oid)); + unsigned sides_match = (!side1_null && !side2_null && + names[1].mode == names[2].mode && + oideq(&names[1].oid, &names[2].oid)); /* n = 3 is a fundamental assumption. */ if (n != 3) @@ -248,10 +257,19 @@ static int collect_merge_info_callback(int n, newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); for (i = 0; i < 3; i++) { - const struct object_id *oid = NULL; - if (dirmask & 1) - oid = &names[i].oid; - buf[i] = fill_tree_descriptor(opt->repo, t + i, oid); + if (i == 1 && side1_matches_mbase) + t[1] = t[0]; + else if (i == 2 && side2_matches_mbase) + t[2] = t[0]; + else if (i == 2 && sides_match) + t[2] = t[1]; + else { + const struct object_id *oid = NULL; + if (dirmask & 1) + oid = &names[i].oid; + buf[i] = fill_tree_descriptor(opt->repo, + t + i, oid); + } dirmask >>= 1; } From patchwork Sun Nov 29 07:43:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938981 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 76D28C71155 for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 44BE0207CD for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="h3/RfjkD" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726903AbgK2Hou (ORCPT ); Sun, 29 Nov 2020 02:44:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726769AbgK2Hos (ORCPT ); Sun, 29 Nov 2020 02:44:48 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7E26DC061A49 for ; Sat, 28 Nov 2020 23:43:33 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id 64so10808474wra.11 for ; Sat, 28 Nov 2020 23:43:33 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=v6alr+KAGKKe5cHjqzKdUWEZEAwQg4ltW5eGxM4tM2U=; b=h3/RfjkDpZew19qNxANMGGRdMSryC3nP3RL+3hgiWozkhTCJCPmUKZv1/d3UXBV2UY TNz24+/i6KmVK8wf+a3HUTA9ORDgjdd0YX2ECpbkbwiwVMN8HytxoYbMrFj8pLas2w3d JTpQb5suaJPw/THaZwDcUbaSmt3lRiB3Qh+QVQHEOU2WwMfPKf+e0RiLPPJJ7fJxJH4n bD4r4uM+wXKMVxIZgGmm/xGTucPjGxiAcszrXh67lXEeBhcdIbvLCbj2jLt8JWETgPP4 zRD9r5XJW9twm7vRnPOOQwij7NM0G3EDm9vfhPxsvVM8FX1SWKl6UDnCgUbF67KmzeDh 8Arw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=v6alr+KAGKKe5cHjqzKdUWEZEAwQg4ltW5eGxM4tM2U=; b=DtI2omtX/6o4X0IOY4soo3M7WRT4jIWN4GtgnYZD/8lrRXXJ6DtwzaJYHv7qEwn/iN s0OEZfP9SHEWd3hPDbKw8jl1Ew674sNmDqw5f09XMmTnUrGP0qTTZYAZ6UcXVnMcjbVs Op8TbuX3U35Vq6rWpNf6uCNeKAjgz5CJWJe+WQmM8uM51yIvOZkscy3f/R+3ZXdG00yb sl34Iig8gt4YO8wcqkjLBq/jyzauvNXLL1p+Egx1tjfbfyLcROopD81yul8GOohpDpdC Yxpq3SoEpHiSi4q1jkWICMtRs7qJ4adKd5Iu9lIb0KwkajFl/wRZkY1FBa3k77lE+dhn wxog== X-Gm-Message-State: AOAM533fsFAc5d6DImD5hN+zq80bAa62oe0+QAqRhVt18TeJPI/UuiLO EYPFbzOWVXtl/l6st0q5J6lr48gd6EQ= X-Google-Smtp-Source: ABdhPJzRHzySTSaKY7TLtkrCsi4u3R+76treoHcMR0TZOG0+wnC2VEM2f3pJ9VSwkKPoYPOAwFEWSQ== X-Received: by 2002:adf:e801:: with SMTP id o1mr21154270wrm.3.1606635812019; Sat, 28 Nov 2020 23:43:32 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id n10sm21939893wrv.77.2020.11.28.23.43.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:31 -0800 (PST) Message-Id: <61b3d66fdcfaf3230c44243e09f3ff9068b2abd5.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:11 +0000 Subject: [PATCH 08/20] merge-ort: compute a few more useful fields for collect_merge_info Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- merge-ort.c | 36 ++++++++++++++++++++++++++++++++++++ 1 file changed, 36 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index 820809f67e..e5bca25a8d 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -193,6 +193,7 @@ static int collect_merge_info_callback(int n, size_t len; char *fullpath; unsigned filemask = mask & ~dirmask; + unsigned match_mask = 0; /* will be updated below */ unsigned mbase_null = !(mask & 1); unsigned side1_null = !(mask & 2); unsigned side2_null = !(mask & 4); @@ -206,6 +207,22 @@ static int collect_merge_info_callback(int n, names[1].mode == names[2].mode && oideq(&names[1].oid, &names[2].oid)); + /* + * Note: When a path is a file on one side of history and a directory + * in another, we have a directory/file conflict. In such cases, if + * the conflict doesn't resolve from renames and deletions, then we + * always leave directories where they are and move files out of the + * way. Thus, while struct conflict_info has a df_conflict field to + * track such conflicts, we ignore that field for any directories at + * a path and only pay attention to it for files at the given path. + * The fact that we leave directories were they are also means that + * we do not need to worry about getting additional df_conflict + * information propagated from parent directories down to children + * (unlike, say traverse_trees_recursive() in unpack-trees.c, which + * sets a newinfo.df_conflicts field specifically to propagate it). + */ + unsigned df_conflict = (filemask != 0) && (dirmask != 0); + /* n = 3 is a fundamental assumption. */ if (n != 3) BUG("Called collect_merge_info_callback wrong"); @@ -221,6 +238,14 @@ static int collect_merge_info_callback(int n, assert(!mbase_null || !side1_null || !side2_null); assert(mask > 0 && mask < 8); + /* Determine match_mask */ + if (side1_matches_mbase) + match_mask = (side2_matches_mbase ? 7 : 3); + else if (side2_matches_mbase) + match_mask = 5; + else if (sides_match) + match_mask = 6; + /* * Get the name of the relevant filepath, which we'll pass to * setup_path_info() for tracking. @@ -239,6 +264,8 @@ static int collect_merge_info_callback(int n, * so we can resolve later in process_entries. */ ci = xcalloc(1, sizeof(struct conflict_info)); + ci->df_conflict = df_conflict; + ci->match_mask = match_mask; strmap_put(&opti->paths, fullpath, ci); /* If dirmask, recurse into subdirectories */ @@ -255,6 +282,15 @@ static int collect_merge_info_callback(int n, newinfo.name = p->path; newinfo.namelen = p->pathlen; newinfo.pathlen = st_add3(newinfo.pathlen, p->pathlen, 1); + /* + * If this directory we are about to recurse into cared about + * its parent directory (the current directory) having a D/F + * conflict, then we'd propagate the masks in this way: + * newinfo.df_conflicts |= (mask & ~dirmask); + * But we don't worry about propagating D/F conflicts. (See + * comment near setting of local df_conflict variable near + * the beginning of this function). + */ for (i = 0; i < 3; i++) { if (i == 1 && side1_matches_mbase) From patchwork Sun Nov 29 07:43:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06F57C8300F for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BEDAE20809 for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ig8v587/" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726865AbgK2Hou (ORCPT ); Sun, 29 Nov 2020 02:44:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47312 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726685AbgK2Hot (ORCPT ); Sun, 29 Nov 2020 02:44:49 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 53858C061A4A for ; Sat, 28 Nov 2020 23:43:34 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id p8so10820185wrx.5 for ; Sat, 28 Nov 2020 23:43:34 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=jypN2g+opEybn7lBR3zGoccE0OgSDLejx62gRXxmGi4=; b=ig8v587/xWIsDRkJDlLN6PX/puZThjHu2bwbxzT8hUcbyt45XHkYIH8anjgxDaI+mg ruezEvLW6FRYLu/QwMT8rvjkgxhxIDyGNpmTA53wA51/PdJ6uGN+V8uKTD4EAXcKcjcA m5dkcjKrbLO7hl6GpJ7J8q6MdglAmkduzX3IFT4x8Ju+HyIgpWNdABLvcImXHed6rbuO SU1MwH4nfhmSLrHPomGNtUE3sdGmmYwktmtC2AFB5cKCVco93yi0XSGMwIGodDm1EFtW ml2WPMakSXUq+CLvZq4tyqd5dUtsObxNPXF8yQ4Cok3L6l+xuOODzV1y2jjaQtSbi7tv b3Ww== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=jypN2g+opEybn7lBR3zGoccE0OgSDLejx62gRXxmGi4=; b=Gv0/nWYZckBpoTZkrJJzcVrWC2FyBXffIGrqfNur8wh4fNrVoeLpzLGoukl3+JQGAZ Jo5bL8CJuwr7ZDF3A9E5MIBLIBk/7d6W78+dNO8Fg8FFZyq8eq7PPZiZcU03EGR8ncqB X1AC88+6kMLTAutMyGjlTtFgYgGvOSRecG+bFcdZTJ+E5EIzlEl4pMrl855G5p4v4SJB qO+GOD+SBPxwsPfpcWvp1gNe/d6ZCCOjeWL1mN8CKSWvf4TuO1spkozdJS+hx2ZGEOJp JADE0TQsd2NmrR+60R+tLvo3xgCJEkvJpnuXC2ihwluwRs7Ty1yf4dZj1ugp7rPDUFJ+ eZvQ== X-Gm-Message-State: AOAM531cq9ennwbX5aXzE1Q418T/TDBBi5uVplfww3tVryP0n6Jsao1M IFedVh/Iu2ppR7BwO76XTiJsIgn+pss= X-Google-Smtp-Source: ABdhPJyI0YRjHPJhFq958BOJ8uocEYVw+reHn50Gyvh6EWkRAgkELW/7YhOo2Lw2i5lreNAukRfL8A== X-Received: by 2002:a5d:514f:: with SMTP id u15mr20527196wrt.385.1606635812860; Sat, 28 Nov 2020 23:43:32 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id q66sm21679004wme.6.2020.11.28.23.43.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:32 -0800 (PST) Message-Id: <4e4298fa7077b2f972915b96825528e505f8035e.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:12 +0000 Subject: [PATCH 09/20] merge-ort: record stage and auxiliary info for every path Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Create a helper function, setup_path_info(), which can be used to record all the information we want in a merged_info or conflict_info. While there is currently only one caller of this new function, and some of its particular parameters are fixed, future callers of this function will be added later. Signed-off-by: Elijah Newren --- merge-ort.c | 97 +++++++++++++++++++++++++++++++++++++++++++++++++---- 1 file changed, 90 insertions(+), 7 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index e5bca25a8d..52a8c41cf8 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -158,6 +158,26 @@ struct conflict_info { unsigned match_mask:3; }; +/* + * For the next three macros, see warning for conflict_info.merged. + * + * In each of the below, mi is a struct merged_info*, and ci was defined + * as a struct conflict_info* (but we need to verify ci isn't actually + * pointed at a struct merged_info*). + * + * INITIALIZE_CI: Assign ci to mi but only if it's safe; set to NULL otherwise. + * VERIFY_CI: Ensure that something we assigned to a conflict_info* is one. + * ASSIGN_AND_VERIFY_CI: Similar to VERIFY_CI but do assignment first. + */ +#define INITIALIZE_CI(ci, mi) do { \ + (ci) = (!(mi) || (mi)->clean) ? NULL : (struct conflict_info *)(mi); \ +} while (0) +#define VERIFY_CI(ci) assert(ci && !ci->merged.clean); +#define ASSIGN_AND_VERIFY_CI(ci, mi) do { \ + (ci) = (struct conflict_info *)(mi); \ + assert((ci) && !(mi)->clean); \ +} while (0) + static int err(struct merge_options *opt, const char *err, ...) { va_list params; @@ -174,6 +194,65 @@ static int err(struct merge_options *opt, const char *err, ...) return -1; } +static void setup_path_info(struct merge_options *opt, + struct string_list_item *result, + const char *current_dir_name, + int current_dir_name_len, + char *fullpath, /* we'll take over ownership */ + struct name_entry *names, + struct name_entry *merged_version, + unsigned is_null, /* boolean */ + unsigned df_conflict, /* boolean */ + unsigned filemask, + unsigned dirmask, + int resolved /* boolean */) +{ + /* result->util is void*, so mi is a convenience typed variable */ + struct merged_info *mi; + + assert(!is_null || resolved); + assert(!df_conflict || !resolved); /* df_conflict implies !resolved */ + assert(resolved == (merged_version != NULL)); + + mi = xcalloc(1, resolved ? sizeof(struct merged_info) : + sizeof(struct conflict_info)); + mi->directory_name = current_dir_name; + mi->basename_offset = current_dir_name_len; + mi->clean = !!resolved; + if (resolved) { + mi->result.mode = merged_version->mode; + oidcpy(&mi->result.oid, &merged_version->oid); + mi->is_null = !!is_null; + } else { + int i; + struct conflict_info *ci; + + ASSIGN_AND_VERIFY_CI(ci, mi); + for (i = 0; i < 3; i++) { + ci->pathnames[i] = fullpath; + ci->stages[i].mode = names[i].mode; + oidcpy(&ci->stages[i].oid, &names[i].oid); + } + ci->filemask = filemask; + ci->dirmask = dirmask; + ci->df_conflict = !!df_conflict; + if (dirmask) + /* + * Assume is_null for now, but if we have entries + * under the directory then when it is complete in + * write_completed_directory() it'll update this. + * Also, for D/F conflicts, we have to handle the + * directory first, then clear this bit and process + * the file to see how it is handled -- that occurs + * near the top of process_entry(). + */ + mi->is_null = 1; + } + strmap_put(&opt->priv->paths, fullpath, mi); + result->string = fullpath; + result->util = mi; +} + static int collect_merge_info_callback(int n, unsigned long mask, unsigned long dirmask, @@ -188,10 +267,12 @@ static int collect_merge_info_callback(int n, */ struct merge_options *opt = info->data; struct merge_options_internal *opti = opt->priv; - struct conflict_info *ci; + struct string_list_item pi; /* Path Info */ + struct conflict_info *ci; /* typed alias to pi.util (which is void*) */ struct name_entry *p; size_t len; char *fullpath; + const char *dirname = opti->current_dir_name; unsigned filemask = mask & ~dirmask; unsigned match_mask = 0; /* will be updated below */ unsigned mbase_null = !(mask & 1); @@ -260,13 +341,15 @@ static int collect_merge_info_callback(int n, make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); /* - * TODO: record information about the path other than all zeros, - * so we can resolve later in process_entries. + * Record information about the path so we can resolve later in + * process_entries. */ - ci = xcalloc(1, sizeof(struct conflict_info)); - ci->df_conflict = df_conflict; + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, NULL, 0, df_conflict, filemask, dirmask, 0); + + ci = pi.util; + VERIFY_CI(ci); ci->match_mask = match_mask; - strmap_put(&opti->paths, fullpath, ci); /* If dirmask, recurse into subdirectories */ if (dirmask) { @@ -310,7 +393,7 @@ static int collect_merge_info_callback(int n, } original_dir_name = opti->current_dir_name; - opti->current_dir_name = fullpath; + opti->current_dir_name = pi.string; ret = traverse_trees(NULL, 3, t, &newinfo); opti->current_dir_name = original_dir_name; From patchwork Sun Nov 29 07:43:13 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938979 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5C87CC64E8A for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1EA0320809 for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="MAiEjYiM" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726885AbgK2Hou (ORCPT ); Sun, 29 Nov 2020 02:44:50 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47314 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726841AbgK2Hot (ORCPT ); Sun, 29 Nov 2020 02:44:49 -0500 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 146D9C061A4B for ; Sat, 28 Nov 2020 23:43:35 -0800 (PST) Received: by mail-wm1-x344.google.com with SMTP id 3so11633701wmg.4 for ; Sat, 28 Nov 2020 23:43:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=GjWh6CVSL/ax4SCb+UslMA7Nn+yKmwId9N+80UoDxe4=; b=MAiEjYiMQqC6ba1KQeWrevhNC2bEK5GKTUptTiTo7X6tc4/Ylyz0anflNTaudvt1MV IDyILuvmfK+VNxYQ5aNOz+DxtzZMi0EGK88+hoejq3OrWsne2+4TA9SEytUp39WW7k5d o5FHO7tlERlK/KYhEzKBWfD3DpXIXosokfkleOZmxAldQ9VlBwav50iInTKFnkOQwzi/ dvCXbPQl+ig0TY5utY6Fln6ejXMQQ7l7Iz8JjOqoKyi8etbTW4lmqmxM9OU/t1ZWt9h3 kGlMuktsL34xdr+a6t6Y0ipg/oFcYYfGVdSUBnk7xxZ+aNNbqWfw6bCuvrGyV6Ixsnl1 bqOw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=GjWh6CVSL/ax4SCb+UslMA7Nn+yKmwId9N+80UoDxe4=; b=XFtEvP2+v8EUTeAfzP+ZDtXfOzSjILQ1/rxOy5PfWRnY9mWJ91YnI4WT2fuxQo7PoK 2Fk1KZ0vXX9yMug0pBrPm1VZy6ctBUiZXwbSK14jOfGFIqyCzlrMLuWS0hgGlJl85mH4 xIIufBXeS5x4mW4y01ReHUM6EwTEVEvKPCGywWoosO4U4HhYPIs2jjReLsvSaOsOZDDO VL3TIISTUTYsHj5eA3G9fjfC3dOglFce9S9M21uHStj173eA3lbVli+McRNw0lt3oAGy Fm/RUCuHKQKlOELfZU+FOwylBWpYsDKwPv4pRzkizAeCpK6HoZ8G4nPDbyQg0tdRHOva D85g== X-Gm-Message-State: AOAM532kmq857tGegBUVqbwMhkdeWzpSz0apd9cbOkv3M26O7L2S1wLd xu3hRC6BAgnTEpVxzXWrDLsqgIy9zJI= X-Google-Smtp-Source: ABdhPJx8v8VJnlfNGPkZ3rQDUiPR9HQc0pDcLlxZtPSlhvx17AD1NsQ08AODeBzCdNDxXfEosms5lA== X-Received: by 2002:a7b:c7d3:: with SMTP id z19mr17185024wmk.4.1606635813726; Sat, 28 Nov 2020 23:43:33 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id b4sm21894728wmc.1.2020.11.28.23.43.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:33 -0800 (PST) Message-Id: <3ec087eb68619f4e0587fa098f7de1f137a9e3d3.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:13 +0000 Subject: [PATCH 10/20] merge-ort: avoid recursing into identical trees Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren When all three trees have the same oid, there is no need to recurse into these trees to find that all files within them happen to match. We can just record any one of the trees as the resolution of merging that particular path. Immediately resolving trees for other types of trivial tree merges (such as one side matches the merge base, or the two sides match each other) would prevent us from detecting renames for some paths, and thus prevent us from doing three-way content merges for those paths whose renames we did not detect. Signed-off-by: Elijah Newren --- merge-ort.c | 13 +++++++++++++ 1 file changed, 13 insertions(+) diff --git a/merge-ort.c b/merge-ort.c index 52a8c41cf8..0789816ae9 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -340,6 +340,19 @@ static int collect_merge_info_callback(int n, fullpath = xmalloc(len + 1); make_traverse_path(fullpath, len + 1, info, p->path, p->pathlen); + /* + * If mbase, side1, and side2 all match, we can resolve early. Even + * if these are trees, there will be no renames or anything + * underneath. + */ + if (side1_matches_mbase && side2_matches_mbase) { + /* mbase, side1, & side2 all match; use mbase as resolution */ + setup_path_info(opt, &pi, dirname, info->pathlen, fullpath, + names, names+0, mbase_null, 0, + filemask, dirmask, 1); + return mask; + } + /* * Record information about the path so we can resolve later in * process_entries. From patchwork Sun Nov 29 07:43:14 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938985 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id CED09C71156 for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 979CB2080C for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="vWXp//2w" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726948AbgK2How (ORCPT ); Sun, 29 Nov 2020 02:44:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726893AbgK2Hou (ORCPT ); Sun, 29 Nov 2020 02:44:50 -0500 Received: from mail-wm1-x335.google.com (mail-wm1-x335.google.com [IPv6:2a00:1450:4864:20::335]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E4E57C061A4C for ; Sat, 28 Nov 2020 23:43:35 -0800 (PST) Received: by mail-wm1-x335.google.com with SMTP id d3so9022695wmb.4 for ; Sat, 28 Nov 2020 23:43:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=Vc5zshdJXJ5K2pdJI4ibeIgV2YdLAdDIL4aCX2GIXEA=; b=vWXp//2wV1VXqBKzJ/wXQ5dM2yL9ZEU/RT9eTygYxw2GZYEsXkTGr1H76Mc73He7n7 fzb/sSgMLkpswAQRveZ9Kn1nc+J/HZhSRHnFRv1noSqvJINKA96B07IKVPl5QcOm4vMf GcLxhhe8nNx3h3Z3Y+HNMBLeARhnqXfvZPYrKohlEMUgh5OewT9m789fw37HSnNjk+zC iMpvF6okfUCdxwhyoeyP3Q+HPDBT1Lo3ubgQc/8sBdxx8sL+i/8sbdbKAXMz7R5MHl5g VGsBRFS+vnn+cY9KRRPR8CsWcELU5KKxzLQCugq6qXvT14wqIQMRr3vySwwIV7xYIerf Jcog== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=Vc5zshdJXJ5K2pdJI4ibeIgV2YdLAdDIL4aCX2GIXEA=; b=h9DxDN9TVGKy6t/fEimMSkQccsKlUb9AmbWJh6RFIxW1Hguq2k/GXG1r0CqtUSuFO8 1Z1ahjyxAAhA432m8ufyjVUHsTNrQYvpLuILaGFs31Wunc+bKG8jXYsoi+lehq0uof3K eTqRwd2H847YcI+L1KKrfRBZvjfRzopAmLxXBXeH6qAQacfJJwjl1otp36k+vvH4xJAE PNJ3Z93BhZ7VcxOKPmJrwriOTnqzYg6YYiK8JvzXKOnYsoGJGCBOzD5LwqiKc8Po3Bj8 RygXmoSyS8w5CX0TyL25OG2SMQqN6A/Jkv9RxqJinuEeNrZQnaixBlLOWPC3SzyMTbv5 JQhQ== X-Gm-Message-State: AOAM531M5cMfXeOxyIx8ybYUDrVAw+PRLqunEgDH6znMTF/CgVB2Frk8 jtObv7YyeNpiqN1w8Bs0h8RutMSWcB0= X-Google-Smtp-Source: ABdhPJwEmPxBFo8LqyLXJUEhX2S2mCh34dp+mOCGPNu57S0Ztx3O12MdY5bOTGzJjYCkQB7fCGKGnw== X-Received: by 2002:a1c:32c6:: with SMTP id y189mr17616754wmy.133.1606635814462; Sat, 28 Nov 2020 23:43:34 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id y7sm3769685wrp.3.2020.11.28.23.43.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:34 -0800 (PST) Message-Id: <0c89cee34e5055dcd08013684acbe5d292e1a2dd.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:14 +0000 Subject: [PATCH 11/20] merge-ort: add a preliminary simple process_entries() implementation Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Add a process_entries() implementation that just loops over the paths and processes each one individually with an auxiliary process_entry() call. Add a basic process_entry() as well, which handles several cases but leaves a few of the more involved ones with die-not-implemented messages. Also, although process_entries() is supposed to create a tree, it does not yet have code to do so -- except in the special case of merging completely empty trees. Signed-off-by: Elijah Newren --- merge-ort.c | 103 +++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 102 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 0789816ae9..04127a32f8 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -465,10 +465,111 @@ static int detect_and_process_renames(struct merge_options *opt, return clean; } +/* Per entry merge function */ +static void process_entry(struct merge_options *opt, + const char *path, + struct conflict_info *ci) +{ + VERIFY_CI(ci); + assert(ci->filemask >= 0 && ci->filemask <= 7); + /* ci->match_mask == 7 was handled in collect_merge_info_callback() */ + assert(ci->match_mask == 0 || ci->match_mask == 3 || + ci->match_mask == 5 || ci->match_mask == 6); + + if (ci->df_conflict) { + die("Not yet implemented."); + } + + /* + * NOTE: Below there is a long switch-like if-elseif-elseif... block + * which the code goes through even for the df_conflict cases + * above. Well, it will once we don't die-not-implemented above. + */ + if (ci->match_mask) { + ci->merged.clean = 1; + if (ci->match_mask == 6) { + /* stages[1] == stages[2] */ + ci->merged.result.mode = ci->stages[1].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[1].oid); + } else { + /* determine the mask of the side that didn't match */ + unsigned int othermask = 7 & ~ci->match_mask; + int side = (othermask == 4) ? 2 : 1; + + ci->merged.result.mode = ci->stages[side].mode; + ci->merged.is_null = !ci->merged.result.mode; + oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); + + assert(othermask == 2 || othermask == 4); + assert(ci->merged.is_null == + (ci->filemask == ci->match_mask)); + } + } else if (ci->filemask >= 6 && + (S_IFMT & ci->stages[1].mode) != + (S_IFMT & ci->stages[2].mode)) { + /* + * Two different items from (file/submodule/symlink) + */ + die("Not yet implemented."); + } else if (ci->filemask >= 6) { + /* + * TODO: Needs a two-way or three-way content merge, but we're + * just being lazy and copying the version from HEAD and + * leaving it as conflicted. + */ + ci->merged.clean = 0; + ci->merged.result.mode = ci->stages[1].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[1].oid); + } else if (ci->filemask == 3 || ci->filemask == 5) { + /* Modify/delete */ + die("Not yet implemented."); + } else if (ci->filemask == 2 || ci->filemask == 4) { + /* Added on one side */ + int side = (ci->filemask == 4) ? 2 : 1; + ci->merged.result.mode = ci->stages[side].mode; + oidcpy(&ci->merged.result.oid, &ci->stages[side].oid); + ci->merged.clean = !ci->df_conflict; + } else if (ci->filemask == 1) { + /* Deleted on both sides */ + ci->merged.is_null = 1; + ci->merged.result.mode = 0; + oidcpy(&ci->merged.result.oid, &null_oid); + ci->merged.clean = 1; + } + + /* + * If still conflicted, record it separately. This allows us to later + * iterate over just conflicted entries when updating the index instead + * of iterating over all entries. + */ + if (!ci->merged.clean) + strmap_put(&opt->priv->conflicted, path, ci); +} + static void process_entries(struct merge_options *opt, struct object_id *result_oid) { - die("Not yet implemented."); + struct hashmap_iter iter; + struct strmap_entry *e; + + if (strmap_empty(&opt->priv->paths)) { + oidcpy(result_oid, opt->repo->hash_algo->empty_tree); + return; + } + + strmap_for_each_entry(&opt->priv->paths, &iter, e) { + /* + * NOTE: mi may actually be a pointer to a conflict_info, but + * we have to check mi->clean first to see if it's safe to + * reassign to such a pointer type. + */ + struct merged_info *mi = e->value; + + if (!mi->clean) + process_entry(opt, e->key, e->value); + } + + die("Tree creation not yet implemented"); } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Nov 29 07:43:15 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938991 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id AB7F0C64E90 for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 69F0F20809 for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="iA/5Ulnq" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726917AbgK2Hov (ORCPT ); Sun, 29 Nov 2020 02:44:51 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47320 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726901AbgK2Hou (ORCPT ); Sun, 29 Nov 2020 02:44:50 -0500 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AF593C061A4F for ; Sat, 28 Nov 2020 23:43:36 -0800 (PST) Received: by mail-wm1-x344.google.com with SMTP id g185so7315873wmf.3 for ; Sat, 28 Nov 2020 23:43:36 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=bxHGKizmJ6jjFkFgdOxDxkLQ0GowaVPZABJAzSl9eMk=; b=iA/5UlnqFfR6bjGW0CQ/tX/r0aLOiKvt8NHRIOcVlwNJpTNPf0rMuW9ZzyfbXh4Uqd qtFjeTmfNWe98ENeKNsfpSz0o2826wLTCYAqFWk+Qp81IxNt8Lj5zdAcsu3ZSd3hYVek Zzc4nmTh9Ll9COBFygy5prMoR0KzgsMXmaZ0jRfd7IQhc+SmbldfkPHpsksDj60MDxJe YwwnJdlB9HpkA+NDaX6MYjM0geKFsUJf/snIZpZflFnogvGUleq5+BGSC/aI7qZeAaAU MkdCmyNJsV/GbQ3HOO7NKXmhO1xdO6vA1garYe2DsQwyn2HqhOwVCW88XgAdQ6vosVvO 6ZAA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=bxHGKizmJ6jjFkFgdOxDxkLQ0GowaVPZABJAzSl9eMk=; b=hrm1R4iXl1okUEUDEbbEX8uCZw9xr/KryOZb3HGl3+D2+AAR3lzvhOAMLa6WG3RK51 PXPNZrz3kDDtte58XPhsLxj7A0dh3tr4qtJcWuQWrP1OX94xRHQBmfSnlkXlU2hVXysq 5SddrnDQ60zV2vorAeNXhE9jREOx9QvPA1YeNHiCduD0uc02lSK7oYJhny+8O83fuKiO oqFnZUkYG6DVuoEUnkE+73yk60V1mxveuxhUeJ1BR82mjPiJear4yjeWgLY0EJVES0xW HIjk4n1q6OQIvpqrBQXv/5LuXBHM7XjQvS9yrzxVTI7GJ6lJYSqL9QZ06LiOGqFY/4mF c1Ig== X-Gm-Message-State: AOAM533x0onLMzPv1uRKs9Pc16q4HrNT3zMqaa3n+gAG6bKQ8JE4bePx zsCTri7YkrHJoTdNcAXB7EiWiME7E9s= X-Google-Smtp-Source: ABdhPJwgMRJCbVbUt/2r6HzADTJ2MNo0MxcWTJYKEaNKe8jajI6qkrE1HnM+B0du6+04MXkHw1JHaQ== X-Received: by 2002:a1c:80cb:: with SMTP id b194mr4539095wmd.91.1606635815290; Sat, 28 Nov 2020 23:43:35 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a184sm20512564wmf.8.2020.11.28.23.43.34 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:34 -0800 (PST) Message-Id: <605cbc19d2553ab15ac8b1541c5b3442b0c381a1.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:15 +0000 Subject: [PATCH 12/20] merge-ort: have process_entries operate in a defined order Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren We want to handle paths below a directory before needing to handle the directory itself. Also, we want to handle the directory immediately after the paths below it, so we can't use simple lexicographic ordering from strcmp (which would insert foo.txt between foo and foo/file.c). Copy string_list_df_name_compare() from merge-recursive.c, and set up a string list of paths sorted by that function so that we can iterate in the desired order. Signed-off-by: Elijah Newren --- merge-ort.c | 53 ++++++++++++++++++++++++++++++++++++++++++++++++++--- 1 file changed, 50 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 04127a32f8..eec3b41e7e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -465,6 +465,33 @@ static int detect_and_process_renames(struct merge_options *opt, return clean; } +static int string_list_df_name_compare(const char *one, const char *two) +{ + int onelen = strlen(one); + int twolen = strlen(two); + /* + * Here we only care that entries for D/F conflicts are + * adjacent, in particular with the file of the D/F conflict + * appearing before files below the corresponding directory. + * The order of the rest of the list is irrelevant for us. + * + * To achieve this, we sort with df_name_compare and provide + * the mode S_IFDIR so that D/F conflicts will sort correctly. + * We use the mode S_IFDIR for everything else for simplicity, + * since in other cases any changes in their order due to + * sorting cause no problems for us. + */ + int cmp = df_name_compare(one, onelen, S_IFDIR, + two, twolen, S_IFDIR); + /* + * Now that 'foo' and 'foo/bar' compare equal, we have to make sure + * that 'foo' comes before 'foo/bar'. + */ + if (cmp) + return cmp; + return onelen - twolen; +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, @@ -551,24 +578,44 @@ static void process_entries(struct merge_options *opt, { struct hashmap_iter iter; struct strmap_entry *e; + struct string_list plist = STRING_LIST_INIT_NODUP; + struct string_list_item *entry; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); return; } + /* Hack to pre-allocate plist to the desired size */ + ALLOC_GROW(plist.items, strmap_get_size(&opt->priv->paths), plist.alloc); + + /* Put every entry from paths into plist, then sort */ strmap_for_each_entry(&opt->priv->paths, &iter, e) { + string_list_append(&plist, e->key)->util = e->value; + } + plist.cmp = string_list_df_name_compare; + string_list_sort(&plist); + + /* + * Iterate over the items in reverse order, so we can handle paths + * below a directory before needing to handle the directory itself. + */ + for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) { + char *path = entry->string; /* * NOTE: mi may actually be a pointer to a conflict_info, but * we have to check mi->clean first to see if it's safe to * reassign to such a pointer type. */ - struct merged_info *mi = e->value; + struct merged_info *mi = entry->util; - if (!mi->clean) - process_entry(opt, e->key, e->value); + if (!mi->clean) { + struct conflict_info *ci = (struct conflict_info *)mi; + process_entry(opt, path, ci); + } } + string_list_clear(&plist, 0); die("Tree creation not yet implemented"); } From patchwork Sun Nov 29 07:43:16 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938997 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4A0F4C83013 for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1B61C20809 for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="cYALl4WA" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726992AbgK2Hoy (ORCPT ); Sun, 29 Nov 2020 02:44:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47330 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726893AbgK2Hox (ORCPT ); Sun, 29 Nov 2020 02:44:53 -0500 Received: from mail-wm1-x342.google.com (mail-wm1-x342.google.com [IPv6:2a00:1450:4864:20::342]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81373C061A51 for ; Sat, 28 Nov 2020 23:43:37 -0800 (PST) Received: by mail-wm1-x342.google.com with SMTP id a3so15343191wmb.5 for ; Sat, 28 Nov 2020 23:43:37 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=efzwHOWR7ZtL+4GXk0iNYDjO9FBieaEluKPMU0G8vwM=; b=cYALl4WAY1aqPSm8d8k+P1/lZox0OuB/fpCn/8CWBI8S0E38Sl6AZk+27MorHcPQc5 nxZwBpEmeu7WlvTlpQgBqe2GYRNVd9olhb6NeZH4R26g3q7fB1nnzxAblgd4DGbQUoBC UCJnKp+siMVGOuSaJ8zz7/BbSCpyi62QLSMfpSxSs3rkbOKMAMP3NLBdVLNG4aWc/57p nlVRM1q78KlbUHhFtNXMhJxkJzEiMS98uDqOup0Sb3B9BPPPTNiv77vuHEdyjdoHgm2b 5SLLOqDfSu0K3SQg0nlTisW6FqHmpDkJ6DJRNLcGdBtxB16dpRptjmJpDSm6KRwzvx5f kpVQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=efzwHOWR7ZtL+4GXk0iNYDjO9FBieaEluKPMU0G8vwM=; b=IlYB6NoBzhXMwj+4ay7654cOKQtMj0uS1j6w475pjY3+m3ez2Xlphd4xBPgS62ASnh VdbrJEKIc1j/KZiJnbcLiLOuGYSInrZpe/GMAhqKNI2dh5rA5W6XkmeQoKfLcKkLdHs5 5K1nXeUv0nMOsQ0XsYwevg+dsV4bDT5X9BKl7W6zZwC0WgPVsEEkudBGBL7atmTXzOeG bo+dJFXp54BAaM7x2rg3PTp4LTVMIJLOMDfOAy9tOeDj38LeAV3OmKC5LR3AKfrmjSx4 2m/GqjIoJu47ofDEsbBDpOoNucL9OTzSbDxiTYnFX+iwe6CZo6/KS9tAg687Itus6C81 s7fA== X-Gm-Message-State: AOAM532JOX0bXO73qsaqxuCklnvs4z1w3FU8wawOujUiwdlyxR8TBVNF iPdDxkJtwBGYCSeRdvkrfk5om+TFM3s= X-Google-Smtp-Source: ABdhPJwJwROeMQCo3+4gGO1z8JDojokshdAEOeF+z1KmcUiuQAvhwvCezz5dKgmWlgFkQ8E65lbTdg== X-Received: by 2002:a7b:c3ce:: with SMTP id t14mr17063575wmj.170.1606635816129; Sat, 28 Nov 2020 23:43:36 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id w3sm20599654wma.3.2020.11.28.23.43.35 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:35 -0800 (PST) Message-Id: <242c3cab1349905af9b645a0a3e213edd49846ad.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:16 +0000 Subject: [PATCH 13/20] merge-ort: step 1 of tree writing -- record basenames, modes, and oids Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren As a step towards transforming the processed path->conflict_info entries into an actual tree object, start recording basenames, modes, and oids in a dir_metadata structure. Subsequent commits will make use of this to actually write a tree. Signed-off-by: Elijah Newren --- merge-ort.c | 40 +++++++++++++++++++++++++++++++++++++--- 1 file changed, 37 insertions(+), 3 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index eec3b41e7e..970708fff9 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -492,10 +492,31 @@ static int string_list_df_name_compare(const char *one, const char *two) return onelen - twolen; } +struct directory_versions { + struct string_list versions; +}; + +static void record_entry_for_tree(struct directory_versions *dir_metadata, + const char *path, + struct merged_info *mi) +{ + const char *basename; + + if (mi->is_null) + /* nothing to record */ + return; + + basename = path + mi->basename_offset; + assert(strchr(basename, '/') == NULL); + string_list_append(&dir_metadata->versions, + basename)->util = &mi->result; +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, - struct conflict_info *ci) + struct conflict_info *ci, + struct directory_versions *dir_metadata) { VERIFY_CI(ci); assert(ci->filemask >= 0 && ci->filemask <= 7); @@ -503,6 +524,14 @@ static void process_entry(struct merge_options *opt, assert(ci->match_mask == 0 || ci->match_mask == 3 || ci->match_mask == 5 || ci->match_mask == 6); + if (ci->dirmask) { + record_entry_for_tree(dir_metadata, path, &ci->merged); + if (ci->filemask == 0) + /* nothing else to handle */ + return; + assert(ci->df_conflict); + } + if (ci->df_conflict) { die("Not yet implemented."); } @@ -571,6 +600,7 @@ static void process_entry(struct merge_options *opt, */ if (!ci->merged.clean) strmap_put(&opt->priv->conflicted, path, ci); + record_entry_for_tree(dir_metadata, path, &ci->merged); } static void process_entries(struct merge_options *opt, @@ -580,6 +610,7 @@ static void process_entries(struct merge_options *opt, struct strmap_entry *e; struct string_list plist = STRING_LIST_INIT_NODUP; struct string_list_item *entry; + struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP }; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); @@ -609,13 +640,16 @@ static void process_entries(struct merge_options *opt, */ struct merged_info *mi = entry->util; - if (!mi->clean) { + if (mi->clean) + record_entry_for_tree(&dir_metadata, path, mi); + else { struct conflict_info *ci = (struct conflict_info *)mi; - process_entry(opt, path, ci); + process_entry(opt, path, ci, &dir_metadata); } } string_list_clear(&plist, 0); + string_list_clear(&dir_metadata.versions, 0); die("Tree creation not yet implemented"); } From patchwork Sun Nov 29 07:43:17 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D92CC83012 for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EC2FA207CD for ; Sun, 29 Nov 2020 07:45:04 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VBZEfT8B" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727040AbgK2Hoy (ORCPT ); Sun, 29 Nov 2020 02:44:54 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47332 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726953AbgK2Hoy (ORCPT ); Sun, 29 Nov 2020 02:44:54 -0500 Received: from mail-wm1-x344.google.com (mail-wm1-x344.google.com [IPv6:2a00:1450:4864:20::344]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6AE6FC061A52 for ; Sat, 28 Nov 2020 23:43:38 -0800 (PST) Received: by mail-wm1-x344.google.com with SMTP id 3so11634088wmg.4 for ; Sat, 28 Nov 2020 23:43:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=0/wosQMub37QZjMN2F69GcoUKYnFBqc9X0QlVZOCJT4=; b=VBZEfT8BHjLmtFGOb+xkavWirfdslLanNQVwTJ2WXpUwncoK7Lg/XN9Bk5Ph5jI39A 725+j+pNA0mGPBebtfM2GBuR7XdWvKSlqNwGzImI75qV68E9zh0ayuWfcxKkD46SJf84 siMddBzbaa7bZmrtA7WO/uUEsaIJt76sX+ni7MriggFVUEDLnBLE5iGGy0cpFyTTJt5Q 5UUeX8Pf5uWzaUJPqoo7AYYloBBoyMkUCgONZOFr9OlFOBbb6IGPNYBtNMLXixPSlGNs dGcJOaQLop9poankQ7FklLqbJp+83N9g+n29gSAp/0Nrv32NrvQhXE+lzuyCO2xCxDCS mMeQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=0/wosQMub37QZjMN2F69GcoUKYnFBqc9X0QlVZOCJT4=; b=sgi99zEVz8wm2u8tM+D86DveOPGAb8pUp0Jvyl7hzpIrP/Xa1o9eSfe1EaspDUU8Bc FfMIeKjDGpdLW9jRhhQVF62ZWeAacslDHymw6K/lbYc5u/fdFNOhB8WOAefjQ/EwwWVn h3Mog12cWwr0lIDA1Hc1/p7OVTfaBrf5eqaX2HtfC6jyvOs+AtG/OsI6/+/1hQrGISYa Bdf7xaYaFJeuvqcgs9NTB6lZqhotPeFtsW2WSKLYBNVKj0kJ06uMZPnp5UBfdBQQ8OOJ +R19ocfQFArCvoUhHfW2ZQWTsxeqqT6BBUE/mQrBurQXS4BBzbv9pNGPn12GbFgNV4Mg 5d8Q== X-Gm-Message-State: AOAM532ZSx0T/uopAiJUMYWWp0VxA2UQC9eXySvTfgD2dDdArK5dL15I eOaccrxnsr+3hu/y4j5YJlnM2g5DyBg= X-Google-Smtp-Source: ABdhPJzYKewluhnFzE5yrveRO5Tf8sGtKMLm82Z6xGl2EOgCX2F9oBYFzbY8qfDGKg8gxL9jt93HNQ== X-Received: by 2002:a7b:c385:: with SMTP id s5mr17752879wmj.30.1606635817056; Sat, 28 Nov 2020 23:43:37 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z189sm20160606wme.23.2020.11.28.23.43.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:36 -0800 (PST) Message-Id: <33a5d23c85357607997a5138da7eaa8ac401049b.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:17 +0000 Subject: [PATCH 14/20] merge-ort: step 2 of tree writing -- function to create tree object Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Create a new function, write_tree(), which will take a list of basenames, modes, and oids for a single directory and create a tree object in the object-store. We do not yet have just basenames, modes, and oids for just a single directory (we have a mixture of entries from all directory levels in the hierarchy) so we still die() before the current call to write_tree(), but the next patch will rectify that. Signed-off-by: Elijah Newren --- merge-ort.c | 56 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 55 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 970708fff9..59355de628 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -19,6 +19,7 @@ #include "diff.h" #include "diffcore.h" +#include "object-store.h" #include "strmap.h" #include "tree.h" #include "xdiff-interface.h" @@ -496,6 +497,51 @@ struct directory_versions { struct string_list versions; }; +static void write_tree(struct object_id *result_oid, + struct string_list *versions, + unsigned int offset, + size_t hash_size) +{ + size_t maxlen = 0, extra; + unsigned int nr = versions->nr - offset; + struct strbuf buf = STRBUF_INIT; + struct string_list relevant_entries = STRING_LIST_INIT_NODUP; + int i; + + /* + * We want to sort the last (versions->nr-offset) entries in versions. + * Do so by abusing the string_list API a bit: make another string_list + * that contains just those entries and then sort them. + * + * We won't use relevant_entries again and will let it just pop off the + * stack, so there won't be allocation worries or anything. + */ + relevant_entries.items = versions->items + offset; + relevant_entries.nr = versions->nr - offset; + string_list_sort(&relevant_entries); + + /* Pre-allocate some space in buf */ + extra = hash_size + 8; /* 8: 6 for mode, 1 for space, 1 for NUL char */ + for (i = 0; i < nr; i++) { + maxlen += strlen(versions->items[offset+i].string) + extra; + } + strbuf_grow(&buf, maxlen); + + /* Write each entry out to buf */ + for (i = 0; i < nr; i++) { + struct merged_info *mi = versions->items[offset+i].util; + struct version_info *ri = &mi->result; + strbuf_addf(&buf, "%o %s%c", + ri->mode, + versions->items[offset+i].string, '\0'); + strbuf_add(&buf, ri->oid.hash, hash_size); + } + + /* Write this object file out, and record in result_oid */ + write_object_file(buf.buf, buf.len, tree_type, result_oid); + strbuf_release(&buf); +} + static void record_entry_for_tree(struct directory_versions *dir_metadata, const char *path, struct merged_info *mi) @@ -648,9 +694,17 @@ static void process_entries(struct merge_options *opt, } } + /* + * TODO: We can't actually write a tree yet, because dir_metadata just + * contains all basenames of all files throughout the tree with their + * mode and hash. Not only is that a nonsensical tree, it will have + * lots of duplicates for paths such as "Makefile" or ".gitignore". + */ + die("Not yet implemented; need to process subtrees separately"); + write_tree(result_oid, &dir_metadata.versions, 0, + opt->repo->hash_algo->rawsz); string_list_clear(&plist, 0); string_list_clear(&dir_metadata.versions, 0); - die("Tree creation not yet implemented"); } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Nov 29 07:43:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938993 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9ADE7C83017 for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 6389820809 for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="sb3MJTCs" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727050AbgK2HpC (ORCPT ); Sun, 29 Nov 2020 02:45:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47302 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726021AbgK2HpC (ORCPT ); Sun, 29 Nov 2020 02:45:02 -0500 Received: from mail-wm1-x343.google.com (mail-wm1-x343.google.com [IPv6:2a00:1450:4864:20::343]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA3A1C061A53 for ; Sat, 28 Nov 2020 23:43:39 -0800 (PST) Received: by mail-wm1-x343.google.com with SMTP id k10so5763058wmi.3 for ; Sat, 28 Nov 2020 23:43:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=SxD6GYPL287a7bMQwLJr0FJjpCh//Ag9+xFuDJAjJNE=; b=sb3MJTCs834zk58a0YRuEVwtDdDpHHV6/fiPxB5nY8Rd8Ba8opFWUYyOeA5p9UnVIe Ad+PqQmYxvdHK8BH77Jwhyg+9AiifwyJlPPnZY54MXGZ+0SCH/M57JnKcDSYsZU+bkcQ jGvm0bOGe5YXKrVgFMgx2BwZHErk+KR+N3KAm31rSbT6sK6SjJcWp2kgdbJzzrwEiI8z EL7eUQ3biQ4WRqgp0cqz3rX39YhO3m64Bl7fvGsXLI7vBAOCH31ExJ4ZXK6Tz4kgDMX/ 09JYqJNxM3fprXb6zGDvSQSXR4C+lQT97bDhKxneu1Bi+AC4KLB6ZjpCGOqIfqBPefME CRwQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=SxD6GYPL287a7bMQwLJr0FJjpCh//Ag9+xFuDJAjJNE=; b=e3oQ/uYRhoZzRIrHHHHi4foWZxjD8W6HXHP/bal4V7sYO5PVC+De2e3AzRWNnHQyIQ uUkF/R3SnxvvSMNrn8a4R9qPWSYhyYjKtq5nYA9DahsPgH9BeROe1uYYuetY/4Gmxl35 VgpTNTmu/u9kbuHW4F4fYuaayBiXp7+7oaSJcx0Tx616XVrlHSaIYR3JHLpeQq61CeNu aty+SSI5aIM5dfublyLreQRa+GHlz29wi81Meeha9BIzbwM8e1/2dqn9j+BjYo37cojJ xTTuuevdHPlhKtB9lX053KyEIf++r0D7SLkk+vIHhmQsU3O2C+lgnFANywqYgwnNPRoB HVaQ== X-Gm-Message-State: AOAM530OQSyFr5GIzqPm3tND4j87kHcgkeOoRb/thYRunPoiMfagrsxt f7fSUqxbiYsJpFnTtkx6oBYQVj9B8ag= X-Google-Smtp-Source: ABdhPJy3gSYnCglfdnATKFnBndZq+eroxSnjzRQ0DEfcv9v4WSkH8+713Wqp23PBnFVQaABINI5z1w== X-Received: by 2002:a1c:2203:: with SMTP id i3mr11929548wmi.6.1606635817821; Sat, 28 Nov 2020 23:43:37 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a18sm16217044wrr.20.2020.11.28.23.43.37 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:37 -0800 (PST) Message-Id: <29615c366f46ced1a4b0a17d8e3ec570f60ec437.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:18 +0000 Subject: [PATCH 15/20] merge-ort: step 3 of tree writing -- handling subdirectories as we go Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Our order for processing of entries means that if we have a tree of files that looks like Makefile src/moduleA/foo.c src/moduleA/bar.c src/moduleB/baz.c src/moduleB/umm.c tokens.txt Then we will process paths in the order of the leftmost column below. I have added two additional columns that help explain the algorithm that follows; the 2nd column is there to remind us we have oid & mode info we are tracking for each of these paths (which differs between the paths which I'm not representing well here), and the third column annotates the parent directory of the entry: tokens.txt "" src/moduleB/umm.c src/moduleB src/moduleB/baz.c src/moduleB src/moduleB src src/moduleA/foo.c src/moduleA src/moduleA/bar.c src/moduleA src/moduleA src src "" Makefile "" When the parent directory changes, if it's a subdirectory of the previous parent directory (e.g. "" -> src/moduleB) then we can just keep appending. If the parent directory differs from the previous parent directory and is not a subdirectory, then we should process that directory. So, for example, when we get to this point: tokens.txt "" src/moduleB/umm.c src/moduleB src/moduleB/baz.c src/moduleB and note that the next entry (src/moduleB) has a different parent than the last one that isn't a subdirectory, we should write out a tree for it 100644 blob umm.c 100644 blob baz.c then pop all the entries under that directory while recording the new hash for that directory, leaving us with tokens.txt "" src/moduleB src This process repeats until at the end we get to tokens.txt "" src "" Makefile "" and then we can write out the toplevel tree. Since we potentially have entries in our string_list corresponding to multiple different toplevel directories, e.g. a slightly different repository might have: whizbang.txt "" tokens.txt "" src/moduleD src src/moduleC src src/moduleB src src/moduleA/foo.c src/moduleA src/moduleA/bar.c src/moduleA When src/moduleA is popped off, we need to know that the "last directory" reverts back to src, and how many entries in our string_list are associated with that parent directory. So I use an auxiliary offsets string_list which would have (parent_directory,offset) information of the form "" 0 src 2 src/moduleA 5 Whenever I write out a tree for a subdirectory, I set versions.nr to the final offset value and then decrement offsets.nr...and then add an entry to versions with a hash for the new directory. The idea is relatively simple, there's just a lot of accounting to implement this. Signed-off-by: Elijah Newren --- merge-ort.c | 242 ++++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 234 insertions(+), 8 deletions(-) diff --git a/merge-ort.c b/merge-ort.c index 59355de628..65dbdadc5e 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -494,7 +494,46 @@ static int string_list_df_name_compare(const char *one, const char *two) } struct directory_versions { + /* + * versions: list of (basename -> version_info) + * + * The basenames are in reverse lexicographic order of full pathnames, + * as processed in process_entries(). This puts all entries within + * a directory together, and covers the directory itself after + * everything within it, allowing us to write subtrees before needing + * to record information for the tree itself. + */ struct string_list versions; + + /* + * offsets: list of (full relative path directories -> integer offsets) + * + * Since versions contains basenames from files in multiple different + * directories, we need to know which entries in versions correspond + * to which directories. Values of e.g. + * "" 0 + * src 2 + * src/moduleA 5 + * Would mean that entries 0-1 of versions are files in the toplevel + * directory, entries 2-4 are files under src/, and the remaining + * entries starting at index 5 are files under src/moduleA/. + */ + struct string_list offsets; + + /* + * last_directory: directory that previously processed file found in + * + * last_directory starts NULL, but records the directory in which the + * previous file was found within. As soon as + * directory(current_file) != last_directory + * then we need to start updating accounting in versions & offsets. + * Note that last_directory is always the last path in "offsets" (or + * NULL if "offsets" is empty) so this exists just for quick access. + */ + const char *last_directory; + + /* last_directory_len: cached computation of strlen(last_directory) */ + unsigned last_directory_len; }; static void write_tree(struct object_id *result_oid, @@ -558,6 +597,181 @@ static void record_entry_for_tree(struct directory_versions *dir_metadata, basename)->util = &mi->result; } +static void write_completed_directory(struct merge_options *opt, + const char *new_directory_name, + struct directory_versions *info) +{ + const char *prev_dir; + struct merged_info *dir_info = NULL; + unsigned int offset; + + /* + * Some explanation of info->versions and info->offsets... + * + * process_entries() iterates over all relevant files AND + * directories in reverse lexicographic order, and calls this + * function. Thus, an example of the paths that process_entries() + * could operate on (along with the directories for those paths + * being shown) is: + * + * xtract.c "" + * tokens.txt "" + * src/moduleB/umm.c src/moduleB + * src/moduleB/stuff.h src/moduleB + * src/moduleB/baz.c src/moduleB + * src/moduleB src + * src/moduleA/foo.c src/moduleA + * src/moduleA/bar.c src/moduleA + * src/moduleA src + * src "" + * Makefile "" + * + * info->versions: + * + * always contains the unprocessed entries and their + * version_info information. For example, after the first five + * entries above, info->versions would be: + * + * xtract.c + * token.txt + * umm.c + * stuff.h + * baz.c + * + * Once a subdirectory is completed we remove the entries in + * that subdirectory from info->versions, writing it as a tree + * (write_tree()). Thus, as soon as we get to src/moduleB, + * info->versions would be updated to + * + * xtract.c + * token.txt + * moduleB + * + * info->offsets: + * + * helps us track which entries in info->versions correspond to + * which directories. When we are N directories deep (e.g. 4 + * for src/modA/submod/subdir/), we have up to N+1 unprocessed + * directories (+1 because of toplevel dir). Corresponding to + * the info->versions example above, after processing five entries + * info->offsets will be: + * + * "" 0 + * src/moduleB 2 + * + * which is used to know that xtract.c & token.txt are from the + * toplevel dirctory, while umm.c & stuff.h & baz.c are from the + * src/moduleB directory. Again, following the example above, + * once we need to process src/moduleB, then info->offsets is + * updated to + * + * "" 0 + * src 2 + * + * which says that moduleB (and only moduleB so far) is in the + * src directory. + * + * One unique thing to note about info->offsets here is that + * "src" was not added to info->offsets until there was a path + * (a file OR directory) immediately below src/ that got + * processed. + * + * Since process_entry() just appends new entries to info->versions, + * write_completed_directory() only needs to do work if the next path + * is in a directory that is different than the last directory found + * in info->offsets. + */ + + /* + * If we are working with the same directory as the last entry, there + * is no work to do. (See comments above the directory_name member of + * struct merged_info for why we can use pointer comparison instead of + * strcmp here.) + */ + if (new_directory_name == info->last_directory) + return; + + /* + * If we are just starting (last_directory is NULL), or last_directory + * is a prefix of the current directory, then we can just update + * info->offsets to record the offset where we started this directory + * and update last_directory to have quick access to it. + */ + if (info->last_directory == NULL || + !strncmp(new_directory_name, info->last_directory, + info->last_directory_len)) { + uintptr_t offset = info->versions.nr; + + info->last_directory = new_directory_name; + info->last_directory_len = strlen(info->last_directory); + /* + * Record the offset into info->versions where we will + * start recording basenames of paths found within + * new_directory_name. + */ + string_list_append(&info->offsets, + info->last_directory)->util = (void*)offset; + return; + } + + /* + * The next entry that will be processed will be within + * new_directory_name. Since at this point we know that + * new_directory_name is within a different directory than + * info->last_directory, we have all entries for info->last_directory + * in info->versions and we need to create a tree object for them. + */ + dir_info = strmap_get(&opt->priv->paths, info->last_directory); + assert(dir_info); + offset = (uintptr_t)info->offsets.items[info->offsets.nr-1].util; + if (offset == info->versions.nr) { + /* + * Actually, we don't need to create a tree object in this + * case. Whenever all files within a directory disappear + * during the merge (e.g. unmodified on one side and + * deleted on the other, or files were renamed elsewhere), + * then we get here and the directory itself needs to be + * omitted from its parent tree as well. + */ + dir_info->is_null = 1; + } else { + /* + * Write out the tree to the git object directory, and also + * record the mode and oid in dir_info->result. + */ + dir_info->is_null = 0; + dir_info->result.mode = S_IFDIR; + write_tree(&dir_info->result.oid, &info->versions, offset, + opt->repo->hash_algo->rawsz); + } + + /* + * We've now used several entries from info->versions and one entry + * from info->offsets, so we get rid of those values. + */ + info->offsets.nr--; + info->versions.nr = offset; + + /* + * Now we've taken care of the completed directory, but we need to + * prepare things since future entries will be in + * new_directory_name. (In particular, process_entry() will be + * appending new entries to info->versions.) So, we need to make + * sure new_directory_name is the last entry in info->offsets. + */ + prev_dir = info->offsets.nr == 0 ? NULL : + info->offsets.items[info->offsets.nr-1].string; + if (new_directory_name != prev_dir) { + uintptr_t c = info->versions.nr; + string_list_append(&info->offsets, + new_directory_name)->util = (void*)c; + } + + /* And, of course, we need to update last_directory to match. */ + info->last_directory = new_directory_name; + info->last_directory_len = strlen(info->last_directory); +} + /* Per entry merge function */ static void process_entry(struct merge_options *opt, const char *path, @@ -656,7 +870,9 @@ static void process_entries(struct merge_options *opt, struct strmap_entry *e; struct string_list plist = STRING_LIST_INIT_NODUP; struct string_list_item *entry; - struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP }; + struct directory_versions dir_metadata = { STRING_LIST_INIT_NODUP, + STRING_LIST_INIT_NODUP, + NULL, 0 }; if (strmap_empty(&opt->priv->paths)) { oidcpy(result_oid, opt->repo->hash_algo->empty_tree); @@ -676,6 +892,11 @@ static void process_entries(struct merge_options *opt, /* * Iterate over the items in reverse order, so we can handle paths * below a directory before needing to handle the directory itself. + * + * This allows us to write subtrees before we need to write trees, + * and it also enables sane handling of directory/file conflicts + * (because it allows us to know whether the directory is still in + * the way when it is time to process the file at the same path). */ for (entry = &plist.items[plist.nr-1]; entry >= plist.items; --entry) { char *path = entry->string; @@ -686,6 +907,8 @@ static void process_entries(struct merge_options *opt, */ struct merged_info *mi = entry->util; + write_completed_directory(opt, mi->directory_name, + &dir_metadata); if (mi->clean) record_entry_for_tree(&dir_metadata, path, mi); else { @@ -694,17 +917,20 @@ static void process_entries(struct merge_options *opt, } } - /* - * TODO: We can't actually write a tree yet, because dir_metadata just - * contains all basenames of all files throughout the tree with their - * mode and hash. Not only is that a nonsensical tree, it will have - * lots of duplicates for paths such as "Makefile" or ".gitignore". - */ - die("Not yet implemented; need to process subtrees separately"); + if (dir_metadata.offsets.nr != 1 || + (uintptr_t)dir_metadata.offsets.items[0].util != 0) { + printf("dir_metadata.offsets.nr = %d (should be 1)\n", + dir_metadata.offsets.nr); + printf("dir_metadata.offsets.items[0].util = %u (should be 0)\n", + (unsigned)(uintptr_t)dir_metadata.offsets.items[0].util); + fflush(stdout); + BUG("dir_metadata accounting completely off; shouldn't happen"); + } write_tree(result_oid, &dir_metadata.versions, 0, opt->repo->hash_algo->rawsz); string_list_clear(&plist, 0); string_list_clear(&dir_metadata.versions, 0); + string_list_clear(&dir_metadata.offsets, 0); } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Nov 29 07:43:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938995 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 70027C83014 for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3DCE32078D for ; Sun, 29 Nov 2020 07:45:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="vIRsnKTT" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727068AbgK2HpD (ORCPT ); Sun, 29 Nov 2020 02:45:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47300 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726669AbgK2HpC (ORCPT ); Sun, 29 Nov 2020 02:45:02 -0500 Received: from mail-wr1-x443.google.com (mail-wr1-x443.google.com [IPv6:2a00:1450:4864:20::443]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F06D9C061A54 for ; Sat, 28 Nov 2020 23:43:39 -0800 (PST) Received: by mail-wr1-x443.google.com with SMTP id 64so10808657wra.11 for ; Sat, 28 Nov 2020 23:43:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=HAnw3NafiijmKfmnnIszEnLNjmX8+tY7kIWxaN1h+ko=; b=vIRsnKTTRSjq1sjNUIRoaT8V3e/kptmdoyTA29180M/Pce4NrTgz653/5vNPIjizzG 8NJ5zRqVqmtGyrgCgg4yyemtza1cAZIFpFMcKnvnUp2XVdQgzlDF2fUAAUDm7EA4WR8E YsEWpc6SzS4y6o9rUe96CuAhCyCYAKK6ojNqxRTOmJqb3qj3Vk+GSw/BbSmsudWh+mSB wnqO5TgnbiDmrsgrdjlXocm2M/zpmCjuLy9bLm/5vt04LW1IxOIyTvKYZiODsZsYJbD6 Od2bfg2jhcRFSgXpf5s+YKhSSzEWsJ0hYROMB2sKJMHBgT+SzJOeREFXlXZn5AzEB2T4 OOtg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=HAnw3NafiijmKfmnnIszEnLNjmX8+tY7kIWxaN1h+ko=; b=SpMHZ6S3JzH8S3v/cnP0wT5J8O/QopwbcKf5f+DpIiISgBXrBxHOkBTcjeAMOXxph9 1zeb3hlQP5xlAjPAUg/9Us2gu34OYCJnO9/OUyn9wT3bb9D19stWC25dpS1KOJR4Zv9k xf+xNPInTjNt2K68DjhT+TnnTUNnGJIeR7ndwobkMLgbeAnyP9vkQUGbvpjL2UfJ0zeN Tq97irfxYUlOri7AFnOzw3aWeuFBsvhj9mYgWCDkSc8XwGY/QLAdXurnuo6FCFNg9GuO aNLNhElgR2rxixvHG2oGSWvSbZBDjg7H9zEOn6wnwz/iHUtdaG6uRq3WTCGFgkIO3x0Y g2Wg== X-Gm-Message-State: AOAM531Gc+X11hJB/ryrKS1fDsRhtdDQ+zEa3uVmKigPDs5zS33sMeck Lk7UmFWcG8LJI7ZPdPf8VCfIyfubqJ0= X-Google-Smtp-Source: ABdhPJxgl7eMmrus0uSvK90zT++plr40k8UUcRr6PDXRKfp4jLHWlaoVnx/ORs/rS16ybrfwwCBqrg== X-Received: by 2002:a5d:474f:: with SMTP id o15mr20966952wrs.377.1606635818570; Sat, 28 Nov 2020 23:43:38 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s25sm19048289wmh.16.2020.11.28.23.43.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:38 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:19 +0000 Subject: [PATCH 16/20] merge-ort: basic outline for merge_switch_to_result() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren This adds a basic implementation for merge_switch_to_result(), though just in terms of a few new empty functions that will be defined in subsequent commits. Signed-off-by: Elijah Newren --- merge-ort.c | 42 +++++++++++++++++++++++++++++++++++++++++- 1 file changed, 41 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 65dbdadc5e..1ef32a4053 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -933,13 +933,53 @@ static void process_entries(struct merge_options *opt, string_list_clear(&dir_metadata.offsets, 0); } +static int checkout(struct merge_options *opt, + struct tree *prev, + struct tree *next) +{ + die("Not yet implemented."); +} + +static int record_conflicted_index_entries(struct merge_options *opt, + struct index_state *index, + struct strmap *paths, + struct strmap *conflicted) +{ + if (strmap_empty(conflicted)) + return 0; + + die("Not yet implemented."); +} + void merge_switch_to_result(struct merge_options *opt, struct tree *head, struct merge_result *result, int update_worktree_and_index, int display_update_msgs) { - die("Not yet implemented"); + assert(opt->priv == NULL); + if (result->clean >= 0 && update_worktree_and_index) { + struct merge_options_internal *opti = result->priv; + + if (checkout(opt, head, result->tree)) { + /* failure to function */ + result->clean = -1; + return; + } + + if (record_conflicted_index_entries(opt, opt->repo->index, + &opti->paths, + &opti->conflicted)) { + /* failure to function */ + result->clean = -1; + return; + } + } + + if (display_update_msgs) { + /* TODO: print out CONFLICT and other informational messages. */ + } + merge_finalize(opt, result); } From patchwork Sun Nov 29 07:43:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11939003 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7EFABC64E7C for ; Sun, 29 Nov 2020 07:45:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4C2F8207CD for ; Sun, 29 Nov 2020 07:45:29 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="P3hhB1UB" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727168AbgK2Hp1 (ORCPT ); Sun, 29 Nov 2020 02:45:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47420 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726669AbgK2Hp1 (ORCPT ); Sun, 29 Nov 2020 02:45:27 -0500 Received: from mail-wm1-x341.google.com (mail-wm1-x341.google.com [IPv6:2a00:1450:4864:20::341]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C33B8C061A55 for ; Sat, 28 Nov 2020 23:43:40 -0800 (PST) Received: by mail-wm1-x341.google.com with SMTP id v14so6543820wml.1 for ; Sat, 28 Nov 2020 23:43:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=W3Bhf2BLQ8bIe2cDp1LDmVIo03O0xRmiZCI1HPp2dfU=; b=P3hhB1UB8xUtGe5hLoEVLeF49Cx/4YL9YzbggjbDOf05huEI3pIHjGNP4xRRJK4tKI RuGNh2E+wUH3ob2Fkm0ybzCTZykmabLiM15hvpq8N+yNlSFthQ4/mfpDwjilbYdH+/xX E1RzbepdT79VpJ0WydLRTIpZrvpDRLV71TappI/1F0JotDwb3BUtaQQNX6AAGVEyR5ky WdIEzKBAzcfrxerpQgtxJ00pnnqUGiMs6Y3hhWunkzRGzx0YXCOOFjhDlWt65STh+IhK Vz0dTAG0eOp0kVShqL1U/yT/3cUEirjOA/6nCcnIo79To/lDjHxW47z6yph9ZHHO0mYk js8Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=W3Bhf2BLQ8bIe2cDp1LDmVIo03O0xRmiZCI1HPp2dfU=; b=iXrpnFnKnRQtUaFOWpOBAPXwxGoxJYyvpqeMrJ5cHL5C1471YyIsA5zWYe62cermaA bQ8A1fQI8SrJnvVLVQn0ZgWCh07MNONdCJCrQqGuvs2FUelrEmDqI2TtknP7W8QN8nuc boH5SDONSxFsQM9BNyeIBueA3WqIkLuByIl3m9aksWb5KBfgZRYhEAxAS4mqrcbYznO8 ekn7Esonxci4toKpLPs4Ezpr4q88kqC2JZ0gKAB9pQEu/7rCpN0wk64w4f0XOlUso+cl gud0TT0dNI1ALOfN8HguRlC1IW/l+/+D4wl6Yz7UwQoOAbBHbfKF1hUF2+PUVxjHV6Qa 8zMA== X-Gm-Message-State: AOAM530mm0dWX5Wpy4AekOlC8qK2JHN1R4ml4Oyo3pftmXGFRUaoTUJe FpEW+8KxG51CFMSVIePJwMIKlt2XXj8= X-Google-Smtp-Source: ABdhPJyiB7iiRhW5XLfVJwL2ofbFfDtmZjNOYbfZEf+41dUP3b01dJ2oiQlK4IukwfS9U6yOCUvLqQ== X-Received: by 2002:a1c:b742:: with SMTP id h63mr16619656wmf.122.1606635819344; Sat, 28 Nov 2020 23:43:39 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a1sm23061481wrv.61.2020.11.28.23.43.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:38 -0800 (PST) Message-Id: <68307f1b676d0ece1ca8b586ecc7d73a331e6b1a.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:20 +0000 Subject: [PATCH 17/20] merge-ort: add implementation of checkout() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Since merge-ort creates a tree for its output, when there are no conflicts, updating the working tree and index is as simple as using the unpack_trees() machinery with a twoway_merge (i.e. doing the equivalent of a "checkout" operation). If there were conflicts in the merge, then since the tree we created included all the conflict markers, then using the unpack_trees machinery in this manner will still update the working tree correctly. Further, all index entries corresponding to cleanly merged files will also be updated correctly by this procedure. Index entries corresponding to conflicted entries will appear as though the user had run "git add -u" after the merge to accept all files as-is with conflict markers. Thus, after running unpack_trees(), there needs to be a separate step for updating the entries in the index corresponding to conflicted files. This will be the job for the function record_conflicted_index_entris(), which will be implemented in a subsequent commit. Signed-off-by: Elijah Newren --- merge-ort.c | 45 ++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 44 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 1ef32a4053..69b9fbe591 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -19,9 +19,11 @@ #include "diff.h" #include "diffcore.h" +#include "dir.h" #include "object-store.h" #include "strmap.h" #include "tree.h" +#include "unpack-trees.h" #include "xdiff-interface.h" struct merge_options_internal { @@ -937,7 +939,48 @@ static int checkout(struct merge_options *opt, struct tree *prev, struct tree *next) { - die("Not yet implemented."); + /* Switch the index/working copy from old to new */ + int ret; + struct tree_desc trees[2]; + struct unpack_trees_options unpack_opts; + + memset(&unpack_opts, 0, sizeof(unpack_opts)); + unpack_opts.head_idx = -1; + unpack_opts.src_index = opt->repo->index; + unpack_opts.dst_index = opt->repo->index; + + setup_unpack_trees_porcelain(&unpack_opts, "merge"); + + /* + * NOTE: if this were just "git checkout" code, we would probably + * read or refresh the cache and check for a conflicted index, but + * builtin/merge.c or sequencer.c really needs to read the index + * and check for conflicted entries before starting merging for a + * good user experience (no sense waiting for merges/rebases before + * erroring out), so there's no reason to duplicate that work here. + */ + + /* 2-way merge to the new branch */ + unpack_opts.update = 1; + unpack_opts.merge = 1; + unpack_opts.quiet = 0; /* FIXME: sequencer might want quiet? */ + unpack_opts.verbose_update = (opt->verbosity > 2); + unpack_opts.fn = twoway_merge; + if (1/* FIXME: opts->overwrite_ignore*/) { + unpack_opts.dir = xcalloc(1, sizeof(*unpack_opts.dir)); + unpack_opts.dir->flags |= DIR_SHOW_IGNORED; + setup_standard_excludes(unpack_opts.dir); + } + parse_tree(prev); + init_tree_desc(&trees[0], prev->buffer, prev->size); + parse_tree(next); + init_tree_desc(&trees[1], next->buffer, next->size); + + ret = unpack_trees(2, trees, &unpack_opts); + clear_unpack_trees_porcelain(&unpack_opts); + dir_clear(unpack_opts.dir); + FREE_AND_NULL(unpack_opts.dir); + return ret; } static int record_conflicted_index_entries(struct merge_options *opt, From patchwork Sun Nov 29 07:43:21 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11939001 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 08D59C3E8C5 for ; Sun, 29 Nov 2020 07:45:29 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C8DB0207CD for ; Sun, 29 Nov 2020 07:45:28 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="ievJt+GL" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727178AbgK2Hp2 (ORCPT ); Sun, 29 Nov 2020 02:45:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47422 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727117AbgK2Hp1 (ORCPT ); Sun, 29 Nov 2020 02:45:27 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8DA86C061A56 for ; Sat, 28 Nov 2020 23:43:41 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id t4so10781391wrr.12 for ; Sat, 28 Nov 2020 23:43:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=5Vqos9ENXW+RP/6IqXk99famSFdcxfCZlVpwjy1vIpo=; b=ievJt+GLJjyr3WDrITmLxTTtYIP4IpOV71sFKvwHdcopzZb5D8gdAv9llRGEaq9CyR Tas5D2bQv6d8PcbXaz5HmSP24G0ujzJbCeh5UFekYsyOIc65USnwADK6nmX2R2Lh/uD2 q7ZerlNbz8HhiKWjdkTm3rrNXuC+ioHM9P7MpLYkqXbt6YuFPw2QGzFsEnGYy/QL3YLZ QaTIc4bpVwAzZGTjCkg6nokixdbFHA5gd28xmBZenHhA0TTENkCBG4GGk0iHxz40UTEi +8L0iA3QKVBkMpDCABfXeUBz1QEK/icZIRcLsYgHfYFejcykw59mGnTpTkrduHkk5JhN Sbmg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=5Vqos9ENXW+RP/6IqXk99famSFdcxfCZlVpwjy1vIpo=; b=mmOPzyIsFkLMsadR0YTFjcUrO4OUqkE9f7v9z0TX1AvLiBd87gYlDOgWuupYrx/7Hs 3QRzSnUu+AqRiJH+8lz4b/nnGPywYH4r/ZiZJXIzuXIzwCxa0pXG6mWopFAQMQkZ9Kbi 2fZuXaJFCk3xqPU/gal7OsV5t82XTuqaxKORo7rzWV2ydROfNJFc9kLOqWvud8wCtgKp wGuCoyiAeRzArSAl6oSP9ZrtbKG9A2KvjSeqfjGT8XH0mk1o4BNe80DXTxn3CT2nhxxx u4j446tJqdOSoT2VJx3tHcvSnxY7E6pyWPKRCqQzzfT++pj9svYHGjW/ZNqT8ntHGR4i memg== X-Gm-Message-State: AOAM532I6RLTMcW4aAaYKFt2xlnsifYcbdnFbQ1IaD2In8wvggnlByt2 gdltqeGOauGHH58FmLip3WmxQDfVlrE= X-Google-Smtp-Source: ABdhPJz5+F0UDUiaj5btlTpQtjQeXjGMSNNVMWW0VVQ/NT5rlaidyA9aAjU8rOSELskP6oNgv/RvsQ== X-Received: by 2002:adf:fc8c:: with SMTP id g12mr21308679wrr.355.1606635820178; Sat, 28 Nov 2020 23:43:40 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id 35sm22301086wro.71.2020.11.28.23.43.39 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:39 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:21 +0000 Subject: [PATCH 18/20] tree: enable cmp_cache_name_compare() to be used elsewhere Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- tree.c | 2 +- tree.h | 2 ++ 2 files changed, 3 insertions(+), 1 deletion(-) diff --git a/tree.c b/tree.c index e76517f6b1..a52479812c 100644 --- a/tree.c +++ b/tree.c @@ -144,7 +144,7 @@ int read_tree_recursive(struct repository *r, return ret; } -static int cmp_cache_name_compare(const void *a_, const void *b_) +int cmp_cache_name_compare(const void *a_, const void *b_) { const struct cache_entry *ce1, *ce2; diff --git a/tree.h b/tree.h index 9383745073..3eb0484cbf 100644 --- a/tree.h +++ b/tree.h @@ -28,6 +28,8 @@ void free_tree_buffer(struct tree *tree); /* Parses and returns the tree in the given ent, chasing tags and commits. */ struct tree *parse_tree_indirect(const struct object_id *oid); +int cmp_cache_name_compare(const void *a_, const void *b_); + #define READ_TREE_RECURSIVE 1 typedef int (*read_tree_fn_t)(const struct object_id *, struct strbuf *, const char *, unsigned int, int, void *); From patchwork Sun Nov 29 07:43:22 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11938999 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 672CEC3E8C5 for ; Sun, 29 Nov 2020 07:45:06 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 30D34207CD for ; Sun, 29 Nov 2020 07:45:06 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="m5bKLiV8" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727116AbgK2HpF (ORCPT ); Sun, 29 Nov 2020 02:45:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726669AbgK2HpE (ORCPT ); Sun, 29 Nov 2020 02:45:04 -0500 Received: from mail-wr1-x442.google.com (mail-wr1-x442.google.com [IPv6:2a00:1450:4864:20::442]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8476FC08C5F2 for ; Sat, 28 Nov 2020 23:43:42 -0800 (PST) Received: by mail-wr1-x442.google.com with SMTP id s8so10793992wrw.10 for ; Sat, 28 Nov 2020 23:43:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=8Rr00YGAeqAyGLrmlvNPjplfA4WlpT4yQLtdYFYM6Bg=; b=m5bKLiV8w2FwYt7/Cb+Fn7qpWReubslakQsAbgiNwhMSenY+kql6CSc6Jg4HbflctY q8IQIzRt3FqdW0kVxLQ0lkEXL0FNt//wBeao4juPd5d8F0uItCCi5elKzs3zySoBPqyI zcVBv95/iUhq0zY1UKI4Svk5LjwAg+CEYx19UkO3LlEQ+s7nwGYX8BujvWJhFIS/h6ok cmA1C9ab7hEmuj3PVDPS5KehKbfUdVbRqL5B4+hL7ptTJRnhyNUPhNMW7nKtY/6o7Mvm MlohGPAIWyIk4La14rpZI6s5DZ5CJEPUldUCRWuLNoY3gDq+zDgioUpIzFY4dvgWV9ry qc1g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=8Rr00YGAeqAyGLrmlvNPjplfA4WlpT4yQLtdYFYM6Bg=; b=l43ZLGlG7OvHYAH/2JetTMd5SeCzMFgujl/SeLryHuAdC+2mJNk/q9+EGQ4ieAY3HU ygTaClfVNnsigEFPYksK9m6Rny9z8PWQiW2PymlF9QLYOdwoRXclCD1A5L9pSIi5q5cH YJRRVy/XcpuH2AzfUu0KCZcW/Um+RvX4bKcPslxD5mu39k2MiPR2d3LHozGVserf6c3P 0Vy3+jGzKf4I28v8Af2VszkSG9ZSch6Fbg8UcxFfViEy0UfGlFh4Nz7d/ENvVUF/QSjW xWtzK4Lg3Vp3q+OjKuIKoPG2oDzxhHCHD9X1+EM9Jfcq2wkk4gCuloM7dr29SB84mrxz NaCw== X-Gm-Message-State: AOAM533ppEliaAHLzi5wXyyPJZsJdFRYgRCt6HxTf96Lzqyy4rRaJDEV nng+ydKnztGS9rDEnydKvYJhCGVVZt4= X-Google-Smtp-Source: ABdhPJw6fQEwmd7cJvqRRWvzpXM1K16ohc138OfXXSLD1+fKRv4aQ/cgAEZP0nPEwwhRtvhGchS6Rw== X-Received: by 2002:a5d:6310:: with SMTP id i16mr21176936wru.284.1606635821070; Sat, 28 Nov 2020 23:43:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c187sm22235021wmd.23.2020.11.28.23.43.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:40 -0800 (PST) Message-Id: <56b162c60993061e29c100ce4a27839d758033b8.1606635803.git.gitgitgadget@gmail.com> In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:22 +0000 Subject: [PATCH 19/20] merge-ort: add implementation of record_conflicted_index_entries() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren After checkout(), the working tree has the appropriate contents, and the index matches the working copy. That means that all unmodified and cleanly merged files have correct index entries, but conflicted entries need to be updated. We do this by looping over the conflicted entries, marking the existing index entry for the path with CE_REMOVE, adding new higher order staged for the path at the end of the index (ignoring normal index sort order), and then at the end of the loop removing the CE_REMOVED-marked cache entries and sorting the index. Signed-off-by: Elijah Newren --- merge-ort.c | 89 ++++++++++++++++++++++++++++++++++++++++++++++++++++- 1 file changed, 88 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index 69b9fbe591..d1b98e2fca 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -17,6 +17,7 @@ #include "cache.h" #include "merge-ort.h" +#include "cache-tree.h" #include "diff.h" #include "diffcore.h" #include "dir.h" @@ -988,10 +989,96 @@ static int record_conflicted_index_entries(struct merge_options *opt, struct strmap *paths, struct strmap *conflicted) { + struct hashmap_iter iter; + struct strmap_entry *e; + int errs = 0; + int original_cache_nr; + if (strmap_empty(conflicted)) return 0; - die("Not yet implemented."); + original_cache_nr = index->cache_nr; + + /* Put every entry from paths into plist, then sort */ + strmap_for_each_entry(conflicted, &iter, e) { + const char *path = e->key; + struct conflict_info *ci = e->value; + int pos; + struct cache_entry *ce; + int i; + + VERIFY_CI(ci); + + /* + * The index will already have a stage=0 entry for this path, + * because we created an as-merged-as-possible version of the + * file and checkout() moved the working copy and index over + * to that version. + * + * However, previous iterations through this loop will have + * added unstaged entries to the end of the cache which + * ignore the standard alphabetical ordering of cache + * entries and break invariants needed for index_name_pos() + * to work. However, we know the entry we want is before + * those appended cache entries, so do a temporary swap on + * cache_nr to only look through entries of interest. + */ + SWAP(index->cache_nr, original_cache_nr); + pos = index_name_pos(index, path, strlen(path)); + SWAP(index->cache_nr, original_cache_nr); + if (pos < 0) { + if (ci->filemask == 1) + cache_tree_invalidate_path(index, path); + else + BUG("Conflicted %s but nothing in basic working tree or index; this shouldn't happen", path); + } else { + ce = index->cache[pos]; + + /* + * Clean paths with CE_SKIP_WORKTREE set will not be + * written to the working tree by the unpack_trees() + * call in checkout(). Our conflicted entries would + * have appeared clean to that code since we ignored + * the higher order stages. Thus, we need override + * the CE_SKIP_WORKTREE bit and manually write those + * files to the working disk here. + * + * TODO: Implement this CE_SKIP_WORKTREE fixup. + */ + + /* + * Mark this cache entry for removal and instead add + * new stage>0 entries corresponding to the + * conflicts. If there are many conflicted entries, we + * want to avoid memmove'ing O(NM) entries by + * inserting the new entries one at a time. So, + * instead, we just add the new cache entries to the + * end (ignoring normal index requirements on sort + * order) and sort the index once we're all done. + */ + ce->ce_flags |= CE_REMOVE; + } + + for (i = 0; i < 3; i++) { + struct version_info *vi; + if (!(ci->filemask & (1ul << i))) + continue; + vi = &ci->stages[i]; + ce = make_cache_entry(index, vi->mode, &vi->oid, + path, i+1, 0); + add_index_entry(index, ce, ADD_CACHE_JUST_APPEND); + } + } + + /* + * Remove the unused cache entries (and invalidate the relevant + * cache-trees), then sort the index entries to get the conflicted + * entries we added to the end into their right locations. + */ + remove_marked_cache_entries(index, 1); + QSORT(index->cache, index->cache_nr, cmp_cache_name_compare); + + return errs; } void merge_switch_to_result(struct merge_options *opt, From patchwork Sun Nov 29 07:43:23 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Elijah Newren X-Patchwork-Id: 11939005 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.8 required=3.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FORGED_FROMDOMAIN,FREEMAIL_FROM, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A351AC64E8A for ; Sun, 29 Nov 2020 07:45:30 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 709D92078D for ; Sun, 29 Nov 2020 07:45:30 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VdGtVxHP" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727183AbgK2Hp3 (ORCPT ); Sun, 29 Nov 2020 02:45:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726428AbgK2Hp3 (ORCPT ); Sun, 29 Nov 2020 02:45:29 -0500 Received: from mail-wr1-x444.google.com (mail-wr1-x444.google.com [IPv6:2a00:1450:4864:20::444]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 565CEC08E85E for ; Sat, 28 Nov 2020 23:43:43 -0800 (PST) Received: by mail-wr1-x444.google.com with SMTP id r3so10836643wrt.2 for ; Sat, 28 Nov 2020 23:43:43 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=AV1blWkOnP2sBdEEc8SPkC3V7YMmQlaBdGixWNjkfn0=; b=VdGtVxHPoMHGPJ7hpqjNqplv74ZNKIB+cQ4kghSNYgbvK1gTaIONJl2YwZe3Jl9tmB B1Pa39oAgBn0TZ0N3ELoA6qQzFO9MBdwbwjqitMvPQR+octbgJxhVxFlU0OjP3fjuvAm uwNLhzKKy6M1B7FoF+Gg5UXOqmkZizKg+VDk0nnf5BHfNB6xUx9twCg8go5BUKIx/oIO 5IlZO6ZOszRm3vmGTDOGMEK87dpvonucoKvn7FOr8KvGXoHZTENrLgPYFE68Vpg1dLi+ rWJeh2+LJAGQyG5/uh1mPsTFAK7zRMNCAF2tRryTkzBa7Gl9RAo8x+HuTr4SQlAdrX56 KiMQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=AV1blWkOnP2sBdEEc8SPkC3V7YMmQlaBdGixWNjkfn0=; b=GnLYT/e/dCPnR+6kRg/s3xfkdnLh57CjyPMWHO8ea0JVy5eoEpWLVUfiEZc1AvYAyS SrbdBUyQzMIxCydXJqzruA63ZilwId6VOhPn4lzTlXRrtdvF6Gq2pAwpY11SS1NxHrkD xvHejdBFUYdCHxMK814ajeWa4PMq4m2qOfHGJpnKgGPEL6PHYbIDH20x6buqNwogn9ZQ K7o/V/0SHZhLc//sUs9D2qramyMfPliYCk6Om9BmI5Fs8bMtQkQCTDosRfhsUmmC/dCf NNJeDciA598jwXV91BpxWarKfbbKt4twklt6l9tcJ3I02KP3m/JR3TZ5LTeWU0rHCZNm WzAw== X-Gm-Message-State: AOAM533P2ZwO9QrkwGTr1WVdA9q8attIGoIfgxUOOreuhir/4GZgeyIq gzrqdFKARvGF6xQb7UWVf+xa+4C+PKY= X-Google-Smtp-Source: ABdhPJy58119qSBdns52fUPhJawen3P4Q5S1fE5Vd67irngajfcN/QcPofCSTAwYVL7Ugv4clluvFg== X-Received: by 2002:adf:dd81:: with SMTP id x1mr21340434wrl.163.1606635821943; Sat, 28 Nov 2020 23:43:41 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id d16sm24236977wrw.17.2020.11.28.23.43.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sat, 28 Nov 2020 23:43:41 -0800 (PST) Message-Id: In-Reply-To: References: Date: Sun, 29 Nov 2020 07:43:23 +0000 Subject: [PATCH 20/20] merge-ort: free data structures in merge_finalize() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Elijah Newren , Elijah Newren Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren From: Elijah Newren Signed-off-by: Elijah Newren --- merge-ort.c | 32 +++++++++++++++++++++++++++++++- 1 file changed, 31 insertions(+), 1 deletion(-) diff --git a/merge-ort.c b/merge-ort.c index d1b98e2fca..ea6a9d7348 100644 --- a/merge-ort.c +++ b/merge-ort.c @@ -182,6 +182,16 @@ struct conflict_info { assert((ci) && !(mi)->clean); \ } while (0) +static void free_strmap_strings(struct strmap *map) +{ + struct hashmap_iter iter; + struct strmap_entry *entry; + + strmap_for_each_entry(map, &iter, entry) { + free((char*)entry->key); + } +} + static int err(struct merge_options *opt, const char *err, ...) { va_list params; @@ -1116,7 +1126,27 @@ void merge_switch_to_result(struct merge_options *opt, void merge_finalize(struct merge_options *opt, struct merge_result *result) { - die("Not yet implemented"); + struct merge_options_internal *opti = result->priv; + + assert(opt->priv == NULL); + + /* + * We marked opti->paths with strdup_strings = 0, so that we + * wouldn't have to make another copy of the fullpath created by + * make_traverse_path from setup_path_info(). But, now that we've + * used it and have no other references to these strings, it is time + * to deallocate them. + */ + free_strmap_strings(&opti->paths); + strmap_clear(&opti->paths, 1); + + /* + * All keys and values in opti->conflicted are a subset of those in + * opti->paths. We don't want to deallocate anything twice, so we + * don't free the keys and we pass 0 for free_values. + */ + strmap_clear(&opti->conflicted, 0); + FREE_AND_NULL(opti); } static void merge_start(struct merge_options *opt, struct merge_result *result)