From patchwork Thu Nov 1 13:46:16 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Derrick Stolee X-Patchwork-Id: 10663975 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4980D109C for ; Thu, 1 Nov 2018 13:46:31 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3DE232BBC3 for ; Thu, 1 Nov 2018 13:46:31 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 3025E2BCA0; Thu, 1 Nov 2018 13:46:31 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,FREEMAIL_FROM,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 511522BC69 for ; Thu, 1 Nov 2018 13:46:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728589AbeKAWta (ORCPT ); Thu, 1 Nov 2018 18:49:30 -0400 Received: from mail-qk1-f195.google.com ([209.85.222.195]:46351 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728515AbeKAWt3 (ORCPT ); Thu, 1 Nov 2018 18:49:29 -0400 Received: by mail-qk1-f195.google.com with SMTP id q1so10560256qkf.13 for ; Thu, 01 Nov 2018 06:46:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fc00YL8L4NYk6BchbaDvrOXOpyZ+hXlBbdStPrcooHU=; b=W9leqqUjxRDAEvYUZ8K4KJsTWKSVZm7udSSX+p0JB6I0WTV6RyB6Vl1LSDWt/5Lrtq zlKgbnZvr6lA3dEvHu7wlDXO7xAlHWqCzH3zi85vOkpwcrr83ZtSVe4jjc6M5F+oIp/h padu0ePyv340nua50dSMXfVULuRj+XNE2ryP52q/t83EnH7Jh90b4yHoTLpbQhtJ+KPO MtDJG6rKhvI3ZFxyzn+ve+w5iYP/P9sAO3g9tQJjuKJvyr3oN1YR6Ch0WHG2IlI8kkm5 6AN2UOOvgnFsy13WeFd08GY8ADhQ1LRcWXuyug+bNWRot7QCi8zIbh5N9LMPpyML8efi jhiA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fc00YL8L4NYk6BchbaDvrOXOpyZ+hXlBbdStPrcooHU=; b=oeXMqiZluSn8dANiIkehDE9Y8uF9t2S65eJesiqIzsJWjyXvU6mySuFsC3cd3cOlpP c1cT2WEOrGUfrLhv8tuA/32WXwD2POv8r8SThcBu6trto4m2EJusWxkVBOhf/7NdCP4G AgYMH0nfkItHS5aCqs9rq6VmqiVA5+/VTFh7edWEg6chrWswdinzxx1rMH9i4xtF+8lM VQdJtZadl7ENCeoJ00KizdchoBQX0Mqo0UGTn333tj6xR6JpsextzFQ3EMh24oK8iFb6 nTh6n6CgjK0DzDPzB58gmpj/cTEBawDDD9LpWGhJobMllDtcYcRBT71PXqyvca55AmxC kk4g== X-Gm-Message-State: AGRZ1gI2FN3IPnxqsefjHyrz6rIPBbmC3U/LQ17Ffs+gva6kz5QuTg0Z +ghfPq2ikRIeGg7lMRzHKq+6wWFC X-Google-Smtp-Source: AJdET5fhXOVTRmr93TtDuR1HwPy8kgZlamUH045YwBZqYmJr/3kEx8MqhtAgPf+juO41HK+xwcqZwA== X-Received: by 2002:a37:4bd1:: with SMTP id y200-v6mr6672887qka.55.1541079986546; Thu, 01 Nov 2018 06:46:26 -0700 (PDT) Received: from stolee-linux.mshome.net ([167.220.148.125]) by smtp.gmail.com with ESMTPSA id f75sm7347357qkf.96.2018.11.01.06.46.25 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Thu, 01 Nov 2018 06:46:25 -0700 (PDT) From: Derrick Stolee X-Google-Original-From: Derrick Stolee To: git@vger.kernel.org Cc: gitster@pobox.com, avarab@gmail.com, szeder.dev@gmail.com, peff@peff.net, jnareb@gmail.com, Derrick Stolee Subject: [PATCH v5 0/7] Use generation numbers for --topo-order Date: Thu, 1 Nov 2018 13:46:16 +0000 Message-Id: <20181101134623.84055-1-dstolee@microsoft.com> X-Mailer: git-send-email 2.19.1.542.gc4df23f792 In-Reply-To: References: MIME-Version: 1.0 Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP This patch series performs a decently-sized refactoring of the revision-walk machinery. Well, "refactoring" is probably the wrong word, as I don't actually remove the old code. Instead, when we see certain options in the 'rev_info' struct, we redirect the commit-walk logic to a new set of methods that distribute the workload differently. By using generation numbers in the commit-graph, we can significantly improve 'git log --graph' commands (and the underlying 'git rev-list --topo-order'). On the Linux repository, I got the following performance results when comparing to the previous version with or without a commit-graph: Test: git rev-list --topo-order -100 HEAD HEAD~1, no commit-graph: 6.80 s HEAD~1, w/ commit-graph: 0.77 s HEAD, w/ commit-graph: 0.02 s Test: git rev-list --topo-order -100 HEAD -- tools HEAD~1, no commit-graph: 9.63 s HEAD~1, w/ commit-graph: 6.06 s HEAD, w/ commit-graph: 0.06 s If you want to read this series but are unfamiliar with the commit-graph and generation numbers, then I recommend reading `Documentation/technical/commit-graph.txt` or a blog post [1] I wrote on the subject. In particular, the three-part walk described in "revision.c: refactor basic topo-order logic" is present (but underexplained) as an animated PNG [2]. **UPDATED** Now that we have had some review and some dogfooding, I'm removing the paragraph I had here about "RFC quality". I think this is ready to merge! One notable case that is not included in this series is the case of a history comparison such as 'git rev-list --topo-order A..B'. The existing code in limit_list() has ways to cut the walk short when all pending commits are UNINTERESTING. Since this code depends on commit_list instead of the prio_queue we are using here, I chose to leave it untouched for now. We can revisit it in a separate series later. Since handle_commit() turns on revs->limited when a commit is UNINTERESTING, we do not hit the new code in this case. Removing this 'revs->limited = 1;' line yields correct results, but the performance can be worse. **UPDATED** See the discussion about Generation Number V2 [4] for more on this topic. Changes in V5: Thanks Jakub for feedback on the huge commit! I think I've responded to all the code feedback. See the range-diff at the end of this cover-page. Thanks, -Stolee [1] https://blogs.msdn.microsoft.com/devops/2018/07/09/supercharging-the-git-commit-graph-iii-generations/ Supercharging the Git Commit Graph III: Generations and Graph Algorithms [2] https://msdnshared.blob.core.windows.net/media/2018/06/commit-graph-topo-order-b-a.png Animation showing three-part walk [3] https://github.com/derrickstolee/git/tree/topo-order/test A branch containing this series along with commits to compute commit-graph in entire test suite. [4] https://public-inbox.org/git/6367e30a-1b3a-4fe9-611b-d931f51effef@gmail.com/ [RFC] Generation Number v2 Note: I'm not submitting this version via GitGitGadget because it's currently struggling with how to handle a PR in a conflict state. The new flags in revision.h have a conflict with recent changes in master. Derrick Stolee (7): prio-queue: add 'peek' operation test-reach: add run_three_modes method test-reach: add rev-list tests revision.c: begin refactoring --topo-order logic commit/revisions: bookkeeping before refactoring revision.c: generation-based topo-order algorithm t6012: make rev-list tests more interesting commit.c | 9 +- commit.h | 7 + object.h | 4 +- prio-queue.c | 9 ++ prio-queue.h | 6 + revision.c | 243 +++++++++++++++++++++++++++++++++-- revision.h | 6 + t/helper/test-prio-queue.c | 26 ++-- t/t0009-prio-queue.sh | 14 ++ t/t6012-rev-list-simplify.sh | 45 +++++-- t/t6600-test-reach.sh | 96 +++++++++++++- 11 files changed, 426 insertions(+), 39 deletions(-) base-commit: 2d3b1c576c85b7f5db1f418907af00ab88e0c303