From patchwork Wed Jan 29 22:03:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11356923 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4B434921 for ; Wed, 29 Jan 2020 22:03:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 206662071E for ; Wed, 29 Jan 2020 22:03:52 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="oXAP1AX4" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726723AbgA2WDu (ORCPT ); Wed, 29 Jan 2020 17:03:50 -0500 Received: from mail-wm1-f65.google.com ([209.85.128.65]:54906 "EHLO mail-wm1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726283AbgA2WDs (ORCPT ); Wed, 29 Jan 2020 17:03:48 -0500 Received: by mail-wm1-f65.google.com with SMTP id g1so1464784wmh.4 for ; Wed, 29 Jan 2020 14:03:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=kZr6bRchdKE3uJo7GDfpg3MFt0eukyWBsmuwQflY8MA=; b=oXAP1AX4voI3CZourcspkSA5C881Lh3Ju86eS7SG44GDaUtmQVoicmSO2Qfq24cWDY 2uD4BCm/suO36FTQ0SuyGXEdi0BnKqh+yx7KivISEjZp65gB3H1nLrpMrpinynUTJiU2 +S6h2+Oqh4ao6D8dFlUcqVZfPBdBvyh37I+cbPczA5W5gJs7L+40Bj31gezETFO+MnQD AwQqIQV5DT6OKi4a5qfHWA5kqGvemZpdNZNMgkm6iKk9gPVtvsrDTFx0rWAEg2M0k5Ri Qn1LvzisB6QUzNpeIy4eLAVAmbJMPOjsbIipqDR7LDh/zRz/Vcqv990Wt0aUNy9nqvJr F1jQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=kZr6bRchdKE3uJo7GDfpg3MFt0eukyWBsmuwQflY8MA=; b=KLfW50+5uxBTCM1L0PU6/dV9V8rQt5dRf3vtX+R6prKY5yRli9NAhKSQkpaGDWWv7S vqryCwxMNIC9VM+iH8B5zEjLqwXWGY40D5sqV4zNbU2v2+UQhWOQDP40BaRW7XzjtDNQ iOpBh+nZxnCVrj+WZYM9176fGc/krV5X3K9dkxGF0+tf7yL5vmFIy8ciz8UtZInCLqAe yUV4VFMnJGpwSC4yIr2VPKeAnaiL0dRtpSrhhzbjF1lV3O0tHWcoxr3NE9rV8ARjHFQI E7X5LJCfajX0DXjyz/vXcDAh05ZnOwp+y+8SA+EBsw+2Xx54HyB07QFRqNbRHO+wJIw5 hE/g== X-Gm-Message-State: APjAAAUQJ78gP3yWsYDxFIWCv/auk361xugqkWUL5zP2Vy2b/SYyz5ns WxQuk7ojiz5THiOzAZ89JQolHeP/ X-Google-Smtp-Source: APXvYqxbH+jDLnvW/q0q3DLIaN5pla3uFLTX5jXDRLq3y2lA9qccgu8LIHNGeTD0Bj88lYyYWkv1tg== X-Received: by 2002:a7b:c450:: with SMTP id l16mr1367221wmi.31.1580335426335; Wed, 29 Jan 2020 14:03:46 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id c9sm3869590wme.41.2020.01.29.14.03.45 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2020 14:03:45 -0800 (PST) Message-Id: <27bc1357964662faeeb1f4cadd31772ed6caf109.1580335424.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Elijah Newren via GitGitGadget" Date: Wed, 29 Jan 2020 22:03:38 +0000 Subject: [PATCH 1/6] dir: consolidate treat_path() and treat_one_path() Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Martin Melka , SZEDER =?utf-8?b?R8OhYm9y?= , Samuel Lijin , =?utf-8?b?Tmd1eeG7hW4gVGjDoWkgTmfhu41j?= Duy , Elijah Newren , Elijah Newren Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren Commit 16e2cfa90993 ("read_directory(): further split treat_path()", 2010-01-08) split treat_one_path() out of treat_path(), because treat_leading_path() would not have access to a dirent but wanted to re-use as much of treat_path() as possible. Not re-using all of treat_path() caused other bugs, as noted in commit b9670c1f5e6b ("dir: fix checks on common prefix directory", 2019-12-19). Finally, in commit ad6f2157f951 ("dir: restructure in a way to avoid passing around a struct dirent", 2020-01-16), dirents were removed from treat_path() and other functions entirely. Since the only reason for splitting these functions was the lack of a dirent -- which no longer applies to either function -- and since the split caused problems in the past resulting in us not using treat_one_path() separately anymore, just undo the split. Signed-off-by: Elijah Newren --- dir.c | 121 ++++++++++++++++++++++++++-------------------------------- 1 file changed, 55 insertions(+), 66 deletions(-) diff --git a/dir.c b/dir.c index b460211e61..68c56aeddb 100644 --- a/dir.c +++ b/dir.c @@ -1863,21 +1863,65 @@ static int resolve_dtype(int dtype, struct index_state *istate, return dtype; } -static enum path_treatment treat_one_path(struct dir_struct *dir, - struct untracked_cache_dir *untracked, - struct index_state *istate, - struct strbuf *path, - int baselen, - const struct pathspec *pathspec, - int dtype) -{ - int exclude; - int has_path_in_index = !!index_file_exists(istate, path->buf, path->len, ignore_case); +static enum path_treatment treat_path_fast(struct dir_struct *dir, + struct untracked_cache_dir *untracked, + struct cached_dir *cdir, + struct index_state *istate, + struct strbuf *path, + int baselen, + const struct pathspec *pathspec) +{ + strbuf_setlen(path, baselen); + if (!cdir->ucd) { + strbuf_addstr(path, cdir->file); + return path_untracked; + } + strbuf_addstr(path, cdir->ucd->name); + /* treat_one_path() does this before it calls treat_directory() */ + strbuf_complete(path, '/'); + if (cdir->ucd->check_only) + /* + * check_only is set as a result of treat_directory() getting + * to its bottom. Verify again the same set of directories + * with check_only set. + */ + return read_directory_recursive(dir, istate, path->buf, path->len, + cdir->ucd, 1, 0, pathspec); + /* + * We get path_recurse in the first run when + * directory_exists_in_index() returns index_nonexistent. We + * are sure that new changes in the index does not impact the + * outcome. Return now. + */ + return path_recurse; +} + +static enum path_treatment treat_path(struct dir_struct *dir, + struct untracked_cache_dir *untracked, + struct cached_dir *cdir, + struct index_state *istate, + struct strbuf *path, + int baselen, + const struct pathspec *pathspec) +{ + int has_path_in_index, dtype, exclude; enum path_treatment path_treatment; - dtype = resolve_dtype(dtype, istate, path->buf, path->len); + if (!cdir->d_name) + return treat_path_fast(dir, untracked, cdir, istate, path, + baselen, pathspec); + if (is_dot_or_dotdot(cdir->d_name) || !fspathcmp(cdir->d_name, ".git")) + return path_none; + strbuf_setlen(path, baselen); + strbuf_addstr(path, cdir->d_name); + if (simplify_away(path->buf, path->len, pathspec)) + return path_none; + + dtype = resolve_dtype(cdir->d_type, istate, path->buf, path->len); /* Always exclude indexed files */ + has_path_in_index = !!index_file_exists(istate, path->buf, path->len, + ignore_case); if (dtype != DT_DIR && has_path_in_index) return path_none; @@ -1942,61 +1986,6 @@ static enum path_treatment treat_one_path(struct dir_struct *dir, } } -static enum path_treatment treat_path_fast(struct dir_struct *dir, - struct untracked_cache_dir *untracked, - struct cached_dir *cdir, - struct index_state *istate, - struct strbuf *path, - int baselen, - const struct pathspec *pathspec) -{ - strbuf_setlen(path, baselen); - if (!cdir->ucd) { - strbuf_addstr(path, cdir->file); - return path_untracked; - } - strbuf_addstr(path, cdir->ucd->name); - /* treat_one_path() does this before it calls treat_directory() */ - strbuf_complete(path, '/'); - if (cdir->ucd->check_only) - /* - * check_only is set as a result of treat_directory() getting - * to its bottom. Verify again the same set of directories - * with check_only set. - */ - return read_directory_recursive(dir, istate, path->buf, path->len, - cdir->ucd, 1, 0, pathspec); - /* - * We get path_recurse in the first run when - * directory_exists_in_index() returns index_nonexistent. We - * are sure that new changes in the index does not impact the - * outcome. Return now. - */ - return path_recurse; -} - -static enum path_treatment treat_path(struct dir_struct *dir, - struct untracked_cache_dir *untracked, - struct cached_dir *cdir, - struct index_state *istate, - struct strbuf *path, - int baselen, - const struct pathspec *pathspec) -{ - if (!cdir->d_name) - return treat_path_fast(dir, untracked, cdir, istate, path, - baselen, pathspec); - if (is_dot_or_dotdot(cdir->d_name) || !fspathcmp(cdir->d_name, ".git")) - return path_none; - strbuf_setlen(path, baselen); - strbuf_addstr(path, cdir->d_name); - if (simplify_away(path->buf, path->len, pathspec)) - return path_none; - - return treat_one_path(dir, untracked, istate, path, baselen, pathspec, - cdir->d_type); -} - static void add_untracked(struct untracked_cache_dir *dir, const char *name) { if (!dir) From patchwork Wed Jan 29 22:03:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11356921 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7B38E921 for ; Wed, 29 Jan 2020 22:03:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5AF3E2071E for ; Wed, 29 Jan 2020 22:03:51 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="bRarUl44" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726735AbgA2WDu (ORCPT ); Wed, 29 Jan 2020 17:03:50 -0500 Received: from mail-wr1-f65.google.com ([209.85.221.65]:37011 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726339AbgA2WDt (ORCPT ); Wed, 29 Jan 2020 17:03:49 -0500 Received: by mail-wr1-f65.google.com with SMTP id w15so1416801wru.4 for ; Wed, 29 Jan 2020 14:03:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=9gJy8rO8rX3RtYiNe+N6NAtJQNv32fz37U/5D9/Csog=; b=bRarUl44iOptLWY4ChH6DQH1AAaTIy3n4YXAKZuy4VA15hcKCOfdSp2BuoYsIReicT HDfL7Z0kzbaJCIQPKqdkG+74tb7HHjoIBI+ZAnAvlHa3Vd1CfjUI+RVOOtRQV+LWQ5LQ PIg3EtXYv5ixB9k2jCgk9xx6fED+7iDD1Nt6AOOMDmZMf7w3aSRsVdeZhtHi3jmyZDZq 4HoUt3Qt/AVfbwIf1s43FfsEaiDynbbYycmTGL52DGGviTe2kL0C65YhLJ4hL3I3YDhC NNKe1rovzal4h1+3XNOi+6QYxK/whqibixVutmjSCySAvWJFR4htlBy3nO6xWU4IWpml Jdmw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=9gJy8rO8rX3RtYiNe+N6NAtJQNv32fz37U/5D9/Csog=; b=Dfk14agYuohdhrnP6GEKuBcq6jLNCiWURqpOzdVttXkeXjFWcySleuopDFSWw4ZnrP mA1N0QtUraliuIgGwWqyLaJWpT48QhlhFHqrMglc/dGDkjT2CeNEGI1EJpDK0+0IIzKV FWMC1wisAOQUJ5C44AB2KZPbvOFo1AM+eNC+VcSj/sC217Nv+jEpMxV0Hz976OEOEkl4 JhZvqFTHaemupNk9by63QjqbtRJfBdDwzMy5b3D1tjMovVGIGPOZfqaWyGuZROrWovWU dfUh7a07nMDLHL63LJ9Xf8Tjs6J5UfSPPOHtBOJxO0+/bTQryKtkUjVuylhsXWj4mwBo XSIQ== X-Gm-Message-State: APjAAAW3icXOU3Ke5hS1ZsaRJU3pCzXboo804hy07A6c8GSp3cr60zaq kLenMBZJiQQxWp8WycB770j4+33G X-Google-Smtp-Source: APXvYqx/imBl1nW9t1IOyoN09dRxiaVTEnhtBPR4eCUU/2STB6xk65o7Ct5NH/yYSjhJ/P/R2kmPgg== X-Received: by 2002:a5d:488c:: with SMTP id g12mr1041237wrq.67.1580335427167; Wed, 29 Jan 2020 14:03:47 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id a1sm4510986wrr.80.2020.01.29.14.03.46 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2020 14:03:46 -0800 (PST) Message-Id: <2ceb64ae61eccd662922f6156e00d4044bef515c.1580335424.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Elijah Newren via GitGitGadget" Date: Wed, 29 Jan 2020 22:03:39 +0000 Subject: [PATCH 2/6] dir: fix broken comment Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Martin Melka , SZEDER =?utf-8?b?R8OhYm9y?= , Samuel Lijin , =?utf-8?b?Tmd1eeG7hW4gVGjDoWkgTmfhu41j?= Duy , Elijah Newren , Elijah Newren Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren Signed-off-by: Elijah Newren --- dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/dir.c b/dir.c index 68c56aeddb..c358158f55 100644 --- a/dir.c +++ b/dir.c @@ -2259,7 +2259,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, add_untracked(untracked, path.buf + baselen); break; } - /* skip the dir_add_* part */ + /* skip the add_path_to_appropriate_result_list() */ continue; } From patchwork Wed Jan 29 22:03:40 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11356927 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 08CCA14B4 for ; Wed, 29 Jan 2020 22:03:55 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D011620716 for ; Wed, 29 Jan 2020 22:03:54 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="CPxU3cX1" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726762AbgA2WDx (ORCPT ); Wed, 29 Jan 2020 17:03:53 -0500 Received: from mail-wr1-f67.google.com ([209.85.221.67]:39873 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726222AbgA2WDv (ORCPT ); Wed, 29 Jan 2020 17:03:51 -0500 Received: by mail-wr1-f67.google.com with SMTP id y11so1392669wrt.6 for ; Wed, 29 Jan 2020 14:03:48 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=eOW7ZR7XOpzw/b1ufbziIWWepfolUPPVh6H8feOJQDQ=; b=CPxU3cX1jmCGf1Z0Gchka/UuMGBoaziIhhWbjPfqiXzDBdIvJOnaCisj1EN23zMrkN FNvBsLti64KfnvD0Cw6d+opyDMdYNITCOlaAHNf11Ake+B79XMBa0zoSht9V8qfOZ/R9 dwh3N9mVlhKYqtTyvmaW27hGzrVI3RasJfgCmHiz9YNK68Shky3TUMJ/DeTslF5Ji2xZ duByQjnsUZjNALjGAHER0UrvWoQwqbNrdoOxf8y2HXZGDXgJKtq0ZPR15IRWNMWwlfFy 8V+5rRsbK+MKnok+pMStrXmgYLge2QBBX0qmS/3bWZE96pDeijvSOfEqnf2638xPM3qM ju5Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=eOW7ZR7XOpzw/b1ufbziIWWepfolUPPVh6H8feOJQDQ=; b=AN6xiLelOWPkGY4o8PyQMHmy5d0ONY0ImaBG16s08xpt6My4qzDOixtWPkkPJazfW+ Ub8anitjNPUFBNr9UV3nPs0B/qdZEIj53OZlxngeGV6k4YwOPPyd+dLjVSnLmxC6r6up zxAmyzQc4wv4Rbtcu1YuVAr87bxknrxYgAAssyzvH7ikP5wjAbCTB+Pr39uzsk8P+2if HucA7p08x8PXYQO/Rcd+KTg9lvzMKWD9mR/NoT6mhI9vE/P+oQWXIcpgEGgIG4UO6eGI WSexSkvmjSVIsIxRQqh5Ur4kkxHdmotDrfFsF1KnUQAc/e4mTs9m4z1GsdZODZR7+npH ybGg== X-Gm-Message-State: APjAAAX96/Uvh6qjADXJAWo8IJMk8rzNtMrfZj9j+Ch942vWmISzoXNT WSOByDM6y5gooFYfi4ndyiLMySTq X-Google-Smtp-Source: APXvYqy99gdkyuUmZ4Q1sRnE7n94pFZhYLGAEcPwuL1tCp72d4iN8HEf8dNG+5cUDm/pZHkn6LphVA== X-Received: by 2002:a5d:6ac5:: with SMTP id u5mr1058650wrw.271.1580335427936; Wed, 29 Jan 2020 14:03:47 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id x132sm9251436wmg.0.2020.01.29.14.03.47 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2020 14:03:47 -0800 (PST) Message-Id: In-Reply-To: References: From: "Elijah Newren via GitGitGadget" Date: Wed, 29 Jan 2020 22:03:40 +0000 Subject: [PATCH 3/6] dir: fix confusion based on variable tense Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Martin Melka , SZEDER =?utf-8?b?R8OhYm9y?= , Samuel Lijin , =?utf-8?b?Tmd1eeG7hW4gVGjDoWkgTmfhu41j?= Duy , Elijah Newren , Elijah Newren Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren Despite having contributed several fixes in this area, I have for months (years?) assumed that the "exclude" variable was a directive; this caused me to think of it as a different mode we operate in and left me confused as I tried to build up a mental model around why we'd need such a directive. I mostly tried to ignore it while focusing on the pieces I was trying to understand. Then I finally traced this variable all back to a call to is_excluded(), meaning it was actually functioning as an adjective. In particular, it was a checked property ("Does this path match a rule in .gitignore?"), rather than a mode passed in from the caller. Change the variable name to match the part of speech used by the function called to define it, which will hopefully make these bits of code slightly clearer to the next reader. Signed-off-by: Elijah Newren --- dir.c | 26 +++++++++++++------------- 1 file changed, 13 insertions(+), 13 deletions(-) diff --git a/dir.c b/dir.c index c358158f55..225f0bc082 100644 --- a/dir.c +++ b/dir.c @@ -1656,7 +1656,7 @@ static enum exist_status directory_exists_in_index(struct index_state *istate, static enum path_treatment treat_directory(struct dir_struct *dir, struct index_state *istate, struct untracked_cache_dir *untracked, - const char *dirname, int len, int baselen, int exclude, + const char *dirname, int len, int baselen, int excluded, const struct pathspec *pathspec) { int nested_repo = 0; @@ -1679,13 +1679,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, } if (nested_repo) return ((dir->flags & DIR_SKIP_NESTED_GIT) ? path_none : - (exclude ? path_excluded : path_untracked)); + (excluded ? path_excluded : path_untracked)); if (dir->flags & DIR_SHOW_OTHER_DIRECTORIES) break; - if (exclude && - (dir->flags & DIR_SHOW_IGNORED_TOO) && - (dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING)) { + if (excluded && + (dir->flags & DIR_SHOW_IGNORED_TOO) && + (dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING)) { /* * This is an excluded directory and we are @@ -1713,7 +1713,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, /* This is the "show_other_directories" case */ if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) - return exclude ? path_excluded : path_untracked; + return excluded ? path_excluded : path_untracked; untracked = lookup_untracked(dir->untracked, untracked, dirname + baselen, len - baselen); @@ -1723,7 +1723,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, * the directory contains any files. */ return read_directory_recursive(dir, istate, dirname, len, - untracked, 1, exclude, pathspec); + untracked, 1, excluded, pathspec); } /* @@ -1904,7 +1904,7 @@ static enum path_treatment treat_path(struct dir_struct *dir, int baselen, const struct pathspec *pathspec) { - int has_path_in_index, dtype, exclude; + int has_path_in_index, dtype, excluded; enum path_treatment path_treatment; if (!cdir->d_name) @@ -1949,13 +1949,13 @@ static enum path_treatment treat_path(struct dir_struct *dir, (directory_exists_in_index(istate, path->buf, path->len) == index_nonexistent)) return path_none; - exclude = is_excluded(dir, istate, path->buf, &dtype); + excluded = is_excluded(dir, istate, path->buf, &dtype); /* * Excluded? If we don't explicitly want to show * ignored files, ignore it */ - if (exclude && !(dir->flags & (DIR_SHOW_IGNORED|DIR_SHOW_IGNORED_TOO))) + if (excluded && !(dir->flags & (DIR_SHOW_IGNORED|DIR_SHOW_IGNORED_TOO))) return path_excluded; switch (dtype) { @@ -1965,7 +1965,7 @@ static enum path_treatment treat_path(struct dir_struct *dir, strbuf_addch(path, '/'); path_treatment = treat_directory(dir, istate, untracked, path->buf, path->len, - baselen, exclude, pathspec); + baselen, excluded, pathspec); /* * If 1) we only want to return directories that * match an exclude pattern and 2) this directory does @@ -1974,7 +1974,7 @@ static enum path_treatment treat_path(struct dir_struct *dir, * recurse into this directory (instead of marking the * directory itself as an ignored path). */ - if (!exclude && + if (!excluded && path_treatment == path_excluded && (dir->flags & DIR_SHOW_IGNORED_TOO) && (dir->flags & DIR_SHOW_IGNORED_TOO_MODE_MATCHING)) @@ -1982,7 +1982,7 @@ static enum path_treatment treat_path(struct dir_struct *dir, return path_treatment; case DT_REG: case DT_LNK: - return exclude ? path_excluded : path_untracked; + return excluded ? path_excluded : path_untracked; } } From patchwork Wed Jan 29 22:03:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11356925 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6BDBF921 for ; Wed, 29 Jan 2020 22:03:53 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 486E620716 for ; Wed, 29 Jan 2020 22:03:53 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="jGs5d9Xi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726757AbgA2WDw (ORCPT ); Wed, 29 Jan 2020 17:03:52 -0500 Received: from mail-wr1-f67.google.com ([209.85.221.67]:35045 "EHLO mail-wr1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726708AbgA2WDv (ORCPT ); Wed, 29 Jan 2020 17:03:51 -0500 Received: by mail-wr1-f67.google.com with SMTP id g17so1441997wro.2 for ; Wed, 29 Jan 2020 14:03:49 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=+EEP5cvZz+JJ7/eTYAgYq85d/6J7CRuOEJVck0LRXxw=; b=jGs5d9Xi6uGbIRnvXJjTAWDotS5W9x+C7Uy7alF3o5IAoLZTgNtB22mGjgkcruMDiY uxdMaycFjwrMR9XQOxKKyF4ZH78/BgissRzlnvaEaRQbbaufDT3OwiL41eOojyj2WW4C xwSxKLiaCYhHmSMGbjuq/sDB13EidlyCUWO7CXAjpwIgTbU4+oN8lFuiydp3j4cSThaA qTS9ybnw0fSa2KYhkKSm15MTSOox5ZDE6iHGj9sUQ/oPwqArggAmhet8lS45NyrWY+Zv 3z0ctQCbRnev1ygvQbu8xmeaeHrK6lHB2tLWo7QzbNo99Wp1b2GB4TXbR1hZed+GqDne xbvQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=+EEP5cvZz+JJ7/eTYAgYq85d/6J7CRuOEJVck0LRXxw=; b=FDm8oQZnHkanQ79l4W7F3TNCjlo7va+UOrolc5BV8FFLE4c9CryJzzoxjq/IP6xy+y q2mqXdfZxRmKLwpL2MFfYah7waPWnnfK+sHn48mU00gdF+WD+fmWCIMqnCVp7xCqNuCS otlZ4M24V0SNjmXkpQHp8dyfZXZ8edMOGt8pY2H/aolFBl8SxuOBsWBWdSVLmCES8f82 kg52uz39hb5adsvpScOwpM5YOYvIOcZoTStk81qe6UAo2IMHwIeymlUVv1Vh7aELl5QX FeDNm4/XI2j2gteePZZpsngJZiGiPTPcM5NAs6bHgYB6eUdLUjkK2Z4APBSbu0QU6drZ uGtA== X-Gm-Message-State: APjAAAW993UF8FiHLTTVnqlpEVUO8Paaktw85zz6tFI0X/5WhhS3IrCb Kr5AZXO152z71NDWnT2piBT7qOjx X-Google-Smtp-Source: APXvYqzdgPaH5PtF5xhg3x223vBiNh5Y4Y+nRJNVAxjASSnd4F7o9egN+xMrcIb+XvF5G6ffLZAMRw== X-Received: by 2002:adf:f581:: with SMTP id f1mr1048018wro.264.1580335428864; Wed, 29 Jan 2020 14:03:48 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id s22sm3711705wmh.4.2020.01.29.14.03.48 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2020 14:03:48 -0800 (PST) Message-Id: <3b2ec5eaf65c9fe44c4337a4cc2fc3dae6203d54.1580335424.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Elijah Newren via GitGitGadget" Date: Wed, 29 Jan 2020 22:03:41 +0000 Subject: [PATCH 4/6] dir: move setting of nested_repo next to its actual usage Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Martin Melka , SZEDER =?utf-8?b?R8OhYm9y?= , Samuel Lijin , =?utf-8?b?Tmd1eeG7hW4gVGjDoWkgTmfhu41j?= Duy , Elijah Newren , Elijah Newren Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren Signed-off-by: Elijah Newren Signed-off-by: Derrick Stolee Signed-off-by: Derrick Stolee --- dir.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/dir.c b/dir.c index 225f0bc082..ef3307718a 100644 --- a/dir.c +++ b/dir.c @@ -1659,7 +1659,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, const char *dirname, int len, int baselen, int excluded, const struct pathspec *pathspec) { - int nested_repo = 0; + int nested_repo; /* The "len-1" is to strip the final '/' */ switch (directory_exists_in_index(istate, dirname, len-1)) { @@ -1670,6 +1670,7 @@ static enum path_treatment treat_directory(struct dir_struct *dir, return path_none; case index_nonexistent: + nested_repo = 0; if ((dir->flags & DIR_SKIP_NESTED_GIT) || !(dir->flags & DIR_NO_GITLINKS)) { struct strbuf sb = STRBUF_INIT; From patchwork Wed Jan 29 22:03:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11356931 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD58B921 for ; Wed, 29 Jan 2020 22:03:57 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 78B7620716 for ; Wed, 29 Jan 2020 22:03:57 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="VXfYd5La" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726767AbgA2WDz (ORCPT ); Wed, 29 Jan 2020 17:03:55 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:40412 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726339AbgA2WDx (ORCPT ); Wed, 29 Jan 2020 17:03:53 -0500 Received: by mail-wr1-f68.google.com with SMTP id j104so1386011wrj.7 for ; Wed, 29 Jan 2020 14:03:50 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=PBqD72JD96S1oUXJ8dD8iNE1T6KRFyoZKwQmq0BjZ1Y=; b=VXfYd5La+xQRAgOO6Nh9DjZ2HtIecP1gNBdZKnxHtnv+O/Za5CFAq664UQufgTRby3 8oBp04c5KlZjLoHfDD6f6SiGfsKzqeIoiHD8BxJmv4nkWtNXEr6BWKWfbj6pXq6Gh1+q h1srtSft1B5p5aFqwHDUrA7B5Lu3ruV8JdRPiJxZtqEEzurmB1nB7WZcIfsvlCVFDZrh Sw9eqo+5jCcuUNg636rHd6kxHjbtfSg0xGrBOXm/S2evWU8ghfHIrlpvNN0k4aTPT9Sr IlQ6cYRIHBBtge3512rOw/Id+mMVvxF0SWlxfddLPg3Lr4J4sm1+6kii31Q4+mSvadza f6bg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=PBqD72JD96S1oUXJ8dD8iNE1T6KRFyoZKwQmq0BjZ1Y=; b=SvMzW7QIiKkbRhbhm7mN1YyAyIDKNQHcxKV1KohtrWA4di/TMzWWOXeL8m1iImhMqQ FtBlBQNeNhoO7gLKxSDBOYPUQg/nVNgOKq+adteLsx1VYgJHVeS4nz/JB3gp2z3fvUWd uH6yjD5RhSexfb1UPzfR3eOWKGhfy+YtD4lywlyWiS0uaxKh7AJkgHEsP4znpVtWWTcE dSq3Mp+qIA8U8/59gwQERcdUqjZGzGnwi+0B4geqeTfgZUFBu0SMswgTxoyvQPujKv3V VvNEhYruByXAjzwqgzxRZIpDHxcCU9Fr5gS98faNlN64w6Bm+X7bo8dG/B9Ehb49Bj17 F+qg== X-Gm-Message-State: APjAAAWbXE+in450j7AjWATRfUH8kjlpb4BidXOLS2xLTNMxkKaH2fk6 T/GiibhL/xirkvfPHH/6zFAxgFLX X-Google-Smtp-Source: APXvYqw+xcpuhTbco8+n5LkAu3pqdidvocL2yXiC7drXarvxb+yGDHT9XEH4uHAOBPvvV2H8SAKZLg== X-Received: by 2002:adf:ee01:: with SMTP id y1mr1098402wrn.152.1580335429693; Wed, 29 Jan 2020 14:03:49 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id z19sm3600541wmi.43.2020.01.29.14.03.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2020 14:03:49 -0800 (PST) Message-Id: <40b378e7adbbff5ecfd95fd888465fd0f99791c8.1580335424.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Elijah Newren via GitGitGadget" Date: Wed, 29 Jan 2020 22:03:42 +0000 Subject: [PATCH 5/6] dir: replace exponential algorithm with a linear one Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Martin Melka , SZEDER =?utf-8?b?R8OhYm9y?= , Samuel Lijin , =?utf-8?b?Tmd1eeG7hW4gVGjDoWkgTmfhu41j?= Duy , Elijah Newren , Elijah Newren Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren dir's read_directory_recursive() naturally operates recursively in order to walk the directory tree. Treating of directories is sometimes weird because there are so many different permutations about how to handle directories. Some examples: * 'git ls-files -o --directory' only needs to know that a directory itself is untracked; it doesn't need to recurse into it to see what is underneath. * 'git status' needs to recurse into an untracked directory, but only to determine whether or not it is empty. If there are no files underneath, the directory itself will be omitted from the output. If it is not empty, only the directory will be listed. * 'git status --ignored' needs to recurse into untracked directories and report all the ignored entries and then report the directory as untracked -- UNLESS all the entries under the directory are ignored, in which case we don't print any of the entries under the directory and just report the directory itself as ignored. * For 'git clean', we may need to recurse into a directory that doesn't match any specified pathspecs, if it's possible that there is an entry underneath the directory that can match one of the pathspecs. In such a case, we need to be careful to omit the directory itself from the list of paths (see e.g. commit 404ebceda01c ("dir: also check directories for matching pathspecs", 2019-09-17)) Part of the tension noted above is that the treatment of a directory can changed based on the files within it, and based on the various settings in dir->flags. Trying to keep this in mind while reading over the code, it is easy to (accidentally?) think in terms of "treat_directory() tells us what to do with a directory, and read_directory_recursive() is the thing that recurses". Since we need to look into a directory to know how to treat it, though, it was quite easy to decide to recurse into the directory from treat_directory() by adding a read_directory_recursive() call. Adding such a call is actually fine, IF we didn't also cause read_directory_recursive() to recurse into the same directory again. Unfortunately, commit df5bcdf83aeb ("dir: recurse into untracked dirs for ignored files", 2017-05-18), added exactly such a case to the code, meaning we'd have two calls to read_directory_recursive() for an untracked directory. So, if we had a file named one/two/three/four/five/somefile.txt and nothing in one/ was tracked, then 'git status --ignored' would call read_directory_recursive() twice on the directory 'one/', and each of those would call read_directory_recursive() twice on the directory 'one/two/', and so on until read_directory_recursive() was called 2^5 times for 'one/two/three/four/five/'. Avoid calling read_directory_recursive() twice per level by moving a lot of the special logic into treat_directory(). Since dir.c is somewhat complex, extra cruft built up around this over time. While trying to unravel it, I noticed several instances where the first call to read_directory_recursive() would return e.g. path_untracked for a some directory and a later one would return e.g. path_none, and the code relied on the side-effect of the first adding untracked entries to dir->entries in order to get the correct output despite the supposed override in return value by the later call. I am somewhat concerned that there are still bugs and maybe even testcases with the wrong expectation. I have tried to carefully document treat_directory() since it becomes more complex after this change (though much of this complexity came from elsewhere that probably deserved better comments to begin with). However, much of my work felt more like a game of whackamole while attempting to make the code match the existing regression tests than an attempt to create an implementation that matched some clear design. That seems wrong to me, but the rules of existing behavior had so many special cases that I had a hard time coming up with some overarching rules about what correct behavior is for all cases, forcing me to hope that the regression tests are correct and sufficient. (I'll note that this turmoil makes working with dir.c extremely unpleasant for me; I keep hoping it'll get better, but it never seems to.) However, on the positive side, it does make the code much faster. For the following simple shell loop in an empty repository: for depth in $(seq 10 25) do dirs=$(for i in $(seq 1 $depth) ; do printf 'dir/' ; done) rm -rf dir mkdir -p $dirs >$dirs/untracked-file /usr/bin/time --format="$depth: %e" git status --ignored >/dev/null done I saw the following timings, in seconds (note that the numbers are a little noisy from run-to-run, but the trend is very clear with every run): 10: 0.03 11: 0.05 12: 0.08 13: 0.19 14: 0.29 15: 0.50 16: 1.05 17: 2.11 18: 4.11 19: 8.60 20: 17.55 21: 33.87 22: 68.71 23: 140.05 24: 274.45 25: 551.15 After this fix, those drop to: 10: 0.00 11: 0.00 12: 0.00 13: 0.00 14: 0.00 15: 0.00 16: 0.00 17: 0.00 18: 0.00 19: 0.00 20: 0.00 21: 0.00 22: 0.00 23: 0.00 24: 0.00 25: 0.00 In fact, it isn't until a depth of 190 nested directories that it sometimes starts reporting a time of 0.01 seconds and doesn't consistently report 0.01 seconds until there are 240 nested directories. The previous code would have taken 17.55 * 2^220 / (60*60*24*365) = 9.4 * 10^59 YEARS to have completed the 240 nested directories case. It's not often that you get to speed something up by a factor of 3*10^69. WARNING: This change breaks t7063. I don't know whether that is to be expected (I now intentionally visit untracked directories differently so naturally the untracked cache should change), or if I've broken something. I'm hoping to get an untracked cache expert to chime in... Signed-off-by: Elijah Newren --- dir.c | 151 ++++++++++++++++++++++++++++++++++++++++------------------ 1 file changed, 105 insertions(+), 46 deletions(-) diff --git a/dir.c b/dir.c index ef3307718a..aaf038a9c4 100644 --- a/dir.c +++ b/dir.c @@ -1659,7 +1659,13 @@ static enum path_treatment treat_directory(struct dir_struct *dir, const char *dirname, int len, int baselen, int excluded, const struct pathspec *pathspec) { - int nested_repo; + /* + * WARNING: From this function, you can return path_recurse or you + * can call read_directory_recursive() (or neither), but + * you CAN'T DO BOTH. + */ + enum path_treatment state; + int nested_repo, old_ignored_nr, stop_early; /* The "len-1" is to strip the final '/' */ switch (directory_exists_in_index(istate, dirname, len-1)) { @@ -1713,18 +1719,101 @@ static enum path_treatment treat_directory(struct dir_struct *dir, /* This is the "show_other_directories" case */ - if (!(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) + /* + * We only need to recurse into untracked/ignored directories if + * either of the following bits is set: + * - DIR_SHOW_IGNORED_TOO (because then we need to determine if + * there are ignored directories below) + * - DIR_HIDE_EMPTY_DIRECTORIES (because we have to determine if + * the directory is empty) + */ + if (!(dir->flags & (DIR_SHOW_IGNORED_TOO | DIR_HIDE_EMPTY_DIRECTORIES))) return excluded ? path_excluded : path_untracked; + /* + * If we only want to determine if dirname is empty, then we can + * stop at the first file we find underneath that directory rather + * than continuing to recurse beyond it. If DIR_SHOW_IGNORED_TOO + * is set, then we want MORE than just determining if dirname is + * empty. + */ + stop_early = ((dir->flags & DIR_HIDE_EMPTY_DIRECTORIES) && + !(dir->flags & DIR_SHOW_IGNORED_TOO)); + + /* + * If /every/ file within an untracked directory is ignored, then + * we want to treat the directory as ignored (for e.g. status + * --porcelain), without listing the individual ignored files + * underneath. To do so, we'll save the current ignored_nr, and + * pop all the ones added after it if it turns out the entire + * directory is ignored. + */ + old_ignored_nr = dir->ignored_nr; + + /* Actually recurse into dirname now, we'll fixup the state later. */ untracked = lookup_untracked(dir->untracked, untracked, dirname + baselen, len - baselen); + state = read_directory_recursive(dir, istate, dirname, len, untracked, + stop_early, stop_early, pathspec); + + /* There are a variety of reasons we may need to fixup the state... */ + if (state == path_excluded) { + int i; + + /* + * When stop_early is set, read_directory_recursive() will + * never return path_untracked regardless of whether + * underlying paths were untracked or ignored (because + * returning early means it excluded some paths, or + * something like that -- see commit 5aaa7fd39aaf ("Improve + * performance of git status --ignored", 2017-09-18)). + * However, we're not really concerned with the status of + * files under the directory, we just wanted to know + * whether the directory was empty (state == path_none) or + * not (state == path_excluded), and if not, we'd return + * our original status based on whether the untracked + * directory matched an exclusion pattern. + */ + if (stop_early) + state = excluded ? path_excluded : path_untracked; + + else { + /* + * When + * !stop_early && state == path_excluded + * then all paths under dirname were ignored. For + * this case, git status --porcelain wants to just + * list the directory itself as ignored and not + * list the individual paths underneath. Remove + * the individual paths underneath. + */ + for (i = old_ignored_nr + 1; iignored_nr; ++i) + free(dir->ignored[i]); + dir->ignored_nr = old_ignored_nr; + } + } /* - * If this is an excluded directory, then we only need to check if - * the directory contains any files. + * If there is nothing under the current directory and we are not + * hiding empty directories, then we need to report on the + * untracked or ignored status of the directory itself. */ - return read_directory_recursive(dir, istate, dirname, len, - untracked, 1, excluded, pathspec); + if (state == path_none && !(dir->flags & DIR_HIDE_EMPTY_DIRECTORIES)) + state = excluded ? path_excluded : path_untracked; + + /* + * We can recurse into untracked directories that don't match any + * of the given pathspecs when some file underneath the directory + * might match one of the pathspecs. If so, we should make sure + * to note that the directory itself did not match. + */ + if (pathspec && + !match_pathspec(istate, pathspec, dirname, len, + 0 /* prefix */, NULL, + 0 /* do NOT special case dirs */)) + state = path_none; + + return state; } /* @@ -1872,6 +1961,11 @@ static enum path_treatment treat_path_fast(struct dir_struct *dir, int baselen, const struct pathspec *pathspec) { + /* + * WARNING: From this function, you can return path_recurse or you + * can call read_directory_recursive() (or neither), but + * you CAN'T DO BOTH. + */ strbuf_setlen(path, baselen); if (!cdir->ucd) { strbuf_addstr(path, cdir->file); @@ -2177,14 +2271,10 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, int stop_at_first_file, const struct pathspec *pathspec) { /* - * WARNING WARNING WARNING: - * - * Any updates to the traversal logic here may need corresponding - * updates in treat_leading_path(). See the commit message for the - * commit adding this warning as well as the commit preceding it - * for details. + * WARNING: Do NOT call recurse unless path_recurse is returned + * from treat_path(). Recursing on any other return value + * results in exponential slowdown. */ - struct cached_dir cdir; enum path_treatment state, subdir_state, dir_state = path_none; struct strbuf path = STRBUF_INIT; @@ -2206,13 +2296,7 @@ static enum path_treatment read_directory_recursive(struct dir_struct *dir, dir_state = state; /* recurse into subdir if instructed by treat_path */ - if ((state == path_recurse) || - ((state == path_untracked) && - (resolve_dtype(cdir.d_type, istate, path.buf, path.len) == DT_DIR) && - ((dir->flags & DIR_SHOW_IGNORED_TOO) || - (pathspec && - do_match_pathspec(istate, pathspec, path.buf, path.len, - baselen, NULL, DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC)))) { + if (state == path_recurse) { struct untracked_cache_dir *ud; ud = lookup_untracked(dir->untracked, untracked, path.buf + baselen, @@ -2296,15 +2380,6 @@ static int treat_leading_path(struct dir_struct *dir, const char *path, int len, const struct pathspec *pathspec) { - /* - * WARNING WARNING WARNING: - * - * Any updates to the traversal logic here may need corresponding - * updates in read_directory_recursive(). See 777b420347 (dir: - * synchronize treat_leading_path() and read_directory_recursive(), - * 2019-12-19) and its parent commit for details. - */ - struct strbuf sb = STRBUF_INIT; struct strbuf subdir = STRBUF_INIT; int prevlen, baselen; @@ -2355,23 +2430,7 @@ static int treat_leading_path(struct dir_struct *dir, strbuf_reset(&subdir); strbuf_add(&subdir, path+prevlen, baselen-prevlen); cdir.d_name = subdir.buf; - state = treat_path(dir, NULL, &cdir, istate, &sb, prevlen, - pathspec); - if (state == path_untracked && - resolve_dtype(cdir.d_type, istate, sb.buf, sb.len) == DT_DIR && - (dir->flags & DIR_SHOW_IGNORED_TOO || - do_match_pathspec(istate, pathspec, sb.buf, sb.len, - baselen, NULL, DO_MATCH_LEADING_PATHSPEC) == MATCHED_RECURSIVELY_LEADING_PATHSPEC)) { - if (!match_pathspec(istate, pathspec, sb.buf, sb.len, - 0 /* prefix */, NULL, - 0 /* do NOT special case dirs */)) - state = path_none; - add_path_to_appropriate_result_list(dir, NULL, &cdir, - istate, - &sb, baselen, - pathspec, state); - state = path_recurse; - } + state = treat_path(dir, NULL, &cdir, istate, &sb, prevlen, pathspec); if (state != path_recurse) break; /* do not recurse into it */ From patchwork Wed Jan 29 22:03:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Linus Arver via GitGitGadget X-Patchwork-Id: 11356929 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A18EB921 for ; Wed, 29 Jan 2020 22:03:56 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 755482071E for ; Wed, 29 Jan 2020 22:03:56 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=gmail.com header.i=@gmail.com header.b="awPYU07t" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726770AbgA2WDz (ORCPT ); Wed, 29 Jan 2020 17:03:55 -0500 Received: from mail-wm1-f67.google.com ([209.85.128.67]:36161 "EHLO mail-wm1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726736AbgA2WDw (ORCPT ); Wed, 29 Jan 2020 17:03:52 -0500 Received: by mail-wm1-f67.google.com with SMTP id p17so1764941wma.1 for ; Wed, 29 Jan 2020 14:03:51 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=message-id:in-reply-to:references:from:date:subject:fcc :content-transfer-encoding:mime-version:to:cc; bh=7Vq73R40espVKmFSR4Gemv2ZRIhs5+Vas2Or/9uunBY=; b=awPYU07tTgyfjesUMxnzVlZeQuuKrZXiYgWFlsZOy36AtdHRcHOAfSV2kQndE4eL1x tGxj7Ftjsp1NpRJDaeKgZe/BhEC/Bq3wNTskUZxJnRsEwEm2iLh0W6/1aAzarkFhCglE 13VTKOYxoyMHjsSVOExRfREOKODRdOhIn4eeO7E7qanpE3h+DDUB/n2kNa6j/bQSWGkF tiR6zwLmWe/COkMpDvtooPfk4YLKHvVl0x7rpKuugxTnQK1+HWWM9mgiG7CVQ2nVHEdV 9uouw+kBe3jbBUwcSBcOnohV3B3/N0dVfi3cCkI/Okuy7KQHx4FHJzzQvir7NWPPgr+O naNg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:message-id:in-reply-to:references:from:date :subject:fcc:content-transfer-encoding:mime-version:to:cc; bh=7Vq73R40espVKmFSR4Gemv2ZRIhs5+Vas2Or/9uunBY=; b=dBZ0K2QAatiXR0uftluqlbagOZGzoRTzpvrSbghxT3lzxSuRcb4Tb9I5+Ib8GYWy5+ Ow2gCuX/A+9vnywJJ5hMPMUPOXuyxRmy72lHT02Oady43B/zBLuQGAq92N65xx4+S/ae S5j2VIjYFNJLfG8vhKxcJtiuW/UaqwBYMvj6dvB8/qThMggv5svhnu5g4EW9y+CFCNGL CKjmJDdUP+linMBRXjt4spih/DK77m+9WCjtADV+Qd5oJyYil0qtE1RFTSq8WaQf0XK4 VAZ91yQWEz8SjoZmMvEL9l34gq+Be9D/WF8O0rDPUGDZE+5qnkhVUAMN8d1jqB33+nuc P5/w== X-Gm-Message-State: APjAAAUkgWhTX6XoEIBbnvyLa8Q6FL7WOs0ffdIqQTNIkMCtU3edDjvp qNUjpkdklQkUUia8CfvhUfoYPlPz X-Google-Smtp-Source: APXvYqxp/KwF5pMeNa0FyOPD4P1dZXgrlmaMkuv5wIn1zp08ekVqQY9ZD3k7cVJ6eZvOlVvZQrn+WQ== X-Received: by 2002:a7b:c8d3:: with SMTP id f19mr1308162wml.26.1580335430333; Wed, 29 Jan 2020 14:03:50 -0800 (PST) Received: from [127.0.0.1] ([13.74.141.28]) by smtp.gmail.com with ESMTPSA id o187sm3814880wme.36.2020.01.29.14.03.49 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 29 Jan 2020 14:03:49 -0800 (PST) Message-Id: <7fb8063541248b7b91d9559dbca1445c634a87f1.1580335424.git.gitgitgadget@gmail.com> In-Reply-To: References: From: "Elijah Newren via GitGitGadget" Date: Wed, 29 Jan 2020 22:03:43 +0000 Subject: [PATCH 6/6] t7063: blindly accept diffs Fcc: Sent MIME-Version: 1.0 To: git@vger.kernel.org Cc: Martin Melka , SZEDER =?utf-8?b?R8OhYm9y?= , Samuel Lijin , =?utf-8?b?Tmd1eeG7hW4gVGjDoWkgTmfhu41j?= Duy , Elijah Newren , Elijah Newren Sender: git-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: git@vger.kernel.org From: Elijah Newren Assuming that the changes I made in the last commit to drastically modify how and when and especially how frequently untracked paths are visited should result in changes to the untracked-cache, this commit simply updates the t7063 testcases to match what the code now reports. If this is correct, this commit should be squashed into the previous one. It'd be nice if I could get an untracked-cache expert to comment on this... Signed-off-by: Elijah Newren --- t/t7063-status-untracked-cache.sh | 50 ++++++++++++------------------- 1 file changed, 19 insertions(+), 31 deletions(-) diff --git a/t/t7063-status-untracked-cache.sh b/t/t7063-status-untracked-cache.sh index 190ae149cf..c1b0fd0540 100755 --- a/t/t7063-status-untracked-cache.sh +++ b/t/t7063-status-untracked-cache.sh @@ -85,9 +85,7 @@ dtwo/ three /done/ 0000000000000000000000000000000000000000 recurse valid /dthree/ 0000000000000000000000000000000000000000 recurse check_only valid -three /dtwo/ 0000000000000000000000000000000000000000 recurse check_only valid -two EOF test_expect_success 'status first time (empty cache)' ' @@ -140,8 +138,6 @@ test_expect_success 'modify in root directory, one dir invalidation' ' A done/one A one A two -?? dthree/ -?? dtwo/ ?? four ?? three EOF @@ -164,15 +160,11 @@ core.excludesfile 0000000000000000000000000000000000000000 exclude_per_dir .gitignore flags 00000006 / 0000000000000000000000000000000000000000 recurse valid -dthree/ -dtwo/ four three /done/ 0000000000000000000000000000000000000000 recurse valid /dthree/ 0000000000000000000000000000000000000000 recurse check_only valid -three /dtwo/ 0000000000000000000000000000000000000000 recurse check_only valid -two EOF test_cmp ../expect ../actual ' @@ -217,9 +209,7 @@ dtwo/ three /done/ 0000000000000000000000000000000000000000 recurse valid /dthree/ 0000000000000000000000000000000000000000 recurse check_only valid -three /dtwo/ 0000000000000000000000000000000000000000 recurse check_only valid -two EOF test_cmp ../expect ../actual ' @@ -235,6 +225,7 @@ A done/one A one A two ?? .gitignore +?? dthree/ ?? dtwo/ EOF test_cmp ../status.expect ../actual && @@ -256,11 +247,11 @@ exclude_per_dir .gitignore flags 00000006 / e6fcc8f2ee31bae321d66afd183fcb7237afae6e recurse valid .gitignore +dthree/ dtwo/ /done/ 0000000000000000000000000000000000000000 recurse valid /dthree/ 0000000000000000000000000000000000000000 recurse check_only valid /dtwo/ 0000000000000000000000000000000000000000 recurse check_only valid -two EOF test_cmp ../expect ../actual ' @@ -277,7 +268,6 @@ flags 00000006 /done/ 0000000000000000000000000000000000000000 recurse valid /dthree/ 0000000000000000000000000000000000000000 recurse check_only valid /dtwo/ 0000000000000000000000000000000000000000 recurse check_only valid -two EOF test_cmp ../expect ../actual ' @@ -290,7 +280,6 @@ test_expect_success 'status after the move' ' A done/one A one ?? .gitignore -?? dtwo/ ?? two EOF test_cmp ../status.expect ../actual && @@ -312,12 +301,10 @@ exclude_per_dir .gitignore flags 00000006 / e6fcc8f2ee31bae321d66afd183fcb7237afae6e recurse valid .gitignore -dtwo/ two /done/ 0000000000000000000000000000000000000000 recurse valid /dthree/ 0000000000000000000000000000000000000000 recurse check_only valid /dtwo/ 0000000000000000000000000000000000000000 recurse check_only valid -two EOF test_cmp ../expect ../actual ' @@ -334,7 +321,6 @@ flags 00000006 /done/ 0000000000000000000000000000000000000000 recurse valid /dthree/ 0000000000000000000000000000000000000000 recurse check_only valid /dtwo/ 0000000000000000000000000000000000000000 recurse check_only valid -two EOF test_cmp ../expect ../actual ' @@ -348,7 +334,6 @@ A done/one A one A two ?? .gitignore -?? dtwo/ EOF test_cmp ../status.expect ../actual && cat >../trace.expect <../actual && cat >../status.expect <../trace.expect <../trace.expect <../trace.expect <../trace.expect <../actual && + cat >../expect-from-test-dump <