Message ID | 20231104000209.916189-1-eantoranz@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] status: avoid reporting worktrees as "Untracked files" | expand |
On Fri, Nov 3, 2023 at 8:03 PM Edmundo Carmona Antoranz <eantoranz@gmail.com> wrote: > Given that worktrees are tracked in their own special fashion separately, > it makes sense to _not_ report them as "untracked". Also, when seeing the > directory of a worktree listed as Untracked, it might be tempting to try > to do operations (like 'git add') on them from the parent worktree which, > at the moment, will silently do nothing. > > With this patch, we check items against the list of worktrees to add > them into the untracked items list effectively hiding them. > > END OF PATCH > > Here are a few questions more inline with the "RFC" part of the patch. > > About UI > - Would it make more sense to separate them from Untracked files instead > of hiding them (perhaps add a --worktrees option to display them)? > - Follow-up if the previous answer is 'yes': List a worktree only if it > is not clean? > > About code: > - If keeping the idea/patch, Would it make more sense (performance-wise) to > fist check an item in the list of worktrees before checking it in the > index? In other words, reverse the conditions to add an item to the > untracked list? I have slightly mixed feelings about this idea since I'm sympathetic to the motivation, however, my knee-jerk reaction is that these really _are_ untracked considering that Git is a "content tracker" and worktrees are not project content. Git already has general mechanisms such as .git/info/exclude and .gitignore for suppressing certain untracked items, so introducing special-purpose code to suppress worktrees from being considered untracked may be a case of adding complexity for little gain. Moreover, although your personal workflow may be to create worktrees within your main directory: git worktree add new-feature other people use a workflow in which worktrees are created at other locations, such as making them siblings: git worktree add ../new-feature For the former case, it's easy enough to mention worktrees in .git/info/exclude, especially if you use a standard naming convention for your worktrees, in which case a single wildcard pattern may allow you to set it once and forget about it. For the latter workflow, the extra "is untracked" checking is simply wasteful. Having said all that, I think that someone may have recently floated the idea on the mailing list about suppressing dirty submodules from showing up as "dirty" in git-status. Although the underlying concepts and mechanisms are quite distinct (especially since a submodule _is_ content), perhaps there is some sort of analogy between worktrees and dirty submodules which invalidates my knee-jerk reaction. Also, I'm just one person responding without having put all that much thought into it. Others may feel differently. Regarding the patch itself... > diff --git a/wt-status.c b/wt-status.c > @@ -795,9 +796,12 @@ static void wt_status_collect_untracked(struct wt_status *s) > + worktrees = get_worktrees(); > + > for (i = 0; i < dir.nr; i++) { > struct dir_entry *ent = dir.entries[i]; > - if (index_name_is_other(istate, ent->name, ent->len)) > + if (index_name_is_other(istate, ent->name, ent->len) && > + !find_worktree_by_path(worktrees, ent->name)) > string_list_insert(&s->untracked, ent->name); > } This first-stab implementation unfortunately has worse than quadratic complexity, perhaps even cubic complexity since find_worktree_by_path() performs a linear scan through the worktree list. So, for each path in `dir`, it's performing a character-by-character string comparison with each path in `worktrees`. Worse, find_worktree_by_path() calls strbuf_realpath() which hits the filesystem for each path in `worktrees` each time it's called. So, a real (non-RFC) implementation would probably need to perform a preparatory step of creating a hash-table/set in which the keys are the realpath'd elements from `worktrees`, and then simply consult the hash-table/set for each path in `dir`. > @@ -809,6 +813,9 @@ static void wt_status_collect_untracked(struct wt_status *s) > + if (worktrees) > + free_worktrees(worktrees); Nit: At this point, we _know_ that `worktrees` is non-NULL, so free_worktrees() can be called unconditionally.
Edmundo Carmona Antoranz <eantoranz@gmail.com> writes: > Given that worktrees are tracked in their own special fashion separately, > it makes sense to _not_ report them as "untracked". My gut feeling is that a much better solution to the unstated problem you are solving is to make sure that "git worktree add" will complain and not allow adding a subdirectory of any existing working tree of a repository as a new worktree. What problem are you trying to solve? "git add foo" where "foo" is actually a different worktree of the repository would add it as a submodule that causes confusion? If that is the case, I think the right solution is not to get into such a state, i.e. not create a worktree of the repository inside a different worktree in the first place.
Hey, guys! Thanks Junio and Eric for sharing your thoughts. And I candidly thought this was going to be an "easy sell".... :-D On Sat, Nov 4, 2023 at 7:58 AM Junio C Hamano <gitster@pobox.com> wrote: > > What problem are you trying to solve? "git add foo" where "foo" is > actually a different worktree of the repository would add it as a > submodule that causes confusion? If that is the case, I think the > right solution is not to get into such a state, i.e. not create a > worktree of the repository inside a different worktree in the first > place. > I am not against the idea of creating worktrees outside of the repository... however, I like them to be _inside_ the repository. Am I the only one? IDK. I might be! It feels completely natural, if you ask me.... but that's just my opinion, I acknowledge that. While I was running a couple of quick tests to add more information about git behaviour with "my use case" I think I found something to work on so more RFCs might be on the way in the next few days or weeks. About adding an error message when 'git add' will skip doing something because it is working on a different worktree, I think it makes sense.... will probably work on that too. Thanks again! BR!
On Sat, Nov 11, 2023 at 4:22 AM Edmundo Carmona Antoranz <eantoranz@gmail.com> wrote: > On Sat, Nov 4, 2023 at 7:58 AM Junio C Hamano <gitster@pobox.com> wrote: > > What problem are you trying to solve? "git add foo" where "foo" is > > actually a different worktree of the repository would add it as a > > submodule that causes confusion? If that is the case, I think the > > right solution is not to get into such a state, i.e. not create a > > worktree of the repository inside a different worktree in the first > > place. > > > Hey, guys! Thanks Junio and Eric for sharing your thoughts. > > I am not against the idea of creating worktrees outside of the > repository... however, I like them to be _inside_ the repository. Am I > the only one? IDK. I might be! It feels completely natural, if you ask > me.... but that's just my opinion, I acknowledge that. I doubt you're the only one, but, based upon, list emails over the years, it seems that both in-main-tree and outside-main-tree (often sibling) worktrees are common. More recently, we've also heard from people who don't even have a main-worktree; instead, they hang their multiple worktrees off of a bare repository (which is an explicitly-supported use-case); i.e.: git clone --bare https://.../foobar.git git -C foobar.git worktree add worktree1 git -C foobar.git worktree add worktree2 ...
Eric Sunshine <sunshine@sunshineco.com> writes: > I doubt you're the only one, but, based upon, list emails over the > years, it seems that both in-main-tree and outside-main-tree (often > sibling) worktrees are common. More recently, we've also heard from > people who don't even have a main-worktree; instead, they hang their > multiple worktrees off of a bare repository (which is an > explicitly-supported use-case); i.e.: > > git clone --bare https://.../foobar.git > git -C foobar.git worktree add worktree1 > git -C foobar.git worktree add worktree2 > ... I am not sure why you brought in that layout in this discussion, because it places worktree1 and worktree2 next to each other, just like placing worktree1 and worktree2 next to the non-bare repository. git clone https://.../foobar.git foobar git -C foobar worktree add worktree1 git -C foobar worktree add worktree2 The layout to create worktrees attached to a bare repository and add them next to each other, and the same starting from a non-bare repository, share an important trait. They do not have an untracked and untrackable "cruft" in their working tree, unlike the crazy layout that places worktrees of the repository inside the working tree of the primary worktree as untracked subdirectories. Really, what is the advantage of doing so? It is not like the build recipe recorded in the primary worktree can work recursively on different branches that are checked out---worktree names and paths at which they are checked out are totally local matter, and the upstream project that supplies the build recipe would not know or care. Even worse, when the project wants to add a new subdirectory or a file, the name chosen for the subdirectory may happen to collide with the name of an untracked subdirectory you happened to have used (again, because the worktree names and locations are totally local matter, the upstream project are unaware of them and cannot avoid such name clashes even if they cared). You can imagine the confusion that happens to your next "git pull". Compared to such an insanity, attaching worktrees to a bare repository, so that all worktrees are equals and there is no "primary" worktree that you cannot remove, behave just as normal as a set of worktrees attached to a non-bare repository and sit outside the primary worktree, often as immediate siblings.
diff --git a/wt-status.c b/wt-status.c index 9f45bf6949..5fd1e6007a 100644 --- a/wt-status.c +++ b/wt-status.c @@ -775,6 +775,7 @@ static void wt_status_collect_untracked(struct wt_status *s) struct dir_struct dir = DIR_INIT; uint64_t t_begin = getnanotime(); struct index_state *istate = s->repo->index; + struct worktree **worktrees; if (!s->show_untracked_files) return; @@ -795,9 +796,12 @@ static void wt_status_collect_untracked(struct wt_status *s) fill_directory(&dir, istate, &s->pathspec); + worktrees = get_worktrees(); + for (i = 0; i < dir.nr; i++) { struct dir_entry *ent = dir.entries[i]; - if (index_name_is_other(istate, ent->name, ent->len)) + if (index_name_is_other(istate, ent->name, ent->len) && + !find_worktree_by_path(worktrees, ent->name)) string_list_insert(&s->untracked, ent->name); } @@ -809,6 +813,9 @@ static void wt_status_collect_untracked(struct wt_status *s) dir_clear(&dir); + if (worktrees) + free_worktrees(worktrees); + if (advice_enabled(ADVICE_STATUS_U_OPTION)) s->untracked_in_ms = (getnanotime() - t_begin) / 1000000; }