Message ID | pull.1041.v5.git.1644612979.gitgitgadget@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | Builtin FSMonitor Part 2 | expand |
Hi Jeff, On Fri, 11 Feb 2022, Jeff Hostetler via GitGitGadget wrote: > Here is V5 of Part 2 of my Builtin FSMonitor series. I apologize for the > delay since V4 that I submitted back in October. (Insert the usual $DayJob > excuse...) > > I have rebased this branch onto the current "master" branch. Thank you for your tireless work on this. I do see that it requires a ton of effort on your part and just wanted to let you know that I appreciate it very much! > In this version I removed the core.useBuiltinFSMonitor config setting and > instead extended the existing core.fsmonitor. I am somewhat surprised that a reviewer suggested this, as it breaks the common paradigm we use to allow using several Git versions on the same worktree. Imagine, for example, that you run a Git version that understands `core.fsmonitor=true` to imply the built-in FSMonitor, while you _also_ use an IDE that bundles a slightly older Git version that mistakes the `true` for meaning the executable `true` (which is not a FSMonitor at all, but its exit code suggests that everything's fine and dandy). The result would be that the IDE does not see _any_ updates anymore, but nothing would suggest that anything is wrong. We can probably warn users about this, and we can also work around the fact that Git for Windows already uses `core.useBuiltinFSMonitor`, but it makes me somewhat uneasy nevertheless. Thank you, Dscho
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: >> In this version I removed the core.useBuiltinFSMonitor config setting and >> instead extended the existing core.fsmonitor. > > I am somewhat surprised that a reviewer suggested this, as it breaks the > common paradigm we use to allow using several Git versions on the same > worktree. I do not think sharing the same repository with different versions of Git was considered as a possible source of problems during the review discussion. https://lore.kernel.org/git/74282d08-aaeb-0a1e-cad3-1de17d59b4d1@jeffhostetler.com/ I am not saying that we should not consider it; I am just stating the fact that there was nobody who raised as a potential issue during the discussion that lead to the cited message.
On 2/17/22 11:06 AM, Johannes Schindelin wrote: > Hi Jeff, > > On Fri, 11 Feb 2022, Jeff Hostetler via GitGitGadget wrote: > >> Here is V5 of Part 2 of my Builtin FSMonitor series. I apologize for the >> delay since V4 that I submitted back in October. (Insert the usual $DayJob >> excuse...) >> >> I have rebased this branch onto the current "master" branch. > > Thank you for your tireless work on this. I do see that it requires a ton > of effort on your part and just wanted to let you know that I appreciate > it very much! > >> In this version I removed the core.useBuiltinFSMonitor config setting and >> instead extended the existing core.fsmonitor. > > I am somewhat surprised that a reviewer suggested this, as it breaks the > common paradigm we use to allow using several Git versions on the same > worktree. > > Imagine, for example, that you run a Git version that understands > `core.fsmonitor=true` to imply the built-in FSMonitor, while you _also_ > use an IDE that bundles a slightly older Git version that mistakes the > `true` for meaning the executable `true` (which is not a FSMonitor at all, > but its exit code suggests that everything's fine and dandy). The result > would be that the IDE does not see _any_ updates anymore, but nothing > would suggest that anything is wrong. > > We can probably warn users about this, and we can also work around the > fact that Git for Windows already uses `core.useBuiltinFSMonitor`, but it > makes me somewhat uneasy nevertheless. > > Thank you, > Dscho > This is a valid concern and I should have thought to mention it when the suggestion came up on the list. Yes, extending `core.fsmonitor` to take a boolean or a path could confuse older clients (like ones bundled with an IDE, like VS). My assumption was that since we shipped `core.useBuiltinFSMonitor` in GFW with an experimental label, that normal users would not be using it at all and especially not from their IDEs, so it wouldn't matter. And experimental features are just that -- experimental and subject to change. But your point is valid -- if someone does have the odd hook called "true" or "1", they'll get an unexpected result. Jeff
Hi Junio, On Thu, 17 Feb 2022, Junio C Hamano wrote: > Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > > >> In this version I removed the core.useBuiltinFSMonitor config setting and > >> instead extended the existing core.fsmonitor. > > > > I am somewhat surprised that a reviewer suggested this, as it breaks the > > common paradigm we use to allow using several Git versions on the same > > worktree. > > I do not think sharing the same repository with different versions > of Git was considered as a possible source of problems during the > review discussion. > > https://lore.kernel.org/git/74282d08-aaeb-0a1e-cad3-1de17d59b4d1@jeffhostetler.com/ > > I am not saying that we should not consider it; I am just stating > the fact that there was nobody who raised as a potential issue > during the discussion that lead to the cited message. Just to make sure: I did not intend to insult anyone (and in hindsight I wish that I had made that clearer). Ciao, Dscho
Hi Jeff, On Tue, 22 Feb 2022, Jeff Hostetler wrote: > On 2/17/22 11:06 AM, Johannes Schindelin wrote: > > > On Fri, 11 Feb 2022, Jeff Hostetler via GitGitGadget wrote: > > > > > In this version I removed the core.useBuiltinFSMonitor config > > > setting and instead extended the existing core.fsmonitor. > > > > I am somewhat surprised that a reviewer suggested this, as it breaks > > the common paradigm we use to allow using several Git versions on the > > same worktree. > > > > Imagine, for example, that you run a Git version that understands > > `core.fsmonitor=true` to imply the built-in FSMonitor, while you > > _also_ use an IDE that bundles a slightly older Git version that > > mistakes the `true` for meaning the executable `true` (which is not a > > FSMonitor at all, but its exit code suggests that everything's fine > > and dandy). The result would be that the IDE does not see _any_ > > updates anymore, but nothing would suggest that anything is wrong. > > > > We can probably warn users about this, and we can also work around the > > fact that Git for Windows already uses `core.useBuiltinFSMonitor`, but > > it makes me somewhat uneasy nevertheless. > > This is a valid concern and I should have thought to mention it when > the suggestion came up on the list. Yes, extending `core.fsmonitor` to > take a boolean or a path could confuse older clients (like ones bundled > with an IDE, like VS). > > My assumption was that since we shipped `core.useBuiltinFSMonitor` > in GFW with an experimental label, that normal users would not be > using it at all and especially not from their IDEs, so it wouldn't > matter. And experimental features are just that -- experimental > and subject to change. > > But your point is valid -- if someone does have the odd hook called > "true" or "1", they'll get an unexpected result. I wondered about that for a while, and put that to a test last night. I set `core.fsmonitor = true` and then modified a file and ran `git status`. Something I did not expect happened: it picked up on the modified file! It also printed out a warning: warning: Empty last update token. This is the reason why it works: by default, current Git versions assume that the FSMonitor hook understands the FSMonitor protocol v2, which starts by the client sending out a token, receiving a new token and then the paths of the files/directories/symlinks to inspect. Since the program `true` does _not_ write that token, Git warns that it did not receive a token and continues as if no FSMonitor had been configured. So that's good news! The less good news is that prior to v2.26.0, Git did not support v2 of the FSMonitor protocol, but only v1. And v1 does not expect such a token. Git versions between v2.16.0 and v2.26.0 will interpret a successful run of the `true` executable with an empty output to mean that no files have been modified. And indeed, in my tests, after making sure that the Git index had been refreshed explicitly and then modifying a file and then running `git status` with v2.16.0, Git did not pick up on the modification. That's the less good news. At first I thought that we're pretty safe because nobody should use older Git versions and enable FSMonitor because FSMonitor protocol v1 is known to be subject to racy behavior. But then, Git users sometimes do not completely control which Git versions they use. Take for example Visual Studio users who also use the Git Bash to work on their worktree. While their Git Bash might be reasonably recent, Visual Studio comes with its own embedded Git version. Therefore, a user might want to play with the built-in FSMonitor in Git Bash, find that it dramatically speeds up everything (as it does for me, thank you so much!), and not realize that the Git executable used by Visual Studio totally misinterprets `core.fsmonitor` to refer to `/usr/bin/true.exe` and then miss any modifications. As long as the embedded Git version is at least v2.26.0, Visual Studio will at least work correctly (because it ignores `true.exe`'s output and continue as if no FSMonitor had been configured). But as soon as an older version is used, Git would work incorrectly, without any indication what is going wrong. I tried to come up with alternatives (because I _really_ dislike being a reviewer who only points out what's wrong without any constructive suggestion how to do it better), and the best alternatives I came up were: - stick with `core.useBuiltinFSMonitor` as before, or - use a special value of `core.fsmonitor` that simply is not a valid executable name. In 2019, when I worked on the original precursor of the built-in FSMonitor (before I had to drop working on FSMonitor because of all the security work that went into v2.24.1), I had picked `:builtin:` because colons are illegal on Windows, but of _course_ they are legal everywhere else. But one thing is not possible, even on Linux: to have a trailing slash in an executable name. So something like `/builtin-fsmonitor/` would work. However, after seeing how nicely your latest iteration cleans up the code by simply interpreting a Boolean value to refer to the built-in FSMonitor, I _really_ would like to make it work. Maybe we can declare that it is "safe enough" to rely on new enough Git versions to be used by users who use multiple Git versions on the same worktree? They should use _at least_ v2.26.1 anyway, because that one fixed a rather important vulnerability (CVE-2020-5260)? At least for Visual Studio, this is true: it ships with Git version 2.33.0.windows.2. What do you think? Can we somehow make `core.fsmonitor = true` work? Ciao, Dscho
Johannes Schindelin <Johannes.Schindelin@gmx.de> writes: > Just to make sure: I did not intend to insult anyone (and in hindsight I > wish that I had made that clearer). It is OK that you are wiser in hindsight. We all are, and we try to do better the next time ;-) Thanks for reminding of the topic. As a general principle, when introducing a new feature that achieves the same goal in a new and improved way, it is safer to introduce it in such a way that users of older implementations that lack the feature cannot choose it by mistake. One way to do so is to we reuse the same configuration knob by adding a new settings value that older implementations would choke on, the users will be forced to ensure that the older and proven way is used consistently everywhere, until every tool the user uses are ready. For this instance, I think it is OK to split and allow two to operate on the same data at the same time, because I believe that both old and new implementation will leave a permanent difference to the on-disk data that cannot later be reused by the other [*]. But it is an exception than a norm when adding a new thing that extends an existing feature (as opposed to inventing totally a new thing that won't overlap with any existing one). As a general principle, it is much safer to make sure it breaks (and have users hold off) when the new setting is given to an old implementation. * Side note. For example, if we introduce the index-v5 feature by not reusing the index.version but with index.usev5 variable, new implementations that know about the knob would write out v5 data that existing implementations will not work with. Also, from the point of view of the longer-term maintenance, of course not having to deal with orthogonal looking different configuration variables where newer ones override the older ones will induce more pain over time.
On 2/24/22 11:22 AM, Johannes Schindelin wrote: > Hi Jeff, > > On Tue, 22 Feb 2022, Jeff Hostetler wrote: > >> On 2/17/22 11:06 AM, Johannes Schindelin wrote: >> >>> On Fri, 11 Feb 2022, Jeff Hostetler via GitGitGadget wrote: >>> >>>> In this version I removed the core.useBuiltinFSMonitor config >>>> setting and instead extended the existing core.fsmonitor. >>> >>> I am somewhat surprised that a reviewer suggested this, as it breaks >>> the common paradigm we use to allow using several Git versions on the >>> same worktree. >>> >>> Imagine, for example, that you run a Git version that understands >>> `core.fsmonitor=true` to imply the built-in FSMonitor, while you >>> _also_ use an IDE that bundles a slightly older Git version that >>> mistakes the `true` for meaning the executable `true` (which is not a >>> FSMonitor at all, but its exit code suggests that everything's fine >>> and dandy). The result would be that the IDE does not see _any_ >>> updates anymore, but nothing would suggest that anything is wrong. >>> >>> We can probably warn users about this, and we can also work around the >>> fact that Git for Windows already uses `core.useBuiltinFSMonitor`, but >>> it makes me somewhat uneasy nevertheless. >> >> This is a valid concern and I should have thought to mention it when >> the suggestion came up on the list. Yes, extending `core.fsmonitor` to >> take a boolean or a path could confuse older clients (like ones bundled >> with an IDE, like VS). >> >> My assumption was that since we shipped `core.useBuiltinFSMonitor` >> in GFW with an experimental label, that normal users would not be >> using it at all and especially not from their IDEs, so it wouldn't >> matter. And experimental features are just that -- experimental >> and subject to change. >> >> But your point is valid -- if someone does have the odd hook called >> "true" or "1", they'll get an unexpected result. > > I wondered about that for a while, and put that to a test last night. I > set `core.fsmonitor = true` and then modified a file and ran `git status`. > Something I did not expect happened: it picked up on the modified file! > > It also printed out a warning: > > warning: Empty last update token. > > This is the reason why it works: by default, current Git versions assume > that the FSMonitor hook understands the FSMonitor protocol v2, which > starts by the client sending out a token, receiving a new token and then > the paths of the files/directories/symlinks to inspect. Since the program > `true` does _not_ write that token, Git warns that it did not receive a > token and continues as if no FSMonitor had been configured. > > So that's good news! > > The less good news is that prior to v2.26.0, Git did not support v2 of the > FSMonitor protocol, but only v1. And v1 does not expect such a token. Git > versions between v2.16.0 and v2.26.0 will interpret a successful run of > the `true` executable with an empty output to mean that no files have been > modified. > > And indeed, in my tests, after making sure that the Git index had been > refreshed explicitly and then modifying a file and then running `git > status` with v2.16.0, Git did not pick up on the modification. > > That's the less good news. > > At first I thought that we're pretty safe because nobody should use older > Git versions and enable FSMonitor because FSMonitor protocol v1 is known > to be subject to racy behavior. But then, Git users sometimes do not > completely control which Git versions they use. Take for example Visual > Studio users who also use the Git Bash to work on their worktree. While > their Git Bash might be reasonably recent, Visual Studio comes with its > own embedded Git version. Therefore, a user might want to play with the > built-in FSMonitor in Git Bash, find that it dramatically speeds up > everything (as it does for me, thank you so much!), and not realize that > the Git executable used by Visual Studio totally misinterprets > `core.fsmonitor` to refer to `/usr/bin/true.exe` and then miss any > modifications. > > As long as the embedded Git version is at least v2.26.0, Visual Studio > will at least work correctly (because it ignores `true.exe`'s output and > continue as if no FSMonitor had been configured). But as soon as an older > version is used, Git would work incorrectly, without any indication what > is going wrong. > > I tried to come up with alternatives (because I _really_ dislike being a > reviewer who only points out what's wrong without any constructive > suggestion how to do it better), and the best alternatives I came up were: > > - stick with `core.useBuiltinFSMonitor` as before, or > > - use a special value of `core.fsmonitor` that simply is not a valid > executable name. In 2019, when I worked on the original precursor of the > built-in FSMonitor (before I had to drop working on FSMonitor > because of all the security work that went into v2.24.1), I had picked > `:builtin:` because colons are illegal on Windows, but of _course_ they > are legal everywhere else. But one thing is not possible, even on Linux: > to have a trailing slash in an executable name. So something like > `/builtin-fsmonitor/` would work. > > However, after seeing how nicely your latest iteration cleans up the code > by simply interpreting a Boolean value to refer to the built-in FSMonitor, > I _really_ would like to make it work. > > Maybe we can declare that it is "safe enough" to rely on new enough Git > versions to be used by users who use multiple Git versions on the same > worktree? They should use _at least_ v2.26.1 anyway, because that one > fixed a rather important vulnerability (CVE-2020-5260)? At least for > Visual Studio, this is true: it ships with Git version 2.33.0.windows.2. > > What do you think? Can we somehow make `core.fsmonitor = true` work? > > Ciao, > Dscho > Thanks for the testing and background information here! I agree. I would like to keep the current "core.fsmonitor = <bool> | <path>" usage that I have in V5. It cleaned up things very nicely and it got rid of the somewhat awkward usage of having "core.useBuiltinFSMonitor" override the existing "core.fsmonitor" setting. It is unfortunate that it might cause a breakage for users who are *also* running a Git version between 2.16 ... 2.26. I have to wonder if it wouldn't be better to spend our energy documenting that users should upgrade, rather than trying to support interop with them. The 2.16 ... 2.26 versions don't have the security fix referenced above. The 2.16 ... 2.26 versions are based upon the V1 FSMonitor protocol and don't have fixes for several serious bugs or races in the original implementation: * 398a3b0899 (fsmonitor: force a refresh after the index was discarded, 2019-05-07) * 679f2f9fdd (unpack-trees: skip stat on fsmonitor-valid files, 2019-11-20) * dfaed02862 (fsmonitor: update documentation for hook version and watchman hooks, 2020-01-23) That last fix added the V2 FSMonitor protocol. This is used by both the hook- and the IPC-based providers. So anything still using the V1 FSMonitor protocol is going to unreliable and racy and users should not use it, so I don't think it is worth the effort to complicate our current solution to maintain compatibility. (I hate to say that, but they just shouldn't be using V1 any more.) On a slight tangent, the current code (before my patch series) does support a "core.fsmonitorhookversion" to allow the client to talk to a V1 or V2 provider explicitly (vs the default of trying V2 and then trying V1). The IPC implementation does not use this config setting, but I could see adding something to emit a warning if it was set to 1 when using the builtin FSMonitor. This might help users who are *also* running a Git version between 2.26 and 2.35 to understand the fallback after the true.exe warning that Johannes described. On another slight tangent, I'm wondering if we want to officially deprecate the V1 hook code and/or remove support for it from the code. Jeff
Hi Jeff, On Thu, 24 Feb 2022, Jeff Hostetler wrote: > On 2/24/22 11:22 AM, Johannes Schindelin wrote: > > > On Tue, 22 Feb 2022, Jeff Hostetler wrote: > > > > > On 2/17/22 11:06 AM, Johannes Schindelin wrote: > > > > > > > On Fri, 11 Feb 2022, Jeff Hostetler via GitGitGadget wrote: > > > > > > > > > In this version I removed the core.useBuiltinFSMonitor config > > > > > setting and instead extended the existing core.fsmonitor. > > > > > > > > I am somewhat surprised that a reviewer suggested this, as it > > > > breaks the common paradigm we use to allow using several Git > > > > versions on the same worktree. > > [...] > > > > I wondered about that for a while, and put that to a test last night. > > I set `core.fsmonitor = true` and then modified a file and ran `git > > status`. Something I did not expect happened: it picked up on the > > modified file! > > > > [... describing how FSMonitor protocol v1 is affected...] > > > > However, after seeing how nicely your latest iteration cleans up the > > code by simply interpreting a Boolean value to refer to the built-in > > FSMonitor, I _really_ would like to make it work. > > > > Maybe we can declare that it is "safe enough" to rely on new enough > > Git versions to be used by users who use multiple Git versions on the > > same worktree? They should use _at least_ v2.26.1 anyway, because that > > one fixed a rather important vulnerability (CVE-2020-5260)? At least > > for Visual Studio, this is true: it ships with Git version > > 2.33.0.windows.2. > > > > What do you think? Can we somehow make `core.fsmonitor = true` work? > > [...] > > I agree. I would like to keep the current > > "core.fsmonitor = <bool> | <path>" > > usage that I have in V5. > > It cleaned up things very nicely and it got rid of the somewhat awkward > usage of having "core.useBuiltinFSMonitor" override the existing > "core.fsmonitor" setting. Yes! > It is unfortunate that it might cause a breakage for users who are > *also* running a Git version between 2.16 ... 2.26. I have to wonder > if it wouldn't be better to spend our energy documenting that users > should upgrade, rather than trying to support interop with them. That's a good point: I guess if you added a comment to the documentation of `core.fsmonitor = true`, that should be good enough. > [...] > > So anything still using the V1 FSMonitor protocol is going to unreliable > and racy and users should not use it, so I don't think it is worth the effort > to complicate our current solution to maintain compatibility. > (I hate to say that, but they just shouldn't be using V1 any more.) That's a really good point. > On a slight tangent, the current code (before my patch series) does > support a "core.fsmonitorhookversion" to allow the client to talk to > a V1 or V2 provider explicitly (vs the default of trying V2 and then > trying V1). The IPC implementation does not use this config setting, > but I could see adding something to emit a warning if it was set to > 1 when using the builtin FSMonitor. This might help users who are > *also* running a Git version between 2.26 and 2.35 to understand the > fallback after the true.exe warning that Johannes described. How about making it an error instead? That should really be helpful: if `core.fsmonitor = true` and `core.fsmonitorHookVersion = 1`, just error out. That way, users will more likely fall into the pit of success. > On another slight tangent, I'm wondering if we want to officially > deprecate the V1 hook code and/or remove support for it from the code. Oooh! That's _also_ a good point. Maybe we can keep this deprecation out of this here patch series, though. It would be good to get this finished and into `next`, I think. I spent some time (in two separate thrusts) to review the entire patch series, and hope that my feedback was useful to you. In particular with your idea to document the incompatibilities of `core.fsmonitor = true` with Git v2.16.0..v2.26.0, I am really eager to see (the next iteration of) this patch series advance to the `next` branch, and then into an official Git version so that more users can benefit from it. Thank you for all your work on this, Dscho