Message ID | xmqqa6thcn1n.fsf_-_@gitster.c.googlers.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Re* [PATCH v2] fixup! mergetool: add automerge configuration | expand |
On Sat, Jan 09, 2021 at 10:40:20PM -0800, Junio C Hamano wrote: > An ugly workaround patch that caters only to difftool breakage is > attached at the end; I did not look if a similar treatment is > necessary for the mergetool side. That fixup does the trick on my machine too. Thank you. How wary are you of continuing with `initialize_merge_tool`? Do you see a better approach to get `automerge_enabled `into scope? While it is a nice feature to have, is it worth the risk vs. reward?
Seth House <seth@eseth.com> writes: > On Sat, Jan 09, 2021 at 10:40:20PM -0800, Junio C Hamano wrote: >> An ugly workaround patch that caters only to difftool breakage is >> attached at the end; I did not look if a similar treatment is >> necessary for the mergetool side. > > That fixup does the trick on my machine too. Thank you. Note that with t7800 fixed with the patch, non Windows jobs all seem to pass, but t7610 seems to have problem(s) on Windows. https://github.com/git/git/runs/1675932107?check_suite_focus=true#step:7:10373 > How wary are you of continuing with `initialize_merge_tool`? Do you see > a better approach to get `automerge_enabled `into scope? While it is > a nice feature to have, is it worth the risk vs. reward? Hmph. We need to have a way to define helper routines customized for the tool, and in the codebase without this series, it is the job for setup_tool. It defines fallback implementations, allows tool specific customizations. Your initialize_merge_tool is just a thin wrapper around setup_tool. It calls setup_tool, and if the function exits with non-zero status, returns with status==1 (and otherwise returns with status==0). As I expect all the callers of setup_tool or initialize_merge_tool would either ignore the status or check if it succeeded (i.e. compare $? against 0 and any non-zero values are treated equally), it does not seem to do anything useful. I think we may be able to get rid of initialize_merge_tool, but you would need to call setup_tool in places initialize_merge_tool was called in your patch, as you must have needed to make sure that the tool specific customizations have been carried out before going forward in these places. So, no, I do not see a reason to be wary of initialize/setup. With or without the seemingly needless initialize wrapper, I think calling setup before starting to do certain operations that need tool specific customization is just necessary. The same machanism has been in use to give can_merge/can_diff to each tool and the way it works ought to be fairly well understood. It is a different story if it makes sense not to exit when you see failure from initialize/setup, and instead _skip_ running helpers like run_merge_tool. It was just a mistake we all make every once in a while (i.e. a bug), and I am reasonably sure that we will introduce more of them but we will be capable of fixing all. Thanks.
On Sun, Jan 10, 2021 at 03:24:48AM -0800, Junio C Hamano wrote: > Note that with t7800 fixed with the patch, non Windows jobs all seem > to pass, but t7610 seems to have problem(s) on Windows. The autocrlf test is breaking because the sed that ships with some mingw versions (and also some minsys and cygwin versions) will *automatically* remove carriage returns: $ printf 'foo\r\nbar\r\n' | sed -e '/bar/d' | cat -A foo$ $ printf 'foo\r\nbar\r\n' | sed -b -e '/bar/d' | cat -A foo^M$ (Note: the -b flag above is just for comparison. We can't use it here. It's not in POSIX and is not present in sed for busybox or OSX.) I haven't found official docs that describe this behavior yet but I found a few discussions about it across a few lists. E.g.: https://cygwin.com/pipermail/cygwin/2017-June/233121.html Suggestions welcome. Obviously we could try to detect crlf's before the sed and then try to re-add them back in after sed but that strikes me as error-prone. I recall someone mentioning calling merge-file twice, once with --ours and once with --theirs, to independently generate each "side" of the conflict instead of using sed to split MERGED. Is that a viable approach?
On 2021-01-16 at 04:24:54, Seth House wrote: > On Sun, Jan 10, 2021 at 03:24:48AM -0800, Junio C Hamano wrote: > > Note that with t7800 fixed with the patch, non Windows jobs all seem > > to pass, but t7610 seems to have problem(s) on Windows. > > The autocrlf test is breaking because the sed that ships with some mingw > versions (and also some minsys and cygwin versions) will *automatically* > remove carriage returns: > > $ printf 'foo\r\nbar\r\n' | sed -e '/bar/d' | cat -A > foo$ > > $ printf 'foo\r\nbar\r\n' | sed -b -e '/bar/d' | cat -A > foo^M$ > > (Note: the -b flag above is just for comparison. We can't use it here. > It's not in POSIX and is not present in sed for busybox or OSX.) Can you report this as a bug? This behavior isn't compliant with POSIX and it makes it really hard for folks to write portable code if these versions implement POSIX utilities in a nonstandard way. As a non-Windows user, I have no hope of writing code that works on Windows if we can't rely on our standard utilities working properly.
Hi brian, On Fri, 22 Jan 2021, brian m. carlson wrote: > On 2021-01-16 at 04:24:54, Seth House wrote: > > On Sun, Jan 10, 2021 at 03:24:48AM -0800, Junio C Hamano wrote: > > > Note that with t7800 fixed with the patch, non Windows jobs all seem > > > to pass, but t7610 seems to have problem(s) on Windows. > > > > The autocrlf test is breaking because the sed that ships with some mingw > > versions (and also some minsys and cygwin versions) will *automatically* > > remove carriage returns: > > > > $ printf 'foo\r\nbar\r\n' | sed -e '/bar/d' | cat -A > > foo$ > > > > $ printf 'foo\r\nbar\r\n' | sed -b -e '/bar/d' | cat -A > > foo^M$ > > > > (Note: the -b flag above is just for comparison. We can't use it here. > > It's not in POSIX and is not present in sed for busybox or OSX.) > > Can you report this as a bug? This behavior isn't compliant with POSIX > and it makes it really hard for folks to write portable code if these > versions implement POSIX utilities in a nonstandard way. As a > non-Windows user, I have no hope of writing code that works on Windows > if we can't rely on our standard utilities working properly. I fear that the Windows-based tools do the correct thing, though: they are meant to process _text_, and newlines are supposed to be platform-dependent in text. From that perspective, it sounds to me as if we're trying to ask `sed` to do something it was not designed to do: binary editing. Ciao, Dscho
On 2021-01-22 at 16:29:46, Johannes Schindelin wrote: > Hi brian, > > On Fri, 22 Jan 2021, brian m. carlson wrote: > > > On 2021-01-16 at 04:24:54, Seth House wrote: > > > The autocrlf test is breaking because the sed that ships with some mingw > > > versions (and also some minsys and cygwin versions) will *automatically* > > > remove carriage returns: > > > > > > $ printf 'foo\r\nbar\r\n' | sed -e '/bar/d' | cat -A > > > foo$ > > > > > > $ printf 'foo\r\nbar\r\n' | sed -b -e '/bar/d' | cat -A > > > foo^M$ > > > > > > (Note: the -b flag above is just for comparison. We can't use it here. > > > It's not in POSIX and is not present in sed for busybox or OSX.) > > > > Can you report this as a bug? This behavior isn't compliant with POSIX > > and it makes it really hard for folks to write portable code if these > > versions implement POSIX utilities in a nonstandard way. As a > > non-Windows user, I have no hope of writing code that works on Windows > > if we can't rely on our standard utilities working properly. > > I fear that the Windows-based tools do the correct thing, though: they are > meant to process _text_, and newlines are supposed to be > platform-dependent in text. Ah, but POSIX gives a very specific meaning to "newline", and it refers to a single byte. If you want tools that process CRLF line endings like that, then that should be opt-in as either different tools or additional options, not the default behavior of a POSIX tool. This behavior is not conforming to POSIX and it is therefore a defect. > From that perspective, it sounds to me as if we're trying to ask `sed` to > do something it was not designed to do: binary editing. Most Windows tools are perfectly capable of handling LF line endings. Even the famously incapable Notepad can now handle LF without CR. With the advent of WSL, handling LF line endings is now pretty much required.
Hi brian, On Fri, 22 Jan 2021, brian m. carlson wrote: > On 2021-01-22 at 16:29:46, Johannes Schindelin wrote: > > > > On Fri, 22 Jan 2021, brian m. carlson wrote: > > > > > On 2021-01-16 at 04:24:54, Seth House wrote: > > > > The autocrlf test is breaking because the sed that ships with some mingw > > > > versions (and also some minsys and cygwin versions) will *automatically* > > > > remove carriage returns: > > > > > > > > $ printf 'foo\r\nbar\r\n' | sed -e '/bar/d' | cat -A > > > > foo$ > > > > > > > > $ printf 'foo\r\nbar\r\n' | sed -b -e '/bar/d' | cat -A > > > > foo^M$ > > > > > > > > (Note: the -b flag above is just for comparison. We can't use it here. > > > > It's not in POSIX and is not present in sed for busybox or OSX.) > > > > > > Can you report this as a bug? This behavior isn't compliant with POSIX > > > and it makes it really hard for folks to write portable code if these > > > versions implement POSIX utilities in a nonstandard way. As a > > > non-Windows user, I have no hope of writing code that works on Windows > > > if we can't rely on our standard utilities working properly. > > > > I fear that the Windows-based tools do the correct thing, though: they are > > meant to process _text_, and newlines are supposed to be > > platform-dependent in text. > > Ah, but POSIX gives a very specific meaning to "newline", and it refers > to a single byte. If you want tools that process CRLF line endings like > that, then that should be opt-in as either different tools or additional > options, not the default behavior of a POSIX tool. This behavior is not > conforming to POSIX and it is therefore a defect. If we needed another reminder that we never say "It's in POSIX; we'll happily ignore your needs should your system not conform to it." (https://github.com/git/git/blob/v2.30.0/Documentation/CodingGuidelines), then we have it right here ;-) > > From that perspective, it sounds to me as if we're trying to ask `sed` to > > do something it was not designed to do: binary editing. > > Most Windows tools are perfectly capable of handling LF line endings. > Even the famously incapable Notepad can now handle LF without CR. With > the advent of WSL, handling LF line endings is now pretty much required. I have two comments on that: 1. We could spend a splendid time questioning MSYS2's (and before that, MSys') choices regarding newlines, but I think we can spend that time a lot better. 2. Newer software _seems_ to handle LF line endings just fine. And the `sed` invocation you mentioned above does so: it groks input delimited by Line Feed as newline characters. That does not mean that its output is LF-only by default. I seem to remember that recent Visual Studio versions did something similar: happily read an LF-only `.xml` file, but then write out a modified version using CR/LF. Would I wish that Windows used LF-only instead of CR/LF, just like macOS switched away from CR-only to LF-only after MacOS 9? Sure I do. It would remove quite a few obstacles in my daily work. Can I do anything about it? No. So I'd rather see `git mergetool` be turned into a portable C program, or alternatively using a built-in helper that _is_ written in C, to perform that desired text munging instead of losing the battle to the challenge of cross-platform, advanced text processing in pure shell script. Ciao, Dscho
On Tue, Jan 26, 2021 at 03:32:13PM +0100, Johannes Schindelin wrote: > So I'd rather see `git mergetool` be turned into a portable C program, or > alternatively using a built-in helper that _is_ written in C, to perform > that desired text munging I tend to agree. Though my personal preference is Cygwin's (eventual) approach, I can appreciate the arguments made by the MSYS2 folk. But setting that aside, IMO, the ideal place to handle this would be the same place where the conflict markers are written in the first place, xmerge.c if my limited C literacy is correct. I don't see a big distinction between writing a single file with conflict markers and writing two, diff-able files with each "side" of the conflict -- they're ultimately two different formats for expressing the same information. That would give us the portability you described and the (pretty amazing) performance that merge-file already enjoys. :) I'm more than happy with calling merge-file twice for now. A future C optimisation, perhaps exposed via merge-file as a new (e.g.) --write-conflict-files flag, would be even more awesome.
Seth House <seth@eseth.com> writes: > On Tue, Jan 26, 2021 at 03:32:13PM +0100, Johannes Schindelin wrote: >> So I'd rather see `git mergetool` be turned into a portable C program, or >> alternatively using a built-in helper that _is_ written in C, to perform >> that desired text munging > > I tend to agree. Though my personal preference is Cygwin's (eventual) > approach, I can appreciate the arguments made by the MSYS2 folk. But > setting that aside, IMO, the ideal place to handle this would be the > same place where the conflict markers are written in the first place, > xmerge.c if my limited C literacy is correct. > > I don't see a big distinction between writing a single file with > conflict markers and writing two, diff-able files with each "side" of > the conflict -- they're ultimately two different formats for expressing > the same information. That would give us the portability you described > and the (pretty amazing) performance that merge-file already enjoys. :) > > I'm more than happy with calling merge-file twice for now. A future > C optimisation, perhaps exposed via merge-file as a new (e.g.) > --write-conflict-files flag, would be even more awesome. I am OK with that "two merge-file invocations, one with --ours and then another with --theirs" approach, as I already said in https://lore.kernel.org/git/xmqqh7n9aer5.fsf@gitster.c.googlers.com/
On Tue, Jan 26, 2021 at 12:10:17PM -0800, Junio C Hamano wrote: > I am OK with that "two merge-file invocations, one with --ours and > then another with --theirs" approach, as I already said in > https://lore.kernel.org/git/xmqqh7n9aer5.fsf@gitster.c.googlers.com/ Sorry if it seemed like I left that thread hanging (busy week). Thanks for your reply(s). I'm finishing a v10 patchset with that change which I'll submit to the list for review this week.
Seth House <seth@eseth.com> writes: > Sorry if it seemed like I left that thread hanging (busy week). Thanks > for your reply(s). I'm finishing a v10 patchset with that change which > I'll submit to the list for review this week. Thanks.
diff --git a/git-difftool--helper.sh b/git-difftool--helper.sh index 234dd6944e..992124cc67 100755 --- a/git-difftool--helper.sh +++ b/git-difftool--helper.sh @@ -61,7 +61,9 @@ launch_merge_tool () { export BASE eval $GIT_DIFFTOOL_EXTCMD '"$LOCAL"' '"$REMOTE"' else - initialize_merge_tool "$merge_tool" && + initialize_merge_tool "$merge_tool" + # ignore the error from the above --- run_merge_tool + # will diagnose unusable tool by itself run_merge_tool "$merge_tool" fi } @@ -80,7 +82,9 @@ if test -n "$GIT_DIFFTOOL_DIRDIFF" then LOCAL="$1" REMOTE="$2" - initialize_merge_tool "$merge_tool" && + initialize_merge_tool "$merge_tool" + # ignore the error from the above --- run_merge_tool + # will diagnose unusable tool by itself run_merge_tool "$merge_tool" false else # Launch the merge tool on each path provided by 'git diff'