Message ID | 1eb33969-1739-4a27-a77b-3f4268f5519d@gmail.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | `--whitespace=fix` with `--no-ignore-whitespace` | expand |
Rubén Justo <rjusto@gmail.com> writes: > Ensure strict matching of context lines when applying with > `--whitespace=fix` combined with `--no-ignore-whitespace`. > > Signed-off-by: Rubén Justo <rjusto@gmail.com> > --- > apply.c | 3 ++- > t/t4124-apply-ws-rule.sh | 27 +++++++++++++++++++++++++++ > 2 files changed, 29 insertions(+), 1 deletion(-) > > diff --git a/apply.c b/apply.c > index 63e58086f1..0cb9d38e5a 100644 > --- a/apply.c > +++ b/apply.c > @@ -2596,7 +2596,8 @@ static int match_fragment(struct apply_state *state, > goto out; > } > > - if (state->ws_error_action != correct_ws_error) { > + if (state->ws_error_action != correct_ws_error || > + state->ws_ignore_action == ignore_ws_none) { > ret = 0; > goto out; > } Hmph, if we are correcting for whitespace violations, even if whitespace fuzz is not allowed externally, wouldn't the issue that c1beba5b (git-apply --whitespace=fix: fix whitespace fuzz introduced by previous run, 2008-01-30) corrected still apply? IOW, isn't this change introducing a regression when an input touches a file with a change with broken whitespaces, and then touches the same file to replace the broken whitespace lines with something else?
On Mon, Aug 26, 2024 at 05:35:14PM -0700, Junio C Hamano wrote: > Rubén Justo <rjusto@gmail.com> writes: > > > Ensure strict matching of context lines when applying with > > `--whitespace=fix` combined with `--no-ignore-whitespace`. > > > > Signed-off-by: Rubén Justo <rjusto@gmail.com> > > --- > > apply.c | 3 ++- > > t/t4124-apply-ws-rule.sh | 27 +++++++++++++++++++++++++++ > > 2 files changed, 29 insertions(+), 1 deletion(-) > > > > diff --git a/apply.c b/apply.c > > index 63e58086f1..0cb9d38e5a 100644 > > --- a/apply.c > > +++ b/apply.c > > @@ -2596,7 +2596,8 @@ static int match_fragment(struct apply_state *state, > > goto out; > > } > > > > - if (state->ws_error_action != correct_ws_error) { > > + if (state->ws_error_action != correct_ws_error || > > + state->ws_ignore_action == ignore_ws_none) { > > ret = 0; > > goto out; > > } > > Hmph, if we are correcting for whitespace violations, even if > whitespace fuzz is not allowed externally, wouldn't the issue that > c1beba5b (git-apply --whitespace=fix: fix whitespace fuzz introduced > by previous run, 2008-01-30) corrected still apply? IOW, isn't this > change introducing a regression when an input touches a file with a > change with broken whitespaces, and then touches the same file to > replace the broken whitespace lines with something else? Yes, that is the center of the blast this series is producing; affecting to what `--whitespace=fix` does (just a reminder for other readers): - stop fixing whitespace errors in context lines and - no longer warning about them. Clearly, as you point out, the change poses a problem when applying changes with whitespace errors in lines involved multiple times, either during the same apply session or across multiple sessions executed in sequence. The new `ignore_ws_default` enum option is intended to mitigate the blast. It would be unexpected IMHO for someone who wants the behavior described in c1beba5b to be indicating `--no-ignore-whitespace`. And with it we allow the possibility for someone who finds that behavior undesirable to avoid it. I'm not very happy with the new enum, but I haven't come up with a better idea. There are other alternatives: --whitespace=fix-strict --do-not-fix-ws-errors-in-context-lines None of them are better, I think.
Rubén Justo <rjusto@gmail.com> writes: > I'm not very happy with the new enum, but I haven't come up with a > better idea. > ... > None of them are better, I think. Not adding a new enum is probably much better. See the "additional thought" in my review on [3/5], for example. Thanks.
On Thu, Aug 29, 2024 at 04:13:14PM -0700, Junio C Hamano wrote: > Rubén Justo <rjusto@gmail.com> writes: > > > I'm not very happy with the new enum, but I haven't come up with a > > better idea. > > ... > > None of them are better, I think. > > Not adding a new enum is probably much better. See the "additional > thought" in my review on [3/5], for example. If I understand correctly the example you mentioned, using `in_fn_table()` cannot help us in `parse_fragment()`. But I could be completely wrong and misunderstanding your intention. I still don't see a better option than introducing a new value `default`. Perhaps described like this: diff --git a/Documentation/config/apply.txt b/Documentation/config/apply.txt index f9908e210a..7b642d2f3a 100644 --- a/Documentation/config/apply.txt +++ b/Documentation/config/apply.txt @@ -4,6 +4,10 @@ apply.ignoreWhitespace:: option. When set to one of: no, none, never, false, it tells 'git apply' to respect all whitespace differences. + When not set or set to `default`, it tells `git apply` to + behave like the previous setting: `no`. However, when + combined with 'whitespace=fix', some whitespace errors + will still be ignored because they are being fixed. See linkgit:git-apply[1]. apply.whitespace:
Rubén Justo <rjusto@gmail.com> writes: > If I understand correctly the example you mentioned, using > `in_fn_table()` cannot help us in `parse_fragment()`. But I could be > completely wrong and misunderstanding your intention. Hmph. It's unfortunate that the control flow goes like this: apply_all_patches() -> apply_patch() -> parse_chunk() -> parse_single_patch() -> parse_fragment() -> use_patch() -> check_patch_list() -> check_patch() -> apply_data() -> add_to_fn_table() So deciding if a context line (or a new line for that matter) contains a whitespace error is done too early in the current code, and if you want to do this correctly, you'd need to move the check down so that it happens in apply_one_fragment() that is called in apply_data(). The rest of the whitespace checks are done there, and regardless of the "patch that touches the same path twice" issue, it feels like the right thing to do anyway. Such a "right" fix might be involved. If we want to punt, I think you can still inspect the *patch inside parse_single_patch(), and figure out if the target path of the current fragment you are looking at has already been touched in the current session (we parse everything into the patch struct whose fragments member has a chained list of fragments). Normally that should not be the case. If we know we are not being fed such a patch with duplicated paths, we do not have to inspect whitespace issues while parsing a context ' ' line in parse_fragments(). > I still don't see a better option than introducing a new value > `default`. As long as we can tell that our input is not a patch that touches the same path twice, we shouldn't need a new knob or a command line option. When we are dealing with Git generated patch that does not mention the same path twice, we can unconditionally say "unless --ignore-space-change and other options that allow loose matching of context lines are given, it is an error if context lines do not exactly match, even when we are correcting for whitespace rule violations". Otherwise, we may need to keep the current workaround logic in case a later change has a line as a ' ' context for a file that an earlier change modified (and fixed with --whitespace=fix). It is a different story if all you want to change is to add to --whitespace family of options a new "silent-fix" that makes corrections in the same way as "fix" (or "strip"), but wihtout giving any warnings. I think it makes sense to have such a mode, but that is largely orthogonal to the discussion we are having, I think, even though it might result in a similar effect visible to the end-users. Thanks.
On Tue, Sep 03, 2024 at 09:41:31PM -0700, Junio C Hamano wrote: > ... a new "silent-fix" that makes > corrections in the same way as "fix" (or "strip"), but wihtout > giving any warnings. My main intention is not to make "fix" quieter, although it will probably be a pleasant consequence. Let's consider a sequence of patches where some whitespace errors in one patch are carried over to the next: $ # underscore (_) is whitespace for readability $ cat >file <<END a END $ cat >patch1 <<END # this adds a whitespace error --- v/file +++ m/file @@ -1 +1,2 @@ a +b_ END $ cat >patch2 <<END # this adds another one --- v/file +++ m/file @@ -1,2 +1,3 @@ a b_ +c_ END When applying "patch1" with `--whitespace=fix`, we'll fix the whitespace error in "b ". Consequently, when applying "patch2" we'll need to fix that line in the patch before applying it, so that it matches the context line, now with "b". Makes sense. This applies not only to: $ git apply --whitespace=fix patch1 && \ git apply --whitespace=fix patch2 But to: $ git apply --whitespace=fix patch1 patch2 Even to: $ git apply --whitespace=fix <(cat patch1 patch2) So far, so good. However, a legit question is: Why "a " is being modified here?: $ cat >foo <<END a_ b_ END $ cat >patch <<END --- v/foo +++ m/foo @@ -1,2 +1,2 @@ a -b_ +b END $ git apply -v --whitespace=fix patch Checking patch foo... Applied patch foo cleanly. $ xxd foo 00000000: 610a a. Is this a bug? I don't think so. But the result, IMHO, is questionable, and "git apply" rejecting the patch could also be an expected outcome. We are assuming an implicit, and perhaps unwanted, step: --- v/foo +++ m/foo @@ -1,2 +1,2 @@ -a_ +a b_ END We cannot change the default behaviour (I won't dig into this), but I think we can give a knob to allow changing it. This is the main intention of the, ugly, new "_default" enum value. And... as a nice side effect, we'll know when to stop showing warnings when the user touches lines next to other lines with whitespace errors that they don't want, or can, change ;-)
diff --git a/apply.c b/apply.c index 63e58086f1..0cb9d38e5a 100644 --- a/apply.c +++ b/apply.c @@ -2596,7 +2596,8 @@ static int match_fragment(struct apply_state *state, goto out; } - if (state->ws_error_action != correct_ws_error) { + if (state->ws_error_action != correct_ws_error || + state->ws_ignore_action == ignore_ws_none) { ret = 0; goto out; } diff --git a/t/t4124-apply-ws-rule.sh b/t/t4124-apply-ws-rule.sh index 485c7d2d12..573200da67 100755 --- a/t/t4124-apply-ws-rule.sh +++ b/t/t4124-apply-ws-rule.sh @@ -545,6 +545,33 @@ test_expect_success 'whitespace=fix to expand' ' git -c core.whitespace=tab-in-indent apply --whitespace=fix patch ' +test_expect_success 'whitespace=fix honors no-ignore-whitespace' ' + qz_to_tab_space >preimage <<-\EOF && + AZ + BZZ + EOF + qz_to_tab_space >patch <<-\EOF && + diff --git a/preimage b/preimage + --- a/preimage + +++ b/preimage + @@ -1,2 +1,2 @@ + -AZ + +A + BZZZ + EOF + test_must_fail git apply --no-ignore-whitespace --whitespace=fix patch && + qz_to_tab_space >patch <<-\EOF && + diff --git a/preimage b/preimage + --- a/preimage + +++ b/preimage + @@ -1,2 +1,2 @@ + -AZ + +A + BZZ + EOF + git apply --no-ignore-whitespace --whitespace=fix patch +' + test_expect_success 'whitespace check skipped for excluded paths' ' git config core.whitespace blank-at-eol && >used &&
Ensure strict matching of context lines when applying with `--whitespace=fix` combined with `--no-ignore-whitespace`. Signed-off-by: Rubén Justo <rjusto@gmail.com> --- apply.c | 3 ++- t/t4124-apply-ws-rule.sh | 27 +++++++++++++++++++++++++++ 2 files changed, 29 insertions(+), 1 deletion(-)