diff mbox series

[1/2] git-jump: always specify column 1 for diff entries

Message ID 20240915111846.GA2017851@coredump.intra.peff.net (mailing list archive)
State Accepted
Commit 9f5978e7778843bb729baef121c92f98bd187044
Headers show
Series a few git-jump quality-of-life fixes | expand

Commit Message

Jeff King Sept. 15, 2024, 11:18 a.m. UTC
When we generate a quickfix entry for a diff hunk, we provide just the
filename and line number along with the content, like:

  file:1: contents of the line

This can be a problem if the line itself looks like a quickfix header.
For example (and this is adapted from a real-world case that bit me):

  echo 'static_lease 10:11:12:13:14:15:16 10.0.0.1' >file
  git add file
  echo change >file

produces:

  file:1: static_lease 10:11:12:13:14:15:16 10.0.0.1

which is ambiguous. It could be line 1 of "file", or line 11 of the file
"file:1: static_lease 10", and so on. In the case of vim's default
config, it seems to prefer the latter (you can configure "errorformat"
with a variety of patterns, but out of the box it matches some common
ones).

One easy way to fix this is to provide a column number, like:

  file:1:1: static_lease 10:11:12:13:14:15:16 10.0.0.1

which causes vim to prefer line 1 of "file" again (due to the preference
order of the various patterns in the default errorformat).

There are other options. For example, at least in my version of vim,
wrapping the file in quotation marks like:

  "file":1: static_lease 10:11:12:13:14:15:16 10.0.0.1

also works. That perhaps would the right thing even if you had the silly
file name "file:1:1: foo 10". But it's not clear what would happen if
you had a filename with quotes in it.

This feature is inherently scraping text, and there's bound to be some
ambiguities. I don't think it's worth worrying too much about unlikely
filenames, as its the file content that is more likely to introduce
unexpected characters.

So let's just go with the extra ":1" column specifier. We know this is
supported everywhere, as git-jump's "grep" mode already uses it (and
thus doesn't exhibit the same problem).

The "merge" mode is mostly immune to this, as it only matches "<<<<<<<"
conflict marker lines. It's possible of course to have a marker that
says "foo 10:11" later in the line, but in practice these will only have
branches and perhaps file names, so it's probably not worth worrying
about (and fixing it would involve passing --column to the system grep,
which may not be portable).

I also gave some thought as to whether we could put something more
useful than "1" in the column field for diffs. In theory we could find
the first changed character of the line, but this is tricky in practice.
You'd have to correlate before/after lines of the hunk to decide what
changed. So:

  -this is a foo line
  +this is a bar line

is easy (column 11). But:

  -this is a foo line
  +another line
  +this is a bar line

is harder.

This commit certainly doesn't preclude trying to do something more
clever later, but it's a much deeper rabbit hole than just fixing the
syntactic ambiguity.

Signed-off-by: Jeff King <peff@peff.net>
---
 contrib/git-jump/git-jump | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Martin Ă…gren Sept. 16, 2024, 8:13 p.m. UTC | #1
On Sun, 15 Sept 2024 at 13:18, Jeff King <peff@peff.net> wrote:
>
> When we generate a quickfix entry for a diff hunk, we provide just the
> filename and line number along with the content, like:
>
>   file:1: contents of the line
>
> This can be a problem if the line itself looks like a quickfix header.
> For example (and this is adapted from a real-world case that bit me):
>
>   echo 'static_lease 10:11:12:13:14:15:16 10.0.0.1' >file
>   git add file
>   echo change >file
>
> produces:
>
>   file:1: static_lease 10:11:12:13:14:15:16 10.0.0.1
>
> which is ambiguous. It could be line 1 of "file", or line 11 of the file
> "file:1: static_lease 10", and so on. In the case of vim's default
> config, it seems to prefer the latter (you can configure "errorformat"
> with a variety of patterns, but out of the box it matches some common
> ones).

I've never hit this, but it doesn't look too crazy. A couple of digits
and a colon and things begin to match. Ok.

> One easy way to fix this is to provide a column number, like:
>
>   file:1:1: static_lease 10:11:12:13:14:15:16 10.0.0.1
>
> which causes vim to prefer line 1 of "file" again (due to the preference
> order of the various patterns in the default errorformat).

Makes sense.

> There are other options. For example, at least in my version of vim,
> wrapping the file in quotation marks like:
>
>   "file":1: static_lease 10:11:12:13:14:15:16 10.0.0.1
>
> also works. That perhaps would the right thing even if you had the silly
> file name "file:1:1: foo 10". But it's not clear what would happen if
> you had a filename with quotes in it.

Right. Looking around, I can find someone asking the Internet how to
escape the filename and not getting any response.

> This feature is inherently scraping text, and there's bound to be some
> ambiguities. I don't think it's worth worrying too much about unlikely
> filenames, as its the file content that is more likely to introduce
> unexpected characters.

Agreed. (s/its/it's/)

> So let's just go with the extra ":1" column specifier. We know this is
> supported everywhere, as git-jump's "grep" mode already uses it (and
> thus doesn't exhibit the same problem).
>
> The "merge" mode is mostly immune to this, as it only matches "<<<<<<<"
> conflict marker lines. It's possible of course to have a marker that
> says "foo 10:11" later in the line, but in practice these will only have
> branches and perhaps file names, so it's probably not worth worrying
> about (and fixing it would involve passing --column to the system grep,
> which may not be portable).

I suppose we could use `git grep --no-index` instead of `grep` for `git
jump merge`. Anyway, that's out of scope here.

> I also gave some thought as to whether we could put something more
> useful than "1" in the column field for diffs. In theory we could find

Heh. Yes, in theory everything is possible. Your approach makes sense.

> -               print "$file:$line: $1\n";
> +               print "$file:$line:1: $1\n";

Looks good to me and from my testing, this fixes the problem as
described.

Martin
diff mbox series

Patch

diff --git a/contrib/git-jump/git-jump b/contrib/git-jump/git-jump
index 47e0c557e6..78e7394406 100755
--- a/contrib/git-jump/git-jump
+++ b/contrib/git-jump/git-jump
@@ -50,7 +50,7 @@  mode_diff() {
 	defined($line) or next;
 	if (/^ /) { $line++; next }
 	if (/^[-+]\s*(.*)/) {
-		print "$file:$line: $1\n";
+		print "$file:$line:1: $1\n";
 		$line = undef;
 	}
 	'