diff mbox series

[2/2] Documentation/git-log: mention that line-log regex must match in starting revision

Message ID 4ea4eeae0c1e23221012855168bf6640be93fd4f.1576559263.git.gitgitgadget@gmail.com (mailing list archive)
State New, archived
Headers show
Series Improve line log documentation | expand

Commit Message

Johannes Schindelin via GitGitGadget Dec. 17, 2019, 5:07 a.m. UTC
From: Philippe Blain <levraiphilippeblain@gmail.com>

When giving a regex as parameter <start> or <end> in
`git log -L <start>,<end>:<file>`, or a function name in
`git log -L :<funcname>:<file>`, the given regex must match in the starting
revision, or else the command exits with a fatal error.

This is not obvious in the documentation, so add a note to that
effect.

Signed-off-by: Philippe Blain <levraiphilippeblain@gmail.com>
---
 Documentation/git-log.txt | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Derrick Stolee Dec. 17, 2019, 3:34 p.m. UTC | #1
On 12/17/2019 12:07 AM, Philippe Blain via GitGitGadget wrote:
> From: Philippe Blain <levraiphilippeblain@gmail.com>
> 
> When giving a regex as parameter <start> or <end> in
> `git log -L <start>,<end>:<file>`, or a function name in
> `git log -L :<funcname>:<file>`, the given regex must match in the starting
> revision, or else the command exits with a fatal error.
> 
> This is not obvious in the documentation, so add a note to that
> effect.

This seems helpful. Thanks.

-Stolee
Junio C Hamano Dec. 17, 2019, 6:16 p.m. UTC | #2
Derrick Stolee <stolee@gmail.com> writes:

> On 12/17/2019 12:07 AM, Philippe Blain via GitGitGadget wrote:
>> From: Philippe Blain <levraiphilippeblain@gmail.com>
>> 
>> When giving a regex as parameter <start> or <end> in
>> `git log -L <start>,<end>:<file>`, or a function name in
>> `git log -L :<funcname>:<file>`, the given regex must match in the starting
>> revision, or else the command exits with a fatal error.
>> 
>> This is not obvious in the documentation, so add a note to that
>> effect.
>
> This seems helpful. Thanks.

Even when you specify <start> or <end> as a line number, they must
exist in the starting revision or it would be a fatal error.  If we
are clarifying with this patch for completeness, I think we should
also mention it together.  

Or we could even drop the mention of regex and say <start> and
<end>, regardless of how they are specified (including :<funcname>
format), must exist in the starting revision.
Philippe Blain Dec. 18, 2019, 3:28 a.m. UTC | #3
> Le 17 déc. 2019 à 13:16, Junio C Hamano <gitster@pobox.com> a écrit :
> Even when you specify <start> or <end> as a line number, they must
> exist in the starting revision or it would be a fatal error.  If we
> are clarifying with this patch for completeness, I think we should
> also mention it together.  
Thanks for the feedback. I did some tests : 
    git log -L 73,2000000085:Documentation/git-log.txt
does not error and shows the history from line 73 to the end of the file.
    git log -L 300,2000000085:Documentation/git-log.txt 
errors out:
    fatal: file Documentation/git-log.txt has only 239 lines
But 
    git log -L 300,-2000000085:Documentation/git-log.txt
does not error out and shows the history for the whole file. Also,
    git log -L 1,-2000000085:Documentation/git-log.txt
does not error out and gives the history for the first line.

So I think that it’s really only regex that must match...
Derrick Stolee Dec. 18, 2019, 10:55 a.m. UTC | #4
On 12/17/2019 10:28 PM, Philippe Blain wrote:
> 
>> Le 17 déc. 2019 à 13:16, Junio C Hamano <gitster@pobox.com> a écrit :
>> Even when you specify <start> or <end> as a line number, they must
>> exist in the starting revision or it would be a fatal error.  If we
>> are clarifying with this patch for completeness, I think we should
>> also mention it together.  
...
>     git log -L 300,2000000085:Documentation/git-log.txt 
> errors out:
>     fatal: file Documentation/git-log.txt has only 239 lines

This case seems important enough to include what it means to "match".

Specifically, the range must match at least one line in the file.

> But 
>     git log -L 300,-2000000085:Documentation/git-log.txt
> does not error out and shows the history for the whole file. Also,
>     git log -L 1,-2000000085:Documentation/git-log.txt
> does not error out and gives the history for the first line.

Negative numbers in the ranges are a bit strange to include, and the
large magnitude you include here seems unnecessary for the test.
However, it appears that we do store signed values in the line-log
machinery:

/* A range [start,end].  Lines are numbered starting at 0, and the
 * ranges include start but exclude end. */
struct range {
	long start, end;
};

Perhaps these should be replaced with an unsigned value instead,
just to be clear that negative values do not make sense in this
context?

Or rather, do we want to enforce start < end if they are to be
valuable? We apparently do assert() this in
range_set_check_invariants() for all that does.

The current behavior of showing only the first line
is a strange byproduct of allowing these odd ranges, but we may
want to keep it for consistency. That would imply we allow
start <= end, and auto-correct these negative values to have
end = start. (This does not fix the 300,-1 case, but that
is strange enough to be considered a bug, right?)

Thanks,
-Stolee
SZEDER Gábor Dec. 18, 2019, 11:09 a.m. UTC | #5
On Tue, Dec 17, 2019 at 10:28:37PM -0500, Philippe Blain wrote:
> 
> > Le 17 déc. 2019 à 13:16, Junio C Hamano <gitster@pobox.com> a écrit :
> > Even when you specify <start> or <end> as a line number, they must
> > exist in the starting revision or it would be a fatal error.  If we
> > are clarifying with this patch for completeness, I think we should
> > also mention it together.  
> Thanks for the feedback. I did some tests : 

I'll swap the order of your first two tests:

>     git log -L 300,2000000085:Documentation/git-log.txt 
> errors out:
>     fatal: file Documentation/git-log.txt has only 239 lines

Here the entire line range is outside of the file, so there is not
much we can do about it, but error out.

An alternative would be to treat it as an empty line range and then
don't show any commits but exit silently, but I think that would be
confusing ("why didn't I get any output?!"), and telling the user
what's wrong is better ("Ah, ok, I mistyped 192 as 912").

>     git log -L 73,2000000085:Documentation/git-log.txt
> does not error and shows the history from line 73 to the end of the file.

Here there is an overlap between the given line range and the file
(lines 73-239), so we have two possibilities:

  - be strict and error out saying that the <end> doesn't make sense.

  - be lax about it, and interpret the <end> as the end of file.  This
    allows for cases like "I want line log from here to the end of
    file, but instead of finding out the exact number of lines in the
    file I'll just say 999999 that is surely larger than the file, and
    you shall do what I mean".

Those who implemented the line-log feature chose the latter.  I think
it's the better choice.

> But 
>     git log -L 300,-2000000085:Documentation/git-log.txt
> does not error out and shows the history for the whole file. Also,
>     git log -L 1,-2000000085:Documentation/git-log.txt
> does not error out and gives the history for the first line.

These are a variant of the previous case, with the difference that
because of the '-' in <end> it is not an absolute line number but
relative, and is interpreted as that many lines before <start> (i.e.
in this case <start> actually means the end of the line range).

I think the same argument can be made about <end> pointing past the
beginning of the file, though, arguably, it's not as useful, because
the first line as always 1, and '-L 123,-99999' is more keystrokes
than '-L 1,123'.

> So I think that it’s really only regex that must match...
SZEDER Gábor Dec. 18, 2019, 11:49 a.m. UTC | #6
On Wed, Dec 18, 2019 at 05:55:25AM -0500, Derrick Stolee wrote:
> On 12/17/2019 10:28 PM, Philippe Blain wrote:
> > 
> >> Le 17 déc. 2019 à 13:16, Junio C Hamano <gitster@pobox.com> a écrit :
> >> Even when you specify <start> or <end> as a line number, they must
> >> exist in the starting revision or it would be a fatal error.  If we
> >> are clarifying with this patch for completeness, I think we should
> >> also mention it together.  
> ...
> >     git log -L 300,2000000085:Documentation/git-log.txt 
> > errors out:
> >     fatal: file Documentation/git-log.txt has only 239 lines
> 
> This case seems important enough to include what it means to "match".
> 
> Specifically, the range must match at least one line in the file.
> 
> > But 
> >     git log -L 300,-2000000085:Documentation/git-log.txt
> > does not error out and shows the history for the whole file. Also,
> >     git log -L 1,-2000000085:Documentation/git-log.txt
> > does not error out and gives the history for the first line.
> 
> Negative numbers in the ranges are a bit strange to include, and the
> large magnitude you include here seems unnecessary for the test.
> However, it appears that we do store signed values in the line-log
> machinery:
> 
> /* A range [start,end].  Lines are numbered starting at 0, and the
>  * ranges include start but exclude end. */
> struct range {
> 	long start, end;
> };
> 
> Perhaps these should be replaced with an unsigned value instead,
> just to be clear that negative values do not make sense in this
> context?
> 
> Or rather, do we want to enforce start < end if they are to be
> valuable? We apparently do assert() this in
> range_set_check_invariants() for all that does.

We have different contexts for start,end here: the '-L <start>,<end>'
specified on the command line and the 'start' and 'end' fields of the
internal 'struct range' are not the same and different constraints
apply to them.

'-L X,+Y' and '-L X,-Y' are explicitly allowed, and mean that Y is
interpreted relative to X, the former meaning Y lines starting at X,
while the latter meaning Y lines ending at X.  These are interpreted
while parsing the '-L ...' options, during which we do pass around
negative values in (signed) long parameters, see
line-range.c:parse_loc().  (though, arguably, having unsigned types
and a separate flag parameter might make the code more clear)

The purpose of range_set_check_invariants() is different: as the
file's content changes over its history the internal line ranges must
be adjusted (grown, shrunk, shifted around or sometimes even merged)
accordingly, and this function makes sure that the resulting 'struct
range' instances after these adjustments still make sense, because the
rest of the line log machinery expects sorted and non-overlapping line
ranges.


> The current behavior of showing only the first line
> is a strange byproduct of allowing these odd ranges, but we may
> want to keep it for consistency. That would imply we allow
> start <= end, and auto-correct these negative values to have
> end = start. (This does not fix the 300,-1 case, but that
> is strange enough to be considered a bug, right?)
> 
> Thanks,
> -Stolee
>
Junio C Hamano Dec. 18, 2019, 5:59 p.m. UTC | #7
Derrick Stolee <stolee@gmail.com> writes:

> On 12/17/2019 10:28 PM, Philippe Blain wrote:
>> 
>>> Le 17 déc. 2019 à 13:16, Junio C Hamano <gitster@pobox.com> a écrit :
>>> Even when you specify <start> or <end> as a line number, they must
>>> exist in the starting revision or it would be a fatal error.  If we
>>> are clarifying with this patch for completeness, I think we should
>>> also mention it together.  
> ...
>>     git log -L 300,2000000085:Documentation/git-log.txt 
>> errors out:
>>     fatal: file Documentation/git-log.txt has only 239 lines
>
> This case seems important enough to include what it means to "match".

Yup, that is why I suggested not to focus too much on regexp and
"match"-ing, but express it in terms of the lines that are specified
by given arguments to "exist".  The word "match" gives an incorrect
connotation that the description in the paragraph that uses the verb
applies only to regexp.
Philippe Blain Dec. 26, 2019, 5:46 p.m. UTC | #8
> Le 18 déc. 2019 à 12:59, Junio C Hamano <gitster@pobox.com> a écrit :
> ...
> Yup, that is why I suggested not to focus too much on regexp and
> "match"-ing, but express it in terms of the lines that are specified
> by given arguments to "exist".  The word "match" gives an incorrect
> connotation that the description in the paragraph that uses the verb
> applies only to regexp.

I agree. Thanks for the suggestion. I changed it to « exist » in v2.
diff mbox series

Patch

diff --git a/Documentation/git-log.txt b/Documentation/git-log.txt
index 1c52bf184d..5c3fd39048 100644
--- a/Documentation/git-log.txt
+++ b/Documentation/git-log.txt
@@ -76,7 +76,8 @@  produced by `--stat`, etc.
 	(or the function name regex <funcname>) within the <file>.  You may
 	not give any pathspec limiters.  This is currently limited to
 	a walk starting from a single revision, i.e., you may only
-	give zero or one positive revision arguments.
+	give zero or one positive revision arguments, and any given regex for
+	<start> or <end> (or <funcname>) must match in the starting revision.
 	You can specify this option more than once. Implies `--patch`.
 	If ``:<funcname>'' is given, implies `--function-context`.
 	Patch output can be suppressed using `-s`, but other diff formats