Message ID | 20210613004434.10278-1-felipe.contreras@gmail.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | doc: revisions: improve single range explanation | expand |
Hi, On 13/06/21 07.44, Felipe Contreras wrote: > The original explanation didn't seem clear enough to some people. > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > --- > Documentation/revisions.txt | 22 +++++++++++----------- > 1 file changed, 11 insertions(+), 11 deletions(-) > > diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt > index f5f17b65a1..d8cf512686 100644 > --- a/Documentation/revisions.txt > +++ b/Documentation/revisions.txt > @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. > > Commands that are specifically designed to take two distinct ranges > (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but > -they are exceptions. Unless otherwise noted, all "git" commands > +they are exceptions. Unless otherwise noted, all git commands > that operate on a set of commits work on a single revision range. > -In other words, writing two "two-dot range notation" next to each > -other, e.g. > > - $ git log A..B C..D > +For example, if you have a linear history like this: > > -does *not* specify two revision ranges for most commands. Instead > -it will name a single connected set of commits, i.e. those that are > -reachable from either B or D but are reachable from neither A or C. > -In a linear history like this: > + ---A---B---C---D---E---F > > - ---A---B---o---o---C---D > +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > +commits, but doing A..F B..E will not retrieve two revision ranges > +totalling 8 commits. Instead the starting point A gets overriden by B, > +and the ending point of E by F, effectively becoming B..F, a single > +revision range. > AFAIK, A..F means all commits from A to F. But in case of branched history like ---A---B---C---G---H---I <- main \ ---D---E---F <- mybranch the notation main..mybranch means all commits that are reachable from mybranch but not from main, but the opposite (mybranch..main) means the opposite! So basically the right-hand side of two dot notation specifies from what commit I want to select the range, and the left-hand side specifies the commit which I don't want to reach.
Bagas Sanjaya wrote: > On 13/06/21 07.44, Felipe Contreras wrote: > > The original explanation didn't seem clear enough to some people. > > > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > > --- > > Documentation/revisions.txt | 22 +++++++++++----------- > > 1 file changed, 11 insertions(+), 11 deletions(-) > > > > diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt > > index f5f17b65a1..d8cf512686 100644 > > --- a/Documentation/revisions.txt > > +++ b/Documentation/revisions.txt > > @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. > > > > Commands that are specifically designed to take two distinct ranges > > (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but > > -they are exceptions. Unless otherwise noted, all "git" commands > > +they are exceptions. Unless otherwise noted, all git commands > > that operate on a set of commits work on a single revision range. > > -In other words, writing two "two-dot range notation" next to each > > -other, e.g. > > > > - $ git log A..B C..D > > +For example, if you have a linear history like this: > > > > -does *not* specify two revision ranges for most commands. Instead > > -it will name a single connected set of commits, i.e. those that are > > -reachable from either B or D but are reachable from neither A or C. > > -In a linear history like this: > > + ---A---B---C---D---E---F > > > > - ---A---B---o---o---C---D > > +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > > +commits, but doing A..F B..E will not retrieve two revision ranges > > +totalling 8 commits. Instead the starting point A gets overriden by B, > > +and the ending point of E by F, effectively becoming B..F, a single > > +revision range. > > AFAIK, A..F means all commits from A to F. But in case of branched > history like > > ---A---B---C---G---H---I <- main > \ > ---D---E---F <- mybranch > > the notation main..mybranch means all commits that are reachable from > mybranch but not from main, but the opposite (mybranch..main) means the > opposite! > > So basically the right-hand side of two dot notation specifies from what > commit I want to select the range, and the left-hand side specifies the > commit which I don't want to reach. Yes, `A..F` is the same as `^A F`.
On Sat, Jun 12, 2021 at 8:44 PM Felipe Contreras <felipe.contreras@gmail.com> wrote: > The original explanation didn't seem clear enough to some people. > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > --- > diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt > @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. > +For example, if you have a linear history like this: > > + ---A---B---C---D---E---F > > +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > +commits, but doing A..F B..E will not retrieve two revision ranges > +totalling 8 commits. Instead the starting point A gets overriden by B, > +and the ending point of E by F, effectively becoming B..F, a single > +revision range. s/overriden/overridden/ For what it's worth, as a person who is far from expert at revision ranges, I had to read this revised text five or six times and think about it quite a bit to understand what it is saying, whereas with Junio's original[1], I understood it on the first read with only a little thought. Also, if this explanation is aimed at newcomers, then saying only "doing A..F will retrieve 5 commits" without actually saying _which_ commits those are is perhaps not so helpful. A newcomer might be helped more by enumerating the precise commits: The range A..F represents five commits B, C, D, E, F, and the range B..E represents three commits C, D, E, ... [1]: https://lore.kernel.org/git/xmqqv97g2svd.fsf@gitster.g/
Eric Sunshine wrote: > On Sat, Jun 12, 2021 at 8:44 PM Felipe Contreras > <felipe.contreras@gmail.com> wrote: > > The original explanation didn't seem clear enough to some people. > > > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > > --- > > diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt > > @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. > > +For example, if you have a linear history like this: > > > > + ---A---B---C---D---E---F > > > > +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > > +commits, but doing A..F B..E will not retrieve two revision ranges > > +totalling 8 commits. Instead the starting point A gets overriden by B, > > +and the ending point of E by F, effectively becoming B..F, a single > > +revision range. > > s/overriden/overridden/ > > For what it's worth, as a person who is far from expert at revision > ranges, I had to read this revised text five or six times and think > about it quite a bit to understand what it is saying, Can you explain why? This is the context: commands don't generally take two ranges: 1. Unless otherwise noted, all git commands that operate on a set of commits work on a single revision range. 2. Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 commits, but doing A..F B..E will not retrieve two revision ranges totalling 8 commits. At this point what isn't clear? Isn't it clear that `A..F B..E` aren't two revision ranges? 3. Instead the starting point A gets overridden by B, and the ending point of E by F, effectively becoming B..F, a single revision range. What isn't clear about that? A gets superseded by B because it's higher in the graph. And if you do `git log D E F` it's clear that doing `git log F` will get you the same thing, isn't it? > Also, if this explanation is aimed at newcomers, then saying only > "doing A..F will retrieve 5 commits" without actually saying _which_ > commits those are is perhaps not so helpful. It doesn't matter which specific commits are retrieved, the only thing that matters is that `X op Y` is not additive.
On Sat, Jun 12, 2021 at 9:25 PM Felipe Contreras <felipe.contreras@gmail.com> wrote: > > Eric Sunshine wrote: > > On Sat, Jun 12, 2021 at 8:44 PM Felipe Contreras > > <felipe.contreras@gmail.com> wrote: > > > The original explanation didn't seem clear enough to some people. > > > > > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > > > --- > > > diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt > > > @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. > > > +For example, if you have a linear history like this: > > > > > > + ---A---B---C---D---E---F > > > > > > +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > > > +commits, but doing A..F B..E will not retrieve two revision ranges > > > +totalling 8 commits. Instead the starting point A gets overriden by B, > > > +and the ending point of E by F, effectively becoming B..F, a single > > > +revision range. > > > > s/overriden/overridden/ > > > > For what it's worth, as a person who is far from expert at revision > > ranges, I had to read this revised text five or six times and think > > about it quite a bit to understand what it is saying, > > Can you explain why? I tend to agree with Eric. I think the example you chose is likely to be misinterpreted and your wording magnifies it. A..F B..E simplifies to B..F which is *almost* the union of A..F and B..E, it's only missing A. Off-by-one errors are easy to miss. You make it more likely that they'll miss it, because there are only 6 commits total in the union, and you are trying to explain why listing A..F B..E while not be 8 commits, which readers can easily respond with, "Well, of course it's not 8 commits. There's only 6. When you do the union operation, of course the duplicates go away", and miss the actual point that A got excluded. Junio's wording and example just seemed better to me here. > > This is the context: commands don't generally take two ranges: > > 1. Unless otherwise noted, all git commands that operate on a set of > commits work on a single revision range. > > 2. Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > commits, but doing A..F B..E will not retrieve two revision ranges > totalling 8 commits. > > At this point what isn't clear? Isn't it clear that `A..F B..E` aren't > two revision ranges? > > 3. Instead the starting point A gets overridden by B, and the ending > point of E by F, effectively becoming B..F, a single revision range. > > What isn't clear about that? A gets superseded by B because it's higher > in the graph. And if you do `git log D E F` it's clear that doing > `git log F` will get you the same thing, isn't it? > > > Also, if this explanation is aimed at newcomers, then saying only > > "doing A..F will retrieve 5 commits" without actually saying _which_ > > commits those are is perhaps not so helpful. > > It doesn't matter which specific commits are retrieved, the only thing > that matters is that `X op Y` is not additive. > > -- > Felipe Contreras
On Sun, Jun 13, 2021 at 12:26 AM Felipe Contreras <felipe.contreras@gmail.com> wrote: > Eric Sunshine wrote: > > For what it's worth, as a person who is far from expert at revision > > ranges, I had to read this revised text five or six times and think > > about it quite a bit to understand what it is saying, > > Can you explain why? Probably not to a degree which will satisfy you. And I'm not being flippant by saying that. I mean only that it is more than a little difficult to explain why one thing "clicks" easily in the brain while something else doesn't. I can only relate (to some extent) what I experienced while reading your revised text. > This is the context: commands don't generally take two ranges: > > 1. Unless otherwise noted, all git commands that operate on a set of > commits work on a single revision range. > > 2. Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > commits, but doing A..F B..E will not retrieve two revision ranges > totalling 8 commits. > > At this point what isn't clear? Isn't it clear that `A..F B..E` aren't > two revision ranges? The documentation stating explicitly that `A..F B..E` is not two ranges is fine. What was difficult to understand was your explanation of _why_ those are not two ranges. In contrast, I had no difficulty understanding Junio's explanation of why that is not two ranges. > 3. Instead the starting point A gets overridden by B, and the ending > point of E by F, effectively becoming B..F, a single revision range. > > What isn't clear about that? A gets superseded by B because it's higher > in the graph. And if you do `git log D E F` it's clear that doing > `git log F` will get you the same thing, isn't it? One of the reasons I had to re-read your text so many times was because it was difficult to build a mental model of what you were saying, and to follow along with all the "this replaces that" and "this other thing replaces that other thing". While doing so, I repeatedly had to glance back at the original `A..F B..E` to make sure the mental model I was building was correct or still made sense. The word "overridden" didn't help because I couldn't tell if, by "overridden", you meant that something got replaced by something else or if something was merely ignored. (Or maybe those are the same thing in this case, but how will a newcomer -- who is trying to learn this from scratch -- know which it is?) However, an even bigger problem I experienced while reading your revised text is that it felt like it was trying to express some rule which the reader should internalize ("replace this with that, and replace this other thing too") with no proper explanation of _why_ the rule works that way. Worse, the rule (whatever it is) never actually materialized or solidified in a way which I could understand and thus apply to in other situations. Junio's explanation, on the other hand, was simple and to the point, and (for whatever reason) clicked easily in my brain, such that I came away feeling that I could apply the knowledge immediately to other situations. On the other hand, after reading your proposed text, I did not feel as if I had gained any knowledge, and even had I picked up the rule which seems to be in there, I likely still wouldn't have understood _why_ that rule works or is needed; it would just have been some black box. > > Also, if this explanation is aimed at newcomers, then saying only > > "doing A..F will retrieve 5 commits" without actually saying _which_ > > commits those are is perhaps not so helpful. > > It doesn't matter which specific commits are retrieved, the only thing > that matters is that `X op Y` is not additive. The very first question which popped into my head upon reading "Doing A..F will retrieve 5 commits" was "which five commits?". Not being told the answer by the text did not help me feel confident that I knew the correct five commits. Had the text stated explicitly "the five commits B, C, D, E, F", then there would be no question and no feeling of uncertainty about it. So, whatever precision your above statement might have, it is likely to be lost on the general newcomer who is simply trying to learn about and understand Git revisions.
Eric Sunshine wrote: > On Sun, Jun 13, 2021 at 12:26 AM Felipe Contreras > <felipe.contreras@gmail.com> wrote: > > Eric Sunshine wrote: > > > For what it's worth, as a person who is far from expert at revision > > > ranges, I had to read this revised text five or six times and think > > > about it quite a bit to understand what it is saying, > > > > Can you explain why? > > Probably not to a degree which will satisfy you. And I'm not being > flippant by saying that. I mean only that it is more than a little > difficult to explain why one thing "clicks" easily in the brain while > something else doesn't. I can only relate (to some extent) what I > experienced while reading your revised text. Yes, but the documentation is not for you, it's for the majority of users, so it behooves to try to understand the reason to see if it applies to the population in general. > > This is the context: commands don't generally take two ranges: > > > > 1. Unless otherwise noted, all git commands that operate on a set of > > commits work on a single revision range. > > > > 2. Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > > commits, but doing A..F B..E will not retrieve two revision ranges > > totalling 8 commits. > > > > At this point what isn't clear? Isn't it clear that `A..F B..E` aren't > > two revision ranges? > > The documentation stating explicitly that `A..F B..E` is not two > ranges is fine. What was difficult to understand was your explanation > of _why_ those are not two ranges. At this point the _why_ has not been explained, merely that these two things don't result in two ranges. > > 3. Instead the starting point A gets overridden by B, and the ending > > point of E by F, effectively becoming B..F, a single revision range. > > > > What isn't clear about that? A gets superseded by B because it's higher > > in the graph. And if you do `git log D E F` it's clear that doing > > `git log F` will get you the same thing, isn't it? > > One of the reasons I had to re-read your text so many times was > because it was difficult to build a mental model of what you were > saying, and to follow along with all the "this replaces that" and > "this other thing replaces that other thing". While doing so, I > repeatedly had to glance back at the original `A..F B..E` to make sure > the mental model I was building was correct or still made sense. I wonder why that is the case. A..F is so simple it doesn't have to be explained, Ruby even expands that obvious range. ---A---B---C---D---E---F ^ ^ from to And B..E: ---A---B---C---D---E---F ^ ^ from to In Ruby the range can be defined simply as: 'A'..'F' ["A", "B", "C", "D", "E", "F"] Would 1..6 be easier to picture? > The word "overridden" didn't help because I couldn't tell if, by > "overridden", you meant that something got replaced by something else > or if something was merely ignored. (Or maybe those are the same thing > in this case, but how will a newcomer -- who is trying to learn this > from scratch -- know which it is?) If I say Lucy is available from 1 to 6 p.m. and Michael from 2 to 5 p.m. why would 2 p.m supersede 1 p.m.? If we are trying to define a starting point, obviously the latest starting point is the one that wins. No? > However, an even bigger problem I experienced while reading your > revised text is that it felt like it was trying to express some rule > which the reader should internalize ("replace this with that, and > replace this other thing too") The text starts with *for example*. Therefore it's not something general, it's an example. > Junio's explanation, on the other hand, was simple and to the point, > and (for whatever reason) clicked easily in my brain, such that I came > away feeling that I could apply the knowledge immediately to other > situations. Junio's explanation is inaccurate because it stated that this: Unless otherwise noted, all git commands that operate on a set of commits work on a single revision range. Is the same as this: writing two "two-dot range notation" next to each does *not* specify two revision ranges for most commands. But it is not the same. Can you tell me why? > On the other hand, after reading your proposed text, I did not feel as > if I had gained any knowledge, and even had I picked up the rule which > seems to be in there, The text never mentioned any rule. > > > Also, if this explanation is aimed at newcomers, then saying only > > > "doing A..F will retrieve 5 commits" without actually saying _which_ > > > commits those are is perhaps not so helpful. > > > > It doesn't matter which specific commits are retrieved, the only thing > > that matters is that `X op Y` is not additive. > > The very first question which popped into my head upon reading "Doing > A..F will retrieve 5 commits" was "which five commits?". Keep reading. > So, whatever precision your above statement might have, it is likely > to be lost on the general newcomer who is simply trying to learn about > and understand Git revisions. Or maybe it's something that only applies to you. Cheers.
Elijah Newren wrote: > On Sat, Jun 12, 2021 at 9:25 PM Felipe Contreras > <felipe.contreras@gmail.com> wrote: > > > > Eric Sunshine wrote: > > > On Sat, Jun 12, 2021 at 8:44 PM Felipe Contreras > > > <felipe.contreras@gmail.com> wrote: > > > > The original explanation didn't seem clear enough to some people. > > > > > > > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > > > > --- > > > > diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt > > > > @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. > > > > +For example, if you have a linear history like this: > > > > > > > > + ---A---B---C---D---E---F > > > > > > > > +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > > > > +commits, but doing A..F B..E will not retrieve two revision ranges > > > > +totalling 8 commits. Instead the starting point A gets overriden by B, > > > > +and the ending point of E by F, effectively becoming B..F, a single > > > > +revision range. > > > > > > s/overriden/overridden/ > > > > > > For what it's worth, as a person who is far from expert at revision > > > ranges, I had to read this revised text five or six times and think > > > about it quite a bit to understand what it is saying, > > > > Can you explain why? > > I tend to agree with Eric. I think the example you chose is likely to > be misinterpreted and your wording magnifies it. A..F B..E simplifies > to B..F which is *almost* the union of A..F and B..E, it's only > missing A. Off-by-one errors are easy to miss. Yes, but right before it's explained that the ending point is F. Not E, F. > You make it more likely that they'll miss it, because there are only 6 > commits total in the union, and you are trying to explain why listing > A..F B..E while not be 8 commits, which readers can easily respond > with, "Well, of course it's not 8 commits. There's only 6. If the reader understands that no more than 6 commits can be returned, then the reader has understood the point that the operation is not addition. > When you do the union operation, of course the duplicates go away", > and miss the actual point that A got excluded. But that is not the point. This is the point: Unless otherwise noted, all git commands that operate on a set of commits work on a single revision range. You are missing the forest for the trees. In the context of gitrevisions(7) the user has just been told that: 1. We are trying to specify a graph of commits reachable from a commit, or commits. The user was shown this graph: G H I J \ / \ / D E F \ | / \ \ | / | \|/ | B C \ / \ / A And that B is A^, therefore doing `git log A B` is redundant, as is doing `git log A B D`. 2. The caret notation `^r1 r2` means commits reachable from r2, but exclude commits reachable from r1 (r1 and it's ancestors) That means '^D A' will exclude D G and H. 3. The two-dot range notation `r1..r2` is the same as `^r1 r2` Now, whith this context in mind, we are trying to hedge the corner-case of `r1..r2 r3..r4` in other words: `^r1 r2 ^r3 r4`. The user has been told already that C..A is the same as `^C A` (I'm changing the order to be consistent with the graph above). And to make my point clear I actually don't need two starting points. So how about this: Commands that are specifically designed to take two distinct ranges (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they are exceptions. Unless otherwise noted, all git commands that operate on a set of commits work on a single revision range. Just like 'A A' coalesces to 'A', 'B..A C..A' is the same as the single revision range '^B ^C A'.
On Sun, Jun 13, 2021 at 10:09 AM Felipe Contreras <felipe.contreras@gmail.com> wrote: > > Elijah Newren wrote: > > On Sat, Jun 12, 2021 at 9:25 PM Felipe Contreras > > <felipe.contreras@gmail.com> wrote: > > > > > > Eric Sunshine wrote: > > > > On Sat, Jun 12, 2021 at 8:44 PM Felipe Contreras > > > > <felipe.contreras@gmail.com> wrote: > > > > > The original explanation didn't seem clear enough to some people. > > > > > > > > > > Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> > > > > > --- > > > > > diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt > > > > > @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. > > > > > +For example, if you have a linear history like this: > > > > > > > > > > + ---A---B---C---D---E---F > > > > > > > > > > +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 > > > > > +commits, but doing A..F B..E will not retrieve two revision ranges > > > > > +totalling 8 commits. Instead the starting point A gets overriden by B, > > > > > +and the ending point of E by F, effectively becoming B..F, a single > > > > > +revision range. > > > > > > > > s/overriden/overridden/ > > > > > > > > For what it's worth, as a person who is far from expert at revision > > > > ranges, I had to read this revised text five or six times and think > > > > about it quite a bit to understand what it is saying, > > > > > > Can you explain why? > > > > I tend to agree with Eric. I think the example you chose is likely to > > be misinterpreted and your wording magnifies it. A..F B..E simplifies > > to B..F which is *almost* the union of A..F and B..E, it's only > > missing A. Off-by-one errors are easy to miss. > > Yes, but right before it's explained that the ending point is F. > Not E, F. I think this is somewhat of a useless distinction -- not for the end result, but in terms of helping users understand. We started adding an explanation to the manual because users misunderstand how "start1..end1 start2..end2" is treated and we want to correct their misunderstandings. In that context, the only misunderstanding I can think of that is dispelled by specifying F is the endpoint would be "two ranges are intersected to get the range of commits that log will operate on". I've never seen users assume that or make such a mistake. I've always seen them assume that the "two ranges are combined with a union". In that case, F matches their misunderstanding, so this part of the explanation does nothing to help correct their assumptions. The only place their misunderstanding disagrees with the correct answer for your example is on the other side of those ranges. They would have gotten an incorrect answer of "A..F B..E" == "A..F" , whereas the correct answer is "B..F". That's an off-by-one error, but I think they're likely to miss it. Especially given that folks already mess up the left hand side of single "FOO..BAR" expressions with off-by-one errors. > > You make it more likely that they'll miss it, because there are only 6 > > commits total in the union, and you are trying to explain why listing > > A..F B..E while not be 8 commits, which readers can easily respond > > with, "Well, of course it's not 8 commits. There's only 6. > > If the reader understands that no more than 6 commits can be returned, > then the reader has understood the point that the operation is not > addition. Who in the world ever assumes that "two dotted ranges are combined via list addition"? I've only ever come across users assuming the operation is a union (or, equivalently, addition on sets). I don't understand why you even try to make that point, and think it's a distraction that does more harm than good. > > When you do the union operation, of course the duplicates go away", > > and miss the actual point that A got excluded. > > But that is not the point. This is the point: > > Unless otherwise noted, all git commands that operate on a set of > commits work on a single revision range. > > You are missing the forest for the trees. I think you are missing the boat. That sentence on its own is completely insufficient to dispel the misunderstanding. All that sentence says to users is that if they specify what they think of as "two ranges" that we'll somehow treat it as one; but since users are prone to think that "revision range" is interchangeable with "set of revisions" (especially since we defined A..B elsewhere in set operations), this will merely make them think in terms of what set operation they need to perform on the "two ranges" to get the set of commits the operation will function on. Most users I've seen simply do that via applying a simple operation to combine two ranges into one. Everyone I've ever run across that misunderstands this "two range" thing, does so in the same way: by assuming that the two ranges are combined via a union to get an interesting set of commits. The example you provide should attempt to help explain why that mental model is mistaken and provide them with a corrected one. Your response to Eric suggests you're not even trying to provide a corrected mental model, and your response here suggests you are trying to only correct mistakes of the form "take two revision ranges and add them keeping duplicates" and "take two revision ranges and intersect them", neither of which I've observed in the wild. > In the context of gitrevisions(7) the user has just been told that: > > 1. We are trying to specify a graph of commits reachable from a > commit, or commits. > > The user was shown this graph: > > G H I J > \ / \ / > D E F > \ | / \ > \ | / | > \|/ | > B C > \ / > \ / > A > > And that B is A^, therefore doing `git log A B` is redundant, as is > doing `git log A B D`. > > 2. The caret notation `^r1 r2` means commits reachable from r2, but > exclude commits reachable from r1 (r1 and it's ancestors) > > That means '^D A' will exclude D G and H. > > 3. The two-dot range notation `r1..r2` is the same as `^r1 r2` > > > Now, whith this context in mind, we are trying to hedge the corner-case > of `r1..r2 r3..r4` in other words: `^r1 r2 ^r3 r4`. > > The user has been told already that C..A is the same as `^C A` (I'm > changing the order to be consistent with the graph above). And to make > my point clear I actually don't need two starting points. > > So how about this: > > Commands that are specifically designed to take two distinct ranges > (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but > they are exceptions. Unless otherwise noted, all git commands > that operate on a set of commits work on a single revision range. Just > like 'A A' coalesces to 'A', 'B..A C..A' is the same as the > single revision range '^B ^C A'. Your example here almost seems to suggest that we do an intersection of the "two ranges" to get the answer. It's certainly not your intent, but I think the users I've helped would be prone to read it that way due to your focus on coalescing, and due to your selection of an example which happens to give the correct answer when using the intersection misinterpretation. I would be much happier with something like this: """ Note: There is no shorthand for getting a union or intersection of multiple dotted ranges. Commands that are specifically designed to take two distinct ranges (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they are exceptions. Unless otherwise noted, all git commands that operate on a set of commits work on a single revision range. Thus, just as "A..B" translates to "^A B", the expression "A..B C..D" translates to "^A B ^C D", i.e. all commits reachable from either B or D, as long as they are not reachable from either A or C. This is much different than you would get by trying to do either an intersection or union of the two separate ranges A..B and C..D. Compare the differences on the following simple linear history: ---A---B---C---D---E---F---G---H The command $ git log A..E C..H would be the same as $ git log C..H (since E is reachable from H, and A is reachable from C). In contrast, the union of A..E and C..H would be A..H, while the intersection would be C..E. """
Elijah Newren wrote: > On Sun, Jun 13, 2021 at 10:09 AM Felipe Contreras > <felipe.contreras@gmail.com> wrote: > > Elijah Newren wrote: > > > I tend to agree with Eric. I think the example you chose is likely to > > > be misinterpreted and your wording magnifies it. A..F B..E simplifies > > > to B..F which is *almost* the union of A..F and B..E, it's only > > > missing A. Off-by-one errors are easy to miss. > > > > Yes, but right before it's explained that the ending point is F. > > Not E, F. > > I think this is somewhat of a useless distinction -- not for the end > result, but in terms of helping users understand. We started adding > an explanation to the manual because users misunderstand how > "start1..end1 start2..end2" is treated and we want to correct their > misunderstandings. In that context, the only misunderstanding I can > think of that is dispelled by specifying F is the endpoint would be > "two ranges are intersected to get the range of commits that log will > operate on". I've never seen users assume that or make such a > mistake. I've always seen them assume that the "two ranges are > combined with a union". Then that warrants yet another paragraph, because this one is for: Commands that are specifically designed to take two distinct ranges (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they are exceptions. Probably outside the section of Dotted Range Notations, because if the user is confused about what 'C B A A' should do, that has nothing to do with this dotted ranges. Maybe after the user has been told that: Specifying several revisions means the set of commits reachable from any of the given commits. A commit's reachable set is the commit itself and the commits in its ancestry chain. > > > You make it more likely that they'll miss it, because there are only 6 > > > commits total in the union, and you are trying to explain why listing > > > A..F B..E while not be 8 commits, which readers can easily respond > > > with, "Well, of course it's not 8 commits. There's only 6. > > > > If the reader understands that no more than 6 commits can be returned, > > then the reader has understood the point that the operation is not > > addition. > > Who in the world ever assumes that "two dotted ranges are combined via > list addition"? I don't know, but that is the paragraph we are on: Commands that are specifically designed to take two distinct ranges (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they are exceptions. If you are arguing for the removal of this entire paragraph and its examples, I'd be fine with that. > I've only ever come across users assuming the > operation is a union (or, equivalently, addition on sets). I don't > understand why you even try to make that point, and think it's a > distraction that does more harm than good. If you think it's impossible for the user to assume two dotted ranges means addition, please explain what is the point of this sentence: Unless otherwise noted, all "git" commands that operate on a set of commits work on a single revision range. > > > When you do the union operation, of course the duplicates go away", > > > and miss the actual point that A got excluded. > > > > But that is not the point. This is the point: > > > > Unless otherwise noted, all git commands that operate on a set of > > commits work on a single revision range. > > > > You are missing the forest for the trees. > > I think you are missing the boat. > > That sentence on its own is completely insufficient to dispel the > misunderstanding. One misunderstanding, perhaps, not the one we are trying to tackle here. > All that sentence says to users is that if they specify what they > think of as "two ranges" that we'll somehow treat it as one; Didn't you just said the user would never think it's actually two ranges? What's the point in saying that if the user already knows it? > but since users are prone to think that "revision range" is > interchangeable with "set of revisions" (especially since we defined > A..B elsewhere in set operations), this will merely make them think in > terms of what set operation they need to perform on the "two ranges" > to get the set of commits the operation will function on. That belongs in a separate paragraph. > The example you provide should attempt to help explain why that mental > model is mistaken and provide them with a corrected one. Your > response to Eric suggests you're not even trying to provide a > corrected mental model, and your response here suggests you are trying > to only correct mistakes of the form "take two revision ranges and add > them keeping duplicates" and "take two revision ranges and intersect > them", neither of which I've observed in the wild. I'm providing an example for the paragraph that is already written. If you want me to rewrite the entire section I can certainly give it a try. > Commands that are specifically designed to take two distinct ranges > (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but they > are exceptions. Unless otherwise noted, all git commands that operate > on a set of commits work on a single revision range. Isn't this obvious for all users? > Thus, just as "A..B" translates to "^A B", the expression "A..B C..D" > translates to "^A B ^C D", i.e. all commits reachable from either B or > D, as long as they are not reachable from either A or C. How about we remove the entire paragraph and replace it with: When specifying two ranges, such as 'A..B C..D', the way this is interpreted is as a single range '^A B ^C D', that is: all commits reachable from either B or D, as long as they are not reachable from either A or C. Assuming a linear history, B would be reachable from C, so this is the same as '^C D'.
diff --git a/Documentation/revisions.txt b/Documentation/revisions.txt index f5f17b65a1..d8cf512686 100644 --- a/Documentation/revisions.txt +++ b/Documentation/revisions.txt @@ -299,22 +299,22 @@ empty range that is both reachable and unreachable from HEAD. Commands that are specifically designed to take two distinct ranges (e.g. "git range-diff R1 R2" to compare two ranges) do exist, but -they are exceptions. Unless otherwise noted, all "git" commands +they are exceptions. Unless otherwise noted, all git commands that operate on a set of commits work on a single revision range. -In other words, writing two "two-dot range notation" next to each -other, e.g. - $ git log A..B C..D +For example, if you have a linear history like this: -does *not* specify two revision ranges for most commands. Instead -it will name a single connected set of commits, i.e. those that are -reachable from either B or D but are reachable from neither A or C. -In a linear history like this: + ---A---B---C---D---E---F - ---A---B---o---o---C---D +Doing A..F will retrieve 5 commits, and doing B..E will retrieve 3 +commits, but doing A..F B..E will not retrieve two revision ranges +totalling 8 commits. Instead the starting point A gets overriden by B, +and the ending point of E by F, effectively becoming B..F, a single +revision range. -because A and B are reachable from C, the revision range specified -by these two dotted ranges is a single commit D. +With more complex graphs the result is not so simple and might result in +two disconnected sets of commits, but that is still considered a single +revision range. Other <rev>{caret} Parent Shorthand Notations
The original explanation didn't seem clear enough to some people. Signed-off-by: Felipe Contreras <felipe.contreras@gmail.com> --- Documentation/revisions.txt | 22 +++++++++++----------- 1 file changed, 11 insertions(+), 11 deletions(-)