Add 'preserve' subcommand to 'git stash'

Message ID	pull.1528.git.git.1686913210137.gitgitgadget@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <git-owner@vger.kernel.org> Message-Id: <pull.1528.git.git.1686913210137.gitgitgadget@gmail.com> Date: Fri, 16 Jun 2023 11:00:09 +0000 Subject: [PATCH] Add 'preserve' subcommand to 'git stash' Fcc: Sent Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit MIME-Version: 1.0 To: git@vger.kernel.org Cc: Nadav Goldstein <nadav.goldstein96@gmail.com>, Nadav Goldstein <nadav.goldstein@blazepod.co> Precedence: bulk From: Nadav Goldstein <nadav.goldstein@blazepod.co>
Series	Add 'preserve' subcommand to 'git stash' \| expand Add 'preserve' subcommand to 'git stash'

Nadav Goldstein June 16, 2023, 11 a.m. UTC

From: Nadav Goldstein <nadav.goldstein@blazepod.co>

In this patch, we introduce a new subcommand preserve to
git stash. The purpose of this subcommand is to save the
current changes into the stash and then immediately re-apply
those changes to the working directory.

Implementation-wise, this is achieved by adding a new branch
to the conditional in the cmd_stash function, where we check
if argv[0] is "preserve". If it is, we push_stash with the
new argument that we added to it preserve=1.
In all other cases we call push_stack/do_push_stack preserve=0

Signed-off-by: Nadav Goldstein <nadav.goldstein96@gmail.com>
---
    Add 'preserve' subcommand to 'git stash'
    
    In this patch, we introduce a new subcommand preserve to git stash. The
    purpose of this subcommand is to save the current changes into the stash
    and then immediately re-apply those changes to the working directory.
    
    Implementation-wise, this is achieved by adding a new branch to the
    conditional in the cmd_stash function, where we check if argv[0] is
    "preserve". If it is, we push_stash with the new argument that we added
    to it preserve=1. In all other cases we call push_stack/do_push_stack
    preserve=0
    
    If the community will approve, I will modify the patch to include help
    messages for the new subcommand

Published-As: https://github.com/gitgitgadget/git/releases/tag/pr-git-1528%2Fnadav96%2Fstash_preserve-v1
Fetch-It-Via: git fetch https://github.com/gitgitgadget/git pr-git-1528/nadav96/stash_preserve-v1
Pull-Request: https://github.com/git/git/pull/1528

 builtin/stash.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)


base-commit: d7d8841f67f29e6ecbad85a11805c907d0f00d5d

Junio C Hamano June 16, 2023, 4:42 p.m. UTC | #1

"Nadav Goldstein via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Nadav Goldstein <nadav.goldstein@blazepod.co>
>
> In this patch, we introduce a new subcommand preserve to
> git stash. The purpose of this subcommand is to save the
> current changes into the stash and then immediately re-apply
> those changes to the working directory.

Why a new subcommand, not a new option to "push"?  Adding a new
subcommand would mean it would be another unfamiliar thing users
need to learn, as opposed to a slight variation of what they are
already familiar with.

> Implementation-wise, this is achieved by adding a new branch
> to the conditional in the cmd_stash function, where we check
> if argv[0] is "preserve". If it is, we push_stash with the
> new argument that we added to it preserve=1.
> In all other cases we call push_stack/do_push_stack preserve=0

The proposed log message is lacking "why".  What problem does it
want to solve?  What are the things you cannot do without the
feature, and what does the new feature allow you to do that you
couldn't do before?  On the other hand, the implementation detail,
unless it is tricky to read from the patch, does not have to be
repeated here---please aim to make the patch and new code clear
enough not to require such commentary.

More on this in "Describe your changes well" section in the
Documentation/SubmittingPatches document.

> Signed-off-by: Nadav Goldstein <nadav.goldstein96@gmail.com>
> ---
>     Add 'preserve' subcommand to 'git stash'
>     
>     In this patch, we introduce a new subcommand preserve to git stash. The
>     purpose of this subcommand is to save the current changes into the stash
>     and then immediately re-apply those changes to the working directory.
>     
>     Implementation-wise, this is achieved by adding a new branch to the
>     conditional in the cmd_stash function, where we check if argv[0] is
>     "preserve". If it is, we push_stash with the new argument that we added
>     to it preserve=1. In all other cases we call push_stack/do_push_stack
>     preserve=0

Please do not repeat what you already said in your log message here.
This is a place to give additional message that would help only
during the review.

>     If the community will approve, I will modify the patch to include help
>     messages for the new subcommand

Please don't think this way.  If the new feature is not worth
completing to document and tests for your own use, it is not worth
community's time to review or "approve" it.  Instead, we try to send
a patch that is already perfected, with tests and docs, in order to
improve the chance reviewers will understand the new feature and its
motivation better when they review the patch.

>  static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int quiet,
> -			 int keep_index, int patch_mode, int include_untracked, int only_staged)
> +			 int keep_index, int patch_mode, int include_untracked, int only_staged, int preserve)
>  {
>  	int ret = 0;
>  	struct stash_info info = STASH_INFO_INIT;
> @@ -1643,7 +1643,7 @@ static int do_push_stash(const struct pathspec *ps, const char *stash_msg, int q
>  				ret = -1;
>  				goto done;
>  			}
> -		} else {
> +		} else if (!preserve) {
>  			struct child_process cp = CHILD_PROCESS_INIT;
>  			cp.git_cmd = 1;
>  			/* BUG: this nukes untracked files in the way */

While this does skip the "reset --hard" step,

 * With --include-untracked and without a pathspec, we have run "clean
   --force --quiet" already, removing the untracked files that have
   been stored in the stash.  Wasn't the objective of the change to
   ensure that the working tree files are unaffected?

 * When run with a pathspec, this change has no effect, no?  We've
   added updated files with "add -u" to the index, compared the
   contents between HEAD and what was added with "add -u" and
   applied the reverse of it to both the index and the working tree,
   which amounts to reverting the changes to the paths that match
   the pathspec.

 * When run with --keep-index, there is another block after this
   part that uses "checkout --no-overlay" to ensure that the
   contents of the working tree matches what is in the index.

 * When the outermost "if (!(patch_mode || only_staged))" block is
   not entered, we'd run "apply -R" to revert the changes in the
   patch, whose point is to update the working tree contents.

What the patch does seem to fall far short of "preserving" in many
cases.  In some modes, I suspect that it might not make sense to use
"preserve" (for example, it could be that "--keep-index" may be
totally incompatible with "--preserve"), but that is hard to guess
without knowing _why_ you wanted to this new feature in the first
place.  Adding a separate subcommand that gives users no access to
these modes would be an easy way to avoid having to think about
these issues, but may close too many doors, including ones those
that do not have to be closed if this feature is done as a new
option to existing "push" command.  I dunno.

Thanks.

Oswald Buddenhagen June 16, 2023, 8:03 p.m. UTC | #2

On Fri, Jun 16, 2023 at 09:42:41AM -0700, Junio C Hamano wrote:
>"Nadav Goldstein via GitGitGadget" <gitgitgadget@gmail.com> writes:
>> In this patch, we introduce a new subcommand preserve to
>> git stash. The purpose of this subcommand is to save the
>> current changes into the stash and then immediately re-apply
>> those changes to the working directory.
>
>Why a new subcommand, not a new option to "push"?  Adding a new
>subcommand would mean it would be another unfamiliar thing users
>need to learn, as opposed to a slight variation of what they are
>already familiar with.
>
to be fair, there's also `apply` and not `pop --keep`.

of course, `preserve` seems a bit unspecific, but `save` and `create` 
are already taken.

>>     If the community will approve, I will modify the patch to include 
>>     help
>>     messages for the new subcommand
>
>Please don't think this way.  If the new feature is not worth
>completing to document and tests for your own use, it is not worth
>community's time to review or "approve" it.
>
for one's own use, one usually wouldn't do the polishing.

>Instead, we try to send a patch that is already perfected, with tests 
>and docs,
>
it's nice when "we" do that, but i think that this is a somewhat too 
one-sided committment to *ask* for.

>in order to improve the chance reviewers will understand the new 
>feature and its motivation better when they review the patch.
>
i think one can achieve that without doing the full monty.
that's what the design-driven process is for, after all. the crux is at 
what contribution size one considers it "worth it", but you can be sure 
that drive-by contributors have a significantly lower threshold than 
regulars.

i'm not saying that nadav succeeded, but a focus on final artifacts 
alone is unlikely to have changed anything.

regards,
ossi

Junio C Hamano June 16, 2023, 8:11 p.m. UTC | #3

Oswald Buddenhagen <oswald.buddenhagen@gmx.de> writes:

>>Why a new subcommand, not a new option to "push"?  Adding a new
>>subcommand would mean it would be another unfamiliar thing users
>>need to learn, as opposed to a slight variation of what they are
>>already familiar with.
>>
> to be fair, there's also `apply` and not `pop --keep`.

I do not care all that much if that is fair, but I do not think it
is a meaningful comparison.  "stash apply" is merely exposing the
first half (the other half is "stash drop") of a two step operation
that is "stash pop".

Oswald Buddenhagen June 17, 2023, 8:39 a.m. UTC | #4

On Fri, Jun 16, 2023 at 01:11:58PM -0700, Junio C Hamano wrote:
>Oswald Buddenhagen <oswald.buddenhagen@gmx.de> writes:
>
>>>Why a new subcommand, not a new option to "push"?  Adding a new
>>>subcommand would mean it would be another unfamiliar thing users
>>>need to learn, as opposed to a slight variation of what they are
>>>already familiar with.
>>>
>> to be fair, there's also `apply` and not `pop --keep`.
>
>I do not care all that much if that is fair, but I do not think it
>is a meaningful comparison.  "stash apply" is merely exposing the
>first half (the other half is "stash drop") of a two step operation
>that is "stash pop".
>
i may be totally wrong about it (because i don't understand the 
motivation behind this feature, either), but i think the _intent_ of 
nadav's patch is to merely expose the first half of "stash push" (the 
other half is the implicit "reset --hard"). it may not be a sufficiently 
good one, but there is clearly an analogy here.

regards,
ossi

Junio C Hamano June 17, 2023, 11:21 a.m. UTC | #5

Oswald Buddenhagen <oswald.buddenhagen@gmx.de> writes:

> i may be totally wrong about it (because i don't understand the
> motivation behind this feature, either), but i think the _intent_ of
> nadav's patch is to merely expose the first half of "stash push" (the
> other half is the implicit "reset --hard"). it may not be a
> sufficiently good one, but there is clearly an analogy here.

I do agree that it would be reasonable to want to expose the first
half (the other half is "now the local mod got saved in a stash,
adjust the working tree and/or the index"), but then that means the
code should cover the various operating modes we have, and let the
users perform their first half, so that the second half (which by
the way needs to be exposed by another series later) can be used on
top of the result to emulate as if the combined two (i.e. "stash
save/push") have been run, for the feature to be complete, no?

Lack of the second half can be excused away with "let's do these one
step at a time", but the analogy fails to hold with an incomplete
coverage of even the first half, I am afraid.

But as you said, I think the lack of concrete "here is how this
feature is expected to be used and why it is useful because it
allows us to do X that we haven't been able to before" is the
largest first issue in the posted patch, as that leaves reviewers
guessing without feeling they "understand the motivation behind" the
feature.  Such an understanding would help us to tell where to stop
(maybe in certain modes doing only the "first half" does not make
sense because the corresponding "second half" inherently does not
exist for some reason, in which case it is fine not to support such
a mode that is supported by "stash push").

Nadav Goldstein June 18, 2023, 9:05 a.m. UTC | #6

Hi,
Thanks for the feedback, and I totally agree I was very vague in my 
description, and I'm sorry for that.

Let me try to explain my motivation:
I heavily use stash to set quick points in my code so I could go back to 
them (during thought process), and I want to store my changes quickly 
and continue from there).

Currently doing git stash discard the current working tree, so I need to 
perform git stash apply 0 to restore it, so my new sub-command is aiming 
to replace doing:
`git stash
git stash apply 0`

with just `git stash preserve`

Regarding using it as a flag in the stash push, I went to this direction 
initially, but stopped because of all of the flags you mentioned 
(keep-index, include-untracked etc...), I wanted a clean slate, and to 
avoid using the push flags that seems overkill in this phase (they can 
be supported later if users requested it in forums, wanted to keep it 
simple).

If I understand correctly, the problem is my subcommand behind the 
scenes still support the push flags because they use the same method 
(do_push_stash).

Do you have any idea how to disable those flags in the new subcommand 
only? And do you still think it should be a flag?

Also what do you think regarding the way I choose to implement it? 
(Adding the extra argument to do_push_stash)

Thanks,
Nadav

On 17/06/2023 14:21, Junio C Hamano wrote:
> Oswald Buddenhagen <oswald.buddenhagen@gmx.de> writes:
>
>> i may be totally wrong about it (because i don't understand the
>> motivation behind this feature, either), but i think the _intent_ of
>> nadav's patch is to merely expose the first half of "stash push" (the
>> other half is the implicit "reset --hard"). it may not be a
>> sufficiently good one, but there is clearly an analogy here.
> I do agree that it would be reasonable to want to expose the first
> half (the other half is "now the local mod got saved in a stash,
> adjust the working tree and/or the index"), but then that means the
> code should cover the various operating modes we have, and let the
> users perform their first half, so that the second half (which by
> the way needs to be exposed by another series later) can be used on
> top of the result to emulate as if the combined two (i.e. "stash
> save/push") have been run, for the feature to be complete, no?
>
> Lack of the second half can be excused away with "let's do these one
> step at a time", but the analogy fails to hold with an incomplete
> coverage of even the first half, I am afraid.
>
> But as you said, I think the lack of concrete "here is how this
> feature is expected to be used and why it is useful because it
> allows us to do X that we haven't been able to before" is the
> largest first issue in the posted patch, as that leaves reviewers
> guessing without feeling they "understand the motivation behind" the
> feature.  Such an understanding would help us to tell where to stop
> (maybe in certain modes doing only the "first half" does not make
> sense because the corresponding "second half" inherently does not
> exist for some reason, in which case it is fine not to support such
> a mode that is supported by "stash push").
>
>
>
>

Oswald Buddenhagen June 18, 2023, 9:47 a.m. UTC | #7

On Sun, Jun 18, 2023 at 12:05:21PM +0300, Nadav Goldstein wrote:
>Let me try to explain my motivation:
>I heavily use stash to set quick points in my code so I could go back to 
>them (during thought process), and I want to store my changes quickly 
>and continue from there.
>
so why are you (ab-)using stash for that, rather than just committing 
each time, and later cleaning it by using `reset [--mixed]` and 
re-committing (or using `rebase --interactive`)? the reflog holds 
information about "lost" commits (the stash is just a somewhat special 
reflog, too).

regards,
ossi

Nadav Goldstein June 18, 2023, 10:57 a.m. UTC | #8

I see your point, but commits force me to detail the changes (and I know 
it's good practice).

I just want to give the user another option rather than committing their 
changes.


If we talking about commits, why use stash at all? why not just commit, 
push it/create and switch branch, and go back?


I know my argument is not so smart, but I just want to highlight that 
everything can be done with commits, as this what makes git so powerful :)


Thanks for you reply!

Nadav


On 18/06/2023 12:47, Oswald Buddenhagen wrote:
> On Sun, Jun 18, 2023 at 12:05:21PM +0300, Nadav Goldstein wrote:
>> Let me try to explain my motivation:
>> I heavily use stash to set quick points in my code so I could go back 
>> to them (during thought process), and I want to store my changes 
>> quickly and continue from there.
>>
> so why are you (ab-)using stash for that, rather than just committing 
> each time, and later cleaning it by using `reset [--mixed]` and 
> re-committing (or using `rebase --interactive`)? the reflog holds 
> information about "lost" commits (the stash is just a somewhat special 
> reflog, too).
>
> regards,
> ossi
>
>

Junio C Hamano June 19, 2023, 1:42 a.m. UTC | #9

Nadav Goldstein <nadav.goldstein96@gmail.com> writes:

> I heavily use stash to set quick points in my code so I could go back
> to them (during thought process), and I want to store my changes
> quickly and continue from there).

So it is more like "take a series of snapshots".  I do not think
"stash" is a good match for that, and I do not know how to bend
"stash" to make it a good match for that purpose while not hurting
the original use of the "stash" command.

Surely, the first part of "stash create" is a good way to take a
snapshot, but after creating a series of stash entries, we cannot
use them effectively as "snapshot we can go back to", and half the
problem lies in the application side ("stash apply") and the other
half comes from your design to leave the local changes in the
working tree and the index.

Let's imagine that while we are exploring, we came up with an idea,
wrote code and did a "git stash push && git stash apply" (or "git
stash snapshot").  We continue, and we do more modification, and do
another "git stash snapshot".  We do that again and create another
snapshot.  In total, on top of the HEAD that hasn't changed, we now
have three snapshots stash@{2}, stash@{1}, and stash@{0}.

Let's further imagine that the earliest first step was good
(i.e. the change stored in stash@{2}^..stash@{2} and the tree object
recorded in stash@{2}^{tree} are both good).  Also, the latest
change we made since we took the second snapshot (i.e. diff between
stash@{1} and stash@{0}) is good.

But the changes made in the second part was totally misdesigned and
beyond salvaging.  It has to be discarded and reworked from scratch.

Using stash@{2} to go back to the first snapshot may be trivial.
We'd do something like

    $ git reset --hard stash@{2} && git reset --soft HEAD^

But how would we salvage the latest part of the work on top of this
state, while excluding the crap we made before we took the second
snapshot?  Mostly because the main motivation behind "git stash" is
to preserve the change between HEAD and the current state while
preparing a clean slate to work on something totally unrelated for
us, its application side is geared towards applying the preserved
change (i.e. not the tree state of the stash entry, but the
differences between the tree state of the parent of the stash entry
and the tree state of the stash entry).  Asking "please apply the
difference recorded in the stash@{0}" would give us changes we made
while taking all three snapshots, which is bad for two reasons, (1)
because we have already recovered the changes we made before taking
the first snapshot above but stash@{0} contains that change
redundantly, and (2) stash@{0} also contains the change we made
between the first and the second snapshot, the crappy one we want to
discard.  We end up doing something unwierdy like this, perhaps?

    $ git diff stash@{1} stash@{0} | git apply

Besides, new users will be utterly confused if they realize that
stash@{0} would contain changes made in the all three phases and
their "stash pop/apply" on it would resurrect all of them.  So we
need to educate users with "stash entries created with 'stash
preserve' cannot be applied with 'stash pop' without care".

If we instead used normal commits during our exploring phase of the
development, this would have been much easier and cleaner:

    $ work work initial work
    $ git commit -a -m 'snapshot #1'
    $ work work more work
    $ git commit -a -m 'snapshot #2'
    $ work work great work
    $ git commit -a -m 'snapshot #3'

We only discover that earlier work was crap after trying to build on
top and the end result does not work well.  So the "great work" part
may have been very good, "initial work" part may be OK, but the
other part, i.e. "more work", while it may have looked like a good
idea, turns out to be unusable.  Fortunately, how to clean up such a
history is well established.  With "git rebase -i HEAD~3", we can
easily discard the "more work" part and preserve both "initial" and
"great" work in the result.

In short, we do not have to abuse stash to make a poor imitation of
already well understood workflow, and even if it were a good idea,
the "stash preserve" proposed in the thread would not be a good
ingredient to build something like that, because of its design,
i.e. the snapshots are not incremental and it is always relative to
the unmoving HEAD, hence recording older changes redundantly.

> Regarding using it as a flag in the stash push, I went to this
> direction initially, but stopped because of all of the flags you
> mentioned (keep-index, include-untracked etc...),...

As shown above, "stash preserve" is not a good alternative to
"commits then rebase".  In short, if you create N snapshots in a
row, if your changes between M-th snapshot and M+1-th snapshot were
so bad that they need to be discarded, it would make all of your
later snapshot hard to salvage.

Now let's see if we can salvage it for single-use snapshot that
cannot be combined.  The limitation is that you can only go back to
the state of a single stash entry, as if running "reset --hard".
Would the "stash preserve" with such a limited application still be
useful?  Even then lack of "include untracked" would probably be a
show-stopper for those people who create new files while "during
thought process", and here is why.

    $ work work work
    $ modify files A, B, C, D
    $ create new file E, F
    $ git stash snapshot
    $ work work more work
    $ git stash snapshot
    $ work work even more work
    $ git stash snapshot
    $ work work even more work
    $ ...

Now, if realize that these later work after the snapshot were all
bad, can we go back to the initial snapshot?

If you do not allow recording what was in the untracked files, the
contents of new files E and F may have been modified while you were
butchering your somewhat good initial step you recorded in your
first snapshot.

Not allowing pathspec would have similar issues.  While doing
exploratory programming, you may realize that changes you made to
some of the files are in good shape and worth including in a
snapshot while the state of other files are no better than what you
had in your previous snapshot, and you would want to be able to use
"git stash snapshot -- <pathspec>" to make the resulting snapshot
usable.

So whether it is done as a separate command "git stash preserve", or
it is done as an option to "git stash push --snapshot-only", the
users will need access to (some of) the options they expect to be
able to use the usual "stash push".  It may not have to support all
the options and operating modes from day one, but if it will have to
eventually learn many of them when it becomes complete, I do not see
a strong reason to have it as a separate command.

It is fine if some of the options are inherently incompatible with
"--snapshot-only" mode.  There are precedence in the code you can
find and mimic, like "--patch" and "--include-untracked" options
are marked to be incompatible.

But as I already said, I am not sure if it is worth pursuing this
route, as "commits then rebase" is well understood and established
workflow to help us during our exploratory programming use case.

Thanks.

Add 'preserve' subcommand to 'git stash'

Commit Message

Comments

Patch