diff mbox series

[v2,3/4] builtin/stash: provide a way to export stashes to a ref

Message ID 20220329214941.2018609-4-sandals@crustytoothpaste.net (mailing list archive)
State Superseded
Headers show
Series Importing and exporting stashes to refs | expand

Commit Message

brian m. carlson March 29, 2022, 9:49 p.m. UTC
A common user problem is how to sync in-progress work to another
machine.  Users currently must use some sort of transfer of the working
tree, which poses security risks and also necessarily causes the index
to become dirty.  The experience is suboptimal and frustrating for
users.

A reasonable idea is to use the stash for this purpose, but the stash is
stored in the reflog, not in a ref, and as such it cannot be pushed or
pulled.  This also means that it cannot be saved into a bundle or
preserved elsewhere, which is a problem when using throwaway development
environments.

Let's solve this problem by allowing the user to export the stash to a
ref (or, to just write it into the repository and print the hash, à la
git commit-tree).  Introduce git stash export, which writes a chain of
commits where the first parent is always a chain to the previous stash,
or to a single, empty commit (for the final item) and the second is the
stash commit normally written to the reflog.

Iterate over each stash from topmost to bottomost, looking up the data
for each one, and then create the chain from the single empty commit
back up in reverse order.  Generate a predictable empty commit so our
behavior is reproducible.  Create a useful commit message, preserving
the author and committer information, to help users identify stash
commits when viewing them as normal commits.

If the user has specified specific stashes they'd like to export
instead, use those instead of iterating over all of the stashes.

As part of this, specifically request quiet behavior when looking up the
OID for a revision because we will eventually hit a revision that
doesn't exist and we don't want to die when that occurs.

Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
---
 Documentation/git-stash.txt |  22 ++++-
 builtin/stash.c             | 183 ++++++++++++++++++++++++++++++++++++
 2 files changed, 204 insertions(+), 1 deletion(-)

Comments

Junio C Hamano March 30, 2022, 11:05 p.m. UTC | #1
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

> +static int do_export_stash(const char *ref, size_t argc, const char **argv)
> +{
> +	struct object_id base;
> +	struct object_context unused;
> +	struct commit *prev;
> +	struct object_id *items = NULL;
> +	int nitems = 0, nalloc = 0;
> +	int res = 0;
> +	struct strbuf revision;
> +	const char *author, *committer;
> +
> +	/*
> +	 * This is an arbitrary, fixed date, specifically the one used by git
> +	 * format-patch.  The goal is merely to produce reproducible output.
> +	 */
> +	prepare_fallback_ident("git stash", "git@stash");
> +	author = fmt_ident("git stash", "git@stash", WANT_BLANK_IDENT, "2001-09-17T00:00:00Z", 0);
> +	committer = fmt_ident("git stash", "git@stash", WANT_BLANK_IDENT, "2001-09-17T00:00:00Z", 0);
> +
> +	/* First, we create a single empty commit. */
> +	if (commit_tree_extended(NULL, 0, the_hash_algo->empty_tree, NULL, &base, author, committer, NULL, NULL))
> +		return error(_("unable to write base commit"));
> +
> +	prev = lookup_commit_reference(the_repository, &base);
> +
> +	if (argc) {
> +		/*
> +		 * Find each specified stash, and load data into the array.
> +		 */
> +		for (int i = 0; i < argc; i++) {
> +			ALLOC_GROW_BY(items, nitems, 1, nalloc);

Documentation/CodingGuidelines still says this

 - Declaring a variable in the for loop "for (int i = 0; i < 10; i++)"
   is still not allowed in this codebase.

We have been experimenting since we merged 44ba10d6 (revision: use
C99 declaration of variable in for() loop, 2021-11-14) at 5a4069a1
(Merge branch 'jc/c99-var-decl-in-for-loop', 2021-12-21), which is
shipped as a part of v2.35.0 just a few months ago.

Let's not add more until we are sure that we do not have to revert
44ba10d6 (revision: use C99 declaration of variable in for() loop,
2021-11-14) to limit the potential damage, especially when it is so
easy to do so.

Just declare "int i" at the beginning of the funcion and keep
reusing it would be fine here.
brian m. carlson March 30, 2022, 11:44 p.m. UTC | #2
On 2022-03-30 at 23:05:28, Junio C Hamano wrote:
> Documentation/CodingGuidelines still says this
> 
>  - Declaring a variable in the for loop "for (int i = 0; i < 10; i++)"
>    is still not allowed in this codebase.
> 
> We have been experimenting since we merged 44ba10d6 (revision: use
> C99 declaration of variable in for() loop, 2021-11-14) at 5a4069a1
> (Merge branch 'jc/c99-var-decl-in-for-loop', 2021-12-21), which is
> shipped as a part of v2.35.0 just a few months ago.
> 
> Let's not add more until we are sure that we do not have to revert
> 44ba10d6 (revision: use C99 declaration of variable in for() loop,
> 2021-11-14) to limit the potential damage, especially when it is so
> easy to do so.
> 
> Just declare "int i" at the beginning of the funcion and keep
> reusing it would be fine here.

Okay, I can reroll with that.
Ævar Arnfjörð Bjarmason March 31, 2022, 1:56 a.m. UTC | #3
On Tue, Mar 29 2022, brian m. carlson wrote:

> A common user problem is how to sync in-progress work to another
> machine.  Users currently must use some sort of transfer of the working
> tree, which poses security risks and also necessarily causes the index
> to become dirty.  The experience is suboptimal and frustrating for
> users.
>
> A reasonable idea is to use the stash for this purpose, but the stash is
> stored in the reflog, not in a ref, and as such it cannot be pushed or
> pulled.  This also means that it cannot be saved into a bundle or
> preserved elsewhere, which is a problem when using throwaway development
> environments.
>
> Let's solve this problem by allowing the user to export the stash to a
> ref (or, to just write it into the repository and print the hash, à la
> git commit-tree).  Introduce git stash export, which writes a chain of
> commits where the first parent is always a chain to the previous stash,
> or to a single, empty commit (for the final item) and the second is the
> stash commit normally written to the reflog.
>
> Iterate over each stash from topmost to bottomost, looking up the data
> for each one, and then create the chain from the single empty commit
> back up in reverse order.  Generate a predictable empty commit so our
> behavior is reproducible.  Create a useful commit message, preserving
> the author and committer information, to help users identify stash
> commits when viewing them as normal commits.
>
> If the user has specified specific stashes they'd like to export
> instead, use those instead of iterating over all of the stashes.
>
> As part of this, specifically request quiet behavior when looking up the
> OID for a revision because we will eventually hit a revision that
> doesn't exist and we don't want to die when that occurs.
>
> Signed-off-by: brian m. carlson <sandals@crustytoothpaste.net>
> ---
>  Documentation/git-stash.txt |  22 ++++-
>  builtin/stash.c             | 183 ++++++++++++++++++++++++++++++++++++
>  2 files changed, 204 insertions(+), 1 deletion(-)
>
> diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
> index 6e15f47525..162110314e 100644
> --- a/Documentation/git-stash.txt
> +++ b/Documentation/git-stash.txt
> @@ -20,6 +20,7 @@ SYNOPSIS
>  'git stash' clear
>  'git stash' create [<message>]
>  'git stash' store [-m|--message <message>] [-q|--quiet] <commit>
> +'git stash' export ( --print | --to-ref <ref> ) [<stash>...]
>  
>  DESCRIPTION
>  -----------
> @@ -151,6 +152,12 @@ store::
>  	reflog.  This is intended to be useful for scripts.  It is
>  	probably not the command you want to use; see "push" above.
>  
> +export ( --print | --to-ref <ref> ) [<stash>...]::
> +

I think this extra \n here isn't needed.

> +static const char * const git_stash_export_usage[] = {
> +	N_("git stash export (--print | --to-ref <ref>) [<stash>...]"),
> +	NULL
> +};
> +
> +

Stray too-much-whitespace.

>  static const char ref_stash[] = "refs/stash";
>  static struct strbuf stash_index_path = STRBUF_INIT;
>  
> @@ -1773,6 +1780,180 @@ static int save_stash(int argc, const char **argv, const char *prefix)
>  	return ret;
>  }
>  
> +static int write_commit_with_parents(struct object_id *out, const struct object_id *oid, struct commit_list *parents)
> +{
> +	size_t author_len, committer_len;
> +	struct commit *this;
> +	const char *orig_author, *orig_committer;
> +	char *author = NULL, *committer = NULL;
> +	const char *buffer;
> +	unsigned long bufsize;
> +	const char *p;
> +	struct strbuf msg = STRBUF_INIT;
> +	int ret = 0;

With this...

> +	this = lookup_commit_reference(the_repository, oid);
> +	buffer = get_commit_buffer(this, &bufsize);
> +	orig_author = find_commit_header(buffer, "author", &author_len);
> +	orig_committer = find_commit_header(buffer, "committer", &committer_len);
> +	p = memmem(buffer, bufsize, "\n\n", 2);
> +
> +	if (!orig_author || !orig_committer || !p) {
> +		error(_("cannot parse commit %s"), oid_to_hex(oid));
> +		goto out;

..isn't this a logic error, shouldn't we return -1 here?

> +	}
> +	/* Jump to message. */
> +	p += 2;
> +	strbuf_addstr(&msg, "git stash: ");
> +	strbuf_add(&msg, p, bufsize - (p - buffer));
> +
> +	author = xmemdupz(orig_author, author_len);
> +	committer = xmemdupz(orig_committer, committer_len);
> +
> +	if (commit_tree_extended(msg.buf, msg.len,
> +				 the_hash_algo->empty_tree, parents,
> +				 out, author, committer,
> +				 NULL, NULL)) {
> +		ret = -1;
> +		error(_("could not write commit"));

better as "ret = error(..."?

> +		goto out;
> +	}
> +out:
> +	strbuf_reset(&msg);
> +	unuse_commit_buffer(this, buffer);
> +	free(author);
> +	free(committer);
> +	return ret;
> +}
> +
> +static int do_export_stash(const char *ref, size_t argc, const char **argv)
> +{
> +	struct object_id base;
> +	struct object_context unused;
> +	struct commit *prev;
> +	struct object_id *items = NULL;
> +	int nitems = 0, nalloc = 0;

Can nalloc be moved into the if=else scopes?

Also shouldn't these be size_t...?

> +	int res = 0;
> +	struct strbuf revision;
> +	const char *author, *committer;
> +
> +	/*
> +	 * This is an arbitrary, fixed date, specifically the one used by git
> +	 * format-patch.  The goal is merely to produce reproducible output.
> +	 */
> +	prepare_fallback_ident("git stash", "git@stash");
> +	author = fmt_ident("git stash", "git@stash", WANT_BLANK_IDENT, "2001-09-17T00:00:00Z", 0);
> +	committer = fmt_ident("git stash", "git@stash", WANT_BLANK_IDENT, "2001-09-17T00:00:00Z", 0);
> +
> +	/* First, we create a single empty commit. */
> +	if (commit_tree_extended(NULL, 0, the_hash_algo->empty_tree, NULL, &base, author, committer, NULL, NULL))

Some very long lines here.

> +		return error(_("unable to write base commit"));
> +
> +	prev = lookup_commit_reference(the_repository, &base);
> +
> +	if (argc) {
> +		/*
> +		 * Find each specified stash, and load data into the array.
> +		 */
> +		for (int i = 0; i < argc; i++) {

...as this is size_t, not int.

> +			ALLOC_GROW_BY(items, nitems, 1, nalloc);
> +			if (parse_revision(&revision, argv[i], 0) ||
> +			    get_oid_with_context(the_repository, revision.buf,
> +						 GET_OID_QUIETLY | GET_OID_GENTLY,
> +						 &items[i], &unused)) {
> +				error(_("unable to find stash entry %s"), argv[i]);
> +				res = -1;

ditto "ret = error(..."
> +				goto out;
> +			}
> +		}
> +	} else {
> +		/*
> +		 * Walk the reflog, finding each stash entry, and load data into the
> +		 * array.
> +		 */
> +		for (int i = 0;; i++) {

Aside from the C99 dependency Junio mentioned, this should also be size_t.

> +	/*
> +	 * Now, create a set of commits identical to the regular stash commits,
> +	 * but where their first parents form a chain to our original empty
> +	 * base commit.
> +	 */
> +	for (int i = nitems - 1; i >= 0; i--) {

Did you spot my "count down" suggestion in
https://lore.kernel.org/git/220311.86bkydi65v.gmgdl@evledraar.gmail.com/
on the v1?

> +		struct commit_list *parents = NULL;
> +		struct commit_list **next = &parents;
> +		struct object_id out;
> +
> +		next = commit_list_append(prev, next);
> +		next = commit_list_append(lookup_commit_reference(the_repository, &items[i]), next);
> +		if (write_commit_with_parents(&out, &items[i], parents)) {

Here we returned -1 from this earlier, I think this would be more
straightforward as:
	
	res = write_commit_with_parents(...)
	if (res < 0)
		goto out;
	

> +			res = -1;
> +			goto out;

So one doesn't have to wonder why we're ignoring the error value, and
using -1, but then treating all non-zero as errors.

> +		}
> +		prev = lookup_commit_reference(the_repository, &out);
> +	}
> +	if (ref)
> +		update_ref(NULL, ref, &prev->object.oid, NULL, 0, UPDATE_REFS_DIE_ON_ERR);
> +	else
> +		puts(oid_to_hex(&prev->object.oid));
> +out:
> +	free(items);
> +
> +	return res;
> +}
> +
> +enum export_action {
> +	ACTION_NONE,
> +	ACTION_PRINT,
> +	ACTION_TO_REF,
> +};
> +
> +static int export_stash(int argc, const char **argv, const char *prefix)
> +{
> +	int ret = 0;

It looks like we don't need to initialize this.

> +	const char *ref = NULL;
> +	enum export_action action = ACTION_NONE;
> +	struct option options[] = {
> +		OPT_CMDMODE(0, "print", &action,
> +			    N_("print the object ID instead of writing it to a ref"),
> +			    ACTION_PRINT),
> +		OPT_CMDMODE(0, "to-ref", &action,
> +			    N_("save the data to the given ref"),
> +			    ACTION_TO_REF),
> +		OPT_END()
> +	};
> +
> +	argc = parse_options(argc, argv, prefix, options,
> +			     git_stash_export_usage,
> +			     PARSE_OPT_KEEP_DASHDASH);
> +
> +	if (action == ACTION_NONE) {
> +		return error(_("exactly one of --print and --to-ref is required"));
> +	} else if (action == ACTION_TO_REF) {
> +		if (!argc)
> +			return error(_("--to-ref requires an argument"));
> +		ref = argv[0];
> +		argc--;
> +		argv++;
> +	}
> +
> +

nit: Too much whitespace

> +	ret = do_export_stash(ref, argc, argv);
> +	return ret;

Aside from the "ret" case above, maybe this would be better if the
"action" check became a swith, then the compiler would help check it
against the enum, and this wouldn't implicitly be both ACTION_PRINT and
ACTION_TO_REF, but could be done via a fall-through.
Ævar Arnfjörð Bjarmason March 31, 2022, 2:09 a.m. UTC | #4
On Tue, Mar 29 2022, brian m. carlson wrote:

> +	if (argc) {
> +		/*
> +		 * Find each specified stash, and load data into the array.
> +		 */
> +		for (int i = 0; i < argc; i++) {
> +			ALLOC_GROW_BY(items, nitems, 1, nalloc);
> +			if (parse_revision(&revision, argv[i], 0) ||
> +			    get_oid_with_context(the_repository, revision.buf,
> +						 GET_OID_QUIETLY | GET_OID_GENTLY,
> +						 &items[i], &unused)) {
> +				error(_("unable to find stash entry %s"), argv[i]);
> +				res = -1;
> +				goto out;
> +			}
> +		}

One thing I missed on the first read-through, in the earlier commit you
factored out parse_revision(), which now contains this code (which was
here before this series):
	
	+	if (!commit) {
	+		if (!ref_exists(ref_stash)) {
	+			fprintf_ln(stderr, _("No stash entries found."));
	+			return -1;
	+		}

Aren't we going to print both "No stash entries" and "unable to find
stash entry %s" in those cases?
Junio C Hamano March 31, 2022, 5:43 p.m. UTC | #5
Ævar Arnfjörð Bjarmason <avarab@gmail.com> writes:

>> @@ -151,6 +152,12 @@ store::
>>  	reflog.  This is intended to be useful for scripts.  It is
>>  	probably not the command you want to use; see "push" above.
>>  
>> +export ( --print | --to-ref <ref> ) [<stash>...]::
>> +
>
> I think this extra \n here isn't needed.

I wrote the same in my review on one of the previous rounds, but
removed it before sending it out, because this is better with than
without.  The blank line isn't necessary but does not hurt.  And
other existing entries in this file seem to have them consistently.

Thanks for carefully reading the posted patch.
brian m. carlson April 5, 2022, 10:22 a.m. UTC | #6
On 2022-03-31 at 02:09:21, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Mar 29 2022, brian m. carlson wrote:
> 
> > +	if (argc) {
> > +		/*
> > +		 * Find each specified stash, and load data into the array.
> > +		 */
> > +		for (int i = 0; i < argc; i++) {
> > +			ALLOC_GROW_BY(items, nitems, 1, nalloc);
> > +			if (parse_revision(&revision, argv[i], 0) ||
> > +			    get_oid_with_context(the_repository, revision.buf,
> > +						 GET_OID_QUIETLY | GET_OID_GENTLY,
> > +						 &items[i], &unused)) {
> > +				error(_("unable to find stash entry %s"), argv[i]);
> > +				res = -1;
> > +				goto out;
> > +			}
> > +		}
> 
> One thing I missed on the first read-through, in the earlier commit you
> factored out parse_revision(), which now contains this code (which was
> here before this series):
> 	
> 	+	if (!commit) {
> 	+		if (!ref_exists(ref_stash)) {
> 	+			fprintf_ln(stderr, _("No stash entries found."));
> 	+			return -1;
> 	+		}
> 
> Aren't we going to print both "No stash entries" and "unable to find
> stash entry %s" in those cases?

No, we're not, because commit isn't NULL.  However, I'll put this under
the quiet flag nevertheless for the situation in the next chunk down
(where argc is 0) where this is a problem.
brian m. carlson April 5, 2022, 10:55 a.m. UTC | #7
On 2022-03-31 at 01:56:01, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Mar 29 2022, brian m. carlson wrote:
> 
> > diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
> > index 6e15f47525..162110314e 100644
> > --- a/Documentation/git-stash.txt
> > +++ b/Documentation/git-stash.txt
> > @@ -20,6 +20,7 @@ SYNOPSIS
> >  'git stash' clear
> >  'git stash' create [<message>]
> >  'git stash' store [-m|--message <message>] [-q|--quiet] <commit>
> > +'git stash' export ( --print | --to-ref <ref> ) [<stash>...]
> >  
> >  DESCRIPTION
> >  -----------
> > @@ -151,6 +152,12 @@ store::
> >  	reflog.  This is intended to be useful for scripts.  It is
> >  	probably not the command you want to use; see "push" above.
> >  
> > +export ( --print | --to-ref <ref> ) [<stash>...]::
> > +
> 
> I think this extra \n here isn't needed.

All the rest of the entries have it that way.  Junio likes it, you
don't, but it's consistent with the rest of the file and I'm just
following along.

> > +static const char * const git_stash_export_usage[] = {
> > +	N_("git stash export (--print | --to-ref <ref>) [<stash>...]"),
> > +	NULL
> > +};
> > +
> > +
> 
> Stray too-much-whitespace.

Fixed in v3 already.

> > +	this = lookup_commit_reference(the_repository, oid);
> > +	buffer = get_commit_buffer(this, &bufsize);
> > +	orig_author = find_commit_header(buffer, "author", &author_len);
> > +	orig_committer = find_commit_header(buffer, "committer", &committer_len);
> > +	p = memmem(buffer, bufsize, "\n\n", 2);
> > +
> > +	if (!orig_author || !orig_committer || !p) {
> > +		error(_("cannot parse commit %s"), oid_to_hex(oid));
> > +		goto out;
> 
> ..isn't this a logic error, shouldn't we return -1 here?

Yes, it is.

> better as "ret = error(..."?

Yup.  v4 will have it fixed in both places.

> Can nalloc be moved into the if=else scopes?

It _can_, but it's used in both, so I don't see a particular reason to
do so.

> Some very long lines here.

Will wrap in v4.

> > +		return error(_("unable to write base commit"));
> > +
> > +	prev = lookup_commit_reference(the_repository, &base);
> > +
> > +	if (argc) {
> > +		/*
> > +		 * Find each specified stash, and load data into the array.
> > +		 */
> > +		for (int i = 0; i < argc; i++) {
> 
> ...as this is size_t, not int.

I'll make argc int.

> > +				goto out;
> > +			}
> > +		}
> > +	} else {
> > +		/*
> > +		 * Walk the reflog, finding each stash entry, and load data into the
> > +		 * array.
> > +		 */
> > +		for (int i = 0;; i++) {
> 
> Aside from the C99 dependency Junio mentioned, this should also be size_t.

Nope.  I specifically decided not to do that and mentioned it in the
cover letter.  Since Windows doesn't let us have nice things like %zu,
I'm going to switch to int here and be consistent.  I'm not coding for
16-bit systems here and I feel 2 billion stashes is sufficient for the
needs of the project for the indefinite future based on the capabilities
of current human beings.

The C99 dependency has been removed in v3.

> > +	/*
> > +	 * Now, create a set of commits identical to the regular stash commits,
> > +	 * but where their first parents form a chain to our original empty
> > +	 * base commit.
> > +	 */
> > +	for (int i = nitems - 1; i >= 0; i--) {
> 
> Did you spot my "count down" suggestion in
> https://lore.kernel.org/git/220311.86bkydi65v.gmgdl@evledraar.gmail.com/
> on the v1?

I did, and I prefer this approach.  That would be necessary if we were
using size_t here, but we're not.

> > +		struct commit_list *parents = NULL;
> > +		struct commit_list **next = &parents;
> > +		struct object_id out;
> > +
> > +		next = commit_list_append(prev, next);
> > +		next = commit_list_append(lookup_commit_reference(the_repository, &items[i]), next);
> > +		if (write_commit_with_parents(&out, &items[i], parents)) {
> 
> Here we returned -1 from this earlier, I think this would be more
> straightforward as:
> 	
> 	res = write_commit_with_parents(...)
> 	if (res < 0)
> 		goto out;
> 	
> 
> > +			res = -1;
> > +			goto out;
> 
> So one doesn't have to wonder why we're ignoring the error value, and
> using -1, but then treating all non-zero as errors.

That will be fixed in v4.

> It looks like we don't need to initialize this.

It'll be removed in v4.

> > +	ret = do_export_stash(ref, argc, argv);
> > +	return ret;
> 
> Aside from the "ret" case above, maybe this would be better if the
> "action" check became a swith, then the compiler would help check it
> against the enum, and this wouldn't implicitly be both ACTION_PRINT and
> ACTION_TO_REF, but could be done via a fall-through.

Normally I'm a big fan of switch statements, but I don't feel in this
case that it logically represents the current code.  I think, given the
context, that an if-else is easier to read.
Ævar Arnfjörð Bjarmason April 6, 2022, 9:05 a.m. UTC | #8
On Tue, Apr 05 2022, brian m. carlson wrote:

> [[PGP Signed Part:Undecided]]
> On 2022-03-31 at 01:56:01, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Tue, Mar 29 2022, brian m. carlson wrote:
>> 
>> > diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
>> > index 6e15f47525..162110314e 100644
>> > --- a/Documentation/git-stash.txt
>> > +++ b/Documentation/git-stash.txt
>> > @@ -20,6 +20,7 @@ SYNOPSIS
>> >  'git stash' clear
>> >  'git stash' create [<message>]
>> >  'git stash' store [-m|--message <message>] [-q|--quiet] <commit>
>> > +'git stash' export ( --print | --to-ref <ref> ) [<stash>...]
>> >  
>> >  DESCRIPTION
>> >  -----------
>> > @@ -151,6 +152,12 @@ store::
>> >  	reflog.  This is intended to be useful for scripts.  It is
>> >  	probably not the command you want to use; see "push" above.
>> >  
>> > +export ( --print | --to-ref <ref> ) [<stash>...]::
>> > +
>> 
>> I think this extra \n here isn't needed.
>
> All the rest of the entries have it that way.  Junio likes it, you
> don't, but it's consistent with the rest of the file and I'm just
> following along.

FWIW I really don't mind. I vaguely thought it might be an ASCIIDOC
syntax error as I'm used to seeing the other form, as e.g. adding an
extra \n before the following "+" is.

But it's not, and it's indeed consistent with the rest. looks good.

>> Can nalloc be moved into the if=else scopes?
>
> It _can_, but it's used in both, so I don't see a particular reason to
> do so.

I don't mind, FWIW the reason is just to save the reader skimming to
track down the various bits of the manual memory allocation.

But as you noted in the v3 reply this can also just use oidset, so ...

>> > +				goto out;
>> > +			}
>> > +		}
>> > +	} else {
>> > +		/*
>> > +		 * Walk the reflog, finding each stash entry, and load data into the
>> > +		 * array.
>> > +		 */
>> > +		for (int i = 0;; i++) {
>> 
>> Aside from the C99 dependency Junio mentioned, this should also be size_t.
>
> Nope.  I specifically decided not to do that and mentioned it in the
> cover letter.  Since Windows doesn't let us have nice things like %zu,
> I'm going to switch to int here and be consistent.

You mean to avoid PRIuMAX instead of %d with:

    snprintf(buf, sizeof(buf), "%d", i);

?

> [...(moved around)...]
>> Did you spot my "count down" suggestion in
>> https://lore.kernel.org/git/220311.86bkydi65v.gmgdl@evledraar.gmail.com/
>> on the v1?
>
> I did, and I prefer this approach.  That would be necessary if we were
> using size_t here, but we're not.
> [...]

> I'm not coding for 16-bit systems here and I feel 2 billion stashes is
> sufficient for the needs of the project for the indefinite future
> based on the capabilities of current human beings.

I just thought you might not have seen the v1 feedback, fair enough. And
I'm not going to argue that point here.

Just as an explanation: The reason to use these sorts of patterns isn't
because we might need to support 16 bit in the future, which we won't,
or that I think it's plausible that someone might have >2^31-1 stashes.

It's that we've been changing to larger types across the codebase, and
e.g. in 99d60545f87 (string-list API: change "nr" and "alloc" to
"size_t", 2022-03-07) changing to an unsigned type required changing
such an iterator.

So if we really don't need negative numbers, but are just using -1 as a
value to "stop" in a for-loop it's IMO worth it in general to just stick
with unsigned, it makes things more future-proof.

Also, not gcc or clang, but e.g. HP/UX's aCC compiler screams bloddy
murder about various "why are you assuming unsigned here?" in
comparisons across the codebase, which gcc/clang stay quiet about, to
the point where it downs out other useful warnings it gives about actual
potential issues.

That's not *directly* related to this, which it wouldn't warn about as
it's "int". But generally it's a result of us being loose in mixing
unsigned and signed types when it comes to a count of a number of items
that can't be negative, and won't be exact for "guard clause" code like
this.

And finally, even if we might say that more than 2^31-1 stashes is
insane it's just easier to reason about code where multiple "counts" are
in play if you don't need to carefully think about that aspect.

Anyway, </by-way-of-explanation over>. But just to be clear I think it's
fine that you want to keep this as-is.

>> > +	ret = do_export_stash(ref, argc, argv);
>> > +	return ret;
>> 
>> Aside from the "ret" case above, maybe this would be better if the
>> "action" check became a swith, then the compiler would help check it
>> against the enum, and this wouldn't implicitly be both ACTION_PRINT and
>> ACTION_TO_REF, but could be done via a fall-through.
>
> Normally I'm a big fan of switch statements, but I don't feel in this
> case that it logically represents the current code.  I think, given the
> context, that an if-else is easier to read.

Makes sense. Thanks.
Junio C Hamano April 6, 2022, 4:38 p.m. UTC | #9
"brian m. carlson" <sandals@crustytoothpaste.net> writes:

>> >  	probably not the command you want to use; see "push" above.
>> >  
>> > +export ( --print | --to-ref <ref> ) [<stash>...]::
>> > +
>> 
>> I think this extra \n here isn't needed.
>
> All the rest of the entries have it that way.  Junio likes it, you
> don't, but it's consistent with the rest of the file and I'm just
> following along.

I do not have preference either way.  The only reason why I think
this patch is better with the blank line is because other existing
entries are with a blank line between the header and the body.

Thanks.
diff mbox series

Patch

diff --git a/Documentation/git-stash.txt b/Documentation/git-stash.txt
index 6e15f47525..162110314e 100644
--- a/Documentation/git-stash.txt
+++ b/Documentation/git-stash.txt
@@ -20,6 +20,7 @@  SYNOPSIS
 'git stash' clear
 'git stash' create [<message>]
 'git stash' store [-m|--message <message>] [-q|--quiet] <commit>
+'git stash' export ( --print | --to-ref <ref> ) [<stash>...]
 
 DESCRIPTION
 -----------
@@ -151,6 +152,12 @@  store::
 	reflog.  This is intended to be useful for scripts.  It is
 	probably not the command you want to use; see "push" above.
 
+export ( --print | --to-ref <ref> ) [<stash>...]::
+
+	Export the specified stashes, or all of them if none are specified, to
+	a chain of commits which can be transferred using the normal fetch and
+	push mechanisms, then imported using the `import` subcommand.
+
 OPTIONS
 -------
 -a::
@@ -239,6 +246,19 @@  literally (including newlines and quotes).
 +
 Quiet, suppress feedback messages.
 
+--print::
+	This option is only valid for `export`.
++
+Create the chain of commits representing the exported stashes without
+storing it anywhere in the ref namespace and print the object ID to
+standard output.  This is designed for scripts.
+
+--to-ref::
+	This option is only valid for `export`.
++
+Create the chain of commits representing the exported stashes and store
+it to the specified ref.
+
 \--::
 	This option is only valid for `push` command.
 +
@@ -256,7 +276,7 @@  For more details, see the 'pathspec' entry in linkgit:gitglossary[7].
 
 <stash>::
 	This option is only valid for `apply`, `branch`, `drop`, `pop`,
-	`show` commands.
+	`show`, and `export` commands.
 +
 A reference of the form `stash@{<revision>}`. When no `<stash>` is
 given, the latest stash is assumed (that is, `stash@{0}`).
diff --git a/builtin/stash.c b/builtin/stash.c
index 4c281a5781..6f1fa19172 100644
--- a/builtin/stash.c
+++ b/builtin/stash.c
@@ -33,6 +33,7 @@  static const char * const git_stash_usage[] = {
 	   "          [--] [<pathspec>...]]"),
 	N_("git stash save [-p|--patch] [-S|--staged] [-k|--[no-]keep-index] [-q|--quiet]\n"
 	   "          [-u|--include-untracked] [-a|--all] [<message>]"),
+	N_("git stash export (--print | --to-ref <ref>) [<stash>...]"),
 	NULL
 };
 
@@ -89,6 +90,12 @@  static const char * const git_stash_save_usage[] = {
 	NULL
 };
 
+static const char * const git_stash_export_usage[] = {
+	N_("git stash export (--print | --to-ref <ref>) [<stash>...]"),
+	NULL
+};
+
+
 static const char ref_stash[] = "refs/stash";
 static struct strbuf stash_index_path = STRBUF_INIT;
 
@@ -1773,6 +1780,180 @@  static int save_stash(int argc, const char **argv, const char *prefix)
 	return ret;
 }
 
+static int write_commit_with_parents(struct object_id *out, const struct object_id *oid, struct commit_list *parents)
+{
+	size_t author_len, committer_len;
+	struct commit *this;
+	const char *orig_author, *orig_committer;
+	char *author = NULL, *committer = NULL;
+	const char *buffer;
+	unsigned long bufsize;
+	const char *p;
+	struct strbuf msg = STRBUF_INIT;
+	int ret = 0;
+
+	this = lookup_commit_reference(the_repository, oid);
+	buffer = get_commit_buffer(this, &bufsize);
+	orig_author = find_commit_header(buffer, "author", &author_len);
+	orig_committer = find_commit_header(buffer, "committer", &committer_len);
+	p = memmem(buffer, bufsize, "\n\n", 2);
+
+	if (!orig_author || !orig_committer || !p) {
+		error(_("cannot parse commit %s"), oid_to_hex(oid));
+		goto out;
+	}
+	/* Jump to message. */
+	p += 2;
+	strbuf_addstr(&msg, "git stash: ");
+	strbuf_add(&msg, p, bufsize - (p - buffer));
+
+	author = xmemdupz(orig_author, author_len);
+	committer = xmemdupz(orig_committer, committer_len);
+
+	if (commit_tree_extended(msg.buf, msg.len,
+				 the_hash_algo->empty_tree, parents,
+				 out, author, committer,
+				 NULL, NULL)) {
+		ret = -1;
+		error(_("could not write commit"));
+		goto out;
+	}
+out:
+	strbuf_reset(&msg);
+	unuse_commit_buffer(this, buffer);
+	free(author);
+	free(committer);
+	return ret;
+}
+
+static int do_export_stash(const char *ref, size_t argc, const char **argv)
+{
+	struct object_id base;
+	struct object_context unused;
+	struct commit *prev;
+	struct object_id *items = NULL;
+	int nitems = 0, nalloc = 0;
+	int res = 0;
+	struct strbuf revision;
+	const char *author, *committer;
+
+	/*
+	 * This is an arbitrary, fixed date, specifically the one used by git
+	 * format-patch.  The goal is merely to produce reproducible output.
+	 */
+	prepare_fallback_ident("git stash", "git@stash");
+	author = fmt_ident("git stash", "git@stash", WANT_BLANK_IDENT, "2001-09-17T00:00:00Z", 0);
+	committer = fmt_ident("git stash", "git@stash", WANT_BLANK_IDENT, "2001-09-17T00:00:00Z", 0);
+
+	/* First, we create a single empty commit. */
+	if (commit_tree_extended(NULL, 0, the_hash_algo->empty_tree, NULL, &base, author, committer, NULL, NULL))
+		return error(_("unable to write base commit"));
+
+	prev = lookup_commit_reference(the_repository, &base);
+
+	if (argc) {
+		/*
+		 * Find each specified stash, and load data into the array.
+		 */
+		for (int i = 0; i < argc; i++) {
+			ALLOC_GROW_BY(items, nitems, 1, nalloc);
+			if (parse_revision(&revision, argv[i], 0) ||
+			    get_oid_with_context(the_repository, revision.buf,
+						 GET_OID_QUIETLY | GET_OID_GENTLY,
+						 &items[i], &unused)) {
+				error(_("unable to find stash entry %s"), argv[i]);
+				res = -1;
+				goto out;
+			}
+		}
+	} else {
+		/*
+		 * Walk the reflog, finding each stash entry, and load data into the
+		 * array.
+		 */
+		for (int i = 0;; i++) {
+			char buf[32];
+			struct object_id oid;
+
+			snprintf(buf, sizeof(buf), "%d", i);
+			if (parse_revision(&revision, buf, 1) ||
+			    get_oid_with_context(the_repository, revision.buf,
+						 GET_OID_QUIETLY | GET_OID_GENTLY,
+						 &oid, &unused))
+				break;
+			ALLOC_GROW_BY(items, nitems, 1, nalloc);
+			oidcpy(&items[i], &oid);
+		}
+	}
+
+	/*
+	 * Now, create a set of commits identical to the regular stash commits,
+	 * but where their first parents form a chain to our original empty
+	 * base commit.
+	 */
+	for (int i = nitems - 1; i >= 0; i--) {
+		struct commit_list *parents = NULL;
+		struct commit_list **next = &parents;
+		struct object_id out;
+
+		next = commit_list_append(prev, next);
+		next = commit_list_append(lookup_commit_reference(the_repository, &items[i]), next);
+		if (write_commit_with_parents(&out, &items[i], parents)) {
+			res = -1;
+			goto out;
+		}
+		prev = lookup_commit_reference(the_repository, &out);
+	}
+	if (ref)
+		update_ref(NULL, ref, &prev->object.oid, NULL, 0, UPDATE_REFS_DIE_ON_ERR);
+	else
+		puts(oid_to_hex(&prev->object.oid));
+out:
+	free(items);
+
+	return res;
+}
+
+enum export_action {
+	ACTION_NONE,
+	ACTION_PRINT,
+	ACTION_TO_REF,
+};
+
+static int export_stash(int argc, const char **argv, const char *prefix)
+{
+	int ret = 0;
+	const char *ref = NULL;
+	enum export_action action = ACTION_NONE;
+	struct option options[] = {
+		OPT_CMDMODE(0, "print", &action,
+			    N_("print the object ID instead of writing it to a ref"),
+			    ACTION_PRINT),
+		OPT_CMDMODE(0, "to-ref", &action,
+			    N_("save the data to the given ref"),
+			    ACTION_TO_REF),
+		OPT_END()
+	};
+
+	argc = parse_options(argc, argv, prefix, options,
+			     git_stash_export_usage,
+			     PARSE_OPT_KEEP_DASHDASH);
+
+	if (action == ACTION_NONE) {
+		return error(_("exactly one of --print and --to-ref is required"));
+	} else if (action == ACTION_TO_REF) {
+		if (!argc)
+			return error(_("--to-ref requires an argument"));
+		ref = argv[0];
+		argc--;
+		argv++;
+	}
+
+
+	ret = do_export_stash(ref, argc, argv);
+	return ret;
+}
+
 int cmd_stash(int argc, const char **argv, const char *prefix)
 {
 	pid_t pid = getpid();
@@ -1816,6 +1997,8 @@  int cmd_stash(int argc, const char **argv, const char *prefix)
 		return !!push_stash(argc, argv, prefix, 0);
 	else if (!strcmp(argv[0], "save"))
 		return !!save_stash(argc, argv, prefix);
+	else if (!strcmp(argv[0], "export"))
+		return !!export_stash(argc, argv, prefix);
 	else if (*argv[0] != '-')
 		usage_msg_optf(_("unknown subcommand: %s"),
 			       git_stash_usage, options, argv[0]);