diff mbox series

[07/11] bundle: safely handle --objects option

Message ID 1476a9495c53a165e6971afe75205889524fe7ca.1645638911.git.gitgitgadget@gmail.com (mailing list archive)
State Superseded
Headers show
Series Partial bundles | expand

Commit Message

Derrick Stolee Feb. 23, 2022, 5:55 p.m. UTC
From: Derrick Stolee <derrickstolee@github.com>

Since 'git bundle' uses setup_revisions() to specify the object walk,
some options do not make sense to include during the pack-objects child
process. Further, these options are used for a call to
traverse_commit_list() which would then require a callback which is
currently NULL.

By populating the callback we prevent a segfault in the case of adding
the --objects flag. This is really a redundant statement because the
bundles are constructing a pack-file containing all objects in the
discovered commit range.

Adding --objects to a 'git bundle' command might cause a slower command,
but at least it will not have a hard failure when the user supplies this
option. We can also disable walking trees and blobs in advance of this
walk.

Signed-off-by: Derrick Stolee <derrickstolee@github.com>
---
 bundle.c               | 10 +++++++++-
 t/t6020-bundle-misc.sh | 12 ++++++++++++
 2 files changed, 21 insertions(+), 1 deletion(-)

Comments

Jeff Hostetler Feb. 28, 2022, 4 p.m. UTC | #1
On 2/23/22 12:55 PM, Derrick Stolee via GitGitGadget wrote:
> From: Derrick Stolee <derrickstolee@github.com>
> 
> Since 'git bundle' uses setup_revisions() to specify the object walk,
> some options do not make sense to include during the pack-objects child
> process. Further, these options are used for a call to
> traverse_commit_list() which would then require a callback which is
> currently NULL.
> 
> By populating the callback we prevent a segfault in the case of adding
> the --objects flag. This is really a redundant statement because the

Nit: I stumbled over "...because the bundles are constructing..."
Is there a better wording here??

> bundles are constructing a pack-file containing all objects in the
> discovered commit range.
> 
> Adding --objects to a 'git bundle' command might cause a slower command,
> but at least it will not have a hard failure when the user supplies this
> option. We can also disable walking trees and blobs in advance of this
> walk.
> 
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>   bundle.c               | 10 +++++++++-
>   t/t6020-bundle-misc.sh | 12 ++++++++++++
>   2 files changed, 21 insertions(+), 1 deletion(-)
> 
> diff --git a/bundle.c b/bundle.c
> index a0bb687b0f4..dc56db9a50a 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -451,6 +451,12 @@ struct bundle_prerequisites_info {
>   	int fd;
>   };
>   
> +
> +static void ignore_object(struct object *obj, const char *v, void *data)
> +{
> +	/* Do nothing. */
> +}
> +
>   static void write_bundle_prerequisites(struct commit *commit, void *data)
>   {
>   	struct bundle_prerequisites_info *bpi = data;
> @@ -544,7 +550,9 @@ int create_bundle(struct repository *r, const char *path,
>   		die("revision walk setup failed");
>   	bpi.fd = bundle_fd;
>   	bpi.pending = &revs_copy.pending;
> -	traverse_commit_list(&revs, write_bundle_prerequisites, NULL, &bpi);
> +
> +	revs.blob_objects = revs.tree_objects = 0;
> +	traverse_commit_list(&revs, write_bundle_prerequisites, ignore_object, &bpi);
>   	object_array_remove_duplicates(&revs_copy.pending);
>   
>   	/* write bundle refs */
> diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
> index b13e8a52a93..6522401617d 100755
> --- a/t/t6020-bundle-misc.sh
> +++ b/t/t6020-bundle-misc.sh
> @@ -475,4 +475,16 @@ test_expect_success 'clone from bundle' '
>   	test_cmp expect actual
>   '
>   
> +test_expect_success 'unfiltered bundle with --objects' '
> +	git bundle create all-objects.bdl \
> +		--all --objects &&
> +	git bundle create all.bdl \
> +		--all &&
> +
> +	# Compare the headers of these files.
> +	head -11 all.bdl >expect &&
> +	head -11 all-objects.bdl >actual &&
> +	test_cmp expect actual
> +'
> +
>   test_done
>
Junio C Hamano March 4, 2022, 10:57 p.m. UTC | #2
"Derrick Stolee via GitGitGadget" <gitgitgadget@gmail.com> writes:

> From: Derrick Stolee <derrickstolee@github.com>
>
> Since 'git bundle' uses setup_revisions() to specify the object walk,
> some options do not make sense to include during the pack-objects child
> process. Further, these options are used for a call to
> traverse_commit_list() which would then require a callback which is
> currently NULL.
>
> By populating the callback we prevent a segfault in the case of adding
> the --objects flag. This is really a redundant statement because the
> bundles are constructing a pack-file containing all objects in the
> discovered commit range.
>
> Adding --objects to a 'git bundle' command might cause a slower command,
> but at least it will not have a hard failure when the user supplies this
> option. We can also disable walking trees and blobs in advance of this
> walk.

Wow.  That's fun.  

This commit makes me wonder if we are safe with --max-parents=,
--author=, and other nonsense options, but it is obvious that it is
a segfault waiting to happen by passing NULL to object callback,
which makes it worth singling out "--objects" and dedicate a commit
to fix it.
Junio C Hamano March 4, 2022, 10:58 p.m. UTC | #3
Jeff Hostetler <git@jeffhostetler.com> writes:

> On 2/23/22 12:55 PM, Derrick Stolee via GitGitGadget wrote:
>> From: Derrick Stolee <derrickstolee@github.com>
>> Since 'git bundle' uses setup_revisions() to specify the object
>> walk,
>> some options do not make sense to include during the pack-objects child
>> process. Further, these options are used for a call to
>> traverse_commit_list() which would then require a callback which is
>> currently NULL.
>> By populating the callback we prevent a segfault in the case of
>> adding
>> the --objects flag. This is really a redundant statement because the
>
> Nit: I stumbled over "...because the bundles are constructing..."
> Is there a better wording here??

"... because the command is constructing ..." should be sufficient,
I hope?
Derrick Stolee March 7, 2022, 2:09 p.m. UTC | #4
On 3/4/2022 5:58 PM, Junio C Hamano wrote:
> Jeff Hostetler <git@jeffhostetler.com> writes:
> 
>> On 2/23/22 12:55 PM, Derrick Stolee via GitGitGadget wrote:
>>> From: Derrick Stolee <derrickstolee@github.com>
>>> Since 'git bundle' uses setup_revisions() to specify the object
>>> walk,
>>> some options do not make sense to include during the pack-objects child
>>> process. Further, these options are used for a call to
>>> traverse_commit_list() which would then require a callback which is
>>> currently NULL.
>>> By populating the callback we prevent a segfault in the case of
>>> adding
>>> the --objects flag. This is really a redundant statement because the
>>
>> Nit: I stumbled over "...because the bundles are constructing..."
>> Is there a better wording here??
> 
> "... because the command is constructing ..." should be sufficient,
> I hope?

That works for me. Thanks!
-Stolee
Ævar Arnfjörð Bjarmason March 7, 2022, 3:35 p.m. UTC | #5
On Wed, Feb 23 2022, Derrick Stolee via GitGitGadget wrote:

> From: Derrick Stolee <derrickstolee@github.com>
>
> Since 'git bundle' uses setup_revisions() to specify the object walk,
> some options do not make sense to include during the pack-objects child
> process. Further, these options are used for a call to
> traverse_commit_list() which would then require a callback which is
> currently NULL.
>
> By populating the callback we prevent a segfault in the case of adding
> the --objects flag. This is really a redundant statement because the
> bundles are constructing a pack-file containing all objects in the
> discovered commit range.
>
> Adding --objects to a 'git bundle' command might cause a slower command,
> but at least it will not have a hard failure when the user supplies this
> option. We can also disable walking trees and blobs in advance of this
> walk.
>
> Signed-off-by: Derrick Stolee <derrickstolee@github.com>
> ---
>  bundle.c               | 10 +++++++++-
>  t/t6020-bundle-misc.sh | 12 ++++++++++++
>  2 files changed, 21 insertions(+), 1 deletion(-)
>
> diff --git a/bundle.c b/bundle.c
> index a0bb687b0f4..dc56db9a50a 100644
> --- a/bundle.c
> +++ b/bundle.c
> @@ -451,6 +451,12 @@ struct bundle_prerequisites_info {
>  	int fd;
>  };
>  
> +
> +static void ignore_object(struct object *obj, const char *v, void *data)
> +{
> +	/* Do nothing. */
> +}
> +
>  static void write_bundle_prerequisites(struct commit *commit, void *data)
>  {
>  	struct bundle_prerequisites_info *bpi = data;
> @@ -544,7 +550,9 @@ int create_bundle(struct repository *r, const char *path,
>  		die("revision walk setup failed");
>  	bpi.fd = bundle_fd;
>  	bpi.pending = &revs_copy.pending;
> -	traverse_commit_list(&revs, write_bundle_prerequisites, NULL, &bpi);
> +
> +	revs.blob_objects = revs.tree_objects = 0;
> +	traverse_commit_list(&revs, write_bundle_prerequisites, ignore_object, &bpi);
>  	object_array_remove_duplicates(&revs_copy.pending);

The callback dummy callback part of it seems like something we'd be
better off doing by just teaching traverse_commit_list() to pay
attention to our "NULL" in this case.

But maybe I'd don't quite get why it either can't say "oh it's, NULL,
don't need to call that", or alternatively die earlier as it notices it
needs to call it, but it wasn't provided.

The same presumably goes for show_commit_fn.
diff mbox series

Patch

diff --git a/bundle.c b/bundle.c
index a0bb687b0f4..dc56db9a50a 100644
--- a/bundle.c
+++ b/bundle.c
@@ -451,6 +451,12 @@  struct bundle_prerequisites_info {
 	int fd;
 };
 
+
+static void ignore_object(struct object *obj, const char *v, void *data)
+{
+	/* Do nothing. */
+}
+
 static void write_bundle_prerequisites(struct commit *commit, void *data)
 {
 	struct bundle_prerequisites_info *bpi = data;
@@ -544,7 +550,9 @@  int create_bundle(struct repository *r, const char *path,
 		die("revision walk setup failed");
 	bpi.fd = bundle_fd;
 	bpi.pending = &revs_copy.pending;
-	traverse_commit_list(&revs, write_bundle_prerequisites, NULL, &bpi);
+
+	revs.blob_objects = revs.tree_objects = 0;
+	traverse_commit_list(&revs, write_bundle_prerequisites, ignore_object, &bpi);
 	object_array_remove_duplicates(&revs_copy.pending);
 
 	/* write bundle refs */
diff --git a/t/t6020-bundle-misc.sh b/t/t6020-bundle-misc.sh
index b13e8a52a93..6522401617d 100755
--- a/t/t6020-bundle-misc.sh
+++ b/t/t6020-bundle-misc.sh
@@ -475,4 +475,16 @@  test_expect_success 'clone from bundle' '
 	test_cmp expect actual
 '
 
+test_expect_success 'unfiltered bundle with --objects' '
+	git bundle create all-objects.bdl \
+		--all --objects &&
+	git bundle create all.bdl \
+		--all &&
+
+	# Compare the headers of these files.
+	head -11 all.bdl >expect &&
+	head -11 all-objects.bdl >actual &&
+	test_cmp expect actual
+'
+
 test_done