diff mbox series

Don't ignore transport_disconnect error codes in fetch and clone

Message ID 20210928001726.2592734-1-mh@glandium.org (mailing list archive)
State New, archived
Headers show
Series Don't ignore transport_disconnect error codes in fetch and clone | expand

Commit Message

Mike Hommey Sept. 28, 2021, 12:17 a.m. UTC
When a remote-helper fails in a way that is not directly visible in the
remote-helper protocol, the helper failure is ignored by git during
fetch or clone.

For example, a helper cannot directly report an error during an `import`
command (short of sending `feature done` to the fast-import file
descriptor and not sending a `done` later on).

Or if the helper crashes at the wrong moment, git doesn't notice and
thinks everything went well.

Signed-off-by: Mike Hommey <mh@glandium.org>
---
 builtin/clone.c | 5 +++--
 builtin/fetch.c | 6 +++---
 2 files changed, 6 insertions(+), 5 deletions(-)

What I'm not sure about is whether a message should be explicitly
printed by git itself in those cases.

Comments

Ævar Arnfjörð Bjarmason Sept. 28, 2021, 2:56 a.m. UTC | #1
On Tue, Sep 28 2021, Mike Hommey wrote:

> When a remote-helper fails in a way that is not directly visible in the
> remote-helper protocol, the helper failure is ignored by git during
> fetch or clone.
>
> For example, a helper cannot directly report an error during an `import`
> command (short of sending `feature done` to the fast-import file
> descriptor and not sending a `done` later on).
>
> Or if the helper crashes at the wrong moment, git doesn't notice and
> thinks everything went well.
>
> Signed-off-by: Mike Hommey <mh@glandium.org>
> ---
>  builtin/clone.c | 5 +++--
>  builtin/fetch.c | 6 +++---
>  2 files changed, 6 insertions(+), 5 deletions(-)
>
> What I'm not sure about is whether a message should be explicitly
> printed by git itself in those cases.
>
> diff --git a/builtin/clone.c b/builtin/clone.c
> index 66fe66679c..f26fa027c5 100644
> --- a/builtin/clone.c
> +++ b/builtin/clone.c
> @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	submodule_progress = transport->progress;
>  
>  	transport_unlock_pack(transport);
> -	transport_disconnect(transport);
> +	err = transport_disconnect(transport);
>  
>  	if (option_dissociate) {
>  		close_object_store(the_repository->objects);
> @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>  	}
>  
>  	junk_mode = JUNK_LEAVE_REPO;
> -	err = checkout(submodule_progress);
> +	if (!err)
> +		err = checkout(submodule_progress);
>  
>  	free(remote_name);
>  	strbuf_release(&reflog_msg);

This seems buggy in some cases, e.g. just because we couldn't close()
some final socket we should just not run the checkout at all? Shouldn't
we note the disconnect status, but still see if we can do the checkout
etc?

> diff --git a/builtin/fetch.c b/builtin/fetch.c
> index 25740c13df..66bccf6f50 100644
> --- a/builtin/fetch.c
> +++ b/builtin/fetch.c
> @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
>  {
>  	struct refspec rs = REFSPEC_INIT_FETCH;
>  	int i;
> -	int exit_code;
> +	int exit_code, disconnect_code;
>  	int maybe_prune_tags;
>  	int remote_via_config = remote_is_configured(remote, 0);
>  


> @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
>  	exit_code = do_fetch(gtransport, &rs);
>  	sigchain_pop(SIGPIPE);
>  	refspec_clear(&rs);
> -	transport_disconnect(gtransport);
> +	disconnect_code = transport_disconnect(gtransport);
>  	gtransport = NULL;
> -	return exit_code;
> +	return exit_code || disconnect_code;
>  }
>  
>  int cmd_fetch(int argc, const char **argv, const char *prefix)

This seems like it really needs fixes in other areas,
i.e. disconnect_git() returns 0 unconditionally, but should check at
least finish_connect(), no?

Also once that's done you'll have a logic error here where you're
conflating exit and error codes, we should not return -1 from main() (we
do in a few cases, but it's nasty)>
Mike Hommey Oct. 5, 2021, 4:54 a.m. UTC | #2
Hi,

Sorry for the delay, I managed to miss this reply.

On Tue, Sep 28, 2021 at 04:56:37AM +0200, Ævar Arnfjörð Bjarmason wrote:
> 
> On Tue, Sep 28 2021, Mike Hommey wrote:
> 
> > When a remote-helper fails in a way that is not directly visible in the
> > remote-helper protocol, the helper failure is ignored by git during
> > fetch or clone.
> >
> > For example, a helper cannot directly report an error during an `import`
> > command (short of sending `feature done` to the fast-import file
> > descriptor and not sending a `done` later on).
> >
> > Or if the helper crashes at the wrong moment, git doesn't notice and
> > thinks everything went well.
> >
> > Signed-off-by: Mike Hommey <mh@glandium.org>
> > ---
> >  builtin/clone.c | 5 +++--
> >  builtin/fetch.c | 6 +++---
> >  2 files changed, 6 insertions(+), 5 deletions(-)
> >
> > What I'm not sure about is whether a message should be explicitly
> > printed by git itself in those cases.
> >
> > diff --git a/builtin/clone.c b/builtin/clone.c
> > index 66fe66679c..f26fa027c5 100644
> > --- a/builtin/clone.c
> > +++ b/builtin/clone.c
> > @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  	submodule_progress = transport->progress;
> >  
> >  	transport_unlock_pack(transport);
> > -	transport_disconnect(transport);
> > +	err = transport_disconnect(transport);
> >  
> >  	if (option_dissociate) {
> >  		close_object_store(the_repository->objects);
> > @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
> >  	}
> >  
> >  	junk_mode = JUNK_LEAVE_REPO;
> > -	err = checkout(submodule_progress);
> > +	if (!err)
> > +		err = checkout(submodule_progress);
> >  
> >  	free(remote_name);
> >  	strbuf_release(&reflog_msg);
> 
> This seems buggy in some cases, e.g. just because we couldn't close()
> some final socket we should just not run the checkout at all? Shouldn't
> we note the disconnect status, but still see if we can do the checkout
> etc?

I guess it could be seen both ways. In my particular use case, I do want
automated testing to be able to catch Leak Sanitizer failures from a
remote-helper.

> > diff --git a/builtin/fetch.c b/builtin/fetch.c
> > index 25740c13df..66bccf6f50 100644
> > --- a/builtin/fetch.c
> > +++ b/builtin/fetch.c
> > @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
> >  {
> >  	struct refspec rs = REFSPEC_INIT_FETCH;
> >  	int i;
> > -	int exit_code;
> > +	int exit_code, disconnect_code;
> >  	int maybe_prune_tags;
> >  	int remote_via_config = remote_is_configured(remote, 0);
> >  
> 
> 
> > @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
> >  	exit_code = do_fetch(gtransport, &rs);
> >  	sigchain_pop(SIGPIPE);
> >  	refspec_clear(&rs);
> > -	transport_disconnect(gtransport);
> > +	disconnect_code = transport_disconnect(gtransport);
> >  	gtransport = NULL;
> > -	return exit_code;
> > +	return exit_code || disconnect_code;
> >  }
> >  
> >  int cmd_fetch(int argc, const char **argv, const char *prefix)
> 
> This seems like it really needs fixes in other areas,
> i.e. disconnect_git() returns 0 unconditionally, but should check at
> least finish_connect(), no?
> 
> Also once that's done you'll have a logic error here where you're
> conflating exit and error codes, we should not return -1 from main() (we
> do in a few cases, but it's nasty)>

transport_disconnect returns the remote-helper exit code for external
helpers, though, but I guess we don't necessarily need to return that
particular exit code.

Mike
Ævar Arnfjörð Bjarmason Oct. 5, 2021, 12:21 p.m. UTC | #3
On Tue, Oct 05 2021, Mike Hommey wrote:

> Hi,
>
> Sorry for the delay, I managed to miss this reply.
>
> On Tue, Sep 28, 2021 at 04:56:37AM +0200, Ævar Arnfjörð Bjarmason wrote:
>> 
>> On Tue, Sep 28 2021, Mike Hommey wrote:
>> 
>> > When a remote-helper fails in a way that is not directly visible in the
>> > remote-helper protocol, the helper failure is ignored by git during
>> > fetch or clone.
>> >
>> > For example, a helper cannot directly report an error during an `import`
>> > command (short of sending `feature done` to the fast-import file
>> > descriptor and not sending a `done` later on).
>> >
>> > Or if the helper crashes at the wrong moment, git doesn't notice and
>> > thinks everything went well.
>> >
>> > Signed-off-by: Mike Hommey <mh@glandium.org>
>> > ---
>> >  builtin/clone.c | 5 +++--
>> >  builtin/fetch.c | 6 +++---
>> >  2 files changed, 6 insertions(+), 5 deletions(-)
>> >
>> > What I'm not sure about is whether a message should be explicitly
>> > printed by git itself in those cases.
>> >
>> > diff --git a/builtin/clone.c b/builtin/clone.c
>> > index 66fe66679c..f26fa027c5 100644
>> > --- a/builtin/clone.c
>> > +++ b/builtin/clone.c
>> > @@ -1398,7 +1398,7 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>> >  	submodule_progress = transport->progress;
>> >  
>> >  	transport_unlock_pack(transport);
>> > -	transport_disconnect(transport);
>> > +	err = transport_disconnect(transport);
>> >  
>> >  	if (option_dissociate) {
>> >  		close_object_store(the_repository->objects);
>> > @@ -1406,7 +1406,8 @@ int cmd_clone(int argc, const char **argv, const char *prefix)
>> >  	}
>> >  
>> >  	junk_mode = JUNK_LEAVE_REPO;
>> > -	err = checkout(submodule_progress);
>> > +	if (!err)
>> > +		err = checkout(submodule_progress);
>> >  
>> >  	free(remote_name);
>> >  	strbuf_release(&reflog_msg);
>> 
>> This seems buggy in some cases, e.g. just because we couldn't close()
>> some final socket we should just not run the checkout at all? Shouldn't
>> we note the disconnect status, but still see if we can do the checkout
>> etc?
>
> I guess it could be seen both ways. In my particular use case, I do want
> automated testing to be able to catch Leak Sanitizer failures from a
> remote-helper.

I'd like that too, but I'm pointing out something unrelated (or at least
I think I am), which is that this is changing the equivalent of this code:

    int fd = open(...);
    close(fd);
    do_stuff();

To be:

    int err;
    int fd = open(...); /* check err here too, but whatever, pseuodocode() */
    if (!close(fd))
        do_stuff();

But just with sockets, i.e. just because transport_disconnect() had
*some* error are we really confident that checkout() should not be run?

*Maybe*, but at the very least shouldn't we distinguish some of those
error states? Isn't any serious protocol error going to be caught
earlier transport_disconnect() is just going to give us the equivalent
of not being able to cleanly close the socket?

Except as noted in my second comment it doesn't, since we ignore
disconnect_git(). At this point I can't recall what we do check, and I
intentionally haven't re-looked, but that's also a suggestion for your
commit message :) I.e. what error states are going to be different now,
when don't we proceed because of a disconnect_git() failure?

>> > diff --git a/builtin/fetch.c b/builtin/fetch.c
>> > index 25740c13df..66bccf6f50 100644
>> > --- a/builtin/fetch.c
>> > +++ b/builtin/fetch.c
>> > @@ -1886,7 +1886,7 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
>> >  {
>> >  	struct refspec rs = REFSPEC_INIT_FETCH;
>> >  	int i;
>> > -	int exit_code;
>> > +	int exit_code, disconnect_code;
>> >  	int maybe_prune_tags;
>> >  	int remote_via_config = remote_is_configured(remote, 0);
>> >  
>> 
>> 
>> > @@ -1952,9 +1952,9 @@ static int fetch_one(struct remote *remote, int argc, const char **argv,
>> >  	exit_code = do_fetch(gtransport, &rs);
>> >  	sigchain_pop(SIGPIPE);
>> >  	refspec_clear(&rs);
>> > -	transport_disconnect(gtransport);
>> > +	disconnect_code = transport_disconnect(gtransport);
>> >  	gtransport = NULL;
>> > -	return exit_code;
>> > +	return exit_code || disconnect_code;
>> >  }
>> >  
>> >  int cmd_fetch(int argc, const char **argv, const char *prefix)
>> 
>> This seems like it really needs fixes in other areas,
>> i.e. disconnect_git() returns 0 unconditionally, but should check at
>> least finish_connect(), no?
>> 
>> Also once that's done you'll have a logic error here where you're
>> conflating exit and error codes, we should not return -1 from main() (we
>> do in a few cases, but it's nasty)>
>
> transport_disconnect returns the remote-helper exit code for external
> helpers, though, but I guess we don't necessarily need to return that
> particular exit code.

Right, some of these functions are quite naughty about that, and as
noted we get it wrong in a lot of existing cases.

But it is unportable (non-standard C, but POSIX plasters over it),
i.e. a return of -1 from main() is interpreted as a wait() status in
POSIX, and results in a return of -1 & 0xff == 255.

But for new code let's try to get it right, which in this case is going
to be transport_disconnect() being being a syscall error code, and the
"exit code" being a return code, which should not be mixed, no?
diff mbox series

Patch

diff --git a/builtin/clone.c b/builtin/clone.c
index 66fe66679c..f26fa027c5 100644
--- a/builtin/clone.c
+++ b/builtin/clone.c
@@ -1398,7 +1398,7 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	submodule_progress = transport->progress;
 
 	transport_unlock_pack(transport);
-	transport_disconnect(transport);
+	err = transport_disconnect(transport);
 
 	if (option_dissociate) {
 		close_object_store(the_repository->objects);
@@ -1406,7 +1406,8 @@  int cmd_clone(int argc, const char **argv, const char *prefix)
 	}
 
 	junk_mode = JUNK_LEAVE_REPO;
-	err = checkout(submodule_progress);
+	if (!err)
+		err = checkout(submodule_progress);
 
 	free(remote_name);
 	strbuf_release(&reflog_msg);
diff --git a/builtin/fetch.c b/builtin/fetch.c
index 25740c13df..66bccf6f50 100644
--- a/builtin/fetch.c
+++ b/builtin/fetch.c
@@ -1886,7 +1886,7 @@  static int fetch_one(struct remote *remote, int argc, const char **argv,
 {
 	struct refspec rs = REFSPEC_INIT_FETCH;
 	int i;
-	int exit_code;
+	int exit_code, disconnect_code;
 	int maybe_prune_tags;
 	int remote_via_config = remote_is_configured(remote, 0);
 
@@ -1952,9 +1952,9 @@  static int fetch_one(struct remote *remote, int argc, const char **argv,
 	exit_code = do_fetch(gtransport, &rs);
 	sigchain_pop(SIGPIPE);
 	refspec_clear(&rs);
-	transport_disconnect(gtransport);
+	disconnect_code = transport_disconnect(gtransport);
 	gtransport = NULL;
-	return exit_code;
+	return exit_code || disconnect_code;
 }
 
 int cmd_fetch(int argc, const char **argv, const char *prefix)