diff mbox series

[v4,1/4] transport-helper: no connection restriction in connect_helper

Message ID d343585cb5e696f521c2ee1dd6c0f0c2d86de113.1702562879.git.zhiyou.jx@alibaba-inc.com (mailing list archive)
State Accepted
Commit 4a61faf75d684eb31c23521bc0e3c3cac5fd1553
Headers show
Series support remote archive via stateless transport | expand

Commit Message

Jiang Xin Dec. 14, 2023, 2:13 p.m. UTC
From: Jiang Xin <zhiyou.jx@alibaba-inc.com>

When commit b236752a (Support remote archive from all smart transports,
2009-12-09) added "remote archive" support for "smart transports", it
was for transport that supports the ".connect" method. The
"connect_helper()" function protected itself from getting called for a
transport without the method before calling process_connect_service(),
which did not work with such a transport.

Later, commit edc9caf7 (transport-helper: introduce stateless-connect,
2018-03-15) added a way for a transport without the ".connect" method
to establish a "stateless" connection in protocol-v2, which
process_connect_service() was taught to handle the "stateless"
connection, making the old safety valve in its caller that insisted
that ".connect" method must be defined too strict, and forgot to loosen
it.

Remove the restriction in the "connect_helper()" function and give the
function "process_connect_service()" the opportunity to establish a
connection using ".connect" or ".stateless_connect" for protocol v2. So
we can connect with a stateless-rpc and do something useful. E.g., in a
later commit, implements remote archive for a repository over HTTP
protocol.

Helped-by: Junio C Hamano <gitster@pobox.com>
Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
---
 transport-helper.c | 2 --
 1 file changed, 2 deletions(-)

Comments

Linus Arver Jan. 12, 2024, 7:42 a.m. UTC | #1
Jiang Xin <worldhello.net@gmail.com> writes:

> From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>
> When commit b236752a (Support remote archive from all smart transports,
> 2009-12-09) added "remote archive" support for "smart transports", it
> was for transport that supports the ".connect" method. The
> "connect_helper()" function protected itself from getting called for a
> transport without the method before calling process_connect_service(),

OK.

> which did not work with such a transport.

How about 'which only worked with the ".connect" method.' ?

>
> Later, commit edc9caf7 (transport-helper: introduce stateless-connect,
> 2018-03-15) added a way for a transport without the ".connect" method
> to establish a "stateless" connection in protocol-v2, which

s/which/where

> process_connect_service() was taught to handle the "stateless"
> connection,

I think using 'the ".stateless_connect" method' is more consistent with
the rest of this text.

> making the old safety valve in its caller that insisted
> that ".connect" method must be defined too strict, and forgot to loosen
> it.

I think just "...making the old protection too strict. But edc9caf7
forgot to adjust this protection accordingly." is simpler to read.

> Remove the restriction in the "connect_helper()" function and give the
> function "process_connect_service()" the opportunity to establish a
> connection using ".connect" or ".stateless_connect" for protocol v2. So
> we can connect with a stateless-rpc and do something useful. E.g., in a
> later commit, implements remote archive for a repository over HTTP
> protocol.
>
> Helped-by: Junio C Hamano <gitster@pobox.com>
> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> ---
>  transport-helper.c | 2 --
>  1 file changed, 2 deletions(-)
>
> diff --git a/transport-helper.c b/transport-helper.c
> index 49811ef176..2e127d24a5 100644
> --- a/transport-helper.c
> +++ b/transport-helper.c
> @@ -662,8 +662,6 @@ static int connect_helper(struct transport *transport, const char *name,
>  
>  	/* Get_helper so connect is inited. */
>  	get_helper(transport);
> -	if (!data->connect)
> -		die(_("operation not supported by protocol"));

Should we still terminate early here if both data->connect and
data->stateless_connect are not truthy? It would save a few CPU cycles,
but even better, remain true to the the original intent of the code.
Maybe there was a really good reason to terminate early here that we're
not aware of?

But also, what about the case where both are enabled? Should we print an
error message? (Maybe this concern is outside the scope of this series?)

>  	if (!process_connect_service(transport, name, exec))
>  		die(_("can't connect to subservice %s"), name);
> -- 
> 2.41.0.232.g2f6f0bca4f.agit.8.0.4.dev
Junio C Hamano Jan. 12, 2024, 9:50 p.m. UTC | #2
Linus Arver <linusa@google.com> writes:

> Jiang Xin <worldhello.net@gmail.com> writes:
>
>> From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>>
>> When commit b236752a (Support remote archive from all smart transports,
>> 2009-12-09) added "remote archive" support for "smart transports", it
>> was for transport that supports the ".connect" method. The
>> "connect_helper()" function protected itself from getting called for a
>> transport without the method before calling process_connect_service(),
>
> OK.
>
>> which did not work with such a transport.
>
> How about 'which only worked with the ".connect" method.' ?
>
>>
>> Later, commit edc9caf7 (transport-helper: introduce stateless-connect,
>> 2018-03-15) added a way for a transport without the ".connect" method
>> to establish a "stateless" connection in protocol-v2, which
>
> s/which/where
>
>> process_connect_service() was taught to handle the "stateless"
>> connection,
>
> I think using 'the ".stateless_connect" method' is more consistent with
> the rest of this text.
>
>> making the old safety valve in its caller that insisted
>> that ".connect" method must be defined too strict, and forgot to loosen
>> it.
>
> I think just "...making the old protection too strict. But edc9caf7
> forgot to adjust this protection accordingly." is simpler to read.
>
>> Remove the restriction in the "connect_helper()" function and give the
>> function "process_connect_service()" the opportunity to establish a
>> connection using ".connect" or ".stateless_connect" for protocol v2. So
>> we can connect with a stateless-rpc and do something useful. E.g., in a
>> later commit, implements remote archive for a repository over HTTP
>> protocol.
>>
>> Helped-by: Junio C Hamano <gitster@pobox.com>
>> Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>> ---
>>  transport-helper.c | 2 --
>>  1 file changed, 2 deletions(-)
>>
>> diff --git a/transport-helper.c b/transport-helper.c
>> index 49811ef176..2e127d24a5 100644
>> --- a/transport-helper.c
>> +++ b/transport-helper.c
>> @@ -662,8 +662,6 @@ static int connect_helper(struct transport *transport, const char *name,
>>  
>>  	/* Get_helper so connect is inited. */
>>  	get_helper(transport);
>> -	if (!data->connect)
>> -		die(_("operation not supported by protocol"));
>
> Should we still terminate early here if both data->connect and
> data->stateless_connect are not truthy? It would save a few CPU cycles,
> but even better, remain true to the the original intent of the code.
> Maybe there was a really good reason to terminate early here that we're
> not aware of?
>
> But also, what about the case where both are enabled? Should we print an
> error message? (Maybe this concern is outside the scope of this series?)
>
>>  	if (!process_connect_service(transport, name, exec))
>>  		die(_("can't connect to subservice %s"), name);
>> -- 
>> 2.41.0.232.g2f6f0bca4f.agit.8.0.4.dev

Thanks for a review to get the topic that hasn't seen much reviews
unstuck.  Very much appreciated.
Jiang Xin Jan. 16, 2024, 9:04 a.m. UTC | #3
On Fri, Jan 12, 2024 at 3:42 PM Linus Arver <linusa@google.com> wrote:
>
> Jiang Xin <worldhello.net@gmail.com> writes:
>
> > From: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> >
> > When commit b236752a (Support remote archive from all smart transports,
> > 2009-12-09) added "remote archive" support for "smart transports", it
> > was for transport that supports the ".connect" method. The
> > "connect_helper()" function protected itself from getting called for a
> > transport without the method before calling process_connect_service(),
>
> OK.
>
> > which did not work with such a transport.
>
> How about 'which only worked with the ".connect" method.' ?
>
> >
> > Later, commit edc9caf7 (transport-helper: introduce stateless-connect,
> > 2018-03-15) added a way for a transport without the ".connect" method
> > to establish a "stateless" connection in protocol-v2, which
>
> s/which/where
>
> > process_connect_service() was taught to handle the "stateless"
> > connection,
>
> I think using 'the ".stateless_connect" method' is more consistent with
> the rest of this text.
>
> > making the old safety valve in its caller that insisted
> > that ".connect" method must be defined too strict, and forgot to loosen
> > it.
>
> I think just "...making the old protection too strict. But edc9caf7
> forgot to adjust this protection accordingly." is simpler to read.

Thanks for the above suggestions, and will update in next reroll.

> > Remove the restriction in the "connect_helper()" function and give the
> > function "process_connect_service()" the opportunity to establish a
> > connection using ".connect" or ".stateless_connect" for protocol v2. So
> > we can connect with a stateless-rpc and do something useful. E.g., in a
> > later commit, implements remote archive for a repository over HTTP
> > protocol.
> >
> > Helped-by: Junio C Hamano <gitster@pobox.com>
> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> > ---
> >  transport-helper.c | 2 --
> >  1 file changed, 2 deletions(-)
> >
> > diff --git a/transport-helper.c b/transport-helper.c
> > index 49811ef176..2e127d24a5 100644
> > --- a/transport-helper.c
> > +++ b/transport-helper.c
> > @@ -662,8 +662,6 @@ static int connect_helper(struct transport *transport, const char *name,
> >
> >       /* Get_helper so connect is inited. */
> >       get_helper(transport);
> > -     if (!data->connect)
> > -             die(_("operation not supported by protocol"));
>
> Should we still terminate early here if both data->connect and
> data->stateless_connect are not truthy? It would save a few CPU cycles,
> but even better, remain true to the the original intent of the code.
> Maybe there was a really good reason to terminate early here that we're
> not aware of?
>

It's not necessary to check data->connect here, because it will
terminate if fail to connect by calling the function
"process_connect_service()".

> But also, what about the case where both are enabled? Should we print an
> error message? (Maybe this concern is outside the scope of this series?)

In the function "process_connect_service()", we can see that "connect"
has a higher priority than "stateless-connect".

>
> >       if (!process_connect_service(transport, name, exec))
> >               die(_("can't connect to subservice %s"), name);

Regardless of whether "connect" or "stateless-connect" is used, the
function process_connect_service() will return 1 if the connection is
successful. If the connection fails, it will terminate here.
Linus Arver Jan. 18, 2024, 10:26 p.m. UTC | #4
Jiang Xin <worldhello.net@gmail.com> writes:

>> > Remove the restriction in the "connect_helper()" function and give the
>> > function "process_connect_service()" the opportunity to establish a
>> > connection using ".connect" or ".stateless_connect" for protocol v2. So
>> > we can connect with a stateless-rpc and do something useful. E.g., in a
>> > later commit, implements remote archive for a repository over HTTP
>> > protocol.
>> >
>> > Helped-by: Junio C Hamano <gitster@pobox.com>
>> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
>> > ---
>> >  transport-helper.c | 2 --
>> >  1 file changed, 2 deletions(-)
>> >
>> > diff --git a/transport-helper.c b/transport-helper.c
>> > index 49811ef176..2e127d24a5 100644
>> > --- a/transport-helper.c
>> > +++ b/transport-helper.c
>> > @@ -662,8 +662,6 @@ static int connect_helper(struct transport *transport, const char *name,
>> >
>> >       /* Get_helper so connect is inited. */
>> >       get_helper(transport);
>> > -     if (!data->connect)
>> > -             die(_("operation not supported by protocol"));
>>
>> Should we still terminate early here if both data->connect and
>> data->stateless_connect are not truthy? It would save a few CPU cycles,
>> but even better, remain true to the the original intent of the code.
>> Maybe there was a really good reason to terminate early here that we're
>> not aware of?
>>
>
> It's not necessary to check data->connect here, because it will
> terminate if fail to connect by calling the function
> "process_connect_service()".

In the process_connect_service() we have

    if (data->connect) {
       ...
    } else if (data->stateless_connect && ...) {
       ...
    }

    strbuf_release(&cmdbuf);
    return ret;

and so if both data->connect and data->stateless_connect are false, that
function could silently do nothing. IOW that function expects the
connection type to be guaranteed to be set, so it makes sense to check
for the correctness of this in the connect_helper().

>> But also, what about the case where both are enabled? Should we print an
>> error message? (Maybe this concern is outside the scope of this series?)
>
> In the function "process_connect_service()", we can see that "connect"
> has a higher priority than "stateless-connect".

What I mean is, does it make sense for connect_helper() to recognize
invalid or possibly buggy states? IOW, is having both data->connect and
data->stateless_connect enabled a bug? If we only ever set one or the
other (we treat them as mutually exclusive) elsewhere in the codebase,
and if we are doing the sort of "correctness" check in the
connect_helper(), then it makes sense to detect that both are set and
print an error or warning (as a programmer bug).
Jiang Xin Jan. 19, 2024, 10:56 a.m. UTC | #5
On Fri, Jan 19, 2024 at 6:26 AM Linus Arver <linusa@google.com> wrote:
>
> Jiang Xin <worldhello.net@gmail.com> writes:
>
> >> > Remove the restriction in the "connect_helper()" function and give the
> >> > function "process_connect_service()" the opportunity to establish a
> >> > connection using ".connect" or ".stateless_connect" for protocol v2. So
> >> > we can connect with a stateless-rpc and do something useful. E.g., in a
> >> > later commit, implements remote archive for a repository over HTTP
> >> > protocol.
> >> >
> >> > Helped-by: Junio C Hamano <gitster@pobox.com>
> >> > Signed-off-by: Jiang Xin <zhiyou.jx@alibaba-inc.com>
> >> > ---
> >> >  transport-helper.c | 2 --
> >> >  1 file changed, 2 deletions(-)
> >> >
> >> > diff --git a/transport-helper.c b/transport-helper.c
> >> > index 49811ef176..2e127d24a5 100644
> >> > --- a/transport-helper.c
> >> > +++ b/transport-helper.c
> >> > @@ -662,8 +662,6 @@ static int connect_helper(struct transport *transport, const char *name,
> >> >
> >> >       /* Get_helper so connect is inited. */
> >> >       get_helper(transport);
> >> > -     if (!data->connect)
> >> > -             die(_("operation not supported by protocol"));
> >>
> >> Should we still terminate early here if both data->connect and
> >> data->stateless_connect are not truthy? It would save a few CPU cycles,
> >> but even better, remain true to the the original intent of the code.
> >> Maybe there was a really good reason to terminate early here that we're
> >> not aware of?
> >>
> >
> > It's not necessary to check data->connect here, because it will
> > terminate if fail to connect by calling the function
> > "process_connect_service()".
>
> In the process_connect_service() we have
>
>     if (data->connect) {
>        ...
>     } else if (data->stateless_connect && ...) {
>        ...
>     }
>
>     strbuf_release(&cmdbuf);
>     return ret;
>
> and so if both data->connect and data->stateless_connect are false, that
> function could silently do nothing. IOW that function expects the
> connection type to be guaranteed to be set, so it makes sense to check
> for the correctness of this in the connect_helper().

If both data->connect and data->stateless_connect are false,
process_connect_service() will return 0 instead of making a connection
and returning 1. The return value will be checked in the function
connect_helper() as follows:

        if (!process_connect_service(transport, name, exec))
                die(_("can't connect to subservice %s"), name);

So I think it's not necessary to make double check in connect_helper().

>
> >> But also, what about the case where both are enabled? Should we print an
> >> error message? (Maybe this concern is outside the scope of this series?)
> >
> > In the function "process_connect_service()", we can see that "connect"
> > has a higher priority than "stateless-connect".
>
> What I mean is, does it make sense for connect_helper() to recognize
> invalid or possibly buggy states? IOW, is having both data->connect and
> data->stateless_connect enabled a bug? If we only ever set one or the
> other (we treat them as mutually exclusive) elsewhere in the codebase,
> and if we are doing the sort of "correctness" check in the
> connect_helper(), then it makes sense to detect that both are set and
> print an error or warning (as a programmer bug).

The best position to address the bug that both data->connect and
data->stateless_connect are enabled is in the function get_helper() as
below:

        } else if (!strcmp(capname, "connect")) {
                data->connect = 1;
        } else if (!strcmp(capname, "stateless-connect")) {
                data->stateless_connect = 1;
        }
        ... ...
        if (data->connect && data->stateless_connect)
                die("cannot have both connect and stateless_connect enabled");

I consider this change to be off-topic and it will not be introduced
in this series.

--
Jiang Xin
Linus Arver Jan. 20, 2024, 8:25 p.m. UTC | #6
Jiang Xin <worldhello.net@gmail.com> writes:

> If both data->connect and data->stateless_connect are false,
> process_connect_service() will return 0 instead of making a connection
> and returning 1. The return value will be checked in the function
> connect_helper() as follows:
>
>         if (!process_connect_service(transport, name, exec))
>                 die(_("can't connect to subservice %s"), name);
>
> So I think it's not necessary to make double check in connect_helper().

Ah, thank you for the clarification.

> The best position to address the bug that both data->connect and
> data->stateless_connect are enabled is in the function get_helper() as
> below:
>
>         } else if (!strcmp(capname, "connect")) {
>                 data->connect = 1;
>         } else if (!strcmp(capname, "stateless-connect")) {
>                 data->stateless_connect = 1;
>         }
>         ... ...
>         if (data->connect && data->stateless_connect)
>                 die("cannot have both connect and stateless_connect enabled");
>
> I consider this change to be off-topic and it will not be introduced
> in this series.

SG.
diff mbox series

Patch

diff --git a/transport-helper.c b/transport-helper.c
index 49811ef176..2e127d24a5 100644
--- a/transport-helper.c
+++ b/transport-helper.c
@@ -662,8 +662,6 @@  static int connect_helper(struct transport *transport, const char *name,
 
 	/* Get_helper so connect is inited. */
 	get_helper(transport);
-	if (!data->connect)
-		die(_("operation not supported by protocol"));
 
 	if (!process_connect_service(transport, name, exec))
 		die(_("can't connect to subservice %s"), name);