diff mbox series

[v4,4/8] object-info: send attribute packet regardless of object ids

Message ID 20220502170904.2770649-5-calvinwan@google.com (mailing list archive)
State New, archived
Headers show
Series cat-file: add --batch-command remote-object-info command | expand

Commit Message

Calvin Wan May 2, 2022, 5:09 p.m. UTC
Currently on the server side of object-info, if the client does not send
any object ids in the request or if the client requests an attribute the
server does not support, the server will either not send any packets
back or error out. Consider the scenario where the client git version is
ahead of the server git version, and the client can request additional
object-info besides 'size'. There needs to be a way to tell whether the
server can honor all of the client attribute requests before the client
sends a request with all of the object ids.  In a future patch, if the
client were to make an initial command request with only attributes, the
server would be able to confirm which attributes it could return.

---
 protocol-caps.c | 14 +++++++-------
 1 file changed, 7 insertions(+), 7 deletions(-)

Comments

Junio C Hamano May 3, 2022, 12:05 a.m. UTC | #1
Calvin Wan <calvinwan@google.com> writes:

> Currently on the server side of object-info, if the client does not send
> any object ids in the request or if the client requests an attribute the
> server does not support, the server will either not send any packets
> back or error out.

There is an early return when oid_str_list->nr (i.e. number of
objects we are queried) is zero, and we return without showing the
"size" token in the output (called "attribute").  The change in this
patch changes the behaviour and makes such a request responded with
"size" but without any size information for no objects.

It is unclear why that is a good change, though.

The loop that accepts requests in today's code sends an error when
it sees a line that it does not understand.  The patch stops doing
so.  Also, after ignoring such an unknown line, the requests that
were understood are responded to by send_info() function.  The patch
instead refuses to give any output in such a case.

It is unclear why either of these two makes it a good change,
though.

> Consider the scenario where the client git version is
> ahead of the server git version, and the client can request additional
> object-info besides 'size'. There needs to be a way to tell whether the
> server can honor all of the client attribute requests before the client
> sends a request with all of the object ids.

Yes.  If we want to learn about "size", "frotz", and "nitfol"
attributes about these objects, we can send "size", "frotz",
"nitfol", and then start feeding object names.  Then we can observe
that "frotz" were not liked to learn that this old server does not
understand it, and the same for "nitfol".  At least we would have
learned about size of these objects, without this patch.

Now, an updated server side with this patch would respond with
"size" and no other response.  Does it mean it does not understand
"frotz", it does not understand "nitfol", or neither?  Did we get no
response because we didn't feed objects, or because we asked for
something they don't know about?

How well does the current client work with today's server side and
the server side updated with this patch?  I _think_ this change is
essentially a no-op when there is no version skew (i.e. we do not
ask for anything today's server does not know about and we do not
necessarily ask today's server about zero objects).

Am I correct to assume that those who are talking with today's
server side are all out-of-tree custom clients?

I am wondering how much damage we are causing them with this change
in behaviour, especially with the lack of "You asked for X, but I
don't understand X", which is replaced by no output, which would be
totally unexpected by them.

It totally is possible that this "object-info" request is treated as
an experimental curiosity that is not ready for production and has
no existing users.  If this were in 2006, I would just _declare_
such is a case and move on, breaking the protocol for existing users
whose number is zero.  But I am afraid that we no longer live in
such a simpler world, so...

> In a future patch, if the
> client were to make an initial command request with only attributes, the
> server would be able to confirm which attributes it could return.
>
> ---
>  protocol-caps.c | 14 +++++++-------
>  1 file changed, 7 insertions(+), 7 deletions(-)
>
> diff --git a/protocol-caps.c b/protocol-caps.c
> index bbde91810a..bc7def0727 100644
> --- a/protocol-caps.c
> +++ b/protocol-caps.c
> @@ -11,6 +11,7 @@
>  
>  struct requested_info {
>  	unsigned size : 1;
> +	unsigned unknown : 1;
>  };

OK.

>  /*
> @@ -40,12 +41,12 @@ static void send_info(struct repository *r, struct packet_writer *writer,
>  	struct string_list_item *item;
>  	struct strbuf send_buffer = STRBUF_INIT;
>  
> -	if (!oid_str_list->nr)
> -		return;
> -
>  	if (info->size)
>  		packet_writer_write(writer, "size");
>  
> +	if (info->unknown || !oid_str_list->nr)
> +		goto release;
> +
>  	for_each_string_list_item (item, oid_str_list) {
>  		const char *oid_str = item->string;
>  		struct object_id oid;
> @@ -72,12 +73,13 @@ static void send_info(struct repository *r, struct packet_writer *writer,
>  		packet_writer_write(writer, "%s", send_buffer.buf);
>  		strbuf_reset(&send_buffer);
>  	}
> +release:
>  	strbuf_release(&send_buffer);
>  }

OK, except for the bypass for info->unknown case, which I am not
sure about.

>  int cap_object_info(struct repository *r, struct packet_reader *request)
>  {
> -	struct requested_info info;
> +	struct requested_info info = { 0 };

Wow.  We have been using info uninitialized?  Is this a silent
bugfix?

If info.size bit was on due to on-stack garbage, we would have given
our response with "size" attribute prefixed, even when the client
side never requested it.

> @@ -92,9 +94,7 @@ int cap_object_info(struct repository *r, struct packet_reader *request)
>  		if (parse_oid(request->line, &oid_str_list))
>  			continue;
>  
> -		packet_writer_error(&writer,
> -				    "object-info: unexpected line: '%s'",
> -				    request->line);
> +		info.unknown = 1;
Calvin Wan May 3, 2022, 7:21 p.m. UTC | #2
> Now, an updated server side with this patch would respond with
> "size" and no other response.  Does it mean it does not understand
> "frotz", it does not understand "nitfol", or neither?  Did we get no
> response because we didn't feed objects, or because we asked for
> something they don't know about?

Patches 4-6 describe the following design idea I had:
 - Send initial request with only attributes
 - If server can fulfill object-info attributes, then go ahead
 - Else fallback to fetch

I responded to your comments on patch 5 regarding not needing
that first request, and sending the entire request from the beginning
so I imagine v5 would look something like this instead:
 - Send full request
 - Server either responds with a fulfilled request or only the attributes
   it can return.
 - Fallback to fetch if request is unfulfilled

> How well does the current client work with today's server side and
> the server side updated with this patch?  I _think_ this change is
> essentially a no-op when there is no version skew

You are correct that this change is essentially a no-op when there is
no version skew, which is intended. There is no current client; it's
implemented in patch 5.

> Am I correct to assume that those who are talking with today's
> server side are all out-of-tree custom clients?

> I am wondering how much damage we are causing them with this change
> in behaviour, especially with the lack of "You asked for X, but I
> don't understand X", which is replaced by no output, which would be
> totally unexpected by them.

Internally we don't have anyone using this yet. If we want to be
totally safe, I am also OK with having object-info return everything
it understands about the request rather than nothing. This was
intended to be an optimization since we would be defaulting back
to fetch to get object-info rather than only returning what object-info
knows.

> Wow.  We have been using info uninitialized?  Is this a silent
> bugfix?

Yes unfortunately.
On Mon, May 2, 2022 at 5:06 PM Junio C Hamano <gitster@pobox.com> wrote:
>
> Calvin Wan <calvinwan@google.com> writes:
>
> > Currently on the server side of object-info, if the client does not send
> > any object ids in the request or if the client requests an attribute the
> > server does not support, the server will either not send any packets
> > back or error out.
>
> There is an early return when oid_str_list->nr (i.e. number of
> objects we are queried) is zero, and we return without showing the
> "size" token in the output (called "attribute").  The change in this
> patch changes the behaviour and makes such a request responded with
> "size" but without any size information for no objects.
>
> It is unclear why that is a good change, though.
>
> The loop that accepts requests in today's code sends an error when
> it sees a line that it does not understand.  The patch stops doing
> so.  Also, after ignoring such an unknown line, the requests that
> were understood are responded to by send_info() function.  The patch
> instead refuses to give any output in such a case.
>
> It is unclear why either of these two makes it a good change,
> though.
>
> > Consider the scenario where the client git version is
> > ahead of the server git version, and the client can request additional
> > object-info besides 'size'. There needs to be a way to tell whether the
> > server can honor all of the client attribute requests before the client
> > sends a request with all of the object ids.
>
> Yes.  If we want to learn about "size", "frotz", and "nitfol"
> attributes about these objects, we can send "size", "frotz",
> "nitfol", and then start feeding object names.  Then we can observe
> that "frotz" were not liked to learn that this old server does not
> understand it, and the same for "nitfol".  At least we would have
> learned about size of these objects, without this patch.
>
> Now, an updated server side with this patch would respond with
> "size" and no other response.  Does it mean it does not understand
> "frotz", it does not understand "nitfol", or neither?  Did we get no
> response because we didn't feed objects, or because we asked for
> something they don't know about?
>
> How well does the current client work with today's server side and
> the server side updated with this patch?  I _think_ this change is
> essentially a no-op when there is no version skew (i.e. we do not
> ask for anything today's server does not know about and we do not
> necessarily ask today's server about zero objects).
>
> Am I correct to assume that those who are talking with today's
> server side are all out-of-tree custom clients?
>
> I am wondering how much damage we are causing them with this change
> in behaviour, especially with the lack of "You asked for X, but I
> don't understand X", which is replaced by no output, which would be
> totally unexpected by them.
>
> It totally is possible that this "object-info" request is treated as
> an experimental curiosity that is not ready for production and has
> no existing users.  If this were in 2006, I would just _declare_
> such is a case and move on, breaking the protocol for existing users
> whose number is zero.  But I am afraid that we no longer live in
> such a simpler world, so...
>
> > In a future patch, if the
> > client were to make an initial command request with only attributes, the
> > server would be able to confirm which attributes it could return.
> >
> > ---
> >  protocol-caps.c | 14 +++++++-------
> >  1 file changed, 7 insertions(+), 7 deletions(-)
> >
> > diff --git a/protocol-caps.c b/protocol-caps.c
> > index bbde91810a..bc7def0727 100644
> > --- a/protocol-caps.c
> > +++ b/protocol-caps.c
> > @@ -11,6 +11,7 @@
> >
> >  struct requested_info {
> >       unsigned size : 1;
> > +     unsigned unknown : 1;
> >  };
>
> OK.
>
> >  /*
> > @@ -40,12 +41,12 @@ static void send_info(struct repository *r, struct packet_writer *writer,
> >       struct string_list_item *item;
> >       struct strbuf send_buffer = STRBUF_INIT;
> >
> > -     if (!oid_str_list->nr)
> > -             return;
> > -
> >       if (info->size)
> >               packet_writer_write(writer, "size");
> >
> > +     if (info->unknown || !oid_str_list->nr)
> > +             goto release;
> > +
> >       for_each_string_list_item (item, oid_str_list) {
> >               const char *oid_str = item->string;
> >               struct object_id oid;
> > @@ -72,12 +73,13 @@ static void send_info(struct repository *r, struct packet_writer *writer,
> >               packet_writer_write(writer, "%s", send_buffer.buf);
> >               strbuf_reset(&send_buffer);
> >       }
> > +release:
> >       strbuf_release(&send_buffer);
> >  }
>
> OK, except for the bypass for info->unknown case, which I am not
> sure about.
>
> >  int cap_object_info(struct repository *r, struct packet_reader *request)
> >  {
> > -     struct requested_info info;
> > +     struct requested_info info = { 0 };
>
> Wow.  We have been using info uninitialized?  Is this a silent
> bugfix?
>
> If info.size bit was on due to on-stack garbage, we would have given
> our response with "size" attribute prefixed, even when the client
> side never requested it.
>
> > @@ -92,9 +94,7 @@ int cap_object_info(struct repository *r, struct packet_reader *request)
> >               if (parse_oid(request->line, &oid_str_list))
> >                       continue;
> >
> > -             packet_writer_error(&writer,
> > -                                 "object-info: unexpected line: '%s'",
> > -                                 request->line);
> > +             info.unknown = 1;
>
Jonathan Tan May 3, 2022, 11:11 p.m. UTC | #3
Calvin Wan <calvinwan@google.com> writes:
> Currently on the server side of object-info, if the client does not send
> any object ids in the request or if the client requests an attribute the
> server does not support, the server will either not send any packets
> back or error out. Consider the scenario where the client git version is
> ahead of the server git version, and the client can request additional
> object-info besides 'size'. There needs to be a way to tell whether the
> server can honor all of the client attribute requests before the client
> sends a request with all of the object ids.  

This part explains the problem...

> In a future patch, if the
> client were to make an initial command request with only attributes, the
> server would be able to confirm which attributes it could return.

...and this part explains what we want to do in the future, but this
commit message doesn't explain what this commit does. (The commit
subject has something related, but does not have enough detail. For
example, is it an empty attribute packet?)

Having said that, the way we usually communicate client-server version
differences is through capabilities - is there a reason why we can't use
that here?
diff mbox series

Patch

diff --git a/protocol-caps.c b/protocol-caps.c
index bbde91810a..bc7def0727 100644
--- a/protocol-caps.c
+++ b/protocol-caps.c
@@ -11,6 +11,7 @@ 
 
 struct requested_info {
 	unsigned size : 1;
+	unsigned unknown : 1;
 };
 
 /*
@@ -40,12 +41,12 @@  static void send_info(struct repository *r, struct packet_writer *writer,
 	struct string_list_item *item;
 	struct strbuf send_buffer = STRBUF_INIT;
 
-	if (!oid_str_list->nr)
-		return;
-
 	if (info->size)
 		packet_writer_write(writer, "size");
 
+	if (info->unknown || !oid_str_list->nr)
+		goto release;
+
 	for_each_string_list_item (item, oid_str_list) {
 		const char *oid_str = item->string;
 		struct object_id oid;
@@ -72,12 +73,13 @@  static void send_info(struct repository *r, struct packet_writer *writer,
 		packet_writer_write(writer, "%s", send_buffer.buf);
 		strbuf_reset(&send_buffer);
 	}
+release:
 	strbuf_release(&send_buffer);
 }
 
 int cap_object_info(struct repository *r, struct packet_reader *request)
 {
-	struct requested_info info;
+	struct requested_info info = { 0 };
 	struct packet_writer writer;
 	struct string_list oid_str_list = STRING_LIST_INIT_DUP;
 
@@ -92,9 +94,7 @@  int cap_object_info(struct repository *r, struct packet_reader *request)
 		if (parse_oid(request->line, &oid_str_list))
 			continue;
 
-		packet_writer_error(&writer,
-				    "object-info: unexpected line: '%s'",
-				    request->line);
+		info.unknown = 1;
 	}
 
 	if (request->status != PACKET_READ_FLUSH) {