diff mbox series

[v2] qemu-nbd: Use SOMAXCONN for socket listen() backlog

Message ID 20210205185705.1502071-1-eblake@redhat.com (mailing list archive)
State New, archived
Headers show
Series [v2] qemu-nbd: Use SOMAXCONN for socket listen() backlog | expand

Commit Message

Eric Blake Feb. 5, 2021, 6:57 p.m. UTC
Our default of a backlog of 1 connection is rather puny, particularly
for scenarios where we expect multiple listeners to connect (such as
qemu-nbd -e X).  This is especially important for Unix sockets, as a
definite benefit to clients: at least on Linux, a client trying to
connect to a Unix socket with a backlog gets an EAGAIN failure with no
way to poll() for when the backlog is no longer present short of
sleeping an arbitrary amount of time before retrying.

See https://bugzilla.redhat.com/1925045 for a demonstration of where
our low backlog prevents libnbd from connecting as many parallel
clients as it wants.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
---

v2: target the correct API used by qemu-nbd, rather than an unrelated
legacy wrapper [Dan]

 qemu-nbd.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

Comments

Nir Soffer Feb. 5, 2021, 7:55 p.m. UTC | #1
On Fri, Feb 5, 2021 at 8:57 PM Eric Blake <eblake@redhat.com> wrote:
>
> Our default of a backlog of 1 connection is rather puny, particularly
> for scenarios where we expect multiple listeners to connect (such as
> qemu-nbd -e X).  This is especially important for Unix sockets, as a
> definite benefit to clients: at least on Linux, a client trying to
> connect to a Unix socket with a backlog gets an EAGAIN failure with no
> way to poll() for when the backlog is no longer present short of
> sleeping an arbitrary amount of time before retrying.
>
> See https://bugzilla.redhat.com/1925045 for a demonstration of where
> our low backlog prevents libnbd from connecting as many parallel
> clients as it wants.
>
> Reported-by: Richard W.M. Jones <rjones@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
>
> v2: target the correct API used by qemu-nbd, rather than an unrelated
> legacy wrapper [Dan]
>
>  qemu-nbd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 608c63e82a25..cd20ee73be19 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -965,7 +965,8 @@ int main(int argc, char **argv)
>      server = qio_net_listener_new();
>      if (socket_activation == 0) {
>          saddr = nbd_build_socket_address(sockpath, bindto, port);
> -        if (qio_net_listener_open_sync(server, saddr, 1, &local_err) < 0) {
> +        if (qio_net_listener_open_sync(server, saddr, SOMAXCONN,

Shouldn't we use value based on --shared=N?

Using maximum value makes sense for generic server expecting to handle many
connections from different clients. qemu-nbd is typically used by one
client, and we
need to make it possible to connect a known number of connections quickly.

> +                                       &local_err) < 0) {
>              object_unref(OBJECT(server));
>              error_report_err(local_err);
>              exit(EXIT_FAILURE);
> --
> 2.30.0
>
>
Richard W.M. Jones Feb. 5, 2021, 8:19 p.m. UTC | #2
On Fri, Feb 05, 2021 at 12:57:05PM -0600, Eric Blake wrote:
> Our default of a backlog of 1 connection is rather puny, particularly
> for scenarios where we expect multiple listeners to connect (such as
> qemu-nbd -e X).  This is especially important for Unix sockets, as a
> definite benefit to clients: at least on Linux, a client trying to
> connect to a Unix socket with a backlog gets an EAGAIN failure with no
> way to poll() for when the backlog is no longer present short of
> sleeping an arbitrary amount of time before retrying.
> 
> See https://bugzilla.redhat.com/1925045 for a demonstration of where
> our low backlog prevents libnbd from connecting as many parallel
> clients as it wants.
> 
> Reported-by: Richard W.M. Jones <rjones@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v2: target the correct API used by qemu-nbd, rather than an unrelated
> legacy wrapper [Dan]
> 
>  qemu-nbd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 608c63e82a25..cd20ee73be19 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -965,7 +965,8 @@ int main(int argc, char **argv)
>      server = qio_net_listener_new();
>      if (socket_activation == 0) {
>          saddr = nbd_build_socket_address(sockpath, bindto, port);
> -        if (qio_net_listener_open_sync(server, saddr, 1, &local_err) < 0) {
> +        if (qio_net_listener_open_sync(server, saddr, SOMAXCONN,
> +                                       &local_err) < 0) {
>              object_unref(OBJECT(server));
>              error_report_err(local_err);
>              exit(EXIT_FAILURE);

This one works:

Tested-by: Richard W.M. Jones <rjones@redhat.com>

Rich.
Daniel P. Berrangé Feb. 8, 2021, 10 a.m. UTC | #3
On Fri, Feb 05, 2021 at 12:57:05PM -0600, Eric Blake wrote:
> Our default of a backlog of 1 connection is rather puny, particularly
> for scenarios where we expect multiple listeners to connect (such as
> qemu-nbd -e X).  This is especially important for Unix sockets, as a
> definite benefit to clients: at least on Linux, a client trying to
> connect to a Unix socket with a backlog gets an EAGAIN failure with no
> way to poll() for when the backlog is no longer present short of
> sleeping an arbitrary amount of time before retrying.
> 
> See https://bugzilla.redhat.com/1925045 for a demonstration of where
> our low backlog prevents libnbd from connecting as many parallel
> clients as it wants.
> 
> Reported-by: Richard W.M. Jones <rjones@redhat.com>
> Signed-off-by: Eric Blake <eblake@redhat.com>
> ---
> 
> v2: target the correct API used by qemu-nbd, rather than an unrelated
> legacy wrapper [Dan]
> 
>  qemu-nbd.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/qemu-nbd.c b/qemu-nbd.c
> index 608c63e82a25..cd20ee73be19 100644
> --- a/qemu-nbd.c
> +++ b/qemu-nbd.c
> @@ -965,7 +965,8 @@ int main(int argc, char **argv)
>      server = qio_net_listener_new();
>      if (socket_activation == 0) {
>          saddr = nbd_build_socket_address(sockpath, bindto, port);
> -        if (qio_net_listener_open_sync(server, saddr, 1, &local_err) < 0) {
> +        if (qio_net_listener_open_sync(server, saddr, SOMAXCONN,
> +                                       &local_err) < 0) {

This addresses qemu-nbd, but surely we want to be consistent with the
QMP  nbd-server-start impl too, in blockdev-nbd.c

>              object_unref(OBJECT(server));
>              error_report_err(local_err);
>              exit(EXIT_FAILURE);

Regards,
Daniel
Eric Blake Feb. 8, 2021, 4:45 p.m. UTC | #4
On 2/8/21 4:00 AM, Daniel P. Berrangé wrote:
> On Fri, Feb 05, 2021 at 12:57:05PM -0600, Eric Blake wrote:
>> Our default of a backlog of 1 connection is rather puny, particularly
>> for scenarios where we expect multiple listeners to connect (such as
>> qemu-nbd -e X).  This is especially important for Unix sockets, as a
>> definite benefit to clients: at least on Linux, a client trying to
>> connect to a Unix socket with a backlog gets an EAGAIN failure with no
>> way to poll() for when the backlog is no longer present short of
>> sleeping an arbitrary amount of time before retrying.
>>
>> See https://bugzilla.redhat.com/1925045 for a demonstration of where
>> our low backlog prevents libnbd from connecting as many parallel
>> clients as it wants.
>>
>> Reported-by: Richard W.M. Jones <rjones@redhat.com>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>
>> v2: target the correct API used by qemu-nbd, rather than an unrelated
>> legacy wrapper [Dan]
>>
>>  qemu-nbd.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>> index 608c63e82a25..cd20ee73be19 100644
>> --- a/qemu-nbd.c
>> +++ b/qemu-nbd.c
>> @@ -965,7 +965,8 @@ int main(int argc, char **argv)
>>      server = qio_net_listener_new();
>>      if (socket_activation == 0) {
>>          saddr = nbd_build_socket_address(sockpath, bindto, port);
>> -        if (qio_net_listener_open_sync(server, saddr, 1, &local_err) < 0) {
>> +        if (qio_net_listener_open_sync(server, saddr, SOMAXCONN,
>> +                                       &local_err) < 0) {
> 
> This addresses qemu-nbd, but surely we want to be consistent with the
> QMP  nbd-server-start impl too, in blockdev-nbd.c
> 

Good point, I'll send v3.
Eric Blake Feb. 8, 2021, 4:49 p.m. UTC | #5
On 2/5/21 1:55 PM, Nir Soffer wrote:
> On Fri, Feb 5, 2021 at 8:57 PM Eric Blake <eblake@redhat.com> wrote:
>>
>> Our default of a backlog of 1 connection is rather puny, particularly
>> for scenarios where we expect multiple listeners to connect (such as
>> qemu-nbd -e X).  This is especially important for Unix sockets, as a
>> definite benefit to clients: at least on Linux, a client trying to
>> connect to a Unix socket with a backlog gets an EAGAIN failure with no
>> way to poll() for when the backlog is no longer present short of
>> sleeping an arbitrary amount of time before retrying.
>>
>> See https://bugzilla.redhat.com/1925045 for a demonstration of where
>> our low backlog prevents libnbd from connecting as many parallel
>> clients as it wants.
>>
>> Reported-by: Richard W.M. Jones <rjones@redhat.com>
>> Signed-off-by: Eric Blake <eblake@redhat.com>
>> ---
>>
>> v2: target the correct API used by qemu-nbd, rather than an unrelated
>> legacy wrapper [Dan]
>>
>>  qemu-nbd.c | 3 ++-
>>  1 file changed, 2 insertions(+), 1 deletion(-)
>>
>> diff --git a/qemu-nbd.c b/qemu-nbd.c
>> index 608c63e82a25..cd20ee73be19 100644
>> --- a/qemu-nbd.c
>> +++ b/qemu-nbd.c
>> @@ -965,7 +965,8 @@ int main(int argc, char **argv)
>>      server = qio_net_listener_new();
>>      if (socket_activation == 0) {
>>          saddr = nbd_build_socket_address(sockpath, bindto, port);
>> -        if (qio_net_listener_open_sync(server, saddr, 1, &local_err) < 0) {
>> +        if (qio_net_listener_open_sync(server, saddr, SOMAXCONN,
> 
> Shouldn't we use value based on --shared=N?

That's a possibility.  I don't know how many resources we tie up if we
allow more clients to try to connect() than what we will ultimately
accept when --shared is small.

But there's also the issue that for qemu-nbd, we default to 1 client
unless you pass -e/--shared on the command line, whereas starting an NBD
server in qemu via QMP (including via qemu-storage-daemon) defaults
max-connections to 0 meaning unlimited.

> 
> Using maximum value makes sense for generic server expecting to handle many
> connections from different clients. qemu-nbd is typically used by one
> client, and we
> need to make it possible to connect a known number of connections quickly.

At any rate, when max-connections is specified to something smaller than
SOMAXCONN, listen()ing to a smaller number of connections seems
reasonable, so I'll play with that for v3.
diff mbox series

Patch

diff --git a/qemu-nbd.c b/qemu-nbd.c
index 608c63e82a25..cd20ee73be19 100644
--- a/qemu-nbd.c
+++ b/qemu-nbd.c
@@ -965,7 +965,8 @@  int main(int argc, char **argv)
     server = qio_net_listener_new();
     if (socket_activation == 0) {
         saddr = nbd_build_socket_address(sockpath, bindto, port);
-        if (qio_net_listener_open_sync(server, saddr, 1, &local_err) < 0) {
+        if (qio_net_listener_open_sync(server, saddr, SOMAXCONN,
+                                       &local_err) < 0) {
             object_unref(OBJECT(server));
             error_report_err(local_err);
             exit(EXIT_FAILURE);