diff mbox series

multifd: Set a higher "backlog" default value for listen()

Message ID 20230518085228.172816-1-lei4.wang@intel.com (mailing list archive)
State New, archived
Headers show
Series multifd: Set a higher "backlog" default value for listen() | expand

Commit Message

Lei Wang May 18, 2023, 8:52 a.m. UTC
When destination VM is launched, the "backlog" parameter for listen() is set
to 1 as default in socket_start_incoming_migration_internal(), which will
lead to socket connection error (the queue of pending connections is full)
when "multifd" and "multifd-channels" are set later on and a high number of
channels are used. Set it to a hard-coded higher default value 512 to fix
this issue.

Reported-by: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Lei Wang <lei4.wang@intel.com>
---
 migration/socket.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Wang, Wei W May 18, 2023, 9:13 a.m. UTC | #1
On Thursday, May 18, 2023 4:52 PM, Wang, Lei4 wrote:
> When destination VM is launched, the "backlog" parameter for listen() is set
> to 1 as default in socket_start_incoming_migration_internal(), which will
> lead to socket connection error (the queue of pending connections is full)
> when "multifd" and "multifd-channels" are set later on and a high number of
> channels are used. Set it to a hard-coded higher default value 512 to fix this
> issue.
> 
> Reported-by: Wei Wang <wei.w.wang@intel.com>
> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> ---
>  migration/socket.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/migration/socket.c b/migration/socket.c index
> 1b6f5baefb..b43a66ef7e 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -179,7 +179,7 @@
> socket_start_incoming_migration_internal(SocketAddress *saddr,
>      QIONetListener *listener = qio_net_listener_new();
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      size_t i;
> -    int num = 1;
> +    int num = 512;
> 

Probably we need a macro for it, e.g.
#define MIGRATION_CHANNEL_MAX  512

Also, I think below lines could be removed, as using a larger value of num (i.e. 512)
doesn't seem to consume more resources anywhere:
-    if (migrate_use_multifd()) {
-        num = migrate_multifd_channels();
-    } else if (migrate_postcopy_preempt()) {
-        num = RAM_CHANNEL_MAX;
-    }
Juan Quintela May 18, 2023, 9:16 a.m. UTC | #2
Lei Wang <lei4.wang@intel.com> wrote:
> When destination VM is launched, the "backlog" parameter for listen() is set
> to 1 as default in socket_start_incoming_migration_internal(), which will
> lead to socket connection error (the queue of pending connections is full)
> when "multifd" and "multifd-channels" are set later on and a high number of
> channels are used. Set it to a hard-coded higher default value 512 to fix
> this issue.
>
> Reported-by: Wei Wang <wei.w.wang@intel.com>
> Signed-off-by: Lei Wang <lei4.wang@intel.com>

[cc'd daiel who is the maintainer of qio]

My understanding of that value is that 230 or something like that would
be more than enough.  The maxiimum number of multifd channels is 256.

Daniel, any opinion?

Later, Juan.

> ---
>  migration/socket.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/migration/socket.c b/migration/socket.c
> index 1b6f5baefb..b43a66ef7e 100644
> --- a/migration/socket.c
> +++ b/migration/socket.c
> @@ -179,7 +179,7 @@ socket_start_incoming_migration_internal(SocketAddress *saddr,
>      QIONetListener *listener = qio_net_listener_new();
>      MigrationIncomingState *mis = migration_incoming_get_current();
>      size_t i;
> -    int num = 1;
> +    int num = 512;
>  
>      qio_net_listener_set_name(listener, "migration-socket-listener");
Juan Quintela May 18, 2023, 11:44 a.m. UTC | #3
"Wang, Wei W" <wei.w.wang@intel.com> wrote:
> On Thursday, May 18, 2023 4:52 PM, Wang, Lei4 wrote:
>> When destination VM is launched, the "backlog" parameter for listen() is set
>> to 1 as default in socket_start_incoming_migration_internal(), which will
>> lead to socket connection error (the queue of pending connections is full)
>> when "multifd" and "multifd-channels" are set later on and a high number of
>> channels are used. Set it to a hard-coded higher default value 512 to fix this
>> issue.
>> 
>> Reported-by: Wei Wang <wei.w.wang@intel.com>
>> Signed-off-by: Lei Wang <lei4.wang@intel.com>
>> ---
>>  migration/socket.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>> 
>> diff --git a/migration/socket.c b/migration/socket.c index
>> 1b6f5baefb..b43a66ef7e 100644
>> --- a/migration/socket.c
>> +++ b/migration/socket.c
>> @@ -179,7 +179,7 @@
>> socket_start_incoming_migration_internal(SocketAddress *saddr,
>>      QIONetListener *listener = qio_net_listener_new();
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>>      size_t i;
>> -    int num = 1;
>> +    int num = 512;
>> 
>
> Probably we need a macro for it, e.g.
> #define MIGRATION_CHANNEL_MAX  512
>
> Also, I think below lines could be removed, as using a larger value of num (i.e. 512)
> doesn't seem to consume more resources anywhere:

Could you confirm that?

> -    if (migrate_use_multifd()) {
> -        num = migrate_multifd_channels();
> -    } else if (migrate_postcopy_preempt()) {
> -        num = RAM_CHANNEL_MAX;
> -    }

Agreed that in this case we should drop this bit.

But on the other hand, if it does'nt consume more resources, why isn't
the kernel just ignoring the value passed to listen an just use a big
number?

Later, Juan.
Daniel P. Berrangé May 18, 2023, 12:29 p.m. UTC | #4
On Thu, May 18, 2023 at 09:13:58AM +0000, Wang, Wei W wrote:
> On Thursday, May 18, 2023 4:52 PM, Wang, Lei4 wrote:
> > When destination VM is launched, the "backlog" parameter for listen() is set
> > to 1 as default in socket_start_incoming_migration_internal(), which will
> > lead to socket connection error (the queue of pending connections is full)
> > when "multifd" and "multifd-channels" are set later on and a high number of
> > channels are used. Set it to a hard-coded higher default value 512 to fix this
> > issue.
> > 
> > Reported-by: Wei Wang <wei.w.wang@intel.com>
> > Signed-off-by: Lei Wang <lei4.wang@intel.com>
> > ---
> >  migration/socket.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/migration/socket.c b/migration/socket.c index
> > 1b6f5baefb..b43a66ef7e 100644
> > --- a/migration/socket.c
> > +++ b/migration/socket.c
> > @@ -179,7 +179,7 @@
> > socket_start_incoming_migration_internal(SocketAddress *saddr,
> >      QIONetListener *listener = qio_net_listener_new();
> >      MigrationIncomingState *mis = migration_incoming_get_current();
> >      size_t i;
> > -    int num = 1;
> > +    int num = 512;
> > 
> 
> Probably we need a macro for it, e.g.
> #define MIGRATION_CHANNEL_MAX  512
> 
> Also, I think below lines could be removed, as using a larger value of num (i.e. 512)
> doesn't seem to consume more resources anywhere:
> -    if (migrate_use_multifd()) {
> -        num = migrate_multifd_channels();
> -    } else if (migrate_postcopy_preempt()) {
> -        num = RAM_CHANNEL_MAX;
> -    }

Given that this code already exists, why is it not already sufficient ?

The commit description is saying we're setting backlog == 1 wit
multifd, but this later code is setting it to match the multfd
channels.  Why isn't that enough ?

With regards,
Daniel
Juan Quintela May 18, 2023, 12:42 p.m. UTC | #5
Daniel P. Berrangé <berrange@redhat.com> wrote:
> On Thu, May 18, 2023 at 09:13:58AM +0000, Wang, Wei W wrote:
>> On Thursday, May 18, 2023 4:52 PM, Wang, Lei4 wrote:
>> > When destination VM is launched, the "backlog" parameter for listen() is set
>> > to 1 as default in socket_start_incoming_migration_internal(), which will
>> > lead to socket connection error (the queue of pending connections is full)
>> > when "multifd" and "multifd-channels" are set later on and a high number of
>> > channels are used. Set it to a hard-coded higher default value 512 to fix this
>> > issue.
>> > 
>> > Reported-by: Wei Wang <wei.w.wang@intel.com>
>> > Signed-off-by: Lei Wang <lei4.wang@intel.com>
>> > ---
>> >  migration/socket.c | 2 +-
>> >  1 file changed, 1 insertion(+), 1 deletion(-)
>> > 
>> > diff --git a/migration/socket.c b/migration/socket.c index
>> > 1b6f5baefb..b43a66ef7e 100644
>> > --- a/migration/socket.c
>> > +++ b/migration/socket.c
>> > @@ -179,7 +179,7 @@
>> > socket_start_incoming_migration_internal(SocketAddress *saddr,
>> >      QIONetListener *listener = qio_net_listener_new();
>> >      MigrationIncomingState *mis = migration_incoming_get_current();
>> >      size_t i;
>> > -    int num = 1;
>> > +    int num = 512;
>> > 
>> 
>> Probably we need a macro for it, e.g.
>> #define MIGRATION_CHANNEL_MAX  512
>> 
>> Also, I think below lines could be removed, as using a larger value of num (i.e. 512)
>> doesn't seem to consume more resources anywhere:
>> -    if (migrate_use_multifd()) {
>> -        num = migrate_multifd_channels();
>> -    } else if (migrate_postcopy_preempt()) {
>> -        num = RAM_CHANNEL_MAX;
>> -    }
>
> Given that this code already exists, why is it not already sufficient ?

Ah, I "think" I remember now.

> The commit description is saying we're setting backlog == 1 wit
> multifd, but this later code is setting it to match the multfd
> channels.  Why isn't that enough ?


Are you using -incoming defer?

No? right.

With multifd, you should use -incoming defer.  It is more, you should
use -incoming defer always.  The problem is that the way qemu starts,
when we do the initial listen, the parameters migration_channels and
migration_multifd hasn't yet been parsed.

Can you confirm that if you start with -incoming defer, everything works
as expected?

Later, Juan.
Wang, Wei W May 18, 2023, 3:17 p.m. UTC | #6
On Thursday, May 18, 2023 8:43 PM, Juan Quintela wrote:
> 
> 
> Are you using -incoming defer?
> 
> No? right.
> 
> With multifd, you should use -incoming defer. 

Yes, just confirmed that it works well with deferred incoming.
I think we should enforce this kind of requirement in the code.
I'll make a patch for this.

Thanks,
Wei
Juan Quintela May 18, 2023, 3:28 p.m. UTC | #7
"Wang, Wei W" <wei.w.wang@intel.com> wrote:
> On Thursday, May 18, 2023 8:43 PM, Juan Quintela wrote:
>> 
>> 
>> Are you using -incoming defer?
>> 
>> No? right.
>> 
>> With multifd, you should use -incoming defer. 
>
> Yes, just confirmed that it works well with deferred incoming.
> I think we should enforce this kind of requirement in the code.
> I'll make a patch for this.

Actually, we should deprecate everything except defer.

Later, Juan.
Lei Wang May 19, 2023, 1:30 a.m. UTC | #8
On 5/18/2023 17:16, Juan Quintela wrote:
> Lei Wang <lei4.wang@intel.com> wrote:
>> When destination VM is launched, the "backlog" parameter for listen() is set
>> to 1 as default in socket_start_incoming_migration_internal(), which will
>> lead to socket connection error (the queue of pending connections is full)
>> when "multifd" and "multifd-channels" are set later on and a high number of
>> channels are used. Set it to a hard-coded higher default value 512 to fix
>> this issue.
>>
>> Reported-by: Wei Wang <wei.w.wang@intel.com>
>> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> 
> [cc'd daiel who is the maintainer of qio]
> 
> My understanding of that value is that 230 or something like that would
> be more than enough.  The maxiimum number of multifd channels is 256.

You are right, the "multifd-channels" expects uint8_t, so 256 is enough.

> 
> Daniel, any opinion?
> 
> Later, Juan.
> 
>> ---
>>  migration/socket.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/migration/socket.c b/migration/socket.c
>> index 1b6f5baefb..b43a66ef7e 100644
>> --- a/migration/socket.c
>> +++ b/migration/socket.c
>> @@ -179,7 +179,7 @@ socket_start_incoming_migration_internal(SocketAddress *saddr,
>>      QIONetListener *listener = qio_net_listener_new();
>>      MigrationIncomingState *mis = migration_incoming_get_current();
>>      size_t i;
>> -    int num = 1;
>> +    int num = 512;
>>  
>>      qio_net_listener_set_name(listener, "migration-socket-listener");
>
Wang, Wei W May 19, 2023, 2:44 a.m. UTC | #9
On Friday, May 19, 2023 9:31 AM, Wang, Lei4 wrote:
> On 5/18/2023 17:16, Juan Quintela wrote:
> > Lei Wang <lei4.wang@intel.com> wrote:
> >> When destination VM is launched, the "backlog" parameter for listen()
> >> is set to 1 as default in socket_start_incoming_migration_internal(),
> >> which will lead to socket connection error (the queue of pending
> >> connections is full) when "multifd" and "multifd-channels" are set
> >> later on and a high number of channels are used. Set it to a
> >> hard-coded higher default value 512 to fix this issue.
> >>
> >> Reported-by: Wei Wang <wei.w.wang@intel.com>
> >> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> >
> > [cc'd daiel who is the maintainer of qio]
> >
> > My understanding of that value is that 230 or something like that
> > would be more than enough.  The maxiimum number of multifd channels is
> 256.
> 
> You are right, the "multifd-channels" expects uint8_t, so 256 is enough.
> 

We can change it to uint16_t or uint32_t, but need to see if listening on a larger
value is OK to everyone.

Man page of listen mentions that the  maximum length of the queue for
incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog,
and it is 4096 by default on my machine.
Lei Wang May 19, 2023, 2:51 a.m. UTC | #10
On 5/19/2023 10:44, Wang, Wei W wrote:
> On Friday, May 19, 2023 9:31 AM, Wang, Lei4 wrote:
>> On 5/18/2023 17:16, Juan Quintela wrote:
>>> Lei Wang <lei4.wang@intel.com> wrote:
>>>> When destination VM is launched, the "backlog" parameter for listen()
>>>> is set to 1 as default in socket_start_incoming_migration_internal(),
>>>> which will lead to socket connection error (the queue of pending
>>>> connections is full) when "multifd" and "multifd-channels" are set
>>>> later on and a high number of channels are used. Set it to a
>>>> hard-coded higher default value 512 to fix this issue.
>>>>
>>>> Reported-by: Wei Wang <wei.w.wang@intel.com>
>>>> Signed-off-by: Lei Wang <lei4.wang@intel.com>
>>>
>>> [cc'd daiel who is the maintainer of qio]
>>>
>>> My understanding of that value is that 230 or something like that
>>> would be more than enough.  The maxiimum number of multifd channels is
>> 256.
>>
>> You are right, the "multifd-channels" expects uint8_t, so 256 is enough.
>>
> 
> We can change it to uint16_t or uint32_t, but need to see if listening on a larger
> value is OK to everyone.

Is there any use case to use >256 migration channels? If not, then I suppose
it's no need to increase it.

> 
> Man page of listen mentions that the  maximum length of the queue for
> incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog,
> and it is 4096 by default on my machine
Wang, Wei W May 19, 2023, 3:33 a.m. UTC | #11
On Friday, May 19, 2023 10:52 AM, Wang, Lei4 wrote:
> > We can change it to uint16_t or uint32_t, but need to see if listening
> > on a larger value is OK to everyone.
> 
> Is there any use case to use >256 migration channels? If not, then I suppose
> it's no need to increase it.

People can choose to use more than 256 channels to boost performance.
If it is determined that using larger than 256 channels doesn't increase performance
on all the existing platforms, then we need to have it reflected in the code explicitly,
e.g. fail with errors messages when user does:
migrate_set_parameter multifd-channels 512

> 
> >
> > Man page of listen mentions that the  maximum length of the queue for
> > incomplete sockets can be set using
> > /proc/sys/net/ipv4/tcp_max_syn_backlog,
> > and it is 4096 by default on my machine
Juan Quintela May 19, 2023, 11:22 a.m. UTC | #12
"Wang, Wei W" <wei.w.wang@intel.com> wrote:
> On Friday, May 19, 2023 9:31 AM, Wang, Lei4 wrote:
>> On 5/18/2023 17:16, Juan Quintela wrote:
>> > Lei Wang <lei4.wang@intel.com> wrote:
>> >> When destination VM is launched, the "backlog" parameter for listen()
>> >> is set to 1 as default in socket_start_incoming_migration_internal(),
>> >> which will lead to socket connection error (the queue of pending
>> >> connections is full) when "multifd" and "multifd-channels" are set
>> >> later on and a high number of channels are used. Set it to a
>> >> hard-coded higher default value 512 to fix this issue.
>> >>
>> >> Reported-by: Wei Wang <wei.w.wang@intel.com>
>> >> Signed-off-by: Lei Wang <lei4.wang@intel.com>
>> >
>> > [cc'd daiel who is the maintainer of qio]
>> >
>> > My understanding of that value is that 230 or something like that
>> > would be more than enough.  The maxiimum number of multifd channels is
>> 256.
>> 
>> You are right, the "multifd-channels" expects uint8_t, so 256 is enough.
>> 
>
> We can change it to uint16_t or uint32_t, but need to see if listening on a larger
> value is OK to everyone.

If we need something more than 256 channels for migration, we ar edoing
something really weird.  We can saturate a 100Gigabit network relatively
easily with 10 channels.  256 Channels would mean that we have at least
2TBit/s networking.  I am not expecting that really soon.  And as soon
as that happens I would expect CPU's to handle easily more that
10Gigabits/second.

> Man page of listen mentions that the  maximum length of the queue for
> incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog,
> and it is 4096 by default on my machine.

I think that current code is ok.  We just need to enforce that we use
defer.

Later, Juan.
Juan Quintela May 19, 2023, 11:32 a.m. UTC | #13
"Wang, Wei W" <wei.w.wang@intel.com> wrote:
> On Friday, May 19, 2023 10:52 AM, Wang, Lei4 wrote:
>> > We can change it to uint16_t or uint32_t, but need to see if listening
>> > on a larger value is OK to everyone.
>> 
>> Is there any use case to use >256 migration channels? If not, then I suppose
>> it's no need to increase it.
>
> People can choose to use more than 256 channels to boost performance.

See my other email, I doubt it any time soon O:-)

> If it is determined that using larger than 256 channels doesn't increase performance
> on all the existing platforms, then we need to have it reflected in the code explicitly,
> e.g. fail with errors messages when user does:
> migrate_set_parameter multifd-channels 512


(qemu) migrate_set_parameter multifd-channels 300
Error: Parameter 'multifd-channels' expects uint8_t

So I think that is working.

>> 
>> >
>> > Man page of listen mentions that the  maximum length of the queue for
>> > incomplete sockets can be set using
>> > /proc/sys/net/ipv4/tcp_max_syn_backlog,
>> > and it is 4096 by default on my machine
Peter Xu May 19, 2023, 3:17 p.m. UTC | #14
On Fri, May 19, 2023 at 01:22:20PM +0200, Juan Quintela wrote:
> "Wang, Wei W" <wei.w.wang@intel.com> wrote:
> > On Friday, May 19, 2023 9:31 AM, Wang, Lei4 wrote:
> >> On 5/18/2023 17:16, Juan Quintela wrote:
> >> > Lei Wang <lei4.wang@intel.com> wrote:
> >> >> When destination VM is launched, the "backlog" parameter for listen()
> >> >> is set to 1 as default in socket_start_incoming_migration_internal(),
> >> >> which will lead to socket connection error (the queue of pending
> >> >> connections is full) when "multifd" and "multifd-channels" are set
> >> >> later on and a high number of channels are used. Set it to a
> >> >> hard-coded higher default value 512 to fix this issue.
> >> >>
> >> >> Reported-by: Wei Wang <wei.w.wang@intel.com>
> >> >> Signed-off-by: Lei Wang <lei4.wang@intel.com>
> >> >
> >> > [cc'd daiel who is the maintainer of qio]
> >> >
> >> > My understanding of that value is that 230 or something like that
> >> > would be more than enough.  The maxiimum number of multifd channels is
> >> 256.
> >> 
> >> You are right, the "multifd-channels" expects uint8_t, so 256 is enough.
> >> 
> >
> > We can change it to uint16_t or uint32_t, but need to see if listening on a larger
> > value is OK to everyone.
> 
> If we need something more than 256 channels for migration, we ar edoing
> something really weird.  We can saturate a 100Gigabit network relatively
> easily with 10 channels.  256 Channels would mean that we have at least
> 2TBit/s networking.  I am not expecting that really soon.  And as soon
> as that happens I would expect CPU's to handle easily more that
> 10Gigabits/second.

Besides the network limitation, I'd also bet when the thread number goes to
some degree it'll start to have bottleneck here and there on either
scheduling or even cache bouncing between the cores even when atomically
updating the counters (afaiu those need mem bus locking).

So I agree 256 would be good enough.  We can wait to see when there're
higher speed network..

> 
> > Man page of listen mentions that the  maximum length of the queue for
> > incomplete sockets can be set using /proc/sys/net/ipv4/tcp_max_syn_backlog,
> > and it is 4096 by default on my machine.
> 
> I think that current code is ok.  We just need to enforce that we use
> defer.
> 
> Later, Juan.
>
diff mbox series

Patch

diff --git a/migration/socket.c b/migration/socket.c
index 1b6f5baefb..b43a66ef7e 100644
--- a/migration/socket.c
+++ b/migration/socket.c
@@ -179,7 +179,7 @@  socket_start_incoming_migration_internal(SocketAddress *saddr,
     QIONetListener *listener = qio_net_listener_new();
     MigrationIncomingState *mis = migration_incoming_get_current();
     size_t i;
-    int num = 1;
+    int num = 512;
 
     qio_net_listener_set_name(listener, "migration-socket-listener");