[1/1] monitor: increase amount of data for monitor to read
diff mbox

Message ID 1493732857-10721-1-git-send-email-den@openvz.org
State New
Headers show

Commit Message

Denis V. Lunev May 2, 2017, 1:47 p.m. UTC
Right now QMP and HMP monitors read 1 byte at a time from the socket, which
is very inefficient. With 100+ VMs on the host this easily reasults in
a lot of unnecessary system calls and CPU usage in the system.

This patch changes the amount of data to read to 4096 bytes, which matches
buffer size on the channel level. Fortunately, monitor protocol is
synchronous right now thus we should not face side effects in reality.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
CC: Eric Blake <eblake@redhat.com>
---
 monitor.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Eric Blake May 2, 2017, 2:34 p.m. UTC | #1
On 05/02/2017 08:47 AM, Denis V. Lunev wrote:
> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> is very inefficient. With 100+ VMs on the host this easily reasults in

s/reasults/results/

> a lot of unnecessary system calls and CPU usage in the system.
> 
> This patch changes the amount of data to read to 4096 bytes, which matches
> buffer size on the channel level. Fortunately, monitor protocol is
> synchronous right now thus we should not face side effects in reality.

Do you have any easy benchmarks or measurements to prove what sort of
efficiencies we get?  (I believe they exist, but quantifying them never
hurts)

> 
> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Markus Armbruster <armbru@redhat.com>
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> CC: Eric Blake <eblake@redhat.com>
> ---
>  monitor.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/monitor.c b/monitor.c
> index be282ec..00df5d0 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -3698,7 +3698,7 @@ static int monitor_can_read(void *opaque)
>  {
>      Monitor *mon = opaque;
>  
> -    return (mon->suspend_cnt == 0) ? 1 : 0;
> +    return (mon->suspend_cnt == 0) ? 4096 : 0;

Is a hard-coded number correct, or should we be asking the channel for
an actual number?

>  }
>  
>  static void handle_qmp_command(JSONMessageParser *parser, GQueue *tokens)
>
Markus Armbruster May 2, 2017, 2:43 p.m. UTC | #2
"Denis V. Lunev" <den@openvz.org> writes:

> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> is very inefficient. With 100+ VMs on the host this easily reasults in
> a lot of unnecessary system calls and CPU usage in the system.
>
> This patch changes the amount of data to read to 4096 bytes, which matches
> buffer size on the channel level. Fortunately, monitor protocol is
> synchronous right now thus we should not face side effects in reality.

Can you explain briefly why this relies on "synchronous"?  I've spent
all of two seconds on the question myself...

> Signed-off-by: Denis V. Lunev <den@openvz.org>
> CC: Markus Armbruster <armbru@redhat.com>
> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> CC: Eric Blake <eblake@redhat.com>
> ---
>  monitor.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/monitor.c b/monitor.c
> index be282ec..00df5d0 100644
> --- a/monitor.c
> +++ b/monitor.c
> @@ -3698,7 +3698,7 @@ static int monitor_can_read(void *opaque)
>  {
>      Monitor *mon = opaque;
>  
> -    return (mon->suspend_cnt == 0) ? 1 : 0;
> +    return (mon->suspend_cnt == 0) ? 4096 : 0;
>  }
>  
>  static void handle_qmp_command(JSONMessageParser *parser, GQueue *tokens)
Daniel P. Berrangé May 2, 2017, 2:44 p.m. UTC | #3
On Tue, May 02, 2017 at 09:34:55AM -0500, Eric Blake wrote:
> On 05/02/2017 08:47 AM, Denis V. Lunev wrote:
> > Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> > is very inefficient. With 100+ VMs on the host this easily reasults in
> 
> s/reasults/results/
> 
> > a lot of unnecessary system calls and CPU usage in the system.
> > 
> > This patch changes the amount of data to read to 4096 bytes, which matches
> > buffer size on the channel level. Fortunately, monitor protocol is
> > synchronous right now thus we should not face side effects in reality.
> 
> Do you have any easy benchmarks or measurements to prove what sort of
> efficiencies we get?  (I believe they exist, but quantifying them never
> hurts)
> 
> > 
> > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > CC: Markus Armbruster <armbru@redhat.com>
> > CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > CC: Eric Blake <eblake@redhat.com>
> > ---
> >  monitor.c | 2 +-
> >  1 file changed, 1 insertion(+), 1 deletion(-)
> > 
> > diff --git a/monitor.c b/monitor.c
> > index be282ec..00df5d0 100644
> > --- a/monitor.c
> > +++ b/monitor.c
> > @@ -3698,7 +3698,7 @@ static int monitor_can_read(void *opaque)
> >  {
> >      Monitor *mon = opaque;
> >  
> > -    return (mon->suspend_cnt == 0) ? 1 : 0;
> > +    return (mon->suspend_cnt == 0) ? 4096 : 0;
> 
> Is a hard-coded number correct, or should we be asking the channel for
> an actual number?

There's no need - this will cause the chardev code to just do a
gio_channel_read() with a 4096 byte buffer. The chardev backend
impl will then happily return fewer bytes than this - just whatever
happens to be pending. IOW this is just acting as an upper bound
on the amount of data we read at once. So 4k seems reasonable to
me, given the typical size of QMP/HMP command strings.

Regards,
Daniel
Dr. David Alan Gilbert May 2, 2017, 2:49 p.m. UTC | #4
* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Tue, May 02, 2017 at 09:34:55AM -0500, Eric Blake wrote:
> > On 05/02/2017 08:47 AM, Denis V. Lunev wrote:
> > > Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> > > is very inefficient. With 100+ VMs on the host this easily reasults in
> > 
> > s/reasults/results/
> > 
> > > a lot of unnecessary system calls and CPU usage in the system.
> > > 
> > > This patch changes the amount of data to read to 4096 bytes, which matches
> > > buffer size on the channel level. Fortunately, monitor protocol is
> > > synchronous right now thus we should not face side effects in reality.
> > 
> > Do you have any easy benchmarks or measurements to prove what sort of
> > efficiencies we get?  (I believe they exist, but quantifying them never
> > hurts)
> > 
> > > 
> > > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > > CC: Markus Armbruster <armbru@redhat.com>
> > > CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > CC: Eric Blake <eblake@redhat.com>
> > > ---
> > >  monitor.c | 2 +-
> > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > 
> > > diff --git a/monitor.c b/monitor.c
> > > index be282ec..00df5d0 100644
> > > --- a/monitor.c
> > > +++ b/monitor.c
> > > @@ -3698,7 +3698,7 @@ static int monitor_can_read(void *opaque)
> > >  {
> > >      Monitor *mon = opaque;
> > >  
> > > -    return (mon->suspend_cnt == 0) ? 1 : 0;
> > > +    return (mon->suspend_cnt == 0) ? 4096 : 0;
> > 
> > Is a hard-coded number correct, or should we be asking the channel for
> > an actual number?
> 
> There's no need - this will cause the chardev code to just do a
> gio_channel_read() with a 4096 byte buffer. The chardev backend
> impl will then happily return fewer bytes than this - just whatever
> happens to be pending. IOW this is just acting as an upper bound
> on the amount of data we read at once. So 4k seems reasonable to
> me, given the typical size of QMP/HMP command strings.

So there's *no* situation in which that will block?
I'm assuming the reason it read one byte was thats the only thing
that poll() coming back to you guarantees.

Dave


> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Daniel P. Berrangé May 2, 2017, 2:55 p.m. UTC | #5
On Tue, May 02, 2017 at 03:49:52PM +0100, Dr. David Alan Gilbert wrote:
> * Daniel P. Berrange (berrange@redhat.com) wrote:
> > On Tue, May 02, 2017 at 09:34:55AM -0500, Eric Blake wrote:
> > > On 05/02/2017 08:47 AM, Denis V. Lunev wrote:
> > > > Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> > > > is very inefficient. With 100+ VMs on the host this easily reasults in
> > > 
> > > s/reasults/results/
> > > 
> > > > a lot of unnecessary system calls and CPU usage in the system.
> > > > 
> > > > This patch changes the amount of data to read to 4096 bytes, which matches
> > > > buffer size on the channel level. Fortunately, monitor protocol is
> > > > synchronous right now thus we should not face side effects in reality.
> > > 
> > > Do you have any easy benchmarks or measurements to prove what sort of
> > > efficiencies we get?  (I believe they exist, but quantifying them never
> > > hurts)
> > > 
> > > > 
> > > > Signed-off-by: Denis V. Lunev <den@openvz.org>
> > > > CC: Markus Armbruster <armbru@redhat.com>
> > > > CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
> > > > CC: Eric Blake <eblake@redhat.com>
> > > > ---
> > > >  monitor.c | 2 +-
> > > >  1 file changed, 1 insertion(+), 1 deletion(-)
> > > > 
> > > > diff --git a/monitor.c b/monitor.c
> > > > index be282ec..00df5d0 100644
> > > > --- a/monitor.c
> > > > +++ b/monitor.c
> > > > @@ -3698,7 +3698,7 @@ static int monitor_can_read(void *opaque)
> > > >  {
> > > >      Monitor *mon = opaque;
> > > >  
> > > > -    return (mon->suspend_cnt == 0) ? 1 : 0;
> > > > +    return (mon->suspend_cnt == 0) ? 4096 : 0;
> > > 
> > > Is a hard-coded number correct, or should we be asking the channel for
> > > an actual number?
> > 
> > There's no need - this will cause the chardev code to just do a
> > gio_channel_read() with a 4096 byte buffer. The chardev backend
> > impl will then happily return fewer bytes than this - just whatever
> > happens to be pending. IOW this is just acting as an upper bound
> > on the amount of data we read at once. So 4k seems reasonable to
> > me, given the typical size of QMP/HMP command strings.
> 
> So there's *no* situation in which that will block?

Correct.

> I'm assuming the reason it read one byte was thats the only thing
> that poll() coming back to you guarantees.

Poll returning with POLLIN set, guarantees there is at least one byte
pending. A read on a pipe, socket or other FD, will return at long as
it has read at least one byte. It'll never block to fill the entire
buffer [1].

Regards,
Daniel

[1] Except if reading from a regular file, in which case POSIX I/O
    is broken and it'll happily block while the disk seeks to wherever
    the data lives, but that's not an issue for the monitor.
Denis V. Lunev May 2, 2017, 3:29 p.m. UTC | #6
On 05/02/2017 05:43 PM, Markus Armbruster wrote:
> "Denis V. Lunev" <den@openvz.org> writes:
>
>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>> is very inefficient. With 100+ VMs on the host this easily reasults in
>> a lot of unnecessary system calls and CPU usage in the system.
>>
>> This patch changes the amount of data to read to 4096 bytes, which matches
>> buffer size on the channel level. Fortunately, monitor protocol is
>> synchronous right now thus we should not face side effects in reality.
> Can you explain briefly why this relies on "synchronous"?  I've spent
> all of two seconds on the question myself...
Each command is processed in sequence as it appears in the
channel. The answer to the command is sent and only after that
next command is processed.

Theoretically tith asynchronous processing we can have some side
effects due to changed buffer size.

Den
Denis V. Lunev May 2, 2017, 3:37 p.m. UTC | #7
On 05/02/2017 05:34 PM, Eric Blake wrote:
> On 05/02/2017 08:47 AM, Denis V. Lunev wrote:
>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>> is very inefficient. With 100+ VMs on the host this easily reasults in
> s/reasults/results/
>
>> a lot of unnecessary system calls and CPU usage in the system.
>>
>> This patch changes the amount of data to read to 4096 bytes, which matches
>> buffer size on the channel level. Fortunately, monitor protocol is
>> synchronous right now thus we should not face side effects in reality.
> Do you have any easy benchmarks or measurements to prove what sort of
> efficiencies we get?  (I believe they exist, but quantifying them never
> hurts)
>
Unfortunately I have not measured numbers, but I am sure that
this will improve the performance by the small number. I have
had in mind calculations like the following:
- our management software executes 6 QMP requests in 10 seconds
  for each VM to collect balloon statistics, disk statistics, CPU
  statistics etc
- lets assume we have 100 VMs
- each byte processing require poll(), which is expensive, and recvmsg,
  i.e. 2 syscalls per byte
- If the request is 50 bytes in length (this number is optimistic) we
  will have
   6 (amount of QMP reqs) * 50 (bytes in req) * 100 (VMs count) * 2
(syscalls per byte) / 10 (seconds) = 6000 syscalls/second

For me this number is not that small.


>> Signed-off-by: Denis V. Lunev <den@openvz.org>
>> CC: Markus Armbruster <armbru@redhat.com>
>> CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
>> CC: Eric Blake <eblake@redhat.com>
>> ---
>>  monitor.c | 2 +-
>>  1 file changed, 1 insertion(+), 1 deletion(-)
>>
>> diff --git a/monitor.c b/monitor.c
>> index be282ec..00df5d0 100644
>> --- a/monitor.c
>> +++ b/monitor.c
>> @@ -3698,7 +3698,7 @@ static int monitor_can_read(void *opaque)
>>  {
>>      Monitor *mon = opaque;
>>  
>> -    return (mon->suspend_cnt == 0) ? 1 : 0;
>> +    return (mon->suspend_cnt == 0) ? 4096 : 0;
> Is a hard-coded number correct, or should we be asking the channel for
> an actual number?
Daniel has suggested good answer here. Though you are right,
it would be better to re-write commit message like this.
'4096 is takes as the number which allows to read most incoming
requests in one read'.

Den
Markus Armbruster May 2, 2017, 4:30 p.m. UTC | #8
"Denis V. Lunev" <den@openvz.org> writes:

> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
>> "Denis V. Lunev" <den@openvz.org> writes:
>>
>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>> a lot of unnecessary system calls and CPU usage in the system.
>>>
>>> This patch changes the amount of data to read to 4096 bytes, which matches
>>> buffer size on the channel level. Fortunately, monitor protocol is
>>> synchronous right now thus we should not face side effects in reality.
>> Can you explain briefly why this relies on "synchronous"?  I've spent
>> all of two seconds on the question myself...
> Each command is processed in sequence as it appears in the
> channel. The answer to the command is sent and only after that
> next command is processed.

Yes, that's how QMP works.

> Theoretically tith asynchronous processing we can have some side
> effects due to changed buffer size.

What kind of side effects do you have in mind?

It's quite possible that this obviously inefficient way to read had some
deep reason back when it was created.  Hmm, git-blame is our friend:

commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
Author: Jan Kiszka <jan.kiszka@siemens.com>
Date:   Fri Dec 4 14:05:29 2009 +0100

    monitor: Accept input only byte-wise
    
    This allows to suspend command interpretation and execution
    synchronously, e.g. during migration.
    
    Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
    Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Dr. David Alan Gilbert May 2, 2017, 4:36 p.m. UTC | #9
* Markus Armbruster (armbru@redhat.com) wrote:
> "Denis V. Lunev" <den@openvz.org> writes:
> 
> > On 05/02/2017 05:43 PM, Markus Armbruster wrote:
> >> "Denis V. Lunev" <den@openvz.org> writes:
> >>
> >>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> >>> is very inefficient. With 100+ VMs on the host this easily reasults in
> >>> a lot of unnecessary system calls and CPU usage in the system.
> >>>
> >>> This patch changes the amount of data to read to 4096 bytes, which matches
> >>> buffer size on the channel level. Fortunately, monitor protocol is
> >>> synchronous right now thus we should not face side effects in reality.
> >> Can you explain briefly why this relies on "synchronous"?  I've spent
> >> all of two seconds on the question myself...
> > Each command is processed in sequence as it appears in the
> > channel. The answer to the command is sent and only after that
> > next command is processed.
> 
> Yes, that's how QMP works.
> 
> > Theoretically tith asynchronous processing we can have some side
> > effects due to changed buffer size.
> 
> What kind of side effects do you have in mind?
> 
> It's quite possible that this obviously inefficient way to read had some
> deep reason back when it was created.  Hmm, git-blame is our friend:
> 
> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
> Author: Jan Kiszka <jan.kiszka@siemens.com>
> Date:   Fri Dec 4 14:05:29 2009 +0100
> 
>     monitor: Accept input only byte-wise
>     
>     This allows to suspend command interpretation and execution
>     synchronously, e.g. during migration.
>     
>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

I don't think I understand why that's a problem; if we read more bytes,
we're not going to interpret them and execute them until after the previous
command returns are we?

Dave

--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Daniel P. Berrangé May 2, 2017, 4:48 p.m. UTC | #10
On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
> * Markus Armbruster (armbru@redhat.com) wrote:
> > "Denis V. Lunev" <den@openvz.org> writes:
> > 
> > > On 05/02/2017 05:43 PM, Markus Armbruster wrote:
> > >> "Denis V. Lunev" <den@openvz.org> writes:
> > >>
> > >>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> > >>> is very inefficient. With 100+ VMs on the host this easily reasults in
> > >>> a lot of unnecessary system calls and CPU usage in the system.
> > >>>
> > >>> This patch changes the amount of data to read to 4096 bytes, which matches
> > >>> buffer size on the channel level. Fortunately, monitor protocol is
> > >>> synchronous right now thus we should not face side effects in reality.
> > >> Can you explain briefly why this relies on "synchronous"?  I've spent
> > >> all of two seconds on the question myself...
> > > Each command is processed in sequence as it appears in the
> > > channel. The answer to the command is sent and only after that
> > > next command is processed.
> > 
> > Yes, that's how QMP works.
> > 
> > > Theoretically tith asynchronous processing we can have some side
> > > effects due to changed buffer size.
> > 
> > What kind of side effects do you have in mind?
> > 
> > It's quite possible that this obviously inefficient way to read had some
> > deep reason back when it was created.  Hmm, git-blame is our friend:
> > 
> > commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
> > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > Date:   Fri Dec 4 14:05:29 2009 +0100
> > 
> >     monitor: Accept input only byte-wise
> >     
> >     This allows to suspend command interpretation and execution
> >     synchronously, e.g. during migration.
> >     
> >     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> 
> I don't think I understand why that's a problem; if we read more bytes,
> we're not going to interpret them and execute them until after the previous
> command returns are we?

Actually it sees we might do, due to the way the "migrate" command works
in HMP when you don't give the '-d' flag.

Most monitors commands will block the caller until they are finished,
but "migrate" is different. The hmp_migrate() method will return
immediately, but we call monitor_suspend() to block processing of
further commands. If another command has already been read off
the wire though (due to "monitor_read" having a buffer that contains
multiple commands), we would in fact start processing this command
despite having suspended the monitor.

This is only a problem, however, if the client app has issued "migrate"
followed by another command, at the same time without waiting for the
respond to "migrate". So in practice the only way you'd hit the bug
is probably if you just cut+paste a big chunk of commands into the
monitor at once without waiting for completion and one of the commands
was "migrate" without "-d".

Still, I think we would need to figure out a proper fix for this before
we could increase the buffer size.

Regards,
Daniel
Dr. David Alan Gilbert May 2, 2017, 5 p.m. UTC | #11
* Daniel P. Berrange (berrange@redhat.com) wrote:
> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
> > * Markus Armbruster (armbru@redhat.com) wrote:
> > > "Denis V. Lunev" <den@openvz.org> writes:
> > > 
> > > > On 05/02/2017 05:43 PM, Markus Armbruster wrote:
> > > >> "Denis V. Lunev" <den@openvz.org> writes:
> > > >>
> > > >>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> > > >>> is very inefficient. With 100+ VMs on the host this easily reasults in
> > > >>> a lot of unnecessary system calls and CPU usage in the system.
> > > >>>
> > > >>> This patch changes the amount of data to read to 4096 bytes, which matches
> > > >>> buffer size on the channel level. Fortunately, monitor protocol is
> > > >>> synchronous right now thus we should not face side effects in reality.
> > > >> Can you explain briefly why this relies on "synchronous"?  I've spent
> > > >> all of two seconds on the question myself...
> > > > Each command is processed in sequence as it appears in the
> > > > channel. The answer to the command is sent and only after that
> > > > next command is processed.
> > > 
> > > Yes, that's how QMP works.
> > > 
> > > > Theoretically tith asynchronous processing we can have some side
> > > > effects due to changed buffer size.
> > > 
> > > What kind of side effects do you have in mind?
> > > 
> > > It's quite possible that this obviously inefficient way to read had some
> > > deep reason back when it was created.  Hmm, git-blame is our friend:
> > > 
> > > commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
> > > Author: Jan Kiszka <jan.kiszka@siemens.com>
> > > Date:   Fri Dec 4 14:05:29 2009 +0100
> > > 
> > >     monitor: Accept input only byte-wise
> > >     
> > >     This allows to suspend command interpretation and execution
> > >     synchronously, e.g. during migration.
> > >     
> > >     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> > >     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> > 
> > I don't think I understand why that's a problem; if we read more bytes,
> > we're not going to interpret them and execute them until after the previous
> > command returns are we?
> 
> Actually it sees we might do, due to the way the "migrate" command works
> in HMP when you don't give the '-d' flag.
> 
> Most monitors commands will block the caller until they are finished,
> but "migrate" is different. The hmp_migrate() method will return
> immediately, but we call monitor_suspend() to block processing of
> further commands. If another command has already been read off
> the wire though (due to "monitor_read" having a buffer that contains
> multiple commands), we would in fact start processing this command
> despite having suspended the monitor.

Ah OK; yes that's painful.

> This is only a problem, however, if the client app has issued "migrate"
> followed by another command, at the same time without waiting for the
> respond to "migrate". So in practice the only way you'd hit the bug
> is probably if you just cut+paste a big chunk of commands into the
> monitor at once without waiting for completion and one of the commands
> was "migrate" without "-d".

That's probably not unusual for scripts;  doing something like:
  migrate  ....
  info migrate

isn't that unusual.

> Still, I think we would need to figure out a proper fix for this before
> we could increase the buffer size.

Nod, it's a bit grim.

Dave

> 
> Regards,
> Daniel
> -- 
> |: https://berrange.com      -o-    https://www.flickr.com/photos/dberrange :|
> |: https://libvirt.org         -o-            https://fstop138.berrange.com :|
> |: https://entangle-photo.org    -o-    https://www.instagram.com/dberrange :|
--
Dr. David Alan Gilbert / dgilbert@redhat.com / Manchester, UK
Denis V. Lunev May 2, 2017, 5:07 p.m. UTC | #12
On 05/02/2017 07:48 PM, Daniel P. Berrange wrote:
> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
>> * Markus Armbruster (armbru@redhat.com) wrote:
>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>
>>>> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>
>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>>>>> a lot of unnecessary system calls and CPU usage in the system.
>>>>>>
>>>>>> This patch changes the amount of data to read to 4096 bytes, which matches
>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
>>>>>> synchronous right now thus we should not face side effects in reality.
>>>>> Can you explain briefly why this relies on "synchronous"?  I've spent
>>>>> all of two seconds on the question myself...
>>>> Each command is processed in sequence as it appears in the
>>>> channel. The answer to the command is sent and only after that
>>>> next command is processed.
>>> Yes, that's how QMP works.
>>>
>>>> Theoretically tith asynchronous processing we can have some side
>>>> effects due to changed buffer size.
>>> What kind of side effects do you have in mind?
>>>
>>> It's quite possible that this obviously inefficient way to read had some
>>> deep reason back when it was created.  Hmm, git-blame is our friend:
>>>
>>> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>> Date:   Fri Dec 4 14:05:29 2009 +0100
>>>
>>>     monitor: Accept input only byte-wise
>>>     
>>>     This allows to suspend command interpretation and execution
>>>     synchronously, e.g. during migration.
>>>     
>>>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>> I don't think I understand why that's a problem; if we read more bytes,
>> we're not going to interpret them and execute them until after the previous
>> command returns are we?
> Actually it sees we might do, due to the way the "migrate" command works
> in HMP when you don't give the '-d' flag.
>
> Most monitors commands will block the caller until they are finished,
> but "migrate" is different. The hmp_migrate() method will return
> immediately, but we call monitor_suspend() to block processing of
> further commands. If another command has already been read off
> the wire though (due to "monitor_read" having a buffer that contains
> multiple commands), we would in fact start processing this command
> despite having suspended the monitor.
>
> This is only a problem, however, if the client app has issued "migrate"
> followed by another command, at the same time without waiting for the
> respond to "migrate". So in practice the only way you'd hit the bug
> is probably if you just cut+paste a big chunk of commands into the
> monitor at once without waiting for completion and one of the commands
> was "migrate" without "-d".
>
> Still, I think we would need to figure out a proper fix for this before
> we could increase the buffer size.
>
> Regards,
> Daniel

There is one thing, which simplifies things a lot.
- suspend_cnt can be increased only from 2 places:
  1) monitor_event(), which is called for real HMP monitor only

  2) monitor_suspend(), which can increment suspend_cnt
      only if mon->rs != NULL, which also means that the
      monitor is specifically configured HMP monitor.

So, we can improve the patch (for now) with the following
tweak:

static int monitor_can_read(void *opaque)
{
    Monitor *mon = opaque;
   
    if (monitor_is_qmp(mon))
        return 4096;   
    return (mon->suspend_cnt == 0) ? 1 : 0;
}

This will solve my case completely and does not break any
backward compatibility.

Den
Markus Armbruster May 3, 2017, 11:29 a.m. UTC | #13
"Denis V. Lunev" <den@openvz.org> writes:

> On 05/02/2017 07:48 PM, Daniel P. Berrange wrote:
>> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
>>> * Markus Armbruster (armbru@redhat.com) wrote:
>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>
>>>>> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>
>>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>>>>>> a lot of unnecessary system calls and CPU usage in the system.
>>>>>>>
>>>>>>> This patch changes the amount of data to read to 4096 bytes, which matches
>>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
>>>>>>> synchronous right now thus we should not face side effects in reality.
>>>>>> Can you explain briefly why this relies on "synchronous"?  I've spent
>>>>>> all of two seconds on the question myself...
>>>>> Each command is processed in sequence as it appears in the
>>>>> channel. The answer to the command is sent and only after that
>>>>> next command is processed.
>>>> Yes, that's how QMP works.
>>>>
>>>>> Theoretically tith asynchronous processing we can have some side
>>>>> effects due to changed buffer size.
>>>> What kind of side effects do you have in mind?
>>>>
>>>> It's quite possible that this obviously inefficient way to read had some
>>>> deep reason back when it was created.  Hmm, git-blame is our friend:
>>>>
>>>> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>> Date:   Fri Dec 4 14:05:29 2009 +0100
>>>>
>>>>     monitor: Accept input only byte-wise
>>>>     
>>>>     This allows to suspend command interpretation and execution
>>>>     synchronously, e.g. during migration.
>>>>     
>>>>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>> I don't think I understand why that's a problem; if we read more bytes,
>>> we're not going to interpret them and execute them until after the previous
>>> command returns are we?
>> Actually it sees we might do, due to the way the "migrate" command works
>> in HMP when you don't give the '-d' flag.
>>
>> Most monitors commands will block the caller until they are finished,
>> but "migrate" is different. The hmp_migrate() method will return
>> immediately, but we call monitor_suspend() to block processing of
>> further commands. If another command has already been read off
>> the wire though (due to "monitor_read" having a buffer that contains
>> multiple commands), we would in fact start processing this command
>> despite having suspended the monitor.
>>
>> This is only a problem, however, if the client app has issued "migrate"
>> followed by another command, at the same time without waiting for the
>> respond to "migrate". So in practice the only way you'd hit the bug
>> is probably if you just cut+paste a big chunk of commands into the
>> monitor at once without waiting for completion and one of the commands
>> was "migrate" without "-d".
>>
>> Still, I think we would need to figure out a proper fix for this before
>> we could increase the buffer size.
>>
>> Regards,
>> Daniel
>
> There is one thing, which simplifies things a lot.
> - suspend_cnt can be increased only from 2 places:
>   1) monitor_event(), which is called for real HMP monitor only
>
>   2) monitor_suspend(), which can increment suspend_cnt
>       only if mon->rs != NULL, which also means that the
>       monitor is specifically configured HMP monitor.

I think you're right.  Monitor member suspend_cnt could use a comment.

If there are more members that apply only to HMP, we should collect them
in a MonitorHMP struct, similar to MonitorQMP.

> So, we can improve the patch (for now) with the following
> tweak:
>
> static int monitor_can_read(void *opaque)
> {
>     Monitor *mon = opaque;
>    
>     if (monitor_is_qmp(mon))
>         return 4096;   
>     return (mon->suspend_cnt == 0) ? 1 : 0;
> }

Instead of adding the conditional, I'd split this into two functions,
one for HMP and one for QMP, just like we split the other two callbacks.

> This will solve my case completely and does not break any
> backward compatibility.

No change for HMP.  Okay.

For QMP, monitor_qmp_read() feeds whatever it gets to the JSON lexer.
It currently gets one character at a time, because that's how much
monitor_can_read() returns.  With your change, it gets up to 4KiB.

The JSON lexer feeds tokens to the JSON streamer one at a time until it
has consumed everything it was fed.

The JSON streamer accumulates tokens, parsing them just enough to know
when it has a complete expression.  It pushes the expression to the QMP
expression handler immediately.

The QMP expression handler calls the JSON parser to parse the tokens
into a QObject, then dispatches to QMP command handlers accordingly.

Everything's synchronous.  When a QMP command handler runs, the calling
JSON streamer invocation is handling the command's final closing brace,
and so is the calling JSON lexer.  After the QMP command handler
returns, the JSON streamer returns.  The JSON lexer then looks at the
next character if there are more, else it returns.

The only difference to before that I can see is that we can read ahead.
That's a feature.

Looks safe to me.  Opinions?
Denis V. Lunev May 3, 2017, 11:34 a.m. UTC | #14
On 05/03/2017 02:29 PM, Markus Armbruster wrote:
> "Denis V. Lunev" <den@openvz.org> writes:
>
>> On 05/02/2017 07:48 PM, Daniel P. Berrange wrote:
>>> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
>>>> * Markus Armbruster (armbru@redhat.com) wrote:
>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>
>>>>>> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
>>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>>
>>>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>>>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>>>>>>> a lot of unnecessary system calls and CPU usage in the system.
>>>>>>>>
>>>>>>>> This patch changes the amount of data to read to 4096 bytes, which matches
>>>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
>>>>>>>> synchronous right now thus we should not face side effects in reality.
>>>>>>> Can you explain briefly why this relies on "synchronous"?  I've spent
>>>>>>> all of two seconds on the question myself...
>>>>>> Each command is processed in sequence as it appears in the
>>>>>> channel. The answer to the command is sent and only after that
>>>>>> next command is processed.
>>>>> Yes, that's how QMP works.
>>>>>
>>>>>> Theoretically tith asynchronous processing we can have some side
>>>>>> effects due to changed buffer size.
>>>>> What kind of side effects do you have in mind?
>>>>>
>>>>> It's quite possible that this obviously inefficient way to read had some
>>>>> deep reason back when it was created.  Hmm, git-blame is our friend:
>>>>>
>>>>> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>> Date:   Fri Dec 4 14:05:29 2009 +0100
>>>>>
>>>>>     monitor: Accept input only byte-wise
>>>>>     
>>>>>     This allows to suspend command interpretation and execution
>>>>>     synchronously, e.g. during migration.
>>>>>     
>>>>>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>>> I don't think I understand why that's a problem; if we read more bytes,
>>>> we're not going to interpret them and execute them until after the previous
>>>> command returns are we?
>>> Actually it sees we might do, due to the way the "migrate" command works
>>> in HMP when you don't give the '-d' flag.
>>>
>>> Most monitors commands will block the caller until they are finished,
>>> but "migrate" is different. The hmp_migrate() method will return
>>> immediately, but we call monitor_suspend() to block processing of
>>> further commands. If another command has already been read off
>>> the wire though (due to "monitor_read" having a buffer that contains
>>> multiple commands), we would in fact start processing this command
>>> despite having suspended the monitor.
>>>
>>> This is only a problem, however, if the client app has issued "migrate"
>>> followed by another command, at the same time without waiting for the
>>> respond to "migrate". So in practice the only way you'd hit the bug
>>> is probably if you just cut+paste a big chunk of commands into the
>>> monitor at once without waiting for completion and one of the commands
>>> was "migrate" without "-d".
>>>
>>> Still, I think we would need to figure out a proper fix for this before
>>> we could increase the buffer size.
>>>
>>> Regards,
>>> Daniel
>> There is one thing, which simplifies things a lot.
>> - suspend_cnt can be increased only from 2 places:
>>   1) monitor_event(), which is called for real HMP monitor only
>>
>>   2) monitor_suspend(), which can increment suspend_cnt
>>       only if mon->rs != NULL, which also means that the
>>       monitor is specifically configured HMP monitor.
> I think you're right.  Monitor member suspend_cnt could use a comment.
>
> If there are more members that apply only to HMP, we should collect them
> in a MonitorHMP struct, similar to MonitorQMP.
>
I think that this make sense even if this will be a single member as
the readability would be improved.


>> So, we can improve the patch (for now) with the following
>> tweak:
>>
>> static int monitor_can_read(void *opaque)
>> {
>>     Monitor *mon = opaque;
>>    
>>     if (monitor_is_qmp(mon))
>>         return 4096;   
>>     return (mon->suspend_cnt == 0) ? 1 : 0;
>> }
> Instead of adding the conditional, I'd split this into two functions,
> one for HMP and one for QMP, just like we split the other two callbacks.
good idea

>> This will solve my case completely and does not break any
>> backward compatibility.
> No change for HMP.  Okay.
>
> For QMP, monitor_qmp_read() feeds whatever it gets to the JSON lexer.
> It currently gets one character at a time, because that's how much
> monitor_can_read() returns.  With your change, it gets up to 4KiB.
>
> The JSON lexer feeds tokens to the JSON streamer one at a time until it
> has consumed everything it was fed.
>
> The JSON streamer accumulates tokens, parsing them just enough to know
> when it has a complete expression.  It pushes the expression to the QMP
> expression handler immediately.
>
> The QMP expression handler calls the JSON parser to parse the tokens
> into a QObject, then dispatches to QMP command handlers accordingly.
>
> Everything's synchronous.  When a QMP command handler runs, the calling
> JSON streamer invocation is handling the command's final closing brace,
> and so is the calling JSON lexer.  After the QMP command handler
> returns, the JSON streamer returns.  The JSON lexer then looks at the
> next character if there are more, else it returns.
>
> The only difference to before that I can see is that we can read ahead.
> That's a feature.
>
> Looks safe to me.  Opinions?
Looks fair to me.

Den
Daniel P. Berrangé May 3, 2017, 11:35 a.m. UTC | #15
On Wed, May 03, 2017 at 01:29:57PM +0200, Markus Armbruster wrote:
> "Denis V. Lunev" <den@openvz.org> writes:
> 
> > On 05/02/2017 07:48 PM, Daniel P. Berrange wrote:
> >> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
> >>> * Markus Armbruster (armbru@redhat.com) wrote:
> >>>> "Denis V. Lunev" <den@openvz.org> writes:
> >>>>
> >>>>> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
> >>>>>> "Denis V. Lunev" <den@openvz.org> writes:
> >>>>>>
> >>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
> >>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
> >>>>>>> a lot of unnecessary system calls and CPU usage in the system.
> >>>>>>>
> >>>>>>> This patch changes the amount of data to read to 4096 bytes, which matches
> >>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
> >>>>>>> synchronous right now thus we should not face side effects in reality.
> >>>>>> Can you explain briefly why this relies on "synchronous"?  I've spent
> >>>>>> all of two seconds on the question myself...
> >>>>> Each command is processed in sequence as it appears in the
> >>>>> channel. The answer to the command is sent and only after that
> >>>>> next command is processed.
> >>>> Yes, that's how QMP works.
> >>>>
> >>>>> Theoretically tith asynchronous processing we can have some side
> >>>>> effects due to changed buffer size.
> >>>> What kind of side effects do you have in mind?
> >>>>
> >>>> It's quite possible that this obviously inefficient way to read had some
> >>>> deep reason back when it was created.  Hmm, git-blame is our friend:
> >>>>
> >>>> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
> >>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
> >>>> Date:   Fri Dec 4 14:05:29 2009 +0100
> >>>>
> >>>>     monitor: Accept input only byte-wise
> >>>>     
> >>>>     This allows to suspend command interpretation and execution
> >>>>     synchronously, e.g. during migration.
> >>>>     
> >>>>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
> >>>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
> >>> I don't think I understand why that's a problem; if we read more bytes,
> >>> we're not going to interpret them and execute them until after the previous
> >>> command returns are we?
> >> Actually it sees we might do, due to the way the "migrate" command works
> >> in HMP when you don't give the '-d' flag.
> >>
> >> Most monitors commands will block the caller until they are finished,
> >> but "migrate" is different. The hmp_migrate() method will return
> >> immediately, but we call monitor_suspend() to block processing of
> >> further commands. If another command has already been read off
> >> the wire though (due to "monitor_read" having a buffer that contains
> >> multiple commands), we would in fact start processing this command
> >> despite having suspended the monitor.
> >>
> >> This is only a problem, however, if the client app has issued "migrate"
> >> followed by another command, at the same time without waiting for the
> >> respond to "migrate". So in practice the only way you'd hit the bug
> >> is probably if you just cut+paste a big chunk of commands into the
> >> monitor at once without waiting for completion and one of the commands
> >> was "migrate" without "-d".
> >>
> >> Still, I think we would need to figure out a proper fix for this before
> >> we could increase the buffer size.
> >>
> >> Regards,
> >> Daniel
> >
> > There is one thing, which simplifies things a lot.
> > - suspend_cnt can be increased only from 2 places:
> >   1) monitor_event(), which is called for real HMP monitor only
> >
> >   2) monitor_suspend(), which can increment suspend_cnt
> >       only if mon->rs != NULL, which also means that the
> >       monitor is specifically configured HMP monitor.
> 
> I think you're right.  Monitor member suspend_cnt could use a comment.
> 
> If there are more members that apply only to HMP, we should collect them
> in a MonitorHMP struct, similar to MonitorQMP.
> 
> > So, we can improve the patch (for now) with the following
> > tweak:
> >
> > static int monitor_can_read(void *opaque)
> > {
> >     Monitor *mon = opaque;
> >    
> >     if (monitor_is_qmp(mon))
> >         return 4096;   
> >     return (mon->suspend_cnt == 0) ? 1 : 0;
> > }
> 
> Instead of adding the conditional, I'd split this into two functions,
> one for HMP and one for QMP, just like we split the other two callbacks.
> 
> > This will solve my case completely and does not break any
> > backward compatibility.
> 
> No change for HMP.  Okay.
> 
> For QMP, monitor_qmp_read() feeds whatever it gets to the JSON lexer.
> It currently gets one character at a time, because that's how much
> monitor_can_read() returns.  With your change, it gets up to 4KiB.
> 
> The JSON lexer feeds tokens to the JSON streamer one at a time until it
> has consumed everything it was fed.
> 
> The JSON streamer accumulates tokens, parsing them just enough to know
> when it has a complete expression.  It pushes the expression to the QMP
> expression handler immediately.
> 
> The QMP expression handler calls the JSON parser to parse the tokens
> into a QObject, then dispatches to QMP command handlers accordingly.
> 
> Everything's synchronous.  When a QMP command handler runs, the calling
> JSON streamer invocation is handling the command's final closing brace,
> and so is the calling JSON lexer.  After the QMP command handler
> returns, the JSON streamer returns.  The JSON lexer then looks at the
> next character if there are more, else it returns.
> 
> The only difference to before that I can see is that we can read ahead.
> That's a feature.
> 
> Looks safe to me.  Opinions?

Yes, I concur, it looks safe for QMP.

I might suggest putting an assert(!qmp) in monitor_suspend() to guarantee
no one accidentally introduces usage of the suspend feature in QMP in
future.


I think we could make it safe for HMP too, eventually, if we really wanted
to. In monitor_read(), we feed bytes 1 at a time into the parser. After
each byte is processed, we would need to check whether the monitor is now
suspended or not. If it is suspended, we would have to stop feeding bytes
into the parser, and stash them in the Monitor object for later use. When
the monitor_resume was triggered, then we could carry on processing the
stashed bytes.

Of course, that's not a blocker for doing the quick change for QMP only
right now.

Regards,
Daniel
Denis V. Lunev May 3, 2017, 11:39 a.m. UTC | #16
On 05/03/2017 02:35 PM, Daniel P. Berrange wrote:
> On Wed, May 03, 2017 at 01:29:57PM +0200, Markus Armbruster wrote:
>> "Denis V. Lunev" <den@openvz.org> writes:
>>
>>> On 05/02/2017 07:48 PM, Daniel P. Berrange wrote:
>>>> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
>>>>> * Markus Armbruster (armbru@redhat.com) wrote:
>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>
>>>>>>> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
>>>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>>>
>>>>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>>>>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>>>>>>>> a lot of unnecessary system calls and CPU usage in the system.
>>>>>>>>>
>>>>>>>>> This patch changes the amount of data to read to 4096 bytes, which matches
>>>>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
>>>>>>>>> synchronous right now thus we should not face side effects in reality.
>>>>>>>> Can you explain briefly why this relies on "synchronous"?  I've spent
>>>>>>>> all of two seconds on the question myself...
>>>>>>> Each command is processed in sequence as it appears in the
>>>>>>> channel. The answer to the command is sent and only after that
>>>>>>> next command is processed.
>>>>>> Yes, that's how QMP works.
>>>>>>
>>>>>>> Theoretically tith asynchronous processing we can have some side
>>>>>>> effects due to changed buffer size.
>>>>>> What kind of side effects do you have in mind?
>>>>>>
>>>>>> It's quite possible that this obviously inefficient way to read had some
>>>>>> deep reason back when it was created.  Hmm, git-blame is our friend:
>>>>>>
>>>>>> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
>>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>> Date:   Fri Dec 4 14:05:29 2009 +0100
>>>>>>
>>>>>>     monitor: Accept input only byte-wise
>>>>>>     
>>>>>>     This allows to suspend command interpretation and execution
>>>>>>     synchronously, e.g. during migration.
>>>>>>     
>>>>>>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>>>> I don't think I understand why that's a problem; if we read more bytes,
>>>>> we're not going to interpret them and execute them until after the previous
>>>>> command returns are we?
>>>> Actually it sees we might do, due to the way the "migrate" command works
>>>> in HMP when you don't give the '-d' flag.
>>>>
>>>> Most monitors commands will block the caller until they are finished,
>>>> but "migrate" is different. The hmp_migrate() method will return
>>>> immediately, but we call monitor_suspend() to block processing of
>>>> further commands. If another command has already been read off
>>>> the wire though (due to "monitor_read" having a buffer that contains
>>>> multiple commands), we would in fact start processing this command
>>>> despite having suspended the monitor.
>>>>
>>>> This is only a problem, however, if the client app has issued "migrate"
>>>> followed by another command, at the same time without waiting for the
>>>> respond to "migrate". So in practice the only way you'd hit the bug
>>>> is probably if you just cut+paste a big chunk of commands into the
>>>> monitor at once without waiting for completion and one of the commands
>>>> was "migrate" without "-d".
>>>>
>>>> Still, I think we would need to figure out a proper fix for this before
>>>> we could increase the buffer size.
>>>>
>>>> Regards,
>>>> Daniel
>>> There is one thing, which simplifies things a lot.
>>> - suspend_cnt can be increased only from 2 places:
>>>   1) monitor_event(), which is called for real HMP monitor only
>>>
>>>   2) monitor_suspend(), which can increment suspend_cnt
>>>       only if mon->rs != NULL, which also means that the
>>>       monitor is specifically configured HMP monitor.
>> I think you're right.  Monitor member suspend_cnt could use a comment.
>>
>> If there are more members that apply only to HMP, we should collect them
>> in a MonitorHMP struct, similar to MonitorQMP.
>>
>>> So, we can improve the patch (for now) with the following
>>> tweak:
>>>
>>> static int monitor_can_read(void *opaque)
>>> {
>>>     Monitor *mon = opaque;
>>>    
>>>     if (monitor_is_qmp(mon))
>>>         return 4096;   
>>>     return (mon->suspend_cnt == 0) ? 1 : 0;
>>> }
>> Instead of adding the conditional, I'd split this into two functions,
>> one for HMP and one for QMP, just like we split the other two callbacks.
>>
>>> This will solve my case completely and does not break any
>>> backward compatibility.
>> No change for HMP.  Okay.
>>
>> For QMP, monitor_qmp_read() feeds whatever it gets to the JSON lexer.
>> It currently gets one character at a time, because that's how much
>> monitor_can_read() returns.  With your change, it gets up to 4KiB.
>>
>> The JSON lexer feeds tokens to the JSON streamer one at a time until it
>> has consumed everything it was fed.
>>
>> The JSON streamer accumulates tokens, parsing them just enough to know
>> when it has a complete expression.  It pushes the expression to the QMP
>> expression handler immediately.
>>
>> The QMP expression handler calls the JSON parser to parse the tokens
>> into a QObject, then dispatches to QMP command handlers accordingly.
>>
>> Everything's synchronous.  When a QMP command handler runs, the calling
>> JSON streamer invocation is handling the command's final closing brace,
>> and so is the calling JSON lexer.  After the QMP command handler
>> returns, the JSON streamer returns.  The JSON lexer then looks at the
>> next character if there are more, else it returns.
>>
>> The only difference to before that I can see is that we can read ahead.
>> That's a feature.
>>
>> Looks safe to me.  Opinions?
> Yes, I concur, it looks safe for QMP.
>
> I might suggest putting an assert(!qmp) in monitor_suspend() to guarantee
> no one accidentally introduces usage of the suspend feature in QMP in
> future.
>
>
> I think we could make it safe for HMP too, eventually, if we really wanted
> to. In monitor_read(), we feed bytes 1 at a time into the parser. After
> each byte is processed, we would need to check whether the monitor is now
> suspended or not. If it is suspended, we would have to stop feeding bytes
> into the parser, and stash them in the Monitor object for later use. When
> the monitor_resume was triggered, then we could carry on processing the
> stashed bytes.
>
> Of course, that's not a blocker for doing the quick change for QMP only
> right now.
>
> Regards,
> Daniel
I have though on this but fear about the context. Are we really able to
start
QMP processing from it? Usage of bottom-half is also unclear.
What about locking and other interesting stuff?

Den
Marc-André Lureau May 3, 2017, 11:55 a.m. UTC | #17
Hi

On Wed, May 3, 2017 at 3:36 PM Daniel P. Berrange <berrange@redhat.com>
wrote:

> On Wed, May 03, 2017 at 01:29:57PM +0200, Markus Armbruster wrote:
>
> > The only difference to before that I can see is that we can read ahead.
> > That's a feature.
> >
> > Looks safe to me.  Opinions?
>
> Yes, I concur, it looks safe for QMP.
>
> I might suggest putting an assert(!qmp) in monitor_suspend() to guarantee
> no one accidentally introduces usage of the suspend feature in QMP in
> future.
>

fwiw, in the qapi-async series, I added a bunch of related assert:
https://github.com/elmarco/qemu/commit/48d0691fef7602b652b8e2a2a8c0c6665f8e7c14
Markus Armbruster May 10, 2017, 3:54 p.m. UTC | #18
"Denis V. Lunev" <den@openvz.org> writes:

> On 05/03/2017 02:29 PM, Markus Armbruster wrote:
>> "Denis V. Lunev" <den@openvz.org> writes:
>>
>>> On 05/02/2017 07:48 PM, Daniel P. Berrange wrote:
>>>> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
>>>>> * Markus Armbruster (armbru@redhat.com) wrote:
>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>
>>>>>>> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
>>>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>>>
>>>>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>>>>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>>>>>>>> a lot of unnecessary system calls and CPU usage in the system.
>>>>>>>>>
>>>>>>>>> This patch changes the amount of data to read to 4096 bytes, which matches
>>>>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
>>>>>>>>> synchronous right now thus we should not face side effects in reality.
>>>>>>>> Can you explain briefly why this relies on "synchronous"?  I've spent
>>>>>>>> all of two seconds on the question myself...
>>>>>>> Each command is processed in sequence as it appears in the
>>>>>>> channel. The answer to the command is sent and only after that
>>>>>>> next command is processed.
>>>>>> Yes, that's how QMP works.
>>>>>>
>>>>>>> Theoretically tith asynchronous processing we can have some side
>>>>>>> effects due to changed buffer size.
>>>>>> What kind of side effects do you have in mind?
>>>>>>
>>>>>> It's quite possible that this obviously inefficient way to read had some
>>>>>> deep reason back when it was created.  Hmm, git-blame is our friend:
>>>>>>
>>>>>> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
>>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>> Date:   Fri Dec 4 14:05:29 2009 +0100
>>>>>>
>>>>>>     monitor: Accept input only byte-wise
>>>>>>     
>>>>>>     This allows to suspend command interpretation and execution
>>>>>>     synchronously, e.g. during migration.
>>>>>>     
>>>>>>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>>>> I don't think I understand why that's a problem; if we read more bytes,
>>>>> we're not going to interpret them and execute them until after the previous
>>>>> command returns are we?
>>>> Actually it sees we might do, due to the way the "migrate" command works
>>>> in HMP when you don't give the '-d' flag.
>>>>
>>>> Most monitors commands will block the caller until they are finished,
>>>> but "migrate" is different. The hmp_migrate() method will return
>>>> immediately, but we call monitor_suspend() to block processing of
>>>> further commands. If another command has already been read off
>>>> the wire though (due to "monitor_read" having a buffer that contains
>>>> multiple commands), we would in fact start processing this command
>>>> despite having suspended the monitor.
>>>>
>>>> This is only a problem, however, if the client app has issued "migrate"
>>>> followed by another command, at the same time without waiting for the
>>>> respond to "migrate". So in practice the only way you'd hit the bug
>>>> is probably if you just cut+paste a big chunk of commands into the
>>>> monitor at once without waiting for completion and one of the commands
>>>> was "migrate" without "-d".
>>>>
>>>> Still, I think we would need to figure out a proper fix for this before
>>>> we could increase the buffer size.
>>>>
>>>> Regards,
>>>> Daniel
>>> There is one thing, which simplifies things a lot.
>>> - suspend_cnt can be increased only from 2 places:
>>>   1) monitor_event(), which is called for real HMP monitor only
>>>
>>>   2) monitor_suspend(), which can increment suspend_cnt
>>>       only if mon->rs != NULL, which also means that the
>>>       monitor is specifically configured HMP monitor.
>> I think you're right.  Monitor member suspend_cnt could use a comment.
>>
>> If there are more members that apply only to HMP, we should collect them
>> in a MonitorHMP struct, similar to MonitorQMP.
>>
> I think that this make sense even if this will be a single member as
> the readability would be improved.
>
>
>>> So, we can improve the patch (for now) with the following
>>> tweak:
>>>
>>> static int monitor_can_read(void *opaque)
>>> {
>>>     Monitor *mon = opaque;
>>>    
>>>     if (monitor_is_qmp(mon))
>>>         return 4096;   
>>>     return (mon->suspend_cnt == 0) ? 1 : 0;
>>> }
>> Instead of adding the conditional, I'd split this into two functions,
>> one for HMP and one for QMP, just like we split the other two callbacks.
> good idea
>
>>> This will solve my case completely and does not break any
>>> backward compatibility.
>> No change for HMP.  Okay.
>>
>> For QMP, monitor_qmp_read() feeds whatever it gets to the JSON lexer.
>> It currently gets one character at a time, because that's how much
>> monitor_can_read() returns.  With your change, it gets up to 4KiB.
>>
>> The JSON lexer feeds tokens to the JSON streamer one at a time until it
>> has consumed everything it was fed.
>>
>> The JSON streamer accumulates tokens, parsing them just enough to know
>> when it has a complete expression.  It pushes the expression to the QMP
>> expression handler immediately.
>>
>> The QMP expression handler calls the JSON parser to parse the tokens
>> into a QObject, then dispatches to QMP command handlers accordingly.
>>
>> Everything's synchronous.  When a QMP command handler runs, the calling
>> JSON streamer invocation is handling the command's final closing brace,
>> and so is the calling JSON lexer.  After the QMP command handler
>> returns, the JSON streamer returns.  The JSON lexer then looks at the
>> next character if there are more, else it returns.
>>
>> The only difference to before that I can see is that we can read ahead.
>> That's a feature.
>>
>> Looks safe to me.  Opinions?
> Looks fair to me.

Care to post a formal patch?  Or did I miss it?
Denis V. Lunev May 10, 2017, 4:01 p.m. UTC | #19
On 05/10/2017 05:54 PM, Markus Armbruster wrote:
> "Denis V. Lunev" <den@openvz.org> writes:
>
>> On 05/03/2017 02:29 PM, Markus Armbruster wrote:
>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>
>>>> On 05/02/2017 07:48 PM, Daniel P. Berrange wrote:
>>>>> On Tue, May 02, 2017 at 05:36:30PM +0100, Dr. David Alan Gilbert wrote:
>>>>>> * Markus Armbruster (armbru@redhat.com) wrote:
>>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>>
>>>>>>>> On 05/02/2017 05:43 PM, Markus Armbruster wrote:
>>>>>>>>> "Denis V. Lunev" <den@openvz.org> writes:
>>>>>>>>>
>>>>>>>>>> Right now QMP and HMP monitors read 1 byte at a time from the socket, which
>>>>>>>>>> is very inefficient. With 100+ VMs on the host this easily reasults in
>>>>>>>>>> a lot of unnecessary system calls and CPU usage in the system.
>>>>>>>>>>
>>>>>>>>>> This patch changes the amount of data to read to 4096 bytes, which matches
>>>>>>>>>> buffer size on the channel level. Fortunately, monitor protocol is
>>>>>>>>>> synchronous right now thus we should not face side effects in reality.
>>>>>>>>> Can you explain briefly why this relies on "synchronous"?  I've spent
>>>>>>>>> all of two seconds on the question myself...
>>>>>>>> Each command is processed in sequence as it appears in the
>>>>>>>> channel. The answer to the command is sent and only after that
>>>>>>>> next command is processed.
>>>>>>> Yes, that's how QMP works.
>>>>>>>
>>>>>>>> Theoretically tith asynchronous processing we can have some side
>>>>>>>> effects due to changed buffer size.
>>>>>>> What kind of side effects do you have in mind?
>>>>>>>
>>>>>>> It's quite possible that this obviously inefficient way to read had some
>>>>>>> deep reason back when it was created.  Hmm, git-blame is our friend:
>>>>>>>
>>>>>>> commit c62313bbdc48f72e93fa8196f2fff96ba35e4e9d
>>>>>>> Author: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>> Date:   Fri Dec 4 14:05:29 2009 +0100
>>>>>>>
>>>>>>>     monitor: Accept input only byte-wise
>>>>>>>     
>>>>>>>     This allows to suspend command interpretation and execution
>>>>>>>     synchronously, e.g. during migration.
>>>>>>>     
>>>>>>>     Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
>>>>>>>     Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
>>>>>> I don't think I understand why that's a problem; if we read more bytes,
>>>>>> we're not going to interpret them and execute them until after the previous
>>>>>> command returns are we?
>>>>> Actually it sees we might do, due to the way the "migrate" command works
>>>>> in HMP when you don't give the '-d' flag.
>>>>>
>>>>> Most monitors commands will block the caller until they are finished,
>>>>> but "migrate" is different. The hmp_migrate() method will return
>>>>> immediately, but we call monitor_suspend() to block processing of
>>>>> further commands. If another command has already been read off
>>>>> the wire though (due to "monitor_read" having a buffer that contains
>>>>> multiple commands), we would in fact start processing this command
>>>>> despite having suspended the monitor.
>>>>>
>>>>> This is only a problem, however, if the client app has issued "migrate"
>>>>> followed by another command, at the same time without waiting for the
>>>>> respond to "migrate". So in practice the only way you'd hit the bug
>>>>> is probably if you just cut+paste a big chunk of commands into the
>>>>> monitor at once without waiting for completion and one of the commands
>>>>> was "migrate" without "-d".
>>>>>
>>>>> Still, I think we would need to figure out a proper fix for this before
>>>>> we could increase the buffer size.
>>>>>
>>>>> Regards,
>>>>> Daniel
>>>> There is one thing, which simplifies things a lot.
>>>> - suspend_cnt can be increased only from 2 places:
>>>>   1) monitor_event(), which is called for real HMP monitor only
>>>>
>>>>   2) monitor_suspend(), which can increment suspend_cnt
>>>>       only if mon->rs != NULL, which also means that the
>>>>       monitor is specifically configured HMP monitor.
>>> I think you're right.  Monitor member suspend_cnt could use a comment.
>>>
>>> If there are more members that apply only to HMP, we should collect them
>>> in a MonitorHMP struct, similar to MonitorQMP.
>>>
>> I think that this make sense even if this will be a single member as
>> the readability would be improved.
>>
>>
>>>> So, we can improve the patch (for now) with the following
>>>> tweak:
>>>>
>>>> static int monitor_can_read(void *opaque)
>>>> {
>>>>     Monitor *mon = opaque;
>>>>    
>>>>     if (monitor_is_qmp(mon))
>>>>         return 4096;   
>>>>     return (mon->suspend_cnt == 0) ? 1 : 0;
>>>> }
>>> Instead of adding the conditional, I'd split this into two functions,
>>> one for HMP and one for QMP, just like we split the other two callbacks.
>> good idea
>>
>>>> This will solve my case completely and does not break any
>>>> backward compatibility.
>>> No change for HMP.  Okay.
>>>
>>> For QMP, monitor_qmp_read() feeds whatever it gets to the JSON lexer.
>>> It currently gets one character at a time, because that's how much
>>> monitor_can_read() returns.  With your change, it gets up to 4KiB.
>>>
>>> The JSON lexer feeds tokens to the JSON streamer one at a time until it
>>> has consumed everything it was fed.
>>>
>>> The JSON streamer accumulates tokens, parsing them just enough to know
>>> when it has a complete expression.  It pushes the expression to the QMP
>>> expression handler immediately.
>>>
>>> The QMP expression handler calls the JSON parser to parse the tokens
>>> into a QObject, then dispatches to QMP command handlers accordingly.
>>>
>>> Everything's synchronous.  When a QMP command handler runs, the calling
>>> JSON streamer invocation is handling the command's final closing brace,
>>> and so is the calling JSON lexer.  After the QMP command handler
>>> returns, the JSON streamer returns.  The JSON lexer then looks at the
>>> next character if there are more, else it returns.
>>>
>>> The only difference to before that I can see is that we can read ahead.
>>> That's a feature.
>>>
>>> Looks safe to me.  Opinions?
>> Looks fair to me.
> Care to post a formal patch?  Or did I miss it?
I will. Sorry, we have big holidays in Russia which
were enlarged by my vacation.

I'll try to do that tomorrow.

Den

Patch
diff mbox

diff --git a/monitor.c b/monitor.c
index be282ec..00df5d0 100644
--- a/monitor.c
+++ b/monitor.c
@@ -3698,7 +3698,7 @@  static int monitor_can_read(void *opaque)
 {
     Monitor *mon = opaque;
 
-    return (mon->suspend_cnt == 0) ? 1 : 0;
+    return (mon->suspend_cnt == 0) ? 4096 : 0;
 }
 
 static void handle_qmp_command(JSONMessageParser *parser, GQueue *tokens)