diff mbox

[PULL,10/14] ui: fix VNC client throttling when audio capture is active

Message ID 20180112125854.18261-11-kraxel@redhat.com (mailing list archive)
State New, archived
Headers show

Commit Message

Gerd Hoffmann Jan. 12, 2018, 12:58 p.m. UTC
From: "Daniel P. Berrange" <berrange@redhat.com>

The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).

The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check must be disabled if audio capture is
enabled, because when streaming audio the output buffer offset will rarely be
zero due to queued audio data, and so this would starve framebuffer updates.

As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then enable audio capture, and simply never
read data back from the server. This can easily make QEMU's VNC server send
buffer consume 100MB of RAM per second, until the OOM killer starts reaping
processes (hopefully the rogue QEMU process, but it might pick others...).

To address this we make the throttling more intelligent, so we can throttle
when audio capture is active too. To determine how to throttle incremental
updates or audio data, we calculate a size threshold. Normally the threshold is
the approximate number of bytes associated with a single complete framebuffer
update. ie width * height * bytes per pixel. We'll send incremental updates
until we hit this threshold, at which point we'll stop sending updates until
data has been written to the wire, causing the output buffer offset to fall
back below the threshold.

If audio capture is enabled, we increase the size of the threshold to also
allow for upto 1 seconds worth of audio data samples. ie nchannels * bytes
per sample * frequency. This allows the output buffer to have a mixture of
incremental framebuffer updates and audio data queued, but once the threshold
is exceeded, audio data will be dropped and incremental updates will be
throttled.

This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.

This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:

  commit a7b20a8efa28e5f22c26c06cd06c2f12bc863493
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Oct 9 14:43:42 2017 +0100

    io: monitor encoutput buffer size from websocket GSource

This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-10-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
---
 ui/vnc.h |  6 ++++++
 ui/vnc.c | 72 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++--------
 2 files changed, 70 insertions(+), 8 deletions(-)

Comments

Peter Maydell Jan. 18, 2018, 1:29 p.m. UTC | #1
On 12 January 2018 at 12:58, Gerd Hoffmann <kraxel@redhat.com> wrote:
> From: "Daniel P. Berrange" <berrange@redhat.com>
>
> The VNC server must throttle data sent to the client to prevent the 'output'
> buffer size growing without bound, if the client stops reading data off the
> socket (either maliciously or due to stalled/slow network connection).

Hi. Coverity (CID 1385147) complains about a suspicious sign extension
in this patch:

> +/*
> + * Figure out how much pending data we should allow in the output
> + * buffer before we throttle incremental display updates, and/or
> + * drop audio samples.
> + *
> + * We allow for equiv of 1 full display's worth of FB updates,
> + * and 1 second of audio samples. If audio backlog was larger
> + * than that the client would already suffering awful audio
> + * glitches, so dropping samples is no worse really).
> + */
> +static void vnc_update_throttle_offset(VncState *vs)
> +{
> +    size_t offset =
> +        vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;

because the multiply is done with the "int" type, and then may
be sign-extended when converted to the probably-64-bit unsigned
size_t, resulting in the high bits all being set if the
multiply ended up with a 1 in bit 31.

thanks
-- PMM
Daniel P. Berrangé Jan. 18, 2018, 1:36 p.m. UTC | #2
On Thu, Jan 18, 2018 at 01:29:35PM +0000, Peter Maydell wrote:
> On 12 January 2018 at 12:58, Gerd Hoffmann <kraxel@redhat.com> wrote:
> > From: "Daniel P. Berrange" <berrange@redhat.com>
> >
> > The VNC server must throttle data sent to the client to prevent the 'output'
> > buffer size growing without bound, if the client stops reading data off the
> > socket (either maliciously or due to stalled/slow network connection).
> 
> Hi. Coverity (CID 1385147) complains about a suspicious sign extension
> in this patch:
> 
> > +/*
> > + * Figure out how much pending data we should allow in the output
> > + * buffer before we throttle incremental display updates, and/or
> > + * drop audio samples.
> > + *
> > + * We allow for equiv of 1 full display's worth of FB updates,
> > + * and 1 second of audio samples. If audio backlog was larger
> > + * than that the client would already suffering awful audio
> > + * glitches, so dropping samples is no worse really).
> > + */
> > +static void vnc_update_throttle_offset(VncState *vs)
> > +{
> > +    size_t offset =
> > +        vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;
> 
> because the multiply is done with the "int" type, and then may
> be sign-extended when converted to the probably-64-bit unsigned
> size_t, resulting in the high bits all being set if the
> multiply ended up with a 1 in bit 31.

I guess we can usefully change client_width/client_height to be an unsigned
int, since there's no valid scenario for them to be negative.

Regards,
Daniel
Paolo Bonzini Jan. 18, 2018, 1:54 p.m. UTC | #3
On 18/01/2018 14:36, Daniel P. Berrange wrote:
>>> +/*
>>> + * Figure out how much pending data we should allow in the output
>>> + * buffer before we throttle incremental display updates, and/or
>>> + * drop audio samples.
>>> + *
>>> + * We allow for equiv of 1 full display's worth of FB updates,
>>> + * and 1 second of audio samples. If audio backlog was larger
>>> + * than that the client would already suffering awful audio
>>> + * glitches, so dropping samples is no worse really).
>>> + */
>>> +static void vnc_update_throttle_offset(VncState *vs)
>>> +{
>>> +    size_t offset =
>>> +        vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;
>> because the multiply is done with the "int" type, and then may
>> be sign-extended when converted to the probably-64-bit unsigned
>> size_t, resulting in the high bits all being set if the
>> multiply ended up with a 1 in bit 31.
> I guess we can usefully change client_width/client_height to be an unsigned
> int, since there's no valid scenario for them to be negative.

In addition to that, do we support a >= 2 GiB framebuffer at all? (Even
with unsigned ints, Coverity would rightly complain about a truncated
32-bit multiplication being assigned to a 64-bit value).

Paolo
Daniel P. Berrangé Jan. 18, 2018, 2:12 p.m. UTC | #4
On Thu, Jan 18, 2018 at 02:54:48PM +0100, Paolo Bonzini wrote:
> On 18/01/2018 14:36, Daniel P. Berrange wrote:
> >>> +/*
> >>> + * Figure out how much pending data we should allow in the output
> >>> + * buffer before we throttle incremental display updates, and/or
> >>> + * drop audio samples.
> >>> + *
> >>> + * We allow for equiv of 1 full display's worth of FB updates,
> >>> + * and 1 second of audio samples. If audio backlog was larger
> >>> + * than that the client would already suffering awful audio
> >>> + * glitches, so dropping samples is no worse really).
> >>> + */
> >>> +static void vnc_update_throttle_offset(VncState *vs)
> >>> +{
> >>> +    size_t offset =
> >>> +        vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;
> >> because the multiply is done with the "int" type, and then may
> >> be sign-extended when converted to the probably-64-bit unsigned
> >> size_t, resulting in the high bits all being set if the
> >> multiply ended up with a 1 in bit 31.
> > I guess we can usefully change client_width/client_height to be an unsigned
> > int, since there's no valid scenario for them to be negative.
> 
> In addition to that, do we support a >= 2 GiB framebuffer at all? (Even
> with unsigned ints, Coverity would rightly complain about a truncated
> 32-bit multiplication being assigned to a 64-bit value).

client_width/client_height are values that are initialized from the
graphics card frontend config, and thus limited by amount of video
RAM QEMU allows.   bytes_per_pixel is limited to 8/16/32.

So I think we're safe from 2GB overflow in any normal case.

That said, VGA RAM size is configurable, so I'm curious what would happen
if someone configured an insanely large VGA RAM and asked for a big frame
buffer in guest.

VNC is protocol limited to uint16 for width/height size, and so is X11
so I imagine some exploding behavour would follow :-)

Regards,
Daniel
Paolo Bonzini Jan. 18, 2018, 2:46 p.m. UTC | #5
On 18/01/2018 15:12, Daniel P. Berrange wrote:
>> In addition to that, do we support a >= 2 GiB framebuffer at all? (Even
>> with unsigned ints, Coverity would rightly complain about a truncated
>> 32-bit multiplication being assigned to a 64-bit value).
> client_width/client_height are values that are initialized from the
> graphics card frontend config, and thus limited by amount of video
> RAM QEMU allows.   bytes_per_pixel is limited to 8/16/32.
> 
> So I think we're safe from 2GB overflow in any normal case.
> 
> That said, VGA RAM size is configurable, so I'm curious what would happen
> if someone configured an insanely large VGA RAM and asked for a big frame
> buffer in guest.
> 
> VNC is protocol limited to uint16 for width/height size, and so is X11
> so I imagine some exploding behavour would follow :-)

Indeed, and even 2^16 x 2^16 * 32bpp is already 34 bits.  So perhaps we
should limit VNC to 16384 pixels on each axis (maximum frame buffer size
1 GiB).

Paolo
Peter Maydell Jan. 18, 2018, 2:50 p.m. UTC | #6
On 18 January 2018 at 14:46, Paolo Bonzini <pbonzini@redhat.com> wrote:
> On 18/01/2018 15:12, Daniel P. Berrange wrote:
>>> In addition to that, do we support a >= 2 GiB framebuffer at all? (Even
>>> with unsigned ints, Coverity would rightly complain about a truncated
>>> 32-bit multiplication being assigned to a 64-bit value).
>> client_width/client_height are values that are initialized from the
>> graphics card frontend config, and thus limited by amount of video
>> RAM QEMU allows.   bytes_per_pixel is limited to 8/16/32.
>>
>> So I think we're safe from 2GB overflow in any normal case.
>>
>> That said, VGA RAM size is configurable, so I'm curious what would happen
>> if someone configured an insanely large VGA RAM and asked for a big frame
>> buffer in guest.
>>
>> VNC is protocol limited to uint16 for width/height size, and so is X11
>> so I imagine some exploding behavour would follow :-)
>
> Indeed, and even 2^16 x 2^16 * 32bpp is already 34 bits.  So perhaps we
> should limit VNC to 16384 pixels on each axis (maximum frame buffer size
> 1 GiB).

Google says you can already get graphics cards that can do 15360x8640,
which is really quite close to that 16384 limit...

thanks
-- PMM
Paolo Bonzini Jan. 18, 2018, 3:33 p.m. UTC | #7
On 18/01/2018 15:50, Peter Maydell wrote:
> On 18 January 2018 at 14:46, Paolo Bonzini <pbonzini@redhat.com> wrote:
>> On 18/01/2018 15:12, Daniel P. Berrange wrote:
>>>> In addition to that, do we support a >= 2 GiB framebuffer at all? (Even
>>>> with unsigned ints, Coverity would rightly complain about a truncated
>>>> 32-bit multiplication being assigned to a 64-bit value).
>>> client_width/client_height are values that are initialized from the
>>> graphics card frontend config, and thus limited by amount of video
>>> RAM QEMU allows.   bytes_per_pixel is limited to 8/16/32.
>>>
>>> So I think we're safe from 2GB overflow in any normal case.
>>>
>>> That said, VGA RAM size is configurable, so I'm curious what would happen
>>> if someone configured an insanely large VGA RAM and asked for a big frame
>>> buffer in guest.
>>>
>>> VNC is protocol limited to uint16 for width/height size, and so is X11
>>> so I imagine some exploding behavour would follow :-)
>>
>> Indeed, and even 2^16 x 2^16 * 32bpp is already 34 bits.  So perhaps we
>> should limit VNC to 16384 pixels on each axis (maximum frame buffer size
>> 1 GiB).
> 
> Google says you can already get graphics cards that can do 15360x8640,
> which is really quite close to that 16384 limit...

Then we can do 32767 * 16384 * 4, but I'm a bit afraid of off-by-ones.

Paolo
Thomas Huth Jan. 18, 2018, 4:06 p.m. UTC | #8
On 18.01.2018 16:33, Paolo Bonzini wrote:
> On 18/01/2018 15:50, Peter Maydell wrote:
>> On 18 January 2018 at 14:46, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>> On 18/01/2018 15:12, Daniel P. Berrange wrote:
>>>>> In addition to that, do we support a >= 2 GiB framebuffer at all? (Even
>>>>> with unsigned ints, Coverity would rightly complain about a truncated
>>>>> 32-bit multiplication being assigned to a 64-bit value).
>>>> client_width/client_height are values that are initialized from the
>>>> graphics card frontend config, and thus limited by amount of video
>>>> RAM QEMU allows.   bytes_per_pixel is limited to 8/16/32.
>>>>
>>>> So I think we're safe from 2GB overflow in any normal case.
>>>>
>>>> That said, VGA RAM size is configurable, so I'm curious what would happen
>>>> if someone configured an insanely large VGA RAM and asked for a big frame
>>>> buffer in guest.
>>>>
>>>> VNC is protocol limited to uint16 for width/height size, and so is X11
>>>> so I imagine some exploding behavour would follow :-)
>>>
>>> Indeed, and even 2^16 x 2^16 * 32bpp is already 34 bits.  So perhaps we
>>> should limit VNC to 16384 pixels on each axis (maximum frame buffer size
>>> 1 GiB).
>>
>> Google says you can already get graphics cards that can do 15360x8640,
>> which is really quite close to that 16384 limit...
> 
> Then we can do 32767 * 16384 * 4, but I'm a bit afraid of off-by-ones.

Simply limit it to 30000 * 20000 ?

 Thomas
Paolo Bonzini Jan. 18, 2018, 4:13 p.m. UTC | #9
On 18/01/2018 17:06, Thomas Huth wrote:
> On 18.01.2018 16:33, Paolo Bonzini wrote:
>> On 18/01/2018 15:50, Peter Maydell wrote:
>>> On 18 January 2018 at 14:46, Paolo Bonzini <pbonzini@redhat.com> wrote:
>>>> On 18/01/2018 15:12, Daniel P. Berrange wrote:
>>>>>> In addition to that, do we support a >= 2 GiB framebuffer at all? (Even
>>>>>> with unsigned ints, Coverity would rightly complain about a truncated
>>>>>> 32-bit multiplication being assigned to a 64-bit value).
>>>>> client_width/client_height are values that are initialized from the
>>>>> graphics card frontend config, and thus limited by amount of video
>>>>> RAM QEMU allows.   bytes_per_pixel is limited to 8/16/32.
>>>>>
>>>>> So I think we're safe from 2GB overflow in any normal case.
>>>>>
>>>>> That said, VGA RAM size is configurable, so I'm curious what would happen
>>>>> if someone configured an insanely large VGA RAM and asked for a big frame
>>>>> buffer in guest.
>>>>>
>>>>> VNC is protocol limited to uint16 for width/height size, and so is X11
>>>>> so I imagine some exploding behavour would follow :-)
>>>>
>>>> Indeed, and even 2^16 x 2^16 * 32bpp is already 34 bits.  So perhaps we
>>>> should limit VNC to 16384 pixels on each axis (maximum frame buffer size
>>>> 1 GiB).
>>>
>>> Google says you can already get graphics cards that can do 15360x8640,
>>> which is really quite close to that 16384 limit...
>>
>> Then we can do 32767 * 16384 * 4, but I'm a bit afraid of off-by-ones.
> 
> Simply limit it to 30000 * 20000 ?

That's too much (exceeds 2^31-1 at 32bpp), but yeah, 30720*17280 is
twice what Peter found and it's safe.

Paolo
Gerd Hoffmann Jan. 25, 2018, 9:08 a.m. UTC | #10
Hi,

> >>>>> VNC is protocol limited to uint16 for width/height size, and so is X11
> >>>>> so I imagine some exploding behavour would follow :-)
> >>>>
> >>>> Indeed, and even 2^16 x 2^16 * 32bpp is already 34 bits.  So perhaps we
> >>>> should limit VNC to 16384 pixels on each axis (maximum frame buffer size
> >>>> 1 GiB).
> >>>
> >>> Google says you can already get graphics cards that can do 15360x8640,
> >>> which is really quite close to that 16384 limit...
> >>
> >> Then we can do 32767 * 16384 * 4, but I'm a bit afraid of off-by-ones.
> > 
> > Simply limit it to 30000 * 20000 ?
> 
> That's too much (exceeds 2^31-1 at 32bpp), but yeah, 30720*17280 is
> twice what Peter found and it's safe.

We already have:

#define VNC_MAX_WIDTH ROUND_UP(2560, VNC_DIRTY_PIXELS_PER_BIT)
#define VNC_MAX_HEIGHT 2048

If the guest framebuffer is larger you'll get the upper left corner
only I think.  spice can handle larger resolutions.

cheers,
  Gerd
diff mbox

Patch

diff --git a/ui/vnc.h b/ui/vnc.h
index b9d310e640..8fe69595c6 100644
--- a/ui/vnc.h
+++ b/ui/vnc.h
@@ -298,6 +298,12 @@  struct VncState
 
     VncClientInfo *info;
 
+    /* We allow multiple incremental updates or audio capture
+     * samples to be queued in output buffer, provided the
+     * buffer size doesn't exceed this threshold. The value
+     * is calculating dynamically based on framebuffer size
+     * and audio sample settings in vnc_update_throttle_offset() */
+    size_t throttle_output_offset;
     Buffer output;
     Buffer input;
     /* current output mode information */
diff --git a/ui/vnc.c b/ui/vnc.c
index 4ba7fc076a..9e03cc7c01 100644
--- a/ui/vnc.c
+++ b/ui/vnc.c
@@ -60,6 +60,7 @@  static QTAILQ_HEAD(, VncDisplay) vnc_displays =
 
 static int vnc_cursor_define(VncState *vs);
 static void vnc_release_modifiers(VncState *vs);
+static void vnc_update_throttle_offset(VncState *vs);
 
 static void vnc_set_share_mode(VncState *vs, VncShareMode mode)
 {
@@ -766,6 +767,7 @@  static void vnc_dpy_switch(DisplayChangeListener *dcl,
         vnc_set_area_dirty(vs->dirty, vd, 0, 0,
                            vnc_width(vd),
                            vnc_height(vd));
+        vnc_update_throttle_offset(vs);
     }
 }
 
@@ -961,16 +963,67 @@  static int find_and_clear_dirty_height(VncState *vs,
     return h;
 }
 
+/*
+ * Figure out how much pending data we should allow in the output
+ * buffer before we throttle incremental display updates, and/or
+ * drop audio samples.
+ *
+ * We allow for equiv of 1 full display's worth of FB updates,
+ * and 1 second of audio samples. If audio backlog was larger
+ * than that the client would already suffering awful audio
+ * glitches, so dropping samples is no worse really).
+ */
+static void vnc_update_throttle_offset(VncState *vs)
+{
+    size_t offset =
+        vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;
+
+    if (vs->audio_cap) {
+        int freq = vs->as.freq;
+        /* We don't limit freq when reading settings from client, so
+         * it could be upto MAX_INT in size. 48khz is a sensible
+         * upper bound for trustworthy clients */
+        int bps;
+        if (freq > 48000) {
+            freq = 48000;
+        }
+        switch (vs->as.fmt) {
+        default:
+        case  AUD_FMT_U8:
+        case  AUD_FMT_S8:
+            bps = 1;
+            break;
+        case  AUD_FMT_U16:
+        case  AUD_FMT_S16:
+            bps = 2;
+            break;
+        case  AUD_FMT_U32:
+        case  AUD_FMT_S32:
+            bps = 4;
+            break;
+        }
+        offset += freq * bps * vs->as.nchannels;
+    }
+
+    /* Put a floor of 1MB on offset, so that if we have a large pending
+     * buffer and the display is resized to a small size & back again
+     * we don't suddenly apply a tiny send limit
+     */
+    offset = MAX(offset, 1024 * 1024);
+
+    vs->throttle_output_offset = offset;
+}
+
 static bool vnc_should_update(VncState *vs)
 {
     switch (vs->update) {
     case VNC_STATE_UPDATE_NONE:
         break;
     case VNC_STATE_UPDATE_INCREMENTAL:
-        /* Only allow incremental updates if the output buffer
-         * is empty, or if audio capture is enabled.
+        /* Only allow incremental updates if the pending send queue
+         * is less than the permitted threshold
          */
-        if (!vs->output.offset || vs->audio_cap) {
+        if (vs->output.offset < vs->throttle_output_offset) {
             return true;
         }
         break;
@@ -1084,11 +1137,13 @@  static void audio_capture(void *opaque, void *buf, int size)
     VncState *vs = opaque;
 
     vnc_lock_output(vs);
-    vnc_write_u8(vs, VNC_MSG_SERVER_QEMU);
-    vnc_write_u8(vs, VNC_MSG_SERVER_QEMU_AUDIO);
-    vnc_write_u16(vs, VNC_MSG_SERVER_QEMU_AUDIO_DATA);
-    vnc_write_u32(vs, size);
-    vnc_write(vs, buf, size);
+    if (vs->output.offset < vs->throttle_output_offset) {
+        vnc_write_u8(vs, VNC_MSG_SERVER_QEMU);
+        vnc_write_u8(vs, VNC_MSG_SERVER_QEMU_AUDIO);
+        vnc_write_u16(vs, VNC_MSG_SERVER_QEMU_AUDIO_DATA);
+        vnc_write_u32(vs, size);
+        vnc_write(vs, buf, size);
+    }
     vnc_unlock_output(vs);
     vnc_flush(vs);
 }
@@ -2288,6 +2343,7 @@  static int protocol_client_msg(VncState *vs, uint8_t *data, size_t len)
         break;
     }
 
+    vnc_update_throttle_offset(vs);
     vnc_read_when(vs, protocol_client_msg, 1);
     return 0;
 }