diff mbox series

[1/3,for,9.0] chardev: lower priority of the HUP GSource in socket chardev

Message ID 20240318182330.96738-2-berrange@redhat.com (mailing list archive)
State New, archived
Headers show
Series Fix TLS support for chardevs and incoming data loss on EOF | expand

Commit Message

Daniel P. Berrangé March 18, 2024, 6:23 p.m. UTC
The socket chardev often has 2 GSource object registered against the
same FD. One is registered all the time and is just intended to handle
POLLHUP events, while the other gets registered & unregistered on the
fly as the frontend is ready to receive more data or not.

It is very common for poll() to signal a POLLHUP event at the same time
as there is pending incoming data from the disconnected client. It is
therefore essential to process incoming data prior to processing HUP.
The problem with having 2 GSource on the same FD is that there is no
guaranteed ordering of execution between them, so the chardev code may
process HUP first and thus discard data.

This failure scenario is non-deterministic but can be seen fairly
reliably by reverting a7077b8e354d90fec26c2921aa2dea85b90dff90, and
then running 'tests/unit/test-char', which will sometimes fail with
missing data.

Ideally QEMU would only have 1 GSource, but that's a complex code
refactoring job. The next best solution is to try to ensure ordering
between the 2 GSource objects. This can be achieved by lowering the
priority of the HUP GSource, so that it is never dispatched if the
main GSource is also ready to dispatch. Counter-intuitively, lowering
the priority of a GSource is done by raising its priority number.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
---
 chardev/char-socket.c | 16 ++++++++++++++++
 1 file changed, 16 insertions(+)

Comments

Marc-André Lureau March 18, 2024, 7:17 p.m. UTC | #1
On Mon, Mar 18, 2024 at 10:25 PM Daniel P. Berrangé <berrange@redhat.com> wrote:
>
> The socket chardev often has 2 GSource object registered against the
> same FD. One is registered all the time and is just intended to handle
> POLLHUP events, while the other gets registered & unregistered on the
> fly as the frontend is ready to receive more data or not.
>
> It is very common for poll() to signal a POLLHUP event at the same time
> as there is pending incoming data from the disconnected client. It is
> therefore essential to process incoming data prior to processing HUP.
> The problem with having 2 GSource on the same FD is that there is no
> guaranteed ordering of execution between them, so the chardev code may
> process HUP first and thus discard data.
>
> This failure scenario is non-deterministic but can be seen fairly
> reliably by reverting a7077b8e354d90fec26c2921aa2dea85b90dff90, and
> then running 'tests/unit/test-char', which will sometimes fail with
> missing data.
>
> Ideally QEMU would only have 1 GSource, but that's a complex code
> refactoring job. The next best solution is to try to ensure ordering
> between the 2 GSource objects. This can be achieved by lowering the
> priority of the HUP GSource, so that it is never dispatched if the
> main GSource is also ready to dispatch. Counter-intuitively, lowering
> the priority of a GSource is done by raising its priority number.
>
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>

> ---
>  chardev/char-socket.c | 16 ++++++++++++++++
>  1 file changed, 16 insertions(+)
>
> diff --git a/chardev/char-socket.c b/chardev/char-socket.c
> index 8a0406cc1e..2c4dffc0e6 100644
> --- a/chardev/char-socket.c
> +++ b/chardev/char-socket.c
> @@ -601,6 +601,22 @@ static void update_ioc_handlers(SocketChardev *s)
>
>      remove_hup_source(s);
>      s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
> +    /*
> +     * poll() is liable to return POLLHUP even when there is
> +     * still incoming data available to read on the FD. If
> +     * we have the hup_source at the same priority as the
> +     * main io_add_watch_poll GSource, then we might end up
> +     * processing the POLLHUP event first, closing the FD,
> +     * and as a result silently discard data we should have
> +     * read.
> +     *
> +     * By setting the hup_source to G_PRIORITY_DEFAULT + 1,
> +     * we ensure that io_add_watch_poll GSource will always
> +     * be dispatched first, thus guaranteeing we will be
> +     * able to process all incoming data before closing the
> +     * FD
> +     */
> +    g_source_set_priority(s->hup_source, G_PRIORITY_DEFAULT + 1);
>      g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
>                            chr, NULL);
>      g_source_attach(s->hup_source, chr->gcontext);
> --
> 2.43.0
>
>
Thomas Huth March 19, 2024, 8:10 a.m. UTC | #2
On 18/03/2024 19.23, Daniel P. Berrangé wrote:
> The socket chardev often has 2 GSource object registered against the
> same FD. One is registered all the time and is just intended to handle
> POLLHUP events, while the other gets registered & unregistered on the
> fly as the frontend is ready to receive more data or not.
> 
> It is very common for poll() to signal a POLLHUP event at the same time
> as there is pending incoming data from the disconnected client. It is
> therefore essential to process incoming data prior to processing HUP.
> The problem with having 2 GSource on the same FD is that there is no
> guaranteed ordering of execution between them, so the chardev code may
> process HUP first and thus discard data.
> 
> This failure scenario is non-deterministic but can be seen fairly
> reliably by reverting a7077b8e354d90fec26c2921aa2dea85b90dff90, and
> then running 'tests/unit/test-char', which will sometimes fail with
> missing data.
> 
> Ideally QEMU would only have 1 GSource, but that's a complex code
> refactoring job. The next best solution is to try to ensure ordering
> between the 2 GSource objects. This can be achieved by lowering the
> priority of the HUP GSource, so that it is never dispatched if the
> main GSource is also ready to dispatch. Counter-intuitively, lowering
> the priority of a GSource is done by raising its priority number.
> 
> Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
> ---
>   chardev/char-socket.c | 16 ++++++++++++++++
>   1 file changed, 16 insertions(+)

Reviewed-by: Thomas Huth <thuth@redhat.com>
diff mbox series

Patch

diff --git a/chardev/char-socket.c b/chardev/char-socket.c
index 8a0406cc1e..2c4dffc0e6 100644
--- a/chardev/char-socket.c
+++ b/chardev/char-socket.c
@@ -601,6 +601,22 @@  static void update_ioc_handlers(SocketChardev *s)
 
     remove_hup_source(s);
     s->hup_source = qio_channel_create_watch(s->ioc, G_IO_HUP);
+    /*
+     * poll() is liable to return POLLHUP even when there is
+     * still incoming data available to read on the FD. If
+     * we have the hup_source at the same priority as the
+     * main io_add_watch_poll GSource, then we might end up
+     * processing the POLLHUP event first, closing the FD,
+     * and as a result silently discard data we should have
+     * read.
+     *
+     * By setting the hup_source to G_PRIORITY_DEFAULT + 1,
+     * we ensure that io_add_watch_poll GSource will always
+     * be dispatched first, thus guaranteeing we will be
+     * able to process all incoming data before closing the
+     * FD
+     */
+    g_source_set_priority(s->hup_source, G_PRIORITY_DEFAULT + 1);
     g_source_set_callback(s->hup_source, (GSourceFunc)tcp_chr_hup,
                           chr, NULL);
     g_source_attach(s->hup_source, chr->gcontext);