diff mbox series

[PULL,11/13] ui/console: Remove dpy_cursor_define_supported()

Message ID 20240716180941.40211-12-philmd@linaro.org (mailing list archive)
State New, archived
Headers show
Series [PULL,01/13] hw/core/loader: allow loading larger ROMs | expand

Commit Message

Philippe Mathieu-Daudé July 16, 2024, 6:09 p.m. UTC
From: Akihiko Odaki <akihiko.odaki@daynix.com>

Remove dpy_cursor_define_supported() as it brings no benefit today and
it has a few inherent problems.

All graphical displays except egl-headless support cursor composition
without DMA-BUF, and egl-headless is meant to be used in conjunction
with another graphical display, so dpy_cursor_define_supported()
always returns true and meaningless.

Even if we add a new display without cursor composition in the future,
dpy_cursor_define_supported() will be problematic as a cursor display
fix for it because some display devices like virtio-gpu cannot tell the
lack of cursor composition capability to the guest and are unable to
utilize the value the function returns. Therefore, all non-headless
graphical displays must actually implement cursor composition for
correct cursor display.

Another problem with dpy_cursor_define_supported() is that it returns
true even if only some of the display listeners support cursor
composition, which is wrong unless all display listeners that lack
cursor composition is headless.

Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Phil Dennis-Jordan <phil@philjordan.eu>
Signed-off-by: Akihiko Odaki <akihiko.odaki@daynix.com>
Message-ID: <20240715-cursor-v3-4-afa5b9492dbf@daynix.com>
Signed-off-by: Philippe Mathieu-Daudé <philmd@linaro.org>
---
 include/ui/console.h    |  1 -
 hw/display/qxl-render.c |  4 ----
 hw/display/vmware_vga.c |  6 ++----
 ui/console.c            | 13 -------------
 4 files changed, 2 insertions(+), 22 deletions(-)

Comments

Michael Tokarev Oct. 29, 2024, 12:30 p.m. UTC | #1
16.07.2024 21:09, Philippe Mathieu-Daudé wrote:
> From: Akihiko Odaki <akihiko.odaki@daynix.com>
> 
> Remove dpy_cursor_define_supported() as it brings no benefit today and
> it has a few inherent problems.
> 
> All graphical displays except egl-headless support cursor composition
> without DMA-BUF, and egl-headless is meant to be used in conjunction
> with another graphical display, so dpy_cursor_define_supported()
> always returns true and meaningless.
> 
> Even if we add a new display without cursor composition in the future,
> dpy_cursor_define_supported() will be problematic as a cursor display
> fix for it because some display devices like virtio-gpu cannot tell the
> lack of cursor composition capability to the guest and are unable to
> utilize the value the function returns. Therefore, all non-headless
> graphical displays must actually implement cursor composition for
> correct cursor display.
> 
> Another problem with dpy_cursor_define_supported() is that it returns
> true even if only some of the display listeners support cursor
> composition, which is wrong unless all display listeners that lack
> cursor composition is headless.

Apparently this commit made windows10 guest to freeze.  There's a rather
hairy bugreport at https://bugs.debian.org/1084199 .  Also there's an
interesting bug filed against qemu, https://gitlab.com/qemu-project/qemu/-/issues/1628 ,
which seems to be relevant.

Can you take a look please?

This user did a great job bisecting qemu with no experience whatsoever..

Thanks,

/mjt
Phil Dennis-Jordan Oct. 29, 2024, 2:04 p.m. UTC | #2
On Tue, 29 Oct 2024 at 13:30, Michael Tokarev <mjt@tls.msk.ru> wrote:

> 16.07.2024 21:09, Philippe Mathieu-Daudé wrote:
> > From: Akihiko Odaki <akihiko.odaki@daynix.com>
> >
> > Remove dpy_cursor_define_supported() as it brings no benefit today and
> > it has a few inherent problems.
> >
> > All graphical displays except egl-headless support cursor composition
> > without DMA-BUF, and egl-headless is meant to be used in conjunction
> > with another graphical display, so dpy_cursor_define_supported()
> > always returns true and meaningless.
> >
> > Even if we add a new display without cursor composition in the future,
> > dpy_cursor_define_supported() will be problematic as a cursor display
> > fix for it because some display devices like virtio-gpu cannot tell the
> > lack of cursor composition capability to the guest and are unable to
> > utilize the value the function returns. Therefore, all non-headless
> > graphical displays must actually implement cursor composition for
> > correct cursor display.
> >
> > Another problem with dpy_cursor_define_supported() is that it returns
> > true even if only some of the display listeners support cursor
> > composition, which is wrong unless all display listeners that lack
> > cursor composition is headless.
>
> Apparently this commit made windows10 guest to freeze.  There's a rather
> hairy bugreport at https://bugs.debian.org/1084199 .  Also there's an
> interesting bug filed against qemu,
> https://gitlab.com/qemu-project/qemu/-/issues/1628 ,
> which seems to be relevant.
>
> Can you take a look please?
>
> This user did a great job bisecting qemu with no experience whatsoever..
>
>
It seems an odd commit to be causing this, but you never know… Some random
thoughts on this and a few questions:



As far as I can tell, in both cases the guest display adapter is QXL. In
the case of the Debian user, I think the UI backend is SPICE? Presumably
the it's the same for the user reporting the bug directly against Qemu, as
they also seem to be using libvirt. (No full command line available as far
as I can see?) In any case, the SPICE display specifies:

    .dpy_cursor_define       = display_mouse_define,

Which means the old dpy_cursor_define_supported() would have been returning
true prior to this commit, with the modified QXL and VMSvga code paths
theoretically unaffected. A bit odd, to say the least.

But I suppose it's possible that my understanding is incorrect, and
dpy_cursor_define_supported() might have returned false. The code in
qxl_render_cursor that follows this check and which will now always run, vs
would not have done previously includes:

    if (qxl->debug > 1 && cmd->type != QXL_CURSOR_MOVE) {
        fprintf(stderr, "%s", __func__);
        qxl_log_cmd_cursor(qxl, cmd, ext->group_id);
        fprintf(stderr, "\n");
    }

Can we get the user to set qxl->debug to a value > 1 and see if the freeze
coincides with logging from here? (Possibly tricky to intercept the fprintf
output from Qemu run via libvirt though.)

I also notice that the subsequent code in the QXL_CURSOR_SET case looks a
lot like the code being replaced by Michael's patch from the Qemu bug
tracker. (qxl_phys2virt calls) It looks OK to me, but I'm not familiar with
QXL at all, so perhaps a chunk of code that could use another review?


Or could it be a timing-dependent bug? Although the removed check should
have been rather cheap and shouldn't affect timing much:
- if (!dpy_cursor_define_supported(qxl->vga.con)) {
-        return 0;
-    }



Given that "The time before the freeze seems to be random, from a few
seconds to a couple of minutes." there is a possibility of a false
negative during the bisect. (i.e. commit marked GOOD that should be BAD
because it happened to not hit the freeze in the usual time) If that was
the case, we would expect the regression to be caused by a nearby *earlier*
commit. The only candidate I can spot that touches SPICE or QXL or the
display subsystem in general is a418e7a (ui/console: Convert mouse
visibility parameter into bool) from the same patchset which looks even
more benign, however.



We could ask the user to check whether there's any connection with mouse
cursor changes, e.g. whether he can more easily provoke the issue by
perform actions that rapidly change the mouse cursor. (For example by
visiting https://developer.mozilla.org/en-US/docs/Web/CSS/cursor in the
guest and moving back and forth over the test area.)


Is there an easy way to take a sampling profile on Linux that will show us
stack traces of all the threads in the frozen Qemu process? On macOS this
is easy with the Activity Monitor GUI or iprofiler on the command line.
That ought to confirm whether it's a deadlock or indefinite wait in some
Qemu subsystem.


Michael, what's the situation with the patch you suggested in your comment
on the Qemu bug:
https://gitlab.com/qemu-project/qemu/-/issues/1628#note_2144606625 ? Is
there any chance we can get the Debian user to try that?



Hope that helps!
Phil
diff mbox series

Patch

diff --git a/include/ui/console.h b/include/ui/console.h
index 82b573e680..fa986ab97e 100644
--- a/include/ui/console.h
+++ b/include/ui/console.h
@@ -324,7 +324,6 @@  void dpy_text_update(QemuConsole *con, int x, int y, int w, int h);
 void dpy_text_resize(QemuConsole *con, int w, int h);
 void dpy_mouse_set(QemuConsole *con, int x, int y, bool on);
 void dpy_cursor_define(QemuConsole *con, QEMUCursor *cursor);
-bool dpy_cursor_define_supported(QemuConsole *con);
 bool dpy_gfx_check_format(QemuConsole *con,
                           pixman_format_code_t format);
 
diff --git a/hw/display/qxl-render.c b/hw/display/qxl-render.c
index 8daae72c8d..335d01edde 100644
--- a/hw/display/qxl-render.c
+++ b/hw/display/qxl-render.c
@@ -307,10 +307,6 @@  int qxl_render_cursor(PCIQXLDevice *qxl, QXLCommandExt *ext)
         return 1;
     }
 
-    if (!dpy_cursor_define_supported(qxl->vga.con)) {
-        return 0;
-    }
-
     if (qxl->debug > 1 && cmd->type != QXL_CURSOR_MOVE) {
         fprintf(stderr, "%s", __func__);
         qxl_log_cmd_cursor(qxl, cmd, ext->group_id);
diff --git a/hw/display/vmware_vga.c b/hw/display/vmware_vga.c
index 512f224b9f..3db3ff98f7 100644
--- a/hw/display/vmware_vga.c
+++ b/hw/display/vmware_vga.c
@@ -904,10 +904,8 @@  static uint32_t vmsvga_value_read(void *opaque, uint32_t address)
         caps |= SVGA_CAP_RECT_FILL;
 #endif
 #ifdef HW_MOUSE_ACCEL
-        if (dpy_cursor_define_supported(s->vga.con)) {
-            caps |= SVGA_CAP_CURSOR | SVGA_CAP_CURSOR_BYPASS_2 |
-                    SVGA_CAP_CURSOR_BYPASS;
-        }
+        caps |= SVGA_CAP_CURSOR | SVGA_CAP_CURSOR_BYPASS_2 |
+                SVGA_CAP_CURSOR_BYPASS;
 #endif
         ret = caps;
         break;
diff --git a/ui/console.c b/ui/console.c
index 8e9cc1628a..e8f0083af7 100644
--- a/ui/console.c
+++ b/ui/console.c
@@ -1001,19 +1001,6 @@  void dpy_cursor_define(QemuConsole *c, QEMUCursor *cursor)
     }
 }
 
-bool dpy_cursor_define_supported(QemuConsole *con)
-{
-    DisplayState *s = con->ds;
-    DisplayChangeListener *dcl;
-
-    QLIST_FOREACH(dcl, &s->listeners, next) {
-        if (dcl->ops->dpy_cursor_define) {
-            return true;
-        }
-    }
-    return false;
-}
-
 QEMUGLContext dpy_gl_ctx_create(QemuConsole *con,
                                 struct QEMUGLParams *qparams)
 {