diff mbox series

[v7,1/7] xen/events: fix race with set_global_virq_handler()

Message ID 20250109105935.23585-2-jgross@suse.com (mailing list archive)
State New
Headers show
Series remove libxenctrl usage from xenstored | expand

Commit Message

Juergen Gross Jan. 9, 2025, 10:59 a.m. UTC
There is a possible race scenario between set_global_virq_handler()
and clear_global_virq_handlers() targeting the same domain, which
might result in that domain ending as a zombie domain.

In case set_global_virq_handler() is being called for a domain which
is just dying, it might happen that clear_global_virq_handlers() is
running first, resulting in set_global_virq_handler() taking a new
reference for that domain and entering in the global_virq_handlers[]
array afterwards. The reference will never be dropped, thus the domain
will never be freed completely.

This can be fixed by checking the is_dying state of the domain inside
the region guarded by global_virq_handlers_lock. In case the domain is
dying, handle it as if the domain wouldn't exist, which will be the
case in near future anyway.

Fixes: 87521589aa6a ("xen: allow global VIRQ handlers to be delegated to other domains")
Signed-off-by: Juergen Gross <jgross@suse.com>
---
V6:
- new patch
V7:
- add comment (Jan Beulich)
---
 xen/common/event_channel.c | 25 ++++++++++++++++++++++---
 1 file changed, 22 insertions(+), 3 deletions(-)

Comments

Jan Beulich Jan. 9, 2025, 3:46 p.m. UTC | #1
On 09.01.2025 11:59, Juergen Gross wrote:
> There is a possible race scenario between set_global_virq_handler()
> and clear_global_virq_handlers() targeting the same domain, which
> might result in that domain ending as a zombie domain.
> 
> In case set_global_virq_handler() is being called for a domain which
> is just dying, it might happen that clear_global_virq_handlers() is
> running first, resulting in set_global_virq_handler() taking a new
> reference for that domain and entering in the global_virq_handlers[]
> array afterwards. The reference will never be dropped, thus the domain
> will never be freed completely.
> 
> This can be fixed by checking the is_dying state of the domain inside
> the region guarded by global_virq_handlers_lock. In case the domain is
> dying, handle it as if the domain wouldn't exist, which will be the
> case in near future anyway.
> 
> Fixes: 87521589aa6a ("xen: allow global VIRQ handlers to be delegated to other domains")
> Signed-off-by: Juergen Gross <jgross@suse.com>

Reviewed-by: Jan Beulich <jbeulich@suse.com>
diff mbox series

Patch

diff --git a/xen/common/event_channel.c b/xen/common/event_channel.c
index 8db2ca4ba2..46281b16ce 100644
--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -979,6 +979,7 @@  void send_global_virq(uint32_t virq)
 int set_global_virq_handler(struct domain *d, uint32_t virq)
 {
     struct domain *old;
+    int rc = 0;
 
     if (virq >= NR_VIRQS)
         return -EINVAL;
@@ -992,14 +993,32 @@  int set_global_virq_handler(struct domain *d, uint32_t virq)
         return -EINVAL;
 
     spin_lock(&global_virq_handlers_lock);
-    old = global_virq_handlers[virq];
-    global_virq_handlers[virq] = d;
+
+    /*
+     * Note that this check won't guarantee that a domain just going down can't
+     * be set as the handling domain of a virq, as the is_dying indicator might
+     * change just after testing it.
+     * This isn't going to be a major problem, as clear_global_virq_handlers()
+     * is guaranteed to run afterwards and it will reset the handling domain
+     * for the virq to the hardware domain.
+     */
+    if ( d->is_dying != DOMDYING_alive )
+    {
+        old = d;
+        rc = -EINVAL;
+    }
+    else
+    {
+        old = global_virq_handlers[virq];
+        global_virq_handlers[virq] = d;
+    }
+
     spin_unlock(&global_virq_handlers_lock);
 
     if (old != NULL)
         put_domain(old);
 
-    return 0;
+    return rc;
 }
 
 static void clear_global_virq_handlers(struct domain *d)