diff mbox series

[03/12] evtchn: don't call Xen consumer callback with per-channel lock held

Message ID 1bf3959d-c097-f8ef-cce4-3a325d0984c4@suse.com (mailing list archive)
State Superseded
Headers show
Series evtchn: recent XSAs follow-on | expand

Commit Message

Jan Beulich Sept. 28, 2020, 10:57 a.m. UTC
While there don't look to be any problems with this right now, the lock
order implications from holding the lock can be very difficult to follow
(and may be easy to violate unknowingly). The present callbacks don't
(and no such callback should) have any need for the lock to be held.

Signed-off-by: Jan Beulich <jbeulich@suse.com>

Comments

Julien Grall Sept. 29, 2020, 10:16 a.m. UTC | #1
Hi Jan,

On 28/09/2020 11:57, Jan Beulich wrote:
> While there don't look to be any problems with this right now, the lock
> order implications from holding the lock can be very difficult to follow
> (and may be easy to violate unknowingly).

I think this is a good idea given that we are disabling interrupts now. 
Unfortunately...

> The present callbacks don't
> (and no such callback should) have any need for the lock to be held.

... I think the lock is necessary for the vm_event subsystem to avoid 
racing with the vm_event_disable().

The notification callback will use a data structure that is freed by 
vm_event_disable(). There is a lock, but it is part of the data structure...

One solution would be to have the lock outside of the data structure.

> 
> Signed-off-by: Jan Beulich <jbeulich@suse.com>
> 
> --- a/xen/common/event_channel.c
> +++ b/xen/common/event_channel.c
> @@ -746,9 +746,18 @@ int evtchn_send(struct domain *ld, unsig
>           rport = lchn->u.interdomain.remote_port;
>           rchn  = evtchn_from_port(rd, rport);
>           if ( consumer_is_xen(rchn) )
> -            xen_notification_fn(rchn)(rd->vcpu[rchn->notify_vcpu_id], rport);
> -        else
> -            evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn);
> +        {
> +            /* Don't keep holding the lock for the call below. */
> +            xen_event_channel_notification_t fn = xen_notification_fn(rchn);
> +            struct vcpu *rv = rd->vcpu[rchn->notify_vcpu_id];
> +
> +            rcu_lock_domain(rd);
> +            spin_unlock_irqrestore(&lchn->lock, flags);
> +            fn(rv, rport);
> +            rcu_unlock_domain(rd);
> +            return 0;
> +        }
> +        evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn);
>           break;
>       case ECS_IPI:
>           evtchn_port_set_pending(ld, lchn->notify_vcpu_id, lchn);
>

Cheers,
Jan Beulich Sept. 29, 2020, 10:54 a.m. UTC | #2
On 29.09.2020 12:16, Julien Grall wrote:
> On 28/09/2020 11:57, Jan Beulich wrote:
>> While there don't look to be any problems with this right now, the lock
>> order implications from holding the lock can be very difficult to follow
>> (and may be easy to violate unknowingly).
> 
> I think this is a good idea given that we are disabling interrupts now. 
> Unfortunately...
> 
>> The present callbacks don't
>> (and no such callback should) have any need for the lock to be held.
> 
> ... I think the lock is necessary for the vm_event subsystem to avoid 
> racing with the vm_event_disable().
> 
> The notification callback will use a data structure that is freed by 
> vm_event_disable(). There is a lock, but it is part of the data structure...

Oh, indeed - somehow I didn't spot this despite looking there.

> One solution would be to have the lock outside of the data structure.

I don't think that's viable - the structures are intentionally
separated from struct vcpu. I see two other options: Either free
the structure via call_rcu(), or maintain a count of in-progress
calls, and wait for it to drop to zero when closing the port.

VM event maintainers / reviewers - what are your thoughts here?

Jan
diff mbox series

Patch

--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -746,9 +746,18 @@  int evtchn_send(struct domain *ld, unsig
         rport = lchn->u.interdomain.remote_port;
         rchn  = evtchn_from_port(rd, rport);
         if ( consumer_is_xen(rchn) )
-            xen_notification_fn(rchn)(rd->vcpu[rchn->notify_vcpu_id], rport);
-        else
-            evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn);
+        {
+            /* Don't keep holding the lock for the call below. */
+            xen_event_channel_notification_t fn = xen_notification_fn(rchn);
+            struct vcpu *rv = rd->vcpu[rchn->notify_vcpu_id];
+
+            rcu_lock_domain(rd);
+            spin_unlock_irqrestore(&lchn->lock, flags);
+            fn(rv, rport);
+            rcu_unlock_domain(rd);
+            return 0;
+        }
+        evtchn_port_set_pending(rd, rchn->notify_vcpu_id, rchn);
         break;
     case ECS_IPI:
         evtchn_port_set_pending(ld, lchn->notify_vcpu_id, lchn);