diff mbox series

[v3,1/5] evtchn: drop acquiring of per-channel lock from send_guest_{global,vcpu}_virq()

Message ID d709a9c3-dbe2-65c6-2c2f-6a12f486335d@suse.com (mailing list archive)
State Superseded
Headers show
Series evtchn: (not so) recent XSAs follow-on | expand

Commit Message

Jan Beulich Nov. 23, 2020, 1:28 p.m. UTC
The per-vCPU virq_lock, which is being held anyway, together with there
not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[]
is zero, provide sufficient guarantees. Undo the lock addition done for
XSA-343 (commit e045199c7c9c "evtchn: address races with
evtchn_reset()"). Update the description next to struct evtchn_port_ops
accordingly.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
---
v3: Re-base.
v2: New.

Comments

Julien Grall Dec. 2, 2020, 7:03 p.m. UTC | #1
Hi Jan,

On 23/11/2020 13:28, Jan Beulich wrote:
> The per-vCPU virq_lock, which is being held anyway, together with there
> not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[]
> is zero, provide sufficient guarantees. 

I agree that the per-vCPU virq_lock is going to be sufficient, however 
dropping the lock also means the event channel locking is more complex 
to understand (the long comment that was added proves it).

In fact, the locking in the event channel code was already proven to be 
quite fragile, therefore I think this patch is not worth the risk.

Cheers,
Jan Beulich Dec. 3, 2020, 9:46 a.m. UTC | #2
On 02.12.2020 20:03, Julien Grall wrote:
> On 23/11/2020 13:28, Jan Beulich wrote:
>> The per-vCPU virq_lock, which is being held anyway, together with there
>> not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[]
>> is zero, provide sufficient guarantees. 
> 
> I agree that the per-vCPU virq_lock is going to be sufficient, however 
> dropping the lock also means the event channel locking is more complex 
> to understand (the long comment that was added proves it).
> 
> In fact, the locking in the event channel code was already proven to be 
> quite fragile, therefore I think this patch is not worth the risk.

I agree this is a very reasonable position to take. I probably
would even have remained silent if in the meantime the
spin_lock()s there hadn't changed to read_trylock()s. I really
think we want to limit this unusual locking model to where we
strictly need it. And this change eliminates 50% of them, with
the added benefit of making the paths more lightweight.

Jan
Julien Grall Dec. 9, 2020, 9:53 a.m. UTC | #3
Hi Jan,

On 03/12/2020 09:46, Jan Beulich wrote:
> On 02.12.2020 20:03, Julien Grall wrote:
>> On 23/11/2020 13:28, Jan Beulich wrote:
>>> The per-vCPU virq_lock, which is being held anyway, together with there
>>> not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[]
>>> is zero, provide sufficient guarantees.
>>
>> I agree that the per-vCPU virq_lock is going to be sufficient, however
>> dropping the lock also means the event channel locking is more complex
>> to understand (the long comment that was added proves it).
>>
>> In fact, the locking in the event channel code was already proven to be
>> quite fragile, therefore I think this patch is not worth the risk.
> 
> I agree this is a very reasonable position to take. I probably
> would even have remained silent if in the meantime the
> spin_lock()s there hadn't changed to read_trylock()s. I really
> think we want to limit this unusual locking model to where we
> strictly need it.

While I appreciate that the current locking is unusual, we should follow 
the same model everywhere rather than having a dozen of way to lock the 
same structure.

The rationale is quite simple, if you have one way to lock a structure, 
then there are less chance to screw up.

The only reason I would be willing to diverge from statement is if the 
performance are significantly improved.

Cheers,
Jan Beulich Dec. 9, 2020, 2:24 p.m. UTC | #4
On 09.12.2020 10:53, Julien Grall wrote:
> On 03/12/2020 09:46, Jan Beulich wrote:
>> On 02.12.2020 20:03, Julien Grall wrote:
>>> On 23/11/2020 13:28, Jan Beulich wrote:
>>>> The per-vCPU virq_lock, which is being held anyway, together with there
>>>> not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[]
>>>> is zero, provide sufficient guarantees.
>>>
>>> I agree that the per-vCPU virq_lock is going to be sufficient, however
>>> dropping the lock also means the event channel locking is more complex
>>> to understand (the long comment that was added proves it).
>>>
>>> In fact, the locking in the event channel code was already proven to be
>>> quite fragile, therefore I think this patch is not worth the risk.
>>
>> I agree this is a very reasonable position to take. I probably
>> would even have remained silent if in the meantime the
>> spin_lock()s there hadn't changed to read_trylock()s. I really
>> think we want to limit this unusual locking model to where we
>> strictly need it.
> 
> While I appreciate that the current locking is unusual, we should follow 
> the same model everywhere rather than having a dozen of way to lock the 
> same structure.
> 
> The rationale is quite simple, if you have one way to lock a structure, 
> then there are less chance to screw up.

If only this all was consistent prior to this change. It's not, and
hence I don't see how things get so much more unusual than it was
before. In fact one "unusual" (the trylock) gets treated for another
one (the specific lock protecting the sending of VIRQ events).

Jan
diff mbox series

Patch

--- a/xen/common/event_channel.c
+++ b/xen/common/event_channel.c
@@ -809,7 +809,6 @@  void send_guest_vcpu_virq(struct vcpu *v
     unsigned long flags;
     int port;
     struct domain *d;
-    struct evtchn *chn;
 
     ASSERT(!virq_is_global(virq));
 
@@ -820,12 +819,7 @@  void send_guest_vcpu_virq(struct vcpu *v
         goto out;
 
     d = v->domain;
-    chn = evtchn_from_port(d, port);
-    if ( evtchn_read_trylock(chn) )
-    {
-        evtchn_port_set_pending(d, v->vcpu_id, chn);
-        evtchn_read_unlock(chn);
-    }
+    evtchn_port_set_pending(d, v->vcpu_id, evtchn_from_port(d, port));
 
  out:
     spin_unlock_irqrestore(&v->virq_lock, flags);
@@ -854,11 +848,7 @@  void send_guest_global_virq(struct domai
         goto out;
 
     chn = evtchn_from_port(d, port);
-    if ( evtchn_read_trylock(chn) )
-    {
-        evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
-        evtchn_read_unlock(chn);
-    }
+    evtchn_port_set_pending(d, chn->notify_vcpu_id, chn);
 
  out:
     spin_unlock_irqrestore(&v->virq_lock, flags);
--- a/xen/include/xen/event.h
+++ b/xen/include/xen/event.h
@@ -192,9 +192,16 @@  int evtchn_reset(struct domain *d, bool
  * Low-level event channel port ops.
  *
  * All hooks have to be called with a lock held which prevents the channel
- * from changing state. This may be the domain event lock, the per-channel
- * lock, or in the case of sending interdomain events also the other side's
- * per-channel lock. Exceptions apply in certain cases for the PV shim.
+ * from changing state. This may be
+ * - the domain event lock,
+ * - the per-channel lock,
+ * - in the case of sending interdomain events the other side's per-channel
+ *   lock,
+ * - in the case of sending non-global vIRQ-s the per-vCPU virq_lock (in
+ *   combination with the ordering enforced through how the vCPU's
+ *   virq_to_evtchn[] gets updated),
+ * - in the case of sending global vIRQ-s vCPU 0's virq_lock.
+ * Exceptions apply in certain cases for the PV shim.
  */
 struct evtchn_port_ops {
     void (*init)(struct domain *d, struct evtchn *evtchn);