Message ID | d709a9c3-dbe2-65c6-2c2f-6a12f486335d@suse.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | evtchn: (not so) recent XSAs follow-on | expand |
Hi Jan, On 23/11/2020 13:28, Jan Beulich wrote: > The per-vCPU virq_lock, which is being held anyway, together with there > not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[] > is zero, provide sufficient guarantees. I agree that the per-vCPU virq_lock is going to be sufficient, however dropping the lock also means the event channel locking is more complex to understand (the long comment that was added proves it). In fact, the locking in the event channel code was already proven to be quite fragile, therefore I think this patch is not worth the risk. Cheers,
On 02.12.2020 20:03, Julien Grall wrote: > On 23/11/2020 13:28, Jan Beulich wrote: >> The per-vCPU virq_lock, which is being held anyway, together with there >> not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[] >> is zero, provide sufficient guarantees. > > I agree that the per-vCPU virq_lock is going to be sufficient, however > dropping the lock also means the event channel locking is more complex > to understand (the long comment that was added proves it). > > In fact, the locking in the event channel code was already proven to be > quite fragile, therefore I think this patch is not worth the risk. I agree this is a very reasonable position to take. I probably would even have remained silent if in the meantime the spin_lock()s there hadn't changed to read_trylock()s. I really think we want to limit this unusual locking model to where we strictly need it. And this change eliminates 50% of them, with the added benefit of making the paths more lightweight. Jan
Hi Jan, On 03/12/2020 09:46, Jan Beulich wrote: > On 02.12.2020 20:03, Julien Grall wrote: >> On 23/11/2020 13:28, Jan Beulich wrote: >>> The per-vCPU virq_lock, which is being held anyway, together with there >>> not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[] >>> is zero, provide sufficient guarantees. >> >> I agree that the per-vCPU virq_lock is going to be sufficient, however >> dropping the lock also means the event channel locking is more complex >> to understand (the long comment that was added proves it). >> >> In fact, the locking in the event channel code was already proven to be >> quite fragile, therefore I think this patch is not worth the risk. > > I agree this is a very reasonable position to take. I probably > would even have remained silent if in the meantime the > spin_lock()s there hadn't changed to read_trylock()s. I really > think we want to limit this unusual locking model to where we > strictly need it. While I appreciate that the current locking is unusual, we should follow the same model everywhere rather than having a dozen of way to lock the same structure. The rationale is quite simple, if you have one way to lock a structure, then there are less chance to screw up. The only reason I would be willing to diverge from statement is if the performance are significantly improved. Cheers,
On 09.12.2020 10:53, Julien Grall wrote: > On 03/12/2020 09:46, Jan Beulich wrote: >> On 02.12.2020 20:03, Julien Grall wrote: >>> On 23/11/2020 13:28, Jan Beulich wrote: >>>> The per-vCPU virq_lock, which is being held anyway, together with there >>>> not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[] >>>> is zero, provide sufficient guarantees. >>> >>> I agree that the per-vCPU virq_lock is going to be sufficient, however >>> dropping the lock also means the event channel locking is more complex >>> to understand (the long comment that was added proves it). >>> >>> In fact, the locking in the event channel code was already proven to be >>> quite fragile, therefore I think this patch is not worth the risk. >> >> I agree this is a very reasonable position to take. I probably >> would even have remained silent if in the meantime the >> spin_lock()s there hadn't changed to read_trylock()s. I really >> think we want to limit this unusual locking model to where we >> strictly need it. > > While I appreciate that the current locking is unusual, we should follow > the same model everywhere rather than having a dozen of way to lock the > same structure. > > The rationale is quite simple, if you have one way to lock a structure, > then there are less chance to screw up. If only this all was consistent prior to this change. It's not, and hence I don't see how things get so much more unusual than it was before. In fact one "unusual" (the trylock) gets treated for another one (the specific lock protecting the sending of VIRQ events). Jan
--- a/xen/common/event_channel.c +++ b/xen/common/event_channel.c @@ -809,7 +809,6 @@ void send_guest_vcpu_virq(struct vcpu *v unsigned long flags; int port; struct domain *d; - struct evtchn *chn; ASSERT(!virq_is_global(virq)); @@ -820,12 +819,7 @@ void send_guest_vcpu_virq(struct vcpu *v goto out; d = v->domain; - chn = evtchn_from_port(d, port); - if ( evtchn_read_trylock(chn) ) - { - evtchn_port_set_pending(d, v->vcpu_id, chn); - evtchn_read_unlock(chn); - } + evtchn_port_set_pending(d, v->vcpu_id, evtchn_from_port(d, port)); out: spin_unlock_irqrestore(&v->virq_lock, flags); @@ -854,11 +848,7 @@ void send_guest_global_virq(struct domai goto out; chn = evtchn_from_port(d, port); - if ( evtchn_read_trylock(chn) ) - { - evtchn_port_set_pending(d, chn->notify_vcpu_id, chn); - evtchn_read_unlock(chn); - } + evtchn_port_set_pending(d, chn->notify_vcpu_id, chn); out: spin_unlock_irqrestore(&v->virq_lock, flags); --- a/xen/include/xen/event.h +++ b/xen/include/xen/event.h @@ -192,9 +192,16 @@ int evtchn_reset(struct domain *d, bool * Low-level event channel port ops. * * All hooks have to be called with a lock held which prevents the channel - * from changing state. This may be the domain event lock, the per-channel - * lock, or in the case of sending interdomain events also the other side's - * per-channel lock. Exceptions apply in certain cases for the PV shim. + * from changing state. This may be + * - the domain event lock, + * - the per-channel lock, + * - in the case of sending interdomain events the other side's per-channel + * lock, + * - in the case of sending non-global vIRQ-s the per-vCPU virq_lock (in + * combination with the ordering enforced through how the vCPU's + * virq_to_evtchn[] gets updated), + * - in the case of sending global vIRQ-s vCPU 0's virq_lock. + * Exceptions apply in certain cases for the PV shim. */ struct evtchn_port_ops { void (*init)(struct domain *d, struct evtchn *evtchn);
The per-vCPU virq_lock, which is being held anyway, together with there not being any call to evtchn_port_set_pending() when v->virq_to_evtchn[] is zero, provide sufficient guarantees. Undo the lock addition done for XSA-343 (commit e045199c7c9c "evtchn: address races with evtchn_reset()"). Update the description next to struct evtchn_port_ops accordingly. Signed-off-by: Jan Beulich <jbeulich@suse.com> --- v3: Re-base. v2: New.