From patchwork Tue May 28 13:48:58 2013 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Paolo Bonzini X-Patchwork-Id: 2625051 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-process-083081@patchwork2.kernel.org Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by patchwork2.kernel.org (Postfix) with ESMTP id 9EA46DFB78 for ; Tue, 28 May 2013 13:49:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934178Ab3E1NtM (ORCPT ); Tue, 28 May 2013 09:49:12 -0400 Received: from mx1.redhat.com ([209.132.183.28]:9132 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933785Ab3E1NtM (ORCPT ); Tue, 28 May 2013 09:49:12 -0400 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) by mx1.redhat.com (8.14.4/8.14.4) with ESMTP id r4SDnAOb004651 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-SHA bits=256 verify=OK); Tue, 28 May 2013 09:49:10 -0400 Received: from yakj.usersys.redhat.com (ovpn-112-26.ams2.redhat.com [10.36.112.26]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id r4SDn6P8012047; Tue, 28 May 2013 09:49:07 -0400 Message-ID: <51A4B5CA.9070109@redhat.com> Date: Tue, 28 May 2013 15:48:58 +0200 From: Paolo Bonzini User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:17.0) Gecko/20130514 Thunderbird/17.0.6 MIME-Version: 1.0 To: Gleb Natapov CC: kvm@vger.kernel.org, Jan Kiszka Subject: Re: [PATCH RFC] KVM: Fix race in apic->pending_events processing References: <20130526130031.GS4725@redhat.com> <51A48D53.7070204@redhat.com> <20130528125613.GB3326@redhat.com> In-Reply-To: <20130528125613.GB3326@redhat.com> X-Enigmail-Version: 1.5.1 X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org Il 28/05/2013 14:56, Gleb Natapov ha scritto: >> > else >> > vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; >> > } >> > - if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && >> > + /* >> > + * Note that we may get another INIT+SIPI sequence right here; process >> > + * the INIT first. Assumes that there are only KVM_APIC_INIT/SIPI. >> > + */ >> > + if (cmpxchg(&apic->pending_events, KVM_APIC_SIPI, 0) == KVM_APIC_SIPI && >> > vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) { > Because pending_events can be INIT/SIPI at this point and it should be > interpreted as: do SIPI and ignore INIT (atomically). My patch does "do another INIT (which will have no effect) and do SIPI after that INIT", which is different but has almost the same effect. If pending_events is INIT/SIPI, it ignores the SIPI for now and lets the next iteration of kvm_apic_accept_events do both. The difference would be that in a carefully-timed sequence of interrupts INIT-INIT-SIPI-INIT-SIPI your version would do many SIPIs, while mine would do just one. Hmm... there is a reference to this in 25.2 "Other causes of VM exits": "If a logical processor is in the wait-for-SIPI state, INIT signals are blocked. They do not cause VM exits in this case." It is not for the physical processor, but it makes sense to have the same thing. Is this the reason why you did the cmpxchg at the end? But then, there's another way to mask INITs in the wait-for-SIPI state. Considering that KVM_MP_STATE_INIT_RECEIVED is really a wait-for-SIPI, you can do: I don't have a particular preference. Paolo --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/kvm/lapic.c b/arch/x86/kvm/lapic.c index e1adbb4..36bc308 100644 --- a/arch/x86/kvm/lapic.c +++ b/arch/x86/kvm/lapic.c @@ -720,8 +720,12 @@ out: break; case APIC_DM_INIT: - if (!trig_mode || level) { + if ((!trig_mode || level) && + vcpu->arch.mp_state != KVM_MP_STATE_INIT_RECEIVED) { result = 1; + + /* check mp_state before writing apic->pending_events */ + smp_mb(); /* assumes that there are only KVM_APIC_INIT/SIPI */ apic->pending_events = (1UL << KVM_APIC_INIT); /* make sure pending_events is visible before sending @@ -1865,13 +1869,17 @@ void kvm_apic_accept_events(struct kvm_vcpu *vcpu) if (!kvm_vcpu_has_lapic(vcpu)) return; - if (test_and_clear_bit(KVM_APIC_INIT, &apic->pending_events)) { + if (test_bit(KVM_APIC_INIT, &apic->pending_events)) { kvm_lapic_reset(vcpu); kvm_vcpu_reset(vcpu); if (kvm_vcpu_is_bsp(apic->vcpu)) vcpu->arch.mp_state = KVM_MP_STATE_RUNNABLE; else vcpu->arch.mp_state = KVM_MP_STATE_INIT_RECEIVED; + + /* write mp_state before toggling KVM_APIC_INIT */ + smp_mb__before_clear_bit(); + clear_bit(KVM_APIC_INIT, &apic->pending_events); } if (test_and_clear_bit(KVM_APIC_SIPI, &apic->pending_events) && vcpu->arch.mp_state == KVM_MP_STATE_INIT_RECEIVED) {