From patchwork Wed Mar 25 11:08:14 2015 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?b?UmFkaW0gS3LEjW3DocWZ?= X-Patchwork-Id: 6090131 Return-Path: X-Original-To: patchwork-kvm@patchwork.kernel.org Delivered-To: patchwork-parsemail@patchwork2.web.kernel.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.136]) by patchwork2.web.kernel.org (Postfix) with ESMTP id 6A1E0BF90F for ; Wed, 25 Mar 2015 11:08:30 +0000 (UTC) Received: from mail.kernel.org (localhost [127.0.0.1]) by mail.kernel.org (Postfix) with ESMTP id A740F20377 for ; Wed, 25 Mar 2015 11:08:25 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 8CC72201B4 for ; Wed, 25 Mar 2015 11:08:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752631AbbCYLIV (ORCPT ); Wed, 25 Mar 2015 07:08:21 -0400 Received: from mx1.redhat.com ([209.132.183.28]:57013 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752353AbbCYLIS (ORCPT ); Wed, 25 Mar 2015 07:08:18 -0400 Received: from int-mx11.intmail.prod.int.phx2.redhat.com (int-mx11.intmail.prod.int.phx2.redhat.com [10.5.11.24]) by mx1.redhat.com (Postfix) with ESMTPS id 84B64BACB2; Wed, 25 Mar 2015 11:08:18 +0000 (UTC) Received: from potion (dhcp-1-126.brq.redhat.com [10.34.1.126]) by int-mx11.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with SMTP id t2PB8F5t007115; Wed, 25 Mar 2015 07:08:15 -0400 Received: by potion (sSMTP sendmail emulation); Wed, 25 Mar 2015 12:08:14 +0100 Date: Wed, 25 Mar 2015 12:08:14 +0100 From: Radim =?utf-8?B?S3LEjW3DocWZ?= To: Andy Lutomirski Cc: Marcelo Tosatti , kvm-devel , stable , Paolo Bonzini Subject: Re: x86: kvm: Revert "remove sched notifier for cross-cpu migrations" Message-ID: <20150325110814.GE21522@potion.brq.redhat.com> References: <20150323232151.GA12772@amt.cnet> <20150324153412.GB21710@potion.brq.redhat.com> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: X-Scanned-By: MIMEDefang 2.68 on 10.5.11.24 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Spam-Status: No, score=-5.9 required=5.0 tests=BAYES_00,HK_RANDOM_FROM, RCVD_IN_DNSWL_HI, T_RP_MATCHES_RCVD, UNPARSEABLE_RELAY autolearn=unavailable version=3.3.1 X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on mail.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP 2015-03-24 15:33-0700, Andy Lutomirski: > On Tue, Mar 24, 2015 at 8:34 AM, Radim Kr?má? wrote: > > What is the problem? > > The kvmclock spec says that the host will increment a version field to > an odd number, then update stuff, then increment it to an even number. > The host is buggy and doesn't do this, and the result is observable > when one vcpu reads another vcpu's kvmclock data. > > Since there's no good way for a guest kernel to keep its vdso from > reading a different vcpu's kvmclock data, this is a real corner-case > bug. This patch allows the vdso to retry when this happens. I don't > think it's a great solution, but it should mostly work. Great explanation, thank you. Reverting the patch protects us from any migration, but I don't think we need to care about changing VCPUs as long as we read a consistent data from kvmclock. (VCPU can change outside of this loop too, so it doesn't matter if we return a value not fit for this VCPU.) I think we could drop the second __getcpu if our kvmclock was being handled better; maybe with a patch like the one below: --- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/arch/x86/kvm/x86.c b/arch/x86/kvm/x86.c index cc2c759f69a3..8658599e0024 100644 --- a/arch/x86/kvm/x86.c +++ b/arch/x86/kvm/x86.c @@ -1658,12 +1658,24 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) &guest_hv_clock, sizeof(guest_hv_clock)))) return 0; - /* - * The interface expects us to write an even number signaling that the - * update is finished. Since the guest won't see the intermediate - * state, we just increase by 2 at the end. + /* A guest can read other VCPU's kvmclock; specification says that + * version is odd if data is being modified and even after it is + * consistent. + * We write three times to be sure. + * 1) update version to odd number + * 2) write modified data (version is still odd) + * 3) update version to even number + * + * TODO: optimize + * - only two writes should be enough -- version is first + * - the second write could update just version */ - vcpu->hv_clock.version = guest_hv_clock.version + 2; + guest_hv_clock.version += 1; + kvm_write_guest_cached(v->kvm, &vcpu->pv_time, + &guest_hv_clock, + sizeof(guest_hv_clock)); + + vcpu->hv_clock.version = guest_hv_clock.version; /* retain PVCLOCK_GUEST_STOPPED if set in guest copy */ pvclock_flags = (guest_hv_clock.flags & PVCLOCK_GUEST_STOPPED); @@ -1684,6 +1696,11 @@ static int kvm_guest_time_update(struct kvm_vcpu *v) kvm_write_guest_cached(v->kvm, &vcpu->pv_time, &vcpu->hv_clock, sizeof(vcpu->hv_clock)); + + vcpu->hv_clock.version += 1; + kvm_write_guest_cached(v->kvm, &vcpu->pv_time, + &vcpu->hv_clock, + sizeof(vcpu->hv_clock)); return 0; }