From patchwork Fri Feb 8 10:19:02 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Evgeny Yakovlev X-Patchwork-Id: 10802671 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 70BC51390 for ; Fri, 8 Feb 2019 10:27:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 60E0E2DA4B for ; Fri, 8 Feb 2019 10:27:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 528AE2DA32; Fri, 8 Feb 2019 10:27:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-8.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B16262DA4B for ; Fri, 8 Feb 2019 10:27:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727641AbfBHK1M (ORCPT ); Fri, 8 Feb 2019 05:27:12 -0500 Received: from forwardcorp1j.cmail.yandex.net ([5.255.227.105]:54966 "EHLO forwardcorp1j.cmail.yandex.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726022AbfBHK1M (ORCPT ); Fri, 8 Feb 2019 05:27:12 -0500 X-Greylist: delayed 473 seconds by postgrey-1.27 at vger.kernel.org; Fri, 08 Feb 2019 05:27:09 EST Received: from mxbackcorp1o.mail.yandex.net (mxbackcorp1o.mail.yandex.net [IPv6:2a02:6b8:0:1a2d::301]) by forwardcorp1j.cmail.yandex.net (Yandex) with ESMTP id 3C8B920F4E for ; Fri, 8 Feb 2019 13:19:14 +0300 (MSK) Received: from smtpcorp1o.mail.yandex.net (smtpcorp1o.mail.yandex.net [2a02:6b8:0:1a2d::30]) by mxbackcorp1o.mail.yandex.net (nwsmtp/Yandex) with ESMTP id XjfjOdvw0k-JDAO4RMS; Fri, 08 Feb 2019 13:19:14 +0300 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=yandex-team.ru; s=default; t=1549621154; bh=oSfhIq4ed3cEFXXClZSsmbsC3S7QT1N24ONKfTSr8hY=; h=From:To:Cc:Subject:Date:Message-Id; b=Eo8UIJ3wXY2x5uSCkfeGIYYwfqvXzg1XmzCvf0r6kIGCPPX+eCfqTSBbNvPiHZzIL F9t/lro23ZXdPh7jSeodMCw+9YT2zToFDrAyqj+5t9wLkvLWnidMq+gAXPB3Nje7qd laTQ+edBLsHfjsxb+5TSQ11d3c9ysMrC1r+2q2Yo= Authentication-Results: mxbackcorp1o.mail.yandex.net; dkim=pass header.i=@yandex-team.ru Received: from dynamic-red.dhcp.yndx.net (dynamic-red.dhcp.yndx.net [2a02:6b8:0:40c:f68c:50ff:fee9:44bd]) by smtpcorp1o.mail.yandex.net (nwsmtp/Yandex) with ESMTPSA id sLp3fXNpmF-JDjCurFL; Fri, 08 Feb 2019 13:19:13 +0300 (using TLSv1.2 with cipher ECDHE-RSA-AES128-SHA256 (128/128 bits)) (Client certificate not present) From: Evgeny Yakovlev To: kvm@vger.kernel.org Cc: wrfsh@yandex-team.ru Subject: [kvm-unit-tests PATCH] x86/apic: wait for wrap around in lapic timer periodic test Date: Fri, 8 Feb 2019 13:19:02 +0300 Message-Id: <1549621142-25326-1-git-send-email-wrfsh@yandex-team.ru> X-Mailer: git-send-email 2.7.4 Sender: kvm-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: kvm@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP We are seeing very rare test failures in apic/split-apic with the following: "FAIL: TMCCT should be reset to the initial-count". We are running on 4.14 kernels and this problem occurs sporadically although we didn't try to reproduce it on any other stable kernel branch. It looks like under KVM lapic timer in periodic mode may return several consecutive 0 values in apic_get_tmccr when hrtimer async callback is still restarting the hrtimer and recalculating expiration_time value but apic_get_tmccr is seeing ktime_now() already ahead of (still old) expiration_time value. After a couple of 0-es from TMCCR everything goes back to normal. This change forces test_apic_change_mode test to wait specifically for lapic timer wrap around skipping zero TMCCR values in one case when we are testing for timer in periodic mode to correctly reload TMCCR from TMICR. If timer mode change is indeed broken and TMCCR is not reset then we will see apic test timeout. Signed-off-by: Evgeny Yakovlev --- x86/apic.c | 23 +++++++++++++++++++++-- 1 file changed, 21 insertions(+), 2 deletions(-) diff --git a/x86/apic.c b/x86/apic.c index cfdbef2..51744cf 100644 --- a/x86/apic.c +++ b/x86/apic.c @@ -489,7 +489,7 @@ static void test_physical_broadcast(void) report("APIC physical broadcast shorthand", broadcast_received(ncpus)); } -static void wait_until_tmcct_is_zero(uint32_t initial_count, bool stop_when_half) +static void wait_until_tmcct_common(uint32_t initial_count, bool stop_when_half, bool should_wrap_around) { uint32_t tmcct = apic_read(APIC_TMCCT); @@ -503,9 +503,23 @@ static void wait_until_tmcct_is_zero(uint32_t initial_count, bool stop_when_half /* Wait until the counter reach 0 or wrap-around */ while ( tmcct <= (initial_count / 2) && tmcct > 0 ) tmcct = apic_read(APIC_TMCCT); + + /* Wait specifically for wrap around to skip 0 TMCCR if we were asked to */ + while (should_wrap_around && !tmcct) + tmcct = apic_read(APIC_TMCCT); } } +static void wait_until_tmcct_is_zero(uint32_t initial_count, bool stop_when_half) +{ + return wait_until_tmcct_common(initial_count, stop_when_half, false); +} + +static void wait_until_tmcct_wrap_around(uint32_t initial_count, bool stop_when_half) +{ + return wait_until_tmcct_common(initial_count, stop_when_half, true); +} + static inline void apic_change_mode(unsigned long new_mode) { uint32_t lvtt; @@ -548,7 +562,12 @@ static void test_apic_change_mode(void) * counting down from where it was */ report("TMCCT should not be reset to TMICT value", apic_read(APIC_TMCCT) < (tmict / 2)); - wait_until_tmcct_is_zero(tmict, false); + /* + * Specifically wait for timer wrap around and skip 0. + * Under KVM lapic there is a possibility that a small amount of consecutive + * TMCCR reads return 0 while hrtimer is reset in an async callback + */ + wait_until_tmcct_wrap_around(tmict, false); report("TMCCT should be reset to the initial-count", apic_read(APIC_TMCCT) > (tmict / 2)); wait_until_tmcct_is_zero(tmict, true);