From patchwork Tue Oct 15 06:15:22 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Zhang, Rui" X-Patchwork-Id: 13835744 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.10]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id D20762582; Tue, 15 Oct 2024 06:15:35 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=192.198.163.10 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728972937; cv=none; b=bHWanLRSaeQSWkVljGdLNlRG2eg/plbnMzHizrDgN/I1/Q+wnCe8Lh/i7Umu0Q4PWwY6GoUVarTv1RlRFE26JkhvqEoM9FOLEXjP1xHmpxG7VAjfn9aF/N6nuUwIsFK7rtN+V1tJjjy8wWzf71FWD6myGogA+/yEj618746+8uM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1728972937; c=relaxed/simple; bh=J3AEEHWFdXK99qmd51njSmIbev5atRLnRYTFUmdmIwU=; h=From:To:Cc:Subject:Date:Message-Id:MIME-Version; b=tM3LlLQgTuK9RD+OWgbVPaTN4LrczfPV1qYmlJcS/Dw6r3dSNGXjJVdJJHAw+07C1fds2N9iQqwML/fNFtm337joZoU3WVmGul3UlAAxx7boBRRYI4hIVM4cjdOflq1xpF+FgNQEUojc2715UQ3D1VQvKC+TXpohaU9l480gaz4= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com; spf=pass smtp.mailfrom=intel.com; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b=irBSJEDR; arc=none smtp.client-ip=192.198.163.10 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=intel.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=intel.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="irBSJEDR" DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1728972936; x=1760508936; h=from:to:cc:subject:date:message-id:mime-version: content-transfer-encoding; bh=J3AEEHWFdXK99qmd51njSmIbev5atRLnRYTFUmdmIwU=; b=irBSJEDRVbmiEtJJ1+puM78qFiDI5s7If0EclsKWYeji9vGboS/UyG9I g9fMzFuSGJqU7ogcLRg4Qq2pSYZTQb9DZRGuJhJDk2YFwj0MKqXU9Cc31 rsASE9AYsitkjKZa/jnlMpku87m7m09yPnBTwxn5qF3fM4cKiPJVXIaMD aSFLaX221DIBOR/hVNUWcMSmHs0lmoFCyeB4T3KxbpFHjtPnlvmMcQx7O FUypEgVAJWqGkuWu3/+VhJudLnZjDmBpnVCmMJgmqPk8kW4Itb5GQHruu rxy58SuVyZHBnh9qLxXaOXEbJno2Sd49uEj/1xfFasKNt+5PnnN0T+elf w==; X-CSE-ConnectionGUID: fIY17HwZSQ6j1a2zVFyrpQ== X-CSE-MsgGUID: H4x5SoGoSUmvvTpVbHraAA== X-IronPort-AV: E=McAfee;i="6700,10204,11225"; a="39721986" X-IronPort-AV: E=Sophos;i="6.11,204,1725346800"; d="scan'208";a="39721986" Received: from fmviesa002.fm.intel.com ([10.60.135.142]) by fmvoesa104.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2024 23:15:35 -0700 X-CSE-ConnectionGUID: oOMMFqeMR3u/cwaAPy56aw== X-CSE-MsgGUID: Cc8Kwl+kT0Ch4xj8sPpPpA== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="6.11,204,1725346800"; d="scan'208";a="101135730" Received: from rzhang1-mobl7.sh.intel.com ([10.238.6.124]) by fmviesa002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 14 Oct 2024 23:15:30 -0700 From: Zhang Rui To: tglx@linutronix.de, mingo@redhat.com, bp@alien8.de, dave.hansen@linux.intel.com, rafael.j.wysocki@intel.com, x86@kernel.org, linux-pm@vger.kernel.org Cc: hpa@zytor.com, peterz@infradead.org, thorsten.blum@toblux.com, yuntao.wang@linux.dev, tony.luck@intel.com, len.brown@intel.com, srinivas.pandruvada@intel.com, linux-kernel@vger.kernel.org, stable@vger.kernel.org Subject: [PATCH V4] x86/apic: Always explicitly disarm TSC-deadline timer Date: Tue, 15 Oct 2024 14:15:22 +0800 Message-Id: <20241015061522.25288-1-rui.zhang@intel.com> X-Mailer: git-send-email 2.34.1 Precedence: bulk X-Mailing-List: linux-pm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 New processors have become pickier about the local APIC timer state before entering low power modes. These low power modes are used (for example) when you close your laptop lid and suspend. If you put your laptop in a bag in this unnecessarily-high-power state, it is likely to get quite toasty while it quickly sucks the battery dry. The problem boils down to some CPUs' inability to power down until the kernel fully disables the local APIC timer. The current kernel code works in one-shot and periodic modes but does not work for deadline mode. Deadline mode has been the supported and preferred mode on Intel CPUs for over a decade and uses an MSR to drive the timer instead of an APIC register. Disable the TSC Deadline timer in lapic_timer_shutdown() by writing to MSR_IA32_TSC_DEADLINE when in TSC-deadline mode. Also avoid writing to the initial-count register (APIC_TMICT) which is ignored in TSC-deadline mode. Note: The APIC_LVTT|=APIC_LVT_MASKED operation should theoretically be enough to tell the hardware that the timer will not fire in any of the timer modes. But mitigating AMD erratum 411[1] also requires clearing out APIC_TMICT. Solely setting APIC_LVT_MASKED is also ineffective in practice on Intel Lunar Lake systems, which is the motivation for this change. 1. 411 Processor May Exit Message-Triggered C1E State Without an Interrupt if Local APIC Timer Reaches Zero - https://www.amd.com/content/dam/amd/en/documents/archived-tech-docs/revision-guides/41322_10h_Rev_Gd.pdf Cc: stable@vger.kernel.org Fixes: 279f1461432c ("x86: apic: Use tsc deadline for oneshot when available") Suggested-by: Dave Hansen Signed-off-by: Zhang Rui Reviewed-by: Rafael J. Wysocki Tested-by: Srinivas Pandruvada Tested-by: Todd Brandt --- V2 - Improve changelog V3 - Subject and changelog rewrite - Check LAPIC Timer mode using APIC_LVTT value instead of extra CPU feature flag check - Avoid APIC_TMICT write which is ignored in TSC-deadline mode V4 - Add back Fixes tag and stable tag which was missing in V3 - Update patch recipients --- arch/x86/kernel/apic/apic.c | 14 +++++++++++++- 1 file changed, 13 insertions(+), 1 deletion(-) diff --git a/arch/x86/kernel/apic/apic.c b/arch/x86/kernel/apic/apic.c index 6513c53c9459..5436a4083065 100644 --- a/arch/x86/kernel/apic/apic.c +++ b/arch/x86/kernel/apic/apic.c @@ -440,7 +440,19 @@ static int lapic_timer_shutdown(struct clock_event_device *evt) v = apic_read(APIC_LVTT); v |= (APIC_LVT_MASKED | LOCAL_TIMER_VECTOR); apic_write(APIC_LVTT, v); - apic_write(APIC_TMICT, 0); + + /* + * Setting APIC_LVT_MASKED should be enough to tell the + * hardware that this timer will never fire. But AMD + * erratum 411 and some Intel CPU behavior circa 2024 + * say otherwise. Time for belt and suspenders programming, + * mask the timer and zero the counter registers: + */ + if (v & APIC_LVT_TIMER_TSCDEADLINE) + wrmsrl(MSR_IA32_TSC_DEADLINE, 0); + else + apic_write(APIC_TMICT, 0); + return 0; }