From patchwork Thu Apr 10 20:01:59 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mario Limonciello X-Patchwork-Id: 14047261 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9E2E02900BB; Thu, 10 Apr 2025 20:02:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315342; cv=none; b=a78RtHa0GSGf4uLh3r+qDefiZkRSimURqdpy9dTz+U3ill24PxDmpsxaZ6URotPTxi+OrH9re2Pkp+WpgQ0gZUII4MHdRhUVw4bs82M+Tbp6NipRLyc8TEVeb+5htdO+MwIKl3VMU7yTpilB4tAwpl1MzS+V4bDFVvLmt1Y6CLQ= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315342; c=relaxed/simple; bh=qE5KQeuTNfgty93zEjMZIPiBhtHAerTzCtIP0LC3IIY=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=b2SOR7+/mwuruMhVew2mVTJENz7IfDHByTN3pvdukB9WnXC/43qoW2IgObD/2f4xMmwF1UCEjHaVrnT5v9WqZ5kh+KNRtcIECn53D1NuaPo9nWj0QLFqUOZVibm+9J/DbpRH6UyW5vBTmeLcK2mPOdHmipQWNZEQ9oZmpCabOLI= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=lEmO/JBx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="lEmO/JBx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id E7A23C4CEEB; Thu, 10 Apr 2025 20:02:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744315341; bh=qE5KQeuTNfgty93zEjMZIPiBhtHAerTzCtIP0LC3IIY=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=lEmO/JBx/HHl3VWsF59TpB+l58Wuphhj0uXgfq00F+xLoSwIySf8C4bmxKG65zxqt guL1HPJIdlZKxdyCqp9h1mBBRTpBl5nhA/ROeW3qMUr9XGzLhik6VQxgrV7h1+f2EA hv3+gQ23ttWbklOn/04BiuI2P2o92F0x6V+jyli8m7FzhLt67zPtvGP0hGz6vcFlUQ oUFzDTsO1zHVz6pHrXPqZMR1ZiDy5NAuAYWK7etaOhFVQs/bW0lBu1+Tw1BIAk5lye /3ncl/1yCbdzIFegWX6YUo6yYFFwvfNb+BXcjg2ToEu6sfk+hqH/H2qi5buwA3Pypt UlORzL1B8G8XA== From: Mario Limonciello To: Borislav Petkov , Jean Delvare , Andi Shyti , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Cc: Jonathan Corbet , Mario Limonciello , Yazen Ghannam , Thomas Gleixner , Ingo Molnar , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H . Peter Anvin" , Shyam Sundar S K , Hans de Goede , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), linux-i2c@vger.kernel.org (open list:I2C/SMBUS CONTROLLER DRIVERS FOR PC), platform-driver-x86@vger.kernel.org (open list:AMD PMC DRIVER), "Gautham R . Shenoy" Subject: [PATCH v3 1/4] Documentation: Add AMD Zen debugging document Date: Thu, 10 Apr 2025 15:01:59 -0500 Message-ID: <20250410200202.2974062-2-superm1@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250410200202.2974062-1-superm1@kernel.org> References: <20250410200202.2974062-1-superm1@kernel.org> Precedence: bulk X-Mailing-List: platform-driver-x86@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Mario Limonciello Debugging issues on AMD hardware can be challenging for users without proper documentation and tools. Introduce a document that includes techniques for debugging s2idle issues. It will be expanded for debugging other issues later. Reviewed-by: Gautham R. Shenoy Signed-off-by: Mario Limonciello --- v3: * Move debugging.rst to index.rst --- Documentation/admin-guide/amd/index.rst | 270 ++++++++++++++++++++++ Documentation/admin-guide/amd/resume.svg | 4 + Documentation/admin-guide/amd/suspend.svg | 4 + Documentation/admin-guide/index.rst | 1 + 4 files changed, 279 insertions(+) create mode 100644 Documentation/admin-guide/amd/index.rst create mode 100644 Documentation/admin-guide/amd/resume.svg create mode 100644 Documentation/admin-guide/amd/suspend.svg diff --git a/Documentation/admin-guide/amd/index.rst b/Documentation/admin-guide/amd/index.rst new file mode 100644 index 0000000000000..5a721ab4fe013 --- /dev/null +++ b/Documentation/admin-guide/amd/index.rst @@ -0,0 +1,270 @@ +.. SPDX-License-Identifier: GPL-2.0 + +Debugging AMD Zen systems ++++++++++++++++++++++++++ + +Introduction +============ + +This document describes techniques that are useful for debugging issues with +AMD Zen systems. It is intended for use by developers and technical users +to help identify and resolve issues. + +S3 vs s2idle +============ + +On AMD systems, it's not possible to simultaneously support suspend-to-RAM (S3) +and suspend-to-idle (s2idle). To confirm which mode your system supports you +can look at ``cat /sys/power/mem_sleep``. If it shows ``s2idle [deep]`` then +*S3* is supported. If it shows ``[s2idle]`` then *s2idle* is +supported. + +On systems that support *S3*, the firmware will be utilized to put all hardware into +the appropriate low power state. + +On systems that support *s2idle*, the kernel will be responsible for transitioning devices +into the appropriate low power state. When all devices are in the appropriate low +power state, the hardware will transition into a hardware sleep state. + +After a suspend cycle you can tell how much time was spent in a hardware sleep +state by looking at ``cat /sys/power/suspend_stats/last_hw_sleep``. + +This flowchart explains how the amd s2idle suspend flow works. + +.. kernel-figure:: suspend.svg + +This flowchart explains how the amd s2idle resume flow works. + +.. kernel-figure:: resume.svg + +s2idle debugging script +======================= + +As there are a lot of places that problems can occur, a debugging script has +been created that can help test for common problems and offer suggestions. + +https://git.kernel.org/pub/scm/linux/kernel/git/superm1/amd-debug-tools.git/tree/amd_s2idle.py + +If you have an s2idle issue, it's best to start with this and follow instructions +from its findings. If you continue to have an issue, raise a bug with the +report generated from this script. + +Spurious s2idle wakeups from an IRQ +=================================== +Spurious wakeups will generally have an IRQ set to ``/sys/power/pm_wakeup_irq``. +This can be matched to ``/proc/interrupts`` to determine what device woke the system. + +If this isn't enough to debug the problem, then the following sysfs files +can be set to add more verbosity to the wakeup process: :: + + # echo 1 | sudo tee /sys/power/pm_debug_messages + # echo 1 | sudo tee /sys/power/pm_print_times + +After making those changes, the kernel will display messages that can +be traced back to kernel s2idle loop code as well as display any active +GPIO sources while waking up. + +If the wakeup is caused by the ACPI SCI, additional ACPI debugging may be +needed. These commands can enable additional trace data: :: + + # echo enable | sudo tee /sys/module/acpi/parameters/trace_state + # echo 1 | sudo tee /sys/module/acpi/parameters/aml_debug_output + # echo 0x0800000f | sudo tee /sys/module/acpi/parameters/debug_level + # echo 0xffff0000 | sudo tee /sys/module/acpi/parameters/debug_layer + +Spurious s2idle wakeups from a GPIO +=================================== + +If a GPIO is active when waking up the system ideally you would look at the +schematic to determine what device it is associated with. If the schematic +is not available, another tactic is to look at the ACPI _EVT() entry +to determine what device is notified when that GPIO is active. + +For a hypothetical example, say that GPIO 59 woke up the system. You can +look at the SSDT to determine what device is notified when GPIO 59 is active. + +First convert the GPIO number into hex. :: + + $ python3 -c "print(hex(59))" + 0x3b + +Next determine which ACPI table has the ``_EVT`` entry. For example: :: + + $ sudo grep EVT /sys/firmware/acpi/tables/SSDT* + grep: /sys/firmware/acpi/tables/SSDT27: binary file matches + +Decode this table::: + + $ sudo cp /sys/firmware/acpi/tables/SSDT27 . + $ sudo iasl -d SSDT27 + +Then look at the table and find the matching entry for GPIO 0x3b. :: + + Case (0x3B) + { + M000 (0x393B) + M460 (" Notify (\\_SB.PCI0.GP17.XHC1, 0x02)\n", Zero, Zero, Zero, Zero, Zero, Zero) + Notify (\_SB.PCI0.GP17.XHC1, 0x02) // Device Wake + } + +You can see in this case that the device ``\_SB.PCI0.GP17.XHC1`` is notified +when GPIO 59 is active. It's obvious this is an XHCI controller, but to go a +step further you can figure out which XHCI controller it is by matching it to +ACPI.:: + + $ grep "PCI0.GP17.XHC1" /sys/bus/acpi/devices/*/path + /sys/bus/acpi/devices/device:2d/path:\_SB_.PCI0.GP17.XHC1 + /sys/bus/acpi/devices/device:2e/path:\_SB_.PCI0.GP17.XHC1.RHUB + /sys/bus/acpi/devices/device:2f/path:\_SB_.PCI0.GP17.XHC1.RHUB.PRT1 + /sys/bus/acpi/devices/device:30/path:\_SB_.PCI0.GP17.XHC1.RHUB.PRT1.CAM0 + /sys/bus/acpi/devices/device:31/path:\_SB_.PCI0.GP17.XHC1.RHUB.PRT1.CAM1 + /sys/bus/acpi/devices/device:32/path:\_SB_.PCI0.GP17.XHC1.RHUB.PRT2 + /sys/bus/acpi/devices/LNXPOWER:0d/path:\_SB_.PCI0.GP17.XHC1.PWRS + +Here you can see it matches to ``device:2d``. Look at the ``physical_node`` +to determine what PCI device that actually is. :: + + $ ls -l /sys/bus/acpi/devices/device:2d/physical_node + lrwxrwxrwx 1 root root 0 Feb 12 13:22 /sys/bus/acpi/devices/device:2d/physical_node -> ../../../../../pci0000:00/0000:00:08.1/0000:c2:00.4 + +So there you have it: the PCI device associated with this GPIO wakeup was ``0000:c2:00.4``. + +The ``amd_s2idle.py`` script will capture most of these artifacts for you. + +s2idle PM debug messages +======================== +During the s2idle flow on AMD systems, the ACPI LPS0 driver is responsible +to check all uPEP constraints. Failing uPEP constraints does not prevent +s0i3 entry. This means that if some constraints are not met, it is possible +the kernel may attempt to enter s2idle even if there are some known issues. + +To activate PM debugging, use the kernel command line option: ``pm_debug_messages``. + +Or enable the feature using the sysfs file: ``/sys/power/pm_debug_messages`` +Constraints that are not met will be displayed in the kernel log and can be +viewed using anything that processes the kernel ring buffer such as ``dmesg``` or +``journalctl``. + +If the system freezes on entry/exit before these messages are flushed, a +useful debugging tactic is to unbind the ``amd_pmc`` driver to prevent +notification to the platform to start s0i3 entry. This will stop the +system from freezing on entry or exit and let you view all the failed +constraints. :: + + cd /sys/bus/platform/drivers/amd_pmc + ls | grep AMD | sudo tee unbind + +After doing this, run the suspend cycle and look specifically for errors around: :: + + ACPI: LPI: Constraint not met; min power state:%s current power state:%s + +Historical examples of s2idle issues +==================================== +To help understand the types of issues that can occur and how to debug them, +here are some historical examples of s2idle issues that have been resolved. + +Core offlining +-------------- +An end user had reported that taking a core offline would prevent the system +from properly entering s0i3. This was debugged using internal AMD tools +to capture and display a stream of metrics from the hardware showing what changed +when a core was offlined. It was determined that the hardware didn't get +notification the offline cores were in the deepest state, and so it prevented +CPU from going into the deepest state. The issue was debugged to a missing +command to put cores into C3 upon offline. + +`commit d6b88ce2eb9d2 ("ACPI: processor idle: Allow playing dead in C3 state") `_ + +Corruption after resume +----------------------- +A big problem that occurred with Rembrandt was that there was graphical +corruption after resume. This happened because of a misalignment of PSP +and driver responsibility. The PSP will save and restore DMCUB, but the +driver assumed it needed to reset DMCUB on resume. +This actually was a misalignment for earlier silicon as well, but was not +observed. + +`commit 79d6b9351f086 ("drm/amd/display: Don't reinitialize DMCUB on s0ix resume") `_ + +Back to Back suspends fail +-------------------------- +When using a wakeup source that triggers the IRQ to wakeup, a bug in the +pinctrl-amd driver may capture the wrong state of the IRQ and prevent the +system going back to sleep properly. + +`commit b8c824a869f22 ("pinctrl: amd: Don't save/restore interrupt status and wake status bits") `_ + +Spurious timer based wakeup after 5 minutes +------------------------------------------- +The HPET was being used to program the wakeup source for the system, however +this was causing a spurious wakeup after 5 minutes. The correct alarm to use +was the ACPI alarm. + +`commit 3d762e21d5637 ("rtc: cmos: Use ACPI alarm for non-Intel x86 systems too") `_ + +Disk disappears after resume +---------------------------- +After resuming from s2idle, the NVME disk would disappear. This was due to the +BIOS not specifying the _DSD StorageD3Enable property. This caused the NVME +driver not to put the disk into the expected state at suspend and to fail +on resume. + +`commit e79a10652bbd3 ("ACPI: x86: Force StorageD3Enable on more products") `_ + +Spurious IRQ1 +------------- +A number of Renoir, Lucienne, Cezanne, & Barcelo platforms have a +platform firmware bug where IRQ1 is triggered during s0i3 resume. + +This was fixed in the platform firmware, but a number of systems didn't +receive any more platform firmware updates. + +`commit 8e60615e89321 ("platform/x86/amd: pmc: Disable IRQ1 wakeup for RN/CZN") `_ + +Hardware timeout +---------------- +The hardware performs many actions besides accepting the values from +amd-pmc driver. As the communication path with the hardware is a mailbox, +it's possible that it might not respond quickly enough. +This issue manifested as a failure to suspend: :: + + PM: dpm_run_callback(): acpi_subsys_suspend_noirq+0x0/0x50 returns -110 + amd_pmc AMDI0005:00: PM: failed to suspend noirq: error -110 + +The timing problem was identified by comparing the values of the idle mask. + +`commit 3c3c8e88c8712 ("platform/x86: amd-pmc: Increase the response register timeout") `_ + +Failed to reach hardware sleep state with panel on +-------------------------------------------------- +On some Strix systems certain panels were observed to block the system from +entering a hardware sleep state if the internal panel was on during the sequence. + +Even though the panel got turned off during suspend it exposed a timing problem +where an interrupt caused the display hardware to wake up and block low power +state entry. + +`patch `_ + +Runtime power consumption issues +================================ +Runtime power consumption is influenced by many factors, including but not +limited to the configuration of the PCIe Active State Power Management (ASPM), +the display brightness, the EPP policy of the CPU, and the power management +of the devices. + +ASPM +---- +For the best runtime power consumption, ASPM should be programmed as intended +by the BIOS from the hardware vendor. To accomplish this the Linux kernel +should be compiled with ``CONFIG_PCIEASPM_DEFAULT`` set to ``y`` and the +sysfs file ``/sys/module/pcie_aspm/parameters/policy`` should not be modified. + +Most notably, if L1.2 is not configured properly for any devices, the SoC +will not be able to enter the deepest idle state. + +EPP Policy +---------- +The ``energy_performance_preference`` sysfs file can be used to set a bias +of efficiency or performance for a CPU. This has a direct relationship on +the battery life when more heavily biased towards performance. diff --git a/Documentation/admin-guide/amd/resume.svg b/Documentation/admin-guide/amd/resume.svg new file mode 100644 index 0000000000000..ad8839f762bfe --- /dev/null +++ b/Documentation/admin-guide/amd/resume.svg @@ -0,0 +1,4 @@ + + + +
Wakeup event occurs
MP1 hands off control to OS
OS Moves one core out of ACPI C3
MP0/MP1 boot process
OS checks all wake sources
no
yes
ACPI fixed
event active
no
yes
IRQ other
than ACPI SCI active
no
yes
GPIO
IRQ shared
with SCI
no
yes
no
Any PM
wakeup event
pending
Kernel resumes system
uPEP driver removes OS_HINT
yes
no
Any GPIO
w/ WAKESTS
active
Check for ACPI Notify() events
yes
Any GPE
pending
OS moves active
core back to
ACPI C3
MP1 puts system back to sleep
\ No newline at end of file diff --git a/Documentation/admin-guide/amd/suspend.svg b/Documentation/admin-guide/amd/suspend.svg new file mode 100644 index 0000000000000..a69073c018d56 --- /dev/null +++ b/Documentation/admin-guide/amd/suspend.svg @@ -0,0 +1,4 @@ + + + +
SFH driver notifies MP2 to stop all sensor collection
no
Abort suspend; details logged in dmesg
Failures?
yes
no
yes
All devices go into deepest D-state or F-state
Failures?
no
yes
GPIO driver suspends non-wake GPIOs
Suspend initiated from userspace
GPU driver shuts down clocks and sends SMU messages
Failures?
ACPI s2idle driver notifies EC using _DSM
uPEP driver (amd-pmc) sends OS_HINT
Put all x86 CPU cores into ACPI C3
s2idle loop waiting for IRQ
to wake
no
Failures?
yes
\ No newline at end of file diff --git a/Documentation/admin-guide/index.rst b/Documentation/admin-guide/index.rst index 259d79fbeb944..440d106a17458 100644 --- a/Documentation/admin-guide/index.rst +++ b/Documentation/admin-guide/index.rst @@ -81,6 +81,7 @@ problems and bugs in particular. lockup-watchdogs RAS/index sysrq + amd/index Core-kernel subsystems From patchwork Thu Apr 10 20:02:00 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mario Limonciello X-Patchwork-Id: 14047262 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7F55119DF99; Thu, 10 Apr 2025 20:02:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315343; cv=none; b=JhxP8YqR+ybvUV4GVeiaNh2WamsC4LXqCXAyUhsnQcRf0OMuw1Ir7+6geffcEKRhsWI0f3vO3sn88yu5gH9XCrZ4eZ9xB6x46xidX96XEOt5jSrN10kb76WesvCvuvQCaU3dYpQ7ZQzzfm2hg0Lee3Qa38Fv6ixEdSkxYngiKcs= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315343; c=relaxed/simple; bh=ULFbnNf/hvDw+1NTZ+kLF9e+tvJn1FWc+GWv/LrHRAA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=THF5GVUrWVOSqP+THa0X+gq/hMH5c0MnXoNrcJ6ztUmsyCOIYRolEfI4IiBWHzZCRM6UbezLFrTpwQjLhEOfHeGqIHKPW2B+IlLa82hXwUsWqSYPLu/+XLf8n5vHlo19qEBmkyIpih08nSXMClTULU0hd5u+edLkDvjEJJ8LIxA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=Yv7SIKuX; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="Yv7SIKuX" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5A03DC4CEEA; Thu, 10 Apr 2025 20:02:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744315342; bh=ULFbnNf/hvDw+1NTZ+kLF9e+tvJn1FWc+GWv/LrHRAA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Yv7SIKuXR/dspukAHhBiNpVfesY2FtQldpjTrNkVG1+F4Y2A7spiNelUTbU1KwFFn uWNPi7T6ULdSQuOcEDeIphkAQ/b0GZjdsivMBm4tfd6bO2/w3Gtk9eMKhDbnW90LIw GT89MlZGxLWGvRuWf5R/7joW5ZPS5RnjLzEHZX2SUJ0GYCCRGXSS3RN1+NAEMJReZV Ml/Wvbgi0CTm0BIM3a1B5WcTAUDwXpZ+ZIm9ktbyajayhWiJbq5kkD24Xal2uRweN2 fLYB5dx4YoLvVB+3l1amlfcKJz3rSu4xmqvS0XFRpPh0WOH3Oa5geRG6NwwNwcBIcu ppjoqeZaLEpQw== From: Mario Limonciello To: Borislav Petkov , Jean Delvare , Andi Shyti , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Cc: Jonathan Corbet , Mario Limonciello , Yazen Ghannam , Thomas Gleixner , Ingo Molnar , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H . Peter Anvin" , Shyam Sundar S K , Hans de Goede , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), linux-i2c@vger.kernel.org (open list:I2C/SMBUS CONTROLLER DRIVERS FOR PC), platform-driver-x86@vger.kernel.org (open list:AMD PMC DRIVER) Subject: [PATCH v3 2/4] i2c: piix4: Move SB800_PIIX4_FCH_PM_ADDR definition to amd_node.h Date: Thu, 10 Apr 2025 15:02:00 -0500 Message-ID: <20250410200202.2974062-3-superm1@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250410200202.2974062-1-superm1@kernel.org> References: <20250410200202.2974062-1-superm1@kernel.org> Precedence: bulk X-Mailing-List: platform-driver-x86@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Mario Limonciello SB800_PIIX4_FCH_PM_ADDR is used to indicate the base address for the FCH PM registers. Multiple drivers may need this base address, so move it to a common header location and rename accordingly. Signed-off-by: Mario Limonciello --- arch/x86/include/asm/amd_node.h | 2 ++ drivers/i2c/busses/i2c-piix4.c | 12 ++++++------ 2 files changed, 8 insertions(+), 6 deletions(-) diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h index 23fe617898a8f..f4993201834ea 100644 --- a/arch/x86/include/asm/amd_node.h +++ b/arch/x86/include/asm/amd_node.h @@ -19,6 +19,8 @@ #include +#define FCH_PM_BASE 0xFED80300 + #define MAX_AMD_NUM_NODES 8 #define AMD_NODE0_PCI_SLOT 0x18 diff --git a/drivers/i2c/busses/i2c-piix4.c b/drivers/i2c/busses/i2c-piix4.c index dd75916157f05..7c895001c5e8f 100644 --- a/drivers/i2c/busses/i2c-piix4.c +++ b/drivers/i2c/busses/i2c-piix4.c @@ -21,6 +21,7 @@ an i2c_algorithm to access them. */ +#include #include #include #include @@ -85,7 +86,6 @@ #define SB800_PIIX4_PORT_IDX_MASK_KERNCZ 0x18 #define SB800_PIIX4_PORT_IDX_SHIFT_KERNCZ 3 -#define SB800_PIIX4_FCH_PM_ADDR 0xFED80300 #define SB800_PIIX4_FCH_PM_SIZE 8 #define SB800_ASF_ACPI_PATH "\\_SB.ASFC" @@ -162,19 +162,19 @@ int piix4_sb800_region_request(struct device *dev, struct sb800_mmio_cfg *mmio_c if (mmio_cfg->use_mmio) { void __iomem *addr; - if (!request_mem_region_muxed(SB800_PIIX4_FCH_PM_ADDR, + if (!request_mem_region_muxed(FCH_PM_BASE, SB800_PIIX4_FCH_PM_SIZE, "sb800_piix4_smb")) { dev_err(dev, "SMBus base address memory region 0x%x already in use.\n", - SB800_PIIX4_FCH_PM_ADDR); + FCH_PM_BASE); return -EBUSY; } - addr = ioremap(SB800_PIIX4_FCH_PM_ADDR, + addr = ioremap(FCH_PM_BASE, SB800_PIIX4_FCH_PM_SIZE); if (!addr) { - release_mem_region(SB800_PIIX4_FCH_PM_ADDR, + release_mem_region(FCH_PM_BASE, SB800_PIIX4_FCH_PM_SIZE); dev_err(dev, "SMBus base address mapping failed.\n"); return -ENOMEM; @@ -201,7 +201,7 @@ void piix4_sb800_region_release(struct device *dev, struct sb800_mmio_cfg *mmio_ { if (mmio_cfg->use_mmio) { iounmap(mmio_cfg->addr); - release_mem_region(SB800_PIIX4_FCH_PM_ADDR, + release_mem_region(FCH_PM_BASE, SB800_PIIX4_FCH_PM_SIZE); return; } From patchwork Thu Apr 10 20:02:01 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mario Limonciello X-Patchwork-Id: 14047263 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5B78E290BC2; Thu, 10 Apr 2025 20:02:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315345; cv=none; b=L41fyG9VzlHop4KgYw16dQ9dRtSaoojM5RsJnH3VFGJ8GQWZ/0uJPbxoLpFyTgZ2OdCTz5bEjeCfOoaJDL57gZ75P+W1dCLavrMEnwnUTnFZiGPyH++ABcv23FUUT8G7/+s1wd2UkyqFGYUUpK9Pk5dbqBnxrJCzav9VH7zKopY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315345; c=relaxed/simple; bh=w/RU2A4df4biQl18BgwTSbH3JxoDffUQCgnJgrJx7gQ=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=bAUHW036em/sROZwQ27eFY3LSu1nBHAjnOaR6FEj/132knN4o9SHPwSfPvNslYa1XNeG0LZ24OfK/TYytjcIEjqiXBi877+MjK0CQdqNeEy6yTjzVHzGy9DW8FS1EvGInX/zaakzRCt3qBLAV3uL2vZxSNnw1IV7omu2wHCdipQ= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=u41KXSbx; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="u41KXSbx" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 32300C4CEF0; Thu, 10 Apr 2025 20:02:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744315344; bh=w/RU2A4df4biQl18BgwTSbH3JxoDffUQCgnJgrJx7gQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=u41KXSbxzySGPgWWMEoPvJYRRYjoqHM/CWspk3nBt1++CsW3vOdNM3D/UKeEwWJnM K5btupqvestckKK3A8D41pEBnANFWNDLIlyJhlvZPoPGEslJF1w8jU35mJG3J1FmwE MeIztEmilf3Gaz95pSvwBFD23rlH0X1NHEpmG5a/6yGvI1KwTaFvTht9p2DtXIvTDt xfI+P+4JlxKblXSfpgMoWwA8sxNcAJ5sr5GNsW1v2hGjwdu9zE47/hWVQd5/OPsZ/h TOnoqCJPcxTosq1X8j+NEWgQ5edov7ZBraXA6A2+lJUYpWobxrvivIc11pM0Oe7cB8 leUwmVyO2qXnQ== From: Mario Limonciello To: Borislav Petkov , Jean Delvare , Andi Shyti , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Cc: Jonathan Corbet , Mario Limonciello , Yazen Ghannam , Thomas Gleixner , Ingo Molnar , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H . Peter Anvin" , Shyam Sundar S K , Hans de Goede , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), linux-i2c@vger.kernel.org (open list:I2C/SMBUS CONTROLLER DRIVERS FOR PC), platform-driver-x86@vger.kernel.org (open list:AMD PMC DRIVER) Subject: [PATCH v3 3/4] platform/x86/amd: pmc: use FCH_PM_BASE definition Date: Thu, 10 Apr 2025 15:02:01 -0500 Message-ID: <20250410200202.2974062-4-superm1@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250410200202.2974062-1-superm1@kernel.org> References: <20250410200202.2974062-1-superm1@kernel.org> Precedence: bulk X-Mailing-List: platform-driver-x86@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Mario Limonciello The s2idle mmio quirk uses a scratch register in the FCH. Adjust the code to clarify that. Signed-off-by: Mario Limonciello --- drivers/platform/x86/amd/pmc/pmc-quirks.c | 5 ++++- 1 file changed, 4 insertions(+), 1 deletion(-) diff --git a/drivers/platform/x86/amd/pmc/pmc-quirks.c b/drivers/platform/x86/amd/pmc/pmc-quirks.c index b4f49720c87f6..9d817209e407f 100644 --- a/drivers/platform/x86/amd/pmc/pmc-quirks.c +++ b/drivers/platform/x86/amd/pmc/pmc-quirks.c @@ -8,19 +8,22 @@ * Author: Mario Limonciello */ +#include #include #include #include #include "pmc.h" +#define FCH_PM_SCRATCH 0x80 + struct quirk_entry { u32 s2idle_bug_mmio; bool spurious_8042; }; static struct quirk_entry quirk_s2idle_bug = { - .s2idle_bug_mmio = 0xfed80380, + .s2idle_bug_mmio = FCH_PM_BASE + FCH_PM_SCRATCH, }; static struct quirk_entry quirk_spurious_8042 = { From patchwork Thu Apr 10 20:02:02 2025 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Mario Limonciello X-Patchwork-Id: 14047264 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 53456293453; Thu, 10 Apr 2025 20:02:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315348; cv=none; b=lvh8DitAbAUfoi7XFrMIecLoAeJizi0S/1c3zsbTWObzUcRzwB1HktG81TDm/6mvHFjl7RPh+HrmqPUs2gpPCHnW0daIH/vMIEUaOp9XV4ThebIzj8mia+FHM4cItCR38wNTGre2vvbzL+FKy93AzBI1gnRt4T2Ykr22o6CuxnY= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1744315348; c=relaxed/simple; bh=HKokz0uKDHtOCWF8mVU98Ua+9d+BqiXJ5Xr3OER4avA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=cKtS+lbp2TqrYkPkAXl2BBD7A8uOTPqMbqnejLeyidz9LLHP+THbbLoKwzcwUn6u+CANd29X8R99LADZ8BHJEZW7sqbj2CKayONiL3Gl7Kl3rqFglzVR6a09cBvgxXuxvfapVLcij35lKa8hBGPi1S8HBoD50w9/dYhgt7JcEJo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=UHnCNir2; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="UHnCNir2" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0FF2CC4CEEB; Thu, 10 Apr 2025 20:02:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1744315346; bh=HKokz0uKDHtOCWF8mVU98Ua+9d+BqiXJ5Xr3OER4avA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=UHnCNir2w/qC/TUDrEZF6CUuXnAq7z4ttCiWJ7Zl6E4G4rJEXXpWt+vq2GKISyhsO IqItkZJyoRyEZuzecc8VL4x0QTWnbFALNEcjHwd9EvC1iyAysslm7SQsCE9dL4z6d9 0FE58oRWLowJTRzzppEA4JDmElzf1Y3GjsDbrl9ZK038s4lWkpAxmFhx86oPAjhjdU vmRUwPMkNKwJ3Fpen4X3sUgqAuK5ZXaymtAWbKUnwN8+/xDOFjQUoHuJ4XJEg3Uf7G YVbGSupTFtRkeDWSzIJg9wQucb/eEEEcLvcLB8qDASxtfXed2IQOngDwYPDTzxkP5O 5jmb6uFdZEsOw== From: Mario Limonciello To: Borislav Petkov , Jean Delvare , Andi Shyti , =?utf-8?q?Ilpo_J=C3=A4rvinen?= Cc: Jonathan Corbet , Mario Limonciello , Yazen Ghannam , Thomas Gleixner , Ingo Molnar , Dave Hansen , x86@kernel.org (maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)), "H . Peter Anvin" , Shyam Sundar S K , Hans de Goede , linux-doc@vger.kernel.org (open list:DOCUMENTATION), linux-kernel@vger.kernel.org (open list), linux-i2c@vger.kernel.org (open list:I2C/SMBUS CONTROLLER DRIVERS FOR PC), platform-driver-x86@vger.kernel.org (open list:AMD PMC DRIVER) Subject: [PATCH v3 4/4] x86/CPU/AMD: Print the reason for the last reset Date: Thu, 10 Apr 2025 15:02:02 -0500 Message-ID: <20250410200202.2974062-5-superm1@kernel.org> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20250410200202.2974062-1-superm1@kernel.org> References: <20250410200202.2974062-1-superm1@kernel.org> Precedence: bulk X-Mailing-List: platform-driver-x86@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Yazen Ghannam The following register contains bits that indicate the cause for the previous reset. PMx000000C0 (FCH::PM::S5_RESET_STATUS) This is useful for debug. The reasons for reset are broken into 6 high level categories. Decode it by category and print during boot. Specifics within a category are split off into debugging documentation. The register is accessed indirectly through a "PM" port in the FCH. Use MMIO access in order to avoid restrictions with legacy port access. Use a late_initcall() to ensure that MMIO has been set up before trying to access the register. This register was introduced with AMD Family 17h, so avoid access on older families. There is no CPUID feature bit for this register. Signed-off-by: Yazen Ghannam Co-developed-by: Mario Limonciello Signed-off-by: Mario Limonciello --- v3: * Align strings in the CSV and code. * Switch to an array of strings * Switch to looking up bit of first value * Re-order message to have number first (makes grepping easier) * Add x86/amd prefix to message v2: * Add string for each reason, but still include value in case multiple values are set. --- Documentation/admin-guide/amd/index.rst | 42 ++++++++++++++++++++ arch/x86/include/asm/amd_node.h | 1 + arch/x86/kernel/cpu/amd.c | 51 +++++++++++++++++++++++++ 3 files changed, 94 insertions(+) diff --git a/Documentation/admin-guide/amd/index.rst b/Documentation/admin-guide/amd/index.rst index 5a721ab4fe013..c888b192365c5 100644 --- a/Documentation/admin-guide/amd/index.rst +++ b/Documentation/admin-guide/amd/index.rst @@ -268,3 +268,45 @@ EPP Policy The ``energy_performance_preference`` sysfs file can be used to set a bias of efficiency or performance for a CPU. This has a direct relationship on the battery life when more heavily biased towards performance. + +Random reboot issues +==================== +When a random reboot occurs, the high-level reason for the reboot is stored +in a register that will persist onto the next boot. + +There are 6 classes of reasons for the reboot: + * Software induced + * Power state transition + * Pin induced + * Hardware induced + * Remote reset + * Internal CPU event + +.. csv-table:: + :header: "Bit", "Type", "Reason" + :align: left + + "0", "Pin", "thermal pin BP_THERMTRIP_L was tripped" + "1", "Pin", "power button was pressed for 4 seconds" + "2", "Pin", "shutdown pin was shorted" + "4", "Remote", "remote ASF power off command was received" + "9", "Internal", "internal CPU thermal limit was tripped" + "16", "Pin", "system reset pin BP_SYS_RST_L was tripped" + "17", "Software", "software issued PCI reset" + "18", "Software", "software wrote 0x4 to reset control register 0xCF9" + "19", "Software", "software wrote 0x6 to reset control register 0xCF9" + "20", "Software", "software wrote 0xE to reset control register 0xCF9" + "21", "Sleep", "ACPI power state transition occurred" + "22", "Pin", "keyboard reset pin KB_RST_L was asserted" + "23", "Internal", "internal CPU shutdown event occurred" + "24", "Hardware", "system failed to boot before failed boot timer expired" + "25", "Hardware", "hardware watchdog timer expired" + "26", "Remote", "remote ASF reset command was received" + "27", "Internal", "an uncorrected error caused a data fabric sync flood event" + "29", "Internal", "FCH and MP1 failed warm reset handshake" + "30", "Internal", "a parity error occurred" + "31", "Internal", "a software sync flood event occurred" + +This information is read by the kernel at bootup and is saved into the +kernel ring buffer. When a random reboot occurs this message can be helpful +to determine the next component to debug such an issue. diff --git a/arch/x86/include/asm/amd_node.h b/arch/x86/include/asm/amd_node.h index f4993201834ea..a945d146f1a77 100644 --- a/arch/x86/include/asm/amd_node.h +++ b/arch/x86/include/asm/amd_node.h @@ -20,6 +20,7 @@ #include #define FCH_PM_BASE 0xFED80300 +#define FCH_PM_S5_RESET_STATUS 0xC0 #define MAX_AMD_NUM_NODES 8 #define AMD_NODE0_PCI_SLOT 0x18 diff --git a/arch/x86/kernel/cpu/amd.c b/arch/x86/kernel/cpu/amd.c index 79569f72b8ee5..ddb17f0ad580e 100644 --- a/arch/x86/kernel/cpu/amd.c +++ b/arch/x86/kernel/cpu/amd.c @@ -9,6 +9,7 @@ #include #include #include +#include #include #include #include @@ -1231,3 +1232,53 @@ void amd_check_microcode(void) if (cpu_feature_enabled(X86_FEATURE_ZEN2)) on_each_cpu(zenbleed_check_cpu, NULL, 1); } + +static const char * const s5_reset_reason_txt[] = { + [0] = "thermal pin BP_THERMTRIP_L was tripped", + [1] = "power button was pressed for 4 seconds", + [2] = "shutdown pin was shorted", + [4] = "remote ASF power off command was received", + [9] = "internal CPU thermal limit was tripped", + [16] = "system reset pin BP_SYS_RST_L was tripped", + [17] = "software issued PCI reset", + [18] = "software wrote 0x4 to reset control register 0xCF9", + [19] = "software wrote 0x6 to reset control register 0xCF9", + [20] = "software wrote 0xE to reset control register 0xCF9", + [21] = "ACPI power state transition occurred", + [22] = "keyboard reset pin KB_RST_L was asserted", + [23] = "internal CPU shutdown event occurred", + [24] = "system failed to boot before failed boot timer expired", + [25] = "hardware watchdog timer expired", + [26] = "remote ASF reset command was received", + [27] = "an uncorrected error caused a data fabric sync flood event", + [29] = "FCH and MP1 failed warm reset handshake", + [30] = "a parity error occurred", + [31] = "a software sync flood event occurred", + [32] = "unknown", +}; + +static __init int print_s5_reset_status_mmio(void) +{ + void __iomem *addr; + unsigned long value; + int bit = -1; + + if (!cpu_feature_enabled(X86_FEATURE_ZEN)) + return 0; + + addr = ioremap(FCH_PM_BASE + FCH_PM_S5_RESET_STATUS, sizeof(value)); + if (!addr) + return 0; + value = ioread32(addr); + iounmap(addr); + + do { + bit = find_next_bit(&value, BITS_PER_LONG, bit + 1); + } while (!s5_reset_reason_txt[bit]); + + pr_info("x86/amd: Previous system reset reason [0x%08lx]: %s\n", + value, s5_reset_reason_txt[bit]); + + return 0; +} +late_initcall(print_s5_reset_status_mmio);