diff mbox series

[v2] PCI: Disable LTR on suspend for SSSTC 0x9100

Message ID 20220317131008.6395-1-mario.limonciello@amd.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show
Series [v2] PCI: Disable LTR on suspend for SSSTC 0x9100 | expand

Commit Message

Mario Limonciello March 17, 2022, 1:10 p.m. UTC
Some drives from SSSTC are showing stability problems after s0i3
entry when the Linux kernel is in s2idle loop if LTR has been
enabled. This leads to failures to resume.

This appears to be a firmware issue specific to SSSTC SSDs, so quirk
them to avoid LTR being enabled during the suspend process.

Link: https://lore.kernel.org/linux-nvme/20220315072233.GA2288@lst.de/T/#mb9b5782220a32e2c69fe37cf04ae1501b0f48221
Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
---
changes from v1->v2:
 * Move into a PCI quirk, handle entirely in PCI subsystem

 drivers/pci/quirks.c    | 17 +++++++++++++++++
 include/linux/pci_ids.h |  2 ++
 2 files changed, 19 insertions(+)

Comments

Mario Limonciello March 18, 2022, 2:43 a.m. UTC | #1
+Patrick Huang

On 3/17/22 08:10, Mario Limonciello wrote:
> Some drives from SSSTC are showing stability problems after s0i3
> entry when the Linux kernel is in s2idle loop if LTR has been
> enabled. This leads to failures to resume.
> 
> This appears to be a firmware issue specific to SSSTC SSDs, so quirk
> them to avoid LTR being enabled during the suspend process.
> 
> Link: https://lore.kernel.org/linux-nvme/20220315072233.GA2288@lst.de/T/#mb9b5782220a32e2c69fe37cf04ae1501b0f48221
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> changes from v1->v2:
>   * Move into a PCI quirk, handle entirely in PCI subsystem
> 

Patrick has been running stress tests with this patch and reported a 
failure.  When it's figured out we will re-submit V3.

If any concerns to this approach in V2 though, comments still welcome.

Thanks,

>   drivers/pci/quirks.c    | 17 +++++++++++++++++
>   include/linux/pci_ids.h |  2 ++
>   2 files changed, 19 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index d2dd6a6cda60..005142d574e7 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5879,3 +5879,20 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1533, rom_bar_overlap_defect);
>   DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect);
>   DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect);
>   DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect);
> +
> +/*
> + * SSSTC firmware will continue to send LTR requests after device has entered D3
> + *
> + * This behavior causes problems when entering/exit s2idle, so avoid letting LTR
> + * be enabled during suspend.
> + */
> +static void ssstc_disable_ltr(struct pci_dev *pdev)
> +{
> +	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2, PCI_EXP_DEVCTL2_LTR_EN);
> +}
> +static void ssstc_enable_ltr(struct pci_dev *pdev)
> +{
> +	pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2, PCI_EXP_DEVCTL2_LTR_EN);
> +}
> +DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_SSSTC, 0x9100, ssstc_disable_ltr);
> +DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_SSSTC, 0x9100, ssstc_enable_ltr);
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
> index 130949c3b486..ab47ccdd2ece 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2568,6 +2568,8 @@
>   #define PCI_VENDOR_ID_TEKRAM		0x1de1
>   #define PCI_DEVICE_ID_TEKRAM_DC290	0xdc29
>   
> +#define PCI_VENDOR_ID_SSSTC		0x1e95
> +
>   #define PCI_VENDOR_ID_TEHUTI		0x1fc9
>   #define PCI_DEVICE_ID_TEHUTI_3009	0x3009
>   #define PCI_DEVICE_ID_TEHUTI_3010	0x3010
Huang, Patrick March 18, 2022, 11:30 a.m. UTC | #2
[AMD Official Use Only]

Hi all,
After retrying suspend with this patch, get suspend test successfully and run over 1800 cycles
I think that is test environment error before.
Thanks

Best Regard,
Patrick
-----Original Message-----
From: Limonciello, Mario <Mario.Limonciello@amd.com> 
Sent: Friday, March 18, 2022 10:43 AM
To: Bjorn Helgaas <bhelgaas@google.com>; open list:PCI SUBSYSTEM <linux-pci@vger.kernel.org>; open list <linux-kernel@vger.kernel.org>; Huang, Patrick <Patrick.Huang@amd.com>
Subject: Re: [PATCH v2] PCI: Disable LTR on suspend for SSSTC 0x9100

+Patrick Huang

On 3/17/22 08:10, Mario Limonciello wrote:
> Some drives from SSSTC are showing stability problems after s0i3 entry 
> when the Linux kernel is in s2idle loop if LTR has been enabled. This 
> leads to failures to resume.
> 
> This appears to be a firmware issue specific to SSSTC SSDs, so quirk 
> them to avoid LTR being enabled during the suspend process.
> 
> Link: 
> https://lore.kernel.org/linux-nvme/20220315072233.GA2288@lst.de/T/#mb9
> b5782220a32e2c69fe37cf04ae1501b0f48221
> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
> ---
> changes from v1->v2:
>   * Move into a PCI quirk, handle entirely in PCI subsystem
> 

Patrick has been running stress tests with this patch and reported a failure.  When it's figured out we will re-submit V3.

If any concerns to this approach in V2 though, comments still welcome.

Thanks,

>   drivers/pci/quirks.c    | 17 +++++++++++++++++
>   include/linux/pci_ids.h |  2 ++
>   2 files changed, 19 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index 
> d2dd6a6cda60..005142d574e7 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -5879,3 +5879,20 @@ DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1533, rom_bar_overlap_defect);
>   DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect);
>   DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect);
>   DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, 
> rom_bar_overlap_defect);
> +
> +/*
> + * SSSTC firmware will continue to send LTR requests after device has 
> +entered D3
> + *
> + * This behavior causes problems when entering/exit s2idle, so avoid 
> +letting LTR
> + * be enabled during suspend.
> + */
> +static void ssstc_disable_ltr(struct pci_dev *pdev) {
> +	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2, 
> +PCI_EXP_DEVCTL2_LTR_EN); } static void ssstc_enable_ltr(struct 
> +pci_dev *pdev) {
> +	pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2, 
> +PCI_EXP_DEVCTL2_LTR_EN); } 
> +DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_SSSTC, 0x9100, 
> +ssstc_disable_ltr); DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_SSSTC, 
> +0x9100, ssstc_enable_ltr);
> diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index 
> 130949c3b486..ab47ccdd2ece 100644
> --- a/include/linux/pci_ids.h
> +++ b/include/linux/pci_ids.h
> @@ -2568,6 +2568,8 @@
>   #define PCI_VENDOR_ID_TEKRAM		0x1de1
>   #define PCI_DEVICE_ID_TEKRAM_DC290	0xdc29
>   
> +#define PCI_VENDOR_ID_SSSTC		0x1e95
> +
>   #define PCI_VENDOR_ID_TEHUTI		0x1fc9
>   #define PCI_DEVICE_ID_TEHUTI_3009	0x3009
>   #define PCI_DEVICE_ID_TEHUTI_3010	0x3010
diff mbox series

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index d2dd6a6cda60..005142d574e7 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -5879,3 +5879,20 @@  DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1533, rom_bar_overlap_defect);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1536, rom_bar_overlap_defect);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1537, rom_bar_overlap_defect);
 DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_INTEL, 0x1538, rom_bar_overlap_defect);
+
+/*
+ * SSSTC firmware will continue to send LTR requests after device has entered D3
+ *
+ * This behavior causes problems when entering/exit s2idle, so avoid letting LTR
+ * be enabled during suspend.
+ */
+static void ssstc_disable_ltr(struct pci_dev *pdev)
+{
+	pcie_capability_clear_word(pdev, PCI_EXP_DEVCTL2, PCI_EXP_DEVCTL2_LTR_EN);
+}
+static void ssstc_enable_ltr(struct pci_dev *pdev)
+{
+	pcie_capability_set_word(pdev, PCI_EXP_DEVCTL2, PCI_EXP_DEVCTL2_LTR_EN);
+}
+DECLARE_PCI_FIXUP_SUSPEND(PCI_VENDOR_ID_SSSTC, 0x9100, ssstc_disable_ltr);
+DECLARE_PCI_FIXUP_RESUME(PCI_VENDOR_ID_SSSTC, 0x9100, ssstc_enable_ltr);
diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h
index 130949c3b486..ab47ccdd2ece 100644
--- a/include/linux/pci_ids.h
+++ b/include/linux/pci_ids.h
@@ -2568,6 +2568,8 @@ 
 #define PCI_VENDOR_ID_TEKRAM		0x1de1
 #define PCI_DEVICE_ID_TEKRAM_DC290	0xdc29
 
+#define PCI_VENDOR_ID_SSSTC		0x1e95
+
 #define PCI_VENDOR_ID_TEHUTI		0x1fc9
 #define PCI_DEVICE_ID_TEHUTI_3009	0x3009
 #define PCI_DEVICE_ID_TEHUTI_3010	0x3010