diff mbox series

PCI/PM: Print the pci config space of devices before suspend

Message ID 20200113060724.19571-1-yu.c.chen@intel.com (mailing list archive)
State Not Applicable, archived
Headers show
Series PCI/PM: Print the pci config space of devices before suspend | expand

Commit Message

Chen Yu Jan. 13, 2020, 6:07 a.m. UTC
The pci config space was found to be insane during resume
from hibernation(S4, or suspend to disk) on a VM:

 serial 0000:00:16.3: restoring config space at offset 0x14
 (was 0x9104e000, writing 0xffffffff)

Either the snapshot on the disk has been scribbled or the pci
config space becomes invalid before suspend. To narrow down
and benefit future debugging, print the pci config space
being saved before suspend, which is symmetric to the log
in pci_restore_config_dword().

Cc: Bjorn Helgaas <bhelgaas@google.com>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Len Brown <lenb@kernel.org>
Cc: linux-pci@vger.kernel.org
Cc: linux-pm@vger.kernel.org
Cc: linux-kernel@vger.kernel.org
Signed-off-by: Chen Yu <yu.c.chen@intel.com>
---
 drivers/pci/pci.c | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

Comments

Rafael J. Wysocki Jan. 13, 2020, 10:01 a.m. UTC | #1
On Mon, Jan 13, 2020 at 7:08 AM Chen Yu <yu.c.chen@intel.com> wrote:
>
> The pci config space was found to be insane during resume

I wouldn't call it "insane".

It probably means that the device was not present or not accessible
during hibernation and now it appears to be present (maybe the restore
kernel found it and configured it).

> from hibernation(S4, or suspend to disk) on a VM:
>
>  serial 0000:00:16.3: restoring config space at offset 0x14
>  (was 0x9104e000, writing 0xffffffff)
>
> Either the snapshot on the disk has been scribbled or the pci
> config space becomes invalid before suspend.

Or, most likely, the above.

> To narrow down and benefit future debugging, print the pci config space
> being saved before suspend, which is symmetric to the log
> in pci_restore_config_dword().

But the code change makes sense to me.

> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Len Brown <lenb@kernel.org>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-pm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>
> ---
>  drivers/pci/pci.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e87196cc1a7f..34cde70440c3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1372,8 +1372,11 @@ int pci_save_state(struct pci_dev *dev)
>  {
>         int i;
>         /* XXX: 100% dword access ok here? */
> -       for (i = 0; i < 16; i++)
> +       for (i = 0; i < 16; i++) {
>                 pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
> +               pci_dbg(dev, "saving config space at offset %#x (reading %#x)\n",
> +                       i * 4, dev->saved_config_space[i]);
> +       }
>         dev->state_saved = true;
>
>         i = pci_save_pcie_state(dev);
> --
> 2.17.1
>
Bjorn Helgaas Jan. 13, 2020, 9:45 p.m. UTC | #2
On Mon, Jan 13, 2020 at 02:07:24PM +0800, Chen Yu wrote:
> The pci config space was found to be insane during resume
> from hibernation(S4, or suspend to disk) on a VM:
> 
>  serial 0000:00:16.3: restoring config space at offset 0x14
>  (was 0x9104e000, writing 0xffffffff)
> 
> Either the snapshot on the disk has been scribbled or the pci
> config space becomes invalid before suspend. To narrow down
> and benefit future debugging, print the pci config space
> being saved before suspend, which is symmetric to the log
> in pci_restore_config_dword().
> 
> Cc: Bjorn Helgaas <bhelgaas@google.com>
> Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>
> Cc: Len Brown <lenb@kernel.org>
> Cc: linux-pci@vger.kernel.org
> Cc: linux-pm@vger.kernel.org
> Cc: linux-kernel@vger.kernel.org
> Signed-off-by: Chen Yu <yu.c.chen@intel.com>

Applied to pci/pm for v5.6, thanks!

> ---
>  drivers/pci/pci.c | 5 ++++-
>  1 file changed, 4 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
> index e87196cc1a7f..34cde70440c3 100644
> --- a/drivers/pci/pci.c
> +++ b/drivers/pci/pci.c
> @@ -1372,8 +1372,11 @@ int pci_save_state(struct pci_dev *dev)
>  {
>  	int i;
>  	/* XXX: 100% dword access ok here? */
> -	for (i = 0; i < 16; i++)
> +	for (i = 0; i < 16; i++) {
>  		pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
> +		pci_dbg(dev, "saving config space at offset %#x (reading %#x)\n",
> +			i * 4, dev->saved_config_space[i]);
> +	}
>  	dev->state_saved = true;
>  
>  	i = pci_save_pcie_state(dev);
> -- 
> 2.17.1
>
Chen Yu Jan. 14, 2020, 3:04 a.m. UTC | #3
Hi Rafael,
On Mon, Jan 13, 2020 at 11:01:28AM +0100, Rafael J. Wysocki wrote:
> On Mon, Jan 13, 2020 at 7:08 AM Chen Yu <yu.c.chen@intel.com> wrote:
> >
> > The pci config space was found to be insane during resume
> 
> I wouldn't call it "insane".
> 
> It probably means that the device was not present or not accessible
> during hibernation and now it appears to be present (maybe the restore
> kernel found it and configured it).
> 
Right, thanks for the hint. If this is the case, it should not
save any pci config settings if that device is not accessible,
otherwise there's risk of pci config hard confliction after resumed.
I've applied the patch and wait for the issue to be reproduced
(not 100%) and will send the result later.

Thanks,
Chenyu
diff mbox series

Patch

diff --git a/drivers/pci/pci.c b/drivers/pci/pci.c
index e87196cc1a7f..34cde70440c3 100644
--- a/drivers/pci/pci.c
+++ b/drivers/pci/pci.c
@@ -1372,8 +1372,11 @@  int pci_save_state(struct pci_dev *dev)
 {
 	int i;
 	/* XXX: 100% dword access ok here? */
-	for (i = 0; i < 16; i++)
+	for (i = 0; i < 16; i++) {
 		pci_read_config_dword(dev, i * 4, &dev->saved_config_space[i]);
+		pci_dbg(dev, "saving config space at offset %#x (reading %#x)\n",
+			i * 4, dev->saved_config_space[i]);
+	}
 	dev->state_saved = true;
 
 	i = pci_save_pcie_state(dev);