diff mbox

Subject : [ PATCH ] pci-reset-error_state-to-pci_channel_io_normal-at-report_slot_reset

Message ID F7B8FD780A346D46A0042F5C63B06AE785E3B2@SHSMSX102.ccr.corp.intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Zhang, LongX April 26, 2013, 6:28 a.m. UTC
From: Zhang Long <longx.zhang@intel.com>

Specific pci device drivers might have many functions to call
pci_channel_offline to check device states. When slot_reset happens,
drivers' slot_reset callback might call such functions and eventually
abort the reset.

The patch resets pdev->error_state to pci_channel_io_normal at
the begining of report_slot_reset.

Thank Liu Joseph for pointing it out.

Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
Signed-off-by: Zhang Long <longx.zhang@intel.com>
---
 drivers/pci/pcie/aer/aerdrv_core.c |    1 +
 drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
 2 files changed, 6 insertions(+), 7 deletions(-)

Comments

Yanmin Zhang May 2, 2013, 12:30 a.m. UTC | #1
On Fri, 2013-04-26 at 06:28 +0000, Zhang, LongX wrote:
> From: Zhang Long <longx.zhang@intel.com>
> 
> Specific pci device drivers might have many functions to call
> pci_channel_offline to check device states. When slot_reset happens,
> drivers' slot_reset callback might call such functions and eventually
> abort the reset.
> 
> The patch resets pdev->error_state to pci_channel_io_normal at
> the begining of report_slot_reset.
> 
> Thank Liu Joseph for pointing it out.
Linas, Bjorn,

Would you like to merge the patch to your testing tree?
The patch resolves one issue pointed out by Joseph.

Thanks,
Yanmin


> 
> Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
> Signed-off-by: Zhang Long <longx.zhang@intel.com>
> ---
>  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
>  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
>  2 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 564d97f..c61fd44 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>  	result_data = (struct aer_broadcast_data *) data;
>  
>  	device_lock(&dev->dev);
> +	dev->error_state = pci_channel_io_normal;
>  	if (!dev->driver ||
>  		!dev->driver->err_handler ||
>  		!dev->driver->err_handler->slot_reset)
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index ed4d094..7abefd9 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>  	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>  	int retval;
>  
> -	/* If fatal, restore cfg space for possible link reset at upstream */
> -	if (dev->error_state == pci_channel_io_frozen) {
> -		dev->state_saved = true;
> -		pci_restore_state(dev);
> -		pcie_portdrv_restore_config(dev);
> -		pci_enable_pcie_error_reporting(dev);
> -	}
> +	/* restore cfg space for possible link reset at upstream */
> +	dev->state_saved = true;
> +	pci_restore_state(dev);
> +	pcie_portdrv_restore_config(dev);
> +	pci_enable_pcie_error_reporting(dev);
>  
>  	/* get true return value from &status */
>  	retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanmin Zhang May 3, 2013, 12:33 a.m. UTC | #2
On Wed, 2013-05-01 at 20:20 -0500, Linas Vepstas wrote:
> Hi,
> 
> On 1 May 2013 19:30, Yanmin Zhang <yanmin_zhang@linux.intel.com>
> wrote:
>         On Fri, 2013-04-26 at 06:28 +0000, Zhang, LongX wrote:
>         > From: Zhang Long <longx.zhang@intel.com>
>         >
>         > Specific pci device drivers might have many functions to
>         call
>         > pci_channel_offline to check device states. When slot_reset
>         happens,
>         > drivers' slot_reset callback might call such functions and
>         eventually
>         > abort the reset.
>         >
>         > The patch resets pdev->error_state to pci_channel_io_normal
>         at
>         > the begining of report_slot_reset.
>         >
>         > Thank Liu Joseph for pointing it out.
>         
>         Linas, Bjorn,
>         
>         Would you like to merge the patch to your testing tree?
>         The patch resolves one issue pointed out by Joseph.
> 
> 
> I'm not maintaining a tree at this time, and am not able to test.
> Sorry.
Thanks Linas.

Greg,

Would you like to merge it into your testing tree? Joseph Liu tested
the patch and it does work.

Thanks,
Yanmin



--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH May 3, 2013, 2 a.m. UTC | #3
On Fri, May 03, 2013 at 08:33:00AM +0800, Yanmin Zhang wrote:
> On Wed, 2013-05-01 at 20:20 -0500, Linas Vepstas wrote:
> > Hi,
> > 
> > On 1 May 2013 19:30, Yanmin Zhang <yanmin_zhang@linux.intel.com>
> > wrote:
> >         On Fri, 2013-04-26 at 06:28 +0000, Zhang, LongX wrote:
> >         > From: Zhang Long <longx.zhang@intel.com>
> >         >
> >         > Specific pci device drivers might have many functions to
> >         call
> >         > pci_channel_offline to check device states. When slot_reset
> >         happens,
> >         > drivers' slot_reset callback might call such functions and
> >         eventually
> >         > abort the reset.
> >         >
> >         > The patch resets pdev->error_state to pci_channel_io_normal
> >         at
> >         > the begining of report_slot_reset.
> >         >
> >         > Thank Liu Joseph for pointing it out.
> >         
> >         Linas, Bjorn,
> >         
> >         Would you like to merge the patch to your testing tree?
> >         The patch resolves one issue pointed out by Joseph.
> > 
> > 
> > I'm not maintaining a tree at this time, and am not able to test.
> > Sorry.
> Thanks Linas.
> 
> Greg,
> 
> Would you like to merge it into your testing tree? Joseph Liu tested
> the patch and it does work.

You do know about the scripts/get_maintainer.pl script, right?

Hint, try it out :)
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanmin Zhang May 3, 2013, 3:13 a.m. UTC | #4
On Thu, 2013-05-02 at 19:00 -0700, Greg Kroah-Hartman wrote:
> On Fri, May 03, 2013 at 08:33:00AM +0800, Yanmin Zhang wrote:
> > On Wed, 2013-05-01 at 20:20 -0500, Linas Vepstas wrote:
> > > Hi,
> > > 
> > > On 1 May 2013 19:30, Yanmin Zhang <yanmin_zhang@linux.intel.com>
> > > wrote:
> > >         On Fri, 2013-04-26 at 06:28 +0000, Zhang, LongX wrote:
> > >         > From: Zhang Long <longx.zhang@intel.com>
> > >         >
> > >         > Specific pci device drivers might have many functions to
> > >         call
> > >         > pci_channel_offline to check device states. When slot_reset
> > >         happens,
> > >         > drivers' slot_reset callback might call such functions and
> > >         eventually
> > >         > abort the reset.
> > >         >
> > >         > The patch resets pdev->error_state to pci_channel_io_normal
> > >         at
> > >         > the begining of report_slot_reset.
> > >         >
> > >         > Thank Liu Joseph for pointing it out.
> > >         
> > >         Linas, Bjorn,
> > >         
> > >         Would you like to merge the patch to your testing tree?
> > >         The patch resolves one issue pointed out by Joseph.
> > > 
> > > 
> > > I'm not maintaining a tree at this time, and am not able to test.
> > > Sorry.
> > Thanks Linas.
> > 
> > Greg,
> > 
> > Would you like to merge it into your testing tree? Joseph Liu tested
> > the patch and it does work.
> 
> You do know about the scripts/get_maintainer.pl script, right?
> 
> Hint, try it out :)
Greg,

Thanks. We did send to the right people, Linas and Bjorn.

scripts/get_maintainer.pl 0001-pci-reset-error_state-to-pci_channel_io_normal-at-re.patch
Bjorn Helgaas <bhelgaas@google.com> (supporter:PCI SUBSYSTEM,commit_signer:7/8=88%,commit_signer:10/11=91%)
Linas Vepstas <linasvepstas@gmail.com> (commit_signer:2/8=25%)
Yijing Wang <wangyijing@huawei.com> (commit_signer:2/8=25%,commit_signer:2/11=18%)
Jiang Liu <jiang.liu@huawei.com> (commit_signer:2/8=25%,commit_signer:2/11=18%)
Stephen Hemminger <shemminger@vyatta.com> (commit_signer:1/8=12%)
"Rafael J. Wysocki" <rafael.j.wysocki@intel.com> (commit_signer:6/11=55%)
Huang Ying <ying.huang@intel.com> (commit_signer:5/11=45%)
linux-pci@vger.kernel.org (open list:PCI SUBSYSTEM)
linux-kernel@vger.kernel.org (open list)


I remember Jesse was maintaining PCI subsystem and he responded quickly.

Yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linas Vepstas May 3, 2013, 6 p.m. UTC | #5
Greg,

I've been receiving (and reading!) these messages; I replied that I am
not maintaining a tree, and have no way of testing these patches (no
access to hardware.)  I believe it unlikely that my situation will
change, and so I should probably be removed from the maintainers file.

No acked-by or anything; the patches are not "obviously correct" by
visual inspection; they may be right, but would require deeper
thinking about what is actually happening than I'm capable of at this
time; I'm a bit rusty with this code base, and might miss something
important.

-- Linas

p.s. you didn't see my earlier reply because I forgot to hit 'plain text reply'


On 2 May 2013 22:13, Yanmin Zhang <yanmin_zhang@linux.intel.com> wrote:
>
> On Thu, 2013-05-02 at 19:00 -0700, Greg Kroah-Hartman wrote:
> > On Fri, May 03, 2013 at 08:33:00AM +0800, Yanmin Zhang wrote:
> > > On Wed, 2013-05-01 at 20:20 -0500, Linas Vepstas wrote:
> > > > Hi,
> > > >
> > > > On 1 May 2013 19:30, Yanmin Zhang <yanmin_zhang@linux.intel.com>
> > > > wrote:
> > > >         On Fri, 2013-04-26 at 06:28 +0000, Zhang, LongX wrote:
> > > >         > From: Zhang Long <longx.zhang@intel.com>
> > > >         >
> > > >         > Specific pci device drivers might have many functions to
> > > >         call
> > > >         > pci_channel_offline to check device states. When slot_reset
> > > >         happens,
> > > >         > drivers' slot_reset callback might call such functions and
> > > >         eventually
> > > >         > abort the reset.
> > > >         >
> > > >         > The patch resets pdev->error_state to pci_channel_io_normal
> > > >         at
> > > >         > the begining of report_slot_reset.
> > > >         >
> > > >         > Thank Liu Joseph for pointing it out.
> > > >
> > > >         Linas, Bjorn,
> > > >
> > > >         Would you like to merge the patch to your testing tree?
> > > >         The patch resolves one issue pointed out by Joseph.
> > > >
> > > >
> > > > I'm not maintaining a tree at this time, and am not able to test.
> > > > Sorry.
> > > Thanks Linas.
> > >
> > > Greg,
> > >
> > > Would you like to merge it into your testing tree? Joseph Liu tested
> > > the patch and it does work.
> >
> > You do know about the scripts/get_maintainer.pl script, right?
> >
> > Hint, try it out :)
> Greg,
>
> Thanks. We did send to the right people, Linas and Bjorn.
>
> scripts/get_maintainer.pl 0001-pci-reset-error_state-to-pci_channel_io_normal-at-re.patch
> Bjorn Helgaas <bhelgaas@google.com> (supporter:PCI SUBSYSTEM,commit_signer:7/8=88%,commit_signer:10/11=91%)
> Linas Vepstas <linasvepstas@gmail.com> (commit_signer:2/8=25%)
> Yijing Wang <wangyijing@huawei.com> (commit_signer:2/8=25%,commit_signer:2/11=18%)
> Jiang Liu <jiang.liu@huawei.com> (commit_signer:2/8=25%,commit_signer:2/11=18%)
> Stephen Hemminger <shemminger@vyatta.com> (commit_signer:1/8=12%)
> "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> (commit_signer:6/11=55%)
> Huang Ying <ying.huang@intel.com> (commit_signer:5/11=45%)
> linux-pci@vger.kernel.org (open list:PCI SUBSYSTEM)
> linux-kernel@vger.kernel.org (open list)
>
>
> I remember Jesse was maintaining PCI subsystem and he responded quickly.
>
> Yanmin
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanmin Zhang May 7, 2013, 6:01 a.m. UTC | #6
On Fri, 2013-05-03 at 13:00 -0500, Linas Vepstas wrote:
> Greg,
> 
> I've been receiving (and reading!) these messages; I replied that I am
> not maintaining a tree, and have no way of testing these patches (no
> access to hardware.)  I believe it unlikely that my situation will
> change, and so I should probably be removed from the maintainers file.
> 
> No acked-by or anything; the patches are not "obviously correct" by
> visual inspection; they may be right, but would require deeper
> thinking about what is actually happening than I'm capable of at this
> time; I'm a bit rusty with this code base, and might miss something
> important
powerpc uses the similar method. See function eeh_report_reset.
We worked out the patch after checking powerpc codes.

Joseph Liu helped test the patch and the patch does work well.

> .
> 
> -- Linas
> 
> p.s. you didn't see my earlier reply because I forgot to hit 'plain text reply'
> 
> 
> On 2 May 2013 22:13, Yanmin Zhang <yanmin_zhang@linux.intel.com> wrote:
> >
> > On Thu, 2013-05-02 at 19:00 -0700, Greg Kroah-Hartman wrote:
> > > On Fri, May 03, 2013 at 08:33:00AM +0800, Yanmin Zhang wrote:
> > > > On Wed, 2013-05-01 at 20:20 -0500, Linas Vepstas wrote:
> > > > > Hi,
> > > > >
> > > > > On 1 May 2013 19:30, Yanmin Zhang <yanmin_zhang@linux.intel.com>
> > > > > wrote:
> > > > >         On Fri, 2013-04-26 at 06:28 +0000, Zhang, LongX wrote:
> > > > >         > From: Zhang Long <longx.zhang@intel.com>
> > > > >         >
> > > > >         > Specific pci device drivers might have many functions to
> > > > >         call
> > > > >         > pci_channel_offline to check device states. When slot_reset
> > > > >         happens,
> > > > >         > drivers' slot_reset callback might call such functions and
> > > > >         eventually
> > > > >         > abort the reset.
> > > > >         >
> > > > >         > The patch resets pdev->error_state to pci_channel_io_normal
> > > > >         at
> > > > >         > the begining of report_slot_reset.
> > > > >         >
> > > > >         > Thank Liu Joseph for pointing it out.
> > > > >
> > > > >         Linas, Bjorn,
> > > > >
> > > > >         Would you like to merge the patch to your testing tree?
> > > > >         The patch resolves one issue pointed out by Joseph.
> > > > >
> > > > >
> > > > > I'm not maintaining a tree at this time, and am not able to test.
> > > > > Sorry.
> > > > Thanks Linas.
> > > >
> > > > Greg,
> > > >
> > > > Would you like to merge it into your testing tree? Joseph Liu tested
> > > > the patch and it does work.
> > >
> > > You do know about the scripts/get_maintainer.pl script, right?
> > >
> > > Hint, try it out :)
> > Greg,
> >
> > Thanks. We did send to the right people, Linas and Bjorn.
> >
> > scripts/get_maintainer.pl 0001-pci-reset-error_state-to-pci_channel_io_normal-at-re.patch
> > Bjorn Helgaas <bhelgaas@google.com> (supporter:PCI SUBSYSTEM,commit_signer:7/8=88%,commit_signer:10/11=91%)
> > Linas Vepstas <linasvepstas@gmail.com> (commit_signer:2/8=25%)
> > Yijing Wang <wangyijing@huawei.com> (commit_signer:2/8=25%,commit_signer:2/11=18%)
> > Jiang Liu <jiang.liu@huawei.com> (commit_signer:2/8=25%,commit_signer:2/11=18%)
> > Stephen Hemminger <shemminger@vyatta.com> (commit_signer:1/8=12%)
> > "Rafael J. Wysocki" <rafael.j.wysocki@intel.com> (commit_signer:6/11=55%)
> > Huang Ying <ying.huang@intel.com> (commit_signer:5/11=45%)
> > linux-pci@vger.kernel.org (open list:PCI SUBSYSTEM)
> > linux-kernel@vger.kernel.org (open list)
> >
> >
> > I remember Jesse was maintaining PCI subsystem and he responded quickly.
> >
> > Yanmin
> >
> >


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas May 17, 2013, 11:43 p.m. UTC | #7
[+cc Rafael because he knows about dev->state_saved]

Sorry, I'm not very familiar with AER, so please excuse some naive
questions below.

On Fri, Apr 26, 2013 at 12:28 AM, Zhang, LongX <longx.zhang@intel.com> wrote:
> From: Zhang Long <longx.zhang@intel.com>
>
> Specific pci device drivers might have many functions to call
> pci_channel_offline to check device states. When slot_reset happens,
> drivers' slot_reset callback might call such functions and eventually
> abort the reset.

Where does this happen?  I looked at all the references to
dev->error_state and all the callers of pci_channel_offline(), and I
didn't see any in .slot_reset() methods.

(There are *assignments* to dev->error_state in qlcnic_attach_func(),
qlge_io_slot_reset(), and qla2xxx_pci_slot_reset().  You might be able
to remove those assignments after this patch, but this patch wouldn't
really change anything for those paths.)

> The patch resets pdev->error_state to pci_channel_io_normal at
> the begining of report_slot_reset.

> Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
> Signed-off-by: Zhang Long <longx.zhang@intel.com>
> ---
>  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
>  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
>  2 files changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 564d97f..c61fd44 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>         result_data = (struct aer_broadcast_data *) data;
>
>         device_lock(&dev->dev);
> +       dev->error_state = pci_channel_io_normal;

The device's error_state might be pci_channel_io_frozen when we get
here.  We haven't touched anything in the hardware yet.  What makes
the device unfrozen now?  Did anything actually change as far as the
hardware device is concerned?

I agree it looks like report_slot_reset() should be made more like
eeh_report_reset().  I'm just wondering if the error_state should be
changed *after* calling the .slot_reset() method instead of before.

>         if (!dev->driver ||
>                 !dev->driver->err_handler ||
>                 !dev->driver->err_handler->slot_reset)
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index ed4d094..7abefd9 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>         pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>         int retval;
>
> -       /* If fatal, restore cfg space for possible link reset at upstream */
> -       if (dev->error_state == pci_channel_io_frozen) {
> -               dev->state_saved = true;
> -               pci_restore_state(dev);
> -               pcie_portdrv_restore_config(dev);
> -               pci_enable_pcie_error_reporting(dev);
> -       }

Previously we only restored state for the pci_channel_io_frozen state,
i.e., when handling an AER_FATAL error.  Now we restore it always.
Why?

> +       /* restore cfg space for possible link reset at upstream */
> +       dev->state_saved = true;

"dev->state_saved == true" means that the dev->saved_config_space
contains valid data.  Why do we know that's the case here?  I see that
pcie_portdrv_probe() calls pci_save_state() when we first claim the
port, and I guess we're assuming the state saved then is still valid.
But why do we need to actually set dev->state_saved here?  Shouldn't
it be already set to true anyway?

> +       pci_restore_state(dev);
> +       pcie_portdrv_restore_config(dev);
> +       pci_enable_pcie_error_reporting(dev);
>
>         /* get true return value from &status */
>         retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
> --
> 1.7.4.1
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Rafael Wysocki May 17, 2013, 11:56 p.m. UTC | #8
On Friday, May 17, 2013 05:43:33 PM Bjorn Helgaas wrote:
> [+cc Rafael because he knows about dev->state_saved]
> 
> Sorry, I'm not very familiar with AER, so please excuse some naive
> questions below.
> 
> On Fri, Apr 26, 2013 at 12:28 AM, Zhang, LongX <longx.zhang@intel.com> wrote:
> > From: Zhang Long <longx.zhang@intel.com>
> >
> > Specific pci device drivers might have many functions to call
> > pci_channel_offline to check device states. When slot_reset happens,
> > drivers' slot_reset callback might call such functions and eventually
> > abort the reset.
> 
> Where does this happen?  I looked at all the references to
> dev->error_state and all the callers of pci_channel_offline(), and I
> didn't see any in .slot_reset() methods.
> 
> (There are *assignments* to dev->error_state in qlcnic_attach_func(),
> qlge_io_slot_reset(), and qla2xxx_pci_slot_reset().  You might be able
> to remove those assignments after this patch, but this patch wouldn't
> really change anything for those paths.)
> 
> > The patch resets pdev->error_state to pci_channel_io_normal at
> > the begining of report_slot_reset.
> 
> > Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
> > Signed-off-by: Zhang Long <longx.zhang@intel.com>
> > ---
> >  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
> >  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
> >  2 files changed, 6 insertions(+), 7 deletions(-)
> >
> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> > index 564d97f..c61fd44 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> > @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
> >         result_data = (struct aer_broadcast_data *) data;
> >
> >         device_lock(&dev->dev);
> > +       dev->error_state = pci_channel_io_normal;
> 
> The device's error_state might be pci_channel_io_frozen when we get
> here.  We haven't touched anything in the hardware yet.  What makes
> the device unfrozen now?  Did anything actually change as far as the
> hardware device is concerned?
> 
> I agree it looks like report_slot_reset() should be made more like
> eeh_report_reset().  I'm just wondering if the error_state should be
> changed *after* calling the .slot_reset() method instead of before.
> 
> >         if (!dev->driver ||
> >                 !dev->driver->err_handler ||
> >                 !dev->driver->err_handler->slot_reset)
> > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> > index ed4d094..7abefd9 100644
> > --- a/drivers/pci/pcie/portdrv_pci.c
> > +++ b/drivers/pci/pcie/portdrv_pci.c
> > @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
> >         pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
> >         int retval;
> >
> > -       /* If fatal, restore cfg space for possible link reset at upstream */
> > -       if (dev->error_state == pci_channel_io_frozen) {
> > -               dev->state_saved = true;
> > -               pci_restore_state(dev);
> > -               pcie_portdrv_restore_config(dev);
> > -               pci_enable_pcie_error_reporting(dev);
> > -       }
> 
> Previously we only restored state for the pci_channel_io_frozen state,
> i.e., when handling an AER_FATAL error.  Now we restore it always.
> Why?
> 
> > +       /* restore cfg space for possible link reset at upstream */
> > +       dev->state_saved = true;
> 
> "dev->state_saved == true" means that the dev->saved_config_space
> contains valid data.  Why do we know that's the case here?  I see that
> pcie_portdrv_probe() calls pci_save_state() when we first claim the
> port, and I guess we're assuming the state saved then is still valid.
> But why do we need to actually set dev->state_saved here?  Shouldn't
> it be already set to true anyway?

This is a dirty trick to make pci_restore_state(dev) always work here
(because it checks dev->state_saved and does nothing if it isn't set).
I suppose.

> > +       pci_restore_state(dev);
> > +       pcie_portdrv_restore_config(dev);
> > +       pci_enable_pcie_error_reporting(dev);
> >
> >         /* get true return value from &status */
> >         retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
> > --

Thanks,
Rafael
Liu, Joseph May 20, 2013, 2:38 p.m. UTC | #9
Bjorn,

I have been working with the low level device drivers for supporting both EEH and AER and were involved in working with Yanmin for verifying this AER patch. Please see some of my responses to your questions inline, prefixed with [jzl]. 


Thanks,
Joe Liu, Ph.D.
Senior Principal Engineer
Emulex Corporation

-----Original Message-----
From: Bjorn Helgaas [mailto:bhelgaas@google.com] 
Sent: Friday, May 17, 2013 7:44 PM
To: Zhang, LongX
Cc: linasvepstas@gmail.com; linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; yanmin_zhang@linux.intel.com; Liu, Joseph; Rafael J. Wysocki
Subject: Re: Subject : [ PATCH ] pci-reset-error_state-to-pci_channel_io_normal-at-report_slot_reset

[+cc Rafael because he knows about dev->state_saved]

Sorry, I'm not very familiar with AER, so please excuse some naive
questions below.

On Fri, Apr 26, 2013 at 12:28 AM, Zhang, LongX <longx.zhang@intel.com> wrote:
> From: Zhang Long <longx.zhang@intel.com>
>
> Specific pci device drivers might have many functions to call
> pci_channel_offline to check device states. When slot_reset happens,
> drivers' slot_reset callback might call such functions and eventually
> abort the reset.

|Where does this happen?  I looked at all the references to
|dev->error_state and all the callers of pci_channel_offline(), and I
|didn't see any in .slot_reset() methods.
|
|(There are *assignments* to dev->error_state in qlcnic_attach_func(),
|qlge_io_slot_reset(), and qla2xxx_pci_slot_reset().  You might be able
|to remove those assignments after this patch, but this patch wouldn't
|really change anything for those paths.)

[jzl] Although low level driver might not call pci_channel_offline() directly in .slot_reset() method itself, it does not mean it will not call it during the device error recovery execution of .slot_reset() method. The .slot_reset() method needs to call some of its bottom layer routines interfacing with device PCI interface for preparing the PCI device from recovery of PCI error (EEH or AER). As a matter of fact, it seems that qla2xxx_pci_slot_reset() routine's assignments to dev->error was an attempt to work around this AER driver issue that this patch is trying to resolve. From upstream kernel 3.9.2's qla2xxx_pci_slot_reset(), it has the assignment and comment below:

static pci_ers_result_t
qla2xxx_pci_slot_reset(struct pci_dev *pdev)
{
        pci_ers_result_t ret = PCI_ERS_RESULT_DISCONNECT;
        scsi_qla_host_t *base_vha = pci_get_drvdata(pdev);
        struct qla_hw_data *ha = base_vha->hw;
        struct rsp_que *rsp;
        int rc, retries = 10;

        ql_dbg(ql_dbg_aer, base_vha, 0x9004,
            "Slot Reset.\n");

        /* Workaround: qla2xxx driver which access hardware earlier
         * needs error state to be pci_channel_io_online.
         * Otherwise mailbox command timesout.
         */
        pdev->error_state = pci_channel_io_normal;



> The patch resets pdev->error_state to pci_channel_io_normal at
> the begining of report_slot_reset.

> Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
> Signed-off-by: Zhang Long <longx.zhang@intel.com>
> ---
>  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
>  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
>  2 files changed, 6 insertions(+), 7 deletions(-)
>
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 564d97f..c61fd44 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>         result_data = (struct aer_broadcast_data *) data;
>
>         device_lock(&dev->dev);
> +       dev->error_state = pci_channel_io_normal;

|The device's error_state might be pci_channel_io_frozen when we get
|here.  We haven't touched anything in the hardware yet.  What makes
|the device unfrozen now?  Did anything actually change as far as the
|hardware device is concerned?

[jzl] When the AER driver gets to invoking report_slot_reset(), it has already performed PCI slot reset. Currently, with PCIe, it is the PCIe link reset it has been performed. But with proper AER driver implementation, it should be either PCI slot hot reset or even fundamental reset that has already been performed. As such, after platform performing the PCI slot reset, it should move the device's error_state out of pci_channel_io_fronzen before calling report_slot_reset() to low level device drivers allows them to access the corresponding device's PCI function in preparation for recovery.


|I agree it looks like report_slot_reset() should be made more like
|eeh_report_reset().  I'm just wondering if the error_state should be
|changed *after* calling the .slot_reset() method instead of before.

[jzl] No, you should not set the error_state after calling the .slot_reset() method because the AER driver calling the low level driver's .slot_reset() method is to "report" that the platform PCI slot reset has been done and "inform" the low level driver to prepare the PCI device for recovering, with which the low level driver might need to, for example, further perform reset on PCI functions with the deivice. The .slot_reset() is call by function, changing the error_state *after* calling the .slot_reset() method will be too late for the low level device to access PCI device from within the  .slot_reset() function call...

[jzl] Also, this is not just the eeh_report_reset() behavior, it is PCI error recovery behavior. In the pci-error-recovery.txt, it stated the following under "STEP 4 Slot Reset":

"...
This call gives drivers the chance to re-initialize the hardware
(re-download firmware, etc.).  At this point, the driver may assume
that the card is in a fresh state and is fully functional. The slot
is unfrozen and the driver has full access to PCI config space,
memory mapped I/O space and DMA. Interrupts (Legacy, MSI, or MSI-X)
will also be available.
..."


>         if (!dev->driver ||
>                 !dev->driver->err_handler ||
>                 !dev->driver->err_handler->slot_reset)
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index ed4d094..7abefd9 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>         pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>         int retval;
>
> -       /* If fatal, restore cfg space for possible link reset at upstream */
> -       if (dev->error_state == pci_channel_io_frozen) {
> -               dev->state_saved = true;
> -               pci_restore_state(dev);
> -               pcie_portdrv_restore_config(dev);
> -               pci_enable_pcie_error_reporting(dev);
> -       }

Previously we only restored state for the pci_channel_io_frozen state,
i.e., when handling an AER_FATAL error.  Now we restore it always.
Why?

> +       /* restore cfg space for possible link reset at upstream */
> +       dev->state_saved = true;

"dev->state_saved == true" means that the dev->saved_config_space
contains valid data.  Why do we know that's the case here?  I see that
pcie_portdrv_probe() calls pci_save_state() when we first claim the
port, and I guess we're assuming the state saved then is still valid.
But why do we need to actually set dev->state_saved here?  Shouldn't
it be already set to true anyway?

> +       pci_restore_state(dev);
> +       pcie_portdrv_restore_config(dev);
> +       pci_enable_pcie_error_reporting(dev);
>
>         /* get true return value from &status */
>         retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
> --
> 1.7.4.1
>
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linas Vepstas May 20, 2013, 3:37 p.m. UTC | #10
I think Joe Liu has it right.  I'm going to top-post because things
are a bit tangled below.  I urge a review of
/Documentation/PCI/pci-error-recovery.txt, as that gives the details.

The intended sequence is that, after an error, the device driver gets
a shot at running some diagnostics & dumps, and then the pci
bridges/controllers/ports/links are reset (by platform code, viz. aer
in this case) to a state resembling a fresh power-on.  Then the
.slot_reset() callback is called on the device driver, to tell the
driver "hey everything upstream is now working, go set yourself up for
normal ops."  Thus, in particular, one should have pdev->error_state =
pci_channel_io_normal; before .slot_reset() is called, and the PCI
config space contents should resemble a fresh power-on state (and
**NOT** some other saved state!)

If the device driver wanted to leave the card in a dead state, it had
several opportunities to say so, earlier in the callback sequence.  If
the driver wanted to fiddle with the card with the old PCI config
space, it already had a chance to do that too, before the
bridges/controllers/ports/links were fully reset by the platform.   By
the time that .slot_reset() is being called, both the platform and the
device driver are expecting smooth sailing ahead.

So, looking at the original patch, it seems reasonable.  My impression
is that maybe the AER driver had been doing not quite the right thing
for a long time.

-- Linas Vepstas

On 20 May 2013 09:38, Liu, Joseph <Joseph.Liu@emulex.com> wrote:
> Bjorn,
>
> I have been working with the low level device drivers for supporting both EEH and AER and were involved in working with Yanmin for verifying this AER patch. Please see some of my responses to your questions inline, prefixed with [jzl].
>
>
> Thanks,
> Joe Liu, Ph.D.
> Senior Principal Engineer
> Emulex Corporation
>
> -----Original Message-----
> From: Bjorn Helgaas [mailto:bhelgaas@google.com]
> Sent: Friday, May 17, 2013 7:44 PM
> To: Zhang, LongX
> Cc: linasvepstas@gmail.com; linux-pci@vger.kernel.org; linux-kernel@vger.kernel.org; yanmin_zhang@linux.intel.com; Liu, Joseph; Rafael J. Wysocki
> Subject: Re: Subject : [ PATCH ] pci-reset-error_state-to-pci_channel_io_normal-at-report_slot_reset
>
> [+cc Rafael because he knows about dev->state_saved]
>
> Sorry, I'm not very familiar with AER, so please excuse some naive
> questions below.
>
> On Fri, Apr 26, 2013 at 12:28 AM, Zhang, LongX <longx.zhang@intel.com> wrote:
>> From: Zhang Long <longx.zhang@intel.com>
>>
>> Specific pci device drivers might have many functions to call
>> pci_channel_offline to check device states. When slot_reset happens,
>> drivers' slot_reset callback might call such functions and eventually
>> abort the reset.
>
> |Where does this happen?  I looked at all the references to
> |dev->error_state and all the callers of pci_channel_offline(), and I
> |didn't see any in .slot_reset() methods.
> |
> |(There are *assignments* to dev->error_state in qlcnic_attach_func(),
> |qlge_io_slot_reset(), and qla2xxx_pci_slot_reset().  You might be able
> |to remove those assignments after this patch, but this patch wouldn't
> |really change anything for those paths.)
>
> [jzl] Although low level driver might not call pci_channel_offline() directly in .slot_reset() method itself, it does not mean it will not call it during the device error recovery execution of .slot_reset() method. The .slot_reset() method needs to call some of its bottom layer routines interfacing with device PCI interface for preparing the PCI device from recovery of PCI error (EEH or AER). As a matter of fact, it seems that qla2xxx_pci_slot_reset() routine's assignments to dev->error was an attempt to work around this AER driver issue that this patch is trying to resolve. From upstream kernel 3.9.2's qla2xxx_pci_slot_reset(), it has the assignment and comment below:
>
> static pci_ers_result_t
> qla2xxx_pci_slot_reset(struct pci_dev *pdev)
> {
>         pci_ers_result_t ret = PCI_ERS_RESULT_DISCONNECT;
>         scsi_qla_host_t *base_vha = pci_get_drvdata(pdev);
>         struct qla_hw_data *ha = base_vha->hw;
>         struct rsp_que *rsp;
>         int rc, retries = 10;
>
>         ql_dbg(ql_dbg_aer, base_vha, 0x9004,
>             "Slot Reset.\n");
>
>         /* Workaround: qla2xxx driver which access hardware earlier
>          * needs error state to be pci_channel_io_online.
>          * Otherwise mailbox command timesout.
>          */
>         pdev->error_state = pci_channel_io_normal;
>
>
>
>> The patch resets pdev->error_state to pci_channel_io_normal at
>> the begining of report_slot_reset.
>
>> Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
>> Signed-off-by: Zhang Long <longx.zhang@intel.com>
>> ---
>>  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
>>  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
>>  2 files changed, 6 insertions(+), 7 deletions(-)
>>
>> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
>> index 564d97f..c61fd44 100644
>> --- a/drivers/pci/pcie/aer/aerdrv_core.c
>> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
>> @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>>         result_data = (struct aer_broadcast_data *) data;
>>
>>         device_lock(&dev->dev);
>> +       dev->error_state = pci_channel_io_normal;
>
> |The device's error_state might be pci_channel_io_frozen when we get
> |here.  We haven't touched anything in the hardware yet.  What makes
> |the device unfrozen now?  Did anything actually change as far as the
> |hardware device is concerned?
>
> [jzl] When the AER driver gets to invoking report_slot_reset(), it has already performed PCI slot reset. Currently, with PCIe, it is the PCIe link reset it has been performed. But with proper AER driver implementation, it should be either PCI slot hot reset or even fundamental reset that has already been performed. As such, after platform performing the PCI slot reset, it should move the device's error_state out of pci_channel_io_fronzen before calling report_slot_reset() to low level device drivers allows them to access the corresponding device's PCI function in preparation for recovery.
>
>
> |I agree it looks like report_slot_reset() should be made more like
> |eeh_report_reset().  I'm just wondering if the error_state should be
> |changed *after* calling the .slot_reset() method instead of before.
>
> [jzl] No, you should not set the error_state after calling the .slot_reset() method because the AER driver calling the low level driver's .slot_reset() method is to "report" that the platform PCI slot reset has been done and "inform" the low level driver to prepare the PCI device for recovering, with which the low level driver might need to, for example, further perform reset on PCI functions with the deivice. The .slot_reset() is call by function, changing the error_state *after* calling the .slot_reset() method will be too late for the low level device to access PCI device from within the  .slot_reset() function call...
>
> [jzl] Also, this is not just the eeh_report_reset() behavior, it is PCI error recovery behavior. In the pci-error-recovery.txt, it stated the following under "STEP 4 Slot Reset":
>
> "...
> This call gives drivers the chance to re-initialize the hardware
> (re-download firmware, etc.).  At this point, the driver may assume
> that the card is in a fresh state and is fully functional. The slot
> is unfrozen and the driver has full access to PCI config space,
> memory mapped I/O space and DMA. Interrupts (Legacy, MSI, or MSI-X)
> will also be available.
> ..."
>
>
>>         if (!dev->driver ||
>>                 !dev->driver->err_handler ||
>>                 !dev->driver->err_handler->slot_reset)
>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>> index ed4d094..7abefd9 100644
>> --- a/drivers/pci/pcie/portdrv_pci.c
>> +++ b/drivers/pci/pcie/portdrv_pci.c
>> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>>         pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>>         int retval;
>>
>> -       /* If fatal, restore cfg space for possible link reset at upstream */
>> -       if (dev->error_state == pci_channel_io_frozen) {
>> -               dev->state_saved = true;
>> -               pci_restore_state(dev);
>> -               pcie_portdrv_restore_config(dev);
>> -               pci_enable_pcie_error_reporting(dev);
>> -       }
>
> Previously we only restored state for the pci_channel_io_frozen state,
> i.e., when handling an AER_FATAL error.  Now we restore it always.
> Why?
>
>> +       /* restore cfg space for possible link reset at upstream */
>> +       dev->state_saved = true;
>
> "dev->state_saved == true" means that the dev->saved_config_space
> contains valid data.  Why do we know that's the case here?  I see that
> pcie_portdrv_probe() calls pci_save_state() when we first claim the
> port, and I guess we're assuming the state saved then is still valid.
> But why do we need to actually set dev->state_saved here?  Shouldn't
> it be already set to true anyway?
>
>> +       pci_restore_state(dev);
>> +       pcie_portdrv_restore_config(dev);
>> +       pci_enable_pcie_error_reporting(dev);
>>
>>         /* get true return value from &status */
>>         retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
>> --
>> 1.7.4.1
>>
>>
>>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas May 20, 2013, 5:21 p.m. UTC | #11
On Fri, May 17, 2013 at 5:56 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
> On Friday, May 17, 2013 05:43:33 PM Bjorn Helgaas wrote:
>> On Fri, Apr 26, 2013 at 12:28 AM, Zhang, LongX <longx.zhang@intel.com> wrote:

>> > +       /* restore cfg space for possible link reset at upstream */
>> > +       dev->state_saved = true;
>>
>> "dev->state_saved == true" means that the dev->saved_config_space
>> contains valid data.  Why do we know that's the case here?  I see that
>> pcie_portdrv_probe() calls pci_save_state() when we first claim the
>> port, and I guess we're assuming the state saved then is still valid.
>> But why do we need to actually set dev->state_saved here?  Shouldn't
>> it be already set to true anyway?
>
> This is a dirty trick to make pci_restore_state(dev) always work here
> (because it checks dev->state_saved and does nothing if it isn't set).
> I suppose.

Yes, I did investigate enough to see that this is a dirty trick.  My
question is how we know it's safe to do this dirty trick.

>> > +       pci_restore_state(dev);
>> > +       pcie_portdrv_restore_config(dev);
>> > +       pci_enable_pcie_error_reporting(dev);
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linas Vepstas May 20, 2013, 5:52 p.m. UTC | #12
On 20 May 2013 12:21, Bjorn Helgaas <bhelgaas@google.com> wrote:
> On Fri, May 17, 2013 at 5:56 PM, Rafael J. Wysocki <rjw@sisk.pl> wrote:
>> On Friday, May 17, 2013 05:43:33 PM Bjorn Helgaas wrote:
>>> On Fri, Apr 26, 2013 at 12:28 AM, Zhang, LongX <longx.zhang@intel.com> wrote:
>
>>> > +       /* restore cfg space for possible link reset at upstream */
>>> > +       dev->state_saved = true;
>>>
>>> "dev->state_saved == true" means that the dev->saved_config_space
>>> contains valid data.  Why do we know that's the case here?  I see that
>>> pcie_portdrv_probe() calls pci_save_state() when we first claim the
>>> port, and I guess we're assuming the state saved then is still valid.

Yes, see below.

>>> But why do we need to actually set dev->state_saved here?  Shouldn't
>>> it be already set to true anyway?
>>
>> This is a dirty trick to make pci_restore_state(dev) always work here
>> (because it checks dev->state_saved and does nothing if it isn't set).
>> I suppose.
>
> Yes, I did investigate enough to see that this is a dirty trick.  My
> question is how we know it's safe to do this dirty trick.
>
>>> > +       pci_restore_state(dev);
>>> > +       pcie_portdrv_restore_config(dev);

Lets review what is supposed to happen:

-- poweron
-- BIOS/firmware/bootloader maybe fiddles with PCI config space
-- kernel fiddles with PCI config space during boot
-- device driver sets up PCI config space correctly for normal i/o
-- PCI error occurs
-- PCI port/link is reset

At this point, we want to set the PCI config space to something
resembling what it was just before the device driver first touched it
after poweron.  Why?  Because the device driver will typically set up
DMA segments, and tweak other settings as it initializes the card.  It
needs to do almost exactly the same things again, when re-initializing
the device after the error reset.  Thus, the PCI config needs to be
appropriate for a fresh initialization of the card.

If  some other variant of PCI config was loaded here, it might contain
incorrect DMA mappings or other settings.  In particular, while the
driver is initializing the card, you risk having the card run away and
start doing things based on these incorrect settings -- provoking
another error or maybe just silently corrupting data.   The config
that you want to load should be more-or-less the same one that was
there during kernel boot, before the device-driver started touching
things.

The "dirty hack" is weird:  either there's valid data, and the flag is
set, or the data is garbage and the flag isn't set ... how could it be
otherwise?

-- Linas
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas May 20, 2013, 10:48 p.m. UTC | #13
On Fri, Apr 26, 2013 at 06:28:59AM +0000, Zhang, LongX wrote:
> From: Zhang Long <longx.zhang@intel.com>
> 
> Specific pci device drivers might have many functions to call
> pci_channel_offline to check device states. When slot_reset happens,
> drivers' slot_reset callback might call such functions and eventually
> abort the reset.
> 
> The patch resets pdev->error_state to pci_channel_io_normal at
> the begining of report_slot_reset.
> 
> Thank Liu Joseph for pointing it out.
> 
> Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
> Signed-off-by: Zhang Long <longx.zhang@intel.com>
> ---
>  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
>  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
>  2 files changed, 6 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> index 564d97f..c61fd44 100644
> --- a/drivers/pci/pcie/aer/aerdrv_core.c
> +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>  	result_data = (struct aer_broadcast_data *) data;
>  
>  	device_lock(&dev->dev);
> +	dev->error_state = pci_channel_io_normal;
>  	if (!dev->driver ||
>  		!dev->driver->err_handler ||
>  		!dev->driver->err_handler->slot_reset)
> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> index ed4d094..7abefd9 100644
> --- a/drivers/pci/pcie/portdrv_pci.c
> +++ b/drivers/pci/pcie/portdrv_pci.c
> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>  	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>  	int retval;
>  
> -	/* If fatal, restore cfg space for possible link reset at upstream */
> -	if (dev->error_state == pci_channel_io_frozen) {
> -		dev->state_saved = true;
> -		pci_restore_state(dev);
> -		pcie_portdrv_restore_config(dev);
> -		pci_enable_pcie_error_reporting(dev);
> -	}
> +	/* restore cfg space for possible link reset at upstream */
> +	dev->state_saved = true;
> +	pci_restore_state(dev);
> +	pcie_portdrv_restore_config(dev);
> +	pci_enable_pcie_error_reporting(dev);
>  
>  	/* get true return value from &status */
>  	retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);

I think this patch changes the behavior in the case of a non-fatal error
where one of the .error_detected() methods returned
PCI_ERS_RESULT_NEED_RESET.  In that case, pcie_portdrv_slot_reset()
previously did not restore config space, but after your patch, it *will*
restore it.  We need an explanation of why this is safe.

I think you should split this into two patches: the first would remove the
"if (dev->error_state == pci_channel_io_frozen)" test from portdrv_pci.c
and explain the reason, and the second would make the aerdrv_core.c change.

I'm also concerned that in that same case (a non-fatal error where one of
the .error_detected() methods returned PCI_ERS_RESULT_NEED_RESET), I don't
think we actually *do* any kind of device reset.  This isn't related to
your patch, of course, so if you resolve the config space restore question,
we can deal with the reset question later.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanmin Zhang May 21, 2013, 7:40 a.m. UTC | #14
On Mon, 2013-05-20 at 16:48 -0600, Bjorn Helgaas wrote:
> On Fri, Apr 26, 2013 at 06:28:59AM +0000, Zhang, LongX wrote:
> > From: Zhang Long <longx.zhang@intel.com>
> > 
> > Specific pci device drivers might have many functions to call
> > pci_channel_offline to check device states. When slot_reset happens,
> > drivers' slot_reset callback might call such functions and eventually
> > abort the reset.
> > 
> > The patch resets pdev->error_state to pci_channel_io_normal at
> > the begining of report_slot_reset.
> > 
> > Thank Liu Joseph for pointing it out.
> > 
> > Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
> > Signed-off-by: Zhang Long <longx.zhang@intel.com>
> > ---
> >  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
> >  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
> >  2 files changed, 6 insertions(+), 7 deletions(-)
Thank all for the kind comments.

> > 
> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
> > index 564d97f..c61fd44 100644
> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
> > @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
> >  	result_data = (struct aer_broadcast_data *) data;
> >  
> >  	device_lock(&dev->dev);
> > +	dev->error_state = pci_channel_io_normal;
> >  	if (!dev->driver ||
> >  		!dev->driver->err_handler ||
> >  		!dev->driver->err_handler->slot_reset)
> > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
> > index ed4d094..7abefd9 100644
> > --- a/drivers/pci/pcie/portdrv_pci.c
> > +++ b/drivers/pci/pcie/portdrv_pci.c
> > @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
> >  	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
> >  	int retval;
> >  
> > -	/* If fatal, restore cfg space for possible link reset at upstream */
> > -	if (dev->error_state == pci_channel_io_frozen) {
> > -		dev->state_saved = true;
> > -		pci_restore_state(dev);
> > -		pcie_portdrv_restore_config(dev);
> > -		pci_enable_pcie_error_reporting(dev);
> > -	}
> > +	/* restore cfg space for possible link reset at upstream */
> > +	dev->state_saved = true;
It's indeed a dirty trick to change it to true. The reason is suspend. The port would
suspend/resume at suspend-to-ram. When the port is suspended, PCI power framework calls
pci_save_state. When the port is resumed, PCI framework calls pci_restore_state and
dev->state_saved is set to false.
A better solution is to add double space in pci_dev->saved_config_space/saved_cap_space,
which seems consume unnecessary resource.

> > +	pci_restore_state(dev);
> > +	pcie_portdrv_restore_config(dev);
> > +	pci_enable_pcie_error_reporting(dev);
> >  
> >  	/* get true return value from &status */
> >  	retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
> I think this patch changes the behavior in the case of a non-fatal error
> where one of the .error_detected() methods returned
> PCI_ERS_RESULT_NEED_RESET.  In that case, pcie_portdrv_slot_reset()
> previously did not restore config space, but after your patch, it *will*
> restore it.  We need an explanation of why this is safe.
AER standard doesn't define such behavior like if we need reset a slot.
When we implemented the feature in kernel, we assumed at non-fatal error,
config space is still good.

With the new patch, we change the behavior a little.

> 
> I think you should split this into two patches: the first would remove the
> "if (dev->error_state == pci_channel_io_frozen)" test from portdrv_pci.c
The first patch would depend on the 2nd patch as 2nd patch resets error_state 
to pci_channel_io_normal.

> and explain the reason, and the second would make the aerdrv_core.c change.
> 
> I'm also concerned that in that same case (a non-fatal error where one of
> the .error_detected() methods returned PCI_ERS_RESULT_NEED_RESET), I don't
> think we actually *do* any kind of device reset.  This isn't related to
> your patch, of course, so if you resolve the config space restore question,
> we can deal with the reset question later.
It's a good question. At the beginning when we enabled AER feature in kernel,
we didn't really implement the real device reset. It's in the TODO list.

Yanmin


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanmin Zhang May 21, 2013, 7:49 a.m. UTC | #15
On Mon, 2013-05-20 at 10:37 -0500, Linas Vepstas wrote:
> I think Joe Liu has it right.  I'm going to top-post because things
> are a bit tangled below.  I urge a review of
> /Documentation/PCI/pci-error-recovery.txt, as that gives the details.
> 
> The intended sequence is that, after an error, the device driver gets
> a shot at running some diagnostics & dumps, and then the pci
> bridges/controllers/ports/links are reset (by platform code, viz. aer
> in this case) to a state resembling a fresh power-on.  Then the
> .slot_reset() callback is called on the device driver, to tell the
> driver "hey everything upstream is now working, go set yourself up for
> normal ops."  Thus, in particular, one should have pdev->error_state =
> pci_channel_io_normal; before .slot_reset() is called, and the PCI
> config space contents should resemble a fresh power-on state (and
> **NOT** some other saved state!)
> 
> If the device driver wanted to leave the card in a dead state, it had
> several opportunities to say so, earlier in the callback sequence.  If
> the driver wanted to fiddle with the card with the old PCI config
> space, it already had a chance to do that too, before the
> bridges/controllers/ports/links were fully reset by the platform.   By
> the time that .slot_reset() is being called, both the platform and the
> device driver are expecting smooth sailing ahead.
Yes. It's flexible for drivers to do that in many callbacks. AER framework
provides such flexibility.

> 
> So, looking at the original patch, it seems reasonable. 
I agree.

>  My impression
> is that maybe the AER driver had been doing not quite the right thing
> for a long time.
Pls. provide evidence/facts. The new patch is to facilitate device driver
implementation. It doesn't mean current AER driver is incorrect. We need
a tradeoff.

Just like what Bjoin says, we shouldn't change error_state to pci_channel_io_normal
before we really recover the hardware. The patch changes it just because
drivers might call some functions to recover the devices, while such functions
need (error_state==pci_channel_io_normal).

> 
> -- Linas Vepstas


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linas Vepstas May 21, 2013, 1:38 p.m. UTC | #16
Hi,

On 21 May 2013 02:49, Yanmin Zhang <yanmin_zhang@linux.intel.com> wrote:
> On Mon, 2013-05-20 at 10:37 -0500, Linas Vepstas wrote:

>>  My impression
>> is that maybe the AER driver had been doing not quite the right thing
>> for a long time.
> Pls. provide evidence/facts. The new patch is to facilitate device driver
> implementation. It doesn't mean current AER driver is incorrect. We need
> a tradeoff.
>
> Just like what Bjoin says, we shouldn't change error_state to pci_channel_io_normal
> before we really recover the hardware. The patch changes it just because
> drivers might call some functions to recover the devices, while such functions
> need (error_state==pci_channel_io_normal).

Perhaps we are talking past each other.  One needs to set error_state
== pci_channel_io_normal before calling slot_reset().  If the aer
driver wasn't doing this all along, then it seems like an old bug to
me.   The error_state flag indicates the status of the PCI channel,
and not the status of the attached device.  Once the channel has been
reset, the error state is "normal" even if the card itself hasn't yet
been brought up.

Whatever it is that the aer driver is doing, it should be doing
something similar to what the eeh driver is doing, in
arch/powerpc/platforms/pseries/eeh_driver.c  -- This is the "reference
implementation" -- Its known right, it was and continues to be heavily
tested, has found its way into sriov, etc.

The one thing that eeh does NOT do is to handle suspend/sleep states.
The basic design assumption back then was that no one would ever
suspend/sleep their server.  Since suspend/sleep messes with PCI
config space, then, yes, one would need to somehow save a second,
pristine copy of config space for device recovery.

-- Linas
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Liu, Joseph May 21, 2013, 3:41 p.m. UTC | #17
Bjorn,

>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>> index ed4d094..7abefd9 100644
>> --- a/drivers/pci/pcie/portdrv_pci.c
>> +++ b/drivers/pci/pcie/portdrv_pci.c
>> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>>  	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>>  	int retval;
>>  
>> -	/* If fatal, restore cfg space for possible link reset at upstream */
>> -	if (dev->error_state == pci_channel_io_frozen) {
>> -		dev->state_saved = true;
>> -		pci_restore_state(dev);
>> -		pcie_portdrv_restore_config(dev);
>> -		pci_enable_pcie_error_reporting(dev);
>> -	}
>> +	/* restore cfg space for possible link reset at upstream */
>> +	dev->state_saved = true;
>> +	pci_restore_state(dev);
>> +	pcie_portdrv_restore_config(dev);
>> +	pci_enable_pcie_error_reporting(dev);
>>  
>>  	/* get true return value from &status */
>>  	retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
>
>I think this patch changes the behavior in the case of a non-fatal error
>where one of the .error_detected() methods returned
>PCI_ERS_RESULT_NEED_RESET.  In that case, pcie_portdrv_slot_reset()
>previously did not restore config space, but after your patch, it *will*
>restore it.  We need an explanation of why this is safe. 

Here is my understanding of this part of the patch:

I think your concern here should not be an issue. Whether it is a non-fatal error or a fatal error, as long as one of the .error_detected() method from the downstream drivers involved returns a PCI_ERS_RESULT_NEED_RESET, it should trump all others and a slot reset should be performed, even though it was originally due to a non-fatal error reported or only one of the downstream drivers requests it. In the case of AER driver, this should be implemented in the broadcast_error_message() with pci_walk_bus() by passing in the report_error_detected() function, and merging the results into the result_data->result...

In any case, the fact that this pcie_portdrv_slot_reset() being invoked should be considered as a aftermath of a slot reset has been performed, thus the restore of config space should be performed after the reset. I suppose the restore should be to the same state as fresh power-on states, right?

Thanks,
Joe Liu, Ph.D.
Senior Principal Engineer
Emulex Corporation

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas May 21, 2013, 4:17 p.m. UTC | #18
On Tue, May 21, 2013 at 1:40 AM, Yanmin Zhang
<yanmin_zhang@linux.intel.com> wrote:
> On Mon, 2013-05-20 at 16:48 -0600, Bjorn Helgaas wrote:
>> On Fri, Apr 26, 2013 at 06:28:59AM +0000, Zhang, LongX wrote:
>> > From: Zhang Long <longx.zhang@intel.com>
>> >
>> > Specific pci device drivers might have many functions to call
>> > pci_channel_offline to check device states. When slot_reset happens,
>> > drivers' slot_reset callback might call such functions and eventually
>> > abort the reset.
>> >
>> > The patch resets pdev->error_state to pci_channel_io_normal at
>> > the begining of report_slot_reset.
>> >
>> > Thank Liu Joseph for pointing it out.
>> >
>> > Signed-off-by: Zhang Yanmin <yanmin_zhang@linux.intel.com>
>> > Signed-off-by: Zhang Long <longx.zhang@intel.com>
>> > ---
>> >  drivers/pci/pcie/aer/aerdrv_core.c |    1 +
>> >  drivers/pci/pcie/portdrv_pci.c     |   12 +++++-------
>> >  2 files changed, 6 insertions(+), 7 deletions(-)
> Thank all for the kind comments.
>
>> >
>> > diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
>> > index 564d97f..c61fd44 100644
>> > --- a/drivers/pci/pcie/aer/aerdrv_core.c
>> > +++ b/drivers/pci/pcie/aer/aerdrv_core.c
>> > @@ -286,6 +286,7 @@ static int report_slot_reset(struct pci_dev *dev, void *data)
>> >     result_data = (struct aer_broadcast_data *) data;
>> >
>> >     device_lock(&dev->dev);
>> > +   dev->error_state = pci_channel_io_normal;
>> >     if (!dev->driver ||
>> >             !dev->driver->err_handler ||
>> >             !dev->driver->err_handler->slot_reset)
>> > diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>> > index ed4d094..7abefd9 100644
>> > --- a/drivers/pci/pcie/portdrv_pci.c
>> > +++ b/drivers/pci/pcie/portdrv_pci.c
>> > @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>> >     pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>> >     int retval;
>> >
>> > -   /* If fatal, restore cfg space for possible link reset at upstream */
>> > -   if (dev->error_state == pci_channel_io_frozen) {
>> > -           dev->state_saved = true;
>> > -           pci_restore_state(dev);
>> > -           pcie_portdrv_restore_config(dev);
>> > -           pci_enable_pcie_error_reporting(dev);
>> > -   }
>> > +   /* restore cfg space for possible link reset at upstream */
>> > +   dev->state_saved = true;
> It's indeed a dirty trick to change it to true. The reason is suspend. The port would
> suspend/resume at suspend-to-ram. When the port is suspended, PCI power framework calls
> pci_save_state. When the port is resumed, PCI framework calls pci_restore_state and
> dev->state_saved is set to false.

I can read the code as well as you can.  The above does nothing to
explain why dev->saved_config_space is still valid even when
state_saved is false.

But I want to drop this issue because it was there before your patch,
and we're clearly not going to resolve it here.

> A better solution is to add double space in pci_dev->saved_config_space/saved_cap_space,
> which seems consume unnecessary resource.
>
>> > +   pci_restore_state(dev);
>> > +   pcie_portdrv_restore_config(dev);
>> > +   pci_enable_pcie_error_reporting(dev);
>> >
>> >     /* get true return value from &status */
>> >     retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
>> I think this patch changes the behavior in the case of a non-fatal error
>> where one of the .error_detected() methods returned
>> PCI_ERS_RESULT_NEED_RESET.  In that case, pcie_portdrv_slot_reset()
>> previously did not restore config space, but after your patch, it *will*
>> restore it.  We need an explanation of why this is safe.
> AER standard doesn't define such behavior like if we need reset a slot.
> When we implemented the feature in kernel, we assumed at non-fatal error,
> config space is still good.
>
> With the new patch, we change the behavior a little.
>
>>
>> I think you should split this into two patches: the first would remove the
>> "if (dev->error_state == pci_channel_io_frozen)" test from portdrv_pci.c
> The first patch would depend on the 2nd patch as 2nd patch resets error_state
> to pci_channel_io_normal.

Not true.  The pcie_portdrv_slot_reset() change does not depend on the
report_slot_reset() change.  If the first patch makes
pcie_portdrv_slot_reset() do the restore without even looking at
dev->error_state, there's no dependency on the report_slot_reset()
change.

>> and explain the reason, and the second would make the aerdrv_core.c change.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Linas Vepstas May 21, 2013, 4:26 p.m. UTC | #19
Zhang, Bjorn,

On 21 May 2013 10:41, Liu, Joseph <Joseph.Liu@emulex.com> wrote:
> Bjorn,
>
>>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>>> index ed4d094..7abefd9 100644
>>> --- a/drivers/pci/pcie/portdrv_pci.c
>>> +++ b/drivers/pci/pcie/portdrv_pci.c
>>> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>>>      pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>>>      int retval;
>>>
>>> -    /* If fatal, restore cfg space for possible link reset at upstream */
>>> -    if (dev->error_state == pci_channel_io_frozen) {
>>> -            dev->state_saved = true;
>>> -            pci_restore_state(dev);
>>> -            pcie_portdrv_restore_config(dev);
>>> -            pci_enable_pcie_error_reporting(dev);
>>> -    }
>>> +    /* restore cfg space for possible link reset at upstream */
>>> +    dev->state_saved = true;
>>> +    pci_restore_state(dev);
>>> +    pcie_portdrv_restore_config(dev);
>>> +    pci_enable_pcie_error_reporting(dev);
>>>
>>>      /* get true return value from &status */
>>>      retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
>>
>>I think this patch changes the behavior in the case of a non-fatal error
>>where one of the .error_detected() methods returned
>>PCI_ERS_RESULT_NEED_RESET.  In that case, pcie_portdrv_slot_reset()
>>previously did not restore config space, but after your patch, it *will*
>>restore it.  We need an explanation of why this is safe.
>
> Here is my understanding of this part of the patch:
>
> I think your concern here should not be an issue. Whether it is a non-fatal error or a fatal error, as long as one of the .error_detected() method from the downstream drivers involved returns a PCI_ERS_RESULT_NEED_RESET, it should trump all others and a slot reset should be performed, even though it was originally due to a non-fatal error reported or only one of the downstream drivers requests it. In the case of AER driver, this should be implemented in the broadcast_error_message() with pci_walk_bus() by passing in the report_error_detected() function, and merging the results into the result_data->result...
>
> In any case, the fact that this pcie_portdrv_slot_reset() being invoked should be considered as a aftermath of a slot reset has been performed, thus the restore of config space should be performed after the reset. I suppose the restore should be to the same state as fresh power-on states, right?


Again, I think Joe has it exactly right.   The patch, I'm not so sure.
 In my earlier emails, I was assuming that the device has just gotten
either a hard reset (power has been turned off-n-on, e.g. using
pci-hotplug, or a 'soft reset' (whatever that means :-)).   If the
adapter has been reset, then it is appropriate to restore pci config
space to something resembling a fresh power-on.

If the adapter has NOT been reset, then, ummm .. changing
('restoring') the config space would be wrong ... if I recall
correctly, a pci link reset does not whack the config space, and if
the device itself has not been whacked, then all the contents of the
config space (dma mappings, etc) are all still valid, and should not
be changed!

So, the restore of the config space should be conditional, depending
on whether the device has been reset or not.

-- Linas
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas May 21, 2013, 4:51 p.m. UTC | #20
On Tue, May 21, 2013 at 9:41 AM, Liu, Joseph <Joseph.Liu@emulex.com> wrote:
> Bjorn,
>
>>> diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
>>> index ed4d094..7abefd9 100644
>>> --- a/drivers/pci/pcie/portdrv_pci.c
>>> +++ b/drivers/pci/pcie/portdrv_pci.c
>>> @@ -332,13 +332,11 @@ static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
>>>      pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
>>>      int retval;
>>>
>>> -    /* If fatal, restore cfg space for possible link reset at upstream */
>>> -    if (dev->error_state == pci_channel_io_frozen) {
>>> -            dev->state_saved = true;
>>> -            pci_restore_state(dev);
>>> -            pcie_portdrv_restore_config(dev);
>>> -            pci_enable_pcie_error_reporting(dev);
>>> -    }
>>> +    /* restore cfg space for possible link reset at upstream */
>>> +    dev->state_saved = true;
>>> +    pci_restore_state(dev);
>>> +    pcie_portdrv_restore_config(dev);
>>> +    pci_enable_pcie_error_reporting(dev);
>>>
>>>      /* get true return value from &status */
>>>      retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);
>>
>>I think this patch changes the behavior in the case of a non-fatal error
>>where one of the .error_detected() methods returned
>>PCI_ERS_RESULT_NEED_RESET.  In that case, pcie_portdrv_slot_reset()
>>previously did not restore config space, but after your patch, it *will*
>>restore it.  We need an explanation of why this is safe.
>
> Here is my understanding of this part of the patch:
>
> I think your concern here should not be an issue. Whether it is a non-fatal error or a fatal error, as long as one of the .error_detected() method from the downstream drivers involved returns a PCI_ERS_RESULT_NEED_RESET, it should trump all others and a slot reset should be performed, even though it was originally due to a non-fatal error reported or only one of the downstream drivers requests it. In the case of AER driver, this should be implemented in the broadcast_error_message() with pci_walk_bus() by passing in the report_error_detected() function, and merging the results into the result_data->result...

Yes, I understand all this.  (Though as I pointed out, the current AER
code does not actually perform a reset based on
PCI_ERS_RESULT_NEED_RESET being returned.  The only time we currently
do a reset is for AER_FATAL errors.)

> In any case, the fact that this pcie_portdrv_slot_reset() being invoked should be considered as a aftermath of a slot reset has been performed, thus the restore of config space should be performed after the reset.

The intention has always been that .slot_reset() would be called to
inform the driver that a reset has been performed.  That was the case
even when Yanmin added pcie_portdrv_slot_reset() with commit
4bf3392e0.  Yet in that commit, pcie_portdrv_slot_reset() only does
the restore when the channel is frozen, i.e., when we're handling an
AER_FATAL error.

So far, I haven't seen any explanation of what changed to make us want
to do the restore always, even for non-fatal errors.  Maybe the
original test for the channel being frozen was just a mistake, but it
would have been much simpler to omit the test to begin with, so
obviously Yanmin thought it was necessary at the time.

Maybe the "error_state == pci_channel_io_frozen" test was a back-door
way to express "we only want to restore state when we've actually done
a reset."  That would sort of make sense, although there's no
documented connection between the frozen state and a device reset, and
no driver should rely on one.

If the idea of "only do a restore after a reset" is still valid, we
don't want to make this change to pcie_portdrv_slot_reset() because it
IS called in some cases when a reset has not been performed, namely
non-fatal errors when an .error_detected() method returns
PCI_ERS_RESULT_NEED_RESET.

> I suppose the restore should be to the same state as fresh power-on states, right?

I *think* that in pcie_portdrv_slot_reset(), we're probably restoring
the state saved by pcie_portdrv_probe(), i.e., the state saved by the
PCIe port driver after it has claimed and initialized the port.  This
is not fresh power-on state; a fresh power-on state would have BARs
and other config registers cleared, and we'd have to assign resources
to the device just like we do to a hot-added device.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 4, 2013, 6:04 p.m. UTC | #21
I'm not sure where we are with this patch.  I think Joseph initially
reported a problem (though I haven't actually seen that), and this
patch fixed it, so it seems like there's something we want to do here.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanmin Zhang June 5, 2013, 12:38 a.m. UTC | #22
On Tue, 2013-06-04 at 12:04 -0600, Bjorn Helgaas wrote:
> I'm not sure where we are with this patch.  I think Joseph initially
> reported a problem (though I haven't actually seen that), and this
> patch fixed it, so it seems like there's something we want to do here.
Yes, indeed. We checked Powerpc error handling and plan to use the
similar method to deal with the issue.

Sorry for replying very late. I move to Android smartphone area and am
very busy on many top issues.


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 5, 2013, 1:30 p.m. UTC | #23
On Tue, Jun 4, 2013 at 6:38 PM, Yanmin Zhang
<yanmin_zhang@linux.intel.com> wrote:
> On Tue, 2013-06-04 at 12:04 -0600, Bjorn Helgaas wrote:
>> I'm not sure where we are with this patch.  I think Joseph initially
>> reported a problem (though I haven't actually seen that), and this
>> patch fixed it, so it seems like there's something we want to do here.
> Yes, indeed. We checked Powerpc error handling and plan to use the
> similar method to deal with the issue.
>
> Sorry for replying very late. I move to Android smartphone area and am
> very busy on many top issues.

OK.  Can you point us at a bugzilla or email archive of Joseph's
original problem report, or post it to this thread?  Then maybe
somebody else can pick it up and make progress on this.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yanmin Zhang June 6, 2013, 6:29 a.m. UTC | #24
On Wed, 2013-06-05 at 07:30 -0600, Bjorn Helgaas wrote:
> On Tue, Jun 4, 2013 at 6:38 PM, Yanmin Zhang
> <yanmin_zhang@linux.intel.com> wrote:
> > On Tue, 2013-06-04 at 12:04 -0600, Bjorn Helgaas wrote:
> >> I'm not sure where we are with this patch.  I think Joseph initially
> >> reported a problem (though I haven't actually seen that), and this
> >> patch fixed it, so it seems like there's something we want to do here.
> > Yes, indeed. We checked Powerpc error handling and plan to use the
> > similar method to deal with the issue.
> >
> > Sorry for replying very late. I move to Android smartphone area and am
> > very busy on many top issues.
> 
> OK.  Can you point us at a bugzilla or email archive of Joseph's
> original problem report, or post it to this thread?  Then maybe
> somebody else can pick it up and make progress on this.

Joseph sent email to my outlook emailbox. I changed it a little to delete company
sensitive product name.

===========================Joseph's original email to me================
Hi Tom and Yanmin,
 
Hope this email can reach you well. I am working on a driver's PCI error recovery with AER. I have a question with one of my observations from platform AER driver's behavior. As your names and emails are listed to the PCIe AER driver coming with the kernel, I send this question to you:
 
During injecting AER Non-Correctable/Fatal error and PCIe error recovery process, I observed the following from our Low Level Driver (LLD):
 
1. The LLD's error_detected() callback got called and the PCI channel state passed in as "pci_channel_io_frozen", as expected;
 
2. The LLD's error_detected() callback function returned with PCI_ERS_RESULT_NEED_RESET, requesting a PCI slot reset;
 
3. The LLD's slot_reset() callback got called and it attempted to re-initialize the device hardware for the recovery. However, the PCI slot state was still in "pci_channel_io_frozen" and the pci_channel_offline() helper routine returned 1. This is not expected, and it actually prevented LLD in performing needed access to memory mapped I/O space for preparing the device for recovery;
 
4. Later, the LLD's resume() callback got called and the PCI slot state was set to "pci_channel_io_normal";
 
In the upstream Linux kernel 3.7.0's pci-error-recovery.txt, "STEP 4 Slot Reset", it stated after platform has performed PCI slot reset and then calls LLD's slot_reset() callback:
 
"This call gives drivers the chance to re-initialize the hardware (re-download firmware, etc.).  At this point, the driver may assume that the card is in a fresh state and is fully functional. The slot is unfrozen and the driver has full access to PCI config space, memory mapped I/O space and DMA. Interrupts (Legacy, MSI, or MSI-X) will also be available."
 
I looked into the kernel 3.7.0's AER driver's aerdrv_core.c and see that the PCI slot state was set to "pci_channel_io_frozen" on AER_FATAL serverity, and only set back to "pci_channel_io_normal" in report_resume() function. The PCI slot state was not set to "pci_channel_io_normal" when invoking report_slot_reset().
 
As a comparison, I also perform the EEH error recovery with the LLD driver on a PPC platform, which indeed set the PCI slot state to "pci_channel_io_normal" when calling LLD's slot_reset() callback function.
 
Can you please verify this platform AER driver's behavior is intended and will stay with the AER platform support, or it should be changed to be consistent with the PCI error recovery procedure described in the pci-error-recovery.txt documentation? I also noticed that before invoking the broadcast_error_message() function with "slot_reset", there is a comment in the 3.7.0 kernel source:
 
/*
 * TODO: Should call platform-specific
 * functions to reset slot before calling
 * drivers' slot_reset callbacks?
 */
And I do see that only the PCIe Link_Reset was performed, no PCI fundamental reset was performed even the LLD set the PCIe device's "pdev->needs_freset = 1", is this the area that later will be added for performing needed PCI hot reset or fundamental reset at this stage?
 
Your timely response is very appreciated. Thanks in advance and please let me know if you think I should redirect the question to someone else.
 
Best Regards,
Joseph Liu, Ph.D.
Senior Principal Engineer
Emulex Corporation


--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Bjorn Helgaas June 10, 2013, 5:24 p.m. UTC | #25
I opened https://bugzilla.kernel.org/show_bug.cgi?id=59531 for this issue.

Unfortunately, I don't have an army of minions to assign things like this to.

It's easy to change the dev state from frozen to normal before calling
the slot_reset() callback, but we first have to unravel what that
means for pcie_portdrv_slot_reset(), which restores the device state
if it is frozen.

Bjorn

On Thu, Jun 6, 2013 at 12:29 AM, Yanmin Zhang
<yanmin_zhang@linux.intel.com> wrote:
> On Wed, 2013-06-05 at 07:30 -0600, Bjorn Helgaas wrote:
>> On Tue, Jun 4, 2013 at 6:38 PM, Yanmin Zhang
>> <yanmin_zhang@linux.intel.com> wrote:
>> > On Tue, 2013-06-04 at 12:04 -0600, Bjorn Helgaas wrote:
>> >> I'm not sure where we are with this patch.  I think Joseph initially
>> >> reported a problem (though I haven't actually seen that), and this
>> >> patch fixed it, so it seems like there's something we want to do here.
>> > Yes, indeed. We checked Powerpc error handling and plan to use the
>> > similar method to deal with the issue.
>> >
>> > Sorry for replying very late. I move to Android smartphone area and am
>> > very busy on many top issues.
>>
>> OK.  Can you point us at a bugzilla or email archive of Joseph's
>> original problem report, or post it to this thread?  Then maybe
>> somebody else can pick it up and make progress on this.
>
> Joseph sent email to my outlook emailbox. I changed it a little to delete company
> sensitive product name.
>
> ===========================Joseph's original email to me================
> Hi Tom and Yanmin,
>
> Hope this email can reach you well. I am working on a driver's PCI error recovery with AER. I have a question with one of my observations from platform AER driver's behavior. As your names and emails are listed to the PCIe AER driver coming with the kernel, I send this question to you:
>
> During injecting AER Non-Correctable/Fatal error and PCIe error recovery process, I observed the following from our Low Level Driver (LLD):
>
> 1. The LLD's error_detected() callback got called and the PCI channel state passed in as "pci_channel_io_frozen", as expected;
>
> 2. The LLD's error_detected() callback function returned with PCI_ERS_RESULT_NEED_RESET, requesting a PCI slot reset;
>
> 3. The LLD's slot_reset() callback got called and it attempted to re-initialize the device hardware for the recovery. However, the PCI slot state was still in "pci_channel_io_frozen" and the pci_channel_offline() helper routine returned 1. This is not expected, and it actually prevented LLD in performing needed access to memory mapped I/O space for preparing the device for recovery;
>
> 4. Later, the LLD's resume() callback got called and the PCI slot state was set to "pci_channel_io_normal";
>
> In the upstream Linux kernel 3.7.0's pci-error-recovery.txt, "STEP 4 Slot Reset", it stated after platform has performed PCI slot reset and then calls LLD's slot_reset() callback:
>
> "This call gives drivers the chance to re-initialize the hardware (re-download firmware, etc.).  At this point, the driver may assume that the card is in a fresh state and is fully functional. The slot is unfrozen and the driver has full access to PCI config space, memory mapped I/O space and DMA. Interrupts (Legacy, MSI, or MSI-X) will also be available."
>
> I looked into the kernel 3.7.0's AER driver's aerdrv_core.c and see that the PCI slot state was set to "pci_channel_io_frozen" on AER_FATAL serverity, and only set back to "pci_channel_io_normal" in report_resume() function. The PCI slot state was not set to "pci_channel_io_normal" when invoking report_slot_reset().
>
> As a comparison, I also perform the EEH error recovery with the LLD driver on a PPC platform, which indeed set the PCI slot state to "pci_channel_io_normal" when calling LLD's slot_reset() callback function.
>
> Can you please verify this platform AER driver's behavior is intended and will stay with the AER platform support, or it should be changed to be consistent with the PCI error recovery procedure described in the pci-error-recovery.txt documentation? I also noticed that before invoking the broadcast_error_message() function with "slot_reset", there is a comment in the 3.7.0 kernel source:
>
> /*
>  * TODO: Should call platform-specific
>  * functions to reset slot before calling
>  * drivers' slot_reset callbacks?
>  */
> And I do see that only the PCIe Link_Reset was performed, no PCI fundamental reset was performed even the LLD set the PCIe device's "pdev->needs_freset = 1", is this the area that later will be added for performing needed PCI hot reset or fundamental reset at this stage?
>
> Your timely response is very appreciated. Thanks in advance and please let me know if you think I should redirect the question to someone else.
>
> Best Regards,
> Joseph Liu, Ph.D.
> Senior Principal Engineer
> Emulex Corporation
>
>
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/pci/pcie/aer/aerdrv_core.c b/drivers/pci/pcie/aer/aerdrv_core.c
index 564d97f..c61fd44 100644
--- a/drivers/pci/pcie/aer/aerdrv_core.c
+++ b/drivers/pci/pcie/aer/aerdrv_core.c
@@ -286,6 +286,7 @@  static int report_slot_reset(struct pci_dev *dev, void *data)
 	result_data = (struct aer_broadcast_data *) data;
 
 	device_lock(&dev->dev);
+	dev->error_state = pci_channel_io_normal;
 	if (!dev->driver ||
 		!dev->driver->err_handler ||
 		!dev->driver->err_handler->slot_reset)
diff --git a/drivers/pci/pcie/portdrv_pci.c b/drivers/pci/pcie/portdrv_pci.c
index ed4d094..7abefd9 100644
--- a/drivers/pci/pcie/portdrv_pci.c
+++ b/drivers/pci/pcie/portdrv_pci.c
@@ -332,13 +332,11 @@  static pci_ers_result_t pcie_portdrv_slot_reset(struct pci_dev *dev)
 	pci_ers_result_t status = PCI_ERS_RESULT_RECOVERED;
 	int retval;
 
-	/* If fatal, restore cfg space for possible link reset at upstream */
-	if (dev->error_state == pci_channel_io_frozen) {
-		dev->state_saved = true;
-		pci_restore_state(dev);
-		pcie_portdrv_restore_config(dev);
-		pci_enable_pcie_error_reporting(dev);
-	}
+	/* restore cfg space for possible link reset at upstream */
+	dev->state_saved = true;
+	pci_restore_state(dev);
+	pcie_portdrv_restore_config(dev);
+	pci_enable_pcie_error_reporting(dev);
 
 	/* get true return value from &status */
 	retval = device_for_each_child(&dev->dev, &status, slot_reset_iter);