diff mbox series

[v4,1/6] mpt3sas: Introduce mpt3sas_base_pci_device_is_available

Message ID 1537935759-14754-2-git-send-email-suganath-prabu.subramani@broadcom.com (mailing list archive)
State Changes Requested
Headers show
Series mpt3sas: Hot-Plug Surprise removal support on IOC. | expand

Commit Message

Suganath Prabu S Sept. 26, 2018, 4:22 a.m. UTC
* Driver uses "pci_device_is_present" to check whether
If Hot unplugged:
the outstanding IOs with 'DID_NO_CONNECT' before removing the drives
attached to the HBA.
"DID_NO_CONNECT" status and free the smid, if driver detects that
HBA is hot unplugged.

* In the hard reset flush out all the outstanding IOs even if diag reset
fails and also if driver detects that HBA is hot unplugged.

v1 change set:

Comments

Bjorn Helgaas Sept. 26, 2018, 9:32 p.m. UTC | #1
[+cc LKML]

On Wed, Sep 26, 2018 at 09:52:34AM +0530, Suganath Prabu S wrote:
> * Driver uses "pci_device_is_present" to check whether
> If Hot unplugged:
> the outstanding IOs with 'DID_NO_CONNECT' before removing the drives
> attached to the HBA.

This sentence needs a verb.

> "DID_NO_CONNECT" status and free the smid, if driver detects that
> HBA is hot unplugged.

This sentence also needs a verb.

> * In the hard reset flush out all the outstanding IOs even if diag reset
> fails and also if driver detects that HBA is hot unplugged.
> 
> v1 change set:
> ==============
> unlock mutex before goto "out_unlocked",
> if active reset is in progress.
> 
> v2 change set:
> ==============
> 1) Use pci_device_is_present instead of
> mpt3sas_base_pci_device_is_unplugged.
> 2) As suggested by Lukas, removed using
> watchdog thread for checking hba hot unplug(Patch 02 of V1).
> Added Hot unplug checks in scan finish and reset paths.
> 
> v3 Change Set:
> =============
> Simplified function "mpt3sas_base_pci_device_is_available" and
> made inline.
> 
> v4 Changes:
> ===========
> Dont split strings in print statement.
> 
> Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
> ---
>  drivers/scsi/mpt3sas/mpt3sas_base.c  | 39 ++++++++++++++++++++++++++++
>  drivers/scsi/mpt3sas/mpt3sas_base.h  |  3 ++-
>  drivers/scsi/mpt3sas/mpt3sas_scsih.c | 50 ++++++++++++++++++++++++++++++++----
>  3 files changed, 86 insertions(+), 6 deletions(-)
> 
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
> index 59d7844..c880e72 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
> @@ -543,6 +543,20 @@ static int mpt3sas_remove_dead_ioc_func(void *arg)
>  }
>  
>  /**
> + * mpt3sas_base_pci_device_is_available - check whether pci device is
> + *			available for any transactions with FW
> + *
> + * @ioc: per adapter object
> + *
> + * Return 1 if pci device state is up and running else return 0.
> + */
> +inline bool
> +mpt3sas_base_pci_device_is_available(struct MPT3SAS_ADAPTER *ioc)
> +{
> +	return !ioc->pci_error_recovery && pci_device_is_present(ioc->pdev);
> +}
> +
> +/**
>   * _base_fault_reset_work - workq handling ioc fault conditions
>   * @work: input argument, used to derive ioc
>   *
> @@ -6122,6 +6136,11 @@ _base_diag_reset(struct MPT3SAS_ADAPTER *ioc)
>  
>  	count = 0;
>  	do {
> +		if (!pci_device_is_present(ioc->pdev)) {
> +			ioc->remove_host = 1;
> +			pr_err(MPT3SAS_FMT "Hba Hot unplugged\n", ioc->name);

You capitalized as "HBA" above.

> +			goto out;
> +		}
>  		/* Write magic sequence to WriteSequence register
>  		 * Loop until in diagnostic mode
>  		 */
> @@ -6853,6 +6872,13 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
>  
>  	ioc->pending_io_count = 0;
>  
> +	if (!mpt3sas_base_pci_device_is_available(ioc)) {
> +		pr_err(MPT3SAS_FMT
> +		    "%s: pci error recovery reset or pci device unplug occurred\n",
> +		    ioc->name, __func__);
> +		return;
> +	}
> +
>  	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);

This is a good example of why I don't like pci_device_is_present(): it
is fundamentally racy and gives a false sense of security.  Here we
*think* we're making the code safer, but in fact we could have this
sequence:

  mpt3sas_base_pci_device_is_available()    # returns true
  # device is removed
  ioc_state = mpt3sas_base_get_iocstate()

In this case the readl() inside mpt3sas_base_get_iocstate() will
probably return 0xffffffff data, and we assume that's valid and
continue on our merry way, pretending that "ioc_state" makes sense
when it really doesn't.

>  	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
>  		return;

Bjorn
Lukas Wunner Sept. 27, 2018, 7:03 a.m. UTC | #2
On Wed, Sep 26, 2018 at 04:32:41PM -0500, Bjorn Helgaas wrote:
> On Wed, Sep 26, 2018 at 09:52:34AM +0530, Suganath Prabu S wrote:
> > @@ -6853,6 +6872,13 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
> >  
> >  	ioc->pending_io_count = 0;
> >  
> > +	if (!mpt3sas_base_pci_device_is_available(ioc)) {
> > +		pr_err(MPT3SAS_FMT
> > +		    "%s: pci error recovery reset or pci device unplug occurred\n",
> > +		    ioc->name, __func__);
> > +		return;
> > +	}
> > +
> >  	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> 
> This is a good example of why I don't like pci_device_is_present(): it
> is fundamentally racy and gives a false sense of security.  Here we
> *think* we're making the code safer, but in fact we could have this
> sequence:
> 
>   mpt3sas_base_pci_device_is_available()    # returns true
>   # device is removed
>   ioc_state = mpt3sas_base_get_iocstate()
> 
> In this case the readl() inside mpt3sas_base_get_iocstate() will
> probably return 0xffffffff data, and we assume that's valid and
> continue on our merry way, pretending that "ioc_state" makes sense
> when it really doesn't.

The function does the following:

	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
		return;

where MPI2_IOC_STATE_MASK is 0xF0000000 and MPI2_IOC_STATE_OPERATIONAL
is 0x20000000.  If the device is removed after the call to
mpt3sas_base_pci_device_is_available(), the result of the bitwise "and"
operation would be 0xF0000000, which is unequal to 0x20000000.
Hence this looks safe.

I agree that pci_device_is_present() (and the pci_dev_is_disconnected()
it calls) must be used judiciously, but here it seems to have been done
correctly.

One thing to be aware of is that a return value of "true" from
pci_dev_is_disconnected() is definitive and can be trusted.
On the other hand a return value of "false" is more like a fuzzy
"likely not disconnected, but can't give any guarantees".
So the boolean return value is kind of the problem here.
Boolean logic doesn't really fit these "definitive if true,
not definitive if false" semantics.

However being able to get the definitive answer in the disconnected
case is valuable:  pciehp is the only entity that can determine
surprise removal authoritatively and unambiguously (albeit with
a latency).  All the other tools that we have at our disposal don't
have that quality:  E.g. checking the Vendor ID is ambiguous because
it returns a valid value if a device was quickly replaced with another
one.  Also, all ones may be returned in the case of an Uncorrectable
Error, but the device may revert to valid responses if the error can
be recovered.  (Please correct me if I'm wrong.)

Thanks,

Lukas
Bjorn Helgaas Sept. 27, 2018, 6:47 p.m. UTC | #3
[+cc Ben, Russell, Sam, Oliver in case they have pertinent experience from
powerpc error handling; thread begins at
https://lore.kernel.org/linux-pci/1537935759-14754-1-git-send-email-suganath-prabu.subramani@broadcom.com/]

On Thu, Sep 27, 2018 at 09:03:27AM +0200, Lukas Wunner wrote:
> On Wed, Sep 26, 2018 at 04:32:41PM -0500, Bjorn Helgaas wrote:
> > On Wed, Sep 26, 2018 at 09:52:34AM +0530, Suganath Prabu S wrote:
> > > @@ -6853,6 +6872,13 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
> > >  
> > >  	ioc->pending_io_count = 0;
> > >  
> > > +	if (!mpt3sas_base_pci_device_is_available(ioc)) {
> > > +		pr_err(MPT3SAS_FMT
> > > +		    "%s: pci error recovery reset or pci device unplug occurred\n",
> > > +		    ioc->name, __func__);
> > > +		return;
> > > +	}
> > > +
> > >  	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> > 
> > This is a good example of why I don't like pci_device_is_present(): it
> > is fundamentally racy and gives a false sense of security.  Here we
> > *think* we're making the code safer, but in fact we could have this
> > sequence:
> > 
> >   mpt3sas_base_pci_device_is_available()    # returns true
> >   # device is removed
> >   ioc_state = mpt3sas_base_get_iocstate()
> > 
> > In this case the readl() inside mpt3sas_base_get_iocstate() will
> > probably return 0xffffffff data, and we assume that's valid and
> > continue on our merry way, pretending that "ioc_state" makes sense
> > when it really doesn't.
> 
> The function does the following:
> 
> 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> 	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
> 		return;
> 
> where MPI2_IOC_STATE_MASK is 0xF0000000 and MPI2_IOC_STATE_OPERATIONAL
> is 0x20000000.  If the device is removed after the call to
> mpt3sas_base_pci_device_is_available(), the result of the bitwise "and"
> operation would be 0xF0000000, which is unequal to 0x20000000.
> Hence this looks safe.

I agree this particular case is technically safe, but figuring that
out requires an unreasonable amount of analysis.  And there's no hint
in the code that we need to be concerned about whether the readl()
returns valid data, so the need for the analysis won't even occur to
most readers.

I don't feel good about encouraging this style of adding an explicit
test for whether the device is available, followed by a completely
implicit test that accidentally happens to correctly handle a device
that was removed after the explicit test.

If we instead added a test for ~0 after the readl(), we would avoid
the race and give the reader a clue that *any* read from the device
can potentially fail without advance warning.

> I agree that pci_device_is_present() (and the pci_dev_is_disconnected()
> it calls) must be used judiciously, but here it seems to have been done
> correctly.
> 
> One thing to be aware of is that a return value of "true" from
> pci_dev_is_disconnected() is definitive and can be trusted.
> On the other hand a return value of "false" is more like a fuzzy
> "likely not disconnected, but can't give any guarantees".
> So the boolean return value is kind of the problem here.
> Boolean logic doesn't really fit these "definitive if true,
> not definitive if false" semantics.
> 
> However being able to get the definitive answer in the disconnected
> case is valuable:  pciehp is the only entity that can determine
> surprise removal authoritatively and unambiguously (albeit with
> a latency).  All the other tools that we have at our disposal don't
> have that quality:  E.g. checking the Vendor ID is ambiguous because
> it returns a valid value if a device was quickly replaced with another
> one.  Also, all ones may be returned in the case of an Uncorrectable
> Error, but the device may revert to valid responses if the error can
> be recovered.  (Please correct me if I'm wrong.)

I think everything you said above is true, but I'm not yet convinced
that it's being applied usefully in mpt3sas.

  bool pci_dev_is_disconnected(pdev)       # "true" is definitive
  {
    return test_bit(PCI_DEV_DISCONNECTED, &dev->priv_flags);
  }

  bool pci_device_is_present(pdev)         # "false" is definitive
  {
    if (pci_dev_is_disconnected(pdev))
      return false;
    return pci_bus_read_dev_vendor_id(...);
  }

  mpt3sas_base_pci_device_is_available(ioc)  # "false" is definitive
  {
    return !ioc->pci_error_recovery && pci_device_is_present(ioc->pdev);
  }

  mpt3sas_wait_for_commands_to_complete(...)
  {
    ...
 +  if (!mpt3sas_base_pci_device_is_available(ioc))
 +    return;

    # In the definitive case, we returned above.  If we get here, we
    # think the device *might* be present.  The readl() inside
    # mpt3sas_base_get_iocstate() might fail (returning ~0 data).
    # It happens that (~0 & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL
    # so we will return below if the device was removed after the
    # check above.

    ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
    if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
      return;
    ...


I think this code is unreasonably complicated.  The multiple layers
and negations make it very difficult to figure out what's definitive.

I'm not sure how mpt3sas benefits from adding
mpt3sas_base_pci_device_is_available() here, other than the fact that
it avoids an MMIO read to the device in most cases.  I think it would
be simpler and better to make mpt3sas_base_get_iocstate() smarter
about handling ~0 values from the readl().

Bjorn
Lukas Wunner Sept. 27, 2018, 7:09 p.m. UTC | #4
On Thu, Sep 27, 2018 at 01:47:46PM -0500, Bjorn Helgaas wrote:
>   mpt3sas_wait_for_commands_to_complete(...)
>   {
>     ...
>  +  if (!mpt3sas_base_pci_device_is_available(ioc))
>  +    return;
> 
>     # In the definitive case, we returned above.  If we get here, we
>     # think the device *might* be present.  The readl() inside
>     # mpt3sas_base_get_iocstate() might fail (returning ~0 data).
>     # It happens that (~0 & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL
>     # so we will return below if the device was removed after the
>     # check above.
> 
>     ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
>     if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
>       return;
>     ...
> 
> 
> I think this code is unreasonably complicated.  The multiple layers
> and negations make it very difficult to figure out what's definitive.

I agree there's room for improvement.


> I'm not sure how mpt3sas benefits from adding
> mpt3sas_base_pci_device_is_available() here, other than the fact that
> it avoids an MMIO read to the device in most cases.  I think it would
> be simpler and better to make mpt3sas_base_get_iocstate() smarter
> about handling ~0 values from the readl().

Avoiding an MMIO read when it's known to run into a Completion Timeout
buys about 17 ms.  If the function is executed many times (I don't know
if that's the case here, I'm referring to the general case), or if bailing
out of it early avoids significant amounts of further I/O, then checking
for disconnectedness makes sense.

The 17 ms number is from this talk:
https://www.youtube.com/watch?v=GJ6B0xzgvlM&t=832

It's the typical Completion Timeout on an Intel chip, but it can be up to
50 ms in the default setting or up to 64 s if so configured in the Device
Control 2 register (PCIe r4.0 sec 7.8.16).

Thanks,

Lukas
Suganath Prabu S Oct. 1, 2018, 6:57 a.m. UTC | #5
Hi Lucas and Bjorn,
Thanks for reviewing and providing useful comments.

On Fri, Sep 28, 2018 at 12:40 AM Lukas Wunner <lukas@wunner.de> wrote:
>
> On Thu, Sep 27, 2018 at 01:47:46PM -0500, Bjorn Helgaas wrote:
> >   mpt3sas_wait_for_commands_to_complete(...)
> >   {
> >     ...
> >  +  if (!mpt3sas_base_pci_device_is_available(ioc))
> >  +    return;
> >
> >     # In the definitive case, we returned above.  If we get here, we
> >     # think the device *might* be present.  The readl() inside
> >     # mpt3sas_base_get_iocstate() might fail (returning ~0 data).
> >     # It happens that (~0 & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL
> >     # so we will return below if the device was removed after the
> >     # check above.
> >
> >     ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
> >     if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
> >       return;
> >     ...
> >
> >
> > I think this code is unreasonably complicated.  The multiple layers
> > and negations make it very difficult to figure out what's definitive.
>
> I agree there's room for improvement.
>
>
> > I'm not sure how mpt3sas benefits from adding
> > mpt3sas_base_pci_device_is_available() here, other than the fact that
> > it avoids an MMIO read to the device in most cases.  I think it would
> > be simpler and better to make mpt3sas_base_get_iocstate() smarter
> > about handling ~0 values from the readl().
>
> Avoiding an MMIO read when it's known to run into a Completion Timeout
> buys about 17 ms.  If the function is executed many times (I don't know
> if that's the case here, I'm referring to the general case), or if bailing
> out of it early avoids significant amounts of further I/O, then checking
> for disconnectedness makes sense.
>
> The 17 ms number is from this talk:
> https://www.youtube.com/watch?v=GJ6B0xzgvlM&t=832
>
> It's the typical Completion Timeout on an Intel chip, but it can be up to
> 50 ms in the default setting or up to 64 s if so configured in the Device
> Control 2 register (PCIe r4.0 sec 7.8.16).
>

This function is called only during controller reset, system shutdown
and driver unload operations.
For this case we can remove mpt3sas_base_pci_device_is_available() check,
since we can make the device removal from readl in
mpt3sas_base_get_iocstate() as you suggested.
But we need mpt3sas_base_pci_device_is_available() in other places
like while submitting the
IO/TMs to the HBA firmware etc where driver is not checking for the
IOC state (through readl() operation)
 to gracefully handle the HBA hot-unplug operation.

> Thanks,
>
> Lukas
Bjorn Helgaas Oct. 1, 2018, 8:40 p.m. UTC | #6
On Mon, Oct 01, 2018 at 12:27:28PM +0530, Suganath Prabu Subramani wrote:
> On Fri, Sep 28, 2018 at 12:40 AM Lukas Wunner <lukas@wunner.de> wrote:
> > On Thu, Sep 27, 2018 at 01:47:46PM -0500, Bjorn Helgaas wrote:

> > > I'm not sure how mpt3sas benefits from adding
> > > mpt3sas_base_pci_device_is_available() here, other than the fact that
> > > it avoids an MMIO read to the device in most cases.  I think it would
> > > be simpler and better to make mpt3sas_base_get_iocstate() smarter
> > > about handling ~0 values from the readl().
> >
> > Avoiding an MMIO read when it's known to run into a Completion Timeout
> > buys about 17 ms.  If the function is executed many times (I don't know
> > if that's the case here, I'm referring to the general case), or if bailing
> > out of it early avoids significant amounts of further I/O, then checking
> > for disconnectedness makes sense.

I agree that if we know the device is gone, it's helpful to avoid
further I/O to it.  The misleading pattern used in this patch is:

  R1) Check for device presence
  R2) Read from device
  R3) Consume data from device

That promotes the misconception that "the device is present, so we got
valid data from it."  But in fact the device may disappear between R1
and R2, so the following pattern is better:

  A) Read from device
  B) Check data for validity, e.g., look for ~0
  C) Consume data if valid

If we're writing to the device, we don't expect any response, and it
makes sense to do:

  W1) Check for device presence
  W2) If device present, write to device

Obviously the device can disappear between W1 and W2, so this can't
eliminate *all* writes to non-existent devices, but if we want to
reduce them and we're only doing writes to the device (with no reads),
this is the best we can do.

We can learn that the device is gone in several ways: pciehp might
notice the link is down, the driver might notice a ~0 return, etc.

The driver writer knows the structure of communication with the
device, so he/she should know the appropriate places to check for
valid data from reads and where to check to minimize the number of
writes to a non-existent device.

> This function is called only during controller reset, system shutdown
> and driver unload operations.
> For this case we can remove mpt3sas_base_pci_device_is_available() check,
> since we can make the device removal from readl in
> mpt3sas_base_get_iocstate() as you suggested.

> But we need mpt3sas_base_pci_device_is_available() in other places
> like while submitting the
> IO/TMs to the HBA firmware etc where driver is not checking for the
> IOC state (through readl() operation)
> to gracefully handle the HBA hot-unplug operation.

This is the W1/W2 case above, and what you can do is constrained by
the device model, i.e., where it requires reads from the device.  I
agree you may want to check whether the device is absent to minimize
writes after a device has been removed.

I think the names "pci_device_is_present()" and
"mpt3sas_base_pci_device_is_available()" contribute to the problem
because they make promises that can't be kept -- all we can say is
that the device *was* present, but we know whether it is *still*
present.  I think it would be better if the interfaces were something
like "pci_device_is_absent()" because that gives a result we can rely
on.  If that returns true, we know the device is definitely gone.

Bjorn
Bjorn Helgaas Oct. 2, 2018, 2:04 p.m. UTC | #7
On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> I think the names "pci_device_is_present()" and
> "mpt3sas_base_pci_device_is_available()" contribute to the problem
> because they make promises that can't be kept -- all we can say is
> that the device *was* present, but we know whether it is *still*
> present.

Oops, I meant "we DON'T know whether it is still present."

> I think it would be better if the interfaces were something
> like "pci_device_is_absent()" because that gives a result we can rely
> on.  If that returns true, we know the device is definitely gone.
> 
> Bjorn
Suganath Prabu S Oct. 8, 2018, 6:44 a.m. UTC | #8
On Tue, Oct 2, 2018 at 7:34 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> > I think the names "pci_device_is_present()" and
> > "mpt3sas_base_pci_device_is_available()" contribute to the problem
> > because they make promises that can't be kept -- all we can say is
> > that the device *was* present, but we know whether it is *still*
> > present.

Bjorn,

In the patch we are using '!' (i.e. not operation) of pci_device_is_present(),
which is logically same as pci_device_is absent, and it is
same for mpt3sas_base_pci_device_is_available().
My understanding is that, you want us to rename these functions for
better readability
Is that correct ?


> Oops, I meant "we DON'T know whether it is still present."
>
> > I think it would be better if the interfaces were something
> > like "pci_device_is_absent()" because that gives a result we can rely
> > on.  If that returns true, we know the device is definitely gone.
> >
> > Bjorn
Sreekanth Reddy Oct. 12, 2018, 5:47 a.m. UTC | #9
HI Bjorn,

Please provide your valuable suggestion/reply here.

Thank you,
Sreekanth

-----Original Message-----
From: Suganath Prabu Subramani
[mailto:suganath-prabu.subramani@broadcom.com]
Sent: Monday, October 8, 2018 12:15 PM
To: helgaas@kernel.org
Cc: lukas@wunner.de; linux-scsi@vger.kernel.org; linux-pci@vger.kernel.org;
Andy Shevchenko; Sathya Prakash; Sreekanth Reddy;
linux-kernel@vger.kernel.org; benh@kernel.crashing.org; ruscur@russell.cc;
sbobroff@linux.ibm.com; Oliver
Subject: Re: [PATCH v4 1/6] mpt3sas: Introduce
mpt3sas_base_pci_device_is_available

On Tue, Oct 2, 2018 at 7:34 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
>
> On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> > I think the names "pci_device_is_present()" and
> > "mpt3sas_base_pci_device_is_available()" contribute to the problem
> > because they make promises that can't be kept -- all we can say is
> > that the device *was* present, but we know whether it is *still*
> > present.

Bjorn,

In the patch we are using '!' (i.e. not operation) of
pci_device_is_present(),
which is logically same as pci_device_is absent, and it is
same for mpt3sas_base_pci_device_is_available().
My understanding is that, you want us to rename these functions for
better readability
Is that correct ?


> Oops, I meant "we DON'T know whether it is still present."
>
> > I think it would be better if the interfaces were something
> > like "pci_device_is_absent()" because that gives a result we can rely
> > on.  If that returns true, we know the device is definitely gone.
> >
> > Bjorn
Bjorn Helgaas Oct. 12, 2018, 11:43 p.m. UTC | #10
On Mon, Oct 08, 2018 at 12:14:40PM +0530, Suganath Prabu Subramani wrote:
> On Tue, Oct 2, 2018 at 7:34 PM Bjorn Helgaas <helgaas@kernel.org> wrote:
> > On Mon, Oct 01, 2018 at 03:40:51PM -0500, Bjorn Helgaas wrote:
> > > I think the names "pci_device_is_present()" and
> > > "mpt3sas_base_pci_device_is_available()" contribute to the
> > > problem because they make promises that can't be kept -- all we
> > > can say is that the device *was* present, but we don't know
> > > whether it is *still* present.
> 
> In the patch we are using '!' (i.e. not operation) of
> pci_device_is_present(), which is logically same as pci_device_is
> absent, and it is same for mpt3sas_base_pci_device_is_available().
>
> My understanding is that, you want us to rename these functions for
> better readability.  Is that correct ?

I think "pci_device_is_present()" is a poor name, but I'm not
suggesting you fix it in this patch.  Changing that would be a PCI
core change that would also touch the tg3, igb, nvme, and now mpt3sas
drivers that use it.

My personal opinion is that you should do something like the patch
below.  The main point is that you should for the device being
unreachable *at the places where you might learn that*.

If you WRITE to a device that has been removed, the write will mostly
likely get dropped and you won't learn anything.  But when you READ
from a device that has been removed, you'll most likely get ~0 data
back, and you can check for that.

Of course, it's also possible that pci_device_is_present() can tell
you something.  So you *could* sprinkle those all over, as in your
patch.  But you end up with code like this:

  if (!pci_device_is_present())
    return;

  writel();
  readl();

Here, the device might be removed right after pci_device_is_present()
returns true.  You do the writel() and the readl() anyway, so you
really haven't gained anything.  And if the readl() returned ~0, you
assume it's valid data when it's not.

It's better to do this:

  writel();
  val = readl();
  if (val == ~0) {
    /* device is unreachable, clean things up */
    ...
  }

(Obviously it's possible that ~0 is a valid value for whatever you
read from the device.  That depends on the device and you have to use
your knowledge of the hardware.  If you read ~0 from a register where
that might be a valid value, you can read from a different register
for with ~0 is *not* a valid value.)

I don't claim the patch below is complete because I don't know
anything about your hardware and what sort of registers or ring
buffers it uses.  You still have to arrange to flush or complete
whatever is outstanding when you learn the device is gone.

Bjorn


diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 59d7844ee022..37b09362b31a 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -6145,6 +6145,11 @@ _base_diag_reset(struct MPT3SAS_ADAPTER *ioc)
 		drsprintk(ioc, pr_info(MPT3SAS_FMT
 			"wrote magic sequence: count(%d), host_diagnostic(0x%08x)\n",
 		    ioc->name, count, host_diagnostic));
+		if (host_diagnostic == ~0) {
+			ioc->remove_host = 1;
+			pr_err(MPT3SAS_FMT "HBA unreachable\n", ioc->name);
+			goto out;
+		}
 
 	} while ((host_diagnostic & MPI2_DIAG_DIAG_WRITE_ENABLE) == 0);
 
@@ -6237,6 +6242,11 @@ _base_make_ioc_ready(struct MPT3SAS_ADAPTER *ioc, enum reset_type type)
 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
 	dhsprintk(ioc, pr_info(MPT3SAS_FMT "%s: ioc_state(0x%08x)\n",
 	    ioc->name, __func__, ioc_state));
+	if (ioc_state == ~0) {
+		pr_err(MPT3SAS_FMT "%s: HBA unreachable (ioc_state=0x%x)\n",
+			ioc->name, __func__, ioc_state);
+		return -EFAULT;
+	}
 
 	/* if in RESET state, it should move to READY state shortly */
 	count = 0;
@@ -6251,6 +6261,11 @@ _base_make_ioc_ready(struct MPT3SAS_ADAPTER *ioc, enum reset_type type)
 			}
 			ssleep(1);
 			ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+			if (ioc_state == ~0) {
+				pr_err(MPT3SAS_FMT "%s: HBA unreachable (ioc_state=0x%x)\n",
+					ioc->name, __func__, ioc_state);
+				return -EFAULT;
+			}
 		}
 	}
 
@@ -6854,6 +6869,11 @@ mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
 	ioc->pending_io_count = 0;
 
 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+	if (ioc_state == ~0) {
+		pr_err(MPT3SAS_FMT "%s: HBA unreachable\n",
+		       ioc->name, __func__);
+		return;
+	}
 	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
 		return;
 
@@ -6909,6 +6929,14 @@ mpt3sas_base_hard_reset_handler(struct MPT3SAS_ADAPTER *ioc,
 	    MPT3_DIAG_BUFFER_IS_RELEASED))) {
 		is_trigger = 1;
 		ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+		if (ioc_state == ~0) {
+			pr_err(MPT3SAS_FMT "%s: HBA unreachable\n",
+			       ioc->name, __func__);
+			ioc->schedule_dead_ioc_flush_running_cmds(ioc);
+			r = 0;
+			mutex_unlock(&ioc->reset_in_progress_mutex);
+			goto out_unlocked;
+		}
 		if ((ioc_state & MPI2_IOC_STATE_MASK) == MPI2_IOC_STATE_FAULT)
 			is_fault = 1;
 	}
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 53133cfd420f..d0d4c8d94a10 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2653,6 +2653,11 @@ mpt3sas_scsih_issue_tm(struct MPT3SAS_ADAPTER *ioc, u16 handle, u64 lun,
 	}
 
 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
+	if (ioc_state == ~0) {
+		pr_info(MPT3SAS_FMT "%s: HBA unreachable\n",
+                    __func__, ioc->name);
+		return FAILED;
+	}
 	if (ioc_state & MPI2_DOORBELL_USED) {
 		dhsprintk(ioc, pr_info(MPT3SAS_FMT
 			"unexpected doorbell active!\n", ioc->name));
diff mbox series

Patch

==============
unlock mutex before goto "out_unlocked",
if active reset is in progress.

v2 change set:
==============
1) Use pci_device_is_present instead of
mpt3sas_base_pci_device_is_unplugged.
2) As suggested by Lukas, removed using
watchdog thread for checking hba hot unplug(Patch 02 of V1).
Added Hot unplug checks in scan finish and reset paths.

v3 Change Set:
=============
Simplified function "mpt3sas_base_pci_device_is_available" and
made inline.

v4 Changes:
===========
Dont split strings in print statement.

Signed-off-by: Suganath Prabu S <suganath-prabu.subramani@broadcom.com>
---
 drivers/scsi/mpt3sas/mpt3sas_base.c  | 39 ++++++++++++++++++++++++++++
 drivers/scsi/mpt3sas/mpt3sas_base.h  |  3 ++-
 drivers/scsi/mpt3sas/mpt3sas_scsih.c | 50 ++++++++++++++++++++++++++++++++----
 3 files changed, 86 insertions(+), 6 deletions(-)

diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 59d7844..c880e72 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -543,6 +543,20 @@  static int mpt3sas_remove_dead_ioc_func(void *arg)
 }
 
 /**
+ * mpt3sas_base_pci_device_is_available - check whether pci device is
+ *			available for any transactions with FW
+ *
+ * @ioc: per adapter object
+ *
+ * Return 1 if pci device state is up and running else return 0.
+ */
+inline bool
+mpt3sas_base_pci_device_is_available(struct MPT3SAS_ADAPTER *ioc)
+{
+	return !ioc->pci_error_recovery && pci_device_is_present(ioc->pdev);
+}
+
+/**
  * _base_fault_reset_work - workq handling ioc fault conditions
  * @work: input argument, used to derive ioc
  *
@@ -6122,6 +6136,11 @@  _base_diag_reset(struct MPT3SAS_ADAPTER *ioc)
 
 	count = 0;
 	do {
+		if (!pci_device_is_present(ioc->pdev)) {
+			ioc->remove_host = 1;
+			pr_err(MPT3SAS_FMT "Hba Hot unplugged\n", ioc->name);
+			goto out;
+		}
 		/* Write magic sequence to WriteSequence register
 		 * Loop until in diagnostic mode
 		 */
@@ -6853,6 +6872,13 @@  mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc)
 
 	ioc->pending_io_count = 0;
 
+	if (!mpt3sas_base_pci_device_is_available(ioc)) {
+		pr_err(MPT3SAS_FMT
+		    "%s: pci error recovery reset or pci device unplug occurred\n",
+		    ioc->name, __func__);
+		return;
+	}
+
 	ioc_state = mpt3sas_base_get_iocstate(ioc, 0);
 	if ((ioc_state & MPI2_IOC_STATE_MASK) != MPI2_IOC_STATE_OPERATIONAL)
 		return;
@@ -6899,6 +6925,19 @@  mpt3sas_base_hard_reset_handler(struct MPT3SAS_ADAPTER *ioc,
 	/* wait for an active reset in progress to complete */
 	mutex_lock(&ioc->reset_in_progress_mutex);
 
+	if (!mpt3sas_base_pci_device_is_available(ioc)) {
+		pr_err(MPT3SAS_FMT
+		    "%s: pci error recovery reset or pci device unplug occurred\n",
+		    ioc->name, __func__);
+		if (!pci_device_is_present(ioc->pdev))
+			ioc->schedule_dead_ioc_flush_running_cmds(ioc);
+		r = 0;
+		mutex_unlock(&ioc->reset_in_progress_mutex);
+		goto out_unlocked;
+	}
+
+	mpt3sas_halt_firmware(ioc);
+
 	spin_lock_irqsave(&ioc->ioc_reset_in_progress_lock, flags);
 	ioc->shost_recovery = 1;
 	spin_unlock_irqrestore(&ioc->ioc_reset_in_progress_lock, flags);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.h b/drivers/scsi/mpt3sas/mpt3sas_base.h
index 96dc15e..a802ad4 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.h
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.h
@@ -1474,7 +1474,8 @@  void mpt3sas_base_update_missing_delay(struct MPT3SAS_ADAPTER *ioc,
 	u16 device_missing_delay, u8 io_missing_delay);
 
 int mpt3sas_port_enable(struct MPT3SAS_ADAPTER *ioc);
-
+inline bool  mpt3sas_base_pci_device_is_available(
+	struct MPT3SAS_ADAPTER *ioc);
 void
 mpt3sas_wait_for_commands_to_complete(struct MPT3SAS_ADAPTER *ioc);
 
diff --git a/drivers/scsi/mpt3sas/mpt3sas_scsih.c b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
index 53133cf..566a550 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_scsih.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_scsih.c
@@ -2846,9 +2846,19 @@  scsih_abort(struct scsi_cmnd *scmd)
 		"attempting task abort! scmd(%p)\n", scmd);
 	_scsih_tm_display_info(ioc, scmd);
 
+	if (!pci_device_is_present(ioc->pdev) || ioc->remove_host) {
+		sdev_printk(KERN_INFO, scmd->device, "%s scmd(%p)\n",
+		    ((ioc->remove_host) ? ("shost is getting removed!") :
+		    ("pci device been removed!")), scmd);
+		if (st && st->smid)
+			mpt3sas_base_free_smid(ioc, st->smid);
+		scmd->result = DID_NO_CONNECT << 16;
+		r = FAST_IO_FAIL;
+		goto out;
+	}
+
 	sas_device_priv_data = scmd->device->hostdata;
-	if (!sas_device_priv_data || !sas_device_priv_data->sas_target ||
-	    ioc->remove_host) {
+	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		sdev_printk(KERN_INFO, scmd->device,
 			"device been deleted! scmd(%p)\n", scmd);
 		scmd->result = DID_NO_CONNECT << 16;
@@ -2918,6 +2928,15 @@  scsih_dev_reset(struct scsi_cmnd *scmd)
 		"attempting device reset! scmd(%p)\n", scmd);
 	_scsih_tm_display_info(ioc, scmd);
 
+	if (!pci_device_is_present(ioc->pdev) || ioc->remove_host) {
+		sdev_printk(KERN_INFO, scmd->device, "%s scmd(%p)\n",
+		    ((ioc->remove_host) ? ("shost is getting removed!") :
+		    ("pci device been removed!")), scmd);
+		scmd->result = DID_NO_CONNECT << 16;
+		r = FAST_IO_FAIL;
+		goto out;
+	}
+
 	sas_device_priv_data = scmd->device->hostdata;
 	if (!sas_device_priv_data || !sas_device_priv_data->sas_target ||
 	    ioc->remove_host) {
@@ -2995,9 +3014,17 @@  scsih_target_reset(struct scsi_cmnd *scmd)
 		scmd);
 	_scsih_tm_display_info(ioc, scmd);
 
+	if ((!pci_device_is_present(ioc->pdev)) || ioc->remove_host) {
+		sdev_printk(KERN_INFO, scmd->device, "%s scmd(%p)\n",
+		    ((ioc->remove_host) ? ("shost is getting removed!") :
+		    ("pci device been removed!")), scmd);
+		scmd->result = DID_NO_CONNECT << 16;
+		r = FAST_IO_FAIL;
+		goto out;
+	}
+
 	sas_device_priv_data = scmd->device->hostdata;
-	if (!sas_device_priv_data || !sas_device_priv_data->sas_target ||
-	    ioc->remove_host) {
+	if (!sas_device_priv_data || !sas_device_priv_data->sas_target) {
 		starget_printk(KERN_INFO, starget, "target been deleted! scmd(%p)\n",
 			scmd);
 		scmd->result = DID_NO_CONNECT << 16;
@@ -4474,7 +4501,9 @@  _scsih_flush_running_cmds(struct MPT3SAS_ADAPTER *ioc)
 		st = scsi_cmd_priv(scmd);
 		mpt3sas_base_clear_st(ioc, st);
 		scsi_dma_unmap(scmd);
-		if (ioc->pci_error_recovery || ioc->remove_host)
+
+		if ((!mpt3sas_base_pci_device_is_available(ioc))
+				|| ioc->remove_host)
 			scmd->result = DID_NO_CONNECT << 16;
 		else
 			scmd->result = DID_RESET << 16;
@@ -9726,6 +9755,9 @@  _scsih_ir_shutdown(struct MPT3SAS_ADAPTER *ioc)
 	if (list_empty(&ioc->raid_device_list))
 		return;
 
+	if (!pci_device_is_present(ioc->pdev))
+		return;
+
 	mutex_lock(&ioc->scsih_cmds.mutex);
 
 	if (ioc->scsih_cmds.status != MPT3_CMD_NOT_USED) {
@@ -10247,6 +10279,14 @@  scsih_scan_finished(struct Scsi_Host *shost, unsigned long time)
 {
 	struct MPT3SAS_ADAPTER *ioc = shost_priv(shost);
 
+	if (!pci_device_is_present(ioc->pdev)) {
+		complete(&ioc->port_enable_cmds.done);
+		ioc->port_enable_cmds.status = MPT3_CMD_NOT_USED;
+		ioc->is_driver_loading = 0;
+		ioc->remove_host = 1;
+		return 1;
+	}
+
 	if (disable_discovery > 0) {
 		ioc->is_driver_loading = 0;
 		ioc->wait_for_discovery_to_complete = 0;