diff mbox

[v3,SCSI] mpt2sas, mpt3sas: Abort initialization if no memory I/O resources detected

Message ID 1436935796-28995-1-git-send-email-Sreekanth.Reddy@avagotech.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sreekanth Reddy July 15, 2015, 4:49 a.m. UTC
Driver crashes if the BIOS do not set up at least one
memory I/O resource. This failure can happen if the device is too
slow to respond during POST and is missed by the BIOS, but Linux
then detects the device later in the boot process.

Changes in v3:
   Rearranged the code to remove the code redundancy

Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
---
 drivers/scsi/mpt2sas/mpt2sas_base.c | 16 +++++++++-------
 drivers/scsi/mpt3sas/mpt3sas_base.c | 16 +++++++++-------
 2 files changed, 18 insertions(+), 14 deletions(-)

Comments

Yinghai Lu July 15, 2015, 5:36 a.m. UTC | #1
On Tue, Jul 14, 2015 at 9:49 PM, Sreekanth Reddy
<sreekanth.reddy@avagotech.com> wrote:
> Driver crashes if the BIOS do not set up at least one
> memory I/O resource. This failure can happen if the device is too
> slow to respond during POST and is missed by the BIOS, but Linux
> then detects the device later in the boot process.

But pci subsystem should assign resources to those unassigned BAR.

Do you mean even kernel can not assign resource to them? or it takes so long for
mpt FW to get ready?

Thanks

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Sreekanth Reddy July 15, 2015, 6:24 a.m. UTC | #2
On Tue, Jul 14, 2015 at 10:36:58PM -0700, Yinghai Lu wrote:
> On Tue, Jul 14, 2015 at 9:49 PM, Sreekanth Reddy
> <sreekanth.reddy@avagotech.com> wrote:
> > Driver crashes if the BIOS do not set up at least one
> > memory I/O resource. This failure can happen if the device is too
> > slow to respond during POST and is missed by the BIOS, but Linux
> > then detects the device later in the boot process.
> 
> But pci subsystem should assign resources to those unassigned BAR.
> 
> Do you mean even kernel can not assign resource to them? or it takes so long for
> mpt FW to get ready?

This is not an issue from mpt FW.

I have just kept the same description provide by Timothy in his
initial patch.

But I observe that their may be chance of getting "unable to handle
kernel NULL pointer dereference" kernel panic if no Memory Resource
available in the PCI subsystem. So agreed to the Timothy proposal of
aborting the driver initialization if it doesn't detect any Memory
resource instead of whole system get into panic state.

> 
> Thanks
> 
> Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Timothy Pearson July 15, 2015, 1:52 p.m. UTC | #3
On 07/15/2015 01:24 AM, Sreekanth Reddy wrote:
> On Tue, Jul 14, 2015 at 10:36:58PM -0700, Yinghai Lu wrote:
>> On Tue, Jul 14, 2015 at 9:49 PM, Sreekanth Reddy
>> <sreekanth.reddy@avagotech.com> wrote:
>>> Driver crashes if the BIOS do not set up at least one
>>> memory I/O resource. This failure can happen if the device is too
>>> slow to respond during POST and is missed by the BIOS, but Linux
>>> then detects the device later in the boot process.
>>
>> But pci subsystem should assign resources to those unassigned BAR.
>>
>> Do you mean even kernel can not assign resource to them? or it takes so long for
>> mpt FW to get ready?
> 
> This is not an issue from mpt FW.
> 
> I have just kept the same description provide by Timothy in his
> initial patch.
> 
> But I observe that their may be chance of getting "unable to handle
> kernel NULL pointer dereference" kernel panic if no Memory Resource
> available in the PCI subsystem. So agreed to the Timothy proposal of
> aborting the driver initialization if it doesn't detect any Memory
> resource instead of whole system get into panic state.
> 
>>
>> Thanks
>>
>> Yinghai

On some systems Linux is unable / unwilling to assign a BAR if the BIOS
does not assign one at startup.  I didn't look into the Linux allocator
side of things in much detail, but it is quite possible that Linux is
unaware the device only has partial resources assigned.
Bjorn Helgaas July 15, 2015, 2:59 p.m. UTC | #4
On Wed, Jul 15, 2015 at 08:52:13AM -0500, Timothy Pearson wrote:
> On 07/15/2015 01:24 AM, Sreekanth Reddy wrote:
> > On Tue, Jul 14, 2015 at 10:36:58PM -0700, Yinghai Lu wrote:
> >> On Tue, Jul 14, 2015 at 9:49 PM, Sreekanth Reddy
> >> <sreekanth.reddy@avagotech.com> wrote:
> >>> Driver crashes if the BIOS do not set up at least one
> >>> memory I/O resource. This failure can happen if the device is too
> >>> slow to respond during POST and is missed by the BIOS, but Linux
> >>> then detects the device later in the boot process.
> >>
> >> But pci subsystem should assign resources to those unassigned BAR.
> >>
> >> Do you mean even kernel can not assign resource to them? or it takes so long for
> >> mpt FW to get ready?
> > 
> > This is not an issue from mpt FW.
> > 
> > I have just kept the same description provide by Timothy in his
> > initial patch.
> > 
> > But I observe that their may be chance of getting "unable to handle
> > kernel NULL pointer dereference" kernel panic if no Memory Resource
> > available in the PCI subsystem. So agreed to the Timothy proposal of
> > aborting the driver initialization if it doesn't detect any Memory
> > resource instead of whole system get into panic state.
> 
> On some systems Linux is unable / unwilling to assign a BAR if the BIOS
> does not assign one at startup.  I didn't look into the Linux allocator
> side of things in much detail, but it is quite possible that Linux is
> unaware the device only has partial resources assigned.

There might be a Linux PCI core bug if we don't assign a BAR, although
resource assignment can always fail, even in the absence of bugs.

But there is definitely a driver bug if it uses ioc->chip when it hasn't
been initialized.  That's what this patch seems to fix.

Bjorn
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Yinghai Lu July 15, 2015, 6:28 p.m. UTC | #5
On Wed, Jul 15, 2015 at 6:52 AM, Timothy Pearson
<tpearson@raptorengineeringinc.com> wrote:
>> I have just kept the same description provide by Timothy in his
>> initial patch.
>>
>> But I observe that their may be chance of getting "unable to handle
>> kernel NULL pointer dereference" kernel panic if no Memory Resource
>> available in the PCI subsystem. So agreed to the Timothy proposal of
>> aborting the driver initialization if it doesn't detect any Memory
>> resource instead of whole system get into panic state.
>>
> On some systems Linux is unable / unwilling to assign a BAR if the BIOS
> does not assign one at startup.  I didn't look into the Linux allocator
> side of things in much detail, but it is quite possible that Linux is
> unaware the device only has partial resources assigned.
>

Would be great if you can post boot log so we can figure about why those
BARs are not assigned.

Yinghai
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Timothy Pearson July 15, 2015, 6:46 p.m. UTC | #6
On 07/15/2015 01:28 PM, Yinghai Lu wrote:
> On Wed, Jul 15, 2015 at 6:52 AM, Timothy Pearson
> <tpearson@raptorengineeringinc.com> wrote:
>>> I have just kept the same description provide by Timothy in his
>>> initial patch.
>>>
>>> But I observe that their may be chance of getting "unable to handle
>>> kernel NULL pointer dereference" kernel panic if no Memory Resource
>>> available in the PCI subsystem. So agreed to the Timothy proposal of
>>> aborting the driver initialization if it doesn't detect any Memory
>>> resource instead of whole system get into panic state.
>>>
>> On some systems Linux is unable / unwilling to assign a BAR if the BIOS
>> does not assign one at startup.  I didn't look into the Linux allocator
>> side of things in much detail, but it is quite possible that Linux is
>> unaware the device only has partial resources assigned.
>>
> 
> Would be great if you can post boot log so we can figure about why those
> BARs are not assigned.
> 
> Yinghai

Unfortunately the systems exhibiting the issue were upgraded to a later
BIOS that does not have this problem; it might be possible to set up a
test system in the future but that probably won't happen for some time.
Bjorn Helgaas July 15, 2015, 8:59 p.m. UTC | #7
On Wed, Jul 15, 2015 at 10:19:56AM +0530, Sreekanth Reddy wrote:
> Driver crashes if the BIOS do not set up at least one
> memory I/O resource. This failure can happen if the device is too
> slow to respond during POST and is missed by the BIOS, but Linux
> then detects the device later in the boot process.
> 
> Changes in v3:
>    Rearranged the code to remove the code redundancy
> 
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
> ---
>  drivers/scsi/mpt2sas/mpt2sas_base.c | 16 +++++++++-------
>  drivers/scsi/mpt3sas/mpt3sas_base.c | 16 +++++++++-------
>  2 files changed, 18 insertions(+), 14 deletions(-)
> 
> diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
> index 11248de..6dec7cf 100644
> --- a/drivers/scsi/mpt2sas/mpt2sas_base.c
> +++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
> @@ -1557,7 +1557,8 @@ mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER *ioc)
>  		goto out_fail;
>  	}
>  
> -	for (i = 0, memap_sz = 0, pio_sz = 0 ; i < DEVICE_COUNT_RESOURCE; i++) {
> +	for (i = 0, memap_sz = 0, pio_sz = 0; (i < DEVICE_COUNT_RESOURCE) &&
> +	     (!memap_sz || !pio_sz); i++) {
>  		if (pci_resource_flags(pdev, i) & IORESOURCE_IO) {
>  			if (pio_sz)
>  				continue;
> @@ -1572,16 +1573,17 @@ mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER *ioc)
>  				chip_phys = (u64)ioc->chip_phys;
>  				memap_sz = pci_resource_len(pdev, i);
>  				ioc->chip = ioremap(ioc->chip_phys, memap_sz);
> -				if (ioc->chip == NULL) {
> -					printk(MPT2SAS_ERR_FMT "unable to map "
> -					    "adapter memory!\n", ioc->name);
> -					r = -EINVAL;
> -					goto out_fail;
> -				}
>  			}
>  		}
>  	}
>  
> +	if (ioc->chip == NULL) {
> +		printk(MPT2SAS_ERR_FMT "unable to map adapter memory! "
> +		       "or resource not found\n", ioc->name);
> +		r = -EINVAL;
> +		goto out_fail;
> +	}

  - Before this patch, the driver probe succeeds even if it can't find an
    MMIO resource.  That means we would eventually dereference ioc->chip,
    which is zero.  This patch definitely fixes that bug.

  - I raise my eyebrows a little at the "let's use the first MMIO resource
    we find" idea.  A driver should know exactly which BAR it needs, so it
    seems sloppy to search this way.  And it's possible that BIOS or the
    PCI core set up *one* MMIO BAR, but not the one the driver needs.  Then
    the search will find the wrong BAR and the driver won't work.

  - These drivers both do:

      ioc->chip_phys = pci_resource_start(pdev, i);
      ioc->chip = ioremap(ioc->chip_phys, memap_sz);

    Saving ioc->chip_phys is pointless because it's only used to guard
    iounmap(ioc->chip), and the driver probe should fail if it can't find
    an MMIO resource, so the places that do the iounmap can assume
    ioc->chip is always valid.

  - These drivers also search for an I/O resource, but they don't actually
    use it, except to print it out.  I guess that's harmless, but it seems
    sloppy and will print junk on machines where there's no I/O space
    available.  They do use pci_enable_device_mem(), so at least the device
    should work even if no I/O space is available.

  - It'd be nice if they used dev_printk() and %pR instead of their ad hoc
    formatting.

>  	_base_mask_interrupts(ioc);
>  
>  	r = _base_get_ioc_facts(ioc, CAN_SLEEP);
> diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
> index 14a781b..43f87e9 100644
> --- a/drivers/scsi/mpt3sas/mpt3sas_base.c
> +++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
> @@ -1843,7 +1843,8 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc)
>  		goto out_fail;
>  	}
>  
> -	for (i = 0, memap_sz = 0, pio_sz = 0 ; i < DEVICE_COUNT_RESOURCE; i++) {
> +	for (i = 0, memap_sz = 0, pio_sz = 0; (i < DEVICE_COUNT_RESOURCE) &&
> +	     (!memap_sz || !pio_sz); i++) {
>  		if (pci_resource_flags(pdev, i) & IORESOURCE_IO) {
>  			if (pio_sz)
>  				continue;
> @@ -1856,15 +1857,16 @@ mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc)
>  			chip_phys = (u64)ioc->chip_phys;
>  			memap_sz = pci_resource_len(pdev, i);
>  			ioc->chip = ioremap(ioc->chip_phys, memap_sz);
> -			if (ioc->chip == NULL) {
> -				pr_err(MPT3SAS_FMT "unable to map adapter memory!\n",
> -					ioc->name);
> -				r = -EINVAL;
> -				goto out_fail;
> -			}
>  		}
>  	}
>  
> +	if (ioc->chip == NULL) {
> +		pr_err(MPT3SAS_FMT "unable to map adapter memory! "
> +			" or resource not found\n", ioc->name);
> +		r = -EINVAL;
> +		goto out_fail;
> +	}
> +
>  	_base_mask_interrupts(ioc);
>  
>  	r = _base_get_ioc_facts(ioc, CAN_SLEEP);
> -- 
> 2.0.2
> 
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at  http://www.tux.org/lkml/
--
To unsubscribe from this list: send the line "unsubscribe linux-scsi" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hannes Reinecke July 27, 2015, 8:19 a.m. UTC | #8
On 07/15/2015 06:49 AM, Sreekanth Reddy wrote:
> Driver crashes if the BIOS do not set up at least one
> memory I/O resource. This failure can happen if the device is too
> slow to respond during POST and is missed by the BIOS, but Linux
> then detects the device later in the boot process.
> 
> Changes in v3:
>    Rearranged the code to remove the code redundancy
> 
> Signed-off-by: Sreekanth Reddy <Sreekanth.Reddy@avagotech.com>
> ---
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
diff mbox

Patch

diff --git a/drivers/scsi/mpt2sas/mpt2sas_base.c b/drivers/scsi/mpt2sas/mpt2sas_base.c
index 11248de..6dec7cf 100644
--- a/drivers/scsi/mpt2sas/mpt2sas_base.c
+++ b/drivers/scsi/mpt2sas/mpt2sas_base.c
@@ -1557,7 +1557,8 @@  mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER *ioc)
 		goto out_fail;
 	}
 
-	for (i = 0, memap_sz = 0, pio_sz = 0 ; i < DEVICE_COUNT_RESOURCE; i++) {
+	for (i = 0, memap_sz = 0, pio_sz = 0; (i < DEVICE_COUNT_RESOURCE) &&
+	     (!memap_sz || !pio_sz); i++) {
 		if (pci_resource_flags(pdev, i) & IORESOURCE_IO) {
 			if (pio_sz)
 				continue;
@@ -1572,16 +1573,17 @@  mpt2sas_base_map_resources(struct MPT2SAS_ADAPTER *ioc)
 				chip_phys = (u64)ioc->chip_phys;
 				memap_sz = pci_resource_len(pdev, i);
 				ioc->chip = ioremap(ioc->chip_phys, memap_sz);
-				if (ioc->chip == NULL) {
-					printk(MPT2SAS_ERR_FMT "unable to map "
-					    "adapter memory!\n", ioc->name);
-					r = -EINVAL;
-					goto out_fail;
-				}
 			}
 		}
 	}
 
+	if (ioc->chip == NULL) {
+		printk(MPT2SAS_ERR_FMT "unable to map adapter memory! "
+		       "or resource not found\n", ioc->name);
+		r = -EINVAL;
+		goto out_fail;
+	}
+
 	_base_mask_interrupts(ioc);
 
 	r = _base_get_ioc_facts(ioc, CAN_SLEEP);
diff --git a/drivers/scsi/mpt3sas/mpt3sas_base.c b/drivers/scsi/mpt3sas/mpt3sas_base.c
index 14a781b..43f87e9 100644
--- a/drivers/scsi/mpt3sas/mpt3sas_base.c
+++ b/drivers/scsi/mpt3sas/mpt3sas_base.c
@@ -1843,7 +1843,8 @@  mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc)
 		goto out_fail;
 	}
 
-	for (i = 0, memap_sz = 0, pio_sz = 0 ; i < DEVICE_COUNT_RESOURCE; i++) {
+	for (i = 0, memap_sz = 0, pio_sz = 0; (i < DEVICE_COUNT_RESOURCE) &&
+	     (!memap_sz || !pio_sz); i++) {
 		if (pci_resource_flags(pdev, i) & IORESOURCE_IO) {
 			if (pio_sz)
 				continue;
@@ -1856,15 +1857,16 @@  mpt3sas_base_map_resources(struct MPT3SAS_ADAPTER *ioc)
 			chip_phys = (u64)ioc->chip_phys;
 			memap_sz = pci_resource_len(pdev, i);
 			ioc->chip = ioremap(ioc->chip_phys, memap_sz);
-			if (ioc->chip == NULL) {
-				pr_err(MPT3SAS_FMT "unable to map adapter memory!\n",
-					ioc->name);
-				r = -EINVAL;
-				goto out_fail;
-			}
 		}
 	}
 
+	if (ioc->chip == NULL) {
+		pr_err(MPT3SAS_FMT "unable to map adapter memory! "
+			" or resource not found\n", ioc->name);
+		r = -EINVAL;
+		goto out_fail;
+	}
+
 	_base_mask_interrupts(ioc);
 
 	r = _base_get_ioc_facts(ioc, CAN_SLEEP);