mbox series

[0/4] scsi: fixup dma_set_mask_and_coherent() calls

Message ID 20190213114234.67275-1-hare@suse.de (mailing list archive)
Headers show
Series scsi: fixup dma_set_mask_and_coherent() calls | expand

Message

Hannes Reinecke Feb. 13, 2019, 11:42 a.m. UTC
The recent patchset to use dma_set_mask_and_coherent() introduced
a regression where a call to set a 64-bit DMA mask was followed
by a call to set a 32-bit DMA mask, leading to I/O errors and
data corruption.

Patchset is based on a suggestions from Ewan Milne.

Hannes Reinecke (4):
  lpfc: fix calls to dma_set_mask_and_coherent()
  hptiop: fix calls to dma_set_mask_and_coherent()
  bfa: fix calls to dma_set_mask_and_coherent()
  hisi_sas: fix calls to dma_set_mask_and_coherent()

 drivers/scsi/bfa/bfad.c               | 10 +++++++---
 drivers/scsi/hisi_sas/hisi_sas_main.c |  8 ++++++--
 drivers/scsi/hptiop.c                 | 10 +++++++---
 drivers/scsi/lpfc/lpfc_init.c         |  9 ++++++---
 4 files changed, 26 insertions(+), 11 deletions(-)

Comments

Ewan Milne Feb. 13, 2019, 6:51 p.m. UTC | #1
On Wed, 2019-02-13 at 12:42 +0100, Hannes Reinecke wrote:
> The recent patchset to use dma_set_mask_and_coherent() introduced
> a regression where a call to set a 64-bit DMA mask was followed
> by a call to set a 32-bit DMA mask, leading to I/O errors and
> data corruption.
> 
> Patchset is based on a suggestions from Ewan Milne.
> 
> Hannes Reinecke (4):
>   lpfc: fix calls to dma_set_mask_and_coherent()
>   hptiop: fix calls to dma_set_mask_and_coherent()
>   bfa: fix calls to dma_set_mask_and_coherent()
>   hisi_sas: fix calls to dma_set_mask_and_coherent()
> 
>  drivers/scsi/bfa/bfad.c               | 10 +++++++---
>  drivers/scsi/hisi_sas/hisi_sas_main.c |  8 ++++++--
>  drivers/scsi/hptiop.c                 | 10 +++++++---
>  drivers/scsi/lpfc/lpfc_init.c         |  9 ++++++---
>  4 files changed, 26 insertions(+), 11 deletions(-)
> 

Thanks for posting this set, Hannes.  I've got some individual
comments on each...

-Ewan
John Garry Feb. 14, 2019, 9:29 a.m. UTC | #2
On 13/02/2019 18:51, Ewan D. Milne wrote:
> On Wed, 2019-02-13 at 12:42 +0100, Hannes Reinecke wrote:
>> The recent patchset to use dma_set_mask_and_coherent() introduced
>> a regression where a call to set a 64-bit DMA mask was followed
>> by a call to set a 32-bit DMA mask, leading to I/O errors and
>> data corruption.
>>
>> Patchset is based on a suggestions from Ewan Milne.
>>
>> Hannes Reinecke (4):
>>   lpfc: fix calls to dma_set_mask_and_coherent()
>>   hptiop: fix calls to dma_set_mask_and_coherent()
>>   bfa: fix calls to dma_set_mask_and_coherent()
>>   hisi_sas: fix calls to dma_set_mask_and_coherent()
>>
>>  drivers/scsi/bfa/bfad.c               | 10 +++++++---
>>  drivers/scsi/hisi_sas/hisi_sas_main.c |  8 ++++++--
>>  drivers/scsi/hptiop.c                 | 10 +++++++---
>>  drivers/scsi/lpfc/lpfc_init.c         |  9 ++++++---
>>  4 files changed, 26 insertions(+), 11 deletions(-)

Isn't there a few more to fix up, like:
drivers/scsi/3w-9xxx.c
drivers/scsi/3w-sas.c
drivers/scsi/csiostor/csio_init.c

Thanks,
John

>>
>
> Thanks for posting this set, Hannes.  I've got some individual
> comments on each...
>
> -Ewan
>
>
hch@lst.de Feb. 14, 2019, 5:31 p.m. UTC | #3
On Thu, Feb 14, 2019 at 09:29:05AM +0000, John Garry wrote:
> On 13/02/2019 18:51, Ewan D. Milne wrote:
>> On Wed, 2019-02-13 at 12:42 +0100, Hannes Reinecke wrote:
>>> The recent patchset to use dma_set_mask_and_coherent() introduced
>>> a regression where a call to set a 64-bit DMA mask was followed
>>> by a call to set a 32-bit DMA mask, leading to I/O errors and
>>> data corruption.
>>>
>>> Patchset is based on a suggestions from Ewan Milne.
>>>
>>> Hannes Reinecke (4):
>>>   lpfc: fix calls to dma_set_mask_and_coherent()
>>>   hptiop: fix calls to dma_set_mask_and_coherent()
>>>   bfa: fix calls to dma_set_mask_and_coherent()
>>>   hisi_sas: fix calls to dma_set_mask_and_coherent()
>>>
>>>  drivers/scsi/bfa/bfad.c               | 10 +++++++---
>>>  drivers/scsi/hisi_sas/hisi_sas_main.c |  8 ++++++--
>>>  drivers/scsi/hptiop.c                 | 10 +++++++---
>>>  drivers/scsi/lpfc/lpfc_init.c         |  9 ++++++---
>>>  4 files changed, 26 insertions(+), 11 deletions(-)
>
> Isn't there a few more to fix up, like:
> drivers/scsi/3w-9xxx.c
> drivers/scsi/3w-sas.c
> drivers/scsi/csiostor/csio_init.c

Yeah, there is a few more.  And the sad part is as of a few kernel
release ago we shouldn't even need the fallback 32-bit dma mask
anymore - we've cleaned up all the mess that required it.
Hannes Reinecke Feb. 15, 2019, 6:55 a.m. UTC | #4
On 2/14/19 6:31 PM, Christoph Hellwig wrote:
> On Thu, Feb 14, 2019 at 09:29:05AM +0000, John Garry wrote:
>> On 13/02/2019 18:51, Ewan D. Milne wrote:
>>> On Wed, 2019-02-13 at 12:42 +0100, Hannes Reinecke wrote:
>>>> The recent patchset to use dma_set_mask_and_coherent() introduced
>>>> a regression where a call to set a 64-bit DMA mask was followed
>>>> by a call to set a 32-bit DMA mask, leading to I/O errors and
>>>> data corruption.
>>>>
>>>> Patchset is based on a suggestions from Ewan Milne.
>>>>
>>>> Hannes Reinecke (4):
>>>>    lpfc: fix calls to dma_set_mask_and_coherent()
>>>>    hptiop: fix calls to dma_set_mask_and_coherent()
>>>>    bfa: fix calls to dma_set_mask_and_coherent()
>>>>    hisi_sas: fix calls to dma_set_mask_and_coherent()
>>>>
>>>>   drivers/scsi/bfa/bfad.c               | 10 +++++++---
>>>>   drivers/scsi/hisi_sas/hisi_sas_main.c |  8 ++++++--
>>>>   drivers/scsi/hptiop.c                 | 10 +++++++---
>>>>   drivers/scsi/lpfc/lpfc_init.c         |  9 ++++++---
>>>>   4 files changed, 26 insertions(+), 11 deletions(-)
>>
>> Isn't there a few more to fix up, like:
>> drivers/scsi/3w-9xxx.c
>> drivers/scsi/3w-sas.c
>> drivers/scsi/csiostor/csio_init.c
> 
> Yeah, there is a few more.  And the sad part is as of a few kernel
> release ago we shouldn't even need the fallback 32-bit dma mask
> anymore - we've cleaned up all the mess that required it.
> 
Care to elaborate?
Can you point me to the respective commits facilitating that?

I'd gladly update the drivers...

Cheers,

Hannes
hch@lst.de Feb. 15, 2019, 7:43 a.m. UTC | #5
On Fri, Feb 15, 2019 at 07:55:39AM +0100, Hannes Reinecke wrote:
>> Yeah, there is a few more.  And the sad part is as of a few kernel
>> release ago we shouldn't even need the fallback 32-bit dma mask
>> anymore - we've cleaned up all the mess that required it.
>>
> Care to elaborate?
> Can you point me to the respective commits facilitating that?

It mostly has been that way for a long time, but we still had a few
oddball architectures checking for an exact 32-bit match in their
arch specific direct mapping routines.  With the move to the generic
dma-direct implementation all those got fixed.
hch@lst.de Feb. 15, 2019, 8:08 a.m. UTC | #6
On Fri, Feb 15, 2019 at 08:43:55AM +0100, Christoph Hellwig wrote:
> On Fri, Feb 15, 2019 at 07:55:39AM +0100, Hannes Reinecke wrote:
> >> Yeah, there is a few more.  And the sad part is as of a few kernel
> >> release ago we shouldn't even need the fallback 32-bit dma mask
> >> anymore - we've cleaned up all the mess that required it.
> >>
> > Care to elaborate?
> > Can you point me to the respective commits facilitating that?
> 
> It mostly has been that way for a long time, but we still had a few
> oddball architectures checking for an exact 32-bit match in their
> arch specific direct mapping routines.  With the move to the generic
> dma-direct implementation all those got fixed.

Actually, sparc64 is still doing odd things.  I guess I'll need to
get the fixup series for that into 5.1 first before we can clean up
the mess in the drivers.
John Garry Feb. 15, 2019, 9:16 a.m. UTC | #7
On 15/02/2019 08:08, Christoph Hellwig wrote:
> On Fri, Feb 15, 2019 at 08:43:55AM +0100, Christoph Hellwig wrote:
>> On Fri, Feb 15, 2019 at 07:55:39AM +0100, Hannes Reinecke wrote:
>>>> Yeah, there is a few more.  And the sad part is as of a few kernel
>>>> release ago we shouldn't even need the fallback 32-bit dma mask
>>>> anymore - we've cleaned up all the mess that required it.
>>>>
>>> Care to elaborate?
>>> Can you point me to the respective commits facilitating that?
>>
>> It mostly has been that way for a long time, but we still had a few
>> oddball architectures checking for an exact 32-bit match in their
>> arch specific direct mapping routines.  With the move to the generic
>> dma-direct implementation all those got fixed.

I thought that many SCSI drivers were changed over to stop using the PCI 
DMA APIs at the end of last year, like this:

commit c22b332d811b90448e090c7fb487448afb039fcc
Author: Christoph Hellwig <hch@lst.de>
Date:   Wed Oct 10 18:34:51 2018 +0200

     scsi: csiostor: switch to generic DMA API

     Switch from the legacy PCI DMA API to the generic DMA API.

     Signed-off-by: Christoph Hellwig <hch@lst.de>
     Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
     Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>

diff --git a/drivers/scsi/csiostor/csio_init.c 
b/drivers/scsi/csiostor/csio_init.c
index ed2dae657964..aa04e4a7aed5 100644
--- a/drivers/scsi/csiostor/csio_init.c
+++ b/drivers/scsi/csiostor/csio_init.c
@@ -210,11 +210,8 @@ csio_pci_init(struct pci_dev *pdev, int *bars)
         pci_set_master(pdev);
         pci_try_set_mwi(pdev);

-       if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
-               pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
-       } else if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) {
-               pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
-       } else {
+       if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) ||
+           dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
                 dev_err(&pdev->dev, "No suitable DMA available.\n");
                 goto err_release_regions;
         }

>
> Actually, sparc64 is still doing odd things.  I guess I'll need to
> get the fixup series for that into 5.1 first before we can clean up
> the mess in the drivers.
>
> .
>
Hannes Reinecke Feb. 18, 2019, 7:21 a.m. UTC | #8
On 2/15/19 10:16 AM, John Garry wrote:
> On 15/02/2019 08:08, Christoph Hellwig wrote:
>> On Fri, Feb 15, 2019 at 08:43:55AM +0100, Christoph Hellwig wrote:
>>> On Fri, Feb 15, 2019 at 07:55:39AM +0100, Hannes Reinecke wrote:
>>>>> Yeah, there is a few more.  And the sad part is as of a few kernel
>>>>> release ago we shouldn't even need the fallback 32-bit dma mask
>>>>> anymore - we've cleaned up all the mess that required it.
>>>>>
>>>> Care to elaborate?
>>>> Can you point me to the respective commits facilitating that?
>>>
>>> It mostly has been that way for a long time, but we still had a few
>>> oddball architectures checking for an exact 32-bit match in their
>>> arch specific direct mapping routines.  With the move to the generic
>>> dma-direct implementation all those got fixed.
> 
> I thought that many SCSI drivers were changed over to stop using the PCI 
> DMA APIs at the end of last year, like this:
> 
> commit c22b332d811b90448e090c7fb487448afb039fcc
> Author: Christoph Hellwig <hch@lst.de>
> Date:   Wed Oct 10 18:34:51 2018 +0200
> 
>      scsi: csiostor: switch to generic DMA API
> 
>      Switch from the legacy PCI DMA API to the generic DMA API.
> 
>      Signed-off-by: Christoph Hellwig <hch@lst.de>
>      Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
>      Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
> 
> diff --git a/drivers/scsi/csiostor/csio_init.c 
> b/drivers/scsi/csiostor/csio_init.c
> index ed2dae657964..aa04e4a7aed5 100644
> --- a/drivers/scsi/csiostor/csio_init.c
> +++ b/drivers/scsi/csiostor/csio_init.c
> @@ -210,11 +210,8 @@ csio_pci_init(struct pci_dev *pdev, int *bars)
>          pci_set_master(pdev);
>          pci_try_set_mwi(pdev);
> 
> -       if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
> -               pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
> -       } else if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) {
> -               pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
> -       } else {
> +       if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) ||
> +           dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
>                  dev_err(&pdev->dev, "No suitable DMA available.\n");
>                  goto err_release_regions;
>          }
> 
... which is precisely the error we're encountering.
If the first call to dma_set_mask_and_coherent() succeeds it will return 
'0', and the next call to dma_set_mask_and_coherent() will be executed.

Cheers,

Hannes
John Garry Feb. 18, 2019, 9:02 a.m. UTC | #9
On 18/02/2019 07:21, Hannes Reinecke wrote:
> On 2/15/19 10:16 AM, John Garry wrote:
>> On 15/02/2019 08:08, Christoph Hellwig wrote:
>>> On Fri, Feb 15, 2019 at 08:43:55AM +0100, Christoph Hellwig wrote:
>>>> On Fri, Feb 15, 2019 at 07:55:39AM +0100, Hannes Reinecke wrote:
>>>>>> Yeah, there is a few more.  And the sad part is as of a few kernel
>>>>>> release ago we shouldn't even need the fallback 32-bit dma mask
>>>>>> anymore - we've cleaned up all the mess that required it.
>>>>>>
>>>>> Care to elaborate?
>>>>> Can you point me to the respective commits facilitating that?
>>>>
>>>> It mostly has been that way for a long time, but we still had a few
>>>> oddball architectures checking for an exact 32-bit match in their
>>>> arch specific direct mapping routines.  With the move to the generic
>>>> dma-direct implementation all those got fixed.
>>
>> I thought that many SCSI drivers were changed over to stop using the
>> PCI DMA APIs at the end of last year, like this:
>>
>> commit c22b332d811b90448e090c7fb487448afb039fcc
>> Author: Christoph Hellwig <hch@lst.de>
>> Date:   Wed Oct 10 18:34:51 2018 +0200
>>
>>      scsi: csiostor: switch to generic DMA API
>>
>>      Switch from the legacy PCI DMA API to the generic DMA API.
>>
>>      Signed-off-by: Christoph Hellwig <hch@lst.de>
>>      Reviewed-by: Johannes Thumshirn <jthumshirn@suse.de>
>>      Signed-off-by: Martin K. Petersen <martin.petersen@oracle.com>
>>
>> diff --git a/drivers/scsi/csiostor/csio_init.c
>> b/drivers/scsi/csiostor/csio_init.c
>> index ed2dae657964..aa04e4a7aed5 100644
>> --- a/drivers/scsi/csiostor/csio_init.c
>> +++ b/drivers/scsi/csiostor/csio_init.c
>> @@ -210,11 +210,8 @@ csio_pci_init(struct pci_dev *pdev, int *bars)
>>          pci_set_master(pdev);
>>          pci_try_set_mwi(pdev);
>>
>> -       if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(64))) {
>> -               pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(64));
>> -       } else if (!pci_set_dma_mask(pdev, DMA_BIT_MASK(32))) {
>> -               pci_set_consistent_dma_mask(pdev, DMA_BIT_MASK(32));
>> -       } else {
>> +       if (dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(64)) ||
>> +           dma_set_mask_and_coherent(&pdev->dev, DMA_BIT_MASK(32))) {
>>                  dev_err(&pdev->dev, "No suitable DMA available.\n");
>>                  goto err_release_regions;
>>          }
>>
> ... which is precisely the error we're encountering.
> If the first call to dma_set_mask_and_coherent() succeeds it will return
> '0', and the next call to dma_set_mask_and_coherent() will be executed.
>

Sure, I know this. What's more, if the first fails, then we bail out and 
won't even try the second.

Thanks,
John


> Cheers,
>
> Hannes