diff mbox

Fix MSI-X with NIU cards

Message ID 20090508131333.GV8112@parisc-linux.org (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Matthew Wilcox May 8, 2009, 1:13 p.m. UTC
The NIU device refuses to allow accesses to MSI-X registers before MSI-X
is enabled.  This patch fixes the problem by moving the read of the mask
register to after MSI-X is enabled.

Reported-by: David S. Miller <davem@davemloft.net>
Tested-by: David S. Miller <davem@davemloft.net>
Reviewed-by: David S. Miller <davem@davemloft.net>
Signed-off-by: Matthew Wilcox <willy@linux.intel.com>

Comments

Michael Ellerman May 11, 2009, 1:21 a.m. UTC | #1
On Fri, 2009-05-08 at 07:13 -0600, Matthew Wilcox wrote:
> The NIU device refuses to allow accesses to MSI-X registers before MSI-X
> is enabled.  This patch fixes the problem by moving the read of the mask
> register to after MSI-X is enabled.
> 
> Reported-by: David S. Miller <davem@davemloft.net>
> Tested-by: David S. Miller <davem@davemloft.net>
> Reviewed-by: David S. Miller <davem@davemloft.net>
> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
> 
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 6f2e629..3627732 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -455,8 +455,6 @@ static int msix_capability_init(struct pci_dev *dev,
>  		entry->msi_attrib.default_irq = dev->irq;
>  		entry->msi_attrib.pos = pos;
>  		entry->mask_base = base;
> -		entry->masked = readl(base + j * PCI_MSIX_ENTRY_SIZE +
> -					PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
>  		msix_mask_irq(entry, 1);

158 static void msix_mask_irq(struct msi_desc *desc, u32 flag)                  
159 {
160         u32 mask_bits = desc->masked;
...
165         writel(mask_bits, desc->mask_base + offset);


So I guess this device is just silently ignoring that write?

And aren't we violating the spec by writing 0x1 into the device there
(assuming desc->masked is 0x0 from the kzalloc), ie. we're supposed to
read and write back the reserved bits unchanged. (§ 6.8.2.9?)


> @@ -493,6 +491,12 @@ static int msix_capability_init(struct pci_dev *dev,
>  	msix_set_enable(dev, 1);
>  	dev->msix_enabled = 1;

Are we safe if we take an interrupt here?

> +	list_for_each_entry(entry, &dev->msi_list, list) {
> +		int vector = entry->msi_attrib.entry_nr;
> +		entry->masked = readl(base + vector * PCI_MSIX_ENTRY_SIZE +
> +					PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
> +	}
> +
>  	return 0;
>  }
>  

cheers
David Miller May 11, 2009, 5:36 a.m. UTC | #2
From: Michael Ellerman <michael@ellerman.id.au>
Date: Mon, 11 May 2009 11:21:51 +1000

> So I guess this device is just silently ignoring that write?

No, it signals a fault on the PCI bus, which is how we noticed
this problem in the first place.
--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hidetoshi Seto May 13, 2009, 4:06 a.m. UTC | #3
Michael Ellerman wrote:
> On Fri, 2009-05-08 at 07:13 -0600, Matthew Wilcox wrote:
>> The NIU device refuses to allow accesses to MSI-X registers before MSI-X
>> is enabled.  This patch fixes the problem by moving the read of the mask
>> register to after MSI-X is enabled.
>>
>> Reported-by: David S. Miller <davem@davemloft.net>
>> Tested-by: David S. Miller <davem@davemloft.net>
>> Reviewed-by: David S. Miller <davem@davemloft.net>
>> Signed-off-by: Matthew Wilcox <willy@linux.intel.com>
>>
>> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
>> index 6f2e629..3627732 100644
>> --- a/drivers/pci/msi.c
>> +++ b/drivers/pci/msi.c
>> @@ -455,8 +455,6 @@ static int msix_capability_init(struct pci_dev *dev,
>>  		entry->msi_attrib.default_irq = dev->irq;
>>  		entry->msi_attrib.pos = pos;
>>  		entry->mask_base = base;
>> -		entry->masked = readl(base + j * PCI_MSIX_ENTRY_SIZE +
>> -					PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
>>  		msix_mask_irq(entry, 1);
> 
> 158 static void msix_mask_irq(struct msi_desc *desc, u32 flag)                  
> 159 {
> 160         u32 mask_bits = desc->masked;
> ...
> 165         writel(mask_bits, desc->mask_base + offset);
> 
> 
> So I guess this device is just silently ignoring that write?

I guess that write can be ignored while reads are not.

It seems that this issue was introduced by Matthew's commit:
commit f2440d9acbe866b917b16cc0f927366341ce9215
<quote>
@@ -435,11 +432,12 @@ static int msix_capability_init(struct pci_dev *dev,
                entry->msi_attrib.is_msix = 1;
                entry->msi_attrib.is_64 = 1;
                entry->msi_attrib.entry_nr = j;
-               entry->msi_attrib.maskbit = 1;
-               entry->msi_attrib.masked = 1;
                entry->msi_attrib.default_irq = dev->irq;
                entry->msi_attrib.pos = pos;
                entry->mask_base = base;
+               entry->masked = readl(base + j * PCI_MSIX_ENTRY_SIZE +
+                                       PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
+               msix_mask_irq(entry, 1);

                list_add_tail(&entry->list, &dev->msi_list);
        }
</quote>

I'm not sure why Matthew changed it to read/write...

> And aren't we violating the spec by writing 0x1 into the device there
> (assuming desc->masked is 0x0 from the kzalloc), ie. we're supposed to
> read and write back the reserved bits unchanged. (§ 6.8.2.9?)

Spec says:
 "This bit's state after reset is 1 (entry is masked).
  This bit is read/write."

>> @@ -493,6 +491,12 @@ static int msix_capability_init(struct pci_dev *dev,
>>  	msix_set_enable(dev, 1);
>>  	dev->msix_enabled = 1;
> 
> Are we safe if we take an interrupt here?

If reset was properly done on the device, we will be safe since entries are
masked.  Otherwise, ... I'm not sure.

>> +	list_for_each_entry(entry, &dev->msi_list, list) {
>> +		int vector = entry->msi_attrib.entry_nr;
>> +		entry->masked = readl(base + vector * PCI_MSIX_ENTRY_SIZE +
>> +					PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
>> +	}
>> +
>>  	return 0;
>>  }

I think we can remove read/write here because it was same as before the
Matthew's change above.

If we really need read/write here, then I believe Matthew can provide
an incremental patch for that.


Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hidetoshi Seto May 13, 2009, 4:40 a.m. UTC | #4
Hidetoshi Seto wrote:
> It seems that this issue was introduced by Matthew's commit:
> commit f2440d9acbe866b917b16cc0f927366341ce9215
> <quote>
> @@ -435,11 +432,12 @@ static int msix_capability_init(struct pci_dev *dev,
>                 entry->msi_attrib.is_msix = 1;
>                 entry->msi_attrib.is_64 = 1;
>                 entry->msi_attrib.entry_nr = j;
> -               entry->msi_attrib.maskbit = 1;
> -               entry->msi_attrib.masked = 1;
>                 entry->msi_attrib.default_irq = dev->irq;
>                 entry->msi_attrib.pos = pos;
>                 entry->mask_base = base;
> +               entry->masked = readl(base + j * PCI_MSIX_ENTRY_SIZE +
> +                                       PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
> +               msix_mask_irq(entry, 1);
> 
>                 list_add_tail(&entry->list, &dev->msi_list);
>         }
> </quote>
> 
> I'm not sure why Matthew changed it to read/write...

Sorry, I got it.

The problem is not the bit[0] for mask, but the reserved bits[31::01].
According to the Spec:
"31::01 Reserved 
  After reset, the state of these bits must be 0.
  However, for potential future use, software must preserve the value of
  these reserved bits when modifying the value of other Vector Control bits.
  If software modifies the value of these reserved bits, the result is
  undefined."

I'll post a revised fix soon.

Thanks,
H.Seto

--
To unsubscribe from this list: send the line "unsubscribe linux-pci" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael Ellerman May 13, 2009, 4:49 a.m. UTC | #5
On Wed, 2009-05-13 at 13:40 +0900, Hidetoshi Seto wrote:
> Hidetoshi Seto wrote:
> > It seems that this issue was introduced by Matthew's commit:
> > commit f2440d9acbe866b917b16cc0f927366341ce9215
> > <quote>
> > @@ -435,11 +432,12 @@ static int msix_capability_init(struct pci_dev *dev,
> >                 entry->msi_attrib.is_msix = 1;
> >                 entry->msi_attrib.is_64 = 1;
> >                 entry->msi_attrib.entry_nr = j;
> > -               entry->msi_attrib.maskbit = 1;
> > -               entry->msi_attrib.masked = 1;
> >                 entry->msi_attrib.default_irq = dev->irq;
> >                 entry->msi_attrib.pos = pos;
> >                 entry->mask_base = base;
> > +               entry->masked = readl(base + j * PCI_MSIX_ENTRY_SIZE +
> > +                                       PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
> > +               msix_mask_irq(entry, 1);
> > 
> >                 list_add_tail(&entry->list, &dev->msi_list);
> >         }
> > </quote>
> > 
> > I'm not sure why Matthew changed it to read/write...
> 
> Sorry, I got it.
> 
> The problem is not the bit[0] for mask, but the reserved bits[31::01].
> According to the Spec:
> "31::01 Reserved 
>   After reset, the state of these bits must be 0.
>   However, for potential future use, software must preserve the value of
>   these reserved bits when modifying the value of other Vector Control bits.
>   If software modifies the value of these reserved bits, the result is
>   undefined."

Yes that's what I was referring to about it being out of spec.

So to work around the NIU bug and be in spec I think we need to enable
MSI, then read the mask bits, then write them back with the mask bit
set.

cheers
diff mbox

Patch

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 6f2e629..3627732 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -455,8 +455,6 @@  static int msix_capability_init(struct pci_dev *dev,
 		entry->msi_attrib.default_irq = dev->irq;
 		entry->msi_attrib.pos = pos;
 		entry->mask_base = base;
-		entry->masked = readl(base + j * PCI_MSIX_ENTRY_SIZE +
-					PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
 		msix_mask_irq(entry, 1);
 
 		list_add_tail(&entry->list, &dev->msi_list);
@@ -493,6 +491,12 @@  static int msix_capability_init(struct pci_dev *dev,
 	msix_set_enable(dev, 1);
 	dev->msix_enabled = 1;
 
+	list_for_each_entry(entry, &dev->msi_list, list) {
+		int vector = entry->msi_attrib.entry_nr;
+		entry->masked = readl(base + vector * PCI_MSIX_ENTRY_SIZE +
+					PCI_MSIX_ENTRY_VECTOR_CTRL_OFFSET);
+	}
+
 	return 0;
 }