PCI/MSI: Fix masking MSI/MSI-X on Xen PV

Message ID	20211025012503.33172-1-jandryuk@gmail.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=xOjs=PN=lists.xenproject.org=xen-devel-bounces@kernel.org> DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org D1BE460E96 Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: Jason Andryuk <jandryuk@gmail.com> To: josef@oderland.se Cc: boris.ostrovsky@oracle.com, helgaas@kernel.org, jandryuk@gmail.com, jgross@suse.com, linux-pci@vger.kernel.org, maz@kernel.org, tglx@linutronix.de, xen-devel@lists.xenproject.org Subject: [PATCH] PCI/MSI: Fix masking MSI/MSI-X on Xen PV Date: Sun, 24 Oct 2021 21:25:03 -0400 Message-Id: <20211025012503.33172-1-jandryuk@gmail.com> In-Reply-To: <90277228-cf14-0cfa-c95e-d42e7d533353@oderland.se> References: <90277228-cf14-0cfa-c95e-d42e7d533353@oderland.se> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	PCI/MSI: Fix masking MSI/MSI-X on Xen PV \| expand PCI/MSI: Fix masking MSI/MSI-X on Xen PV

Jason Andryuk Oct. 25, 2021, 1:25 a.m. UTC

commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask
functions") introduce functions pci_msi_update_mask() and
pci_msix_write_vector_ctrl() that is missing checks for
pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use
new mask/unmask functions").  The checks are in place at the high level
__pci_msi_mask_desc()/__pci_msi_unmask_desc(), but some functions call
directly to the helpers.

Push the pci_msi_ignore_mask check down to the functions that make
the actual writes.  This keeps the logic local to the writes that need
to be bypassed.

With Xen PV, the hypervisor is responsible for masking and unmasking the
interrupts, which pci_msi_ignore_mask is used to indicate.

This change avoids lockups in amdgpu drivers under Xen during boot.

Fixes: commit 446a98b19fd6 ("PCI/MSI: Use new mask/unmask functions")
Reported-by: Josef Johansson <josef@oderland.se>
Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
---
 drivers/pci/msi.c | 10 ++++++++--
 1 file changed, 8 insertions(+), 2 deletions(-)

David Woodhouse Oct. 25, 2021, 7:44 a.m. UTC | #1

On Sun, 2021-10-24 at 21:25 -0400, Jason Andryuk wrote:
> commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask
> functions") introduce functions pci_msi_update_mask() and
> pci_msix_write_vector_ctrl() that is missing checks for
> pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use
> new mask/unmask functions").  The checks are in place at the high level
> __pci_msi_mask_desc()/__pci_msi_unmask_desc(), but some functions call
> directly to the helpers.
> 
> Push the pci_msi_ignore_mask check down to the functions that make
> the actual writes.  This keeps the logic local to the writes that need
> to be bypassed.
> 
> With Xen PV, the hypervisor is responsible for masking and unmasking the
> interrupts, which pci_msi_ignore_mask is used to indicate.

This isn't just for Xen PV; Xen HVM guests let Xen unmask the MSI for
them too.

Roger Pau Monné Oct. 25, 2021, 11:43 a.m. UTC | #2

On Mon, Oct 25, 2021 at 08:44:52AM +0100, David Woodhouse wrote:
> On Sun, 2021-10-24 at 21:25 -0400, Jason Andryuk wrote:
> > commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask
> > functions") introduce functions pci_msi_update_mask() and
> > pci_msix_write_vector_ctrl() that is missing checks for
> > pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use
> > new mask/unmask functions").  The checks are in place at the high level
> > __pci_msi_mask_desc()/__pci_msi_unmask_desc(), but some functions call
> > directly to the helpers.
> > 
> > Push the pci_msi_ignore_mask check down to the functions that make
> > the actual writes.  This keeps the logic local to the writes that need
> > to be bypassed.
> > 
> > With Xen PV, the hypervisor is responsible for masking and unmasking the
> > interrupts, which pci_msi_ignore_mask is used to indicate.
> 
> This isn't just for Xen PV; Xen HVM guests let Xen unmask the MSI for
> them too.

It's kind of optional for HVM guests, as it depends on
XENFEAT_hvm_pirqs, which sadly gets unconditionally set for HVM
guests, thus dropping any benefits from having hardware assisted APIC
virtualization or posted interrupts support.

AFAICT pci_msi_ignore_mask is not (correctly) set for HVM guest when
MSI interrupts are routed over event channels.

Regards, Roger.

David Woodhouse Oct. 25, 2021, 11:53 a.m. UTC | #3

On Mon, 2021-10-25 at 13:43 +0200, Roger Pau Monné wrote:
> It's kind of optional for HVM guests, as it depends on
> XENFEAT_hvm_pirqs, which sadly gets unconditionally set for HVM
> guests, thus dropping any benefits from having hardware assisted APIC
> virtualization or posted interrupts support.

Indeed. After implementing PIRQ support for Xen guests running under
KVM, I spent a "happy" couple of days swearing at it because it
actually *worked* if something would just *unmask* the f***ing MSI, but
the guest inexplicably (to me) didn't do that.

Took me a while to work out that Xen itself is *snooping* on the MSI
table writes even while they are *masked*, to capture the magic MSI
message (with vector==0) which means it's actually a PIRQ# in the
destination ID bits, and then magically unmask the MSI when the guest
binds that PIRQ to an event channel.

I did not enjoy implementing that part.

FWIW the *guest* could potentlaly be smarter here and elect not to use
PIRQs when hardware assisted vAPIC is present. Aren't there some bits
in the CPUID that Xen advertises, which indicate that?

Jason Andryuk Oct. 25, 2021, 12:27 p.m. UTC | #4

On Sun, Oct 24, 2021 at 9:26 PM Jason Andryuk <jandryuk@gmail.com> wrote:
>
> commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask
> functions") introduce functions pci_msi_update_mask() and
> pci_msix_write_vector_ctrl() that is missing checks for
> pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use
> new mask/unmask functions").  The checks are in place at the high level
> __pci_msi_mask_desc()/__pci_msi_unmask_desc(), but some functions call
> directly to the helpers.
>
> Push the pci_msi_ignore_mask check down to the functions that make
> the actual writes.  This keeps the logic local to the writes that need
> to be bypassed.
>
> With Xen PV, the hypervisor is responsible for masking and unmasking the
> interrupts, which pci_msi_ignore_mask is used to indicate.
>
> This change avoids lockups in amdgpu drivers under Xen during boot.
>
> Fixes: commit 446a98b19fd6 ("PCI/MSI: Use new mask/unmask functions")
> Reported-by: Josef Johansson <josef@oderland.se>
> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
> ---

I should have written that this is untested.  If this is the desired
approach, Josef should test that it solves his boot hangs.

Regards,
Jason

Jason Andryuk Oct. 25, 2021, 12:31 p.m. UTC | #5

On Mon, Oct 25, 2021 at 3:44 AM David Woodhouse <dwmw2@infradead.org> wrote:
>
> On Sun, 2021-10-24 at 21:25 -0400, Jason Andryuk wrote:
> > commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask
> > functions") introduce functions pci_msi_update_mask() and
> > pci_msix_write_vector_ctrl() that is missing checks for
> > pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use
> > new mask/unmask functions").  The checks are in place at the high level
> > __pci_msi_mask_desc()/__pci_msi_unmask_desc(), but some functions call
> > directly to the helpers.
> >
> > Push the pci_msi_ignore_mask check down to the functions that make
> > the actual writes.  This keeps the logic local to the writes that need
> > to be bypassed.
> >
> > With Xen PV, the hypervisor is responsible for masking and unmasking the
> > interrupts, which pci_msi_ignore_mask is used to indicate.
>
> This isn't just for Xen PV; Xen HVM guests let Xen unmask the MSI for
> them too.

Ah, that makes sense that Xen handles both.  I was repeating another
commit message's statement.  Oh, it looks like pci_msi_ignore_mask is
PV-specific.

Regards,
Jason

Roger Pau Monné Oct. 25, 2021, 12:58 p.m. UTC | #6

On Mon, Oct 25, 2021 at 12:53:31PM +0100, David Woodhouse wrote:
> On Mon, 2021-10-25 at 13:43 +0200, Roger Pau Monné wrote:
> > It's kind of optional for HVM guests, as it depends on
> > XENFEAT_hvm_pirqs, which sadly gets unconditionally set for HVM
> > guests, thus dropping any benefits from having hardware assisted APIC
> > virtualization or posted interrupts support.
> 
> Indeed. After implementing PIRQ support for Xen guests running under
> KVM, I spent a "happy" couple of days swearing at it because it
> actually *worked* if something would just *unmask* the f***ing MSI, but
> the guest inexplicably (to me) didn't do that.
> 
> Took me a while to work out that Xen itself is *snooping* on the MSI
> table writes even while they are *masked*, to capture the magic MSI
> message (with vector==0) which means it's actually a PIRQ# in the
> destination ID bits, and then magically unmask the MSI when the guest
> binds that PIRQ to an event channel.
> 
> I did not enjoy implementing that part.

I can see that. It's even better because none of this is actually
documented.

> FWIW the *guest* could potentlaly be smarter here and elect not to use
> PIRQs when hardware assisted vAPIC is present. Aren't there some bits
> in the CPUID that Xen advertises, which indicate that? 

Yes, it's in leaf 0x40000x04. FWIW, I would also be fine with removing
XENFEAT_hvm_pirqs, as I don't think diverging from the hardware
specifications gives us much benefit. We avoid a couple of vm exits
for sure, but the cost of having all those modifications in guest
OSes is not worth it.

Roger.

David Woodhouse Oct. 25, 2021, 1:02 p.m. UTC | #7

On Mon, 2021-10-25 at 14:58 +0200, Roger Pau Monné wrote:
> On Mon, Oct 25, 2021 at 12:53:31PM +0100, David Woodhouse wrote:
> > On Mon, 2021-10-25 at 13:43 +0200, Roger Pau Monné wrote:
> > > It's kind of optional for HVM guests, as it depends on
> > > XENFEAT_hvm_pirqs, which sadly gets unconditionally set for HVM
> > > guests, thus dropping any benefits from having hardware assisted APIC
> > > virtualization or posted interrupts support.
> > 
> > Indeed. After implementing PIRQ support for Xen guests running under
> > KVM, I spent a "happy" couple of days swearing at it because it
> > actually *worked* if something would just *unmask* the f***ing MSI, but
> > the guest inexplicably (to me) didn't do that.
> > 
> > Took me a while to work out that Xen itself is *snooping* on the MSI
> > table writes even while they are *masked*, to capture the magic MSI
> > message (with vector==0) which means it's actually a PIRQ# in the
> > destination ID bits, and then magically unmask the MSI when the guest
> > binds that PIRQ to an event channel.
> > 
> > I did not enjoy implementing that part.
> 
> I can see that. It's even better because none of this is actually
> documented.

Indeed. I still haven't worked out if/how Xen actually *masks* the
corresponding MSI-X again. It can't do so when the evtchn is masked,
since that's just a bit in the shinfo page. So while the evtchn is
masked, the MSI can still be screaming into the void?

Perhaps it does so when the PIRQ is unbound from the evtchn?

> > FWIW the *guest* could potentlaly be smarter here and elect not to use
> > PIRQs when hardware assisted vAPIC is present. Aren't there some bits
> > in the CPUID that Xen advertises, which indicate that? 
> 
> Yes, it's in leaf 0x40000x04. FWIW, I would also be fine with removing
> XENFEAT_hvm_pirqs, as I don't think diverging from the hardware
> specifications gives us much benefit. We avoid a couple of vm exits
> for sure, but the cost of having all those modifications in guest
> OSes is not worth it.

These days with posted interrupts, it doesn't even save us any vmexits;
it's all that additional guest complexity just to give us *more*
vmexits than we would have had :)

Roger Pau Monné Oct. 25, 2021, 2:12 p.m. UTC | #8

On Mon, Oct 25, 2021 at 02:02:38PM +0100, David Woodhouse wrote:
> On Mon, 2021-10-25 at 14:58 +0200, Roger Pau Monné wrote:
> > On Mon, Oct 25, 2021 at 12:53:31PM +0100, David Woodhouse wrote:
> > > On Mon, 2021-10-25 at 13:43 +0200, Roger Pau Monné wrote:
> > > > It's kind of optional for HVM guests, as it depends on
> > > > XENFEAT_hvm_pirqs, which sadly gets unconditionally set for HVM
> > > > guests, thus dropping any benefits from having hardware assisted APIC
> > > > virtualization or posted interrupts support.
> > > 
> > > Indeed. After implementing PIRQ support for Xen guests running under
> > > KVM, I spent a "happy" couple of days swearing at it because it
> > > actually *worked* if something would just *unmask* the f***ing MSI, but
> > > the guest inexplicably (to me) didn't do that.
> > > 
> > > Took me a while to work out that Xen itself is *snooping* on the MSI
> > > table writes even while they are *masked*, to capture the magic MSI
> > > message (with vector==0) which means it's actually a PIRQ# in the
> > > destination ID bits, and then magically unmask the MSI when the guest
> > > binds that PIRQ to an event channel.
> > > 
> > > I did not enjoy implementing that part.
> > 
> > I can see that. It's even better because none of this is actually
> > documented.
> 
> Indeed. I still haven't worked out if/how Xen actually *masks* the
> corresponding MSI-X again. It can't do so when the evtchn is masked,
> since that's just a bit in the shinfo page. So while the evtchn is
> masked, the MSI can still be screaming into the void?

I think so, it's quite weird because as a side effect of this mangling
Xen is transforming an edge triggered interrupt to a level triggered
one, as masked event channels belonging to MSI vectors will get set to
pending.

So it's not entirely screaming into the void because it will get
(wrongly) set as pending when masked.

> Perhaps it does so when the PIRQ is unbound from the evtchn?
> 
> > > FWIW the *guest* could potentlaly be smarter here and elect not to use
> > > PIRQs when hardware assisted vAPIC is present. Aren't there some bits
> > > in the CPUID that Xen advertises, which indicate that? 
> > 
> > Yes, it's in leaf 0x40000x04. FWIW, I would also be fine with removing
> > XENFEAT_hvm_pirqs, as I don't think diverging from the hardware
> > specifications gives us much benefit. We avoid a couple of vm exits
> > for sure, but the cost of having all those modifications in guest
> > OSes is not worth it.
> 
> These days with posted interrupts, it doesn't even save us any vmexits;
> it's all that additional guest complexity just to give us *more*
> vmexits than we would have had :)

Oh indeed. I was thinking about hardware without APIC hw
virtualization or posted interrupts.

Roger.

Josef Johansson Oct. 25, 2021, 4:46 p.m. UTC | #9

On 10/25/21 14:27, Jason Andryuk wrote:
> On Sun, Oct 24, 2021 at 9:26 PM Jason Andryuk <jandryuk@gmail.com> wrote:
>> commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask
>> functions") introduce functions pci_msi_update_mask() and
>> pci_msix_write_vector_ctrl() that is missing checks for
>> pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use
>> new mask/unmask functions").  The checks are in place at the high level
>> __pci_msi_mask_desc()/__pci_msi_unmask_desc(), but some functions call
>> directly to the helpers.
>>
>> Push the pci_msi_ignore_mask check down to the functions that make
>> the actual writes.  This keeps the logic local to the writes that need
>> to be bypassed.
>>
>> With Xen PV, the hypervisor is responsible for masking and unmasking the
>> interrupts, which pci_msi_ignore_mask is used to indicate.
>>
>> This change avoids lockups in amdgpu drivers under Xen during boot.
>>
>> Fixes: commit 446a98b19fd6 ("PCI/MSI: Use new mask/unmask functions")
>> Reported-by: Josef Johansson <josef@oderland.se>
>> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
>> ---
> I should have written that this is untested.  If this is the desired
> approach, Josef should test that it solves his boot hangs.
>
> Regards,
> Jason

I've tested this today, both the above patch, but also my own below
where I'm patching inside __pci_write_msi_msg,
which is the outcome of the patch above.

I found that both solves the boot hang, but both have severe effects
on suspends/resume later on. The below patch without patching
__pci_write_msi_msg seems to be ideal, solves boot problems but not
causing too much others. There seems to me that there's undocumented
dragons here. Doing more test later today.

diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
index 4b4792940e86..e97eea1bc93a 100644
--- a/drivers/pci/msi.c
+++ b/drivers/pci/msi.c
@@ -148,6 +148,9 @@ static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, u32 s
 	raw_spinlock_t *lock = &desc->dev->msi_lock;
 	unsigned long flags;
 
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
 	raw_spin_lock_irqsave(lock, flags);
 	desc->msi_mask &= ~clear;
 	desc->msi_mask |= set;
@@ -186,6 +189,9 @@ static void pci_msix_write_vector_ctrl(struct msi_desc *desc, u32 ctrl)
 
 static inline void pci_msix_mask(struct msi_desc *desc)
 {
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
 	desc->msix_ctrl |= PCI_MSIX_ENTRY_CTRL_MASKBIT;
 	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
 	/* Flush write to device */
@@ -194,15 +200,15 @@ static inline void pci_msix_mask(struct msi_desc *desc)
 
 static inline void pci_msix_unmask(struct msi_desc *desc)
 {
+	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
+		return;
+
 	desc->msix_ctrl &= ~PCI_MSIX_ENTRY_CTRL_MASKBIT;
 	pci_msix_write_vector_ctrl(desc, desc->msix_ctrl);
 }
 
 static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
 {
-	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
-		return;
-
 	if (desc->msi_attrib.is_msix)
 		pci_msix_mask(desc);
 	else if (desc->msi_attrib.maskbit)
@@ -211,9 +217,6 @@ static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
 
 static void __pci_msi_unmask_desc(struct msi_desc *desc, u32 mask)
 {
-	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
-		return;
-
 	if (desc->msi_attrib.is_msix)
 		pci_msix_unmask(desc);
 	else if (desc->msi_attrib.maskbit)
@@ -307,7 +310,7 @@ void __pci_write_msi_msg(struct msi_desc *entry, struct msi_msg *msg)
 		 * entry while the entry is unmasked, the result is
 		 * undefined."
 		 */
-		if (unmasked)
+		if (unmasked && !pci_msi_ignore_mask)
 			pci_msix_write_vector_ctrl(entry, ctrl | PCI_MSIX_ENTRY_CTRL_MASKBIT);
 
 		writel(msg->address_lo, base + PCI_MSIX_ENTRY_LOWER_ADDR);
@@ -450,8 +453,9 @@ static void __pci_restore_msix_state(struct pci_dev *dev)
 				PCI_MSIX_FLAGS_ENABLE | PCI_MSIX_FLAGS_MASKALL);
 
 	arch_restore_msi_irqs(dev);
-	for_each_pci_msi_entry(entry, dev)
-		pci_msix_write_vector_ctrl(entry, entry->msix_ctrl);
+	if (!(pci_msi_ignore_mask || entry->msi_attrib.is_virtual))
+		for_each_pci_msi_entry(entry, dev)
+			pci_msix_write_vector_ctrl(entry, entry->msix_ctrl);
 
 	pci_msix_clear_and_set_ctrl(dev, PCI_MSIX_FLAGS_MASKALL, 0);
 }

Josef Johansson Oct. 26, 2021, 9:17 p.m. UTC | #10

On 10/25/21 18:46, Josef Johansson wrote:
> On 10/25/21 14:27, Jason Andryuk wrote:
>> On Sun, Oct 24, 2021 at 9:26 PM Jason Andryuk <jandryuk@gmail.com> wrote:
>>> commit fcacdfbef5a1 ("PCI/MSI: Provide a new set of mask and unmask
>>> functions") introduce functions pci_msi_update_mask() and
>>> pci_msix_write_vector_ctrl() that is missing checks for
>>> pci_msi_ignore_mask that exists in commit 446a98b19fd6 ("PCI/MSI: Use
>>> new mask/unmask functions").  The checks are in place at the high level
>>> __pci_msi_mask_desc()/__pci_msi_unmask_desc(), but some functions call
>>> directly to the helpers.
>>>
>>> Push the pci_msi_ignore_mask check down to the functions that make
>>> the actual writes.  This keeps the logic local to the writes that need
>>> to be bypassed.
>>>
>>> With Xen PV, the hypervisor is responsible for masking and unmasking the
>>> interrupts, which pci_msi_ignore_mask is used to indicate.
>>>
>>> This change avoids lockups in amdgpu drivers under Xen during boot.
>>>
>>> Fixes: commit 446a98b19fd6 ("PCI/MSI: Use new mask/unmask functions")
>>> Reported-by: Josef Johansson <josef@oderland.se>
>>> Signed-off-by: Jason Andryuk <jandryuk@gmail.com>
>>> ---
>> I should have written that this is untested.  If this is the desired
>> approach, Josef should test that it solves his boot hangs.
>>
>> Regards,
>> Jason
> I've tested this today, both the above patch, but also my own below
> where I'm patching inside __pci_write_msi_msg,
> which is the outcome of the patch above.
I tested a lot of kernels today. To create a good baseline I compiled
without any of our patches here
and with my config flags set.

CONFIG_AMD_PMC=y
# CONFIG_HSA_AMD is not set
# CONFIG_CRYPTO_DEV_CCP is not set

The kernel stopped as before, and hung.

Test number 2 was to boot with amdgpu.msi=0.
This still resulted in a bad boot since all the xhcd drivers complained.
We can be sure that it's not amdgpu per se.

Test number 3 was with Jason's patch. It worked, but suspend/resume is
not working well.
Generally it's not behaving like other kernels do, which makes it
actually change the behavior.
Now with test 4 I tried that thought, maybe this is still a good change?
I'm deprived of a good baseline in all this, so it's very hard to
navigate between all the variables.

Test number 4 was with Jason's patch plus the amdgpu-patch below.
It worked, even suspend/resume, 2 times, but then it all crashed and
burn with quite interesting stacktraces. Are amdgpu doing it wrong here
or is it just me nitpicking?

index cc2e0c9cfe0a..f125597eb991 100644
--- a/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
+++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_irq.c
@@ -279,17 +279,8 @@ static bool amdgpu_msi_ok(struct amdgpu_device *adev)

 static void amdgpu_restore_msix(struct amdgpu_device *adev)
 {
-	u16 ctrl;
-
-	pci_read_config_word(adev->pdev, adev->pdev->msix_cap + PCI_MSIX_FLAGS, &ctrl);
-	if (!(ctrl & PCI_MSIX_FLAGS_ENABLE))
-		return;
-
-	/* VF FLR */
-	ctrl &= ~PCI_MSIX_FLAGS_ENABLE;
-	pci_write_config_word(adev->pdev, adev->pdev->msix_cap + PCI_MSIX_FLAGS, ctrl);
-	ctrl |= PCI_MSIX_FLAGS_ENABLE;
-	pci_write_config_word(adev->pdev, adev->pdev->msix_cap + PCI_MSIX_FLAGS, ctrl);
+	// checks if msix is enabled also
+	pci_restore_msi_state(adev->pdev);
 }

 /**

During the tests I fiddled with dpm settings, and it does have an effect
one the graphical output during suspend/resume. So maybe there's
hardware problems at play here as well.

I also looked through the code before and after Thomas' changes, and I
can't see that
this patch should make any functional difference compared to before the
MSI series of patches.
It's even such that is_virtual should be checked withing vector_ctrl. I
find Jason's patch
quite nice since it really places the checks on few places making it
easier not to slip.
Compared to my attempt that even failed because I forgot one more place
to put the checks.

With that said I would really like some more tests on this with
different chipsets, on Xen.
Any takers?

What I'm seeing is that there's no readl() in pci_msix_unmask(), it was
one in the code path before.
I'm very much unsure if there should be one there though.

We can really do a better job at the documentation for
pci_msi_ignore_mask, at least in msi.c,
maybe that should be a different patch adding some comments such that
driver folks really see
the benefits of using the built in restore functions e.g.

This became so much bigger project than I thought, thanks all for
chiming in on it.

Thomas Gleixner Oct. 27, 2021, 8:45 a.m. UTC | #11

On Sun, Oct 24 2021 at 21:25, Jason Andryuk wrote:
> diff --git a/drivers/pci/msi.c b/drivers/pci/msi.c
> index 4b4792940e86..478536bafc39 100644
> --- a/drivers/pci/msi.c
> +++ b/drivers/pci/msi.c
> @@ -148,6 +148,9 @@ static noinline void pci_msi_update_mask(struct msi_desc *desc, u32 clear, u32 s
>  	raw_spinlock_t *lock = &desc->dev->msi_lock;
>  	unsigned long flags;
>  
> +	if (pci_msi_ignore_mask)
> +		return;
> +
>  	raw_spin_lock_irqsave(lock, flags);
>  	desc->msi_mask &= ~clear;
>  	desc->msi_mask |= set;
> @@ -181,6 +184,9 @@ static void pci_msix_write_vector_ctrl(struct msi_desc *desc, u32 ctrl)
>  {
>  	void __iomem *desc_addr = pci_msix_desc_addr(desc);
>  
> +	if (pci_msi_ignore_mask)
> +		return;
> +
>  	writel(ctrl, desc_addr + PCI_MSIX_ENTRY_VECTOR_CTRL);
>  }
>  
> @@ -200,7 +206,7 @@ static inline void pci_msix_unmask(struct msi_desc *desc)
>  
>  static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
>  {
> -	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
> +	if (desc->msi_attrib.is_virtual)
>  		return;
>  
>  	if (desc->msi_attrib.is_msix)
> @@ -211,7 +217,7 @@ static void __pci_msi_mask_desc(struct msi_desc *desc, u32 mask)
>  
>  static void __pci_msi_unmask_desc(struct msi_desc *desc, u32 mask)
>  {
> -	if (pci_msi_ignore_mask || desc->msi_attrib.is_virtual)
> +	if (desc->msi_attrib.is_virtual)
>  		return;
>  
>  	if (desc->msi_attrib.is_msix)

No, really. This is horrible and incomplete. The right thing to do is to
move the check back into the low level accessors and remove it from the
call sites simply because the low level accessors can be reached not
only from the mask/unmask functions. But the above also fails to respect
msi_attrib.maskbit... I'll send out a proper fix in a few.

Thanks,

        tglx

PCI/MSI: Fix masking MSI/MSI-X on Xen PV

Commit Message

Comments

Patch