Message ID | 20230420024037.5921-1-decui@microsoft.com (mailing list archive) |
---|---|
Headers | show |
Series | pci-hyper: Fix race condition bugs for fast device hotplug | expand |
> From: Dexuan Cui <decui@microsoft.com> > Sent: Wednesday, April 19, 2023 7:41 PM > ... > Before the guest finishes probing a device, the host may be already starting > to remove the device. Currently there are multiple race condition bugs in the > pci-hyperv driver, which can cause the guest to panic. The patchset fixes > the crashes. > > The patchset also does some cleanup work: patch 3 removes the useless > hv_pcichild_state, and patch 4 reverts an old patch which is not really > useful (without patch 4, it would be hard to make patch 5 clean). > > Patch 6 removes the use of a global mutex lock, and enables async-probing > to allow concurrent device probing for faster boot. > > v3 is based on v6.3-rc5. No code change since v2. I just added Michael's > and Long Li's Reviewed-by. > ... > > Dexuan Cui (6): > PCI: hv: Fix a race condition bug in hv_pci_query_relations() > PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic > PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev > Revert "PCI: hv: Fix a timing issue which causes kdump to fail > occasionally" > PCI: hv: Add a per-bus mutex state_lock > PCI: hv: Use async probing to reduce boot time > > drivers/pci/controller/pci-hyperv.c | 145 +++++++++++++++++----------- > 1 file changed, 86 insertions(+), 59 deletions(-) Hi Bjorn, Lorenzo, since basically this patchset is Hyper-V stuff, I would like it to go through the hyper-v tree if you have no objection. The hyper-v tree already has one big PCI patch from Michael: https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/commit/?h=hyperv-next&id=2c6ba4216844ca7918289b49ed5f3f7138ee2402 Thanks, Dexuan
> From: Dexuan Cui > Sent: Thursday, April 20, 2023 7:04 PM > > ... > > > > Dexuan Cui (6): > > PCI: hv: Fix a race condition bug in hv_pci_query_relations() > > PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic > > PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev > > Revert "PCI: hv: Fix a timing issue which causes kdump to fail > > occasionally" > > PCI: hv: Add a per-bus mutex state_lock > > PCI: hv: Use async probing to reduce boot time > > > > drivers/pci/controller/pci-hyperv.c | 145 +++++++++++++++++----------- > > 1 file changed, 86 insertions(+), 59 deletions(-) > > Hi Bjorn, Lorenzo, since basically this patchset is Hyper-V stuff, I would > like it to go through the hyper-v tree if you have no objection. > > The hyper-v tree already has one big PCI patch from Michael: > https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/commit/?h= > hyperv-next&id=2c6ba4216844ca7918289b49ed5f3f7138ee2402 > > Thanks, > Dexuan Hi Lorenzo, thanks for Ack'ing the patch: Re: [PATCH v2] PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg It would be great if you and/or Bjorn can Ack this patchset as well :-) v1 of this patchset was posted on 3/28: https://lwn.net/ml/linux-kernel/20230328045122.25850-1-decui%40microsoft.com/ and v3 got Michael Kelley's and Long Li's Reviewed-by. I have done a long-haul testing against the patchset and it worked reliably without causing any issue: without the patchset, usually the VM can crash within 1~2 days; with the patchset, the VM is still running fine after 2 weeks. Thanks, Dexuan
On Fri, Apr 21, 2023 at 10:23:03PM +0000, Dexuan Cui wrote: > > From: Dexuan Cui > > Sent: Thursday, April 20, 2023 7:04 PM > > > ... > > > > > > Dexuan Cui (6): > > > PCI: hv: Fix a race condition bug in hv_pci_query_relations() > > > PCI: hv: Fix a race condition in hv_irq_unmask() that can cause panic > > > PCI: hv: Remove the useless hv_pcichild_state from struct hv_pci_dev > > > Revert "PCI: hv: Fix a timing issue which causes kdump to fail > > > occasionally" > > > PCI: hv: Add a per-bus mutex state_lock > > > PCI: hv: Use async probing to reduce boot time > > > > > > drivers/pci/controller/pci-hyperv.c | 145 +++++++++++++++++----------- > > > 1 file changed, 86 insertions(+), 59 deletions(-) > > > > Hi Bjorn, Lorenzo, since basically this patchset is Hyper-V stuff, I would > > like it to go through the hyper-v tree if you have no objection. > > > > The hyper-v tree already has one big PCI patch from Michael: > > https://git.kernel.org/pub/scm/linux/kernel/git/hyperv/linux.git/commit/?h= > > hyperv-next&id=2c6ba4216844ca7918289b49ed5f3f7138ee2402 > > > > Thanks, > > Dexuan > > Hi Lorenzo, thanks for Ack'ing the patch: > Re: [PATCH v2] PCI: hv: Replace retarget_msi_interrupt_params with hyperv_pcpu_input_arg > > It would be great if you and/or Bjorn can Ack this patchset as well :-) > Lorenzo and Bjorn, are you happy with these patches? I can collect them via the hyperv-fixes tree. Thanks, Wei. > v1 of this patchset was posted on 3/28: > https://lwn.net/ml/linux-kernel/20230328045122.25850-1-decui%40microsoft.com/ > and v3 got Michael Kelley's and Long Li's Reviewed-by. > > I have done a long-haul testing against the patchset and it worked > reliably without causing any issue: without the patchset, usually the VM > can crash within 1~2 days; with the patchset, the VM is still running fine > after 2 weeks. > > Thanks, > Dexuan