Message ID | 20230125180411.4742f159.olaf@aepfle.de (mailing list archive) |
---|---|
State | Not Applicable |
Headers | show |
Series | unlocked access to struct irq_data->chip_data in pci_hyperv | expand |
> From: Olaf Hering <olaf@aepfle.de> > Sent: Wednesday, January 25, 2023 9:04 AM > To: linux-hyperv@vger.kernel.org; Haiyang Zhang <haiyangz@microsoft.com> > Cc: linux-pci@vger.kernel.org; Dexuan Cui <decui@microsoft.com> > Subject: unlocked access to struct irq_data->chip_data in pci_hyperv > > Hello, > > there are several crash reports due to struct irq_data->chip_data being NULL. > > I was under the impression all the "recent changes" to pci-hyperv.c > would fix them. But apparently this specific issue is still there. Hi Olaf, thanks for debugging the issue! AFAIK, the last batch of patches for the vPCI driver were from Andrea Parri, who made the patches in April 2022. We hoped that all the VM crash issues could be fixed by Andrea's patches, but it looks like some recent reports of call-traces imply that there are still some race condition bugs to be investigated. We're working on this and hopefully we'll get to the bottom of this. > What does serialize read and write access to struct irq_data->chip_data? > It seems hv_msi_free can run while other code paths still access at least > ->chip_data. I see this comment in hv_compose_msi_msg(): /* * Record the assignment so that this can be unwound later. Using * irq_set_chip_data() here would be appropriate, but the lock it takes * is already held. */ So it looks like to me that hv_compose_msi_msg() may not buggy, but I'll double check the locking here. I suspect some of the crashes happen because the host starts to remove the VF device before the VF device is fully initialized by the pci-hyperv driver and/or the Mellanox VF driver, i.e. I suspect the race conditon(s) may be between hv_eject_device_work()/hv_pci_remove() and the async-probing Mellanox VF driver's probe() function. > The change below may reduce the window, but I'm not confident this would > actually resolve the concurrency issues. > > > Olaf > > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -1760,8 +1760,9 @@ static void hv_compose_msi_msg(struct irq_data > *data, struct msi_msg *msg) > msi_desc->nvec_used > 1; > > /* Reuse the previous allocation */ > - if (data->chip_data && multi_msi) { > - int_desc = data->chip_data; > + virt_rmb(); > + int_desc = READ_ONCE(data->chip_data); > + if (int_desc && multi_msi) { > msg->address_hi = int_desc->address >> 32; > msg->address_lo = int_desc->address & 0xffffffff; > msg->data = int_desc->data; > @@ -1778,8 +1779,9 @@ static void hv_compose_msi_msg(struct irq_data > *data, struct msi_msg *msg) > goto return_null_message; > > /* Free any previous message that might have already been composed. */ > - if (data->chip_data && !multi_msi) { > - int_desc = data->chip_data; > + virt_rmb(); > + int_desc = READ_ONCE(data->chip_data); > + if (int_desc && !multi_msi) { > data->chip_data = NULL; > hv_int_desc_free(hpdev, int_desc); > }
--- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -1760,8 +1760,9 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) msi_desc->nvec_used > 1; /* Reuse the previous allocation */ - if (data->chip_data && multi_msi) { - int_desc = data->chip_data; + virt_rmb(); + int_desc = READ_ONCE(data->chip_data); + if (int_desc && multi_msi) { msg->address_hi = int_desc->address >> 32; msg->address_lo = int_desc->address & 0xffffffff; msg->data = int_desc->data; @@ -1778,8 +1779,9 @@ static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg) goto return_null_message; /* Free any previous message that might have already been composed. */ - if (data->chip_data && !multi_msi) { - int_desc = data->chip_data; + virt_rmb(); + int_desc = READ_ONCE(data->chip_data); + if (int_desc && !multi_msi) { data->chip_data = NULL; hv_int_desc_free(hpdev, int_desc); }