diff mbox series

unlocked access to struct irq_data->chip_data in pci_hyperv

Message ID 20230125180411.4742f159.olaf@aepfle.de (mailing list archive)
State Not Applicable
Headers show
Series unlocked access to struct irq_data->chip_data in pci_hyperv | expand

Commit Message

Olaf Hering Jan. 25, 2023, 5:04 p.m. UTC
Hello,

there are several crash reports due to struct irq_data->chip_data being NULL.

I was under the impression all the "recent changes" to pci-hyperv.c
would fix them. But apparently this specific issue is still there.

What does serialize read and write access to struct irq_data->chip_data?
It seems hv_msi_free can run while other code paths still access at least
->chip_data.

The change below may reduce the window, but I'm not confident this would
actually resolve the concurrency issues.


Olaf

Comments

Dexuan Cui Jan. 30, 2023, 9:47 a.m. UTC | #1
> From: Olaf Hering <olaf@aepfle.de>
> Sent: Wednesday, January 25, 2023 9:04 AM
> To: linux-hyperv@vger.kernel.org; Haiyang Zhang <haiyangz@microsoft.com>
> Cc: linux-pci@vger.kernel.org; Dexuan Cui <decui@microsoft.com>
> Subject: unlocked access to struct irq_data->chip_data in pci_hyperv
> 
> Hello,
> 
> there are several crash reports due to struct irq_data->chip_data being NULL.
> 
> I was under the impression all the "recent changes" to pci-hyperv.c
> would fix them. But apparently this specific issue is still there.

Hi Olaf, thanks for debugging the issue! AFAIK, the last batch of patches for
the vPCI driver were from Andrea Parri, who made the patches in April 2022.

We hoped that all the VM crash issues could be fixed by Andrea's patches,
but it looks like some recent reports of call-traces imply that there are still
some race condition bugs to be investigated. We're working on this and
hopefully we'll get to the bottom of this.

> What does serialize read and write access to struct irq_data->chip_data?
> It seems hv_msi_free can run while other code paths still access at least
> ->chip_data.

I see this comment in hv_compose_msi_msg():
        /*
         * Record the assignment so that this can be unwound later. Using
         * irq_set_chip_data() here would be appropriate, but the lock it takes
         * is already held.
         */
So it looks like to me that hv_compose_msi_msg() may not buggy, but I'll
double check the locking here.

I suspect some of the crashes happen because the host starts to remove the
VF device before the VF device is fully initialized by the pci-hyperv driver and/or
the Mellanox VF driver, i.e. I suspect the race conditon(s) may be between
hv_eject_device_work()/hv_pci_remove() and the async-probing Mellanox
VF driver's probe() function.

> The change below may reduce the window, but I'm not confident this would
> actually resolve the concurrency issues.
> 
> 
> Olaf
> 
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -1760,8 +1760,9 @@ static void hv_compose_msi_msg(struct irq_data
> *data, struct msi_msg *msg)
>  		    msi_desc->nvec_used > 1;
> 
>  	/* Reuse the previous allocation */
> -	if (data->chip_data && multi_msi) {
> -		int_desc = data->chip_data;
> +	virt_rmb();
> +	int_desc = READ_ONCE(data->chip_data);
> +	if (int_desc && multi_msi) {
>  		msg->address_hi = int_desc->address >> 32;
>  		msg->address_lo = int_desc->address & 0xffffffff;
>  		msg->data = int_desc->data;
> @@ -1778,8 +1779,9 @@ static void hv_compose_msi_msg(struct irq_data
> *data, struct msi_msg *msg)
>  		goto return_null_message;
> 
>  	/* Free any previous message that might have already been composed. */
> -	if (data->chip_data && !multi_msi) {
> -		int_desc = data->chip_data;
> +	virt_rmb();
> +	int_desc = READ_ONCE(data->chip_data);
> +	if (int_desc && !multi_msi) {
>  		data->chip_data = NULL;
>  		hv_int_desc_free(hpdev, int_desc);
>  	}
diff mbox series

Patch

--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -1760,8 +1760,9 @@  static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 		    msi_desc->nvec_used > 1;
 
 	/* Reuse the previous allocation */
-	if (data->chip_data && multi_msi) {
-		int_desc = data->chip_data;
+	virt_rmb();
+	int_desc = READ_ONCE(data->chip_data);
+	if (int_desc && multi_msi) {
 		msg->address_hi = int_desc->address >> 32;
 		msg->address_lo = int_desc->address & 0xffffffff;
 		msg->data = int_desc->data;
@@ -1778,8 +1779,9 @@  static void hv_compose_msi_msg(struct irq_data *data, struct msi_msg *msg)
 		goto return_null_message;
 
 	/* Free any previous message that might have already been composed. */
-	if (data->chip_data && !multi_msi) {
-		int_desc = data->chip_data;
+	virt_rmb();
+	int_desc = READ_ONCE(data->chip_data);
+	if (int_desc && !multi_msi) {
 		data->chip_data = NULL;
 		hv_int_desc_free(hpdev, int_desc);
 	}