diff mbox series

[v2,1/2] PCI: hv: Fix a race condition when removing the device

Message ID 1619070346-21557-1-git-send-email-longli@linuxonhyperv.com (mailing list archive)
State Superseded
Headers show
Series [v2,1/2] PCI: hv: Fix a race condition when removing the device | expand

Commit Message

Long Li April 22, 2021, 5:45 a.m. UTC
From: Long Li <longli@microsoft.com>

On removing the device, any work item (hv_pci_devices_present() or
hv_pci_eject_device()) scheduled on workqueue hbus->wq may still be running
and race with hv_pci_remove().

This can happen because the host may send PCI_EJECT or PCI_BUS_RELATIONS(2)
and decide to rescind the channel immediately after that.

Fix this by flushing/stopping the workqueue of hbus before doing hbus remove.

Signed-off-by: Long Li <longli@microsoft.com>
---

Change in v2: Remove unused bus state hv_pcibus_removed

 drivers/pci/controller/pci-hyperv.c | 13 ++++++++++---
 1 file changed, 10 insertions(+), 3 deletions(-)

Comments

Dexuan Cui April 23, 2021, 7:09 a.m. UTC | #1
> From: longli@linuxonhyperv.com <longli@linuxonhyperv.com>
> Sent: Wednesday, April 21, 2021 10:46 PM
> ...
> diff --git a/drivers/pci/controller/pci-hyperv.c
> b/drivers/pci/controller/pci-hyperv.c
> index 27a17a1e4a7c..fc948a2ed703 100644
> --- a/drivers/pci/controller/pci-hyperv.c
> +++ b/drivers/pci/controller/pci-hyperv.c
> @@ -444,7 +444,6 @@ enum hv_pcibus_state {
>  	hv_pcibus_probed,
>  	hv_pcibus_installed,
>  	hv_pcibus_removing,
> -	hv_pcibus_removed,
>  	hv_pcibus_maximum
>  };
> 
> @@ -3305,13 +3304,22 @@ static int hv_pci_remove(struct hv_device *hdev)
> 
>  	hbus = hv_get_drvdata(hdev);
>  	if (hbus->state == hv_pcibus_installed) {
> +		tasklet_disable(&hdev->channel->callback_event);
> +		hbus->state = hv_pcibus_removing;
> +		tasklet_enable(&hdev->channel->callback_event);
> +		destroy_workqueue(hbus->wq);

If we test "rmmod pci-hyperv", I suspect the warning will be printed:
hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_start_relations_work():

        if (hbus->state == hv_pcibus_removing) {
                dev_info(&hbus->hdev->device,
                         "PCI VMBus BUS_RELATIONS: ignored\n");
                return -ENOENT;
        }

Ideally we'd like to avoid the warning in the driver unloading case.

BTW, can you please add "hbus->wq = NULL;" after the line
"destroy_workqueue(hbus->wq);"? In case some other function could
still try to use hbus->wq by accident in the future, the error would be
easier to be understood.
Long Li April 23, 2021, 6:32 p.m. UTC | #2
> Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the
> device
> 
> > From: longli@linuxonhyperv.com <longli@linuxonhyperv.com>
> > Sent: Wednesday, April 21, 2021 10:46 PM ...
> > diff --git a/drivers/pci/controller/pci-hyperv.c
> > b/drivers/pci/controller/pci-hyperv.c
> > index 27a17a1e4a7c..fc948a2ed703 100644
> > --- a/drivers/pci/controller/pci-hyperv.c
> > +++ b/drivers/pci/controller/pci-hyperv.c
> > @@ -444,7 +444,6 @@ enum hv_pcibus_state {
> >  	hv_pcibus_probed,
> >  	hv_pcibus_installed,
> >  	hv_pcibus_removing,
> > -	hv_pcibus_removed,
> >  	hv_pcibus_maximum
> >  };
> >
> > @@ -3305,13 +3304,22 @@ static int hv_pci_remove(struct hv_device
> > *hdev)
> >
> >  	hbus = hv_get_drvdata(hdev);
> >  	if (hbus->state == hv_pcibus_installed) {
> > +		tasklet_disable(&hdev->channel->callback_event);
> > +		hbus->state = hv_pcibus_removing;
> > +		tasklet_enable(&hdev->channel->callback_event);
> > +		destroy_workqueue(hbus->wq);
> 
> If we test "rmmod pci-hyperv", I suspect the warning will be printed:
> hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_start_relations_work():

In most case, it will not print anything. 

It will print something if there is a PCI_BUS_RELATION work pending at the time of remove. The same goes to PCI_EJECT. In those cases, the message is valuable to troubleshooting.

> 
>         if (hbus->state == hv_pcibus_removing) {
>                 dev_info(&hbus->hdev->device,
>                          "PCI VMBus BUS_RELATIONS: ignored\n");
>                 return -ENOENT;
>         }
> 
> Ideally we'd like to avoid the warning in the driver unloading case.
> 
> BTW, can you please add "hbus->wq = NULL;" after the line
> "destroy_workqueue(hbus->wq);"? In case some other function could still
> try to use hbus->wq by accident in the future, the error would be easier to
> be understood.

I will send v3 to add =NULL.
Dexuan Cui April 23, 2021, 6:41 p.m. UTC | #3
> From: Long Li <longli@microsoft.com>
> Sent: Friday, April 23, 2021 11:32 AM
> > ...
> > If we test "rmmod pci-hyperv", I suspect the warning will be printed:
> > hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_start_relations_work():
> 
> In most case, it will not print anything.

If I read the code correctly, I think the warning is printed _every time_ we
unload pci-hyperv.

> It will print something if there is a PCI_BUS_RELATION work pending at the time
> of remove. The same goes to PCI_EJECT. In those cases, the message is valuable
> to troubleshooting.
Long Li April 23, 2021, 6:49 p.m. UTC | #4
> Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the
> device
> 
> > From: Long Li <longli@microsoft.com>
> > Sent: Friday, April 23, 2021 11:32 AM
> > > ...
> > > If we test "rmmod pci-hyperv", I suspect the warning will be printed:
> > > hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_start_relations_work():
> >
> > In most case, it will not print anything.
> 
> If I read the code correctly, I think the warning is printed _every time_ we
> unload pci-hyperv.

Okay I see what you mean. I'll remove this message.

> 
> > It will print something if there is a PCI_BUS_RELATION work pending at
> > the time of remove. The same goes to PCI_EJECT. In those cases, the
> > message is valuable to troubleshooting.
Dexuan Cui April 25, 2021, 2:24 a.m. UTC | #5
> From: Long Li <longli@microsoft.com>
> Sent: Friday, April 23, 2021 11:49 AM
> To: Dexuan Cui <decui@microsoft.com>; longli@linuxonhyperv.com; KY
> 
> > Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the
> > device
> >
> > > From: Long Li <longli@microsoft.com>
> > > Sent: Friday, April 23, 2021 11:32 AM
> > > > ...
> > > > If we test "rmmod pci-hyperv", I suspect the warning will be printed:
> > > > hv_pci_remove() -> hv_pci_bus_exit() -> hv_pci_start_relations_work():
> > >
> > > In most case, it will not print anything.
> >
> > If I read the code correctly, I think the warning is printed _every time_ we
> > unload pci-hyperv.
> 
> Okay I see what you mean. I'll remove this message.

Here we just want to avoid the message every time the pci-hyperv driver is
unloaded. We might want to see the possible message when the PCI device
is removed, but it's ok to me if the message is unconditionally removed.

The real issus with the patch is that the 'hpdev' struct is never freed when
the driver is unloaded: if we print out the value of the ref counter in
put_pcichild(), we would notice that the ref counter is still two when the
driver is unloaded, i.e. memory leak occurs.

Before the patch, hv_pci_remove() calls hv_pci_bus_exit() ->
hv_pci_start_relations_work(), and the ref counter drops to zero in
pci_devices_present_work() due to the two calls of put_pcichild().

With the patch, when the driver is unloaded, pci_devices_present_work() 
is not scheduled, hence the ref counter doesn't drop to zero.
Long Li April 25, 2021, 4:53 a.m. UTC | #6
> Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when removing the
> device
> 
> > From: Long Li <longli@microsoft.com>
> > Sent: Friday, April 23, 2021 11:49 AM
> > To: Dexuan Cui <decui@microsoft.com>; longli@linuxonhyperv.com; KY
> >
> > > Subject: RE: [Patch v2 1/2] PCI: hv: Fix a race condition when
> > > removing the device
> > >
> > > > From: Long Li <longli@microsoft.com>
> > > > Sent: Friday, April 23, 2021 11:32 AM
> > > > > ...
> > > > > If we test "rmmod pci-hyperv", I suspect the warning will be printed:
> > > > > hv_pci_remove() -> hv_pci_bus_exit() ->
> hv_pci_start_relations_work():
> > > >
> > > > In most case, it will not print anything.
> > >
> > > If I read the code correctly, I think the warning is printed _every
> > > time_ we unload pci-hyperv.
> >
> > Okay I see what you mean. I'll remove this message.
> 
> Here we just want to avoid the message every time the pci-hyperv driver is
> unloaded. We might want to see the possible message when the PCI device
> is removed, but it's ok to me if the message is unconditionally removed.
> 
> The real issus with the patch is that the 'hpdev' struct is never freed when
> the driver is unloaded: if we print out the value of the ref counter in
> put_pcichild(), we would notice that the ref counter is still two when the
> driver is unloaded, i.e. memory leak occurs.
> 
> Before the patch, hv_pci_remove() calls hv_pci_bus_exit() ->
> hv_pci_start_relations_work(), and the ref counter drops to zero in
> pci_devices_present_work() due to the two calls of put_pcichild().
> 
> With the patch, when the driver is unloaded, pci_devices_present_work() is
> not scheduled, hence the ref counter doesn't drop to zero.

Yes, I also see the leak, thanks to this warning message. Those will get fixed in v3.
diff mbox series

Patch

diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c
index 27a17a1e4a7c..fc948a2ed703 100644
--- a/drivers/pci/controller/pci-hyperv.c
+++ b/drivers/pci/controller/pci-hyperv.c
@@ -444,7 +444,6 @@  enum hv_pcibus_state {
 	hv_pcibus_probed,
 	hv_pcibus_installed,
 	hv_pcibus_removing,
-	hv_pcibus_removed,
 	hv_pcibus_maximum
 };
 
@@ -3305,13 +3304,22 @@  static int hv_pci_remove(struct hv_device *hdev)
 
 	hbus = hv_get_drvdata(hdev);
 	if (hbus->state == hv_pcibus_installed) {
+		tasklet_disable(&hdev->channel->callback_event);
+		hbus->state = hv_pcibus_removing;
+		tasklet_enable(&hdev->channel->callback_event);
+		destroy_workqueue(hbus->wq);
+		/*
+		 * At this point, no work is running or can be scheduled
+		 * on hbus-wq. We can't race with hv_pci_devices_present()
+		 * or hv_pci_eject_device(), it's safe to proceed.
+		 */
+
 		/* Remove the bus from PCI's point of view. */
 		pci_lock_rescan_remove();
 		pci_stop_root_bus(hbus->pci_bus);
 		hv_pci_remove_slots(hbus);
 		pci_remove_root_bus(hbus->pci_bus);
 		pci_unlock_rescan_remove();
-		hbus->state = hv_pcibus_removed;
 	}
 
 	ret = hv_pci_bus_exit(hdev, false);
@@ -3326,7 +3334,6 @@  static int hv_pci_remove(struct hv_device *hdev)
 	irq_domain_free_fwnode(hbus->sysdata.fwnode);
 	put_hvpcibus(hbus);
 	wait_for_completion(&hbus->remove_event);
-	destroy_workqueue(hbus->wq);
 
 	hv_put_dom_num(hbus->sysdata.domain);