Message ID | 1629789620-11049-1-git-send-email-longli@linuxonhyperv.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Lorenzo Pieralisi |
Headers | show |
Series | PCI: hv: Fix a bug on removing child devices on the bus | expand |
On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@linuxonhyperv.com wrote: > From: Long Li <longli@microsoft.com> > > In hv_pci_bus_exit, the code is holding a spinlock while calling > pci_destroy_slot(), which takes a mutex. > > This is not safe for spinlock. Fix this by moving the children to be > deleted to a list on the stack, and removing them after spinlock is > released. > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the device") > > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > Cc: Haiyang Zhang <haiyangz@microsoft.com> > Cc: Stephen Hemminger <sthemmin@microsoft.com> > Cc: Wei Liu <wei.liu@kernel.org> > Cc: Dexuan Cui <decui@microsoft.com> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Cc: Rob Herring <robh@kernel.org> > Cc: "Krzysztof Wilczyński" <kw@linux.com> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Michael Kelley <mikelley@microsoft.com> > Cc: Dan Carpenter <dan.carpenter@oracle.com> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > Signed-off-by: Long Li <longli@microsoft.com> > --- > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c > index a53bd8728d0d..d4f3cce18957 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) > struct hv_pci_dev *hpdev, *tmp; > unsigned long flags; > int ret; > + struct list_head removed; This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its scope. > > /* > * After the host sends the RESCIND_CHANNEL message, it doesn't > @@ -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) > return 0; > > if (!keep_devs) { > - /* Delete any children which might still exist. */ > + INIT_LIST_HEAD(&removed); > + > + /* Move all present children to the list on stack */ > spin_lock_irqsave(&hbus->device_list_lock, flags); > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) { > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) > + list_move_tail(&hpdev->list_entry, &removed); > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > + > + /* Remove all children in the list */ > + while (!list_empty(&removed)) { > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > + list_entry); list_for_each_entry_safe can also be used here, right? Wei. > list_del(&hpdev->list_entry); > if (hpdev->pci_slot) > pci_destroy_slot(hpdev->pci_slot); > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) > put_pcichild(hpdev); > put_pcichild(hpdev); > } > - spin_unlock_irqrestore(&hbus->device_list_lock, flags); > } > > ret = hv_send_resources_released(hdev); > -- > 2.25.1 >
"Fix a bug ..." is not a very useful subject line. It doesn't say anything about what the patch *does*. It doesn't hint at a locking change. On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@linuxonhyperv.com wrote: > From: Long Li <longli@microsoft.com> > > In hv_pci_bus_exit, the code is holding a spinlock while calling > pci_destroy_slot(), which takes a mutex. It's unfortunate that slots are not better integrated into the PCI core. I'm sorry your driver even has to worry about this. > > This is not safe for spinlock. Fix this by moving the children to be > deleted to a list on the stack, and removing them after spinlock is > released. > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the device") > > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > Cc: Haiyang Zhang <haiyangz@microsoft.com> > Cc: Stephen Hemminger <sthemmin@microsoft.com> > Cc: Wei Liu <wei.liu@kernel.org> > Cc: Dexuan Cui <decui@microsoft.com> > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > Cc: Rob Herring <robh@kernel.org> > Cc: "Krzysztof Wilczyński" <kw@linux.com> > Cc: Bjorn Helgaas <bhelgaas@google.com> > Cc: Michael Kelley <mikelley@microsoft.com> > Cc: Dan Carpenter <dan.carpenter@oracle.com> > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> A lore link to Dan's report would be useful here. > Signed-off-by: Long Li <longli@microsoft.com> > --- > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c > index a53bd8728d0d..d4f3cce18957 100644 > --- a/drivers/pci/controller/pci-hyperv.c > +++ b/drivers/pci/controller/pci-hyperv.c > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) > struct hv_pci_dev *hpdev, *tmp; > unsigned long flags; > int ret; > + struct list_head removed; > > /* > * After the host sends the RESCIND_CHANNEL message, it doesn't > @@ -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) > return 0; > > if (!keep_devs) { > - /* Delete any children which might still exist. */ > + INIT_LIST_HEAD(&removed); > + > + /* Move all present children to the list on stack */ > spin_lock_irqsave(&hbus->device_list_lock, flags); > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) { > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) > + list_move_tail(&hpdev->list_entry, &removed); > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > + > + /* Remove all children in the list */ > + while (!list_empty(&removed)) { > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > + list_entry); > list_del(&hpdev->list_entry); > if (hpdev->pci_slot) > pci_destroy_slot(hpdev->pci_slot); > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) > put_pcichild(hpdev); > put_pcichild(hpdev); > } > - spin_unlock_irqrestore(&hbus->device_list_lock, flags); > } > > ret = hv_send_resources_released(hdev); > -- > 2.25.1 >
> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@linuxonhyperv.com wrote: > > From: Long Li <longli@microsoft.com> > > > > In hv_pci_bus_exit, the code is holding a spinlock while calling > > pci_destroy_slot(), which takes a mutex. > > > > This is not safe for spinlock. Fix this by moving the children to be > > deleted to a list on the stack, and removing them after spinlock is > > released. > > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the > > device") > > > > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > > Cc: Haiyang Zhang <haiyangz@microsoft.com> > > Cc: Stephen Hemminger <sthemmin@microsoft.com> > > Cc: Wei Liu <wei.liu@kernel.org> > > Cc: Dexuan Cui <decui@microsoft.com> > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > Cc: Rob Herring <robh@kernel.org> > > Cc: "Krzysztof Wilczyński" <kw@linux.com> > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > Cc: Michael Kelley <mikelley@microsoft.com> > > Cc: Dan Carpenter <dan.carpenter@oracle.com> > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > Signed-off-by: Long Li <longli@microsoft.com> > > --- > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > b/drivers/pci/controller/pci-hyperv.c > > index a53bd8728d0d..d4f3cce18957 100644 > > --- a/drivers/pci/controller/pci-hyperv.c > > +++ b/drivers/pci/controller/pci-hyperv.c > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > bool keep_devs) > > struct hv_pci_dev *hpdev, *tmp; > > unsigned long flags; > > int ret; > > + struct list_head removed; > > This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its > scope. > > > > > /* > > * After the host sends the RESCIND_CHANNEL message, it doesn't @@ > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool > keep_devs) > > return 0; > > > > if (!keep_devs) { > > - /* Delete any children which might still exist. */ > > + INIT_LIST_HEAD(&removed); > > + > > + /* Move all present children to the list on stack */ > > spin_lock_irqsave(&hbus->device_list_lock, flags); > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, > list_entry) { > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, > list_entry) > > + list_move_tail(&hpdev->list_entry, &removed); > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > + > > + /* Remove all children in the list */ > > + while (!list_empty(&removed)) { > > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > > + list_entry); > > list_for_each_entry_safe can also be used here, right? > > Wei. I will address your comments. Long > > > list_del(&hpdev->list_entry); > > if (hpdev->pci_slot) > > pci_destroy_slot(hpdev->pci_slot); > > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > bool keep_devs) > > put_pcichild(hpdev); > > put_pcichild(hpdev); > > } > > - spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > } > > > > ret = hv_send_resources_released(hdev); > > -- > > 2.25.1 > >
> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > "Fix a bug ..." is not a very useful subject line. It doesn't say anything about what > the patch *does*. It doesn't hint at a locking change. > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@linuxonhyperv.com wrote: > > From: Long Li <longli@microsoft.com> > > > > In hv_pci_bus_exit, the code is holding a spinlock while calling > > pci_destroy_slot(), which takes a mutex. > > It's unfortunate that slots are not better integrated into the PCI core. I'm sorry > your driver even has to worry about this. > > > > This is not safe for spinlock. Fix this by moving the children to be > > deleted to a list on the stack, and removing them after spinlock is > > released. > > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the > > device") > > > > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > > Cc: Haiyang Zhang <haiyangz@microsoft.com> > > Cc: Stephen Hemminger <sthemmin@microsoft.com> > > Cc: Wei Liu <wei.liu@kernel.org> > > Cc: Dexuan Cui <decui@microsoft.com> > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > Cc: Rob Herring <robh@kernel.org> > > Cc: "Krzysztof Wilczyński" <kw@linux.com> > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > Cc: Michael Kelley <mikelley@microsoft.com> > > Cc: Dan Carpenter <dan.carpenter@oracle.com> > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > A lore link to Dan's report would be useful here. I will address your comments and send v2. Long > > > Signed-off-by: Long Li <longli@microsoft.com> > > --- > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > b/drivers/pci/controller/pci-hyperv.c > > index a53bd8728d0d..d4f3cce18957 100644 > > --- a/drivers/pci/controller/pci-hyperv.c > > +++ b/drivers/pci/controller/pci-hyperv.c > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > bool keep_devs) > > struct hv_pci_dev *hpdev, *tmp; > > unsigned long flags; > > int ret; > > + struct list_head removed; > > > > /* > > * After the host sends the RESCIND_CHANNEL message, it doesn't @@ > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool > keep_devs) > > return 0; > > > > if (!keep_devs) { > > - /* Delete any children which might still exist. */ > > + INIT_LIST_HEAD(&removed); > > + > > + /* Move all present children to the list on stack */ > > spin_lock_irqsave(&hbus->device_list_lock, flags); > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, > list_entry) { > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, > list_entry) > > + list_move_tail(&hpdev->list_entry, &removed); > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > + > > + /* Remove all children in the list */ > > + while (!list_empty(&removed)) { > > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > > + list_entry); > > list_del(&hpdev->list_entry); > > if (hpdev->pci_slot) > > pci_destroy_slot(hpdev->pci_slot); > > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > bool keep_devs) > > put_pcichild(hpdev); > > put_pcichild(hpdev); > > } > > - spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > } > > > > ret = hv_send_resources_released(hdev); > > -- > > 2.25.1 > >
From: Long Li <longli@microsoft.com> Sent: Tuesday, August 24, 2021 10:28 AM > > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@linuxonhyperv.com wrote: > > > From: Long Li <longli@microsoft.com> > > > > > > In hv_pci_bus_exit, the code is holding a spinlock while calling > > > pci_destroy_slot(), which takes a mutex. > > > > > > This is not safe for spinlock. Fix this by moving the children to be > > > deleted to a list on the stack, and removing them after spinlock is > > > released. > > > > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the > > > device") > > > > > > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > > > Cc: Haiyang Zhang <haiyangz@microsoft.com> > > > Cc: Stephen Hemminger <sthemmin@microsoft.com> > > > Cc: Wei Liu <wei.liu@kernel.org> > > > Cc: Dexuan Cui <decui@microsoft.com> > > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > > Cc: Rob Herring <robh@kernel.org> > > > Cc: "Krzysztof Wilczyński" <kw@linux.com> > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > > Cc: Michael Kelley <mikelley@microsoft.com> > > > Cc: Dan Carpenter <dan.carpenter@oracle.com> > > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > > Signed-off-by: Long Li <longli@microsoft.com> > > > --- > > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > > b/drivers/pci/controller/pci-hyperv.c > > > index a53bd8728d0d..d4f3cce18957 100644 > > > --- a/drivers/pci/controller/pci-hyperv.c > > > +++ b/drivers/pci/controller/pci-hyperv.c > > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > > bool keep_devs) > > > struct hv_pci_dev *hpdev, *tmp; > > > unsigned long flags; > > > int ret; > > > + struct list_head removed; > > > > This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its > > scope. > > > > > > > > /* > > > * After the host sends the RESCIND_CHANNEL message, it doesn't @@ > > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool > > keep_devs) > > > return 0; > > > > > > if (!keep_devs) { > > > - /* Delete any children which might still exist. */ > > > + INIT_LIST_HEAD(&removed); > > > + > > > + /* Move all present children to the list on stack */ > > > spin_lock_irqsave(&hbus->device_list_lock, flags); > > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > list_entry) { > > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > list_entry) > > > + list_move_tail(&hpdev->list_entry, &removed); > > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > > + > > > + /* Remove all children in the list */ > > > + while (!list_empty(&removed)) { > > > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > > > + list_entry); > > > > list_for_each_entry_safe can also be used here, right? > > > > Wei. > > I will address your comments. > > Long I thought list_for_each_entry_safe() is for use when list manipulation is *not* protected by a lock and you want to safely walk the list even if an entry gets removed. If the list is protected by a lock or not subject to contention (as is the case here), then list_for_each_entry() is the simpler implementation. The original implementation didn't need to use the _safe version because of the spin lock. Or do I have it backwards? Michael > > > > > > list_del(&hpdev->list_entry); > > > if (hpdev->pci_slot) > > > pci_destroy_slot(hpdev->pci_slot); > > > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > > bool keep_devs) > > > put_pcichild(hpdev); > > > put_pcichild(hpdev); > > > } > > > - spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > > } > > > > > > ret = hv_send_resources_released(hdev); > > > -- > > > 2.25.1 > > >
> Subject: RE: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > From: Long Li <longli@microsoft.com> Sent: Tuesday, August 24, 2021 10:28 > AM > > > > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on > > > the bus > > > > > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@linuxonhyperv.com > wrote: > > > > From: Long Li <longli@microsoft.com> > > > > > > > > In hv_pci_bus_exit, the code is holding a spinlock while calling > > > > pci_destroy_slot(), which takes a mutex. > > > > > > > > This is not safe for spinlock. Fix this by moving the children to > > > > be deleted to a list on the stack, and removing them after > > > > spinlock is released. > > > > > > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing > > > > the > > > > device") > > > > > > > > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > > > > Cc: Haiyang Zhang <haiyangz@microsoft.com> > > > > Cc: Stephen Hemminger <sthemmin@microsoft.com> > > > > Cc: Wei Liu <wei.liu@kernel.org> > > > > Cc: Dexuan Cui <decui@microsoft.com> > > > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > > > Cc: Rob Herring <robh@kernel.org> > > > > Cc: "Krzysztof Wilczyński" <kw@linux.com> > > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > > > Cc: Michael Kelley <mikelley@microsoft.com> > > > > Cc: Dan Carpenter <dan.carpenter@oracle.com> > > > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > > > Signed-off-by: Long Li <longli@microsoft.com> > > > > --- > > > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > > > b/drivers/pci/controller/pci-hyperv.c > > > > index a53bd8728d0d..d4f3cce18957 100644 > > > > --- a/drivers/pci/controller/pci-hyperv.c > > > > +++ b/drivers/pci/controller/pci-hyperv.c > > > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device > > > > *hdev, > > > bool keep_devs) > > > > struct hv_pci_dev *hpdev, *tmp; > > > > unsigned long flags; > > > > int ret; > > > > + struct list_head removed; > > > > > > This can be moved to where it is needed -- the if(!keep_dev) branch > > > -- to limit its scope. > > > > > > > > > > > /* > > > > * After the host sends the RESCIND_CHANNEL message, it doesn't > > > > @@ > > > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device > > > > *hdev, bool > > > keep_devs) > > > > return 0; > > > > > > > > if (!keep_devs) { > > > > - /* Delete any children which might still exist. */ > > > > + INIT_LIST_HEAD(&removed); > > > > + > > > > + /* Move all present children to the list on stack */ > > > > spin_lock_irqsave(&hbus->device_list_lock, flags); > > > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > > list_entry) { > > > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > > list_entry) > > > > + list_move_tail(&hpdev->list_entry, &removed); > > > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > > > + > > > > + /* Remove all children in the list */ > > > > + while (!list_empty(&removed)) { > > > > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > > > > + list_entry); > > > > > > list_for_each_entry_safe can also be used here, right? > > > > > > Wei. > > > > I will address your comments. > > > > Long > > I thought list_for_each_entry_safe() is for use when list manipulation is *not* > protected by a lock and you want to safely walk the list even if an entry gets > removed. If the list is protected by a lock or not subject to contention (as is the > case here), then > list_for_each_entry() is the simpler implementation. The original > implementation didn't need to use the _safe version because of the spin lock. > > Or do I have it backwards? > > Michael I think we need list_for_each_entry_safe() because we delete the list elements while going through them: Here is the comment on list_for_each_entry_safe(): /** * Loop through the list, keeping a backup pointer to the element. This * macro allows for the deletion of a list element while looping through the * list. * * See list_for_each_entry for more details. */ > > > > > > > > > > list_del(&hpdev->list_entry); > > > > if (hpdev->pci_slot) > > > > pci_destroy_slot(hpdev->pci_slot); > > > > @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device > > > > *hdev, > > > bool keep_devs) > > > > put_pcichild(hpdev); > > > > put_pcichild(hpdev); > > > > } > > > > - spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > > > } > > > > > > > > ret = hv_send_resources_released(hdev); > > > > -- > > > > 2.25.1 > > > >
On Wed, Aug 25, 2021 at 2:11 PM Michael Kelley <mikelley@microsoft.com> wrote: > > From: Long Li <longli@microsoft.com> Sent: Tuesday, August 24, 2021 10:28 AM > > > > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > > > > > On Tue, Aug 24, 2021 at 12:20:20AM -0700, longli@linuxonhyperv.com wrote: > > > > From: Long Li <longli@microsoft.com> > > > > > > > > In hv_pci_bus_exit, the code is holding a spinlock while calling > > > > pci_destroy_slot(), which takes a mutex. > > > > > > > > This is not safe for spinlock. Fix this by moving the children to be > > > > deleted to a list on the stack, and removing them after spinlock is > > > > released. > > > > > > > > Fixes: 94d22763207a ("PCI: hv: Fix a race condition when removing the > > > > device") > > > > > > > > Cc: "K. Y. Srinivasan" <kys@microsoft.com> > > > > Cc: Haiyang Zhang <haiyangz@microsoft.com> > > > > Cc: Stephen Hemminger <sthemmin@microsoft.com> > > > > Cc: Wei Liu <wei.liu@kernel.org> > > > > Cc: Dexuan Cui <decui@microsoft.com> > > > > Cc: Lorenzo Pieralisi <lorenzo.pieralisi@arm.com> > > > > Cc: Rob Herring <robh@kernel.org> > > > > Cc: "Krzysztof Wilczyński" <kw@linux.com> > > > > Cc: Bjorn Helgaas <bhelgaas@google.com> > > > > Cc: Michael Kelley <mikelley@microsoft.com> > > > > Cc: Dan Carpenter <dan.carpenter@oracle.com> > > > > Reported-by: Dan Carpenter <dan.carpenter@oracle.com> > > > > Signed-off-by: Long Li <longli@microsoft.com> > > > > --- > > > > drivers/pci/controller/pci-hyperv.c | 15 ++++++++++++--- > > > > 1 file changed, 12 insertions(+), 3 deletions(-) > > > > > > > > diff --git a/drivers/pci/controller/pci-hyperv.c > > > > b/drivers/pci/controller/pci-hyperv.c > > > > index a53bd8728d0d..d4f3cce18957 100644 > > > > --- a/drivers/pci/controller/pci-hyperv.c > > > > +++ b/drivers/pci/controller/pci-hyperv.c > > > > @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, > > > bool keep_devs) > > > > struct hv_pci_dev *hpdev, *tmp; > > > > unsigned long flags; > > > > int ret; > > > > + struct list_head removed; > > > > > > This can be moved to where it is needed -- the if(!keep_dev) branch -- to limit its > > > scope. > > > > > > > > > > > /* > > > > * After the host sends the RESCIND_CHANNEL message, it doesn't @@ > > > > -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool > > > keep_devs) > > > > return 0; > > > > > > > > if (!keep_devs) { > > > > - /* Delete any children which might still exist. */ > > > > + INIT_LIST_HEAD(&removed); > > > > + > > > > + /* Move all present children to the list on stack */ > > > > spin_lock_irqsave(&hbus->device_list_lock, flags); > > > > - list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > > list_entry) { > > > > + list_for_each_entry_safe(hpdev, tmp, &hbus->children, > > > list_entry) > > > > + list_move_tail(&hpdev->list_entry, &removed); > > > > + spin_unlock_irqrestore(&hbus->device_list_lock, flags); > > > > + > > > > + /* Remove all children in the list */ > > > > + while (!list_empty(&removed)) { > > > > + hpdev = list_first_entry(&removed, struct hv_pci_dev, > > > > + list_entry); > > > > > > list_for_each_entry_safe can also be used here, right? > > > > > > Wei. > > > > I will address your comments. > > > > Long > > I thought list_for_each_entry_safe() is for use when list manipulation > is *not* protected by a lock and you want to safely walk the list > even if an entry gets removed. If the list is protected by a lock or > not subject to contention (as is the case here), then > list_for_each_entry() is the simpler implementation. The original > implementation didn't need to use the _safe version because of > the spin lock. > > Or do I have it backwards? "_safe" only means "safe against removal of list entry" as the kerneldoc says. But that means removal within the loop iteration, not any writer. A lock is needed in either case if there's another writer. Don't ask me about the RCU variant though... Rob
From: Long Li <longli@microsoft.com> Sent: Wednesday, August 25, 2021 1:25 PM > > > > I thought list_for_each_entry_safe() is for use when list manipulation is *not* > > protected by a lock and you want to safely walk the list even if an entry gets > > removed. If the list is protected by a lock or not subject to contention (as is the > > case here), then > > list_for_each_entry() is the simpler implementation. The original > > implementation didn't need to use the _safe version because of the spin lock. > > > > Or do I have it backwards? > > > > Michael > > I think we need list_for_each_entry_safe() because we delete the list elements while going through them: > > Here is the comment on list_for_each_entry_safe(): > /** > * Loop through the list, keeping a backup pointer to the element. This > * macro allows for the deletion of a list element while looping through the > * list. > * > * See list_for_each_entry for more details. > */ > Got it. Thanks (and to Rob Herring). I read that comment but with the wrong assumptions and didn't understand it correctly. Interestingly, pci-hyperv.c has another case of looping through this list and removing items where the _safe version is not used. See pci_devices_present_work() where the missing children are moved to a list on the stack. Michael
On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote: > From: Long Li <longli@microsoft.com> Sent: Wednesday, August 25, 2021 1:25 PM > > > > > > > I thought list_for_each_entry_safe() is for use when list manipulation is *not* > > > protected by a lock and you want to safely walk the list even if an entry gets > > > removed. If the list is protected by a lock or not subject to contention (as is the > > > case here), then > > > list_for_each_entry() is the simpler implementation. The original > > > implementation didn't need to use the _safe version because of the spin lock. > > > > > > Or do I have it backwards? > > > > > > Michael > > > > I think we need list_for_each_entry_safe() because we delete the list elements while going through them: > > > > Here is the comment on list_for_each_entry_safe(): > > /** > > * Loop through the list, keeping a backup pointer to the element. This > > * macro allows for the deletion of a list element while looping through the > > * list. > > * > > * See list_for_each_entry for more details. > > */ > > > > Got it. Thanks (and to Rob Herring). I read that comment but > with the wrong assumptions and didn't understand it correctly. > > Interestingly, pci-hyperv.c has another case of looping through > this list and removing items where the _safe version is not used. > See pci_devices_present_work() where the missing children are > moved to a list on the stack. That can be converted too, I think. The original code is not wrong per-se. It is just not as concise as using list_for_each_entry_safe. Wei. > > Michael
> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote: > > From: Long Li <longli@microsoft.com> Sent: Wednesday, August 25, 2021 > > 1:25 PM > > > > > > > > > > I thought list_for_each_entry_safe() is for use when list > > > > manipulation is *not* protected by a lock and you want to safely > > > > walk the list even if an entry gets removed. If the list is > > > > protected by a lock or not subject to contention (as is the case > > > > here), then > > > > list_for_each_entry() is the simpler implementation. The original > > > > implementation didn't need to use the _safe version because of the spin > lock. > > > > > > > > Or do I have it backwards? > > > > > > > > Michael > > > > > > I think we need list_for_each_entry_safe() because we delete the list > elements while going through them: > > > > > > Here is the comment on list_for_each_entry_safe(): > > > /** > > > * Loop through the list, keeping a backup pointer to the element. > > > This > > > * macro allows for the deletion of a list element while looping > > > through the > > > * list. > > > * > > > * See list_for_each_entry for more details. > > > */ > > > > > > > Got it. Thanks (and to Rob Herring). I read that comment but > > with the wrong assumptions and didn't understand it correctly. > > > > Interestingly, pci-hyperv.c has another case of looping through this > > list and removing items where the _safe version is not used. > > See pci_devices_present_work() where the missing children are moved to > > a list on the stack. > > That can be converted too, I think. > > The original code is not wrong per-se. It is just not as concise as using > list_for_each_entry_safe. > > Wei. I assume we are talking about the following code in pci_devices_present_work(): list_for_each_entry(hpdev, &hbus->children, list_entry) { if (hpdev->reported_missing) { found = true; put_pcichild(hpdev); list_move_tail(&hpdev->list_entry, &removed); break; } } This code is correct as there is a "break" after a list entry is removed from the list. So there is no need to use the _safe version. This code can be converted to use the _safe version.
On Thu, Aug 26, 2021 at 08:09:19PM +0000, Long Li wrote: > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > > > On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote: > > > From: Long Li <longli@microsoft.com> Sent: Wednesday, August 25, 2021 > > > 1:25 PM > > > > > > > > > > > > > I thought list_for_each_entry_safe() is for use when list > > > > > manipulation is *not* protected by a lock and you want to safely > > > > > walk the list even if an entry gets removed. If the list is > > > > > protected by a lock or not subject to contention (as is the case > > > > > here), then > > > > > list_for_each_entry() is the simpler implementation. The original > > > > > implementation didn't need to use the _safe version because of the spin > > lock. > > > > > > > > > > Or do I have it backwards? > > > > > > > > > > Michael > > > > > > > > I think we need list_for_each_entry_safe() because we delete the list > > elements while going through them: > > > > > > > > Here is the comment on list_for_each_entry_safe(): > > > > /** > > > > * Loop through the list, keeping a backup pointer to the element. > > > > This > > > > * macro allows for the deletion of a list element while looping > > > > through the > > > > * list. > > > > * > > > > * See list_for_each_entry for more details. > > > > */ > > > > > > > > > > Got it. Thanks (and to Rob Herring). I read that comment but > > > with the wrong assumptions and didn't understand it correctly. > > > > > > Interestingly, pci-hyperv.c has another case of looping through this > > > list and removing items where the _safe version is not used. > > > See pci_devices_present_work() where the missing children are moved to > > > a list on the stack. > > > > That can be converted too, I think. > > > > The original code is not wrong per-se. It is just not as concise as using > > list_for_each_entry_safe. > > > > Wei. > > I assume we are talking about the following code in pci_devices_present_work(): > > list_for_each_entry(hpdev, &hbus->children, list_entry) { > if (hpdev->reported_missing) { > found = true; > put_pcichild(hpdev); > list_move_tail(&hpdev->list_entry, &removed); > break; > } > } > > This code is correct as there is a "break" after a list entry is > removed from the list. So there is no need to use the _safe version. > This code can be converted to use the _safe version. After this block there is another block like while (!list_empty(removed)) { ... list_del(...) } I assumed Michael was referring to that block. :-) Wei.
> Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on the bus > > On Thu, Aug 26, 2021 at 08:09:19PM +0000, Long Li wrote: > > > Subject: Re: [PATCH] PCI: hv: Fix a bug on removing child devices on > > > the bus > > > > > > On Thu, Aug 26, 2021 at 04:50:28PM +0000, Michael Kelley wrote: > > > > From: Long Li <longli@microsoft.com> Sent: Wednesday, August 25, > > > > 2021 > > > > 1:25 PM > > > > > > > > > > > > > > > > I thought list_for_each_entry_safe() is for use when list > > > > > > manipulation is *not* protected by a lock and you want to > > > > > > safely walk the list even if an entry gets removed. If the > > > > > > list is protected by a lock or not subject to contention (as > > > > > > is the case here), then > > > > > > list_for_each_entry() is the simpler implementation. The > > > > > > original implementation didn't need to use the _safe version > > > > > > because of the spin > > > lock. > > > > > > > > > > > > Or do I have it backwards? > > > > > > > > > > > > Michael > > > > > > > > > > I think we need list_for_each_entry_safe() because we delete the > > > > > list > > > elements while going through them: > > > > > > > > > > Here is the comment on list_for_each_entry_safe(): > > > > > /** > > > > > * Loop through the list, keeping a backup pointer to the element. > > > > > This > > > > > * macro allows for the deletion of a list element while looping > > > > > through the > > > > > * list. > > > > > * > > > > > * See list_for_each_entry for more details. > > > > > */ > > > > > > > > > > > > > Got it. Thanks (and to Rob Herring). I read that comment but > > > > with the wrong assumptions and didn't understand it correctly. > > > > > > > > Interestingly, pci-hyperv.c has another case of looping through > > > > this list and removing items where the _safe version is not used. > > > > See pci_devices_present_work() where the missing children are > > > > moved to a list on the stack. > > > > > > That can be converted too, I think. > > > > > > The original code is not wrong per-se. It is just not as concise as > > > using list_for_each_entry_safe. > > > > > > Wei. > > > > I assume we are talking about the following code in > pci_devices_present_work(): > > > > list_for_each_entry(hpdev, &hbus->children, list_entry) { > > if (hpdev->reported_missing) { > > found = true; > > put_pcichild(hpdev); > > list_move_tail(&hpdev->list_entry, &removed); > > break; > > } > > } > > > > This code is correct as there is a "break" after a list entry is > > removed from the list. So there is no need to use the _safe version. > > This code can be converted to use the _safe version. > > After this block there is another block like > > while (!list_empty(removed)) { > ... > list_del(...) > > } > > I assumed Michael was referring to that block. :-) > > Wei. This block is also correct. We don't have a bug here but there is a better way to code it. Long
diff --git a/drivers/pci/controller/pci-hyperv.c b/drivers/pci/controller/pci-hyperv.c index a53bd8728d0d..d4f3cce18957 100644 --- a/drivers/pci/controller/pci-hyperv.c +++ b/drivers/pci/controller/pci-hyperv.c @@ -3220,6 +3220,7 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) struct hv_pci_dev *hpdev, *tmp; unsigned long flags; int ret; + struct list_head removed; /* * After the host sends the RESCIND_CHANNEL message, it doesn't @@ -3229,9 +3230,18 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) return 0; if (!keep_devs) { - /* Delete any children which might still exist. */ + INIT_LIST_HEAD(&removed); + + /* Move all present children to the list on stack */ spin_lock_irqsave(&hbus->device_list_lock, flags); - list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) { + list_for_each_entry_safe(hpdev, tmp, &hbus->children, list_entry) + list_move_tail(&hpdev->list_entry, &removed); + spin_unlock_irqrestore(&hbus->device_list_lock, flags); + + /* Remove all children in the list */ + while (!list_empty(&removed)) { + hpdev = list_first_entry(&removed, struct hv_pci_dev, + list_entry); list_del(&hpdev->list_entry); if (hpdev->pci_slot) pci_destroy_slot(hpdev->pci_slot); @@ -3239,7 +3249,6 @@ static int hv_pci_bus_exit(struct hv_device *hdev, bool keep_devs) put_pcichild(hpdev); put_pcichild(hpdev); } - spin_unlock_irqrestore(&hbus->device_list_lock, flags); } ret = hv_send_resources_released(hdev);