Patchwork libsas: flush pending destruct work in sas_unregister_domain_devices()

login
register
mail settings
Submitter Cong Wang
Date Dec. 8, 2017, 12:40 a.m.
Message ID <CAM_iQpWL=JQFLV3uuQ1zupZgv=9oGKG4aBBsqrXnmy2ToX=PtQ@mail.gmail.com>
Download mbox | patch
Permalink /patch/10101293/
State New
Headers show

Comments

Cong Wang - Dec. 8, 2017, 12:40 a.m.
On Thu, Dec 7, 2017 at 2:57 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 5:37 AM, John Garry <john.garry@huawei.com> wrote:
>> On 28/11/2017 17:04, Cong Wang wrote:
>>>
>>> I don't understand, the only caller of sas_unregister_domain_devices()
>>> is sas_deform_port().
>>>
>>
>> And sas_deform_port() may be called from another worker on the same queue,
>> right? As in sas_phye_loss_of_signal()->sas_deform_port()
>
> Oh, good catch! I didn't notice this subtle call path.
>
> Do you have any better idea to fix this? We saw this on 4.9 too.
>

I think we can just cancel the destruct work before calling
sas_port_delete(). This should work even if it is called in
another work.

So does the attached (untested) patch make any sense now?
Cong Wang - Dec. 8, 2017, 1:04 a.m.
On Thu, Dec 7, 2017 at 4:40 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
> On Thu, Dec 7, 2017 at 2:57 PM, Cong Wang <xiyou.wangcong@gmail.com> wrote:
>> On Thu, Dec 7, 2017 at 5:37 AM, John Garry <john.garry@huawei.com> wrote:
>>> On 28/11/2017 17:04, Cong Wang wrote:
>>>>
>>>> I don't understand, the only caller of sas_unregister_domain_devices()
>>>> is sas_deform_port().
>>>>
>>>
>>> And sas_deform_port() may be called from another worker on the same queue,
>>> right? As in sas_phye_loss_of_signal()->sas_deform_port()
>>
>> Oh, good catch! I didn't notice this subtle call path.
>>
>> Do you have any better idea to fix this? We saw this on 4.9 too.
>>
>
> I think we can just cancel the destruct work before calling
> sas_port_delete(). This should work even if it is called in
> another work.
>

This assumes sas_port_delete() could release resources recursively
in the hierarchy, this is true for sysfs but perhaps not true for other
resources...

Patch

diff --git a/drivers/scsi/libsas/sas_discover.c b/drivers/scsi/libsas/sas_discover.c
index 60de66252fa2..bc512d65e2ca 100644
--- a/drivers/scsi/libsas/sas_discover.c
+++ b/drivers/scsi/libsas/sas_discover.c
@@ -565,6 +565,21 @@  int sas_discover_event(struct asd_sas_port *port, enum discover_event ev)
 	return 0;
 }
 
+static void sas_cancel_work(struct sas_work *sw)
+{
+	cancel_work_sync(&sw->work);
+}
+
+void sas_cancel_event(struct asd_sas_port *port, enum discover_event ev)
+{
+	struct sas_discovery *disc;
+
+	if (!port)
+		return;
+	disc = &port->disc;
+	sas_cancel_work(&disc->disc_work[ev].work);
+}
+
 /**
  * sas_init_disc -- initialize the discovery struct in the port
  * @port: pointer to struct port
diff --git a/drivers/scsi/libsas/sas_port.c b/drivers/scsi/libsas/sas_port.c
index d3c5297c6c89..89e37640e26c 100644
--- a/drivers/scsi/libsas/sas_port.c
+++ b/drivers/scsi/libsas/sas_port.c
@@ -219,6 +219,7 @@  void sas_deform_port(struct asd_sas_phy *phy, int gone)
 
 	if (port->num_phys == 1) {
 		sas_unregister_domain_devices(port, gone);
+		sas_cancel_event(port, DISCE_DESTRUCT);
 		sas_port_delete(port->port);
 		port->port = NULL;
 	} else {
diff --git a/include/scsi/libsas.h b/include/scsi/libsas.h
index 6df6fe0c2198..5b8a7fadd9b4 100644
--- a/include/scsi/libsas.h
+++ b/include/scsi/libsas.h
@@ -680,6 +680,7 @@  int  sas_ex_revalidate_domain(struct domain_device *);
 void sas_unregister_domain_devices(struct asd_sas_port *port, int gone);
 void sas_init_disc(struct sas_discovery *disc, struct asd_sas_port *);
 int  sas_discover_event(struct asd_sas_port *, enum discover_event ev);
+void sas_cancel_event(struct asd_sas_port *port, enum discover_event ev);
 
 int  sas_discover_sata(struct domain_device *);
 int  sas_discover_end_dev(struct domain_device *);