Message ID | 4912ec551a8ec01181cc3e7ad1e01d3d36758810.1463170976.git.calvinowens@fb.com (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
On 05/13/2016 01:28 PM, Calvin Owens wrote: > Currently we free the resources backing the enclosure device before we > call device_unregister(). This is racy: during rmmod of low-level SCSI > drivers that hook into enclosure, we end up with a small window of time > during which writing to /sys can OOPS. Example trace with mpt3sas: Ping? > general protection fault: 0000 [#1] SMP KASAN > Modules linked in: mpt3sas(-) <...> > RIP: [<ffffffffa0388a98>] ses_get_page2_descriptor.isra.6+0x38/0x220 [ses] > Call Trace: > [<ffffffffa0389d14>] ses_set_fault+0xf4/0x400 [ses] > [<ffffffffa0361069>] set_component_fault+0xa9/0xf0 [enclosure] > [<ffffffff8205bffc>] dev_attr_store+0x3c/0x70 > [<ffffffff81677df5>] sysfs_kf_write+0x115/0x180 > [<ffffffff81675725>] kernfs_fop_write+0x275/0x3a0 > [<ffffffff8151f810>] __vfs_write+0xe0/0x3e0 > [<ffffffff8152281f>] vfs_write+0x13f/0x4a0 > [<ffffffff81526731>] SyS_write+0x111/0x230 > [<ffffffff828b401b>] entry_SYSCALL_64_fastpath+0x13/0x94 > > Fortunately the solution is extremely simple: call device_unregister() > before we free the resources, and the race no longer exists. The driver > core holds a reference over ->remove_dev(), so AFAICT this is safe. > > Signed-off-by: Calvin Owens <calvinowens@fb.com> > --- > drivers/scsi/ses.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c > index 53ef1cb..0e8601a 100644 > --- a/drivers/scsi/ses.c > +++ b/drivers/scsi/ses.c > @@ -778,6 +778,8 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) > if (!edev) > return; > > + enclosure_unregister(edev); > + > ses_dev = edev->scratch; > edev->scratch = NULL; > > @@ -789,7 +791,6 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) > kfree(edev->component[0].scratch); > > put_device(&edev->edev); > - enclosure_unregister(edev); > } > > static void ses_intf_remove(struct device *cdev, > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thursday 06/02 at 15:50 -0700, Calvin Owens wrote: > On 05/13/2016 01:28 PM, Calvin Owens wrote: > > Currently we free the resources backing the enclosure device before we > > call device_unregister(). This is racy: during rmmod of low-level SCSI > > drivers that hook into enclosure, we end up with a small window of time > > during which writing to /sys can OOPS. Example trace with mpt3sas: > > Ping? Any thoughts? Squinting at this more it still seems racy, but a narrow race is surely better than just blatantly freeing everything while the file is still exposed in /sys? Is there a better way you'd prefer I accomplish this? (I have boxes that OOPS all the time from monitoring code reading the /sys files, with this patch I haven't seen a single one.) Thanks, Calvin > > general protection fault: 0000 [#1] SMP KASAN > > Modules linked in: mpt3sas(-) <...> > > RIP: [<ffffffffa0388a98>] ses_get_page2_descriptor.isra.6+0x38/0x220 [ses] > > Call Trace: > > [<ffffffffa0389d14>] ses_set_fault+0xf4/0x400 [ses] > > [<ffffffffa0361069>] set_component_fault+0xa9/0xf0 [enclosure] > > [<ffffffff8205bffc>] dev_attr_store+0x3c/0x70 > > [<ffffffff81677df5>] sysfs_kf_write+0x115/0x180 > > [<ffffffff81675725>] kernfs_fop_write+0x275/0x3a0 > > [<ffffffff8151f810>] __vfs_write+0xe0/0x3e0 > > [<ffffffff8152281f>] vfs_write+0x13f/0x4a0 > > [<ffffffff81526731>] SyS_write+0x111/0x230 > > [<ffffffff828b401b>] entry_SYSCALL_64_fastpath+0x13/0x94 > > > > Fortunately the solution is extremely simple: call device_unregister() > > before we free the resources, and the race no longer exists. The driver > > core holds a reference over ->remove_dev(), so AFAICT this is safe. > > > > Signed-off-by: Calvin Owens <calvinowens@fb.com> > > --- > > drivers/scsi/ses.c | 3 ++- > > 1 file changed, 2 insertions(+), 1 deletion(-) > > > > diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c > > index 53ef1cb..0e8601a 100644 > > --- a/drivers/scsi/ses.c > > +++ b/drivers/scsi/ses.c > > @@ -778,6 +778,8 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) > > if (!edev) > > return; > > > > + enclosure_unregister(edev); > > + > > ses_dev = edev->scratch; > > edev->scratch = NULL; > > > > @@ -789,7 +791,6 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) > > kfree(edev->component[0].scratch); > > > > put_device(&edev->edev); > > - enclosure_unregister(edev); > > } > > > > static void ses_intf_remove(struct device *cdev, > > > -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 06/15/2016 01:24 PM, Calvin Owens wrote: > On Thursday 06/02 at 15:50 -0700, Calvin Owens wrote: >> On 05/13/2016 01:28 PM, Calvin Owens wrote: >>> Currently we free the resources backing the enclosure device before we >>> call device_unregister(). This is racy: during rmmod of low-level SCSI >>> drivers that hook into enclosure, we end up with a small window of time >>> during which writing to /sys can OOPS. Example trace with mpt3sas: >> >> Ping? > > Any thoughts? Squinting at this more it still seems racy, but a narrow race > is surely better than just blatantly freeing everything while the file is > still exposed in /sys? Is there a better way you'd prefer I accomplish this? > > (I have boxes that OOPS all the time from monitoring code reading the /sys > files, with this patch I haven't seen a single one.) > > Thanks, > Calvin Ping? Thoughts, comments? >>> general protection fault: 0000 [#1] SMP KASAN >>> Modules linked in: mpt3sas(-) <...> >>> RIP: [<ffffffffa0388a98>] ses_get_page2_descriptor.isra.6+0x38/0x220 [ses] >>> Call Trace: >>> [<ffffffffa0389d14>] ses_set_fault+0xf4/0x400 [ses] >>> [<ffffffffa0361069>] set_component_fault+0xa9/0xf0 [enclosure] >>> [<ffffffff8205bffc>] dev_attr_store+0x3c/0x70 >>> [<ffffffff81677df5>] sysfs_kf_write+0x115/0x180 >>> [<ffffffff81675725>] kernfs_fop_write+0x275/0x3a0 >>> [<ffffffff8151f810>] __vfs_write+0xe0/0x3e0 >>> [<ffffffff8152281f>] vfs_write+0x13f/0x4a0 >>> [<ffffffff81526731>] SyS_write+0x111/0x230 >>> [<ffffffff828b401b>] entry_SYSCALL_64_fastpath+0x13/0x94 >>> >>> Fortunately the solution is extremely simple: call device_unregister() >>> before we free the resources, and the race no longer exists. The driver >>> core holds a reference over ->remove_dev(), so AFAICT this is safe. >>> >>> Signed-off-by: Calvin Owens <calvinowens@fb.com> >>> --- >>> drivers/scsi/ses.c | 3 ++- >>> 1 file changed, 2 insertions(+), 1 deletion(-) >>> >>> diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c >>> index 53ef1cb..0e8601a 100644 >>> --- a/drivers/scsi/ses.c >>> +++ b/drivers/scsi/ses.c >>> @@ -778,6 +778,8 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) >>> if (!edev) >>> return; >>> >>> + enclosure_unregister(edev); >>> + >>> ses_dev = edev->scratch; >>> edev->scratch = NULL; >>> >>> @@ -789,7 +791,6 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) >>> kfree(edev->component[0].scratch); >>> >>> put_device(&edev->edev); >>> - enclosure_unregister(edev); >>> } >>> >>> static void ses_intf_remove(struct device *cdev, >>> >> -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
>>>>> "Calvin" == Calvin Owens <calvinowens@fb.com> writes: >> Any thoughts? Squinting at this more it still seems racy, but a >> narrow race is surely better than just blatantly freeing everything >> while the file is still exposed in /sys? Is there a better way you'd >> prefer I accomplish this? >> >> (I have boxes that OOPS all the time from monitoring code reading the >> /sys files, with this patch I haven't seen a single one.) Calvin> Ping? Thoughts, comments? James: This is your puppy...
On Thu, 2016-07-28 at 21:23 -0400, Martin K. Petersen wrote: > > > > > > "Calvin" == Calvin Owens <calvinowens@fb.com> writes: > > > > Any thoughts? Squinting at this more it still seems racy, but a > > > narrow race is surely better than just blatantly freeing > > > everything > > > while the file is still exposed in /sys? Is there a better way > > > you'd > > > prefer I accomplish this? > > > > > > (I have boxes that OOPS all the time from monitoring code reading > > > the > > > /sys files, with this patch I haven't seen a single one.) > > Calvin> Ping? Thoughts, comments? > > James: This is your puppy... I thought it would be bigger by now going by the early paw size indicator ... Anyway Reviewed-by: James Bottomley <jejb@linux.vnet.ibm.com> James -- To unsubscribe from this list: send the line "unsubscribe linux-scsi" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/scsi/ses.c b/drivers/scsi/ses.c index 53ef1cb..0e8601a 100644 --- a/drivers/scsi/ses.c +++ b/drivers/scsi/ses.c @@ -778,6 +778,8 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) if (!edev) return; + enclosure_unregister(edev); + ses_dev = edev->scratch; edev->scratch = NULL; @@ -789,7 +791,6 @@ static void ses_intf_remove_enclosure(struct scsi_device *sdev) kfree(edev->component[0].scratch); put_device(&edev->edev); - enclosure_unregister(edev); } static void ses_intf_remove(struct device *cdev,
Currently we free the resources backing the enclosure device before we call device_unregister(). This is racy: during rmmod of low-level SCSI drivers that hook into enclosure, we end up with a small window of time during which writing to /sys can OOPS. Example trace with mpt3sas: general protection fault: 0000 [#1] SMP KASAN Modules linked in: mpt3sas(-) <...> RIP: [<ffffffffa0388a98>] ses_get_page2_descriptor.isra.6+0x38/0x220 [ses] Call Trace: [<ffffffffa0389d14>] ses_set_fault+0xf4/0x400 [ses] [<ffffffffa0361069>] set_component_fault+0xa9/0xf0 [enclosure] [<ffffffff8205bffc>] dev_attr_store+0x3c/0x70 [<ffffffff81677df5>] sysfs_kf_write+0x115/0x180 [<ffffffff81675725>] kernfs_fop_write+0x275/0x3a0 [<ffffffff8151f810>] __vfs_write+0xe0/0x3e0 [<ffffffff8152281f>] vfs_write+0x13f/0x4a0 [<ffffffff81526731>] SyS_write+0x111/0x230 [<ffffffff828b401b>] entry_SYSCALL_64_fastpath+0x13/0x94 Fortunately the solution is extremely simple: call device_unregister() before we free the resources, and the race no longer exists. The driver core holds a reference over ->remove_dev(), so AFAICT this is safe. Signed-off-by: Calvin Owens <calvinowens@fb.com> --- drivers/scsi/ses.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-)