Message ID | 20190320113941.62117-2-hare@suse.de (mailing list archive) |
---|---|
State | Changes Requested |
Headers | show |
Series | ch: fixup refcounting imbalance for SCSI devices | expand |
On Wed, 2019-03-20 at 12:39 +0100, Hannes Reinecke wrote: > The 'ch_mutex" is meant to guard against modifications of > file->private_data, so we need to take in in ch_release() as > well as in ch_open(). > > Signed-off-by: Hannes Reinecke <hare@suse.de> > --- > drivers/scsi/ch.c | 2 ﭗ橸ṷ梧뇪觬() > > diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c > index 1c5051b1c125..8f426903d7e4 100644 > --- a/drivers/scsi/ch.c > +++ b/drivers/scsi/ch.c > @@ -577,9 +577,11 @@ ch_release(struct inode *inode, struct file *file) > { > scsi_changer *ch = file->private_data; > > + mutex_lock(&ch_mutex); > scsi_device_put(ch->device); > ch->device = NULL; > file->private_data = NULL; > + mutex_unlock(&ch_mutex); > kref_put(&ch->ref, ch_destroy); > return 0; > } Hi Hannes, My interpretation of the open() syscall code (do_sys_open()) is that a new struct file is allocated every time the open() is called. Does that mean that the lock and unlock calls for ch_mutex can be removed from ch_open()? Regarding the above patch: why do you think that file->private_data changes need to be serialized? I don't know any other file_operations::open / close callback implementations that implement similar serialization. Thanks, Bart.
On 2019-03-21 5:36 p.m., Bart Van Assche wrote: > On Wed, 2019-03-20 at 12:39 +0100, Hannes Reinecke wrote: >> The 'ch_mutex" is meant to guard against modifications of >> file->private_data, so we need to take in in ch_release() as >> well as in ch_open(). >> >> Signed-off-by: Hannes Reinecke <hare@suse.de> >> --- >> drivers/scsi/ch.c | 2 ﭗ橸ṷ梧뇪觬() >> >> diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c >> index 1c5051b1c125..8f426903d7e4 100644 >> --- a/drivers/scsi/ch.c >> +++ b/drivers/scsi/ch.c >> @@ -577,9 +577,11 @@ ch_release(struct inode *inode, struct file *file) >> { >> scsi_changer *ch = file->private_data; >> >> + mutex_lock(&ch_mutex); >> scsi_device_put(ch->device); >> ch->device = NULL; >> file->private_data = NULL; >> + mutex_unlock(&ch_mutex); >> kref_put(&ch->ref, ch_destroy); >> return 0; >> } > > Hi Hannes, > > My interpretation of the open() syscall code (do_sys_open()) is that a new > struct file is allocated every time the open() is called. Does that mean that > the lock and unlock calls for ch_mutex can be removed from ch_open()? > > Regarding the above patch: why do you think that file->private_data changes > need to be serialized? I don't know any other file_operations::open / close > callback implementations that implement similar serialization. Bart, That doesn't sound right. If it was correct then sg_open() and sg_release() have mutex overkill (and I don't think that is caused by the complexity of adding O_EXCL which is damn hard to implement correctly). Example with existing ch driver code, two threads T1 and T2: T1 T2 ======================================== f1 = open("/dev/ch1") .... close(f1) f2 = open("dev/sg1") So if f2=open() gets ch (a pointer) but _before_ it does kref_get(), close(f1) gets in and does kref_put(&ch->ref, ch_destroy), ref goes to 0 and ch_destroy() gets called and now ch is dangling .... Right solution, perhaps Bart could fix the explanation on the patch :-) Doug Gilbert P.S. Bart, if your last statement is correct, then they are probably all broken!
On 2019-03-21 7:11 p.m., Douglas Gilbert wrote: > On 2019-03-21 5:36 p.m., Bart Van Assche wrote: >> On Wed, 2019-03-20 at 12:39 +0100, Hannes Reinecke wrote: >>> The 'ch_mutex" is meant to guard against modifications of >>> file->private_data, so we need to take in in ch_release() as >>> well as in ch_open(). >>> >>> Signed-off-by: Hannes Reinecke <hare@suse.de> >>> --- >>> drivers/scsi/ch.c | 2 ﭗ橸ṷ梧뇪觬() >>> >>> diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c >>> index 1c5051b1c125..8f426903d7e4 100644 >>> --- a/drivers/scsi/ch.c >>> +++ b/drivers/scsi/ch.c >>> @@ -577,9 +577,11 @@ ch_release(struct inode *inode, struct file *file) >>> { >>> scsi_changer *ch = file->private_data; >>> + mutex_lock(&ch_mutex); >>> scsi_device_put(ch->device); >>> ch->device = NULL; >>> file->private_data = NULL; >>> + mutex_unlock(&ch_mutex); >>> kref_put(&ch->ref, ch_destroy); >>> return 0; >>> } >> >> Hi Hannes, >> >> My interpretation of the open() syscall code (do_sys_open()) is that a new >> struct file is allocated every time the open() is called. Does that mean that >> the lock and unlock calls for ch_mutex can be removed from ch_open()? >> >> Regarding the above patch: why do you think that file->private_data changes >> need to be serialized? I don't know any other file_operations::open / close >> callback implementations that implement similar serialization. > Bart, > That doesn't sound right. If it was correct then sg_open() and sg_release() > have mutex overkill (and I don't think that is caused by the complexity of > adding O_EXCL which is damn hard to implement correctly). > > Example with existing ch driver code, two threads T1 and T2: > > T1 T2 > ======================================== > f1 = open("/dev/ch1") > .... > close(f1) f2 = open("dev/sg1") That should be close(f1) colliding with f2=open("dev/ch1") > > So if f2=open() gets ch (a pointer) but _before_ it does kref_get(), > close(f1) gets in and does kref_put(&ch->ref, ch_destroy), ref goes > to 0 and ch_destroy() gets called and now ch is dangling .... > > Right solution, perhaps Bart could fix the explanation on the patch :-) > > Doug Gilbert > > > P.S. Bart, if your last statement is correct, then they are probably all > broken! > > >
On Thu, 2019-03-21 at 19:11 -0400, Douglas Gilbert wrote: > That doesn't sound right. If it was correct then sg_open() and sg_release() > have mutex overkill (and I don't think that is caused by the complexity of > adding O_EXCL which is damn hard to implement correctly). > > Example with existing ch driver code, two threads T1 and T2: > > T1 T2 > ======================================== > f1 = open("/dev/ch1") > .... > close(f1) f2 = open("dev/sg1") > > > So if f2=open() gets ch (a pointer) but _before_ it does kref_get(), > close(f1) gets in and does kref_put(&ch->ref, ch_destroy), ref goes > to 0 and ch_destroy() gets called and now ch is dangling .... Hi Doug, I don't think that what you described can happen. The kref_put() call in ch_release() can only drop the final reference after ch_remove() has been called. Before ch_remove() calls kref_put() it removes the index from the idr so ch_open() won't find that index in the idr anymore. In other words, ch_open() can never encounter a zero refcount for an index that it found in the idr. Bart.
On 2019-03-21 7:33 p.m., Bart Van Assche wrote: > On Thu, 2019-03-21 at 19:11 -0400, Douglas Gilbert wrote: >> That doesn't sound right. If it was correct then sg_open() and sg_release() >> have mutex overkill (and I don't think that is caused by the complexity of >> adding O_EXCL which is damn hard to implement correctly). >> >> Example with existing ch driver code, two threads T1 and T2: >> >> T1 T2 >> ======================================== >> f1 = open("/dev/ch1") >> .... >> close(f1) f2 = open("dev/sg1") >> >> >> So if f2=open() gets ch (a pointer) but _before_ it does kref_get(), >> close(f1) gets in and does kref_put(&ch->ref, ch_destroy), ref goes >> to 0 and ch_destroy() gets called and now ch is dangling .... > > Hi Doug, > > I don't think that what you described can happen. The kref_put() call in > ch_release() can only drop the final reference after ch_remove() has been > called. Before ch_remove() calls kref_put() it removes the index from the > idr so ch_open() won't find that index in the idr anymore. In other words, > ch_open() can never encounter a zero refcount for an index that it found > in the idr. Well its another "no file scope" char driver. Open() twice on the same sch device, close() once and static int ch_release(struct inode *inode, struct file *file) { scsi_changer *ch = file->private_data; scsi_device_put(ch->device); ch->device = NULL; // <<===== WTF? file->private_data = NULL; kref_put(&ch->ref, ch_destroy); return 0; } So the next SCSI command sent on the remaining file descriptor should oops due to the line shown above. ch->device is set in the ch_probe() function! And the next open() on that device will oops due to scsi_device_get(NULL) ... Doug Gilbert
On 3/22/19 12:33 AM, Bart Van Assche wrote: > On Thu, 2019-03-21 at 19:11 -0400, Douglas Gilbert wrote: >> That doesn't sound right. If it was correct then sg_open() and sg_release() >> have mutex overkill (and I don't think that is caused by the complexity of >> adding O_EXCL which is damn hard to implement correctly). >> >> Example with existing ch driver code, two threads T1 and T2: >> >> T1 T2 >> ======================================== >> f1 = open("/dev/ch1") >> .... >> close(f1) f2 = open("dev/sg1") >> >> >> So if f2=open() gets ch (a pointer) but _before_ it does kref_get(), >> close(f1) gets in and does kref_put(&ch->ref, ch_destroy), ref goes >> to 0 and ch_destroy() gets called and now ch is dangling .... > > Hi Doug, > > I don't think that what you described can happen. The kref_put() call in > ch_release() can only drop the final reference after ch_remove() has been > called. Before ch_remove() calls kref_put() it removes the index from the > idr so ch_open() won't find that index in the idr anymore. In other words, > ch_open() can never encounter a zero refcount for an index that it found > in the idr. > The original issue leading to this patchset was this crash: [159135.508116] Pid: 2638, comm: ssea Tainted: G W X 3.0.101-0.40-default #1 HP ProLiant BL460c Gen9 [159135.508119] RIP: 0010:[<ffffffffa00bb5d1>] [<ffffffffa00bb5d1>] scsi_device_get+0x11/0xb0 [scsi_mod] [159135.508126] RSP: 0018:ffff88100fdf5c88 EFLAGS: 00010296 [159135.508128] RAX: ffff88101b31d5c0 RBX: 0000000000000000 RCX: ffff88101b31d5c0 [159135.508130] RDX: 0000000000000000 RSI: 0000000000000002 RDI: 0000000000000000 [159135.508132] RBP: ffff88101c1c4780 R08: 0000000000000000 R09: ffff88201f387af0 [159135.508134] R10: ffff88100fdf5e68 R11: ffffffff811eee70 R12: ffffffffa06ea120 [159135.508136] R13: ffff88201ef903c0 R14: ffff881007bdda00 R15: ffff88101c1c4780 [159135.508139] FS: 00007faae06d2700(0000) GS:ffff88107fc00000(0000) knlGS:0000000000000000 [159135.508141] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [159135.508143] CR2: 0000000000000650 CR3: 0000001018d0f000 CR4: 00000000001407f0 [159135.508145] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 [159135.508148] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400 [159135.508150] Process ssea (pid: 2638, threadinfo ffff88100fdf4000, task ffff881012080140) [159135.508152] Stack: [159135.508153] ffff88201ef903c0 ffff88101b31d5c0 ffff88101c1c4780 ffffffffa06e767c [159135.508160] 0000000000000000 0000000000000000 0000000000000000 ffffffff8116119e [159135.508163] 0000000000000000 0000000000000000 0000000000000000 0000000000000000 [159135.508167] Call Trace: [159135.508177] [<ffffffffa06e767c>] ch_open+0x4c/0xa0 [ch] [159135.508189] [<ffffffff8116119e>] chrdev_open+0x13e/0x200 [159135.508196] [<ffffffff8115ade8>] __dentry_open+0x198/0x310 [159135.508201] [<ffffffff8116a432>] do_last+0x1f2/0x800 [159135.508206] [<ffffffff8116b6a9>] path_openat+0xd9/0x420 [159135.508210] [<ffffffff8116bb2c>] do_filp_open+0x4c/0xc0 [159135.508214] [<ffffffff8115c7cf>] do_sys_open+0x17f/0x250 [159135.508219] [<ffffffff8146c292>] system_call_fastpath+0x16/0x1b [159135.508225] [<00007faadfa2a040>] 0x7faadfa2a03f [159135.508227] Code: 56 27 e1 0f 1f 80 00 00 00 00 48 89 df e8 98 47 fe e0 eb d5 66 0f 1f 44 00 00 48 83 ec 18 48 89 5c 24 08 48 89 6c 24 10 48 89 fb 83>[159135.508241] bf 50 06 00 00 04 75 16 b8 fa ff ff ff 48 8b 5c 24 08 48 8b [159135.508248] RIP [<ffffffffa00bb5d1>] scsi_device_get+0x11/0xb0 [scsi_mod] [159135.508254] RSP <ffff88100fdf5c88> [159135.508256] CR2: 0000000000000650 And we had been crashing because 'ch->device' was NULL in ch_open(). This patch is to guarantee atomicity on 'scsi_device_put()' and 'ch->device = NULL'; otherwise we'd be having a race window between those calls, allowing another thread to find a 'ch' device with an invalid but non-NULL ch->device pointer. Cheers, Hannes
On Mon, 2019-03-25 at 10:26 +0100, Hannes Reinecke wrote: > The original issue leading to this patchset was this crash: > > > [159135.508116] Pid: 2638, comm: ssea Tainted: G W X > 3.0.101-0.40-default #1 HP ProLiant BL460c Gen9 > [159135.508119] RIP: 0010:[<ffffffffa00bb5d1>] [<ffffffffa00bb5d1>] > scsi_device_get+0x11/0xb0 [scsi_mod] > [159135.508126] RSP: 0018:ffff88100fdf5c88 EFLAGS: 00010296 > [159135.508128] RAX: ffff88101b31d5c0 RBX: 0000000000000000 RCX: > ffff88101b31d5c0 > [159135.508130] RDX: 0000000000000000 RSI: 0000000000000002 RDI: > 0000000000000000 > [159135.508132] RBP: ffff88101c1c4780 R08: 0000000000000000 R09: > ffff88201f387af0 > [159135.508134] R10: ffff88100fdf5e68 R11: ffffffff811eee70 R12: > ffffffffa06ea120 > [159135.508136] R13: ffff88201ef903c0 R14: ffff881007bdda00 R15: > ffff88101c1c4780 > [159135.508139] FS: 00007faae06d2700(0000) GS:ffff88107fc00000(0000) > knlGS:0000000000000000 > [159135.508141] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [159135.508143] CR2: 0000000000000650 CR3: 0000001018d0f000 CR4: > 00000000001407f0 > [159135.508145] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [159135.508148] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: > 0000000000000400 > [159135.508150] Process ssea (pid: 2638, threadinfo ffff88100fdf4000, > task ffff881012080140) > [159135.508152] Stack: > [159135.508153] ffff88201ef903c0 ffff88101b31d5c0 ffff88101c1c4780 > ffffffffa06e767c > [159135.508160] 0000000000000000 0000000000000000 0000000000000000 > ffffffff8116119e > [159135.508163] 0000000000000000 0000000000000000 0000000000000000 > 0000000000000000 > [159135.508167] Call Trace: > [159135.508177] [<ffffffffa06e767c>] ch_open+0x4c/0xa0 [ch] > [159135.508189] [<ffffffff8116119e>] chrdev_open+0x13e/0x200 > [159135.508196] [<ffffffff8115ade8>] __dentry_open+0x198/0x310 > [159135.508201] [<ffffffff8116a432>] do_last+0x1f2/0x800 > [159135.508206] [<ffffffff8116b6a9>] path_openat+0xd9/0x420 > [159135.508210] [<ffffffff8116bb2c>] do_filp_open+0x4c/0xc0 > [159135.508214] [<ffffffff8115c7cf>] do_sys_open+0x17f/0x250 > [159135.508219] [<ffffffff8146c292>] system_call_fastpath+0x16/0x1b > [159135.508225] [<00007faadfa2a040>] 0x7faadfa2a03f > [159135.508227] Code: 56 27 e1 0f 1f 80 00 00 00 00 48 89 df e8 98 47 fe > e0 eb d5 66 0f 1f 44 00 00 48 83 ec 18 48 89 5c 24 08 48 89 6c 24 10 48 > 89 fb > 83>[159135.508241] bf 50 06 00 00 04 75 16 b8 fa ff ff ff 48 8b 5c 24 > 08 48 8b > [159135.508248] RIP [<ffffffffa00bb5d1>] scsi_device_get+0x11/0xb0 > [scsi_mod] > [159135.508254] RSP <ffff88100fdf5c88> > [159135.508256] CR2: 0000000000000650 > > And we had been crashing because 'ch->device' was NULL in ch_open(). > This patch is to guarantee atomicity on 'scsi_device_put()' and > 'ch->device = NULL'; otherwise we'd be having a race window between > those calls, allowing another thread to find a 'ch' device with an > invalid but non-NULL ch->device pointer. Hi Hannes, Thank you for having shared this call trace. Do you agree that moving the ch->device = NULL assignment from ch_release() into ch_destroy() is sufficient to fix this crash? Bart.
On 3/25/19 4:32 PM, Bart Van Assche wrote: > On Mon, 2019-03-25 at 10:26 +0100, Hannes Reinecke wrote: >> The original issue leading to this patchset was this crash: >> >> >> [159135.508116] Pid: 2638, comm: ssea Tainted: G W X >> 3.0.101-0.40-default #1 HP ProLiant BL460c Gen9 >> [159135.508119] RIP: 0010:[<ffffffffa00bb5d1>] [<ffffffffa00bb5d1>] >> scsi_device_get+0x11/0xb0 [scsi_mod] >> [159135.508126] RSP: 0018:ffff88100fdf5c88 EFLAGS: 00010296 >> [159135.508128] RAX: ffff88101b31d5c0 RBX: 0000000000000000 RCX: >> ffff88101b31d5c0 >> [159135.508130] RDX: 0000000000000000 RSI: 0000000000000002 RDI: >> 0000000000000000 >> [159135.508132] RBP: ffff88101c1c4780 R08: 0000000000000000 R09: >> ffff88201f387af0 >> [159135.508134] R10: ffff88100fdf5e68 R11: ffffffff811eee70 R12: >> ffffffffa06ea120 >> [159135.508136] R13: ffff88201ef903c0 R14: ffff881007bdda00 R15: >> ffff88101c1c4780 >> [159135.508139] FS: 00007faae06d2700(0000) GS:ffff88107fc00000(0000) >> knlGS:0000000000000000 >> [159135.508141] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 >> [159135.508143] CR2: 0000000000000650 CR3: 0000001018d0f000 CR4: >> 00000000001407f0 >> [159135.508145] DR0: 0000000000000000 DR1: 0000000000000000 DR2: >> 0000000000000000 >> [159135.508148] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: >> 0000000000000400 >> [159135.508150] Process ssea (pid: 2638, threadinfo ffff88100fdf4000, >> task ffff881012080140) >> [159135.508152] Stack: >> [159135.508153] ffff88201ef903c0 ffff88101b31d5c0 ffff88101c1c4780 >> ffffffffa06e767c >> [159135.508160] 0000000000000000 0000000000000000 0000000000000000 >> ffffffff8116119e >> [159135.508163] 0000000000000000 0000000000000000 0000000000000000 >> 0000000000000000 >> [159135.508167] Call Trace: >> [159135.508177] [<ffffffffa06e767c>] ch_open+0x4c/0xa0 [ch] >> [159135.508189] [<ffffffff8116119e>] chrdev_open+0x13e/0x200 >> [159135.508196] [<ffffffff8115ade8>] __dentry_open+0x198/0x310 >> [159135.508201] [<ffffffff8116a432>] do_last+0x1f2/0x800 >> [159135.508206] [<ffffffff8116b6a9>] path_openat+0xd9/0x420 >> [159135.508210] [<ffffffff8116bb2c>] do_filp_open+0x4c/0xc0 >> [159135.508214] [<ffffffff8115c7cf>] do_sys_open+0x17f/0x250 >> [159135.508219] [<ffffffff8146c292>] system_call_fastpath+0x16/0x1b >> [159135.508225] [<00007faadfa2a040>] 0x7faadfa2a03f >> [159135.508227] Code: 56 27 e1 0f 1f 80 00 00 00 00 48 89 df e8 98 47 fe >> e0 eb d5 66 0f 1f 44 00 00 48 83 ec 18 48 89 5c 24 08 48 89 6c 24 10 48 >> 89 fb >> 83>[159135.508241] bf 50 06 00 00 04 75 16 b8 fa ff ff ff 48 8b 5c 24 >> 08 48 8b >> [159135.508248] RIP [<ffffffffa00bb5d1>] scsi_device_get+0x11/0xb0 >> [scsi_mod] >> [159135.508254] RSP <ffff88100fdf5c88> >> [159135.508256] CR2: 0000000000000650 >> >> And we had been crashing because 'ch->device' was NULL in ch_open(). >> This patch is to guarantee atomicity on 'scsi_device_put()' and >> 'ch->device = NULL'; otherwise we'd be having a race window between >> those calls, allowing another thread to find a 'ch' device with an >> invalid but non-NULL ch->device pointer. > > Hi Hannes, > > Thank you for having shared this call trace. Do you agree that moving > the ch->device = NULL assignment from ch_release() into ch_destroy() is > sufficient to fix this crash? > That's what I'm doing the the second patch. But yeah, we can possible omit this one after the second is applied; protecting a simple assignment via a mutex is pretty pointless. I'll be resending the second one. Cheers, Hannes
diff --git a/drivers/scsi/ch.c b/drivers/scsi/ch.c index 1c5051b1c125..8f426903d7e4 100644 --- a/drivers/scsi/ch.c +++ b/drivers/scsi/ch.c @@ -577,9 +577,11 @@ ch_release(struct inode *inode, struct file *file) { scsi_changer *ch = file->private_data; + mutex_lock(&ch_mutex); scsi_device_put(ch->device); ch->device = NULL; file->private_data = NULL; + mutex_unlock(&ch_mutex); kref_put(&ch->ref, ch_destroy); return 0; }
The 'ch_mutex" is meant to guard against modifications of file->private_data, so we need to take in in ch_release() as well as in ch_open(). Signed-off-by: Hannes Reinecke <hare@suse.de> --- drivers/scsi/ch.c | 2 ++ 1 file changed, 2 insertions(+)