Message ID | 20190318140703.28963-2-yuyufen@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | fix use-after-free of gendisk | expand |
This patch fix use-after-free on gendisk when open the disk partition. Ping and Cc Cc: Al Viro <viro@zeniv.linux.org.uk> Cc: Bart Van Assche <bart.vanassche@wdc.com> Cc: Jan Kara <jack@suse.cz> Yufen Thanks On 2019/3/18 22:07, Yufen Yu wrote: > In part_release(), it will remove devt from ext_devt_idr and > get_gendisk cannot find it. But, if disk_release() works before > part_release, open device partition may cause use-after-free of > disk in get_gendisk(). We use md device as example, the race sence: > > Process1 Worker Process2 > md_free > blkdev_open > del_gendisk > add delete_partition_work_fn() to wq > __blkdev_get > get_gendisk > put_disk > disk_release > kfree(disk) > find part from ext_devt_idr > get_disk_and_module(disk) > cause use after free > > delete_partition_work_fn > put_device(part) > part_release > remove part from ext_devt_idr > > Before Woker thread removes part from ext_devt_idr, Process2 can find > the part and access the disk, resulting use-after-free. > > We fix this by removing the devt from ext_devt_idr when delete partition. > > Signed-off-by: Yufen Yu <yuyufen@huawei.com> > --- > block/partition-generic.c | 5 +++++ > 1 file changed, 5 insertions(+) > > diff --git a/block/partition-generic.c b/block/partition-generic.c > index 1ee3e1d1bc2a..30d1039d5e8d 100644 > --- a/block/partition-generic.c > +++ b/block/partition-generic.c > @@ -288,6 +288,11 @@ void delete_partition(struct gendisk *disk, int partno) > kobject_put(part->holder_dir); > device_del(part_to_dev(part)); > > + /* > + * We should ensuere to delete part from idr before kfree(disk), > + * avoiding use-after-free of disk. > + */ > + blk_free_devt(part_devt(part)); > hd_struct_kill(part); > } >
On Sat 23-03-19 20:51:45, yuyufen wrote: > This patch fix use-after-free on gendisk when open the disk partition. > > Ping and Cc > > Cc: Al Viro <viro@zeniv.linux.org.uk> > Cc: Bart Van Assche <bart.vanassche@wdc.com> > Cc: Jan Kara <jack@suse.cz> > > Yufen > Thanks > > On 2019/3/18 22:07, Yufen Yu wrote: > > In part_release(), it will remove devt from ext_devt_idr and > > get_gendisk cannot find it. But, if disk_release() works before > > part_release, open device partition may cause use-after-free of > > disk in get_gendisk(). We use md device as example, the race sence: > > > > Process1 Worker Process2 > > md_free > > blkdev_open > > del_gendisk > > add delete_partition_work_fn() to wq > > __blkdev_get > > get_gendisk > > put_disk > > disk_release > > kfree(disk) > > find part from ext_devt_idr > > get_disk_and_module(disk) > > cause use after free > > > > delete_partition_work_fn > > put_device(part) > > part_release > > remove part from ext_devt_idr > > > > Before Woker thread removes part from ext_devt_idr, Process2 can find > > the part and access the disk, resulting use-after-free. > > > > We fix this by removing the devt from ext_devt_idr when delete partition. > > > > Signed-off-by: Yufen Yu <yuyufen@huawei.com> Thanks for your analysis and the patch! I agree with our analysis of the problem but I'm afraid the fix won't be so straightforward. The problem is that once you remove device number from idr, it can get reallocated for another device. So if nothing else, the idr_remove() call you've left in part_release() could delete *newly created device* from idr which is not what you want. And even if this is fixed, there are other possible issues with reallocating device number before the device is fully shutdown - commit 2da78092dda "block: Fix dev_t minor allocation lifetime" specifically moved blk_free_devt(dev->devt) call to part_release() to avoid such issues (sadly without explaining more details). Adding Keith to CC just in case he remembers. Anyway, you are right that it is fundamentally wrong to expose a structure (hd_struct in our case) for lookup in idr when it is essentially scheduled for RCU-delayed freeing. So what we can do is to replace hd_struct pointer in idr with some placeholder value - NULL seems to be OK with idr - and delete the entry from idr in part_release() as we do now. We can call this new helper blk_invalidate_devt() or something like that. Honza > > --- > > block/partition-generic.c | 5 +++++ > > 1 file changed, 5 insertions(+) > > > > diff --git a/block/partition-generic.c b/block/partition-generic.c > > index 1ee3e1d1bc2a..30d1039d5e8d 100644 > > --- a/block/partition-generic.c > > +++ b/block/partition-generic.c > > @@ -288,6 +288,11 @@ void delete_partition(struct gendisk *disk, int partno) > > kobject_put(part->holder_dir); > > device_del(part_to_dev(part)); > > + /* > > + * We should ensuere to delete part from idr before kfree(disk), > > + * avoiding use-after-free of disk. > > + */ > > + blk_free_devt(part_devt(part)); > > hd_struct_kill(part); > > } > >
Hi On 2019/3/30 6:42, Jan Kara wrote: > On Sat 23-03-19 20:51:45, yuyufen wrote: >> This patch fix use-after-free on gendisk when open the disk partition. >> >> Ping and Cc >> >> Cc: Al Viro <viro@zeniv.linux.org.uk> >> Cc: Bart Van Assche <bart.vanassche@wdc.com> >> Cc: Jan Kara <jack@suse.cz> >> >> Yufen >> Thanks >> >> On 2019/3/18 22:07, Yufen Yu wrote: >>> In part_release(), it will remove devt from ext_devt_idr and >>> get_gendisk cannot find it. But, if disk_release() works before >>> part_release, open device partition may cause use-after-free of >>> disk in get_gendisk(). We use md device as example, the race sence: >>> >>> Process1 Worker Process2 >>> md_free >>> blkdev_open >>> del_gendisk >>> add delete_partition_work_fn() to wq >>> __blkdev_get >>> get_gendisk >>> put_disk >>> disk_release >>> kfree(disk) >>> find part from ext_devt_idr >>> get_disk_and_module(disk) >>> cause use after free >>> >>> delete_partition_work_fn >>> put_device(part) >>> part_release >>> remove part from ext_devt_idr >>> >>> Before Woker thread removes part from ext_devt_idr, Process2 can find >>> the part and access the disk, resulting use-after-free. >>> >>> We fix this by removing the devt from ext_devt_idr when delete partition. >>> >>> Signed-off-by: Yufen Yu <yuyufen@huawei.com> > Thanks for your analysis and the patch! I agree with our analysis of the > problem but I'm afraid the fix won't be so straightforward. The problem is > that once you remove device number from idr, it can get reallocated for > another device. So if nothing else, the idr_remove() call you've left in > part_release() could delete *newly created device* from idr which is not > what you want. And even if this is fixed, there are other possible issues > with reallocating device number before the device is fully shutdown - > commit 2da78092dda "block: Fix dev_t minor allocation lifetime" > specifically moved blk_free_devt(dev->devt) call to part_release() to avoid > such issues (sadly without explaining more details). Adding Keith to CC > just in case he remembers. > > Anyway, you are right that it is fundamentally wrong to expose a structure > (hd_struct in our case) for lookup in idr when it is essentially scheduled > for RCU-delayed freeing. So what we can do is to replace hd_struct pointer > in idr with some placeholder value - NULL seems to be OK with idr - and > delete the entry from idr in part_release() as we do now. We can call this > new helper blk_invalidate_devt() or something like that. > > Honza > Thanks a lot for your detailed explanation:-). And replacing hd_struct pointer by NULL in idr is a good idea. I will send v2 for this problem. Yufen Thanks >>> --- >>> block/partition-generic.c | 5 +++++ >>> 1 file changed, 5 insertions(+) >>> >>> diff --git a/block/partition-generic.c b/block/partition-generic.c >>> index 1ee3e1d1bc2a..30d1039d5e8d 100644 >>> --- a/block/partition-generic.c >>> +++ b/block/partition-generic.c >>> @@ -288,6 +288,11 @@ void delete_partition(struct gendisk *disk, int partno) >>> kobject_put(part->holder_dir); >>> device_del(part_to_dev(part)); >>> + /* >>> + * We should ensuere to delete part from idr before kfree(disk), >>> + * avoiding use-after-free of disk. >>> + */ >>> + blk_free_devt(part_devt(part)); >>> hd_struct_kill(part); >>> } >>
diff --git a/block/partition-generic.c b/block/partition-generic.c index 1ee3e1d1bc2a..30d1039d5e8d 100644 --- a/block/partition-generic.c +++ b/block/partition-generic.c @@ -288,6 +288,11 @@ void delete_partition(struct gendisk *disk, int partno) kobject_put(part->holder_dir); device_del(part_to_dev(part)); + /* + * We should ensuere to delete part from idr before kfree(disk), + * avoiding use-after-free of disk. + */ + blk_free_devt(part_devt(part)); hd_struct_kill(part); }
In part_release(), it will remove devt from ext_devt_idr and get_gendisk cannot find it. But, if disk_release() works before part_release, open device partition may cause use-after-free of disk in get_gendisk(). We use md device as example, the race sence: Process1 Worker Process2 md_free blkdev_open del_gendisk add delete_partition_work_fn() to wq __blkdev_get get_gendisk put_disk disk_release kfree(disk) find part from ext_devt_idr get_disk_and_module(disk) cause use after free delete_partition_work_fn put_device(part) part_release remove part from ext_devt_idr Before Woker thread removes part from ext_devt_idr, Process2 can find the part and access the disk, resulting use-after-free. We fix this by removing the devt from ext_devt_idr when delete partition. Signed-off-by: Yufen Yu <yuyufen@huawei.com> --- block/partition-generic.c | 5 +++++ 1 file changed, 5 insertions(+)