Message ID | 20200329140459.18155-1-maco@android.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | loop: Add LOOP_SET_FD_WITH_OFFSET ioctl. | expand |
On 03/29/2020 07:05 AM, Martijn Coenen wrote: > Configuring a loop device for a filesystem that is located at an offset > currently requires calling LOOP_SET_FD and LOOP_SET_STATUS(64) > consecutively. This has some downsides. > > The most important downside is that it can be slow. Here's setting > up ~70 regular loop devices on an x86 Android device: > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > do losetup -r /dev/block/loop$i com.android.adbd.apex; done > 0m01.85s real 0m00.01s user 0m00.01s system > > Here's configuring ~70 devices in the same way, but with an offset: > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done > 0m03.40s real 0m00.02s user 0m00.03s system > > This is almost twice as slow; the main reason for this slowness is that > LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to freeze the associated > queue; this requires waiting for RCU synchronization, which I've > measured can take about 15-20ms on this device on average. > > A more minor downside of having to do two ioctls is that on devices with > max_part > 0, the kernel will initiate a partition scan, which is > needless work if the image is at an offset. > > This change introduces a new ioctl to combine setting the backing file > together with the offset, which avoids the above problems. Adding more > parameters could be a consideration, but offset appears to be the only > commonly used parameter that is required for accessing the device > safely. > > Signed-off-by: Martijn Coenen<maco@android.com> This patch seems to solve problem, can you please make sure to add a blktest [1] for the same since it is a new IOCTL ? [1] https://github.com/osandov/blktests.
On 2020-03-29 07:04, Martijn Coenen wrote: > -static int loop_set_fd(struct loop_device *lo, fmode_t mode, > - struct block_device *bdev, unsigned int arg) > +static int loop_set_fd_with_offset(struct loop_device *lo, fmode_t mode, > + struct block_device *bdev, unsigned int arg, loff_t offset) Since this function has to be modified, please add an additional patch to rename 'arg' into 'fd'. Additionally, how about renaming "loop_set_fd_with_offset" into "loop_set_fd_and_offset"? I think the latter name reflects more clearly the purpose of this function. > @@ -1624,6 +1625,17 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, > break; > case LOOP_GET_STATUS64: > return loop_get_status64(lo, (struct loop_info64 __user *) arg); > + case LOOP_SET_FD_WITH_OFFSET: { > + struct loop_fd_with_offset fdwo; > + > + if (copy_from_user(&fdwo, > + (struct loop_fd_with_offset __user *) arg, > + sizeof(struct loop_fd_with_offset))) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ The kernel code that I'm familiar with uses sizeof(<variable name>) instead of sizeof(<struct name>). That makes it less likely that changing the type of the variable will introduce a mismatch between the sizeof() expression and the size of the variable. Thanks, Bart.
On Sun, Mar 29, 2020 at 04:04:59PM +0200, Martijn Coenen wrote: > Configuring a loop device for a filesystem that is located at an offset > currently requires calling LOOP_SET_FD and LOOP_SET_STATUS(64) > consecutively. This has some downsides. > > The most important downside is that it can be slow. Here's setting > up ~70 regular loop devices on an x86 Android device: > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > do losetup -r /dev/block/loop$i com.android.adbd.apex; done > 0m01.85s real 0m00.01s user 0m00.01s system > > Here's configuring ~70 devices in the same way, but with an offset: > > vsoc_x86:/system/apex # time for i in `seq 30 100`; > do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done > 0m03.40s real 0m00.02s user 0m00.03s system > > This is almost twice as slow; the main reason for this slowness is that > LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to freeze the associated > queue; this requires waiting for RCU synchronization, which I've > measured can take about 15-20ms on this device on average. > > A more minor downside of having to do two ioctls is that on devices with > max_part > 0, the kernel will initiate a partition scan, which is > needless work if the image is at an offset. > > This change introduces a new ioctl to combine setting the backing file > together with the offset, which avoids the above problems. Adding more > parameters could be a consideration, but offset appears to be the only > commonly used parameter that is required for accessing the device > safely. The new ioctl LOOP_SET_FD_WITH_OFFSET looks not generic enough, could you consider to add one ioctl LOOP_SET_FD_AND_STATUS to cover both SET_FD and SET_STATUS so that using two ioctl() to setup loop can become deprecated finally? Thanks, Ming
Hi Ming, On Mon, Mar 30, 2020 at 3:00 AM Ming Lei <ming.lei@redhat.com> wrote: > The new ioctl LOOP_SET_FD_WITH_OFFSET looks not generic enough, could > you consider to add one ioctl LOOP_SET_FD_AND_STATUS to cover both > SET_FD and SET_STATUS so that using two ioctl() to setup loop can become > deprecated finally? I originally started out doing that. However, it is a significantly larger refactoring of the loop driver, and it makes things like error handling more complex. I thought configuring loop with an offset is the most common case. But if there's a preference to do an ioctl that takes the full status, I can work on that. Best, Martijn > > > Thanks, > Ming >
On Mon, Mar 30, 2020 at 10:06:41AM +0200, Martijn Coenen wrote: > Hi Ming, > > On Mon, Mar 30, 2020 at 3:00 AM Ming Lei <ming.lei@redhat.com> wrote: > > The new ioctl LOOP_SET_FD_WITH_OFFSET looks not generic enough, could > > you consider to add one ioctl LOOP_SET_FD_AND_STATUS to cover both > > SET_FD and SET_STATUS so that using two ioctl() to setup loop can become > > deprecated finally? > > I originally started out doing that. However, it is a significantly > larger refactoring of the loop driver, and it makes things like error > handling more complex. I thought configuring loop with an offset is > the most common case. But if there's a preference to do an ioctl that > takes the full status, I can work on that. I think the full blown set fd an status would seem a lot more useful, or even better a LOOP_CTL_ADD variant that sets up everything important on the character device so that we avoid the half set up block devices entirely.
On Tue, Mar 31, 2020 at 9:48 AM Christoph Hellwig <hch@lst.de> wrote: > I think the full blown set fd an status would seem a lot more useful, > or even better a LOOP_CTL_ADD variant that sets up everything important > on the character device so that we avoid the half set up block devices > entirely. Thanks for the feedback, I will work on that then. I think I could do both - LOOP_SET_FD_AND_STATUS and a new variant of LOOP_CTL_ADD that calls it - the former could still be useful if the kernel pre-created a large amount of loop devices. Martijn
diff --git a/drivers/block/loop.c b/drivers/block/loop.c index a42c49e04954..517031e1d10c 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -932,8 +932,8 @@ static void loop_update_rotational(struct loop_device *lo) blk_queue_flag_clear(QUEUE_FLAG_NONROT, q); } -static int loop_set_fd(struct loop_device *lo, fmode_t mode, - struct block_device *bdev, unsigned int arg) +static int loop_set_fd_with_offset(struct loop_device *lo, fmode_t mode, + struct block_device *bdev, unsigned int arg, loff_t offset) { struct file *file; struct inode *inode; @@ -957,7 +957,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, * here to avoid changing device under exclusive owner. */ if (!(mode & FMODE_EXCL)) { - claimed_bdev = bd_start_claiming(bdev, loop_set_fd); + claimed_bdev = bd_start_claiming(bdev, loop_set_fd_with_offset); if (IS_ERR(claimed_bdev)) { error = PTR_ERR(claimed_bdev); goto out_putf; @@ -1002,6 +1002,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, lo->transfer = NULL; lo->ioctl = NULL; lo->lo_sizelimit = 0; + lo->lo_offset = offset; lo->old_gfp_mask = mapping_gfp_mask(mapping); mapping_set_gfp_mask(mapping, lo->old_gfp_mask & ~(__GFP_IO|__GFP_FS)); @@ -1042,14 +1043,14 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode, if (partscan) loop_reread_partitions(lo, bdev); if (claimed_bdev) - bd_abort_claiming(bdev, claimed_bdev, loop_set_fd); + bd_abort_claiming(bdev, claimed_bdev, loop_set_fd_with_offset); return 0; out_unlock: mutex_unlock(&loop_ctl_mutex); out_bdev: if (claimed_bdev) - bd_abort_claiming(bdev, claimed_bdev, loop_set_fd); + bd_abort_claiming(bdev, claimed_bdev, loop_set_fd_with_offset); out_putf: fput(file); out: @@ -1601,7 +1602,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, switch (cmd) { case LOOP_SET_FD: - return loop_set_fd(lo, mode, bdev, arg); + return loop_set_fd_with_offset(lo, mode, bdev, arg, 0); case LOOP_CHANGE_FD: return loop_change_fd(lo, bdev, arg); case LOOP_CLR_FD: @@ -1624,6 +1625,17 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode, break; case LOOP_GET_STATUS64: return loop_get_status64(lo, (struct loop_info64 __user *) arg); + case LOOP_SET_FD_WITH_OFFSET: { + struct loop_fd_with_offset fdwo; + + if (copy_from_user(&fdwo, + (struct loop_fd_with_offset __user *) arg, + sizeof(struct loop_fd_with_offset))) + return -EFAULT; + + return loop_set_fd_with_offset(lo, mode, bdev, fdwo.fd, + fdwo.lo_offset); + } case LOOP_SET_CAPACITY: case LOOP_SET_DIRECT_IO: case LOOP_SET_BLOCK_SIZE: @@ -1774,6 +1786,7 @@ static int lo_compat_ioctl(struct block_device *bdev, fmode_t mode, case LOOP_SET_CAPACITY: case LOOP_CLR_FD: case LOOP_GET_STATUS64: + case LOOP_SET_FD_WITH_OFFSET: case LOOP_SET_STATUS64: arg = (unsigned long) compat_ptr(arg); /* fall through */ diff --git a/include/uapi/linux/loop.h b/include/uapi/linux/loop.h index 080a8df134ef..289829bc5abd 100644 --- a/include/uapi/linux/loop.h +++ b/include/uapi/linux/loop.h @@ -60,6 +60,11 @@ struct loop_info64 { __u64 lo_init[2]; }; +struct loop_fd_with_offset { + __u64 lo_offset; + __u32 fd; +}; + /* * Loop filter types */ @@ -90,6 +95,7 @@ struct loop_info64 { #define LOOP_SET_CAPACITY 0x4C07 #define LOOP_SET_DIRECT_IO 0x4C08 #define LOOP_SET_BLOCK_SIZE 0x4C09 +#define LOOP_SET_FD_WITH_OFFSET 0x4C0A /* /dev/loop-control interface */ #define LOOP_CTL_ADD 0x4C80
Configuring a loop device for a filesystem that is located at an offset currently requires calling LOOP_SET_FD and LOOP_SET_STATUS(64) consecutively. This has some downsides. The most important downside is that it can be slow. Here's setting up ~70 regular loop devices on an x86 Android device: vsoc_x86:/system/apex # time for i in `seq 30 100`; do losetup -r /dev/block/loop$i com.android.adbd.apex; done 0m01.85s real 0m00.01s user 0m00.01s system Here's configuring ~70 devices in the same way, but with an offset: vsoc_x86:/system/apex # time for i in `seq 30 100`; do losetup -r -o 4096 /dev/block/loop$i com.android.adbd.apex; done 0m03.40s real 0m00.02s user 0m00.03s system This is almost twice as slow; the main reason for this slowness is that LOOP_SET_STATUS(64) calls blk_mq_freeze_queue() to freeze the associated queue; this requires waiting for RCU synchronization, which I've measured can take about 15-20ms on this device on average. A more minor downside of having to do two ioctls is that on devices with max_part > 0, the kernel will initiate a partition scan, which is needless work if the image is at an offset. This change introduces a new ioctl to combine setting the backing file together with the offset, which avoids the above problems. Adding more parameters could be a consideration, but offset appears to be the only commonly used parameter that is required for accessing the device safely. Signed-off-by: Martijn Coenen <maco@android.com> --- drivers/block/loop.c | 25 +++++++++++++++++++------ include/uapi/linux/loop.h | 6 ++++++ 2 files changed, 25 insertions(+), 6 deletions(-)