Message ID | 20220308152105.309618-1-joshi.k@samsung.com (mailing list archive) |
---|---|
Headers | show |
Series | io_uring passthru over nvme | expand |
What branch is this against? Do you have a git tree available? On Tue, Mar 08, 2022 at 08:50:48PM +0530, Kanchan Joshi wrote: > This is a streamlined series with new way of doing uring-cmd, and connects > nvme-passthrough (over char device /dev/ngX) to it. > uring-cmd enables using io_uring for any arbitrary command (ioctl, > fsctl etc.) exposed by the command provider (e.g. driver, fs etc.). > > To store the command inline within the sqe, Jens added an option to setup > the ring with 128-byte SQEs.This gives 80 bytes of space (16 bytes at > the end of the first sqe + 64 bytes in the second sqe). With inline > command in sqe, the application avoids explicit allocation and, in the > kernel, we avoid doing copy_from_user. Command-opcode, length etc. > are stored in per-op fields of io_uring_sqe. > > Non-inline submission (when command is a user-space pointer rather than > housed inside sqe) is also supported. > > io_uring sends this command down by newly introduced ->async_cmd() > handler in file_operations. The handler does what is required to > submit, and indicates queued completion.The infra has been added to > process the completion when it arrives. > > Overall the patches wire up the following capabilities for this path: > - async > - fixed-buffer > - plugging > - bio-cache > - sync and async polling. > > This scales well. 512b randread perf (KIOPS) comparing > uring-passthru-over-char (/dev/ng0n1) to > uring-over-block(/dev/nvme0n1) > > QD uring pt uring-poll pt-poll > 8 538 589 831 902 > 64 967 1131 1351 1378 > 256 1043 1230 1376 1429 > > Testing/perf is done with this custom fio that turnes regular-io into > passthru-io on supplying "uring_cmd=1" option. > https://github.com/joshkan/fio/tree/big-sqe-pt.v1 > > Example command-line: > fio -iodepth=256 -rw=randread -ioengine=io_uring -bs=512 -numjobs=1 > -runtime=60 -group_reporting -iodepth_batch_submit=64 > -iodepth_batch_complete_min=1 -iodepth_batch_complete_max=64 > -fixedbufs=1 -hipri=1 -sqthread_poll=0 -filename=/dev/ng0n1 > -name=io_uring_256 -uring_cmd=1 > > > Anuj Gupta (3): > io_uring: prep for fixed-buffer enabled uring-cmd > nvme: enable passthrough with fixed-buffer > nvme: enable non-inline passthru commands > > Jens Axboe (5): > io_uring: add support for 128-byte SQEs > fs: add file_operations->async_cmd() > io_uring: add infra and support for IORING_OP_URING_CMD > io_uring: plug for async bypass > block: wire-up support for plugging > > Kanchan Joshi (5): > nvme: wire-up support for async-passthru on char-device. > io_uring: add support for uring_cmd with fixed-buffer > block: factor out helper for bio allocation from cache > nvme: enable bio-cache for fixed-buffer passthru > io_uring: add support for non-inline uring-cmd > > Keith Busch (2): > nvme: modify nvme_alloc_request to take an additional parameter > nvme: allow user passthrough commands to poll > > Pankaj Raghav (2): > io_uring: add polling support for uring-cmd > nvme: wire-up polling for uring-passthru > > block/bio.c | 43 ++-- > block/blk-map.c | 45 +++++ > block/blk-mq.c | 93 ++++----- > drivers/nvme/host/core.c | 21 +- > drivers/nvme/host/ioctl.c | 336 +++++++++++++++++++++++++++----- > drivers/nvme/host/multipath.c | 2 + > drivers/nvme/host/nvme.h | 11 +- > drivers/nvme/host/pci.c | 4 +- > drivers/nvme/target/passthru.c | 2 +- > fs/io_uring.c | 188 ++++++++++++++++-- > include/linux/bio.h | 1 + > include/linux/blk-mq.h | 4 + > include/linux/fs.h | 2 + > include/linux/io_uring.h | 43 ++++ > include/uapi/linux/io_uring.h | 21 +- > include/uapi/linux/nvme_ioctl.h | 4 + > 16 files changed, 689 insertions(+), 131 deletions(-) > > -- > 2.25.1 ---end quoted text---
On Thu, Mar 10, 2022 at 1:59 PM Christoph Hellwig <hch@lst.de> wrote: > > What branch is this against? Sorry I missed that in the cover. Two options - (a) https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-big-sqe first patch ("128 byte sqe support") is already there. (b) for-next (linux-block), series will fit on top of commit 9e9d83faa ("io_uring: Remove unneeded test in io_run_task_work_sig") > Do you have a git tree available? Not at the moment. @Jens: Please see if it is possible to move patches to your io_uring-big-sqe branch (and maybe rename that to big-sqe-pt.v1). Thanks.
On Thu, Mar 10, 2022 at 03:35:02PM +0530, Kanchan Joshi wrote: > On Thu, Mar 10, 2022 at 1:59 PM Christoph Hellwig <hch@lst.de> wrote: > > > > What branch is this against? > Sorry I missed that in the cover. > Two options - > (a) https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-big-sqe > first patch ("128 byte sqe support") is already there. > (b) for-next (linux-block), series will fit on top of commit 9e9d83faa > ("io_uring: Remove unneeded test in io_run_task_work_sig") > > > Do you have a git tree available? > Not at the moment. > > @Jens: Please see if it is possible to move patches to your > io_uring-big-sqe branch (and maybe rename that to big-sqe-pt.v1). Since Jens might be busy, I've put up a tree with all this stuff: https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20220311-io-uring-cmd It is based on option (b) mentioned above, I took linux-block for-next and reset the tree to commit "io_uring: Remove unneeded test in io_run_task_work_sig" before applying the series. Luis
On Fri, Mar 11, 2022 at 08:43:24AM -0800, Luis Chamberlain wrote: > On Thu, Mar 10, 2022 at 03:35:02PM +0530, Kanchan Joshi wrote: > > On Thu, Mar 10, 2022 at 1:59 PM Christoph Hellwig <hch@lst.de> wrote: > > > > > > What branch is this against? > > Sorry I missed that in the cover. > > Two options - > > (a) https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=03500d22-5ccb341f-0351866d-0cc47a31309a-6f95e6932e414a1d&q=1&e=4ca7b05e-2fe6-40d9-bbcf-a4ed687eca9f&u=https*3A*2F*2Fgit.kernel.dk*2Fcgit*2Flinux-block*2Flog*2F*3Fh*3Dio_uring-big-sqe__;JSUlJSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDh8_2-U9$ > > first patch ("128 byte sqe support") is already there. > > (b) for-next (linux-block), series will fit on top of commit 9e9d83faa > > ("io_uring: Remove unneeded test in io_run_task_work_sig") > > > > > Do you have a git tree available? > > Not at the moment. > > > > @Jens: Please see if it is possible to move patches to your > > io_uring-big-sqe branch (and maybe rename that to big-sqe-pt.v1). > > Since Jens might be busy, I've put up a tree with all this stuff: > > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20220311-io-uring-cmd__;!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDiTF0Q7F$ > > It is based on option (b) mentioned above, I took linux-block for-next > and reset the tree to commit "io_uring: Remove unneeded test in > io_run_task_work_sig" before applying the series. FYI I can be involved in testing this and have added some colleagues that can help in this regard. We have been using some form of this work for several months now and haven't had any issues. That being said some simple tests I have are not currently working with the above git tree :). I will work to get this resolved and post an update here. > > Luis
On Fri, Mar 11, 2022 at 03:35:04PM -0800, Adam Manzanares wrote: > On Fri, Mar 11, 2022 at 08:43:24AM -0800, Luis Chamberlain wrote: > > On Thu, Mar 10, 2022 at 03:35:02PM +0530, Kanchan Joshi wrote: > > > On Thu, Mar 10, 2022 at 1:59 PM Christoph Hellwig <hch@lst.de> wrote: > > > > > > > > What branch is this against? > > > Sorry I missed that in the cover. > > > Two options - > > > (a) https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=03500d22-5ccb341f-0351866d-0cc47a31309a-6f95e6932e414a1d&q=1&e=4ca7b05e-2fe6-40d9-bbcf-a4ed687eca9f&u=https*3A*2F*2Fgit.kernel.dk*2Fcgit*2Flinux-block*2Flog*2F*3Fh*3Dio_uring-big-sqe__;JSUlJSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDh8_2-U9$ > > > first patch ("128 byte sqe support") is already there. > > > (b) for-next (linux-block), series will fit on top of commit 9e9d83faa > > > ("io_uring: Remove unneeded test in io_run_task_work_sig") > > > > > > > Do you have a git tree available? > > > Not at the moment. > > > > > > @Jens: Please see if it is possible to move patches to your > > > io_uring-big-sqe branch (and maybe rename that to big-sqe-pt.v1). > > > > Since Jens might be busy, I've put up a tree with all this stuff: > > > > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20220311-io-uring-cmd__;!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDiTF0Q7F$ > > > > It is based on option (b) mentioned above, I took linux-block for-next > > and reset the tree to commit "io_uring: Remove unneeded test in > > io_run_task_work_sig" before applying the series. > > FYI I can be involved in testing this and have added some colleagues that can > help in this regard. We have been using some form of this work for several > months now and haven't had any issues. That being said some simple tests I have > are not currently working with the above git tree :). I will work to get this > resolved and post an update here. Sorry for the noise, I jumped up the stack too quickly with my tests. The "simple test" actually depends on several pieces of SW not related to the kernel. > > > > > Luis
On Sat, Mar 12, 2022 at 7:57 AM Adam Manzanares <a.manzanares@samsung.com> wrote: > > On Fri, Mar 11, 2022 at 03:35:04PM -0800, Adam Manzanares wrote: > > On Fri, Mar 11, 2022 at 08:43:24AM -0800, Luis Chamberlain wrote: > > > On Thu, Mar 10, 2022 at 03:35:02PM +0530, Kanchan Joshi wrote: > > > > On Thu, Mar 10, 2022 at 1:59 PM Christoph Hellwig <hch@lst.de> wrote: > > > > > > > > > > What branch is this against? > > > > Sorry I missed that in the cover. > > > > Two options - > > > > (a) https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=03500d22-5ccb341f-0351866d-0cc47a31309a-6f95e6932e414a1d&q=1&e=4ca7b05e-2fe6-40d9-bbcf-a4ed687eca9f&u=https*3A*2F*2Fgit.kernel.dk*2Fcgit*2Flinux-block*2Flog*2F*3Fh*3Dio_uring-big-sqe__;JSUlJSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDh8_2-U9$ > > > > first patch ("128 byte sqe support") is already there. > > > > (b) for-next (linux-block), series will fit on top of commit 9e9d83faa > > > > ("io_uring: Remove unneeded test in io_run_task_work_sig") > > > > > > > > > Do you have a git tree available? > > > > Not at the moment. > > > > > > > > @Jens: Please see if it is possible to move patches to your > > > > io_uring-big-sqe branch (and maybe rename that to big-sqe-pt.v1). > > > > > > Since Jens might be busy, I've put up a tree with all this stuff: > > > > > > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20220311-io-uring-cmd__;!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDiTF0Q7F$ > > > > > > It is based on option (b) mentioned above, I took linux-block for-next > > > and reset the tree to commit "io_uring: Remove unneeded test in > > > io_run_task_work_sig" before applying the series. > > > > FYI I can be involved in testing this and have added some colleagues that can > > help in this regard. We have been using some form of this work for several > > months now and haven't had any issues. That being said some simple tests I have > > are not currently working with the above git tree :). I will work to get this > > resolved and post an update here. > > Sorry for the noise, I jumped up the stack too quickly with my tests. The > "simple test" actually depends on several pieces of SW not related to the > kernel. Did you read the cover letter? It's not the same *user-interface* as the previous series. If you did not modify those out-of-kernel layers for the new interface, you're bound to see what you saw. If you did, please specify what the simple test was. I'll fix that in v2. Otherwise, the throwaway remark "simple tests not working" - only infers this series is untested. Nothing could be further from the truth. Rather this series is more robust than the previous one. Let me expand bit more on testing part that's already there in cover: fio -iodepth=256 -rw=randread -ioengine=io_uring -bs=512 -numjobs=1 -runtime=60 -group_reporting -iodepth_batch_submit=64 -iodepth_batch_complete_min=1 -iodepth_batch_complete_max=64 -fixedbufs=1 -hipri=1 -sqthread_poll=0 -filename=/dev/ng0n1 -name=io_uring_256 -uring_cmd=1 When I reduce the above command-line to do single IO, I call that a simple test. Simple test that touches almost *everything* that patches build (i.e async, fixed-buffer, plugging, fixed-buffer, bio-cache, polling). And larger tests combine these knobs in various ways, QD ranging from 1, 2, 4...upto 256; on general and perf-optimized kernel config; with big-sqe and normal-sqe (pointer one). And all this is repeated on the block interface (regular io) too, which covers the regression part. Sure, I can add more tests for checking regression. But no, I don't expect any simple test to fail. And that applies to Luis' tree as well. Tried that too again.
On Fri, Mar 11, 2022 at 10:13 PM Luis Chamberlain <mcgrof@kernel.org> wrote: > > On Thu, Mar 10, 2022 at 03:35:02PM +0530, Kanchan Joshi wrote: > > On Thu, Mar 10, 2022 at 1:59 PM Christoph Hellwig <hch@lst.de> wrote: > > > > > > What branch is this against? > > Sorry I missed that in the cover. > > Two options - > > (a) https://git.kernel.dk/cgit/linux-block/log/?h=io_uring-big-sqe > > first patch ("128 byte sqe support") is already there. > > (b) for-next (linux-block), series will fit on top of commit 9e9d83faa > > ("io_uring: Remove unneeded test in io_run_task_work_sig") > > > > > Do you have a git tree available? > > Not at the moment. > > > > @Jens: Please see if it is possible to move patches to your > > io_uring-big-sqe branch (and maybe rename that to big-sqe-pt.v1). > > Since Jens might be busy, I've put up a tree with all this stuff: > > https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20220311-io-uring-cmd > > It is based on option (b) mentioned above, I took linux-block for-next > and reset the tree to commit "io_uring: Remove unneeded test in > io_run_task_work_sig" before applying the series. Thanks for putting this up.
On Sun, Mar 13, 2022 at 10:37:53AM +0530, Kanchan Joshi wrote: > On Sat, Mar 12, 2022 at 7:57 AM Adam Manzanares > <a.manzanares@samsung.com> wrote: > > > > On Fri, Mar 11, 2022 at 03:35:04PM -0800, Adam Manzanares wrote: > > > On Fri, Mar 11, 2022 at 08:43:24AM -0800, Luis Chamberlain wrote: > > > > On Thu, Mar 10, 2022 at 03:35:02PM +0530, Kanchan Joshi wrote: > > > > > On Thu, Mar 10, 2022 at 1:59 PM Christoph Hellwig <hch@lst.de> wrote: > > > > > > > > > > > > What branch is this against? > > > > > Sorry I missed that in the cover. > > > > > Two options - > > > > > (a) https://urldefense.com/v3/__https://protect2.fireeye.com/v1/url?k=03500d22-5ccb341f-0351866d-0cc47a31309a-6f95e6932e414a1d&q=1&e=4ca7b05e-2fe6-40d9-bbcf-a4ed687eca9f&u=https*3A*2F*2Fgit.kernel.dk*2Fcgit*2Flinux-block*2Flog*2F*3Fh*3Dio_uring-big-sqe__;JSUlJSUlJSUl!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDh8_2-U9$ > > > > > first patch ("128 byte sqe support") is already there. > > > > > (b) for-next (linux-block), series will fit on top of commit 9e9d83faa > > > > > ("io_uring: Remove unneeded test in io_run_task_work_sig") > > > > > > > > > > > Do you have a git tree available? > > > > > Not at the moment. > > > > > > > > > > @Jens: Please see if it is possible to move patches to your > > > > > io_uring-big-sqe branch (and maybe rename that to big-sqe-pt.v1). > > > > > > > > Since Jens might be busy, I've put up a tree with all this stuff: > > > > > > > > https://urldefense.com/v3/__https://git.kernel.org/pub/scm/linux/kernel/git/mcgrof/linux-next.git/log/?h=20220311-io-uring-cmd__;!!EwVzqGoTKBqv-0DWAJBm!FujuZ927K3fuIklgYjkWtodmdQnQyBqOw4Ge4M08DU_0oD5tPm0-wS2SZg0MDiTF0Q7F$ > > > > > > > > It is based on option (b) mentioned above, I took linux-block for-next > > > > and reset the tree to commit "io_uring: Remove unneeded test in > > > > io_run_task_work_sig" before applying the series. > > > > > > FYI I can be involved in testing this and have added some colleagues that can > > > help in this regard. We have been using some form of this work for several > > > months now and haven't had any issues. That being said some simple tests I have > > > are not currently working with the above git tree :). I will work to get this > > > resolved and post an update here. > > > > Sorry for the noise, I jumped up the stack too quickly with my tests. The > > "simple test" actually depends on several pieces of SW not related to the > > kernel. > > Did you read the cover letter? It's not the same *user-interface* as > the previous series. > If you did not modify those out-of-kernel layers for the new > interface, you're bound to see what you saw. > If you did, please specify what the simple test was. I'll fix that in v2. I got a little ahead of myself. Looking forward to leveraging this work in the near future ;) > > Otherwise, the throwaway remark "simple tests not working" - only > infers this series is untested. Nothing could be further from the > truth. > Rather this series is more robust than the previous one. Excellent, to hear that this series is robust. My intent was not to claim this series was untested. It is clear I need to do more hw before making an attempt to help with testing. > > Let me expand bit more on testing part that's already there in cover: > > fio -iodepth=256 -rw=randread -ioengine=io_uring -bs=512 -numjobs=1 > -runtime=60 -group_reporting -iodepth_batch_submit=64 > -iodepth_batch_complete_min=1 -iodepth_batch_complete_max=64 > -fixedbufs=1 -hipri=1 -sqthread_poll=0 -filename=/dev/ng0n1 > -name=io_uring_256 -uring_cmd=1 > > When I reduce the above command-line to do single IO, I call that a simple test. > Simple test that touches almost *everything* that patches build (i.e > async, fixed-buffer, plugging, fixed-buffer, bio-cache, polling). > And larger tests combine these knobs in various ways, QD ranging from > 1, 2, 4...upto 256; on general and perf-optimized kernel config; with > big-sqe and normal-sqe (pointer one). And all this is repeated on the > block interface (regular io) too, which covers the regression part. > Sure, I can add more tests for checking regression. But no, I don't > expect any simple test to fail. And that applies to Luis' tree as > well. Tried that too again. Looks like you have all of your based covered. Keep up the good work. > > > -- > Joshi