From patchwork Fri Oct 27 18:19:26 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13438853 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A6818C25B47 for ; Fri, 27 Oct 2023 18:20:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232608AbjJ0SUP (ORCPT ); Fri, 27 Oct 2023 14:20:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231594AbjJ0SUO (ORCPT ); Fri, 27 Oct 2023 14:20:14 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3AB1CAC for ; Fri, 27 Oct 2023 11:20:11 -0700 (PDT) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39RE5mPB006170 for ; Fri, 27 Oct 2023 11:20:11 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=PeYXdDD9BDy0e7ssYNWuuSlwKOHdUA7B3s3SZXstLZY=; b=k2984Iyczg58uKzsKECSyrffc0HQFOwZWyt7tuvyBhhpolnpp4JTmLB4+JY5I7u2P7v1 OQk8DtfrJlpQ5Xe5WDZ50V7F+EwdcCTJ/6jlBX2cLSEIiCW+LCr5eeL5WzVmm1lTmyBu nimJJaVB8yttwM3v6Sl66PBx2+6dONYftN+QjZSr2mcDr9JRtxcgdfyhZpcgbfUJXvTA WaHdQGyqdzoyXJolIRrGVCcZaK+qKT1ArhH5BLnPaXqGR+3lf/E8aRbu5RMx7/fgFQF/ avfN9pPIKvwZbU7pVhCUZI1BJJ3YvDdENteK1DfloUYKoO++3My6rtks2qhbC7mlCZy5 eA== Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tywry0fu4-20 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 27 Oct 2023 11:20:10 -0700 Received: from twshared15991.38.frc1.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 27 Oct 2023 11:20:04 -0700 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 4268A20D093C1; Fri, 27 Oct 2023 11:19:50 -0700 (PDT) From: Keith Busch To: , , CC: , , , , Keith Busch Subject: [PATCHv2 1/4] block: bio-integrity: directly map user buffers Date: Fri, 27 Oct 2023 11:19:26 -0700 Message-ID: <20231027181929.2589937-2-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231027181929.2589937-1-kbusch@meta.com> References: <20231027181929.2589937-1-kbusch@meta.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: nQ_Si2SFuWJzEvUz-TV5i3bHPN_Yg_nC X-Proofpoint-GUID: nQ_Si2SFuWJzEvUz-TV5i3bHPN_Yg_nC X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-27_17,2023-10-27_01,2023-05-22_02 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Keith Busch Passthrough commands that utilize metadata currently need to bounce the user space buffer through the kernel. Add support for mapping user space directly so that we can avoid this costly overhead. This is similiar to how the normal bio data payload utilizes user addresses with bio_map_user_iov(). If the user address can't directly be used for reasons like too many segments or address unalignement, fallback to a copy of the user vec while keeping the user address pinned for the IO duration so that it can safely be copied on completion in any process context. Signed-off-by: Keith Busch --- block/bio-integrity.c | 195 ++++++++++++++++++++++++++++++++++++++++++ include/linux/bio.h | 9 ++ 2 files changed, 204 insertions(+) diff --git a/block/bio-integrity.c b/block/bio-integrity.c index ec8ac8cf6e1b9..7f9d242ad79df 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -91,6 +91,37 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, } EXPORT_SYMBOL(bio_integrity_alloc); +static void bio_integrity_unmap_user(struct bio_integrity_payload *bip) +{ + bool dirty = bio_data_dir(bip->bip_bio) == READ; + struct bio_vec *copy = bip->copy_vec; + struct bvec_iter iter; + struct bio_vec bv; + + if (copy) { + unsigned short nr_vecs = bip->bip_max_vcnt; + size_t bytes = bip->bip_iter.bi_size; + void *buf = bvec_virt(bip->bip_vec); + + if (dirty) { + struct iov_iter iter; + + iov_iter_bvec(&iter, ITER_DEST, copy, nr_vecs, bytes); + WARN_ON(copy_to_iter(buf, bytes, &iter) != bytes); + } + + memcpy(bip->bip_vec, copy, nr_vecs * sizeof(*copy)); + kfree(copy); + kfree(buf); + } + + bip_for_each_vec(bv, bip, iter) { + if (dirty && !PageCompound(bv.bv_page)) + set_page_dirty_lock(bv.bv_page); + unpin_user_page(bv.bv_page); + } +} + /** * bio_integrity_free - Free bio integrity payload * @bio: bio containing bip to be freed @@ -105,6 +136,8 @@ void bio_integrity_free(struct bio *bio) if (bip->bip_flags & BIP_BLOCK_INTEGRITY) kfree(bvec_virt(bip->bip_vec)); + else if (bip->bip_flags & BIP_INTEGRITY_USER) + bio_integrity_unmap_user(bip); __bio_integrity_free(bs, bip); bio->bi_integrity = NULL; @@ -160,6 +193,168 @@ int bio_integrity_add_page(struct bio *bio, struct page *page, } EXPORT_SYMBOL(bio_integrity_add_page); +static int bio_integrity_copy_user(struct bio *bio, struct bio_vec *bvec, + int nr_vecs, unsigned int len, + unsigned int direction, u32 seed) +{ + struct bio_integrity_payload *bip; + struct bio_vec *copy_vec = NULL; + struct iov_iter iter; + void *buf; + int ret; + + /* if bvec is on the stack, we need to allocate a copy for the completion */ + if (nr_vecs <= UIO_FASTIOV) { + copy_vec = kcalloc(sizeof(*bvec), nr_vecs, GFP_KERNEL); + if (!copy_vec) + return -ENOMEM; + memcpy(copy_vec, bvec, nr_vecs * sizeof(*bvec)); + } + + buf = kmalloc(len, GFP_KERNEL); + if (!buf) + goto free_copy; + + if (direction == ITER_SOURCE) { + iov_iter_bvec(&iter, direction, bvec, nr_vecs, len); + if (!copy_from_iter_full(buf, len, &iter)) { + ret = -EFAULT; + goto free_buf; + } + } else { + memset(buf, 0, len); + } + + /* + * We just need one vec for this bip, but we need to preserve the + * number of vecs in the user bvec for the completion handling, so use + * nr_vecs. + */ + bip = bio_integrity_alloc(bio, GFP_KERNEL, nr_vecs); + if (IS_ERR(bip)) { + ret = PTR_ERR(bip); + goto free_buf; + } + + ret = bio_integrity_add_page(bio, virt_to_page(buf), len, + offset_in_page(buf)); + if (ret != len) { + ret = -ENOMEM; + goto free_bip; + } + + bip->bip_flags |= BIP_INTEGRITY_USER; + bip->copy_vec = copy_vec ?: bvec; + return 0; +free_bip: + bio_integrity_free(bio); +free_buf: + kfree(buf); +free_copy: + kfree(copy_vec); + return ret; +} + +int bio_integrity_map_user(struct bio *bio, void __user *ubuf, unsigned int len, + u32 seed) +{ + struct request_queue *q = bdev_get_queue(bio->bi_bdev); + unsigned long offs, align = q->dma_pad_mask | queue_dma_alignment(q); + int ret, direction, nr_vecs, i, j, folios = 0; + struct bio_vec stack_vec[UIO_FASTIOV]; + struct bio_vec bv, *bvec = stack_vec; + struct page *stack_pages[UIO_FASTIOV]; + struct page **pages = stack_pages; + struct bio_integrity_payload *bip; + struct iov_iter iter; + struct bvec_iter bi; + u32 bytes; + + if (bio_integrity(bio)) + return -EINVAL; + if (len >> SECTOR_SHIFT > queue_max_hw_sectors(q)) + return -E2BIG; + + if (bio_data_dir(bio) == READ) + direction = ITER_DEST; + else + direction = ITER_SOURCE; + + iov_iter_ubuf(&iter, direction, ubuf, len); + nr_vecs = iov_iter_npages(&iter, BIO_MAX_VECS + 1); + if (nr_vecs > BIO_MAX_VECS) + return -E2BIG; + if (nr_vecs > UIO_FASTIOV) { + bvec = kcalloc(sizeof(*bvec), nr_vecs, GFP_KERNEL); + if (!bvec) + return -ENOMEM; + pages = NULL; + } + + bytes = iov_iter_extract_pages(&iter, &pages, len, nr_vecs, 0, &offs); + if (unlikely(bytes < 0)) { + ret = bytes; + goto free_bvec; + } + + for (i = 0; i < nr_vecs; i = j) { + size_t size = min_t(size_t, bytes, PAGE_SIZE - offs); + struct folio *folio = page_folio(pages[i]); + + bytes -= size; + for (j = i + 1; j < nr_vecs; j++) { + size_t next = min_t(size_t, PAGE_SIZE, bytes); + + if (page_folio(pages[j]) != folio || + pages[j] != pages[j - 1] + 1) + break; + unpin_user_page(pages[j]); + size += next; + bytes -= next; + } + + bvec_set_page(&bvec[folios], pages[i], size, offs); + offs = 0; + folios++; + } + + if (pages != stack_pages) + kvfree(pages); + + if (folios > queue_max_integrity_segments(q) || + !iov_iter_is_aligned(&iter, align, align)) { + ret = bio_integrity_copy_user(bio, bvec, folios, len, + direction, seed); + if (ret) + goto release_pages; + return 0; + } + + bip = bio_integrity_alloc(bio, GFP_KERNEL, folios); + if (IS_ERR(bip)) { + ret = PTR_ERR(bip); + goto release_pages; + } + + memcpy(bip->bip_vec, bvec, folios * sizeof(*bvec)); + if (bvec != stack_vec) + kfree(bvec); + + bip->bip_flags |= BIP_INTEGRITY_USER; + bip->copy_vec = NULL; + return 0; + +release_pages: + bi.bi_size = len; + for_each_bvec(bv, bvec, bi, bi) + unpin_user_page(bv.bv_page); +free_bvec: + if (bvec != stack_vec) + kfree(bvec); + return ret; +} +EXPORT_SYMBOL_GPL(bio_integrity_map_user); + /** * bio_integrity_process - Process integrity metadata for a bio * @bio: bio to generate/verify integrity metadata for diff --git a/include/linux/bio.h b/include/linux/bio.h index 41d417ee13499..2b4a0de838ed1 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -324,6 +324,7 @@ enum bip_flags { BIP_CTRL_NOCHECK = 1 << 2, /* disable HBA integrity checking */ BIP_DISK_NOCHECK = 1 << 3, /* disable disk integrity checking */ BIP_IP_CHECKSUM = 1 << 4, /* IP checksum */ + BIP_INTEGRITY_USER = 1 << 5, /* Integrity payload is user address */ }; /* @@ -342,6 +343,7 @@ struct bio_integrity_payload { struct work_struct bip_work; /* I/O completion */ + struct bio_vec *copy_vec; /* for bounce buffering */ struct bio_vec *bip_vec; struct bio_vec bip_inline_vecs[];/* embedded bvec array */ }; @@ -720,6 +722,7 @@ static inline bool bioset_initialized(struct bio_set *bs) extern struct bio_integrity_payload *bio_integrity_alloc(struct bio *, gfp_t, unsigned int); extern int bio_integrity_add_page(struct bio *, struct page *, unsigned int, unsigned int); +extern int bio_integrity_map_user(struct bio *, void __user *, unsigned int, u32); extern bool bio_integrity_prep(struct bio *); extern void bio_integrity_advance(struct bio *, unsigned int); extern void bio_integrity_trim(struct bio *); @@ -789,6 +792,12 @@ static inline int bio_integrity_add_page(struct bio *bio, struct page *page, return 0; } +static inline int bio_integrity_map_user(struct bio *bio, void __user *ubuf, + unsigned int len, u32 seed) +{ + return -EINVAL +} + #endif /* CONFIG_BLK_DEV_INTEGRITY */ /* From patchwork Fri Oct 27 18:19:27 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13438852 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AEDADC25B48 for ; Fri, 27 Oct 2023 18:20:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232577AbjJ0SUM (ORCPT ); Fri, 27 Oct 2023 14:20:12 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:47968 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231560AbjJ0SUL (ORCPT ); Fri, 27 Oct 2023 14:20:11 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B999AC for ; Fri, 27 Oct 2023 11:20:09 -0700 (PDT) Received: from pps.filterd (m0109333.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39RE5mP5006170 for ; Fri, 27 Oct 2023 11:20:08 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=CSq7z/DeGgbL8Eb6mDF6n+DrMWa/wzdsV8/4jUrnd9s=; b=XjlBP+86OG63lMz9BOstbk3Ry2YMRlJLPuyq9i0UnHoIsXRxkFGOl+9NIOvWGgetAXK1 fzjBZdS1DC26up9drK5tZV+b6nDFMHFU5DlckytDPMx1+sW0N7zXCUdOYrJAifDOthJ2 MFatJTijkIqYMFB2epiLqlpxLZGicpSUjdGrQKf3KGjsYu3QqgstopFZE1plL86vfESV 4K/udkjqwlEfCzeoHALxeV0EqPObmt+8x6IV7ZmrOOfB9j35kowfxo5RAMqRCRHA0k1e 3QFHFCfQYkfkimU4V/WXE7oHQ1Wnm2Oz5dvm+AYFl6zFTle56em1fEQ5JSRxB66UNqi1 6g== Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3tywry0fu4-14 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 27 Oct 2023 11:20:08 -0700 Received: from twshared17786.35.frc1.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 27 Oct 2023 11:19:59 -0700 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 5CD6920D093C4; Fri, 27 Oct 2023 11:19:50 -0700 (PDT) From: Keith Busch To: , , CC: , , , , Keith Busch Subject: [PATCHv2 2/4] nvme: use bio_integrity_map_user Date: Fri, 27 Oct 2023 11:19:27 -0700 Message-ID: <20231027181929.2589937-3-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231027181929.2589937-1-kbusch@meta.com> References: <20231027181929.2589937-1-kbusch@meta.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: mLt81BOf0YXuwNPgqw_D9KyeU9SkqPYV X-Proofpoint-GUID: mLt81BOf0YXuwNPgqw_D9KyeU9SkqPYV X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-27_17,2023-10-27_01,2023-05-22_02 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Keith Busch Map user metadata buffers directly. Now that the bio tracks the metadata, nvme doesn't need special metadata handling or additional fields in the pdu. Signed-off-by: Keith Busch --- drivers/nvme/host/ioctl.c | 174 ++++++-------------------------------- 1 file changed, 27 insertions(+), 147 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index d8ff796fd5f21..fec64bc14cfea 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -96,52 +96,17 @@ static void __user *nvme_to_user_ptr(uintptr_t ptrval) return (void __user *)ptrval; } -static void *nvme_add_user_metadata(struct request *req, void __user *ubuf, +static int nvme_add_user_metadata(struct request *req, void __user *ubuf, unsigned len, u32 seed) { - struct bio_integrity_payload *bip; - int ret = -ENOMEM; - void *buf; - struct bio *bio = req->bio; - - buf = kmalloc(len, GFP_KERNEL); - if (!buf) - goto out; - - ret = -EFAULT; - if ((req_op(req) == REQ_OP_DRV_OUT) && copy_from_user(buf, ubuf, len)) - goto out_free_meta; - - bip = bio_integrity_alloc(bio, GFP_KERNEL, 1); - if (IS_ERR(bip)) { - ret = PTR_ERR(bip); - goto out_free_meta; - } + int ret; - bip->bip_iter.bi_sector = seed; - ret = bio_integrity_add_page(bio, virt_to_page(buf), len, - offset_in_page(buf)); - if (ret != len) { - ret = -ENOMEM; - goto out_free_meta; - } + ret = bio_integrity_map_user(req->bio, ubuf, len, seed); + if (ret) + return ret; req->cmd_flags |= REQ_INTEGRITY; - return buf; -out_free_meta: - kfree(buf); -out: - return ERR_PTR(ret); -} - -static int nvme_finish_user_metadata(struct request *req, void __user *ubuf, - void *meta, unsigned len, int ret) -{ - if (!ret && req_op(req) == REQ_OP_DRV_IN && - copy_to_user(ubuf, meta, len)) - ret = -EFAULT; - kfree(meta); - return ret; + return 0; } static struct request *nvme_alloc_user_request(struct request_queue *q, @@ -160,14 +125,12 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, static int nvme_map_user_request(struct request *req, u64 ubuffer, unsigned bufflen, void __user *meta_buffer, unsigned meta_len, - u32 meta_seed, void **metap, struct io_uring_cmd *ioucmd, - unsigned int flags) + u32 meta_seed, struct io_uring_cmd *ioucmd, unsigned int flags) { struct request_queue *q = req->q; struct nvme_ns *ns = q->queuedata; struct block_device *bdev = ns ? ns->disk->part0 : NULL; struct bio *bio = NULL; - void *meta = NULL; int ret; if (ioucmd && (ioucmd->flags & IORING_URING_CMD_FIXED)) { @@ -194,13 +157,10 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer, bio_set_dev(bio, bdev); if (bdev && meta_buffer && meta_len) { - meta = nvme_add_user_metadata(req, meta_buffer, meta_len, + ret = nvme_add_user_metadata(req, meta_buffer, meta_len, meta_seed); - if (IS_ERR(meta)) { - ret = PTR_ERR(meta); + if (ret) goto out_unmap; - } - *metap = meta; } return ret; @@ -221,7 +181,6 @@ static int nvme_submit_user_cmd(struct request_queue *q, struct nvme_ns *ns = q->queuedata; struct nvme_ctrl *ctrl; struct request *req; - void *meta = NULL; struct bio *bio; u32 effects; int ret; @@ -233,7 +192,7 @@ static int nvme_submit_user_cmd(struct request_queue *q, req->timeout = timeout; if (ubuffer && bufflen) { ret = nvme_map_user_request(req, ubuffer, bufflen, meta_buffer, - meta_len, meta_seed, &meta, NULL, flags); + meta_len, meta_seed, NULL, flags); if (ret) return ret; } @@ -245,9 +204,6 @@ static int nvme_submit_user_cmd(struct request_queue *q, ret = nvme_execute_rq(req, false); if (result) *result = le64_to_cpu(nvme_req(req)->result.u64); - if (meta) - ret = nvme_finish_user_metadata(req, meta_buffer, meta, - meta_len, ret); if (bio) blk_rq_unmap_user(bio); blk_mq_free_request(req); @@ -442,19 +398,10 @@ struct nvme_uring_data { * Expect build errors if this grows larger than that. */ struct nvme_uring_cmd_pdu { - union { - struct bio *bio; - struct request *req; - }; - u32 meta_len; - u32 nvme_status; - union { - struct { - void *meta; /* kernel-resident buffer */ - void __user *meta_buffer; - }; - u64 result; - } u; + struct request *req; + struct bio *bio; + u64 result; + int status; }; static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( @@ -463,31 +410,6 @@ static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( return (struct nvme_uring_cmd_pdu *)&ioucmd->pdu; } -static void nvme_uring_task_meta_cb(struct io_uring_cmd *ioucmd, - unsigned issue_flags) -{ - struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - struct request *req = pdu->req; - int status; - u64 result; - - if (nvme_req(req)->flags & NVME_REQ_CANCELLED) - status = -EINTR; - else - status = nvme_req(req)->status; - - result = le64_to_cpu(nvme_req(req)->result.u64); - - if (pdu->meta_len) - status = nvme_finish_user_metadata(req, pdu->u.meta_buffer, - pdu->u.meta, pdu->meta_len, status); - if (req->bio) - blk_rq_unmap_user(req->bio); - blk_mq_free_request(req); - - io_uring_cmd_done(ioucmd, status, result, issue_flags); -} - static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd, unsigned issue_flags) { @@ -495,8 +417,7 @@ static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd, if (pdu->bio) blk_rq_unmap_user(pdu->bio); - - io_uring_cmd_done(ioucmd, pdu->nvme_status, pdu->u.result, issue_flags); + io_uring_cmd_done(ioucmd, pdu->status, pdu->result, issue_flags); } static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, @@ -505,50 +426,24 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, struct io_uring_cmd *ioucmd = req->end_io_data; struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - req->bio = pdu->bio; if (nvme_req(req)->flags & NVME_REQ_CANCELLED) - pdu->nvme_status = -EINTR; + pdu->status = -EINTR; else - pdu->nvme_status = nvme_req(req)->status; - pdu->u.result = le64_to_cpu(nvme_req(req)->result.u64); + pdu->status = nvme_req(req)->status; + pdu->result = le64_to_cpu(nvme_req(req)->result.u64); /* * For iopoll, complete it directly. * Otherwise, move the completion to task work. */ - if (blk_rq_is_poll(req)) { - WRITE_ONCE(ioucmd->cookie, NULL); + if (blk_rq_is_poll(req)) nvme_uring_task_cb(ioucmd, IO_URING_F_UNLOCKED); - } else { + else io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); - } return RQ_END_IO_FREE; } -static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, - blk_status_t err) -{ - struct io_uring_cmd *ioucmd = req->end_io_data; - struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - - req->bio = pdu->bio; - pdu->req = req; - - /* - * For iopoll, complete it directly. - * Otherwise, move the completion to task work. - */ - if (blk_rq_is_poll(req)) { - WRITE_ONCE(ioucmd->cookie, NULL); - nvme_uring_task_meta_cb(ioucmd, IO_URING_F_UNLOCKED); - } else { - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_meta_cb); - } - - return RQ_END_IO_NONE; -} - static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, struct io_uring_cmd *ioucmd, unsigned int issue_flags, bool vec) { @@ -560,7 +455,6 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, struct request *req; blk_opf_t rq_flags = REQ_ALLOC_CACHE; blk_mq_req_flags_t blk_flags = 0; - void *meta = NULL; int ret; c.common.opcode = READ_ONCE(cmd->opcode); @@ -608,27 +502,17 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, if (d.addr && d.data_len) { ret = nvme_map_user_request(req, d.addr, d.data_len, nvme_to_user_ptr(d.metadata), - d.metadata_len, 0, &meta, ioucmd, vec); + d.metadata_len, 0, ioucmd, vec); if (ret) return ret; } - if (blk_rq_is_poll(req)) { - ioucmd->flags |= IORING_URING_CMD_POLLED; - WRITE_ONCE(ioucmd->cookie, req); - } /* to free bio on completion, as req->bio will be null at that time */ pdu->bio = req->bio; - pdu->meta_len = d.metadata_len; + pdu->req = req; req->end_io_data = ioucmd; - if (pdu->meta_len) { - pdu->u.meta = meta; - pdu->u.meta_buffer = nvme_to_user_ptr(d.metadata); - req->end_io = nvme_uring_cmd_end_io_meta; - } else { - req->end_io = nvme_uring_cmd_end_io; - } + req->end_io = nvme_uring_cmd_end_io; blk_execute_rq_nowait(req, false); return -EIOCBQUEUED; } @@ -779,16 +663,12 @@ int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd, struct io_comp_batch *iob, unsigned int poll_flags) { - struct request *req; - int ret = 0; - - if (!(ioucmd->flags & IORING_URING_CMD_POLLED)) - return 0; + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); + struct request *req = pdu->req; - req = READ_ONCE(ioucmd->cookie); if (req && blk_rq_is_poll(req)) - ret = blk_rq_poll(req, iob, poll_flags); - return ret; + return blk_rq_poll(req, iob, poll_flags); + return 0; } #ifdef CONFIG_NVME_MULTIPATH static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd, From patchwork Fri Oct 27 18:19:28 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13438851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 22EC8C25B6F for ; Fri, 27 Oct 2023 18:20:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232199AbjJ0SUH (ORCPT ); Fri, 27 Oct 2023 14:20:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:52694 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232550AbjJ0SUG (ORCPT ); Fri, 27 Oct 2023 14:20:06 -0400 Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0C55218F for ; Fri, 27 Oct 2023 11:20:04 -0700 (PDT) Received: from pps.filterd (m0109331.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39RE5Y2O006162 for ; Fri, 27 Oct 2023 11:20:04 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=jAGKyWeA8QTcQMk/vfy8VOIt2qAQYrVxiznwPWy89lU=; b=muaMpsn5JwFC8QzqI6QwjpDHnwsPDP3VcazR0bJrhYgTvVu/Y4KL3B4u5y9Np88Xzo55 VFpcmtq5RKRGhn1kA3RIjd7yOF9lPqxN2ckEB5YH++RXDxG6LtUFggYfyCQkZZJtyYW7 T4IXiL2H64qdPfENsG96qWNc9I1UNtfjvtkcF3ObyKoz5LtyKN5XqbzuSbEUKs8ge0Oc WqcteQj1S72NijhUQpyx2hFTjtZZ5jl6U/EJyX3HtflLrigEBvCNKyfFphkO9PDGGBRo QJu+ewZG4Pgq1SUtKblUchWwPjVEyUe/KhDwrTIsNgXvVu5xS1SVFKnfNHWjZIkCZMRD Cw== Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3u0c4pu41c-8 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 27 Oct 2023 11:20:04 -0700 Received: from twshared10830.02.ash9.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 27 Oct 2023 11:19:59 -0700 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 6F25520D093C6; Fri, 27 Oct 2023 11:19:50 -0700 (PDT) From: Keith Busch To: , , CC: , , , , Keith Busch Subject: [PATCHv2 3/4] iouring: remove IORING_URING_CMD_POLLED Date: Fri, 27 Oct 2023 11:19:28 -0700 Message-ID: <20231027181929.2589937-4-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231027181929.2589937-1-kbusch@meta.com> References: <20231027181929.2589937-1-kbusch@meta.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: iYNZy2NE31A37c8QR0TP0K7r9Zw9XEVJ X-Proofpoint-ORIG-GUID: iYNZy2NE31A37c8QR0TP0K7r9Zw9XEVJ X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-27_17,2023-10-27_01,2023-05-22_02 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Keith Busch No more users of this flag. Signed-off-by: Keith Busch --- include/linux/io_uring.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index aefb73eeeebff..fe23bf88f86fa 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -28,7 +28,6 @@ enum io_uring_cmd_flags { /* only top 8 bits of sqe->uring_cmd_flags for kernel internal use */ #define IORING_URING_CMD_CANCELABLE (1U << 30) -#define IORING_URING_CMD_POLLED (1U << 31) struct io_uring_cmd { struct file *file; From patchwork Fri Oct 27 18:19:29 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13438854 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8C401C25B70 for ; Fri, 27 Oct 2023 18:20:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232615AbjJ0SUR (ORCPT ); Fri, 27 Oct 2023 14:20:17 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:45888 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232600AbjJ0SUQ (ORCPT ); Fri, 27 Oct 2023 14:20:16 -0400 Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 51CAC12A for ; Fri, 27 Oct 2023 11:20:15 -0700 (PDT) Received: from pps.filterd (m0044012.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 39REBfOB006902 for ; Fri, 27 Oct 2023 11:20:15 -0700 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=2MEE7cRzsGfeMnNn2eY3zHEqVKk2sJ/zEGUF3QOFPSY=; b=KWJuyjy+x/LAlqbRsk/xsVlv18Yc6lOYqXbkvmdkogqh3LaxyJvQCkcCMpkkAL2J+w2q Y9iFu8NYxIpFxKbdlN/zh1dOvERifnzPnRiOfuvrmfvz/LvBbdIoOZWI1YPtk0e7tvEb UMjsjsYWwfD2ByiMzztD6IGABzDeDLrDuddVlCMGKP0OGEuZ69bdc+74ngUA+guITMC+ zUNsvzjzeVvz0gKLGDdPjV4fpUAJdRkUc32RM3g97Jq0JlI16DxZ55mBfY+mjpsjyyBB 4F/QKsDzPFX2BAA2+NGNPcy89nQ7dvnmivsrIgRoNRMl6sQCwHz5ESLi12llCWAB5Dwp TA== Received: from maileast.thefacebook.com ([163.114.130.16]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3u0erd21xs-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Fri, 27 Oct 2023 11:20:14 -0700 Received: from twshared9012.02.ash9.facebook.com (2620:10d:c0a8:1c::11) by mail.thefacebook.com (2620:10d:c0a8:83::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Fri, 27 Oct 2023 11:20:02 -0700 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 837F820D093C8; Fri, 27 Oct 2023 11:19:50 -0700 (PDT) From: Keith Busch To: , , CC: , , , , Keith Busch Subject: [PATCHv2 4/4] io_uring: remove uring_cmd cookie Date: Fri, 27 Oct 2023 11:19:29 -0700 Message-ID: <20231027181929.2589937-5-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231027181929.2589937-1-kbusch@meta.com> References: <20231027181929.2589937-1-kbusch@meta.com> MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-ORIG-GUID: z-y7AJETJ8TK-oMoBnPTU8weT6X434oG X-Proofpoint-GUID: z-y7AJETJ8TK-oMoBnPTU8weT6X434oG X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.987,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-10-27_17,2023-10-27_01,2023-05-22_02 Precedence: bulk List-ID: X-Mailing-List: io-uring@vger.kernel.org From: Keith Busch No more users of this field. Signed-off-by: Keith Busch --- include/linux/io_uring.h | 8 ++------ io_uring/uring_cmd.c | 1 - 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index fe23bf88f86fa..9e6ce6d4ab51f 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -32,12 +32,8 @@ enum io_uring_cmd_flags { struct io_uring_cmd { struct file *file; const struct io_uring_sqe *sqe; - union { - /* callback to defer completions to task context */ - void (*task_work_cb)(struct io_uring_cmd *cmd, unsigned); - /* used for polled completion */ - void *cookie; - }; + /* callback to defer completions to task context */ + void (*task_work_cb)(struct io_uring_cmd *cmd, unsigned); u32 cmd_op; u32 flags; u8 pdu[32]; /* available inline for free use */ diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index acbc2924ecd21..b39ec25c36bc3 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -182,7 +182,6 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) return -EOPNOTSUPP; issue_flags |= IO_URING_F_IOPOLL; req->iopoll_completed = 0; - WRITE_ONCE(ioucmd->cookie, NULL); } ret = file->f_op->uring_cmd(ioucmd, issue_flags);