From patchwork Tue Nov 28 22:27:49 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13471917 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="mi8pGqQj" Received: from mx0b-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C6B585 for ; Tue, 28 Nov 2023 14:28:17 -0800 (PST) Received: from pps.filterd (m0109332.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASLBWeR004321 for ; Tue, 28 Nov 2023 14:28:16 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=PxbgotqlGCxcrBU94lycmI1HXSfuZutz1KUGtphi19w=; b=mi8pGqQj2iFTFlcKxbwasWP+8O8oI13Nwn4DhBbmBw6o4D8Ko1hb1UUbOMY8JDs3E2AQ ZoyV7ngPftn1oNKOyBuH0haa/BhmGAeGlkUVXJtbI55xjhUIUE58+2W7lX17e+eBKCWL vxtolFuJn+YokWuj6ki7VbewyzN5RSeTYRXVnWnCce2tTV7QmYWlcTxAo9gLEQRpyfVx bBamWHvViAp7huBiihY/lETl42pz5f+3PlnMf8oPcJLPQHpDFtJ15mC351dT6kXLuINf CSDLHYzwaR8BLMIrJKU6LC6iFg07QO9mxp8srVF5Y4ZjvIrFBMztUOhRO+oSrvWM1+sQ Xg== Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3unjktb1xs-20 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 28 Nov 2023 14:28:16 -0800 Received: from twshared15991.38.frc1.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Tue, 28 Nov 2023 14:28:07 -0800 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 548232252F0C3; Tue, 28 Nov 2023 14:27:53 -0800 (PST) From: Keith Busch To: , , CC: , , , , , Keith Busch Subject: [PATCHv4 1/4] block: bio-integrity: directly map user buffers Date: Tue, 28 Nov 2023 14:27:49 -0800 Message-ID: <20231128222752.1767344-2-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231128222752.1767344-1-kbusch@meta.com> References: <20231128222752.1767344-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: Xj5cIiTF_Cv_EsxICWdYWVBKAv2HAF2S X-Proofpoint-ORIG-GUID: Xj5cIiTF_Cv_EsxICWdYWVBKAv2HAF2S X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_24,2023-11-27_01,2023-05-22_02 From: Keith Busch Passthrough commands that utilize metadata currently need to bounce the user space buffer through the kernel. Add support for mapping user space directly so that we can avoid this costly overhead. This is similar to how the normal bio data payload utilizes user addresses with bio_map_user_iov(). If the user address can't directly be used for reason, like too many segments or address unalignement, fallback to a copy of the user vec while keeping the user address pinned for the IO duration so that it can safely be copied on completion in any process context. Signed-off-by: Keith Busch --- block/bio-integrity.c | 203 ++++++++++++++++++++++++++++++++++++++++++ include/linux/bio.h | 9 ++ 2 files changed, 212 insertions(+) diff --git a/block/bio-integrity.c b/block/bio-integrity.c index ec8ac8cf6e1b9..60dfc0ecf2cf0 100644 --- a/block/bio-integrity.c +++ b/block/bio-integrity.c @@ -91,6 +91,44 @@ struct bio_integrity_payload *bio_integrity_alloc(struct bio *bio, } EXPORT_SYMBOL(bio_integrity_alloc); +static void bio_integrity_uncopy_user(struct bio_integrity_payload *bip, + bool dirty) +{ + unsigned short nr_vecs = --bip->bip_max_vcnt; + struct bio_vec *copy = &bip->bip_vec[1]; + void *buf = bvec_virt(bip->bip_vec); + + if (dirty) { + size_t bytes = bip->bip_iter.bi_size; + struct iov_iter iter; + int ret; + + iov_iter_bvec(&iter, ITER_DEST, copy, nr_vecs, bytes); + ret = copy_to_iter(buf, bytes, &iter); + WARN_ON_ONCE(ret != bytes); + } + + memmove(bip->bip_vec, copy, nr_vecs * sizeof(*copy)); + kfree(buf); +} + +static void bio_integrity_unpin_bvec(struct bio_vec *bv, int nr_vecs) +{ + int i; + + for (i = 0; i < nr_vecs; i++) + unpin_user_page(bv[i].bv_page); +} + +static void bio_integrity_unmap_user(struct bio_integrity_payload *bip) +{ + bool dirty = bio_data_dir(bip->bip_bio) == READ; + + if (bip->bip_flags & BIP_COPY_USER) + bio_integrity_uncopy_user(bip, dirty); + bio_integrity_unpin_bvec(bip->bip_vec, bip->bip_max_vcnt); +} + /** * bio_integrity_free - Free bio integrity payload * @bio: bio containing bip to be freed @@ -105,6 +143,8 @@ void bio_integrity_free(struct bio *bio) if (bip->bip_flags & BIP_BLOCK_INTEGRITY) kfree(bvec_virt(bip->bip_vec)); + else if (bip->bip_flags & BIP_INTEGRITY_USER) + bio_integrity_unmap_user(bip); __bio_integrity_free(bs, bip); bio->bi_integrity = NULL; @@ -160,6 +200,169 @@ int bio_integrity_add_page(struct bio *bio, struct page *page, } EXPORT_SYMBOL(bio_integrity_add_page); +static int bio_integrity_copy_user(struct bio *bio, struct bio_vec *bvec, + int nr_vecs, unsigned int len, + unsigned int direction, u32 seed) +{ + struct bio_integrity_payload *bip; + struct iov_iter iter; + void *buf; + int ret; + + buf = kmalloc(len, GFP_KERNEL); + if (!buf) + return -ENOMEM; + + if (direction == ITER_SOURCE) { + iov_iter_bvec(&iter, direction, bvec, nr_vecs, len); + if (!copy_from_iter_full(buf, len, &iter)) { + ret = -EFAULT; + goto free_buf; + } + } else { + memset(buf, 0, len); + } + + /* + * We need just one vec for this bip, but we also need to preserve the + * original bvec and the number of vecs in it for completion handling + */ + bip = bio_integrity_alloc(bio, GFP_KERNEL, nr_vecs + 1); + if (IS_ERR(bip)) { + ret = PTR_ERR(bip); + goto free_buf; + } + + ret = bio_integrity_add_page(bio, virt_to_page(buf), len, + offset_in_page(buf)); + if (ret != len) { + ret = -ENOMEM; + goto free_bip; + } + + memcpy(&bip->bip_vec[1], bvec, nr_vecs * sizeof(*bvec)); + bip->bip_flags |= BIP_INTEGRITY_USER | BIP_COPY_USER; + bip->bip_iter.bi_sector = seed; + return 0; +free_bip: + bio_integrity_free(bio); +free_buf: + kfree(buf); + return ret; +} + +static int bio_integrity_init_user(struct bio *bio, struct bio_vec *bvec, + int nr_vecs, unsigned int len, u32 seed) +{ + struct bio_integrity_payload *bip; + + bip = bio_integrity_alloc(bio, GFP_KERNEL, nr_vecs); + if (IS_ERR(bip)) + return PTR_ERR(bip); + + memcpy(bip->bip_vec, bvec, nr_vecs * sizeof(*bvec)); + bip->bip_flags |= BIP_INTEGRITY_USER; + bip->bip_iter.bi_sector = seed; + bip->bip_iter.bi_size = len; + return 0; +} + +static unsigned int bvec_from_pages(struct bio_vec *bvec, struct page **pages, + int nr_vecs, ssize_t bytes, ssize_t offset) +{ + unsigned int nr_bvecs = 0; + int i, j; + + for (i = 0; i < nr_vecs; i = j) { + size_t size = min_t(size_t, bytes, PAGE_SIZE - offset); + struct folio *folio = page_folio(pages[i]); + + bytes -= size; + for (j = i + 1; j < nr_vecs; j++) { + size_t next = min_t(size_t, PAGE_SIZE, bytes); + + if (page_folio(pages[j]) != folio || + pages[j] != pages[j - 1] + 1) + break; + unpin_user_page(pages[j]); + size += next; + bytes -= next; + } + + bvec_set_page(&bvec[nr_bvecs], pages[i], size, offset); + offset = 0; + nr_bvecs++; + } + + return nr_bvecs; +} + +int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t bytes, + u32 seed) +{ + struct request_queue *q = bdev_get_queue(bio->bi_bdev); + unsigned int align = q->dma_pad_mask | queue_dma_alignment(q); + struct page *stack_pages[UIO_FASTIOV], **pages = stack_pages; + struct bio_vec stack_vec[UIO_FASTIOV], *bvec = stack_vec; + unsigned int direction, nr_bvecs; + struct iov_iter iter; + int ret, nr_vecs; + size_t offset; + bool copy; + + if (bio_integrity(bio)) + return -EINVAL; + if (bytes >> SECTOR_SHIFT > queue_max_hw_sectors(q)) + return -E2BIG; + + if (bio_data_dir(bio) == READ) + direction = ITER_DEST; + else + direction = ITER_SOURCE; + + iov_iter_ubuf(&iter, direction, ubuf, bytes); + nr_vecs = iov_iter_npages(&iter, BIO_MAX_VECS + 1); + if (nr_vecs > BIO_MAX_VECS) + return -E2BIG; + if (nr_vecs > UIO_FASTIOV) { + bvec = kcalloc(sizeof(*bvec), nr_vecs, GFP_KERNEL); + if (!bvec) + return -ENOMEM; + pages = NULL; + } + + copy = !iov_iter_is_aligned(&iter, align, align); + ret = iov_iter_extract_pages(&iter, &pages, bytes, nr_vecs, 0, &offset); + if (unlikely(ret < 0)) + goto free_bvec; + + nr_bvecs = bvec_from_pages(bvec, pages, nr_vecs, bytes, offset); + if (pages != stack_pages) + kvfree(pages); + if (nr_bvecs > queue_max_integrity_segments(q)) + copy = true; + + if (copy) + ret = bio_integrity_copy_user(bio, bvec, nr_bvecs, bytes, + direction, seed); + else + ret = bio_integrity_init_user(bio, bvec, nr_bvecs, bytes, seed); + if (ret) + goto release_pages; + if (bvec != stack_vec) + kfree(bvec); + + return 0; + +release_pages: + bio_integrity_unpin_bvec(bvec, nr_bvecs); +free_bvec: + if (bvec != stack_vec) + kfree(bvec); + return ret; +} +EXPORT_SYMBOL_GPL(bio_integrity_map_user); + /** * bio_integrity_process - Process integrity metadata for a bio * @bio: bio to generate/verify integrity metadata for diff --git a/include/linux/bio.h b/include/linux/bio.h index 41d417ee13499..ec4db73e5f4ec 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -324,6 +324,8 @@ enum bip_flags { BIP_CTRL_NOCHECK = 1 << 2, /* disable HBA integrity checking */ BIP_DISK_NOCHECK = 1 << 3, /* disable disk integrity checking */ BIP_IP_CHECKSUM = 1 << 4, /* IP checksum */ + BIP_INTEGRITY_USER = 1 << 5, /* Integrity payload is user address */ + BIP_COPY_USER = 1 << 6, /* Kernel bounce buffer in use */ }; /* @@ -718,6 +720,7 @@ static inline bool bioset_initialized(struct bio_set *bs) for_each_bio(_bio) \ bip_for_each_vec(_bvl, _bio->bi_integrity, _iter) +int bio_integrity_map_user(struct bio *bio, void __user *ubuf, ssize_t len, u32 seed); extern struct bio_integrity_payload *bio_integrity_alloc(struct bio *, gfp_t, unsigned int); extern int bio_integrity_add_page(struct bio *, struct page *, unsigned int, unsigned int); extern bool bio_integrity_prep(struct bio *); @@ -789,6 +792,12 @@ static inline int bio_integrity_add_page(struct bio *bio, struct page *page, return 0; } +static inline int bio_integrity_map_user(struct bio *bio, void __user *ubuf, + ssize_t len, u32 seed) +{ + return -EINVAL; +} + #endif /* CONFIG_BLK_DEV_INTEGRITY */ /* From patchwork Tue Nov 28 22:27:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13471916 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="mXPaWhGv" Received: from mx0a-00082601.pphosted.com (mx0a-00082601.pphosted.com [67.231.145.42]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 189D410DA for ; Tue, 28 Nov 2023 14:28:19 -0800 (PST) Received: from pps.filterd (m0148461.ppops.net [127.0.0.1]) by mx0a-00082601.pphosted.com (8.17.1.19/8.17.1.19) with ESMTP id 3ASGXAar028213 for ; Tue, 28 Nov 2023 14:28:18 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=vFnUfQv67t7xbN71IcefHQtSZrJP7dhwSFWfi58914A=; b=mXPaWhGvmt2q0tqVzFmX8cv550MXSSNVLcOpjBN8HUmfqliQoWNVmhWfkr3X/fqiHDu4 gmR84TKQ78jyQ9erB+MFlQZRho9MVcSR7nemp+wRYyLIvv1ZaaURv4m6TYT5yo1LF1Ek f5rH5p7jXrfiwZnH5FQjRuiJ3dWsw+73Qm37FLmLI0KL2KBckNFMwr+IOnpq3WxhNOvi 62L5LMg6/LER/Wl7gaCRIVh7aR26z6SELgwN++9ZN0nq9fDhlrNYgvKWvRFvLoU8Ayn+ O21ngcKdAzxF99SFjrl0zoiAxEpbp4Z3bDzb3rfW0mBa5xpqY6svpTgSutZy7hf1hhBS /w== Received: from mail.thefacebook.com ([163.114.132.120]) by mx0a-00082601.pphosted.com (PPS) with ESMTPS id 3un6j7q3ps-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 28 Nov 2023 14:28:18 -0800 Received: from twshared15991.38.frc1.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:21d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Tue, 28 Nov 2023 14:28:07 -0800 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 9BC6C2252F0C5; Tue, 28 Nov 2023 14:27:53 -0800 (PST) From: Keith Busch To: , , CC: , , , , , Keith Busch Subject: [PATCHv4 2/4] nvme: use bio_integrity_map_user Date: Tue, 28 Nov 2023 14:27:50 -0800 Message-ID: <20231128222752.1767344-3-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231128222752.1767344-1-kbusch@meta.com> References: <20231128222752.1767344-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: WqyrfUK2We45I46tIbcKw4KLBD-7QExA X-Proofpoint-ORIG-GUID: WqyrfUK2We45I46tIbcKw4KLBD-7QExA X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_24,2023-11-27_01,2023-05-22_02 From: Keith Busch Map user metadata buffers directly. Now that the bio tracks the metadata, nvme doesn't need special metadata handling and tracking with callbacks and additional fields in the pdu. Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch --- drivers/nvme/host/ioctl.c | 197 ++++++-------------------------------- 1 file changed, 29 insertions(+), 168 deletions(-) diff --git a/drivers/nvme/host/ioctl.c b/drivers/nvme/host/ioctl.c index 529b9954d2b8c..32c9bcf491a33 100644 --- a/drivers/nvme/host/ioctl.c +++ b/drivers/nvme/host/ioctl.c @@ -96,58 +96,6 @@ static void __user *nvme_to_user_ptr(uintptr_t ptrval) return (void __user *)ptrval; } -static void *nvme_add_user_metadata(struct request *req, void __user *ubuf, - unsigned len, u32 seed) -{ - struct bio_integrity_payload *bip; - int ret = -ENOMEM; - void *buf; - struct bio *bio = req->bio; - - buf = kmalloc(len, GFP_KERNEL); - if (!buf) - goto out; - - if (req_op(req) == REQ_OP_DRV_OUT) { - ret = -EFAULT; - if (copy_from_user(buf, ubuf, len)) - goto out_free_meta; - } else { - memset(buf, 0, len); - } - - bip = bio_integrity_alloc(bio, GFP_KERNEL, 1); - if (IS_ERR(bip)) { - ret = PTR_ERR(bip); - goto out_free_meta; - } - - bip->bip_iter.bi_sector = seed; - ret = bio_integrity_add_page(bio, virt_to_page(buf), len, - offset_in_page(buf)); - if (ret != len) { - ret = -ENOMEM; - goto out_free_meta; - } - - req->cmd_flags |= REQ_INTEGRITY; - return buf; -out_free_meta: - kfree(buf); -out: - return ERR_PTR(ret); -} - -static int nvme_finish_user_metadata(struct request *req, void __user *ubuf, - void *meta, unsigned len, int ret) -{ - if (!ret && req_op(req) == REQ_OP_DRV_IN && - copy_to_user(ubuf, meta, len)) - ret = -EFAULT; - kfree(meta); - return ret; -} - static struct request *nvme_alloc_user_request(struct request_queue *q, struct nvme_command *cmd, blk_opf_t rq_flags, blk_mq_req_flags_t blk_flags) @@ -164,14 +112,12 @@ static struct request *nvme_alloc_user_request(struct request_queue *q, static int nvme_map_user_request(struct request *req, u64 ubuffer, unsigned bufflen, void __user *meta_buffer, unsigned meta_len, - u32 meta_seed, void **metap, struct io_uring_cmd *ioucmd, - unsigned int flags) + u32 meta_seed, struct io_uring_cmd *ioucmd, unsigned int flags) { struct request_queue *q = req->q; struct nvme_ns *ns = q->queuedata; struct block_device *bdev = ns ? ns->disk->part0 : NULL; struct bio *bio = NULL; - void *meta = NULL; int ret; if (ioucmd && (ioucmd->flags & IORING_URING_CMD_FIXED)) { @@ -193,18 +139,17 @@ static int nvme_map_user_request(struct request *req, u64 ubuffer, if (ret) goto out; + bio = req->bio; - if (bdev) + if (bdev) { bio_set_dev(bio, bdev); - - if (bdev && meta_buffer && meta_len) { - meta = nvme_add_user_metadata(req, meta_buffer, meta_len, - meta_seed); - if (IS_ERR(meta)) { - ret = PTR_ERR(meta); - goto out_unmap; + if (meta_buffer && meta_len) { + ret = bio_integrity_map_user(bio, meta_buffer, meta_len, + meta_seed); + if (ret) + goto out_unmap; + req->cmd_flags |= REQ_INTEGRITY; } - *metap = meta; } return ret; @@ -225,7 +170,6 @@ static int nvme_submit_user_cmd(struct request_queue *q, struct nvme_ns *ns = q->queuedata; struct nvme_ctrl *ctrl; struct request *req; - void *meta = NULL; struct bio *bio; u32 effects; int ret; @@ -237,7 +181,7 @@ static int nvme_submit_user_cmd(struct request_queue *q, req->timeout = timeout; if (ubuffer && bufflen) { ret = nvme_map_user_request(req, ubuffer, bufflen, meta_buffer, - meta_len, meta_seed, &meta, NULL, flags); + meta_len, meta_seed, NULL, flags); if (ret) return ret; } @@ -249,9 +193,6 @@ static int nvme_submit_user_cmd(struct request_queue *q, ret = nvme_execute_rq(req, false); if (result) *result = le64_to_cpu(nvme_req(req)->result.u64); - if (meta) - ret = nvme_finish_user_metadata(req, meta_buffer, meta, - meta_len, ret); if (bio) blk_rq_unmap_user(bio); blk_mq_free_request(req); @@ -446,19 +387,10 @@ struct nvme_uring_data { * Expect build errors if this grows larger than that. */ struct nvme_uring_cmd_pdu { - union { - struct bio *bio; - struct request *req; - }; - u32 meta_len; - u32 nvme_status; - union { - struct { - void *meta; /* kernel-resident buffer */ - void __user *meta_buffer; - }; - u64 result; - } u; + struct request *req; + struct bio *bio; + u64 result; + int status; }; static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( @@ -467,31 +399,6 @@ static inline struct nvme_uring_cmd_pdu *nvme_uring_cmd_pdu( return (struct nvme_uring_cmd_pdu *)&ioucmd->pdu; } -static void nvme_uring_task_meta_cb(struct io_uring_cmd *ioucmd, - unsigned issue_flags) -{ - struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - struct request *req = pdu->req; - int status; - u64 result; - - if (nvme_req(req)->flags & NVME_REQ_CANCELLED) - status = -EINTR; - else - status = nvme_req(req)->status; - - result = le64_to_cpu(nvme_req(req)->result.u64); - - if (pdu->meta_len) - status = nvme_finish_user_metadata(req, pdu->u.meta_buffer, - pdu->u.meta, pdu->meta_len, status); - if (req->bio) - blk_rq_unmap_user(req->bio); - blk_mq_free_request(req); - - io_uring_cmd_done(ioucmd, status, result, issue_flags); -} - static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd, unsigned issue_flags) { @@ -499,8 +406,7 @@ static void nvme_uring_task_cb(struct io_uring_cmd *ioucmd, if (pdu->bio) blk_rq_unmap_user(pdu->bio); - - io_uring_cmd_done(ioucmd, pdu->nvme_status, pdu->u.result, issue_flags); + io_uring_cmd_done(ioucmd, pdu->status, pdu->result, issue_flags); } static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, @@ -509,53 +415,24 @@ static enum rq_end_io_ret nvme_uring_cmd_end_io(struct request *req, struct io_uring_cmd *ioucmd = req->end_io_data; struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - req->bio = pdu->bio; - if (nvme_req(req)->flags & NVME_REQ_CANCELLED) { - pdu->nvme_status = -EINTR; - } else { - pdu->nvme_status = nvme_req(req)->status; - if (!pdu->nvme_status) - pdu->nvme_status = blk_status_to_errno(err); - } - pdu->u.result = le64_to_cpu(nvme_req(req)->result.u64); + if (nvme_req(req)->flags & NVME_REQ_CANCELLED) + pdu->status = -EINTR; + else + pdu->status = nvme_req(req)->status; + pdu->result = le64_to_cpu(nvme_req(req)->result.u64); /* * For iopoll, complete it directly. * Otherwise, move the completion to task work. */ - if (blk_rq_is_poll(req)) { - WRITE_ONCE(ioucmd->cookie, NULL); + if (blk_rq_is_poll(req)) nvme_uring_task_cb(ioucmd, IO_URING_F_UNLOCKED); - } else { + else io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_cb); - } return RQ_END_IO_FREE; } -static enum rq_end_io_ret nvme_uring_cmd_end_io_meta(struct request *req, - blk_status_t err) -{ - struct io_uring_cmd *ioucmd = req->end_io_data; - struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); - - req->bio = pdu->bio; - pdu->req = req; - - /* - * For iopoll, complete it directly. - * Otherwise, move the completion to task work. - */ - if (blk_rq_is_poll(req)) { - WRITE_ONCE(ioucmd->cookie, NULL); - nvme_uring_task_meta_cb(ioucmd, IO_URING_F_UNLOCKED); - } else { - io_uring_cmd_do_in_task_lazy(ioucmd, nvme_uring_task_meta_cb); - } - - return RQ_END_IO_NONE; -} - static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, struct io_uring_cmd *ioucmd, unsigned int issue_flags, bool vec) { @@ -567,7 +444,6 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, struct request *req; blk_opf_t rq_flags = REQ_ALLOC_CACHE; blk_mq_req_flags_t blk_flags = 0; - void *meta = NULL; int ret; c.common.opcode = READ_ONCE(cmd->opcode); @@ -615,27 +491,16 @@ static int nvme_uring_cmd_io(struct nvme_ctrl *ctrl, struct nvme_ns *ns, if (d.addr && d.data_len) { ret = nvme_map_user_request(req, d.addr, d.data_len, nvme_to_user_ptr(d.metadata), - d.metadata_len, 0, &meta, ioucmd, vec); + d.metadata_len, 0, ioucmd, vec); if (ret) return ret; } - if (blk_rq_is_poll(req)) { - ioucmd->flags |= IORING_URING_CMD_POLLED; - WRITE_ONCE(ioucmd->cookie, req); - } - /* to free bio on completion, as req->bio will be null at that time */ pdu->bio = req->bio; - pdu->meta_len = d.metadata_len; + pdu->req = req; req->end_io_data = ioucmd; - if (pdu->meta_len) { - pdu->u.meta = meta; - pdu->u.meta_buffer = nvme_to_user_ptr(d.metadata); - req->end_io = nvme_uring_cmd_end_io_meta; - } else { - req->end_io = nvme_uring_cmd_end_io; - } + req->end_io = nvme_uring_cmd_end_io; blk_execute_rq_nowait(req, false); return -EIOCBQUEUED; } @@ -786,16 +651,12 @@ int nvme_ns_chr_uring_cmd_iopoll(struct io_uring_cmd *ioucmd, struct io_comp_batch *iob, unsigned int poll_flags) { - struct request *req; - int ret = 0; - - if (!(ioucmd->flags & IORING_URING_CMD_POLLED)) - return 0; + struct nvme_uring_cmd_pdu *pdu = nvme_uring_cmd_pdu(ioucmd); + struct request *req = pdu->req; - req = READ_ONCE(ioucmd->cookie); if (req && blk_rq_is_poll(req)) - ret = blk_rq_poll(req, iob, poll_flags); - return ret; + return blk_rq_poll(req, iob, poll_flags); + return 0; } #ifdef CONFIG_NVME_MULTIPATH static int nvme_ns_head_ctrl_ioctl(struct nvme_ns *ns, unsigned int cmd, From patchwork Tue Nov 28 22:27:51 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13471913 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="e78lj2Ad" Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AC41A1BF for ; Tue, 28 Nov 2023 14:28:07 -0800 (PST) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 3ASMGMfl014696 for ; Tue, 28 Nov 2023 14:28:07 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=gz46DOJDVW8DfYBnHMREJeF0H1ddTB4qso7LZwreJ6o=; b=e78lj2AdR8+aKuGsH8bAe1QDYY7SRTJXK+h4dI89Y7FFvhMmWB2sT7O2It7xNVNTqbac uD0APeHUs/irvjVRBEMwSsTl01RNJM/AA4kJCce/M1OKAV+wuA1IH4weeQIqOUObED5j ujxB6RHOV5Jv4b+gkb97B+9q6Czv4t3UVNXQJEnHRUANa9XOVmMnxEFcL7j3ImAXMD2s E8iQOYp7ol1CeBdrn7EkZVOYknDf8K77WNfLiU2D5neyDE3gIoq29kaC1TXZriQib2gG uBvHP8Z4VX3k8B4SpoEJP1BeVWRx97arZtdc0km5VvF/7CFvmFJNtKCdNzRvc9Mf0LW5 wg== Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3unf81cty6-4 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 28 Nov 2023 14:28:06 -0800 Received: from twshared4634.37.frc1.facebook.com (2620:10d:c085:108::4) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Tue, 28 Nov 2023 14:28:04 -0800 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 62B152252F0CA; Tue, 28 Nov 2023 14:27:54 -0800 (PST) From: Keith Busch To: , , CC: , , , , , Keith Busch Subject: [PATCHv4 3/4] iouring: remove IORING_URING_CMD_POLLED Date: Tue, 28 Nov 2023 14:27:51 -0800 Message-ID: <20231128222752.1767344-4-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231128222752.1767344-1-kbusch@meta.com> References: <20231128222752.1767344-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: DXjeafOrUbioUMpVU_JTB8dM3lkz-eCl X-Proofpoint-ORIG-GUID: DXjeafOrUbioUMpVU_JTB8dM3lkz-eCl X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_24,2023-11-27_01,2023-05-22_02 From: Keith Busch No more users of this flag. Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch --- include/linux/io_uring.h | 1 - 1 file changed, 1 deletion(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index aefb73eeeebff..fe23bf88f86fa 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -28,7 +28,6 @@ enum io_uring_cmd_flags { /* only top 8 bits of sqe->uring_cmd_flags for kernel internal use */ #define IORING_URING_CMD_CANCELABLE (1U << 30) -#define IORING_URING_CMD_POLLED (1U << 31) struct io_uring_cmd { struct file *file; From patchwork Tue Nov 28 22:27:52 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Keith Busch X-Patchwork-Id: 13471915 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=meta.com header.i=@meta.com header.b="kFSWOWNN" Received: from mx0a-00082601.pphosted.com (mx0b-00082601.pphosted.com [67.231.153.30]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0E43D85 for ; Tue, 28 Nov 2023 14:28:11 -0800 (PST) Received: from pps.filterd (m0089730.ppops.net [127.0.0.1]) by m0089730.ppops.net (8.17.1.19/8.17.1.19) with ESMTP id 3ASMGMfu014696 for ; Tue, 28 Nov 2023 14:28:11 -0800 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=meta.com; h=from : to : cc : subject : date : message-id : in-reply-to : references : mime-version : content-transfer-encoding : content-type; s=s2048-2021-q4; bh=CtCSQu7SBwAU5RA4TEPV6ClvLhwUfelUCahho9yo6No=; b=kFSWOWNNg1c2o2i3LDqJ/HRlQmRdwjTjFZqxvar/dMbXbEJfwKcMCi5R1+9Piq9tM6BR ltUQZ/r2SKuLUG7BOT4yhbTSSphf0U2DEIWAu/z1pr+2j5FPpIdrHSJV1PQKo2Mvopsj 6FYcVsfOndIb1BJdPRs4hdNoaIWWCoFIVj/p8npN/u9s2w/wrCKzuCeWFriitqdy67Vh FKRE8sYp/JaxC6moIxgQr5NRF8xnENL/YScEhCyyOHW6EDMg9UhYWQMKT/o0KHsOvFz4 xhq4RWLGmFrVgVCAdKpjYxvCYHNBczGPCIkARNFSIPQequuvuU8vUCZhz5LlRPQ1mlyg xQ== Received: from mail.thefacebook.com ([163.114.132.120]) by m0089730.ppops.net (PPS) with ESMTPS id 3unf81cty6-13 (version=TLSv1.2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128 verify=NOT) for ; Tue, 28 Nov 2023 14:28:11 -0800 Received: from twshared34392.14.frc2.facebook.com (2620:10d:c085:108::8) by mail.thefacebook.com (2620:10d:c085:11d::8) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_GCM_SHA256) id 15.1.2507.34; Tue, 28 Nov 2023 14:28:06 -0800 Received: by devbig007.nao1.facebook.com (Postfix, from userid 544533) id 7316C2252F0CC; Tue, 28 Nov 2023 14:27:54 -0800 (PST) From: Keith Busch To: , , CC: , , , , , Keith Busch Subject: [PATCHv4 4/4] io_uring: remove uring_cmd cookie Date: Tue, 28 Nov 2023 14:27:52 -0800 Message-ID: <20231128222752.1767344-5-kbusch@meta.com> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20231128222752.1767344-1-kbusch@meta.com> References: <20231128222752.1767344-1-kbusch@meta.com> Precedence: bulk X-Mailing-List: linux-block@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 X-FB-Internal: Safe X-Proofpoint-GUID: F5zg8FrSDNtPhL7RvQrWbaH03TTszWG6 X-Proofpoint-ORIG-GUID: F5zg8FrSDNtPhL7RvQrWbaH03TTszWG6 X-Proofpoint-Virus-Version: vendor=baseguard engine=ICAP:2.0.272,Aquarius:18.0.997,Hydra:6.0.619,FMLib:17.11.176.26 definitions=2023-11-28_24,2023-11-27_01,2023-05-22_02 From: Keith Busch No more users of this field. Reviewed-by: Christoph Hellwig Signed-off-by: Keith Busch --- include/linux/io_uring.h | 8 ++------ io_uring/uring_cmd.c | 1 - 2 files changed, 2 insertions(+), 7 deletions(-) diff --git a/include/linux/io_uring.h b/include/linux/io_uring.h index fe23bf88f86fa..9e6ce6d4ab51f 100644 --- a/include/linux/io_uring.h +++ b/include/linux/io_uring.h @@ -32,12 +32,8 @@ enum io_uring_cmd_flags { struct io_uring_cmd { struct file *file; const struct io_uring_sqe *sqe; - union { - /* callback to defer completions to task context */ - void (*task_work_cb)(struct io_uring_cmd *cmd, unsigned); - /* used for polled completion */ - void *cookie; - }; + /* callback to defer completions to task context */ + void (*task_work_cb)(struct io_uring_cmd *cmd, unsigned); u32 cmd_op; u32 flags; u8 pdu[32]; /* available inline for free use */ diff --git a/io_uring/uring_cmd.c b/io_uring/uring_cmd.c index acbc2924ecd21..b39ec25c36bc3 100644 --- a/io_uring/uring_cmd.c +++ b/io_uring/uring_cmd.c @@ -182,7 +182,6 @@ int io_uring_cmd(struct io_kiocb *req, unsigned int issue_flags) return -EOPNOTSUPP; issue_flags |= IO_URING_F_IOPOLL; req->iopoll_completed = 0; - WRITE_ONCE(ioucmd->cookie, NULL); } ret = file->f_op->uring_cmd(ioucmd, issue_flags);