From patchwork Wed Aug 14 10:45:54 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Pavel Begunkov X-Patchwork-Id: 13763319 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E714C3DA4A for ; Wed, 14 Aug 2024 10:45:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 5DFAE6B0093; Wed, 14 Aug 2024 06:45:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 4F0786B0095; Wed, 14 Aug 2024 06:45:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 36C066B0096; Wed, 14 Aug 2024 06:45:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 094CE6B0093 for ; Wed, 14 Aug 2024 06:45:32 -0400 (EDT) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id BB8DC160E12 for ; Wed, 14 Aug 2024 10:45:31 +0000 (UTC) X-FDA: 82450519662.23.163EBE5 Received: from mail-ej1-f48.google.com (mail-ej1-f48.google.com [209.85.218.48]) by imf10.hostedemail.com (Postfix) with ESMTP id D4A9AC0016 for ; Wed, 14 Aug 2024 10:45:29 +0000 (UTC) Authentication-Results: imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=khXPOoEG; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=asml.silence@gmail.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1723632250; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=Tuo5pTUVgRqd7+Wu/p/7uLxEPhnfCWa8o0PxSIfrqWo=; b=NQnA1TC0RnpEM3AK8a1eo8/BAiI1pTRymq5drgWjldVdeEp3YANLdgRjYk7wddfT+VljeC lCIPpzaZ4m/5KoxyT2PN3lQKEDrqtMXMbg4HuPoe5x3sORZpFQ5oa0m4eg3mpEnyk+l3bY ODVYASxfmL2Koqo/lk5YHweFxej/KXg= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1723632250; a=rsa-sha256; cv=none; b=pW9F5QKeKhojSdxeSI4MJXV8xlYn+tZ3fsfipUHZ/UlcFETDmzItdKpIXAXwgHQYkI48gn 59eFSGNj7NYbEO4iHH2oAUyCWIJWoJuhA8JaIxQ3pWYAe7shaWs6x4ZyQ05FvGGqH8J7gh dzFehsMy/iFC4JyD1Vvshia01lk93iA= ARC-Authentication-Results: i=1; imf10.hostedemail.com; dkim=pass header.d=gmail.com header.s=20230601 header.b=khXPOoEG; dmarc=pass (policy=none) header.from=gmail.com; spf=pass (imf10.hostedemail.com: domain of asml.silence@gmail.com designates 209.85.218.48 as permitted sender) smtp.mailfrom=asml.silence@gmail.com Received: by mail-ej1-f48.google.com with SMTP id a640c23a62f3a-a7d89bb07e7so701689166b.3 for ; Wed, 14 Aug 2024 03:45:29 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20230601; t=1723632328; x=1724237128; darn=kvack.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Tuo5pTUVgRqd7+Wu/p/7uLxEPhnfCWa8o0PxSIfrqWo=; b=khXPOoEG1rrHH+OHWyrz+KlWx3qV4ypDKHQPlJSZKev9h9UkGQNQh0O7HcU1luQ05g pODe/6QsXpqvzGuG3A5/wot42dmvVhrjpVT0kkAMmP8JDXH2ulOxPUIeAlRncts/z9ZY u7HLnTwctYQM9gWpgt/nUuQWQmcr67lFWmIEzHd/p4t6Q57+YRv6g9QYIQTUw7etFjcz +46fpPFhh7L/739V7am1Yxk+XasAvi4ZQS67uOYM5e+AX0smDE9Se3EFmHvmahOD6meL qWmIglIZr7u6I29aFwvLvxn+1oJ5/g10+uxKi+T0rM4tGc1me3MWheqyuCnFUyaxbW9C IJrg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1723632328; x=1724237128; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Tuo5pTUVgRqd7+Wu/p/7uLxEPhnfCWa8o0PxSIfrqWo=; b=nt3FiFN+PrsN1+y8WFyp/Le7jcETpfDhiYh3uHi0EfKjn+CkMv5H55FCLdml59bKOo +UUfBq5YjB/goT06Pecf9fi2B3B87l9R4bC/1KaKyjkpsN/Mckuho3CXBoVDVEscn70j EWKv5V/JlW+Awn5tW0B5VWP+kUpd53k52o8FavGsGlVMdiKA0d9vUafJXLzKsOTNUYGy qkIKFi1TNCNwR1VzKvUlh+UVDdMNH1g3rqKG822GcCHdHCSGs6JaPkaL5SVDu/AW4GVL H+qXdyShJ1fc4/vf61ljhnrnBRwXTadmK0anQJYcvkdsTao2iwmUMNmKoE9PP0GkZsc/ PIUA== X-Forwarded-Encrypted: i=1; AJvYcCVBN/3vJzvR276wWTmWEYy+LgL5lAGtmHIPJz2kjDRIb0tienk8fuAGMJjCdouK/Ew8PTRqeqwI6M5VM5JcnVMPt58= X-Gm-Message-State: AOJu0Yw3H47CxsjivhGf/iJ2eUs7bJSV7UFnfRJ0CgtNOng7N/bJgAgi d897saAzYI8PA1Z5PbeBvgBqxdZ89ETylz6KBYCcb4XnikLUO3Lvyj1HZldp X-Google-Smtp-Source: AGHT+IG1KkX6wMcxGFIa+KqlcGfRZzjy0lHz5phQrjiJN6wASSWoiIqHMLk5BznC58MpKrZaWW6grA== X-Received: by 2002:a17:907:84c:b0:a7a:bc34:a4c9 with SMTP id a640c23a62f3a-a83670bf16cmr128595066b.69.1723632328176; Wed, 14 Aug 2024 03:45:28 -0700 (PDT) Received: from 127.0.0.1localhost ([148.252.132.251]) by smtp.gmail.com with ESMTPSA id a640c23a62f3a-a80f418692asm157212766b.224.2024.08.14.03.45.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 14 Aug 2024 03:45:27 -0700 (PDT) From: Pavel Begunkov To: io-uring@vger.kernel.org Cc: Jens Axboe , asml.silence@gmail.com, Conrad Meyer , linux-block@vger.kernel.org, linux-mm@kvack.org Subject: [RFC 5/5] block: implement io_uring discard cmd Date: Wed, 14 Aug 2024 11:45:54 +0100 Message-ID: <6ecd7ab3386f63f1656dc766c1b5b038ff5353c2.1723601134.git.asml.silence@gmail.com> X-Mailer: git-send-email 2.45.2 In-Reply-To: References: MIME-Version: 1.0 X-Rspamd-Server: rspam07 X-Rspamd-Queue-Id: D4A9AC0016 X-Stat-Signature: x88e9f4qqku8d7m168wan1fnbfs4ceub X-Rspam-User: X-HE-Tag: 1723632329-655574 X-HE-Meta: U2FsdGVkX1+O3SnKDJRZZhSBDoPixVvemDQzLChE8s4uN918UvUxNsXYMahxdU/JT5IiuRZ5fGER6FLufwceyrxUufVugWtAXTnVlhmYryKkKDyq89m3CaOdMOqAnhAb0V0YcY2CMOQYmNbwtTasIhisx4gaBUV/myVPADuNMXJNSRGxBm54i//MXXnJN7dygA5x0WK6iUUM5k8yQUDo8gy1cCcuH2Ws5MF8o6jlaPw9k+cEI2fq7ETt3CIEg1ra/TRlqoMbdPk/9AhnRRI1sL6JppOVJq5V3Cgk5IaAeXNMX7Ycj4ttaVzAmndD4NuI0E8Da9CnWRcZ0XjSaTYV5esiNg/IzzXQEHwSzKfDLusHxh81dqSKSHd1dob4fFhJ9va8OP3BaUyXbXi5EkL6JP8R4II0y8qJTgXDIQwZPNq9tDuGsHFyh/6d6kAG3EILw5Um9/kTMhWew/MSoyF6rSMwcTzS/06pxHtVDxrpbzq6bvEYdp3R8q2Qiyo8wjhoiTZzis45Ze7SoEV3Wy1uqFF2f3qHagS87ke/bo6XhoffWGAQuvTHj7gOAu/bJ7CHOVPzgj//T+7oZvaB0A3sjnwGuQuGKMBqcLn+1PHD4yv/YjBkcRoNomAZcdKClu9uoEI0kfs6Ng4j39Qtade0m2rvN3+E7fQAdgi1Y5A6Es7S+JIdapHZ1UXCENwIhRw1e8v+FD2T8/Nap2+GKylNpb4kmRqSpdzR6Jg1xprRqwheMvrTHC4B79Mxq96qnL0WeuxHm3jLUUSvhTmLiMDr/Xytb8D1lVZz4G+tN44vkuli/6rbtVOhfY1nP5aZv8c4ze6RQgCk/vILfsilNM1Ave89l0dq305asfgWUOq+syywEKJJb0H8tI9htzL3BJKyHidWz8u/RpOdnwKeAVdDM/HVhBI7tUPBYNEmJmtoDT6kMc4OJ1xpjF0jDgGEDjOk+gRKF0u0S1TAj0WFy3d XnWt6D/1 mjrBYcVowKHQJGd27Saz91UQLyIJWbBVarDj+qeldKDuRw5DjojYR07fN0zNCCdXvPYUvJcoC/9FIcqcNQSQYz0DqEdVzXa9WJ9LKbp4scChNlJGJUKa96Rf9NO0bYYvaIxl9Gon//DNKwlu528B5uBJGECnTCwqaVl2YH8hEpFtdhTU4i4RJMlZhmwS5ul9okS7kKWgFOnEZ/SKhFE6PE0dfqnEBLIO4N/80Em5WVHgJHrS23HbU7TofVrDTlE2Fu8YMcLmme5gUcwD9/Rl76b/f/snSlc6jeIpnLyJqupqDZsgrFQhAP87UZoxHxMfPXNg1YY2vHycZ0Ev2XzB0ToJKGfyq4FpnG95QJzbQAn05ctMwT0r+QH6LdDQS564XrD2Tq3zX9/T8K/0d2ipCVlelf5urnn4Ebam0oH66q7z6yAiXq+quiruImOazZD2qVFn2xqrVsV0xoClaRN1YeQHgQMGTJCcJ87ZOCNoE4RYUqUtZ4M5iyWIaqScxYhfn+PywdNfEkSKB+Si9PQgb2fFxaI9NPDsBm7PEtKRPrSWtGjpoIF9beCGkgQ== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: Add ->uring_cmd callback for block device files and use it to implement asynchronous discard. Normally, it first tries to execute the command from non-blocking context, which we limit to a single bio because otherwise one of sub-bios may need to wait for other bios, and we don't want to deal with partial IO. If non-blocking attempt fails, we'll retry it in a blocking context. Suggested-by: Conrad Meyer Signed-off-by: Pavel Begunkov --- block/blk.h | 1 + block/fops.c | 2 + block/ioctl.c | 94 +++++++++++++++++++++++++++++++++++++++++ include/uapi/linux/fs.h | 2 + 4 files changed, 99 insertions(+) diff --git a/block/blk.h b/block/blk.h index e180863f918b..5178c5ba6852 100644 --- a/block/blk.h +++ b/block/blk.h @@ -571,6 +571,7 @@ blk_mode_t file_to_blk_mode(struct file *file); int truncate_bdev_range(struct block_device *bdev, blk_mode_t mode, loff_t lstart, loff_t lend); long blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); +int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags); long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg); extern const struct address_space_operations def_blk_aops; diff --git a/block/fops.c b/block/fops.c index 9825c1713a49..8154b10b5abf 100644 --- a/block/fops.c +++ b/block/fops.c @@ -17,6 +17,7 @@ #include #include #include +#include #include "blk.h" static inline struct inode *bdev_file_inode(struct file *file) @@ -873,6 +874,7 @@ const struct file_operations def_blk_fops = { .splice_read = filemap_splice_read, .splice_write = iter_file_splice_write, .fallocate = blkdev_fallocate, + .uring_cmd = blkdev_uring_cmd, .fop_flags = FOP_BUFFER_RASYNC, }; diff --git a/block/ioctl.c b/block/ioctl.c index c7a3e6c6f5fa..f7f9c4c6d6b5 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -11,6 +11,8 @@ #include #include #include +#include +#include #include "blk.h" static int blkpg_do_ioctl(struct block_device *bdev, @@ -744,4 +746,96 @@ long compat_blkdev_ioctl(struct file *file, unsigned cmd, unsigned long arg) return ret; } + +struct blk_cmd { + blk_status_t status; + bool nowait; +}; + +static void blk_cmd_complete(struct io_uring_cmd *cmd, unsigned int issue_flags) +{ + struct blk_cmd *bc = io_uring_cmd_to_pdu(cmd, struct blk_cmd); + int res = blk_status_to_errno(bc->status); + + if (res == -EAGAIN && bc->nowait) + io_uring_cmd_issue_blocking(cmd); + else + io_uring_cmd_done(cmd, res, 0, issue_flags); +} + +static void bio_cmd_end(struct bio *bio) +{ + struct io_uring_cmd *cmd = bio->bi_private; + struct blk_cmd *bc = io_uring_cmd_to_pdu(cmd, struct blk_cmd); + + if (unlikely(bio->bi_status) && !bc->status) + bc->status = bio->bi_status; + + io_uring_cmd_do_in_task_lazy(cmd, blk_cmd_complete); + bio_put(bio); +} + +static int blkdev_cmd_discard(struct io_uring_cmd *cmd, + struct block_device *bdev, + uint64_t start, uint64_t len, bool nowait) +{ + sector_t sector = start >> SECTOR_SHIFT; + sector_t nr_sects = len >> SECTOR_SHIFT; + struct bio *prev = NULL, *bio; + int err; + + err = blk_validate_discard(bdev, file_to_blk_mode(cmd->file), + start, len); + if (err) + return err; + err = filemap_invalidate_pages(bdev->bd_mapping, start, + start + len - 1, nowait); + if (err) + return err; + + while ((bio = blk_alloc_discard_bio(bdev, §or, &nr_sects, + GFP_KERNEL))) { + if (nowait) { + if (unlikely(nr_sects)) { + bio_put(bio); + return -EAGAIN; + } + bio->bi_opf |= REQ_NOWAIT; + } + prev = bio_chain_and_submit(prev, bio); + } + if (!prev) + return -EFAULT; + + prev->bi_private = cmd; + prev->bi_end_io = bio_cmd_end; + submit_bio(prev); + return -EIOCBQUEUED; +} + +int blkdev_uring_cmd(struct io_uring_cmd *cmd, unsigned int issue_flags) +{ + struct block_device *bdev = I_BDEV(cmd->file->f_mapping->host); + struct blk_cmd *bc = io_uring_cmd_to_pdu(cmd, struct blk_cmd); + const struct io_uring_sqe *sqe = cmd->sqe; + u32 cmd_op = cmd->cmd_op; + uint64_t start, len; + + if (unlikely(sqe->ioprio || sqe->__pad1 || sqe->len || + sqe->rw_flags || sqe->file_index)) + return -EINVAL; + + bc->status = BLK_STS_OK; + bc->nowait = issue_flags & IO_URING_F_NONBLOCK; + + start = READ_ONCE(sqe->addr); + len = READ_ONCE(sqe->addr3); + + switch (cmd_op) { + case BLOCK_URING_CMD_DISCARD: + return blkdev_cmd_discard(cmd, bdev, start, len, bc->nowait); + } + return -EINVAL; +} + #endif diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 753971770733..0016e38ed33c 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -208,6 +208,8 @@ struct fsxattr { * (see uapi/linux/blkzoned.h) */ +#define BLOCK_URING_CMD_DISCARD 0 + #define BMAP_IOCTL 1 /* obsolete - kept for compatibility */ #define FIBMAP _IO(0x00,1) /* bmap access */ #define FIGETBSZ _IO(0x00,2) /* get the block size used for bmap */