From patchwork Thu Dec 29 08:12:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083338 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 85C31C4332F for ; Thu, 29 Dec 2022 08:13:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233056AbiL2INT (ORCPT ); Thu, 29 Dec 2022 03:13:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37712 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231264AbiL2INI (ORCPT ); Thu, 29 Dec 2022 03:13:08 -0500 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDEE910550 for ; Thu, 29 Dec 2022 00:13:01 -0800 (PST) Received: by mail-pl1-x631.google.com with SMTP id jn22so18252559plb.13 for ; Thu, 29 Dec 2022 00:13:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LPgg4/E3xwiWPCkuKUsuYriDQInX30Nw0OCXNJZ+bXk=; b=JRQzQRQ/h7n4/IZEyBjWvWRtcDuyVX+Iug4JMnDafdf2iXKCL2eZDsBxuEa5yueYbz GBtzujYQ+rtNV3fFWhrS01wFk9UXHIUawqQpFmfKsUPE+V+W140LW12HtP23wsQOc9TO 1safTRCxSdYVsi2hSVsSmGRIpKPCrzb51h++E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LPgg4/E3xwiWPCkuKUsuYriDQInX30Nw0OCXNJZ+bXk=; b=zG8QY5IpcuRCskIeuNKVWcD/drUcdecbavnFOthxr37ydRasAVYtteEn+HaVzv2neM QWEc9xOX5dX5deebrv2Iri6crSo3QCOIW/Mx+cZhqgz24mbjkvcvIoUYRXzp9t7wRP1g bOD065F30ZRmelwpTtWle53pR62BisJ9c3Lgo7+pwtoEtHDje2s1R6y8CyXrFktfBGY9 gmbKdyuW+K/qr7ClOShMCJffjRf6KQIJHzOEXY/ZBYrRnERTDBYSuDrhd+UcklQRqzVA 4h5077xPBFGIU8pAJLkzQn8KVrvI0bNoLa5cI6SIMFca/jqlaQcud/guyYbDrzGF9cwE wJnQ== X-Gm-Message-State: AFqh2kpz5I4p3HtjJsK6gNARec8B5FMfOqCzAvP45tjP2mhn1gz1zELV Lhw+HPCg7OThUbcQUMFEbbgtog== X-Google-Smtp-Source: AMrXdXsfqqYCIpyBiZr/cTv9/CTKLr3m3pQrIjpFfBa2/Eo9DJ18jgamvtyxXSOS/6bH1HpMdN9jKg== X-Received: by 2002:a17:902:b60e:b0:189:89a4:3954 with SMTP id b14-20020a170902b60e00b0018989a43954mr30107986pls.41.1672301581332; Thu, 29 Dec 2022 00:13:01 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.12.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:00 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 1/7] block: Introduce provisioning primitives Date: Thu, 29 Dec 2022 00:12:46 -0800 Message-Id: <20221229081252.452240-2-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce block request REQ_OP_PROVISION. The intent of this request is to request underlying storage to preallocate disk space for the given block range. Block device that support this capability will export a provision limit within their request queues. Signed-off-by: Sarthak Kukreti --- block/blk-core.c | 5 ++++ block/blk-lib.c | 53 +++++++++++++++++++++++++++++++++++++++ block/blk-merge.c | 18 +++++++++++++ block/blk-settings.c | 19 ++++++++++++++ block/blk-sysfs.c | 8 ++++++ block/bounce.c | 1 + include/linux/bio.h | 6 +++-- include/linux/blk_types.h | 5 +++- include/linux/blkdev.h | 16 ++++++++++++ 9 files changed, 128 insertions(+), 3 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 9321767470dc..30bcabc7dc01 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -123,6 +123,7 @@ static const char *const blk_op_name[] = { REQ_OP_NAME(WRITE_ZEROES), REQ_OP_NAME(DRV_IN), REQ_OP_NAME(DRV_OUT), + REQ_OP_NAME(PROVISION) }; #undef REQ_OP_NAME @@ -785,6 +786,10 @@ void submit_bio_noacct(struct bio *bio) if (!q->limits.max_write_zeroes_sectors) goto not_supported; break; + case REQ_OP_PROVISION: + if (!q->limits.max_provision_sectors) + goto not_supported; + break; default: break; } diff --git a/block/blk-lib.c b/block/blk-lib.c index e59c3069e835..647b6451660b 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -343,3 +343,56 @@ int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, return ret; } EXPORT_SYMBOL(blkdev_issue_secure_erase); + +/** + * blkdev_issue_provision - provision a block range + * @bdev: blockdev to write + * @sector: start sector + * @nr_sects: number of sectors to provision + * @gfp_mask: memory allocation flags (for bio_alloc) + * + * Description: + * Issues a provision request to the block device for the range of sectors. + * For thinly provisioned block devices, this acts as a signal for the + * underlying storage pool to allocate space for this block range. + */ +int blkdev_issue_provision(struct block_device *bdev, sector_t sector, + sector_t nr_sects, gfp_t gfp) +{ + sector_t bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1; + unsigned int max_sectors = bdev_max_provision_sectors(bdev); + struct bio *bio = NULL; + struct blk_plug plug; + int ret = 0; + + if (max_sectors == 0) + return -EOPNOTSUPP; + if ((sector | nr_sects) & bs_mask) + return -EINVAL; + if (bdev_read_only(bdev)) + return -EPERM; + + blk_start_plug(&plug); + for (;;) { + unsigned int req_sects = min_t(sector_t, nr_sects, max_sectors); + + bio = blk_next_bio(bio, bdev, 0, REQ_OP_PROVISION, gfp); + bio->bi_iter.bi_sector = sector; + bio->bi_iter.bi_size = req_sects << SECTOR_SHIFT; + + sector += req_sects; + nr_sects -= req_sects; + if (!nr_sects) { + ret = submit_bio_wait(bio); + if (ret == -EOPNOTSUPP) + ret = 0; + bio_put(bio); + break; + } + cond_resched(); + } + blk_finish_plug(&plug); + + return ret; +} +EXPORT_SYMBOL(blkdev_issue_provision); diff --git a/block/blk-merge.c b/block/blk-merge.c index 35a8f75cc45d..3ab35bb2a333 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -158,6 +158,21 @@ static struct bio *bio_split_write_zeroes(struct bio *bio, return bio_split(bio, lim->max_write_zeroes_sectors, GFP_NOIO, bs); } +static struct bio *bio_split_provision(struct bio *bio, + const struct queue_limits *lim, + unsigned *nsegs, struct bio_set *bs) +{ + *nsegs = 0; + + if (!lim->max_provision_sectors) + return NULL; + + if (bio_sectors(bio) <= lim->max_provision_sectors) + return NULL; + + return bio_split(bio, lim->max_provision_sectors, GFP_NOIO, bs); +} + /* * Return the maximum number of sectors from the start of a bio that may be * submitted as a single request to a block device. If enough sectors remain, @@ -355,6 +370,9 @@ struct bio *__bio_split_to_limits(struct bio *bio, case REQ_OP_WRITE_ZEROES: split = bio_split_write_zeroes(bio, lim, nr_segs, bs); break; + case REQ_OP_PROVISION: + split = bio_split_provision(bio, lim, nr_segs, bs); + break; default: split = bio_split_rw(bio, lim, nr_segs, bs, get_max_io_size(bio, lim) << SECTOR_SHIFT); diff --git a/block/blk-settings.c b/block/blk-settings.c index 0477c4d527fe..57d88204fbbe 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -58,6 +58,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->zoned = BLK_ZONED_NONE; lim->zone_write_granularity = 0; lim->dma_alignment = 511; + lim->max_provision_sectors = 0; } /** @@ -81,6 +82,7 @@ void blk_set_stacking_limits(struct queue_limits *lim) lim->max_dev_sectors = UINT_MAX; lim->max_write_zeroes_sectors = UINT_MAX; lim->max_zone_append_sectors = UINT_MAX; + lim->max_provision_sectors = UINT_MAX; } EXPORT_SYMBOL(blk_set_stacking_limits); @@ -202,6 +204,20 @@ void blk_queue_max_write_zeroes_sectors(struct request_queue *q, } EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors); +/** + * blk_queue_max_provision_sectors - set max sectors for a single provision + * + * @q: the request queue for the device + * @max_provision_sectors: maximum number of sectors to provision per command + **/ + +void blk_queue_max_provision_sectors(struct request_queue *q, + unsigned int max_provision_sectors) +{ + q->limits.max_provision_sectors = max_provision_sectors; +} +EXPORT_SYMBOL(blk_queue_max_provision_sectors); + /** * blk_queue_max_zone_append_sectors - set max sectors for a single zone append * @q: the request queue for the device @@ -572,6 +588,9 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->max_segment_size = min_not_zero(t->max_segment_size, b->max_segment_size); + t->max_provision_sectors = min_not_zero(t->max_provision_sectors, + b->max_provision_sectors); + t->misaligned |= b->misaligned; alignment = queue_limit_alignment_offset(b, start); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 93d9e9c9a6ea..2e678417b302 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -131,6 +131,12 @@ static ssize_t queue_max_discard_segments_show(struct request_queue *q, return queue_var_show(queue_max_discard_segments(q), page); } +static ssize_t queue_max_provision_sectors_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_max_provision_sectors(q), (page)); +} + static ssize_t queue_max_integrity_segments_show(struct request_queue *q, char *page) { return queue_var_show(q->limits.max_integrity_segments, page); @@ -589,6 +595,7 @@ QUEUE_RO_ENTRY(queue_io_min, "minimum_io_size"); QUEUE_RO_ENTRY(queue_io_opt, "optimal_io_size"); QUEUE_RO_ENTRY(queue_max_discard_segments, "max_discard_segments"); +QUEUE_RO_ENTRY(queue_max_provision_sectors, "max_provision_sectors"); QUEUE_RO_ENTRY(queue_discard_granularity, "discard_granularity"); QUEUE_RO_ENTRY(queue_discard_max_hw, "discard_max_hw_bytes"); QUEUE_RW_ENTRY(queue_discard_max, "discard_max_bytes"); @@ -638,6 +645,7 @@ static struct attribute *queue_attrs[] = { &queue_max_sectors_entry.attr, &queue_max_segments_entry.attr, &queue_max_discard_segments_entry.attr, + &queue_max_provision_sectors_entry.attr, &queue_max_integrity_segments_entry.attr, &queue_max_segment_size_entry.attr, &elv_iosched_entry.attr, diff --git a/block/bounce.c b/block/bounce.c index 7cfcb242f9a1..ab9d8723ae64 100644 --- a/block/bounce.c +++ b/block/bounce.c @@ -176,6 +176,7 @@ static struct bio *bounce_clone_bio(struct bio *bio_src) case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: case REQ_OP_WRITE_ZEROES: + case REQ_OP_PROVISION: break; default: bio_for_each_segment(bv, bio_src, iter) diff --git a/include/linux/bio.h b/include/linux/bio.h index 22078a28d7cb..5025af105b7c 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -55,7 +55,8 @@ static inline bool bio_has_data(struct bio *bio) bio->bi_iter.bi_size && bio_op(bio) != REQ_OP_DISCARD && bio_op(bio) != REQ_OP_SECURE_ERASE && - bio_op(bio) != REQ_OP_WRITE_ZEROES) + bio_op(bio) != REQ_OP_WRITE_ZEROES && + bio_op(bio) != REQ_OP_PROVISION) return true; return false; @@ -65,7 +66,8 @@ static inline bool bio_no_advance_iter(const struct bio *bio) { return bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_SECURE_ERASE || - bio_op(bio) == REQ_OP_WRITE_ZEROES; + bio_op(bio) == REQ_OP_WRITE_ZEROES || + bio_op(bio) == REQ_OP_PROVISION; } static inline void *bio_data(struct bio *bio) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 99be590f952f..27bdf88f541c 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -385,7 +385,10 @@ enum req_op { REQ_OP_DRV_IN = (__force blk_opf_t)34, REQ_OP_DRV_OUT = (__force blk_opf_t)35, - REQ_OP_LAST = (__force blk_opf_t)36, + /* request device to provision block */ + REQ_OP_PROVISION = (__force blk_opf_t)37, + + REQ_OP_LAST = (__force blk_opf_t)38, }; enum req_flag_bits { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 301cf1cf4f2f..f1abc7b43e25 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -302,6 +302,7 @@ struct queue_limits { unsigned int discard_granularity; unsigned int discard_alignment; unsigned int zone_write_granularity; + unsigned int max_provision_sectors; unsigned short max_segments; unsigned short max_integrity_segments; @@ -918,6 +919,8 @@ extern void blk_queue_max_discard_sectors(struct request_queue *q, unsigned int max_discard_sectors); extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q, unsigned int max_write_same_sectors); +extern void blk_queue_max_provision_sectors(struct request_queue *q, + unsigned int max_provision_sectors); extern void blk_queue_logical_block_size(struct request_queue *, unsigned int); extern void blk_queue_max_zone_append_sectors(struct request_queue *q, unsigned int max_zone_append_sectors); @@ -1057,6 +1060,9 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector, int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp); +extern int blkdev_issue_provision(struct block_device *bdev, sector_t sector, + sector_t nr_sects, gfp_t gfp_mask); + #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit zeroes */ @@ -1135,6 +1141,11 @@ static inline unsigned short queue_max_discard_segments(const struct request_que return q->limits.max_discard_segments; } +static inline unsigned short queue_max_provision_sectors(const struct request_queue *q) +{ + return q->limits.max_provision_sectors; +} + static inline unsigned int queue_max_segment_size(const struct request_queue *q) { return q->limits.max_segment_size; @@ -1271,6 +1282,11 @@ static inline bool bdev_nowait(struct block_device *bdev) return test_bit(QUEUE_FLAG_NOWAIT, &bdev_get_queue(bdev)->queue_flags); } +static inline unsigned int bdev_max_provision_sectors(struct block_device *bdev) +{ + return bdev_get_queue(bdev)->limits.max_provision_sectors; +} + static inline enum blk_zoned_model bdev_zoned_model(struct block_device *bdev) { struct request_queue *q = bdev_get_queue(bdev); From patchwork Thu Dec 29 08:12:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083340 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 64702C4332F for ; Thu, 29 Dec 2022 08:13:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233125AbiL2INb (ORCPT ); Thu, 29 Dec 2022 03:13:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37928 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233043AbiL2INT (ORCPT ); Thu, 29 Dec 2022 03:13:19 -0500 Received: from mail-pj1-x102c.google.com (mail-pj1-x102c.google.com [IPv6:2607:f8b0:4864:20::102c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 703A4B48F for ; Thu, 29 Dec 2022 00:13:04 -0800 (PST) Received: by mail-pj1-x102c.google.com with SMTP id o31-20020a17090a0a2200b00223fedffb30so18285398pjo.3 for ; Thu, 29 Dec 2022 00:13:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZGTS43jY2w8jm4acf0IFMpzxibmXYw0wZeBsvWcE3NI=; b=ESIIlie7g6zZ/ES0wwvkxs/yh+YarKylqT2dcIkLBDXtSdlLAhLNtDyTEOYOzqRvTt yvleBhwtt6piRkZ1AbUwgXVyMufcEl99TQn0HFQM5kyqn7av3cLbbu+S49Ahav6yLl0l 33VPN8E+sneh7QckXjjA0GLOp5T4gOlao+cPk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZGTS43jY2w8jm4acf0IFMpzxibmXYw0wZeBsvWcE3NI=; b=fvU1qvS7nunkX4y6uhP2dGnClw5VW44gJ61TnmnQQYK2YdH40sZQxUAySOTKlwueyB vkYqPuXbCiKTkn8ZGNh7wFKJiEAEgmY7hS9marKh2c16V8bHqCuc52HLFnusNCQgSfRa khMn5b50CpmluVmzwFBUzoDKiNWu+B2PqdTicCAZPhFhsaFKgdFzQI0O+v4wtvahXYa1 selea3brl+wj/Y1KhgcS2MIur9m4z8m/XksfxiKhq6Ix8uJSvwQF8/DoPdxnQQQS7tDP 3Fgrk1O5a6dJfz/vmOb4udIqPRnYZt84Io4TZI8NsFQA8wc8C9k7DKgM0JsmW09JJcns 5nAQ== X-Gm-Message-State: AFqh2kp++ppVkpaWjhnAHiK1xglLJB3gZ3quRLp8ZyFbWuQ+GPdz2hIz pfGhKfR21XGmO6CJ8CJAaLchsw== X-Google-Smtp-Source: AMrXdXtCqORihZF9SwuWUqetXWe9iVm3jlR8L/IVs3risC7ggBVbqn40CWo00UuzV3AXBqtcxJtPjg== X-Received: by 2002:a17:902:f08a:b0:189:c19a:2cd9 with SMTP id p10-20020a170902f08a00b00189c19a2cd9mr27993581pla.25.1672301583922; Thu, 29 Dec 2022 00:13:03 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:03 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 2/7] dm: Add support for block provisioning Date: Thu, 29 Dec 2022 00:12:47 -0800 Message-Id: <20221229081252.452240-3-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add support to dm devices for REQ_OP_PROVISION. The default mode is to pass through the request and dm-thin will utilize it to provision blocks. Signed-off-by: Sarthak Kukreti --- drivers/md/dm-crypt.c | 4 +- drivers/md/dm-linear.c | 1 + drivers/md/dm-snap.c | 7 +++ drivers/md/dm-table.c | 25 ++++++++++ drivers/md/dm-thin.c | 90 ++++++++++++++++++++++++++++++++++- drivers/md/dm.c | 4 ++ include/linux/device-mapper.h | 11 +++++ 7 files changed, 139 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 2653516bcdef..7089a414c3d1 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -3081,6 +3081,8 @@ static int crypt_ctr_optional(struct dm_target *ti, unsigned int argc, char **ar if (ret) return ret; + ti->num_provision_bios = 1; + while (opt_params--) { opt_string = dm_shift_arg(&as); if (!opt_string) { @@ -3384,7 +3386,7 @@ static int crypt_map(struct dm_target *ti, struct bio *bio) * - for REQ_OP_DISCARD caller must use flush if IO ordering matters */ if (unlikely(bio->bi_opf & REQ_PREFLUSH || - bio_op(bio) == REQ_OP_DISCARD)) { + bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_PROVISION)) { bio_set_dev(bio, cc->dev->bdev); if (bio_sectors(bio)) bio->bi_iter.bi_sector = cc->start + diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c index 3212ef6aa81b..1aa782149428 100644 --- a/drivers/md/dm-linear.c +++ b/drivers/md/dm-linear.c @@ -61,6 +61,7 @@ static int linear_ctr(struct dm_target *ti, unsigned int argc, char **argv) ti->num_discard_bios = 1; ti->num_secure_erase_bios = 1; ti->num_write_zeroes_bios = 1; + ti->num_provision_bios = 1; ti->private = lc; return 0; diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index d1c2f84d27e3..d4d2599e3620 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1357,6 +1357,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) if (s->discard_zeroes_cow) ti->num_discard_bios = (s->discard_passdown_origin ? 2 : 1); ti->per_io_data_size = sizeof(struct dm_snap_tracked_chunk); + ti->num_provision_bios = 1; /* Add snapshot to the list of snapshots for this origin */ /* Exceptions aren't triggered till snapshot_resume() is called */ @@ -2001,6 +2002,11 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio) /* If the block is already remapped - use that, else remap it */ e = dm_lookup_exception(&s->complete, chunk); if (e) { + if (unlikely(bio_op(bio) == REQ_OP_PROVISION)) { + bio_endio(bio); + r = DM_MAPIO_SUBMITTED; + goto out_unlock; + } remap_exception(s, e, bio, chunk); if (unlikely(bio_op(bio) == REQ_OP_DISCARD) && io_overlaps_chunk(s, bio)) { @@ -2414,6 +2420,7 @@ static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits) /* All discards are split on chunk_size boundary */ limits->discard_granularity = snap->store->chunk_size; limits->max_discard_sectors = snap->store->chunk_size; + limits->max_provision_sectors = snap->store->chunk_size; up_read(&_origins_lock); } diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 8541d5688f3a..35f8d670935e 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1853,6 +1853,26 @@ static bool dm_table_supports_write_zeroes(struct dm_table *t) return true; } +static int device_provision_capable(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) +{ + return !bdev_max_provision_sectors(dev->bdev); +} + +static bool dm_table_supports_provision(struct dm_table *t) +{ + for (unsigned int i = 0; i < t->num_targets; i++) { + struct dm_target *ti = dm_table_get_target(t, i); + + if (ti->provision_supported || + (ti->type->iterate_devices && + ti->type->iterate_devices(ti, device_provision_capable, NULL))) + return true; + } + + return false; +} + static int device_not_nowait_capable(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data) { @@ -1987,6 +2007,11 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (!dm_table_supports_write_zeroes(t)) q->limits.max_write_zeroes_sectors = 0; + if (dm_table_supports_provision(t)) + blk_queue_max_provision_sectors(q, UINT_MAX >> 9); + else + q->limits.max_provision_sectors = 0; + dm_table_verify_integrity(t); /* diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 64cfcf46881d..ab3f1abfabaf 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -1012,6 +1012,14 @@ static void process_prepared_mapping(struct dm_thin_new_mapping *m) goto out; } + /* For provision requests, return once the prepared block has been inserted + * into the mapping btree. + */ + if (bio && bio_op(bio) == REQ_OP_PROVISION) { + bio_endio(bio); + goto out; + } + /* * Release any bios held while the block was being provisioned. * If we are processing a write bio that completely covers the block, @@ -1239,7 +1247,7 @@ static int io_overlaps_block(struct pool *pool, struct bio *bio) static int io_overwrites_block(struct pool *pool, struct bio *bio) { - return (bio_data_dir(bio) == WRITE) && + return (bio_data_dir(bio) == WRITE) && bio_op(bio) != REQ_OP_PROVISION && io_overlaps_block(pool, bio); } @@ -1388,6 +1396,10 @@ static void schedule_zero(struct thin_c *tc, dm_block_t virt_block, m->data_block = data_block; m->cell = cell; + /* Provision requests are chained on the original bio. */ + if (bio && bio_op(bio) == REQ_OP_PROVISION) + m->bio = bio; + /* * If the whole block of data is being overwritten or we are not * zeroing pre-existing data, we can issue the bio immediately. @@ -1980,6 +1992,70 @@ static void process_cell(struct thin_c *tc, struct dm_bio_prison_cell *cell) } } +static void process_provision_cell(struct thin_c *tc, struct dm_bio_prison_cell *cell) +{ + int r; + struct pool *pool = tc->pool; + struct bio *bio = cell->holder; + dm_block_t begin, end; + struct dm_thin_lookup_result lookup_result; + + if (tc->requeue_mode) { + cell_requeue(pool, cell); + return; + } + + get_bio_block_range(tc, bio, &begin, &end); + + while (begin != end) { + r = ensure_next_mapping(pool); + if (r) + /* we did our best */ + return; + + r = dm_thin_find_block(tc->td, begin, 1, &lookup_result); + switch (r) { + case 0: + begin++; + break; + case -ENODATA: + bio_inc_remaining(bio); + provision_block(tc, bio, begin, cell); + begin++; + break; + default: + DMERR_LIMIT( + "%s: dm_thin_find_block() failed: error = %d", + __func__, r); + cell_defer_no_holder(tc, cell); + bio_io_error(bio); + begin++; + break; + } + } + bio_endio(bio); + cell_defer_no_holder(tc, cell); +} + +static void process_provision_bio(struct thin_c *tc, struct bio *bio) +{ + dm_block_t begin, end; + struct dm_cell_key virt_key; + struct dm_bio_prison_cell *virt_cell; + + get_bio_block_range(tc, bio, &begin, &end); + if (begin == end) { + bio_endio(bio); + return; + } + + build_key(tc->td, VIRTUAL, begin, end, &virt_key); + if (bio_detain(tc->pool, &virt_key, bio, &virt_cell)) + return; + + process_provision_cell(tc, virt_cell); +} + static void process_bio(struct thin_c *tc, struct bio *bio) { struct pool *pool = tc->pool; @@ -2200,6 +2276,8 @@ static void process_thin_deferred_bios(struct thin_c *tc) if (bio_op(bio) == REQ_OP_DISCARD) pool->process_discard(tc, bio); + else if (bio_op(bio) == REQ_OP_PROVISION) + process_provision_bio(tc, bio); else pool->process_bio(tc, bio); @@ -2716,7 +2794,8 @@ static int thin_bio_map(struct dm_target *ti, struct bio *bio) return DM_MAPIO_SUBMITTED; } - if (op_is_flush(bio->bi_opf) || bio_op(bio) == REQ_OP_DISCARD) { + if (op_is_flush(bio->bi_opf) || bio_op(bio) == REQ_OP_DISCARD || + bio_op(bio) == REQ_OP_PROVISION) { thin_defer_bio_with_throttle(tc, bio); return DM_MAPIO_SUBMITTED; } @@ -3355,6 +3434,8 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) pt->low_water_blocks = low_water_blocks; pt->adjusted_pf = pt->requested_pf = pf; ti->num_flush_bios = 1; + ti->num_provision_bios = 1; + ti->provision_supported = true; /* * Only need to enable discards if the pool should pass @@ -4053,6 +4134,7 @@ static void pool_io_hints(struct dm_target *ti, struct queue_limits *limits) blk_limits_io_opt(limits, pool->sectors_per_block << SECTOR_SHIFT); } + /* * pt->adjusted_pf is a staging area for the actual features to use. * They get transferred to the live pool in bind_control_target() @@ -4243,6 +4325,9 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) ti->num_discard_bios = 1; } + ti->num_provision_bios = 1; + ti->provision_supported = true; + mutex_unlock(&dm_thin_pool_table.mutex); spin_lock_irq(&tc->pool->lock); @@ -4457,6 +4542,7 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits) limits->discard_granularity = pool->sectors_per_block << SECTOR_SHIFT; limits->max_discard_sectors = 2048 * 1024 * 16; /* 16G */ + limits->max_provision_sectors = 2048 * 1024 * 16; /* 16G */ } static struct target_type thin_target = { diff --git a/drivers/md/dm.c b/drivers/md/dm.c index e1ea3a7bd9d9..4d19bae9da4a 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1587,6 +1587,7 @@ static bool is_abnormal_io(struct bio *bio) case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: case REQ_OP_WRITE_ZEROES: + case REQ_OP_PROVISION: return true; default: break; @@ -1611,6 +1612,9 @@ static blk_status_t __process_abnormal_io(struct clone_info *ci, case REQ_OP_WRITE_ZEROES: num_bios = ti->num_write_zeroes_bios; break; + case REQ_OP_PROVISION: + num_bios = ti->num_provision_bios; + break; default: break; } diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 04c6acf7faaa..b4d97d5d75b8 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -333,6 +333,12 @@ struct dm_target { */ unsigned num_write_zeroes_bios; + /* + * The number of PROVISION bios that will be submitted to the target. + * The bio number can be accessed with dm_bio_get_target_bio_nr. + */ + unsigned num_provision_bios; + /* * The minimum number of extra bytes allocated in each io for the * target to use. @@ -357,6 +363,11 @@ struct dm_target { */ bool discards_supported:1; + /* Set if this target needs to receive provision requests regardless of + * whether or not its underlying devices have support. + */ + bool provision_supported:1; + /* * Set if we need to limit the number of in-flight bios when swapping. */ From patchwork Thu Dec 29 08:12:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A348AC4708D for ; Thu, 29 Dec 2022 08:13:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233117AbiL2INV (ORCPT ); Thu, 29 Dec 2022 03:13:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S232995AbiL2INT (ORCPT ); Thu, 29 Dec 2022 03:13:19 -0500 Received: from mail-pj1-x102a.google.com (mail-pj1-x102a.google.com [IPv6:2607:f8b0:4864:20::102a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B353111A10 for ; Thu, 29 Dec 2022 00:13:06 -0800 (PST) Received: by mail-pj1-x102a.google.com with SMTP id gv5-20020a17090b11c500b00223f01c73c3so17979529pjb.0 for ; Thu, 29 Dec 2022 00:13:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=97i6P/h7b+1gvIVRYO+DX34CDfR4Yqv0wh8jpsVN6uA=; b=J3UxtnO0lRODuALIZIbXFjggUs9csqzYnLWy6fCUhVcEssVoXOgvKDKzSKMFzlyrjx IAcfKrnfFQdpsDZbxkBzRb3HsHRv4Zx0R+LRjDqk5C1MorbFDuEtB4dQo6e2wnJcboJU lOoPqaeq28xjwQUyT3StdCsVbKKsSKUNoIs8A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=97i6P/h7b+1gvIVRYO+DX34CDfR4Yqv0wh8jpsVN6uA=; b=1gRjt3Xs0VC5nI3vmMmlPT361TGMLcAUc8FeTLU7uGY5jvqldC413TDk5awGI77SLH J/zaEa/cYhCvpPjOU2YUPjMciaFnRRSXUVRq94SCiLZOoDSzu5jU1bFDrwdr/WIxOviO I0UQfiq16hkfDdSpw+nZTUVK01rj5ySmQaH8CztqdPVHG6vrZbxj4HRTxCETgdeyD9Ld Jj5R3tVGbi9DKyzNN2/dRq1+JGSQzpAe2trjeTOzNiCL3+DMZv8UgP5I0/zYIXVjd0Jd OJiOMvDvjUanKvsLvvaPe5MoGZaDlDIPs8cBZ4+5G5WRW0VnuSSEmPHPJlKNbKIPB+2/ Yy/g== X-Gm-Message-State: AFqh2kpn75yS0zuOAYGxr4BQkrlywSBNYFe+DgTX9vqeEcwRKPfZ0I+8 fvwkAAwb3s+lrD1QpmzDwude3g== X-Google-Smtp-Source: AMrXdXuG8djGml0FmIoeHtIQvCDtglHyJ1RnAJ0w5NvdgLWhnB8c6w6P+S+xrI2iEBZecvFqN4EojA== X-Received: by 2002:a17:902:9687:b0:192:9ab2:fd1c with SMTP id n7-20020a170902968700b001929ab2fd1cmr2910136plp.26.1672301586179; Thu, 29 Dec 2022 00:13:06 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:05 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 3/7] fs: Introduce FALLOC_FL_PROVISION Date: Thu, 29 Dec 2022 00:12:48 -0800 Message-Id: <20221229081252.452240-4-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org FALLOC_FL_PROVISION is a new fallocate() allocation mode that sends a hint to (supported) thinly provisioned block devices to allocate space for the given range of sectors via REQ_OP_PROVISION. The man pages for both fallocate(2) and posix_fallocate(3) describe the default allocation mode as: ``` The default operation (i.e., mode is zero) of fallocate() allocates the disk space within the range specified by offset and len. ... subsequent writes to bytes in the specified range are guaranteed not to fail because of lack of disk space. ``` For thinly provisioned storage constructs (dm-thin, filesystems on sparse files), the term 'disk space' is overloaded and can either mean the apparent disk space in the filesystem/thin logical volume or the true disk space that will be utilized on the underlying non-sparse allocation layer. The use of a separate mode allows us to cleanly disambiguate whether fallocate() causes allocation only at the current layer (default mode) or whether it propagates allocations to underlying layers (provision mode) for thinly provisioned filesystems/ block devices. For devices that do not support REQ_OP_PROVISION, both these allocation modes will be equivalent. Given the performance cost of sending provision requests to the underlying layers, keeping the default mode as-is allows users to preserve existing behavior. Signed-off-by: Sarthak Kukreti --- block/fops.c | 15 +++++++++++---- include/linux/falloc.h | 3 ++- include/uapi/linux/falloc.h | 8 ++++++++ 3 files changed, 21 insertions(+), 5 deletions(-) diff --git a/block/fops.c b/block/fops.c index 50d245e8c913..01bde561e1e2 100644 --- a/block/fops.c +++ b/block/fops.c @@ -598,7 +598,8 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to) #define BLKDEV_FALLOC_FL_SUPPORTED \ (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | \ - FALLOC_FL_ZERO_RANGE | FALLOC_FL_NO_HIDE_STALE) + FALLOC_FL_ZERO_RANGE | FALLOC_FL_NO_HIDE_STALE | \ + FALLOC_FL_PROVISION) static long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len) @@ -634,9 +635,11 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, filemap_invalidate_lock(inode->i_mapping); /* Invalidate the page cache, including dirty pages. */ - error = truncate_bdev_range(bdev, file->f_mode, start, end); - if (error) - goto fail; + if (mode != FALLOC_FL_PROVISION) { + error = truncate_bdev_range(bdev, file->f_mode, start, end); + if (error) + goto fail; + } switch (mode) { case FALLOC_FL_ZERO_RANGE: @@ -654,6 +657,10 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, error = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, len >> SECTOR_SHIFT, GFP_KERNEL); break; + case FALLOC_FL_PROVISION: + error = blkdev_issue_provision(bdev, start >> SECTOR_SHIFT, + len >> SECTOR_SHIFT, GFP_KERNEL); + break; default: error = -EOPNOTSUPP; } diff --git a/include/linux/falloc.h b/include/linux/falloc.h index f3f0b97b1675..b9a40a61a59b 100644 --- a/include/linux/falloc.h +++ b/include/linux/falloc.h @@ -30,7 +30,8 @@ struct space_resv { FALLOC_FL_COLLAPSE_RANGE | \ FALLOC_FL_ZERO_RANGE | \ FALLOC_FL_INSERT_RANGE | \ - FALLOC_FL_UNSHARE_RANGE) + FALLOC_FL_UNSHARE_RANGE | \ + FALLOC_FL_PROVISION) /* on ia32 l_start is on a 32-bit boundary */ #if defined(CONFIG_X86_64) diff --git a/include/uapi/linux/falloc.h b/include/uapi/linux/falloc.h index 51398fa57f6c..2d323d113eed 100644 --- a/include/uapi/linux/falloc.h +++ b/include/uapi/linux/falloc.h @@ -77,4 +77,12 @@ */ #define FALLOC_FL_UNSHARE_RANGE 0x40 +/* + * FALLOC_FL_PROVISION acts as a hint for thinly provisioned devices to allocate + * blocks for the range/EOF. + * + * FALLOC_FL_PROVISION can only be used with allocate-mode fallocate. + */ +#define FALLOC_FL_PROVISION 0x80 + #endif /* _UAPI_FALLOC_H_ */ From patchwork Thu Dec 29 08:12:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B6032C3DA79 for ; Thu, 29 Dec 2022 08:13:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233058AbiL2INe (ORCPT ); Thu, 29 Dec 2022 03:13:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233052AbiL2INT (ORCPT ); Thu, 29 Dec 2022 03:13:19 -0500 Received: from mail-pj1-x1029.google.com (mail-pj1-x1029.google.com [IPv6:2607:f8b0:4864:20::1029]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0C6D11C2E for ; Thu, 29 Dec 2022 00:13:08 -0800 (PST) Received: by mail-pj1-x1029.google.com with SMTP id o2so12993814pjh.4 for ; Thu, 29 Dec 2022 00:13:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fhwgSmWbUilpxRzNOrlzxGBN+MAQ1cWKROitwl2Rvss=; b=bepAnhHSak5DeT1Zsd8c8KCFewRJzufY+EE08OPJE7L1K7tKI0YHbxGeuiXkJ9d24R jMqCVu4dGIG+bmz+f7xpeMxZHGq4LMC2iLPaP3PZTpbdTFlP2lV69MQ4+mx00dOR2dJZ QBrNT195qu3kGtPChfT42OjLmjv/EVWUgBRuU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fhwgSmWbUilpxRzNOrlzxGBN+MAQ1cWKROitwl2Rvss=; b=wmMEOqeQ/euZ0Kkn6YZyov0aHiadU5Pv3msyM5SYdlOtQmfFPrAG4S32la9bVZQZ2x qkqo+606LreOPycyfCZzac/pulyNbUaRuMGg1mnNAJhqDou6y/nGb4h05ygWG0A7d6lD ECX1597yeaJcbs3W+2iS+mFNALO8k2Q0D2RXcA1bK2cuhQ7nVFLSf+3XYH7DxEBQkelh +gd2zhaMwsR1U7buaM3MsLvUKp/Tw75BhNwpKCsHsqhW5t3Zz2XTk6oxk0AH9ItuLJJ6 1tdnxb1MSqWeqXtzMxNGjXEFcq6XKCPZapaHQC0pOPkIRzGD6m6fr2wBh9clh3zz8qna eH/w== X-Gm-Message-State: AFqh2kr8+N6DPvDvP0tdXbpmSwtR1HDrnYLFZWmhTitVgWL4oJ78B+mp vIfbtivRf6Z6EiIwGSFnnRKE2Q== X-Google-Smtp-Source: AMrXdXsfxrE5D4/Xt700H4PYzZ8Ma4wkTdnT8ETSbdWhxq69HhaH3ZOTjdO9/P9vNXRG8/2E7BEx7w== X-Received: by 2002:a17:902:d192:b0:192:77df:45d9 with SMTP id m18-20020a170902d19200b0019277df45d9mr12575286plb.17.1672301588655; Thu, 29 Dec 2022 00:13:08 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:07 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 4/7] loop: Add support for provision requests Date: Thu, 29 Dec 2022 00:12:49 -0800 Message-Id: <20221229081252.452240-5-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add support for provision requests to loopback devices. Loop devices will configure provision support based on whether the underlying block device/file can support the provision request and upon receiving a provision bio, will map it to the backing device/storage. Signed-off-by: Sarthak Kukreti --- drivers/block/loop.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 1518a6423279..64ebb0d60c0e 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -327,6 +327,24 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos, return ret; } +static int lo_req_provision(struct loop_device *lo, struct request *rq, loff_t pos) +{ + struct file *file = lo->lo_backing_file; + struct request_queue *q = lo->lo_queue; + int ret; + + if (!q->limits.max_provision_sectors) { + ret = -EOPNOTSUPP; + goto out; + } + + ret = file->f_op->fallocate(file, FALLOC_FL_PROVISION, pos, blk_rq_bytes(rq)); + if (unlikely(ret && ret != -EINVAL && ret != -EOPNOTSUPP)) + ret = -EIO; + out: + return ret; +} + static int lo_req_flush(struct loop_device *lo, struct request *rq) { int ret = vfs_fsync(lo->lo_backing_file, 0); @@ -488,6 +506,8 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) FALLOC_FL_PUNCH_HOLE); case REQ_OP_DISCARD: return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE); + case REQ_OP_PROVISION: + return lo_req_provision(lo, rq, pos); case REQ_OP_WRITE: if (cmd->use_aio) return lo_rw_aio(lo, cmd, pos, ITER_SOURCE); @@ -754,6 +774,25 @@ static void loop_sysfs_exit(struct loop_device *lo) &loop_attribute_group); } +static void loop_config_provision(struct loop_device *lo) +{ + struct file *file = lo->lo_backing_file; + struct inode *inode = file->f_mapping->host; + + /* + * If the backing device is a block device, mirror its provisioning + * capability. + */ + if (S_ISBLK(inode->i_mode)) { + blk_queue_max_provision_sectors(lo->lo_queue, + bdev_max_provision_sectors(I_BDEV(inode))); + } else if (file->f_op->fallocate) { + blk_queue_max_provision_sectors(lo->lo_queue, UINT_MAX >> 9); + } else { + blk_queue_max_provision_sectors(lo->lo_queue, 0); + } +} + static void loop_config_discard(struct loop_device *lo) { struct file *file = lo->lo_backing_file; @@ -1092,6 +1131,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, blk_queue_io_min(lo->lo_queue, bsize); loop_config_discard(lo); + loop_config_provision(lo); loop_update_rotational(lo); loop_update_dio(lo); loop_sysfs_init(lo); @@ -1304,6 +1344,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) } loop_config_discard(lo); + loop_config_provision(lo); /* update dio if lo_offset or transfer is changed */ __loop_update_dio(lo, lo->use_dio); @@ -1824,6 +1865,7 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx, case REQ_OP_FLUSH: case REQ_OP_DISCARD: case REQ_OP_WRITE_ZEROES: + case REQ_OP_PROVISION: cmd->use_aio = false; break; default: From patchwork Thu Dec 29 08:12:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083342 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1A086C4708D for ; Thu, 29 Dec 2022 08:14:28 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233150AbiL2IOF (ORCPT ); Thu, 29 Dec 2022 03:14:05 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37950 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233095AbiL2INU (ORCPT ); Thu, 29 Dec 2022 03:13:20 -0500 Received: from mail-pl1-x634.google.com (mail-pl1-x634.google.com [IPv6:2607:f8b0:4864:20::634]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8265E10060 for ; Thu, 29 Dec 2022 00:13:11 -0800 (PST) Received: by mail-pl1-x634.google.com with SMTP id d9so1537459pll.9 for ; Thu, 29 Dec 2022 00:13:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tknTAYGKiWbOelfCTGAIM4O7q8g3LJuucBsJtaWsm7Y=; b=od6vH7qllnsXef6M4SnGIOrNBwpv4p//RiTi9UhMGK3lxP1a2K3Nm1uXz7ssRR+Smi wUthhRevFt0NOvSr6D9fSLI7kuMeJffcI62x3q63JgaiPRtoRmo7cT9cQcEV1/CPgE3M gaDQE3uH/ZWruLOhIAciPanxs5Ham7RNijdIg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tknTAYGKiWbOelfCTGAIM4O7q8g3LJuucBsJtaWsm7Y=; b=FFfZoqVCBK7imEYvWCqoH9x+ZRh//+Wxa6nnQx9u2tRFOPOTrMolsSiUIvb3fj4vXk 0F0sTtB+oqzPqDxs0vKz5W3fNiB3XT8+ZDCWyluzqlrz9wiMl2awuneL6hkvrSbYjGIh +U9aXCShI3S38SsVvn49OiPbRT+He7w2yc4R3+XhTNNnxmsH6zgPpzHfNrqc+iQmTFVO lqiUaa0l/T8ck1rywzd7jjKRLZXHbgK7cEVhaPkfqV5B+xP9+Q1ayZWA7iF/OGMdYqZo RLe4evK1WVblwqd6Q2QUZKYbYN9977aUKVVCeqe1nNHxxnMQCLsdwJ20+zZUf4AiZMhm Y9lg== X-Gm-Message-State: AFqh2kqEapcXYG0nulO4fuFAn/1309Q+tu0Gcun3Giu/imN5sLsLb8LD ya6gjqdIRBQathpDFEH3ATSRag== X-Google-Smtp-Source: AMrXdXtym0rxjHIaY1zQhiKiElSe51N0oXHJB0JFxm27wv3KTRxbF3edYhG2fNv1/xvRbhTy3gkDbA== X-Received: by 2002:a17:902:c94f:b0:189:9e91:762e with SMTP id i15-20020a170902c94f00b001899e91762emr45014676pla.57.1672301590739; Thu, 29 Dec 2022 00:13:10 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:10 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 5/7] ext4: Add support for FALLOC_FL_PROVISION Date: Thu, 29 Dec 2022 00:12:50 -0800 Message-Id: <20221229081252.452240-6-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Once ext4 is done mapping blocks for an fallocate() request, send out an FALLOC_FL_PROVISION request to the underlying layer to ensure that the space is provisioned for the newly allocated extent or indirect blocks. There is an expected performance degradation with fallocate() calls made with this flag due to the extra REQ_OP_PROVISIONs sent to the underlying storage. Signed-off-by: Sarthak Kukreti --- fs/ext4/ext4.h | 2 ++ fs/ext4/extents.c | 15 ++++++++++++++- fs/ext4/indirect.c | 9 +++++++++ include/linux/blkdev.h | 11 +++++++++++ 4 files changed, 36 insertions(+), 1 deletion(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 140e1eb300d1..49832e90b62f 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -673,6 +673,8 @@ enum { #define EXT4_GET_BLOCKS_IO_SUBMIT 0x0400 /* Caller is in the atomic contex, find extent if it has been cached */ #define EXT4_GET_BLOCKS_CACHED_NOWAIT 0x0800 + /* Provision blocks on underlying storage */ +#define EXT4_GET_BLOCKS_PROVISION 0x1000 /* * The bit position of these flags must not overlap with any of the diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 9de1c9d1a13d..2e64a9211792 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4361,6 +4361,13 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, } } + /* Attempt to provision blocks on underlying storage */ + if (flags & EXT4_GET_BLOCKS_PROVISION) { + err = sb_issue_provision(inode->i_sb, pblk, ar.len, GFP_NOFS); + if (err) + goto out; + } + /* * Cache the extent and update transaction to commit on fdatasync only * when it is _not_ an unwritten extent. @@ -4694,7 +4701,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) /* Return error if mode is not supported */ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE | - FALLOC_FL_INSERT_RANGE)) + FALLOC_FL_INSERT_RANGE | FALLOC_FL_PROVISION)) return -EOPNOTSUPP; inode_lock(inode); @@ -4754,6 +4761,12 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) if (ret) goto out; + /* Ensure that preallocation provisions the blocks on the underlying + * storage device. + */ + if (mode & FALLOC_FL_PROVISION) + flags |= EXT4_GET_BLOCKS_PROVISION; + ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size, flags); if (ret) goto out; diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c index c68bebe7ff4b..a8065aae7563 100644 --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -647,6 +647,15 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, if (err) goto cleanup; + /* Attempt to provision blocks on underlying storage */ + if (flags & EXT4_GET_BLOCKS_PROVISION) { + err = sb_issue_provision(inode->i_sb, + le32_to_cpu(chain[depth-1].key), + ar.len, GFP_NOFS); + if (err) + goto out; + } + map->m_flags |= EXT4_MAP_NEW; ext4_update_inode_fsync_trans(handle, inode, 1); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f1abc7b43e25..b2e3244e9f3d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1093,6 +1093,17 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block, gfp_mask, 0); } +static inline int sb_issue_provision(struct super_block *sb, sector_t block, + sector_t nr_blocks, gfp_t gfp_mask) +{ + return blkdev_issue_provision(sb->s_bdev, + block << (sb->s_blocksize_bits - + SECTOR_SHIFT), + nr_blocks << (sb->s_blocksize_bits - + SECTOR_SHIFT), + gfp_mask); +} + static inline bool bdev_is_partition(struct block_device *bdev) { return bdev->bd_partno; From patchwork Thu Dec 29 08:12:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1E90DC4708E for ; Thu, 29 Dec 2022 08:14:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233057AbiL2IO2 (ORCPT ); Thu, 29 Dec 2022 03:14:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38098 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233044AbiL2INa (ORCPT ); Thu, 29 Dec 2022 03:13:30 -0500 Received: from mail-pl1-x62e.google.com (mail-pl1-x62e.google.com [IPv6:2607:f8b0:4864:20::62e]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BC2313D48 for ; Thu, 29 Dec 2022 00:13:13 -0800 (PST) Received: by mail-pl1-x62e.google.com with SMTP id n4so18307722plp.1 for ; Thu, 29 Dec 2022 00:13:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Vv5x1KdmMeVtIhlp3zKOKF7sZVF7KRZLRJQbPD3QKT4=; b=T05xc0ET+ovYWIL5pcnCM2gTG/W0v/0OqHvEwFkZjoBf10OqF0ArlagciQllVNzMtn GIrm+w7+DEYMtu0OMfltv5BLAP1xPtYnbtZ4WfOOVblA64a9OsheTZbsH52zOo8QN0Sp hWwPftBw+HDztRSuSQCbaldwNxaNhj1AdbITs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vv5x1KdmMeVtIhlp3zKOKF7sZVF7KRZLRJQbPD3QKT4=; b=iz3XC4nvhv4NpRno3LylZWdMx7a5qzM4+rjH8+sdggJyp3nXR+rD1IIiNdyKIub+HQ kRDzOxtqZsOzKY+DDQSmqQAj9GzNl4VrqMcCo64+WQlvCB496MIaJ2ec08DbdHgBg4ZJ JyosPy/m08RBWvHHoa0fMm6tvhFZyGHCoJXkcuP2HWJzCUPhxyiBybENilkpBMSunzJV Rl+DMWd9EKaO9FJmUpDbZCCMfUBLhjGodMQbg/effO/hfWu8LdiT07JBPMoxXdT7tHTS XwnlB9XbaZ9pSifHXLo+y9IeH6MpHPJOoqB1D1ZaQXbsOXzWhLiEkvfNehYjE4ztmL/d pY8Q== X-Gm-Message-State: AFqh2krNHanTxaRkRj6rihj3k6lzXv0e8SUKdc/fu5OYm41btCQCd1fN Elv/mvxJdkckE7A137ca9t35Dg== X-Google-Smtp-Source: AMrXdXvrTl4SO8Hr42En8z77cNqZKNC6eJ1qGCmNwbui5VQ51uN9dxzll3B3KUp7znSacUHIozLKhA== X-Received: by 2002:a17:902:a70c:b0:189:dcc3:e4a1 with SMTP id w12-20020a170902a70c00b00189dcc3e4a1mr28589601plq.9.1672301592863; Thu, 29 Dec 2022 00:13:12 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:12 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 6/7] ext4: Add mount option for provisioning blocks during allocations Date: Thu, 29 Dec 2022 00:12:51 -0800 Message-Id: <20221229081252.452240-7-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add a mount option that sets the default provisioning mode for all files within the filesystem. Signed-off-by: Sarthak Kukreti --- fs/ext4/ext4.h | 1 + fs/ext4/extents.c | 7 +++++++ fs/ext4/super.c | 7 +++++++ 3 files changed, 15 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 49832e90b62f..29cab2e2ea20 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1269,6 +1269,7 @@ struct ext4_inode_info { #define EXT4_MOUNT2_MB_OPTIMIZE_SCAN 0x00000080 /* Optimize group * scanning in mballoc */ +#define EXT4_MOUNT2_PROVISION 0x00000100 /* Provision while allocating file blocks */ #define clear_opt(sb, opt) EXT4_SB(sb)->s_mount_opt &= \ ~EXT4_MOUNT_##opt diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 2e64a9211792..a73f44264fe2 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4441,6 +4441,13 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, unsigned int credits; loff_t epos; + /* + * Attempt to provision file blocks if the mount is mounted with + * provision. + */ + if (test_opt2(inode->i_sb, PROVISION)) + flags |= EXT4_GET_BLOCKS_PROVISION; + BUG_ON(!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)); map.m_lblk = offset; map.m_len = len; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 260c1b3e3ef2..5bc376f6a6f0 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1591,6 +1591,7 @@ enum { Opt_max_dir_size_kb, Opt_nojournal_checksum, Opt_nombcache, Opt_no_prefetch_block_bitmaps, Opt_mb_optimize_scan, Opt_errors, Opt_data, Opt_data_err, Opt_jqfmt, Opt_dax_type, + Opt_provision, Opt_noprovision, #ifdef CONFIG_EXT4_DEBUG Opt_fc_debug_max_replay, Opt_fc_debug_force #endif @@ -1737,6 +1738,8 @@ static const struct fs_parameter_spec ext4_param_specs[] = { fsparam_flag ("reservation", Opt_removed), /* mount option from ext2/3 */ fsparam_flag ("noreservation", Opt_removed), /* mount option from ext2/3 */ fsparam_u32 ("journal", Opt_removed), /* mount option from ext2/3 */ + fsparam_flag ("provision", Opt_provision), + fsparam_flag ("noprovision", Opt_noprovision), {} }; @@ -1826,6 +1829,8 @@ static const struct mount_opts { {Opt_nombcache, EXT4_MOUNT_NO_MBCACHE, MOPT_SET}, {Opt_no_prefetch_block_bitmaps, EXT4_MOUNT_NO_PREFETCH_BLOCK_BITMAPS, MOPT_SET}, + {Opt_provision, EXT4_MOUNT2_PROVISION, MOPT_SET | MOPT_2}, + {Opt_noprovision, EXT4_MOUNT2_PROVISION, MOPT_CLEAR | MOPT_2}, #ifdef CONFIG_EXT4_DEBUG {Opt_fc_debug_force, EXT4_MOUNT2_JOURNAL_FAST_COMMIT, MOPT_SET | MOPT_2 | MOPT_EXT4_ONLY}, @@ -2977,6 +2982,8 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, SEQ_OPTS_PUTS("dax=never"); } else if (test_opt2(sb, DAX_INODE)) { SEQ_OPTS_PUTS("dax=inode"); + } else if (test_opt2(sb, PROVISION)) { + SEQ_OPTS_PUTS("provision"); } if (sbi->s_groups_count >= MB_DEFAULT_LINEAR_SCAN_THRESHOLD && From patchwork Thu Dec 29 08:12:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083344 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A2615C4332F for ; Thu, 29 Dec 2022 08:14:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233134AbiL2IO3 (ORCPT ); Thu, 29 Dec 2022 03:14:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38714 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233027AbiL2IOA (ORCPT ); Thu, 29 Dec 2022 03:14:00 -0500 Received: from mail-pj1-x1031.google.com (mail-pj1-x1031.google.com [IPv6:2607:f8b0:4864:20::1031]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97E5413D56 for ; Thu, 29 Dec 2022 00:13:15 -0800 (PST) Received: by mail-pj1-x1031.google.com with SMTP id c8-20020a17090a4d0800b00225c3614161so15120188pjg.5 for ; Thu, 29 Dec 2022 00:13:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/8DbNiGbTyMPWSOnHF8Kc/zLYgsGw2pv5SS5s5v0R+s=; b=g9XepsvU/uDTIkRU/0RfRztiaH4ysj5lPHnKvHquo4r2R81/t9Ab0n0lHqtHLvd81L Q99LTPfepq7s8nl2fawi9Q1Hj0cYT+r116Jc2iu54KTeJljXqfbrFWykA1a3MtFEjxbc IILAMKAGR6lSnvUPpxgAk36YR2QLLOe+csueQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/8DbNiGbTyMPWSOnHF8Kc/zLYgsGw2pv5SS5s5v0R+s=; b=jA1zp7/Dp+ut7PTmnQ21XRlGUuM0ZDSMkBKPKDlQvR3/NkKHRawZmqf4mScT1w4yu2 jU03k8cbc9AvtRjYvEH2PyRZDQFXfBcJhnMOyX/8xC4kpSpOKoScV3V9k9b/sxzEFOac SDI3RbArglFUBFGAAq+tSJNb+E6uy2S3Rnxf830tF9Ms8+p5YPuuA741JtNLh3fbf+kO NuY4kU8Iq0C6QY/kDE1N0pU43ByQDAITD9IWQGCTGW9QKG1aoS2RdOfNVSeQCZtPAgm4 CcMLgSiBHicMz/ojenD01iId9GyUGylnf5j5wfLbm7ft4kgjeC3Nmto/OR5WcWL/Mc/N dMsw== X-Gm-Message-State: AFqh2krvdDgO+0KntySYkBNYw3jcp5CgsTRkawlD4GaRZPQQJMNR+awI ykrDz8fu6HhTwX3S9Ec2Q0MxQlpoxuig6KlC X-Google-Smtp-Source: AMrXdXtgQYYpwQ6pz9x9/465K8tsLdsmt3F7AV27atGdHHls0t4amkRI67RTybj/weosU20glWNaiw== X-Received: by 2002:a17:903:228a:b0:191:217f:b2ea with SMTP id b10-20020a170903228a00b00191217fb2eamr42495271plh.40.1672301595208; Thu, 29 Dec 2022 00:13:15 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:14 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 7/7] ext4: Add a per-file provision override xattr Date: Thu, 29 Dec 2022 00:12:52 -0800 Message-Id: <20221229081252.452240-8-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Adds a per-file provision override that allows select files to override the per-mount setting for provisioning blocks on allocation. This acts as a mechanism to allow mounts using provision to replicate the current behavior for fallocate() and only preserve space at the filesystem level. Signed-off-by: Sarthak Kukreti --- fs/ext4/extents.c | 32 ++++++++++++++++++++++++++++++++ fs/ext4/xattr.h | 1 + 2 files changed, 33 insertions(+) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index a73f44264fe2..9861115681b3 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4428,6 +4428,26 @@ int ext4_ext_truncate(handle_t *handle, struct inode *inode) return err; } +static int ext4_file_provision_support(struct inode *inode) +{ + char provision; + int ret = + ext4_xattr_get(inode, EXT4_XATTR_INDEX_TRUSTED, + EXT4_XATTR_NAME_PROVISION_POLICY, &provision, 1); + + if (ret < 0) + return ret; + + switch (provision) { + case 'y': + return 1; + case 'n': + return 0; + default: + return -EINVAL; + } +} + static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, ext4_lblk_t len, loff_t new_size, int flags) @@ -4440,12 +4460,24 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, struct ext4_map_blocks map; unsigned int credits; loff_t epos; + bool provision = false; + int file_provision_override = -1; /* * Attempt to provision file blocks if the mount is mounted with * provision. */ if (test_opt2(inode->i_sb, PROVISION)) + provision = true; + + /* + * Use file-specific override, if available. + */ + file_provision_override = ext4_file_provision_support(inode); + if (file_provision_override >= 0) + provision &= file_provision_override; + + if (provision) flags |= EXT4_GET_BLOCKS_PROVISION; BUG_ON(!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)); diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h index 824faf0b15a8..69e97f853b0c 100644 --- a/fs/ext4/xattr.h +++ b/fs/ext4/xattr.h @@ -140,6 +140,7 @@ extern const struct xattr_handler ext4_xattr_security_handler; extern const struct xattr_handler ext4_xattr_hurd_handler; #define EXT4_XATTR_NAME_ENCRYPTION_CONTEXT "c" +#define EXT4_XATTR_NAME_PROVISION_POLICY "provision" /* * The EXT4_STATE_NO_EXPAND is overloaded and used for two purposes.