From patchwork Thu Dec 29 08:12:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 73DEDC4332F for ; Thu, 29 Dec 2022 08:13:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233080AbiL2INT (ORCPT ); Thu, 29 Dec 2022 03:13:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37716 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231722AbiL2INI (ORCPT ); Thu, 29 Dec 2022 03:13:08 -0500 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CF11B10FC2 for ; Thu, 29 Dec 2022 00:13:01 -0800 (PST) Received: by mail-pl1-x631.google.com with SMTP id 17so18341781pll.0 for ; Thu, 29 Dec 2022 00:13:01 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=LPgg4/E3xwiWPCkuKUsuYriDQInX30Nw0OCXNJZ+bXk=; b=JRQzQRQ/h7n4/IZEyBjWvWRtcDuyVX+Iug4JMnDafdf2iXKCL2eZDsBxuEa5yueYbz GBtzujYQ+rtNV3fFWhrS01wFk9UXHIUawqQpFmfKsUPE+V+W140LW12HtP23wsQOc9TO 1safTRCxSdYVsi2hSVsSmGRIpKPCrzb51h++E= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=LPgg4/E3xwiWPCkuKUsuYriDQInX30Nw0OCXNJZ+bXk=; b=AKajobE3rm2JnbNlLj5bIQAeFd+3iTH+PkMUhOiWPf5RG1MTOj+gzymmFpbiac043r fTtMKYOsJuH2zuRHJSzUMN+lttPsleCzr+wwMj55rc2RXNchBQzVN02h8y5/AYGuIPlP hyhXFMu7Pe8K05xLOrb3uS4SRLStFT0DcRtvJ391QMINa9r9qW+wQUCOcCTTcD6Sdcr8 CkyWzPNyXDOCGFiQtyazrPF6hM9WXJx4INb9o1IqAwb8tUycmlXhdKTITEE7KQNR8oIZ m3yHxT0FHN4zkQMWTxZBa9czHrevqxjf1fS9dyyRlM0nI/6poNEB7qFNVyusruNJcPD0 cW2w== X-Gm-Message-State: AFqh2kp8K0J8HIvuV/xdQKsv9T3d47XOjy18IoXp1KdQOhBCYXdXHOBt ttC19sN3d+7XIcFqmor92hSTew== X-Google-Smtp-Source: AMrXdXsfqqYCIpyBiZr/cTv9/CTKLr3m3pQrIjpFfBa2/Eo9DJ18jgamvtyxXSOS/6bH1HpMdN9jKg== X-Received: by 2002:a17:902:b60e:b0:189:89a4:3954 with SMTP id b14-20020a170902b60e00b0018989a43954mr30107986pls.41.1672301581332; Thu, 29 Dec 2022 00:13:01 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.12.59 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:00 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 1/7] block: Introduce provisioning primitives Date: Thu, 29 Dec 2022 00:12:46 -0800 Message-Id: <20221229081252.452240-2-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Introduce block request REQ_OP_PROVISION. The intent of this request is to request underlying storage to preallocate disk space for the given block range. Block device that support this capability will export a provision limit within their request queues. Signed-off-by: Sarthak Kukreti --- block/blk-core.c | 5 ++++ block/blk-lib.c | 53 +++++++++++++++++++++++++++++++++++++++ block/blk-merge.c | 18 +++++++++++++ block/blk-settings.c | 19 ++++++++++++++ block/blk-sysfs.c | 8 ++++++ block/bounce.c | 1 + include/linux/bio.h | 6 +++-- include/linux/blk_types.h | 5 +++- include/linux/blkdev.h | 16 ++++++++++++ 9 files changed, 128 insertions(+), 3 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 9321767470dc..30bcabc7dc01 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -123,6 +123,7 @@ static const char *const blk_op_name[] = { REQ_OP_NAME(WRITE_ZEROES), REQ_OP_NAME(DRV_IN), REQ_OP_NAME(DRV_OUT), + REQ_OP_NAME(PROVISION) }; #undef REQ_OP_NAME @@ -785,6 +786,10 @@ void submit_bio_noacct(struct bio *bio) if (!q->limits.max_write_zeroes_sectors) goto not_supported; break; + case REQ_OP_PROVISION: + if (!q->limits.max_provision_sectors) + goto not_supported; + break; default: break; } diff --git a/block/blk-lib.c b/block/blk-lib.c index e59c3069e835..647b6451660b 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -343,3 +343,56 @@ int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, return ret; } EXPORT_SYMBOL(blkdev_issue_secure_erase); + +/** + * blkdev_issue_provision - provision a block range + * @bdev: blockdev to write + * @sector: start sector + * @nr_sects: number of sectors to provision + * @gfp_mask: memory allocation flags (for bio_alloc) + * + * Description: + * Issues a provision request to the block device for the range of sectors. + * For thinly provisioned block devices, this acts as a signal for the + * underlying storage pool to allocate space for this block range. + */ +int blkdev_issue_provision(struct block_device *bdev, sector_t sector, + sector_t nr_sects, gfp_t gfp) +{ + sector_t bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1; + unsigned int max_sectors = bdev_max_provision_sectors(bdev); + struct bio *bio = NULL; + struct blk_plug plug; + int ret = 0; + + if (max_sectors == 0) + return -EOPNOTSUPP; + if ((sector | nr_sects) & bs_mask) + return -EINVAL; + if (bdev_read_only(bdev)) + return -EPERM; + + blk_start_plug(&plug); + for (;;) { + unsigned int req_sects = min_t(sector_t, nr_sects, max_sectors); + + bio = blk_next_bio(bio, bdev, 0, REQ_OP_PROVISION, gfp); + bio->bi_iter.bi_sector = sector; + bio->bi_iter.bi_size = req_sects << SECTOR_SHIFT; + + sector += req_sects; + nr_sects -= req_sects; + if (!nr_sects) { + ret = submit_bio_wait(bio); + if (ret == -EOPNOTSUPP) + ret = 0; + bio_put(bio); + break; + } + cond_resched(); + } + blk_finish_plug(&plug); + + return ret; +} +EXPORT_SYMBOL(blkdev_issue_provision); diff --git a/block/blk-merge.c b/block/blk-merge.c index 35a8f75cc45d..3ab35bb2a333 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -158,6 +158,21 @@ static struct bio *bio_split_write_zeroes(struct bio *bio, return bio_split(bio, lim->max_write_zeroes_sectors, GFP_NOIO, bs); } +static struct bio *bio_split_provision(struct bio *bio, + const struct queue_limits *lim, + unsigned *nsegs, struct bio_set *bs) +{ + *nsegs = 0; + + if (!lim->max_provision_sectors) + return NULL; + + if (bio_sectors(bio) <= lim->max_provision_sectors) + return NULL; + + return bio_split(bio, lim->max_provision_sectors, GFP_NOIO, bs); +} + /* * Return the maximum number of sectors from the start of a bio that may be * submitted as a single request to a block device. If enough sectors remain, @@ -355,6 +370,9 @@ struct bio *__bio_split_to_limits(struct bio *bio, case REQ_OP_WRITE_ZEROES: split = bio_split_write_zeroes(bio, lim, nr_segs, bs); break; + case REQ_OP_PROVISION: + split = bio_split_provision(bio, lim, nr_segs, bs); + break; default: split = bio_split_rw(bio, lim, nr_segs, bs, get_max_io_size(bio, lim) << SECTOR_SHIFT); diff --git a/block/blk-settings.c b/block/blk-settings.c index 0477c4d527fe..57d88204fbbe 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -58,6 +58,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->zoned = BLK_ZONED_NONE; lim->zone_write_granularity = 0; lim->dma_alignment = 511; + lim->max_provision_sectors = 0; } /** @@ -81,6 +82,7 @@ void blk_set_stacking_limits(struct queue_limits *lim) lim->max_dev_sectors = UINT_MAX; lim->max_write_zeroes_sectors = UINT_MAX; lim->max_zone_append_sectors = UINT_MAX; + lim->max_provision_sectors = UINT_MAX; } EXPORT_SYMBOL(blk_set_stacking_limits); @@ -202,6 +204,20 @@ void blk_queue_max_write_zeroes_sectors(struct request_queue *q, } EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors); +/** + * blk_queue_max_provision_sectors - set max sectors for a single provision + * + * @q: the request queue for the device + * @max_provision_sectors: maximum number of sectors to provision per command + **/ + +void blk_queue_max_provision_sectors(struct request_queue *q, + unsigned int max_provision_sectors) +{ + q->limits.max_provision_sectors = max_provision_sectors; +} +EXPORT_SYMBOL(blk_queue_max_provision_sectors); + /** * blk_queue_max_zone_append_sectors - set max sectors for a single zone append * @q: the request queue for the device @@ -572,6 +588,9 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->max_segment_size = min_not_zero(t->max_segment_size, b->max_segment_size); + t->max_provision_sectors = min_not_zero(t->max_provision_sectors, + b->max_provision_sectors); + t->misaligned |= b->misaligned; alignment = queue_limit_alignment_offset(b, start); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 93d9e9c9a6ea..2e678417b302 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -131,6 +131,12 @@ static ssize_t queue_max_discard_segments_show(struct request_queue *q, return queue_var_show(queue_max_discard_segments(q), page); } +static ssize_t queue_max_provision_sectors_show(struct request_queue *q, + char *page) +{ + return queue_var_show(queue_max_provision_sectors(q), (page)); +} + static ssize_t queue_max_integrity_segments_show(struct request_queue *q, char *page) { return queue_var_show(q->limits.max_integrity_segments, page); @@ -589,6 +595,7 @@ QUEUE_RO_ENTRY(queue_io_min, "minimum_io_size"); QUEUE_RO_ENTRY(queue_io_opt, "optimal_io_size"); QUEUE_RO_ENTRY(queue_max_discard_segments, "max_discard_segments"); +QUEUE_RO_ENTRY(queue_max_provision_sectors, "max_provision_sectors"); QUEUE_RO_ENTRY(queue_discard_granularity, "discard_granularity"); QUEUE_RO_ENTRY(queue_discard_max_hw, "discard_max_hw_bytes"); QUEUE_RW_ENTRY(queue_discard_max, "discard_max_bytes"); @@ -638,6 +645,7 @@ static struct attribute *queue_attrs[] = { &queue_max_sectors_entry.attr, &queue_max_segments_entry.attr, &queue_max_discard_segments_entry.attr, + &queue_max_provision_sectors_entry.attr, &queue_max_integrity_segments_entry.attr, &queue_max_segment_size_entry.attr, &elv_iosched_entry.attr, diff --git a/block/bounce.c b/block/bounce.c index 7cfcb242f9a1..ab9d8723ae64 100644 --- a/block/bounce.c +++ b/block/bounce.c @@ -176,6 +176,7 @@ static struct bio *bounce_clone_bio(struct bio *bio_src) case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: case REQ_OP_WRITE_ZEROES: + case REQ_OP_PROVISION: break; default: bio_for_each_segment(bv, bio_src, iter) diff --git a/include/linux/bio.h b/include/linux/bio.h index 22078a28d7cb..5025af105b7c 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -55,7 +55,8 @@ static inline bool bio_has_data(struct bio *bio) bio->bi_iter.bi_size && bio_op(bio) != REQ_OP_DISCARD && bio_op(bio) != REQ_OP_SECURE_ERASE && - bio_op(bio) != REQ_OP_WRITE_ZEROES) + bio_op(bio) != REQ_OP_WRITE_ZEROES && + bio_op(bio) != REQ_OP_PROVISION) return true; return false; @@ -65,7 +66,8 @@ static inline bool bio_no_advance_iter(const struct bio *bio) { return bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_SECURE_ERASE || - bio_op(bio) == REQ_OP_WRITE_ZEROES; + bio_op(bio) == REQ_OP_WRITE_ZEROES || + bio_op(bio) == REQ_OP_PROVISION; } static inline void *bio_data(struct bio *bio) diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 99be590f952f..27bdf88f541c 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -385,7 +385,10 @@ enum req_op { REQ_OP_DRV_IN = (__force blk_opf_t)34, REQ_OP_DRV_OUT = (__force blk_opf_t)35, - REQ_OP_LAST = (__force blk_opf_t)36, + /* request device to provision block */ + REQ_OP_PROVISION = (__force blk_opf_t)37, + + REQ_OP_LAST = (__force blk_opf_t)38, }; enum req_flag_bits { diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 301cf1cf4f2f..f1abc7b43e25 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -302,6 +302,7 @@ struct queue_limits { unsigned int discard_granularity; unsigned int discard_alignment; unsigned int zone_write_granularity; + unsigned int max_provision_sectors; unsigned short max_segments; unsigned short max_integrity_segments; @@ -918,6 +919,8 @@ extern void blk_queue_max_discard_sectors(struct request_queue *q, unsigned int max_discard_sectors); extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q, unsigned int max_write_same_sectors); +extern void blk_queue_max_provision_sectors(struct request_queue *q, + unsigned int max_provision_sectors); extern void blk_queue_logical_block_size(struct request_queue *, unsigned int); extern void blk_queue_max_zone_append_sectors(struct request_queue *q, unsigned int max_zone_append_sectors); @@ -1057,6 +1060,9 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector, int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp); +extern int blkdev_issue_provision(struct block_device *bdev, sector_t sector, + sector_t nr_sects, gfp_t gfp_mask); + #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit zeroes */ @@ -1135,6 +1141,11 @@ static inline unsigned short queue_max_discard_segments(const struct request_que return q->limits.max_discard_segments; } +static inline unsigned short queue_max_provision_sectors(const struct request_queue *q) +{ + return q->limits.max_provision_sectors; +} + static inline unsigned int queue_max_segment_size(const struct request_queue *q) { return q->limits.max_segment_size; @@ -1271,6 +1282,11 @@ static inline bool bdev_nowait(struct block_device *bdev) return test_bit(QUEUE_FLAG_NOWAIT, &bdev_get_queue(bdev)->queue_flags); } +static inline unsigned int bdev_max_provision_sectors(struct block_device *bdev) +{ + return bdev_get_queue(bdev)->limits.max_provision_sectors; +} + static inline enum blk_zoned_model bdev_zoned_model(struct block_device *bdev) { struct request_queue *q = bdev_get_queue(bdev); From patchwork Thu Dec 29 08:12:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80817C3DA7D for ; Thu, 29 Dec 2022 08:14:04 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233133AbiL2IOC (ORCPT ); Thu, 29 Dec 2022 03:14:02 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231264AbiL2INU (ORCPT ); Thu, 29 Dec 2022 03:13:20 -0500 Received: from mail-pj1-x1030.google.com (mail-pj1-x1030.google.com [IPv6:2607:f8b0:4864:20::1030]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 71DE111162 for ; Thu, 29 Dec 2022 00:13:04 -0800 (PST) Received: by mail-pj1-x1030.google.com with SMTP id n65-20020a17090a2cc700b0021bc5ef7a14so18327060pjd.0 for ; Thu, 29 Dec 2022 00:13:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=ZGTS43jY2w8jm4acf0IFMpzxibmXYw0wZeBsvWcE3NI=; b=ESIIlie7g6zZ/ES0wwvkxs/yh+YarKylqT2dcIkLBDXtSdlLAhLNtDyTEOYOzqRvTt yvleBhwtt6piRkZ1AbUwgXVyMufcEl99TQn0HFQM5kyqn7av3cLbbu+S49Ahav6yLl0l 33VPN8E+sneh7QckXjjA0GLOp5T4gOlao+cPk= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ZGTS43jY2w8jm4acf0IFMpzxibmXYw0wZeBsvWcE3NI=; b=ol6AdiBQlZSVLK/ergOQEA5dsIb/+Mb0xhnn7aqqREuR7j+T4esN52QILwZCOvYyLM Jzg+8tErZKRKzonN7Iq9w9oL1nZs2gbWjfh0FijuAOOeSRgJzSxj/fDWVLr4Vv5HlRtU kYHsiXr06vE4Zx+NZiwbt0n/lG4KRMwwgY6neqYFAfwrr303EEXuA0V1BL3xHnkJsIzm hStTqCqq8sZqPFO5xPY9Y3BrcPzLMJLOETLykqdEn1ULxL+IGyXR9Oo3dqAAxhShO2Nk lRqMtsIY6Trzyl2XlauH1+9G4YCSqE/LbkKszPPcO4s4AgCxY8GRta7yBrm1gDY5fLAp WAWA== X-Gm-Message-State: AFqh2koI3xit546IeIP5BBLhpYIMs2oALjheB7JIRDRCWmtoeJKuD3gH vg57AO8pNqQLVdhN0FoY6CTHXA== X-Google-Smtp-Source: AMrXdXtCqORihZF9SwuWUqetXWe9iVm3jlR8L/IVs3risC7ggBVbqn40CWo00UuzV3AXBqtcxJtPjg== X-Received: by 2002:a17:902:f08a:b0:189:c19a:2cd9 with SMTP id p10-20020a170902f08a00b00189c19a2cd9mr27993581pla.25.1672301583922; Thu, 29 Dec 2022 00:13:03 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.01 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:03 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 2/7] dm: Add support for block provisioning Date: Thu, 29 Dec 2022 00:12:47 -0800 Message-Id: <20221229081252.452240-3-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Add support to dm devices for REQ_OP_PROVISION. The default mode is to pass through the request and dm-thin will utilize it to provision blocks. Signed-off-by: Sarthak Kukreti --- drivers/md/dm-crypt.c | 4 +- drivers/md/dm-linear.c | 1 + drivers/md/dm-snap.c | 7 +++ drivers/md/dm-table.c | 25 ++++++++++ drivers/md/dm-thin.c | 90 ++++++++++++++++++++++++++++++++++- drivers/md/dm.c | 4 ++ include/linux/device-mapper.h | 11 +++++ 7 files changed, 139 insertions(+), 3 deletions(-) diff --git a/drivers/md/dm-crypt.c b/drivers/md/dm-crypt.c index 2653516bcdef..7089a414c3d1 100644 --- a/drivers/md/dm-crypt.c +++ b/drivers/md/dm-crypt.c @@ -3081,6 +3081,8 @@ static int crypt_ctr_optional(struct dm_target *ti, unsigned int argc, char **ar if (ret) return ret; + ti->num_provision_bios = 1; + while (opt_params--) { opt_string = dm_shift_arg(&as); if (!opt_string) { @@ -3384,7 +3386,7 @@ static int crypt_map(struct dm_target *ti, struct bio *bio) * - for REQ_OP_DISCARD caller must use flush if IO ordering matters */ if (unlikely(bio->bi_opf & REQ_PREFLUSH || - bio_op(bio) == REQ_OP_DISCARD)) { + bio_op(bio) == REQ_OP_DISCARD || bio_op(bio) == REQ_OP_PROVISION)) { bio_set_dev(bio, cc->dev->bdev); if (bio_sectors(bio)) bio->bi_iter.bi_sector = cc->start + diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c index 3212ef6aa81b..1aa782149428 100644 --- a/drivers/md/dm-linear.c +++ b/drivers/md/dm-linear.c @@ -61,6 +61,7 @@ static int linear_ctr(struct dm_target *ti, unsigned int argc, char **argv) ti->num_discard_bios = 1; ti->num_secure_erase_bios = 1; ti->num_write_zeroes_bios = 1; + ti->num_provision_bios = 1; ti->private = lc; return 0; diff --git a/drivers/md/dm-snap.c b/drivers/md/dm-snap.c index d1c2f84d27e3..d4d2599e3620 100644 --- a/drivers/md/dm-snap.c +++ b/drivers/md/dm-snap.c @@ -1357,6 +1357,7 @@ static int snapshot_ctr(struct dm_target *ti, unsigned int argc, char **argv) if (s->discard_zeroes_cow) ti->num_discard_bios = (s->discard_passdown_origin ? 2 : 1); ti->per_io_data_size = sizeof(struct dm_snap_tracked_chunk); + ti->num_provision_bios = 1; /* Add snapshot to the list of snapshots for this origin */ /* Exceptions aren't triggered till snapshot_resume() is called */ @@ -2001,6 +2002,11 @@ static int snapshot_map(struct dm_target *ti, struct bio *bio) /* If the block is already remapped - use that, else remap it */ e = dm_lookup_exception(&s->complete, chunk); if (e) { + if (unlikely(bio_op(bio) == REQ_OP_PROVISION)) { + bio_endio(bio); + r = DM_MAPIO_SUBMITTED; + goto out_unlock; + } remap_exception(s, e, bio, chunk); if (unlikely(bio_op(bio) == REQ_OP_DISCARD) && io_overlaps_chunk(s, bio)) { @@ -2414,6 +2420,7 @@ static void snapshot_io_hints(struct dm_target *ti, struct queue_limits *limits) /* All discards are split on chunk_size boundary */ limits->discard_granularity = snap->store->chunk_size; limits->max_discard_sectors = snap->store->chunk_size; + limits->max_provision_sectors = snap->store->chunk_size; up_read(&_origins_lock); } diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 8541d5688f3a..35f8d670935e 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1853,6 +1853,26 @@ static bool dm_table_supports_write_zeroes(struct dm_table *t) return true; } +static int device_provision_capable(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) +{ + return !bdev_max_provision_sectors(dev->bdev); +} + +static bool dm_table_supports_provision(struct dm_table *t) +{ + for (unsigned int i = 0; i < t->num_targets; i++) { + struct dm_target *ti = dm_table_get_target(t, i); + + if (ti->provision_supported || + (ti->type->iterate_devices && + ti->type->iterate_devices(ti, device_provision_capable, NULL))) + return true; + } + + return false; +} + static int device_not_nowait_capable(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data) { @@ -1987,6 +2007,11 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, if (!dm_table_supports_write_zeroes(t)) q->limits.max_write_zeroes_sectors = 0; + if (dm_table_supports_provision(t)) + blk_queue_max_provision_sectors(q, UINT_MAX >> 9); + else + q->limits.max_provision_sectors = 0; + dm_table_verify_integrity(t); /* diff --git a/drivers/md/dm-thin.c b/drivers/md/dm-thin.c index 64cfcf46881d..ab3f1abfabaf 100644 --- a/drivers/md/dm-thin.c +++ b/drivers/md/dm-thin.c @@ -1012,6 +1012,14 @@ static void process_prepared_mapping(struct dm_thin_new_mapping *m) goto out; } + /* For provision requests, return once the prepared block has been inserted + * into the mapping btree. + */ + if (bio && bio_op(bio) == REQ_OP_PROVISION) { + bio_endio(bio); + goto out; + } + /* * Release any bios held while the block was being provisioned. * If we are processing a write bio that completely covers the block, @@ -1239,7 +1247,7 @@ static int io_overlaps_block(struct pool *pool, struct bio *bio) static int io_overwrites_block(struct pool *pool, struct bio *bio) { - return (bio_data_dir(bio) == WRITE) && + return (bio_data_dir(bio) == WRITE) && bio_op(bio) != REQ_OP_PROVISION && io_overlaps_block(pool, bio); } @@ -1388,6 +1396,10 @@ static void schedule_zero(struct thin_c *tc, dm_block_t virt_block, m->data_block = data_block; m->cell = cell; + /* Provision requests are chained on the original bio. */ + if (bio && bio_op(bio) == REQ_OP_PROVISION) + m->bio = bio; + /* * If the whole block of data is being overwritten or we are not * zeroing pre-existing data, we can issue the bio immediately. @@ -1980,6 +1992,70 @@ static void process_cell(struct thin_c *tc, struct dm_bio_prison_cell *cell) } } +static void process_provision_cell(struct thin_c *tc, struct dm_bio_prison_cell *cell) +{ + int r; + struct pool *pool = tc->pool; + struct bio *bio = cell->holder; + dm_block_t begin, end; + struct dm_thin_lookup_result lookup_result; + + if (tc->requeue_mode) { + cell_requeue(pool, cell); + return; + } + + get_bio_block_range(tc, bio, &begin, &end); + + while (begin != end) { + r = ensure_next_mapping(pool); + if (r) + /* we did our best */ + return; + + r = dm_thin_find_block(tc->td, begin, 1, &lookup_result); + switch (r) { + case 0: + begin++; + break; + case -ENODATA: + bio_inc_remaining(bio); + provision_block(tc, bio, begin, cell); + begin++; + break; + default: + DMERR_LIMIT( + "%s: dm_thin_find_block() failed: error = %d", + __func__, r); + cell_defer_no_holder(tc, cell); + bio_io_error(bio); + begin++; + break; + } + } + bio_endio(bio); + cell_defer_no_holder(tc, cell); +} + +static void process_provision_bio(struct thin_c *tc, struct bio *bio) +{ + dm_block_t begin, end; + struct dm_cell_key virt_key; + struct dm_bio_prison_cell *virt_cell; + + get_bio_block_range(tc, bio, &begin, &end); + if (begin == end) { + bio_endio(bio); + return; + } + + build_key(tc->td, VIRTUAL, begin, end, &virt_key); + if (bio_detain(tc->pool, &virt_key, bio, &virt_cell)) + return; + + process_provision_cell(tc, virt_cell); +} + static void process_bio(struct thin_c *tc, struct bio *bio) { struct pool *pool = tc->pool; @@ -2200,6 +2276,8 @@ static void process_thin_deferred_bios(struct thin_c *tc) if (bio_op(bio) == REQ_OP_DISCARD) pool->process_discard(tc, bio); + else if (bio_op(bio) == REQ_OP_PROVISION) + process_provision_bio(tc, bio); else pool->process_bio(tc, bio); @@ -2716,7 +2794,8 @@ static int thin_bio_map(struct dm_target *ti, struct bio *bio) return DM_MAPIO_SUBMITTED; } - if (op_is_flush(bio->bi_opf) || bio_op(bio) == REQ_OP_DISCARD) { + if (op_is_flush(bio->bi_opf) || bio_op(bio) == REQ_OP_DISCARD || + bio_op(bio) == REQ_OP_PROVISION) { thin_defer_bio_with_throttle(tc, bio); return DM_MAPIO_SUBMITTED; } @@ -3355,6 +3434,8 @@ static int pool_ctr(struct dm_target *ti, unsigned argc, char **argv) pt->low_water_blocks = low_water_blocks; pt->adjusted_pf = pt->requested_pf = pf; ti->num_flush_bios = 1; + ti->num_provision_bios = 1; + ti->provision_supported = true; /* * Only need to enable discards if the pool should pass @@ -4053,6 +4134,7 @@ static void pool_io_hints(struct dm_target *ti, struct queue_limits *limits) blk_limits_io_opt(limits, pool->sectors_per_block << SECTOR_SHIFT); } + /* * pt->adjusted_pf is a staging area for the actual features to use. * They get transferred to the live pool in bind_control_target() @@ -4243,6 +4325,9 @@ static int thin_ctr(struct dm_target *ti, unsigned argc, char **argv) ti->num_discard_bios = 1; } + ti->num_provision_bios = 1; + ti->provision_supported = true; + mutex_unlock(&dm_thin_pool_table.mutex); spin_lock_irq(&tc->pool->lock); @@ -4457,6 +4542,7 @@ static void thin_io_hints(struct dm_target *ti, struct queue_limits *limits) limits->discard_granularity = pool->sectors_per_block << SECTOR_SHIFT; limits->max_discard_sectors = 2048 * 1024 * 16; /* 16G */ + limits->max_provision_sectors = 2048 * 1024 * 16; /* 16G */ } static struct target_type thin_target = { diff --git a/drivers/md/dm.c b/drivers/md/dm.c index e1ea3a7bd9d9..4d19bae9da4a 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1587,6 +1587,7 @@ static bool is_abnormal_io(struct bio *bio) case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: case REQ_OP_WRITE_ZEROES: + case REQ_OP_PROVISION: return true; default: break; @@ -1611,6 +1612,9 @@ static blk_status_t __process_abnormal_io(struct clone_info *ci, case REQ_OP_WRITE_ZEROES: num_bios = ti->num_write_zeroes_bios; break; + case REQ_OP_PROVISION: + num_bios = ti->num_provision_bios; + break; default: break; } diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 04c6acf7faaa..b4d97d5d75b8 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -333,6 +333,12 @@ struct dm_target { */ unsigned num_write_zeroes_bios; + /* + * The number of PROVISION bios that will be submitted to the target. + * The bio number can be accessed with dm_bio_get_target_bio_nr. + */ + unsigned num_provision_bios; + /* * The minimum number of extra bytes allocated in each io for the * target to use. @@ -357,6 +363,11 @@ struct dm_target { */ bool discards_supported:1; + /* Set if this target needs to receive provision requests regardless of + * whether or not its underlying devices have support. + */ + bool provision_supported:1; + /* * Set if we need to limit the number of in-flight bios when swapping. */ From patchwork Thu Dec 29 08:12:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 623AFC4332F for ; Thu, 29 Dec 2022 08:13:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232081AbiL2INX (ORCPT ); Thu, 29 Dec 2022 03:13:23 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233044AbiL2INT (ORCPT ); Thu, 29 Dec 2022 03:13:19 -0500 Received: from mail-pl1-x62d.google.com (mail-pl1-x62d.google.com [IPv6:2607:f8b0:4864:20::62d]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B405711A16 for ; Thu, 29 Dec 2022 00:13:06 -0800 (PST) Received: by mail-pl1-x62d.google.com with SMTP id b2so18283445pld.7 for ; Thu, 29 Dec 2022 00:13:06 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=97i6P/h7b+1gvIVRYO+DX34CDfR4Yqv0wh8jpsVN6uA=; b=J3UxtnO0lRODuALIZIbXFjggUs9csqzYnLWy6fCUhVcEssVoXOgvKDKzSKMFzlyrjx IAcfKrnfFQdpsDZbxkBzRb3HsHRv4Zx0R+LRjDqk5C1MorbFDuEtB4dQo6e2wnJcboJU lOoPqaeq28xjwQUyT3StdCsVbKKsSKUNoIs8A= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=97i6P/h7b+1gvIVRYO+DX34CDfR4Yqv0wh8jpsVN6uA=; b=FTbBPwUGkUeRkUtZ956P10WfcatfZVqSa4hlD4HOPWgR/mor/Kl93l+wz8PTN+bHlF E+TIQg4PWi8ubvYLkJ9/e6aNYEnuJ23zzwWQoDq/UWdjgro3n5UOz+1WEa2KPq8chIXQ MbpW9FU1hmMDIbUmZ39GOsQbBYQ8drUGPS4e2Wut0WjmmXRJ7kYyCeEVCs4NOoi2EQPf OlrIXojHcC206U++/wPSz5q4ITo6D+Eyf+6YBiTVW/3y3Umo8fmlNlraMoV3KEYfAxbL m0XhmsHm+I6t+kc5CLMm8Bq/yeK5MpEZDd/iWZXmRFSP1CgutRuEHFNbMrU0PyMnv9KN YIAg== X-Gm-Message-State: AFqh2kqoOibwOlP+WIJIOk9uy4Roy9Q4TxFQCMPkNSIyKhOtW4ogw/mq 25+RWmsaeMZpbh/QJSCQWBeaaA== X-Google-Smtp-Source: AMrXdXuG8djGml0FmIoeHtIQvCDtglHyJ1RnAJ0w5NvdgLWhnB8c6w6P+S+xrI2iEBZecvFqN4EojA== X-Received: by 2002:a17:902:9687:b0:192:9ab2:fd1c with SMTP id n7-20020a170902968700b001929ab2fd1cmr2910136plp.26.1672301586179; Thu, 29 Dec 2022 00:13:06 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.04 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:05 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 3/7] fs: Introduce FALLOC_FL_PROVISION Date: Thu, 29 Dec 2022 00:12:48 -0800 Message-Id: <20221229081252.452240-4-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org FALLOC_FL_PROVISION is a new fallocate() allocation mode that sends a hint to (supported) thinly provisioned block devices to allocate space for the given range of sectors via REQ_OP_PROVISION. The man pages for both fallocate(2) and posix_fallocate(3) describe the default allocation mode as: ``` The default operation (i.e., mode is zero) of fallocate() allocates the disk space within the range specified by offset and len. ... subsequent writes to bytes in the specified range are guaranteed not to fail because of lack of disk space. ``` For thinly provisioned storage constructs (dm-thin, filesystems on sparse files), the term 'disk space' is overloaded and can either mean the apparent disk space in the filesystem/thin logical volume or the true disk space that will be utilized on the underlying non-sparse allocation layer. The use of a separate mode allows us to cleanly disambiguate whether fallocate() causes allocation only at the current layer (default mode) or whether it propagates allocations to underlying layers (provision mode) for thinly provisioned filesystems/ block devices. For devices that do not support REQ_OP_PROVISION, both these allocation modes will be equivalent. Given the performance cost of sending provision requests to the underlying layers, keeping the default mode as-is allows users to preserve existing behavior. Signed-off-by: Sarthak Kukreti --- block/fops.c | 15 +++++++++++---- include/linux/falloc.h | 3 ++- include/uapi/linux/falloc.h | 8 ++++++++ 3 files changed, 21 insertions(+), 5 deletions(-) diff --git a/block/fops.c b/block/fops.c index 50d245e8c913..01bde561e1e2 100644 --- a/block/fops.c +++ b/block/fops.c @@ -598,7 +598,8 @@ static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to) #define BLKDEV_FALLOC_FL_SUPPORTED \ (FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | \ - FALLOC_FL_ZERO_RANGE | FALLOC_FL_NO_HIDE_STALE) + FALLOC_FL_ZERO_RANGE | FALLOC_FL_NO_HIDE_STALE | \ + FALLOC_FL_PROVISION) static long blkdev_fallocate(struct file *file, int mode, loff_t start, loff_t len) @@ -634,9 +635,11 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, filemap_invalidate_lock(inode->i_mapping); /* Invalidate the page cache, including dirty pages. */ - error = truncate_bdev_range(bdev, file->f_mode, start, end); - if (error) - goto fail; + if (mode != FALLOC_FL_PROVISION) { + error = truncate_bdev_range(bdev, file->f_mode, start, end); + if (error) + goto fail; + } switch (mode) { case FALLOC_FL_ZERO_RANGE: @@ -654,6 +657,10 @@ static long blkdev_fallocate(struct file *file, int mode, loff_t start, error = blkdev_issue_discard(bdev, start >> SECTOR_SHIFT, len >> SECTOR_SHIFT, GFP_KERNEL); break; + case FALLOC_FL_PROVISION: + error = blkdev_issue_provision(bdev, start >> SECTOR_SHIFT, + len >> SECTOR_SHIFT, GFP_KERNEL); + break; default: error = -EOPNOTSUPP; } diff --git a/include/linux/falloc.h b/include/linux/falloc.h index f3f0b97b1675..b9a40a61a59b 100644 --- a/include/linux/falloc.h +++ b/include/linux/falloc.h @@ -30,7 +30,8 @@ struct space_resv { FALLOC_FL_COLLAPSE_RANGE | \ FALLOC_FL_ZERO_RANGE | \ FALLOC_FL_INSERT_RANGE | \ - FALLOC_FL_UNSHARE_RANGE) + FALLOC_FL_UNSHARE_RANGE | \ + FALLOC_FL_PROVISION) /* on ia32 l_start is on a 32-bit boundary */ #if defined(CONFIG_X86_64) diff --git a/include/uapi/linux/falloc.h b/include/uapi/linux/falloc.h index 51398fa57f6c..2d323d113eed 100644 --- a/include/uapi/linux/falloc.h +++ b/include/uapi/linux/falloc.h @@ -77,4 +77,12 @@ */ #define FALLOC_FL_UNSHARE_RANGE 0x40 +/* + * FALLOC_FL_PROVISION acts as a hint for thinly provisioned devices to allocate + * blocks for the range/EOF. + * + * FALLOC_FL_PROVISION can only be used with allocate-mode fallocate. + */ +#define FALLOC_FL_PROVISION 0x80 + #endif /* _UAPI_FALLOC_H_ */ From patchwork Thu Dec 29 08:12:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083348 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id B59D0C4332F for ; Thu, 29 Dec 2022 08:13:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232749AbiL2INq (ORCPT ); Thu, 29 Dec 2022 03:13:46 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233061AbiL2INT (ORCPT ); Thu, 29 Dec 2022 03:13:19 -0500 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F12F1120A7 for ; Thu, 29 Dec 2022 00:13:08 -0800 (PST) Received: by mail-pl1-x631.google.com with SMTP id jn22so18252784plb.13 for ; Thu, 29 Dec 2022 00:13:08 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fhwgSmWbUilpxRzNOrlzxGBN+MAQ1cWKROitwl2Rvss=; b=bepAnhHSak5DeT1Zsd8c8KCFewRJzufY+EE08OPJE7L1K7tKI0YHbxGeuiXkJ9d24R jMqCVu4dGIG+bmz+f7xpeMxZHGq4LMC2iLPaP3PZTpbdTFlP2lV69MQ4+mx00dOR2dJZ QBrNT195qu3kGtPChfT42OjLmjv/EVWUgBRuU= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fhwgSmWbUilpxRzNOrlzxGBN+MAQ1cWKROitwl2Rvss=; b=x4GBmYQbb/9ekf4HtKmRgco7dLtGfzn9z/PWx0GxSHAcuKUGSnI6NrNk8yUfVU7zY5 4+BtceXIIxe5tYshBytZwVgGwlpPdbo2xo9FdMjDrdogbWQUPGisBY9pG7XV3tuNK9Ob sVnvemam+YYoT6iUG36x2C8otH/sw96rf/8028Evb4Wsq+GNDkBw15QHzQyCqXnqDi8f uVnnpy5+zL1ofUj66YVUydsz1VYNmmzypLHqBJMxL2Q5giZWdGW5Hd8ilr61QIc7jxZv y9hYtPiJyDOvTJWQwG+2t9a033RoJv1dWpEmmvubJPFJBD57wBpJzCQhfqZQ1pCyy8oM oKBg== X-Gm-Message-State: AFqh2kogemfjH8hj0nZNBt9vK83tsY6Rr7dhbsIIvObewQZy4SYFckrl VNHKkIG7siGtJTgzlmzLPiByGQ== X-Google-Smtp-Source: AMrXdXsfxrE5D4/Xt700H4PYzZ8Ma4wkTdnT8ETSbdWhxq69HhaH3ZOTjdO9/P9vNXRG8/2E7BEx7w== X-Received: by 2002:a17:902:d192:b0:192:77df:45d9 with SMTP id m18-20020a170902d19200b0019277df45d9mr12575286plb.17.1672301588655; Thu, 29 Dec 2022 00:13:08 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.06 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:07 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 4/7] loop: Add support for provision requests Date: Thu, 29 Dec 2022 00:12:49 -0800 Message-Id: <20221229081252.452240-5-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Add support for provision requests to loopback devices. Loop devices will configure provision support based on whether the underlying block device/file can support the provision request and upon receiving a provision bio, will map it to the backing device/storage. Signed-off-by: Sarthak Kukreti --- drivers/block/loop.c | 42 ++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 42 insertions(+) diff --git a/drivers/block/loop.c b/drivers/block/loop.c index 1518a6423279..64ebb0d60c0e 100644 --- a/drivers/block/loop.c +++ b/drivers/block/loop.c @@ -327,6 +327,24 @@ static int lo_fallocate(struct loop_device *lo, struct request *rq, loff_t pos, return ret; } +static int lo_req_provision(struct loop_device *lo, struct request *rq, loff_t pos) +{ + struct file *file = lo->lo_backing_file; + struct request_queue *q = lo->lo_queue; + int ret; + + if (!q->limits.max_provision_sectors) { + ret = -EOPNOTSUPP; + goto out; + } + + ret = file->f_op->fallocate(file, FALLOC_FL_PROVISION, pos, blk_rq_bytes(rq)); + if (unlikely(ret && ret != -EINVAL && ret != -EOPNOTSUPP)) + ret = -EIO; + out: + return ret; +} + static int lo_req_flush(struct loop_device *lo, struct request *rq) { int ret = vfs_fsync(lo->lo_backing_file, 0); @@ -488,6 +506,8 @@ static int do_req_filebacked(struct loop_device *lo, struct request *rq) FALLOC_FL_PUNCH_HOLE); case REQ_OP_DISCARD: return lo_fallocate(lo, rq, pos, FALLOC_FL_PUNCH_HOLE); + case REQ_OP_PROVISION: + return lo_req_provision(lo, rq, pos); case REQ_OP_WRITE: if (cmd->use_aio) return lo_rw_aio(lo, cmd, pos, ITER_SOURCE); @@ -754,6 +774,25 @@ static void loop_sysfs_exit(struct loop_device *lo) &loop_attribute_group); } +static void loop_config_provision(struct loop_device *lo) +{ + struct file *file = lo->lo_backing_file; + struct inode *inode = file->f_mapping->host; + + /* + * If the backing device is a block device, mirror its provisioning + * capability. + */ + if (S_ISBLK(inode->i_mode)) { + blk_queue_max_provision_sectors(lo->lo_queue, + bdev_max_provision_sectors(I_BDEV(inode))); + } else if (file->f_op->fallocate) { + blk_queue_max_provision_sectors(lo->lo_queue, UINT_MAX >> 9); + } else { + blk_queue_max_provision_sectors(lo->lo_queue, 0); + } +} + static void loop_config_discard(struct loop_device *lo) { struct file *file = lo->lo_backing_file; @@ -1092,6 +1131,7 @@ static int loop_configure(struct loop_device *lo, fmode_t mode, blk_queue_io_min(lo->lo_queue, bsize); loop_config_discard(lo); + loop_config_provision(lo); loop_update_rotational(lo); loop_update_dio(lo); loop_sysfs_init(lo); @@ -1304,6 +1344,7 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info) } loop_config_discard(lo); + loop_config_provision(lo); /* update dio if lo_offset or transfer is changed */ __loop_update_dio(lo, lo->use_dio); @@ -1824,6 +1865,7 @@ static blk_status_t loop_queue_rq(struct blk_mq_hw_ctx *hctx, case REQ_OP_FLUSH: case REQ_OP_DISCARD: case REQ_OP_WRITE_ZEROES: + case REQ_OP_PROVISION: cmd->use_aio = false; break; default: From patchwork Thu Dec 29 08:12:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6DF83C4332F for ; Thu, 29 Dec 2022 08:14:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231264AbiL2IOD (ORCPT ); Thu, 29 Dec 2022 03:14:03 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37922 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233084AbiL2INU (ORCPT ); Thu, 29 Dec 2022 03:13:20 -0500 Received: from mail-pl1-x631.google.com (mail-pl1-x631.google.com [IPv6:2607:f8b0:4864:20::631]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 19BD810040 for ; Thu, 29 Dec 2022 00:13:11 -0800 (PST) Received: by mail-pl1-x631.google.com with SMTP id s7so18288952plk.5 for ; Thu, 29 Dec 2022 00:13:11 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=tknTAYGKiWbOelfCTGAIM4O7q8g3LJuucBsJtaWsm7Y=; b=od6vH7qllnsXef6M4SnGIOrNBwpv4p//RiTi9UhMGK3lxP1a2K3Nm1uXz7ssRR+Smi wUthhRevFt0NOvSr6D9fSLI7kuMeJffcI62x3q63JgaiPRtoRmo7cT9cQcEV1/CPgE3M gaDQE3uH/ZWruLOhIAciPanxs5Ham7RNijdIg= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=tknTAYGKiWbOelfCTGAIM4O7q8g3LJuucBsJtaWsm7Y=; b=ynnccAZhWA/ARSF2DHqaVXRoQUmJnwwB/tKP895nGJUGixjcIMLGNMNWzU+3AsiZZh KbZb0QJV/9UuXSCINi2/rJ61muj2rsg/afcWDaNqgk/MCiHthxE3TCvwho2+A+OJj8g4 rtzky7Uq1z0W6hahgw8IA07316OXdqoymTp2nnrvaLS6okYfoOTrjXMAag83Fi3qsQ4N nTALCQR4QT4sNInsfKbsQKMkvAec07tA4S8m1USnThFOTyNMm+SnWdEJvZtwSzROmhNU yF48rGLI68V/rzdyoL/1YtqG3X75DWTuIsDcx6v/TLfKZtpRwU+31NHmG99bpK6MgzSq epxw== X-Gm-Message-State: AFqh2kru/2s1IVN7i4L+76SpeN7C1oaYQOS5KSo4Cu5vwj3e5VZMmdN0 OZXis3iQZmjRiZoVkY9zzNdLIw== X-Google-Smtp-Source: AMrXdXtym0rxjHIaY1zQhiKiElSe51N0oXHJB0JFxm27wv3KTRxbF3edYhG2fNv1/xvRbhTy3gkDbA== X-Received: by 2002:a17:902:c94f:b0:189:9e91:762e with SMTP id i15-20020a170902c94f00b001899e91762emr45014676pla.57.1672301590739; Thu, 29 Dec 2022 00:13:10 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:10 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 5/7] ext4: Add support for FALLOC_FL_PROVISION Date: Thu, 29 Dec 2022 00:12:50 -0800 Message-Id: <20221229081252.452240-6-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Once ext4 is done mapping blocks for an fallocate() request, send out an FALLOC_FL_PROVISION request to the underlying layer to ensure that the space is provisioned for the newly allocated extent or indirect blocks. There is an expected performance degradation with fallocate() calls made with this flag due to the extra REQ_OP_PROVISIONs sent to the underlying storage. Signed-off-by: Sarthak Kukreti --- fs/ext4/ext4.h | 2 ++ fs/ext4/extents.c | 15 ++++++++++++++- fs/ext4/indirect.c | 9 +++++++++ include/linux/blkdev.h | 11 +++++++++++ 4 files changed, 36 insertions(+), 1 deletion(-) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 140e1eb300d1..49832e90b62f 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -673,6 +673,8 @@ enum { #define EXT4_GET_BLOCKS_IO_SUBMIT 0x0400 /* Caller is in the atomic contex, find extent if it has been cached */ #define EXT4_GET_BLOCKS_CACHED_NOWAIT 0x0800 + /* Provision blocks on underlying storage */ +#define EXT4_GET_BLOCKS_PROVISION 0x1000 /* * The bit position of these flags must not overlap with any of the diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 9de1c9d1a13d..2e64a9211792 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4361,6 +4361,13 @@ int ext4_ext_map_blocks(handle_t *handle, struct inode *inode, } } + /* Attempt to provision blocks on underlying storage */ + if (flags & EXT4_GET_BLOCKS_PROVISION) { + err = sb_issue_provision(inode->i_sb, pblk, ar.len, GFP_NOFS); + if (err) + goto out; + } + /* * Cache the extent and update transaction to commit on fdatasync only * when it is _not_ an unwritten extent. @@ -4694,7 +4701,7 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) /* Return error if mode is not supported */ if (mode & ~(FALLOC_FL_KEEP_SIZE | FALLOC_FL_PUNCH_HOLE | FALLOC_FL_COLLAPSE_RANGE | FALLOC_FL_ZERO_RANGE | - FALLOC_FL_INSERT_RANGE)) + FALLOC_FL_INSERT_RANGE | FALLOC_FL_PROVISION)) return -EOPNOTSUPP; inode_lock(inode); @@ -4754,6 +4761,12 @@ long ext4_fallocate(struct file *file, int mode, loff_t offset, loff_t len) if (ret) goto out; + /* Ensure that preallocation provisions the blocks on the underlying + * storage device. + */ + if (mode & FALLOC_FL_PROVISION) + flags |= EXT4_GET_BLOCKS_PROVISION; + ret = ext4_alloc_file_blocks(file, lblk, max_blocks, new_size, flags); if (ret) goto out; diff --git a/fs/ext4/indirect.c b/fs/ext4/indirect.c index c68bebe7ff4b..a8065aae7563 100644 --- a/fs/ext4/indirect.c +++ b/fs/ext4/indirect.c @@ -647,6 +647,15 @@ int ext4_ind_map_blocks(handle_t *handle, struct inode *inode, if (err) goto cleanup; + /* Attempt to provision blocks on underlying storage */ + if (flags & EXT4_GET_BLOCKS_PROVISION) { + err = sb_issue_provision(inode->i_sb, + le32_to_cpu(chain[depth-1].key), + ar.len, GFP_NOFS); + if (err) + goto out; + } + map->m_flags |= EXT4_MAP_NEW; ext4_update_inode_fsync_trans(handle, inode, 1); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f1abc7b43e25..b2e3244e9f3d 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1093,6 +1093,17 @@ static inline int sb_issue_zeroout(struct super_block *sb, sector_t block, gfp_mask, 0); } +static inline int sb_issue_provision(struct super_block *sb, sector_t block, + sector_t nr_blocks, gfp_t gfp_mask) +{ + return blkdev_issue_provision(sb->s_bdev, + block << (sb->s_blocksize_bits - + SECTOR_SHIFT), + nr_blocks << (sb->s_blocksize_bits - + SECTOR_SHIFT), + gfp_mask); +} + static inline bool bdev_is_partition(struct block_device *bdev) { return bdev->bd_partno; From patchwork Thu Dec 29 08:12:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B275C4708D for ; Thu, 29 Dec 2022 08:14:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233044AbiL2IO3 (ORCPT ); Thu, 29 Dec 2022 03:14:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231172AbiL2INp (ORCPT ); Thu, 29 Dec 2022 03:13:45 -0500 Received: from mail-pj1-x1034.google.com (mail-pj1-x1034.google.com [IPv6:2607:f8b0:4864:20::1034]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5A18013D43 for ; Thu, 29 Dec 2022 00:13:13 -0800 (PST) Received: by mail-pj1-x1034.google.com with SMTP id j8-20020a17090a3e0800b00225fdd5007fso8668372pjc.2 for ; Thu, 29 Dec 2022 00:13:13 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=Vv5x1KdmMeVtIhlp3zKOKF7sZVF7KRZLRJQbPD3QKT4=; b=T05xc0ET+ovYWIL5pcnCM2gTG/W0v/0OqHvEwFkZjoBf10OqF0ArlagciQllVNzMtn GIrm+w7+DEYMtu0OMfltv5BLAP1xPtYnbtZ4WfOOVblA64a9OsheTZbsH52zOo8QN0Sp hWwPftBw+HDztRSuSQCbaldwNxaNhj1AdbITs= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=Vv5x1KdmMeVtIhlp3zKOKF7sZVF7KRZLRJQbPD3QKT4=; b=ejn++P33TMjKg6hTv2nrDXowJAMHT0x9d7m5gBk3l5DOrNSRF/WB91RxUmFWYJ3g2t ECCdv4uI92WPlknbjj5V8cP1QpzupMRi7Pd41QajeYEzTJU8mSb3lLYrxwyiWpxZV5T2 E0OYOzmJok+M0u+X4TRegmn27UXYDDdUvFGRNK/S55GVvCQxhxN225OdWAW3XsnIlyMG 6fNcJ+tv9FlVbVNNv5BoqxT9y3vuWT2M9X6TZR87whELbMG9L5vZl3/gN5oKHt4ARqhP 36VIXYHZ9IJU9VU82qmc1V9y6RyKxUrKSqOlajCtRNqcojkZMad2Mcp+pMvmwH87oFLv XoiQ== X-Gm-Message-State: AFqh2kpllY9nZfNMc9M73T4NOupYCjdoq5YOpohaZ7foPkuL6MXRKmf9 lDevsDDjq4sF7Noo14ne1DJfxQ== X-Google-Smtp-Source: AMrXdXvrTl4SO8Hr42En8z77cNqZKNC6eJ1qGCmNwbui5VQ51uN9dxzll3B3KUp7znSacUHIozLKhA== X-Received: by 2002:a17:902:a70c:b0:189:dcc3:e4a1 with SMTP id w12-20020a170902a70c00b00189dcc3e4a1mr28589601plq.9.1672301592863; Thu, 29 Dec 2022 00:13:12 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.10 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:12 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 6/7] ext4: Add mount option for provisioning blocks during allocations Date: Thu, 29 Dec 2022 00:12:51 -0800 Message-Id: <20221229081252.452240-7-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Add a mount option that sets the default provisioning mode for all files within the filesystem. Signed-off-by: Sarthak Kukreti --- fs/ext4/ext4.h | 1 + fs/ext4/extents.c | 7 +++++++ fs/ext4/super.c | 7 +++++++ 3 files changed, 15 insertions(+) diff --git a/fs/ext4/ext4.h b/fs/ext4/ext4.h index 49832e90b62f..29cab2e2ea20 100644 --- a/fs/ext4/ext4.h +++ b/fs/ext4/ext4.h @@ -1269,6 +1269,7 @@ struct ext4_inode_info { #define EXT4_MOUNT2_MB_OPTIMIZE_SCAN 0x00000080 /* Optimize group * scanning in mballoc */ +#define EXT4_MOUNT2_PROVISION 0x00000100 /* Provision while allocating file blocks */ #define clear_opt(sb, opt) EXT4_SB(sb)->s_mount_opt &= \ ~EXT4_MOUNT_##opt diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index 2e64a9211792..a73f44264fe2 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4441,6 +4441,13 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, unsigned int credits; loff_t epos; + /* + * Attempt to provision file blocks if the mount is mounted with + * provision. + */ + if (test_opt2(inode->i_sb, PROVISION)) + flags |= EXT4_GET_BLOCKS_PROVISION; + BUG_ON(!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)); map.m_lblk = offset; map.m_len = len; diff --git a/fs/ext4/super.c b/fs/ext4/super.c index 260c1b3e3ef2..5bc376f6a6f0 100644 --- a/fs/ext4/super.c +++ b/fs/ext4/super.c @@ -1591,6 +1591,7 @@ enum { Opt_max_dir_size_kb, Opt_nojournal_checksum, Opt_nombcache, Opt_no_prefetch_block_bitmaps, Opt_mb_optimize_scan, Opt_errors, Opt_data, Opt_data_err, Opt_jqfmt, Opt_dax_type, + Opt_provision, Opt_noprovision, #ifdef CONFIG_EXT4_DEBUG Opt_fc_debug_max_replay, Opt_fc_debug_force #endif @@ -1737,6 +1738,8 @@ static const struct fs_parameter_spec ext4_param_specs[] = { fsparam_flag ("reservation", Opt_removed), /* mount option from ext2/3 */ fsparam_flag ("noreservation", Opt_removed), /* mount option from ext2/3 */ fsparam_u32 ("journal", Opt_removed), /* mount option from ext2/3 */ + fsparam_flag ("provision", Opt_provision), + fsparam_flag ("noprovision", Opt_noprovision), {} }; @@ -1826,6 +1829,8 @@ static const struct mount_opts { {Opt_nombcache, EXT4_MOUNT_NO_MBCACHE, MOPT_SET}, {Opt_no_prefetch_block_bitmaps, EXT4_MOUNT_NO_PREFETCH_BLOCK_BITMAPS, MOPT_SET}, + {Opt_provision, EXT4_MOUNT2_PROVISION, MOPT_SET | MOPT_2}, + {Opt_noprovision, EXT4_MOUNT2_PROVISION, MOPT_CLEAR | MOPT_2}, #ifdef CONFIG_EXT4_DEBUG {Opt_fc_debug_force, EXT4_MOUNT2_JOURNAL_FAST_COMMIT, MOPT_SET | MOPT_2 | MOPT_EXT4_ONLY}, @@ -2977,6 +2982,8 @@ static int _ext4_show_options(struct seq_file *seq, struct super_block *sb, SEQ_OPTS_PUTS("dax=never"); } else if (test_opt2(sb, DAX_INODE)) { SEQ_OPTS_PUTS("dax=inode"); + } else if (test_opt2(sb, PROVISION)) { + SEQ_OPTS_PUTS("provision"); } if (sbi->s_groups_count >= MB_DEFAULT_LINEAR_SCAN_THRESHOLD && From patchwork Thu Dec 29 08:12:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Sarthak Kukreti X-Patchwork-Id: 13083351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F1735C4332F for ; Thu, 29 Dec 2022 08:14:29 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233084AbiL2IO2 (ORCPT ); Thu, 29 Dec 2022 03:14:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37910 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233129AbiL2IOB (ORCPT ); Thu, 29 Dec 2022 03:14:01 -0500 Received: from mail-pj1-x1033.google.com (mail-pj1-x1033.google.com [IPv6:2607:f8b0:4864:20::1033]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AE4B313D5B for ; Thu, 29 Dec 2022 00:13:15 -0800 (PST) Received: by mail-pj1-x1033.google.com with SMTP id m7-20020a17090a730700b00225ebb9cd01so10355143pjk.3 for ; Thu, 29 Dec 2022 00:13:15 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=chromium.org; s=google; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=/8DbNiGbTyMPWSOnHF8Kc/zLYgsGw2pv5SS5s5v0R+s=; b=g9XepsvU/uDTIkRU/0RfRztiaH4ysj5lPHnKvHquo4r2R81/t9Ab0n0lHqtHLvd81L Q99LTPfepq7s8nl2fawi9Q1Hj0cYT+r116Jc2iu54KTeJljXqfbrFWykA1a3MtFEjxbc IILAMKAGR6lSnvUPpxgAk36YR2QLLOe+csueQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=/8DbNiGbTyMPWSOnHF8Kc/zLYgsGw2pv5SS5s5v0R+s=; b=zFJ5RbeOaHN7P8UC6Vor/sC8QcPNig+PtzgZY9Pu/141UnZTDG+X1dfS86wxks3LG4 PmQGo9r4upebnehIFVeHcmcmVA+v8Gr1VZq97STWmqtwNKCYDQ+6SAoJTxh3RnONubWX UI143/ArOQHmOnxXboja2fQLOhXgroZkfLlcrE0rEh8aTEtmyGuWzy89+KxOGzwmMt4A GQa8lhFCqTI03FwZz+I74bPzCbyxWx3HY7TJutyfURfmntX40rf0B/1TgzIPn9kvGENs 7Oe1BLyt6ZHgGSgkpg3lqAXfOqutYOebwpw37yYKMkz23uHYqLQ7dj/64WoYiVJqo6oW eu1g== X-Gm-Message-State: AFqh2krw3F10Avg5T4HDFJe4iS5i1P4MxV6hfR+1/uuIbcSBJwEndf6n 624bBfbOYxK1Wbm02c3WQ9lfTg== X-Google-Smtp-Source: AMrXdXtgQYYpwQ6pz9x9/465K8tsLdsmt3F7AV27atGdHHls0t4amkRI67RTybj/weosU20glWNaiw== X-Received: by 2002:a17:903:228a:b0:191:217f:b2ea with SMTP id b10-20020a170903228a00b00191217fb2eamr42495271plh.40.1672301595208; Thu, 29 Dec 2022 00:13:15 -0800 (PST) Received: from sarthakkukreti-glaptop.hsd1.ca.comcast.net ([2601:647:4200:b5b0:75ff:1277:3d7b:d67a]) by smtp.gmail.com with ESMTPSA id 12-20020a170902e9cc00b00192820d00d0sm6496325plk.120.2022.12.29.00.13.13 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 29 Dec 2022 00:13:14 -0800 (PST) From: Sarthak Kukreti To: sarthakkukreti@google.com, dm-devel@redhat.com, linux-block@vger.kernel.org, linux-ext4@vger.kernel.org, linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org Cc: Jens Axboe , "Michael S. Tsirkin" , Jason Wang , Stefan Hajnoczi , Alasdair Kergon , Mike Snitzer , Christoph Hellwig , Brian Foster , Theodore Ts'o , Andreas Dilger , Bart Van Assche , Daniil Lunev , "Darrick J. Wong" Subject: [PATCH v2 7/7] ext4: Add a per-file provision override xattr Date: Thu, 29 Dec 2022 00:12:52 -0800 Message-Id: <20221229081252.452240-8-sarthakkukreti@chromium.org> X-Mailer: git-send-email 2.39.0.314.g84b9a713c41-goog In-Reply-To: <20221229081252.452240-1-sarthakkukreti@chromium.org> References: <20221229081252.452240-1-sarthakkukreti@chromium.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Adds a per-file provision override that allows select files to override the per-mount setting for provisioning blocks on allocation. This acts as a mechanism to allow mounts using provision to replicate the current behavior for fallocate() and only preserve space at the filesystem level. Signed-off-by: Sarthak Kukreti --- fs/ext4/extents.c | 32 ++++++++++++++++++++++++++++++++ fs/ext4/xattr.h | 1 + 2 files changed, 33 insertions(+) diff --git a/fs/ext4/extents.c b/fs/ext4/extents.c index a73f44264fe2..9861115681b3 100644 --- a/fs/ext4/extents.c +++ b/fs/ext4/extents.c @@ -4428,6 +4428,26 @@ int ext4_ext_truncate(handle_t *handle, struct inode *inode) return err; } +static int ext4_file_provision_support(struct inode *inode) +{ + char provision; + int ret = + ext4_xattr_get(inode, EXT4_XATTR_INDEX_TRUSTED, + EXT4_XATTR_NAME_PROVISION_POLICY, &provision, 1); + + if (ret < 0) + return ret; + + switch (provision) { + case 'y': + return 1; + case 'n': + return 0; + default: + return -EINVAL; + } +} + static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, ext4_lblk_t len, loff_t new_size, int flags) @@ -4440,12 +4460,24 @@ static int ext4_alloc_file_blocks(struct file *file, ext4_lblk_t offset, struct ext4_map_blocks map; unsigned int credits; loff_t epos; + bool provision = false; + int file_provision_override = -1; /* * Attempt to provision file blocks if the mount is mounted with * provision. */ if (test_opt2(inode->i_sb, PROVISION)) + provision = true; + + /* + * Use file-specific override, if available. + */ + file_provision_override = ext4_file_provision_support(inode); + if (file_provision_override >= 0) + provision &= file_provision_override; + + if (provision) flags |= EXT4_GET_BLOCKS_PROVISION; BUG_ON(!ext4_test_inode_flag(inode, EXT4_INODE_EXTENTS)); diff --git a/fs/ext4/xattr.h b/fs/ext4/xattr.h index 824faf0b15a8..69e97f853b0c 100644 --- a/fs/ext4/xattr.h +++ b/fs/ext4/xattr.h @@ -140,6 +140,7 @@ extern const struct xattr_handler ext4_xattr_security_handler; extern const struct xattr_handler ext4_xattr_hurd_handler; #define EXT4_XATTR_NAME_ENCRYPTION_CONTEXT "c" +#define EXT4_XATTR_NAME_PROVISION_POLICY "provision" /* * The EXT4_STATE_NO_EXPAND is overloaded and used for two purposes.