From patchwork Wed Nov 23 05:58:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053098 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 855FEC4332F for ; Wed, 23 Nov 2022 06:13:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235909AbiKWGNV (ORCPT ); Wed, 23 Nov 2022 01:13:21 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50762 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235693AbiKWGNT (ORCPT ); Wed, 23 Nov 2022 01:13:19 -0500 Received: from mailout1.samsung.com (mailout1.samsung.com [203.254.224.24]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8C580E933E for ; Tue, 22 Nov 2022 22:13:17 -0800 (PST) Received: from epcas5p4.samsung.com (unknown [182.195.41.42]) by mailout1.samsung.com (KnoxPortal) with ESMTP id 20221123061314epoutp01f800646d73ae575ca02171037c4f053e~qIf-iNjgO1961819618epoutp01B for ; Wed, 23 Nov 2022 06:13:14 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout1.samsung.com 20221123061314epoutp01f800646d73ae575ca02171037c4f053e~qIf-iNjgO1961819618epoutp01B DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669183994; bh=3NU0GjtIO0z+LJU7LWf2hU4P1dIsSqLsiJhFYWXV3m4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=fZSDEPyfJIedCbJWLqKGjd/intXbiJj0z0U5qmbuKiV7Xe0ozAZbsZTfRte3OUNTb IGvG4Au1tP4oImtegkJbtZOcHoTeNLUoTFlXWUJdDfQ95oQdrB3zY9rAtrcxn5zp0O f43dOc7XSn0kmgDdLpvbkHbZcXZBOFVvFdwmgUN0= Received: from epsnrtp3.localdomain (unknown [182.195.42.164]) by epcas5p3.samsung.com (KnoxPortal) with ESMTP id 20221123061314epcas5p3537f2e330d9ff8f87bb066bd951304b5~qIf__ssLP1161611616epcas5p3Y; Wed, 23 Nov 2022 06:13:14 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.181]) by epsnrtp3.localdomain (Postfix) with ESMTP id 4NH9m83vSKz4x9Q8; Wed, 23 Nov 2022 06:13:12 +0000 (GMT) Received: from epcas5p2.samsung.com ( [182.195.41.40]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id 62.17.56352.8F9BD736; Wed, 23 Nov 2022 15:13:12 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p1.samsung.com (KnoxPortal) with ESMTPA id 20221123061014epcas5p150fd8add12fe6d09b63c56972818e6a2~qIdXXVKqr1236112361epcas5p1k; Wed, 23 Nov 2022 06:10:14 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20221123061014epsmtrp286681bed34e54926cc55e47c00f293d4~qIdXWP0yd0451404514epsmtrp2o; Wed, 23 Nov 2022 06:10:14 +0000 (GMT) X-AuditID: b6c32a4b-383ff7000001dc20-af-637db9f80d43 Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id 3E.A2.18644.649BD736; Wed, 23 Nov 2022 15:10:14 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061010epsmtip11adfbda7ef6ca9c08eb3072605bac6ed~qIdUR2-sC2063320633epsmtip17; Wed, 23 Nov 2022 06:10:10 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , Hannes Reinecke Subject: [PATCH v5 01/10] block: Introduce queue limits for copy-offload support Date: Wed, 23 Nov 2022 11:28:18 +0530 Message-Id: <20221123055827.26996-2-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Ta0xTZxjmnFNOC0v1UBx+IiJ0M+MSLsWCH0wuy4ieBUNwuD8Mwmp7hHJp m7aMOSHcBblDxEAhXOYUgSFakJUCCjjlIowoApYMEFa2IRMUXJwyZC0HN/897/M9z3v78rIw TinTmiWWKCm5RBDHxc0Z7bcdHVz+7kgWuuuu7YUtQ3cxmF68gcGm6SIcro+MYrB7udIU6no6 UNj1XSkKG5ruoLCz7jkK72w+xeHciykGLO2bQODCuAqF3VPOsKt7kAHHtFU4rLm8wIQl/a2m UKNPQ2D7eg0G1y5lMuHVpRUGHJjaC0c3+k0DAKmaHcHJDtU0kxyduc4gx0YSSHXjOZxs/T6F 7NSl4mRBxrJBkDVrSq7cHMfJwrZGhGy9d4ZcU9uS2T15KKnWP0VDdobFHo6mBCJKbkdJhFKR WBLlyw0Kjfw00tPLnefC84aHuHYSQTzlyw08FuJyRBxnWATX7mtBXIKBChEoFFw3v8NyaYKS souWKpS+XEomipPxZa4KQbwiQRLlKqGUPjx3dw9Pg/Cr2OjVpVymrOL4N+O35k1TkfrAXMSM BQg+UL++wsxFzFkcohMBLeeWUTpYRcD89G8IHawhYOTWpEHG2rJMaiOMbg6hRYBm9AStyUJB TVXzlgYnnMG9TZaR30UUoiCnswczBhjRiYLisllTo9uS+Bz89UvGFmYQB0Bd8xhmxGzCB3TU /4nTxdxA0ayFkTYjPgbD97UoLbEAgxV6hhFjxH6QcaNyKz8gmsxAcX4vk54tECxeWMBpbAme 9Ldt89Zgbbl7m08EDeev4LQ5EwGqSRVCP/iDrKEizNgERjiCFq0bTe8DZUNXUbrwDlCwrkdp ng001W/xB+CHltrt/HvAxMu0bUyCnLZCjN5WIQKePHiEFiN2qncGUr0zkOr/0rUI1ojsoWSK +ChK4Sk7KKES//tloTRejWwdh1OQBpl//My1D0FZSB8CWBh3FzvlsyQhhy0SnP6Wkksj5Qlx lKIP8TQsvASzfl8oNVyXRBnJ43u78728vPjeB7143N3si+VOQg4RJVBSsRQlo+RvfSjLzDoV 5Ti1+zhVH2ot8XC44LcvfCnETJV/qi3imcPD1gzzyYunRc1z5Xjw4/a7jllW1ZdtQ9GNgZni Df6Pb5IffJlRljP3JtRX91HwPzlVwzaLzkcqYzZMCrKbGLuTOb/rwj38e00KsxeJ2AOWZ8vL 0/Nui1W+g1zlqlXgo/Ab938KejHqnX1Mx04SKxefZ29+aL7gpz/q0KWpDYu3/eL4r/PBK2df B1ZGBFxvtCJeBsR+Upf1ajim3v6VyN569uTDiQbEyiQ4zZEcuGQvLs1PRvN+PvPeyT8yU8Ma Q3S21XqGTU5i0rWYHSm9NulS7PzO2aMarf/6ZovHiQq9ej+vYybc4tRNrTOXoYgW8JwwuULw L3cPLx6lBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrOIsWRmVeSWpSXmKPExsWy7bCSnK7bztpkg1kb2C3WnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLPYsmsRksXL1USaL3Qs/Mlkc/f+WzeLhl1ssFpMOXWO0 eHp1FpPF3lvaFnv2nmSxuLxrDpvF/GVP2S0mHt/MarHjSSOjxbbf85ktPi9tYbdY9/o9i8WJ W9IW5/8eZ3WQ8Jh1/yybx85Zd9k9zt/byOJx+Wypx6ZVnWwem5fUe+y+2cDm0dv8Dqig9T6r x/t9V9k8+rasYvTYfLra4/MmOY/2A91MHpuevGUK4I/isklJzcksSy3St0vgyvj0uou9YGZg xdX9j1gbGJe7dDFycEgImEhc3xXbxcjFISSwg1Gi4cVj9i5GTqC4pMSyv0eYIWxhiZX/nrND FDUzSfz/spQNpJlNQFvi9H8OkLiIwAImicv3XjGDOMwCx5kkzk6fzgrSLSwQILHq3G+wqSwC qhIL114Gm8orYCWxc/kbNogr9CX67wuChDkFrCXOXNzFBBIWAirZs0wHolpQ4uTMJywgNrOA vETz1tnMExgFZiFJzUKSWsDItIpRMrWgODc9t9iwwCgvtVyvODG3uDQvXS85P3cTIziutbR2 MO5Z9UHvECMTB+MhRgkOZiUR3nrPmmQh3pTEyqrUovz4otKc1OJDjNIcLErivBe6TsYLCaQn lqRmp6YWpBbBZJk4OKUamE5v0syLSHY5anuk1fN5Z7BEeGSRWi3DXrEXOpua7lu9KavV/pno kMhiI3U/9tTuT9MbdvrPPnO30lSs2mfaTSm7TKZ39tc1Kptv3dDjMllhf0fa9EbL2aQuEb1P nvdU38mG/j+jry7ss/9I7fz/W5pk7K+d4ziq8XXC1CVre4RNlXgjNLMXrZPzZ7t28dIrgSOe Pq7RqXcqn4R1TOzqnGGwKssrblOBQrnHxJNfdC9OlyjoEPGb+rN5xWHh81O2zuvaXGzwWHXH yloRk42/v73sc/Y+li2rs+4/E+uCmvMzC2r3/RK9bnfhwdlN0+Y1fpYI+LW2v9vkd25i9YHT 1Rf2VlX57+gX987u1Q+zUmIpzkg01GIuKk4EAMo+pQxaAwAA X-CMS-MailID: 20221123061014epcas5p150fd8add12fe6d09b63c56972818e6a2 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061014epcas5p150fd8add12fe6d09b63c56972818e6a2 References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add device limits as sysfs entries, - copy_offload (RW) - copy_max_bytes (RW) - copy_max_bytes_hw (RO) Above limits help to split the copy payload in block layer. copy_offload: used for setting copy offload(1) or emulation(0). copy_max_bytes: maximum total length of copy in single payload. copy_max_bytes_hw: Reflects the device supported maximum limit. Reviewed-by: Hannes Reinecke Signed-off-by: Nitesh Shetty Signed-off-by: Kanchan Joshi Signed-off-by: Anuj Gupta --- Documentation/ABI/stable/sysfs-block | 36 ++++++++++++++++ block/blk-settings.c | 24 +++++++++++ block/blk-sysfs.c | 64 ++++++++++++++++++++++++++++ include/linux/blkdev.h | 12 ++++++ include/uapi/linux/fs.h | 3 ++ 5 files changed, 139 insertions(+) diff --git a/Documentation/ABI/stable/sysfs-block b/Documentation/ABI/stable/sysfs-block index cd14ecb3c9a5..e0c9be009706 100644 --- a/Documentation/ABI/stable/sysfs-block +++ b/Documentation/ABI/stable/sysfs-block @@ -155,6 +155,42 @@ Description: last zone of the device which may be smaller. +What: /sys/block//queue/copy_offload +Date: November 2022 +Contact: linux-block@vger.kernel.org +Description: + [RW] When read, this file shows whether offloading copy to + device is enabled (1) or disabled (0). Writing '0' to this + file will disable offloading copies for this device. + Writing any '1' value will enable this feature. If device + does not support offloading, then writing 1, will result in + error. + + +What: /sys/block//queue/copy_max_bytes +Date: November 2022 +Contact: linux-block@vger.kernel.org +Description: + [RW] While 'copy_max_bytes_hw' is the hardware limit for the + device, 'copy_max_bytes' setting is the software limit. + Setting this value lower will make Linux issue smaller size + copies from block layer. + + +What: /sys/block//queue/copy_max_bytes_hw +Date: November 2022 +Contact: linux-block@vger.kernel.org +Description: + [RO] Devices that support offloading copy functionality may have + internal limits on the number of bytes that can be offloaded + in a single operation. The `copy_max_bytes_hw` + parameter is set by the device driver to the maximum number of + bytes that can be copied in a single operation. Copy + requests issued to the device must not exceed this limit. + A value of 0 means that the device does not + support copy offload. + + What: /sys/block//queue/crypto/ Date: February 2022 Contact: linux-block@vger.kernel.org diff --git a/block/blk-settings.c b/block/blk-settings.c index 0477c4d527fe..ca6f15a70fdc 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -58,6 +58,8 @@ void blk_set_default_limits(struct queue_limits *lim) lim->zoned = BLK_ZONED_NONE; lim->zone_write_granularity = 0; lim->dma_alignment = 511; + lim->max_copy_sectors_hw = 0; + lim->max_copy_sectors = 0; } /** @@ -81,6 +83,8 @@ void blk_set_stacking_limits(struct queue_limits *lim) lim->max_dev_sectors = UINT_MAX; lim->max_write_zeroes_sectors = UINT_MAX; lim->max_zone_append_sectors = UINT_MAX; + lim->max_copy_sectors_hw = ULONG_MAX; + lim->max_copy_sectors = ULONG_MAX; } EXPORT_SYMBOL(blk_set_stacking_limits); @@ -177,6 +181,22 @@ void blk_queue_max_discard_sectors(struct request_queue *q, } EXPORT_SYMBOL(blk_queue_max_discard_sectors); +/** + * blk_queue_max_copy_sectors_hw - set max sectors for a single copy payload + * @q: the request queue for the device + * @max_copy_sectors: maximum number of sectors to copy + **/ +void blk_queue_max_copy_sectors_hw(struct request_queue *q, + unsigned int max_copy_sectors) +{ + if (max_copy_sectors >= MAX_COPY_TOTAL_LENGTH) + max_copy_sectors = MAX_COPY_TOTAL_LENGTH; + + q->limits.max_copy_sectors_hw = max_copy_sectors; + q->limits.max_copy_sectors = max_copy_sectors; +} +EXPORT_SYMBOL_GPL(blk_queue_max_copy_sectors_hw); + /** * blk_queue_max_secure_erase_sectors - set max sectors for a secure erase * @q: the request queue for the device @@ -572,6 +592,10 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->max_segment_size = min_not_zero(t->max_segment_size, b->max_segment_size); + t->max_copy_sectors = min(t->max_copy_sectors, b->max_copy_sectors); + t->max_copy_sectors_hw = min(t->max_copy_sectors_hw, + b->max_copy_sectors_hw); + t->misaligned |= b->misaligned; alignment = queue_limit_alignment_offset(b, start); diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index 02e94c4beff1..903285b04029 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -212,6 +212,63 @@ static ssize_t queue_discard_zeroes_data_show(struct request_queue *q, char *pag return queue_var_show(0, page); } +static ssize_t queue_copy_offload_show(struct request_queue *q, char *page) +{ + return queue_var_show(blk_queue_copy(q), page); +} + +static ssize_t queue_copy_offload_store(struct request_queue *q, + const char *page, size_t count) +{ + s64 copy_offload; + ssize_t ret = queue_var_store64(©_offload, page); + + if (ret < 0) + return ret; + + if (copy_offload && !q->limits.max_copy_sectors_hw) + return -EINVAL; + + if (copy_offload) + blk_queue_flag_set(QUEUE_FLAG_COPY, q); + else + blk_queue_flag_clear(QUEUE_FLAG_COPY, q); + + return count; +} + +static ssize_t queue_copy_max_hw_show(struct request_queue *q, char *page) +{ + return sprintf(page, "%llu\n", (unsigned long long) + q->limits.max_copy_sectors_hw << SECTOR_SHIFT); +} + +static ssize_t queue_copy_max_show(struct request_queue *q, char *page) +{ + return sprintf(page, "%llu\n", (unsigned long long) + q->limits.max_copy_sectors << SECTOR_SHIFT); +} + +static ssize_t queue_copy_max_store(struct request_queue *q, + const char *page, size_t count) +{ + s64 max_copy; + ssize_t ret = queue_var_store64(&max_copy, page); + + if (ret < 0) + return ret; + + if (max_copy & (queue_logical_block_size(q) - 1)) + return -EINVAL; + + max_copy >>= SECTOR_SHIFT; + if (max_copy > q->limits.max_copy_sectors_hw) + max_copy = q->limits.max_copy_sectors_hw; + + q->limits.max_copy_sectors = max_copy; + return count; +} + static ssize_t queue_write_same_max_show(struct request_queue *q, char *page) { return queue_var_show(0, page); @@ -604,6 +661,10 @@ QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); QUEUE_RO_ENTRY(queue_max_open_zones, "max_open_zones"); QUEUE_RO_ENTRY(queue_max_active_zones, "max_active_zones"); +QUEUE_RW_ENTRY(queue_copy_offload, "copy_offload"); +QUEUE_RO_ENTRY(queue_copy_max_hw, "copy_max_bytes_hw"); +QUEUE_RW_ENTRY(queue_copy_max, "copy_max_bytes"); + QUEUE_RW_ENTRY(queue_nomerges, "nomerges"); QUEUE_RW_ENTRY(queue_rq_affinity, "rq_affinity"); QUEUE_RW_ENTRY(queue_poll, "io_poll"); @@ -651,6 +712,9 @@ static struct attribute *queue_attrs[] = { &queue_discard_max_entry.attr, &queue_discard_max_hw_entry.attr, &queue_discard_zeroes_data_entry.attr, + &queue_copy_offload_entry.attr, + &queue_copy_max_hw_entry.attr, + &queue_copy_max_entry.attr, &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, &queue_zone_append_max_entry.attr, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index a0452ba08e9a..3ac324208f2f 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -302,6 +302,9 @@ struct queue_limits { unsigned int discard_alignment; unsigned int zone_write_granularity; + unsigned long max_copy_sectors_hw; + unsigned long max_copy_sectors; + unsigned short max_segments; unsigned short max_integrity_segments; unsigned short max_discard_segments; @@ -573,6 +576,7 @@ struct request_queue { #define QUEUE_FLAG_NOWAIT 29 /* device supports NOWAIT */ #define QUEUE_FLAG_SQ_SCHED 30 /* single queue style io dispatch */ #define QUEUE_FLAG_SKIP_TAGSET_QUIESCE 31 /* quiesce_tagset skip the queue*/ +#define QUEUE_FLAG_COPY 32 /* supports copy offload */ #define QUEUE_FLAG_MQ_DEFAULT ((1UL << QUEUE_FLAG_IO_STAT) | \ (1UL << QUEUE_FLAG_SAME_COMP) | \ @@ -593,6 +597,7 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); test_bit(QUEUE_FLAG_STABLE_WRITES, &(q)->queue_flags) #define blk_queue_io_stat(q) test_bit(QUEUE_FLAG_IO_STAT, &(q)->queue_flags) #define blk_queue_add_random(q) test_bit(QUEUE_FLAG_ADD_RANDOM, &(q)->queue_flags) +#define blk_queue_copy(q) test_bit(QUEUE_FLAG_COPY, &(q)->queue_flags) #define blk_queue_zone_resetall(q) \ test_bit(QUEUE_FLAG_ZONE_RESETALL, &(q)->queue_flags) #define blk_queue_dax(q) test_bit(QUEUE_FLAG_DAX, &(q)->queue_flags) @@ -913,6 +918,8 @@ extern void blk_queue_chunk_sectors(struct request_queue *, unsigned int); extern void blk_queue_max_segments(struct request_queue *, unsigned short); extern void blk_queue_max_discard_segments(struct request_queue *, unsigned short); +extern void blk_queue_max_copy_sectors_hw(struct request_queue *q, + unsigned int max_copy_sectors); void blk_queue_max_secure_erase_sectors(struct request_queue *q, unsigned int max_sectors); extern void blk_queue_max_segment_size(struct request_queue *, unsigned int); @@ -1231,6 +1238,11 @@ static inline unsigned int bdev_discard_granularity(struct block_device *bdev) return bdev_get_queue(bdev)->limits.discard_granularity; } +static inline unsigned int bdev_max_copy_sectors(struct block_device *bdev) +{ + return bdev_get_queue(bdev)->limits.max_copy_sectors; +} + static inline unsigned int bdev_max_secure_erase_sectors(struct block_device *bdev) { diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index b7b56871029c..b3ad173f619c 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -64,6 +64,9 @@ struct fstrim_range { __u64 minlen; }; +/* maximum total copy length */ +#define MAX_COPY_TOTAL_LENGTH (1 << 27) + /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */ #define FILE_DEDUPE_RANGE_SAME 0 #define FILE_DEDUPE_RANGE_DIFFERS 1 From patchwork Wed Nov 23 05:58:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053099 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AE4BBC433FE for ; Wed, 23 Nov 2022 06:13:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235956AbiKWGN3 (ORCPT ); Wed, 23 Nov 2022 01:13:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50962 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235722AbiKWGNZ (ORCPT ); Wed, 23 Nov 2022 01:13:25 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 515B6E8714 for ; Tue, 22 Nov 2022 22:13:22 -0800 (PST) Received: from epcas5p2.samsung.com (unknown [182.195.41.40]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20221123061318epoutp020faa7b4b1e8f2e0080a37dbada05dbe7~qIgDSr4hg1661416614epoutp02j for ; Wed, 23 Nov 2022 06:13:18 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20221123061318epoutp020faa7b4b1e8f2e0080a37dbada05dbe7~qIgDSr4hg1661416614epoutp02j DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669183998; bh=1tZws8Hb1eFN18ZJBmssPvEBmaBiDwVui9GYaRE9iO8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Mm/77p9x2UdmqmyZqjmQaZRWYXwjvL8L2mbt4LFwP1eXlvvWjLKfR/ggiu3oXLgb7 2utx3a7Eox/30VhgRJtbuBxskvcRTE3k8rzLvsCv/R1X89O7hdes+LJUMZnBoE2czf UaqZhLUAXucMIhS2RjzTSuZMWVNA2Fw42SW5L+i8= Received: from epsnrtp3.localdomain (unknown [182.195.42.164]) by epcas5p3.samsung.com (KnoxPortal) with ESMTP id 20221123061318epcas5p30a403cecbd22a8ef3969d1c68319a0ed~qIgCjTag_1162411624epcas5p3f; Wed, 23 Nov 2022 06:13:18 +0000 (GMT) Received: from epsmges5p2new.samsung.com (unknown [182.195.38.181]) by epsnrtp3.localdomain (Postfix) with ESMTP id 4NH9mD35f2z4x9Pt; Wed, 23 Nov 2022 06:13:16 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p2new.samsung.com (Symantec Messaging Gateway) with SMTP id 35.30.39477.CF9BD736; Wed, 23 Nov 2022 15:13:16 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p2.samsung.com (KnoxPortal) with ESMTPA id 20221123061017epcas5p246a589e20eac655ac340cfda6028ff35~qIdaetnlR1320613206epcas5p2s; Wed, 23 Nov 2022 06:10:17 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20221123061017epsmtrp20611c43eb55ee051bbf76ef63dee122f~qIdabu9J-0451404514epsmtrp2-; Wed, 23 Nov 2022 06:10:17 +0000 (GMT) X-AuditID: b6c32a4a-259fb70000019a35-2b-637db9fc3898 Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 20.0B.14392.949BD736; Wed, 23 Nov 2022 15:10:17 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061014epsmtip1e54b0e13ca838fc29c950958d0e1fe6e~qIdXddRF62063320633epsmtip19; Wed, 23 Nov 2022 06:10:14 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty Subject: [PATCH v5 02/10] block: Add copy offload support infrastructure Date: Wed, 23 Nov 2022 11:28:19 +0530 Message-Id: <20221123055827.26996-3-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Ta0ybZRTO+33la4tj+VZwvHSbdtUlAgLtBvUFwZE5yOfmheCyBP5Abb+U jtLWtji3qLABKixcxV2KctkGQnHgGGCBFRl3ECTAQMEgw7TEjJWrESdl2NKi+3fOc57nnPOc Ny8L5+QwuSy5UkdrlGIFn3BnNHf5+gbYWj6WCCbHvFD9YC+OLhZs4qh2Jp9AG8MjODItlrih qY4WDNXU9mCorWIFQz1bVgLN/TnNQEWdkwBZJvQYMk37o7umAQYab/2KQGVVFiYq7Lvjhozm CwA1b5ThaK0yk4nqFpYYqH96HxrZ7HOL9Kb0s8ME1aKfYVIjv91mUOPDqVSDIZug7txMo9qm 0gkqN2PRTsiadaOW2icIKq/RAKi1hueozzouYVSD2YrF7I5PDk+ixVJaw6OVEpVUrpRF8E++ m/B6QohIIAwQhqJX+DylOIWO4B9/MyYgWq6w++fzPhArUu1QjFir5Qe9Fq5RpepoXpJKq4vg 02qpQh2sDtSKU7SpSlmgktaFCQWCwyF2YmJyknH9FlNdmfZhb9MQMx3Uy3IAmwXJYFg4/j2W A9xZHLINwHVDNdOZrAI4c7EIOJO/AOz63UrsSB6PWghnwQTglNmKO5MsDA79NG+vsFgE6Q9/ 3GI5cC8yD4Oft3Vsk3CyBIPWKhvT0cqTpOD8Vh3DIWCQh2DmhNoBe5BhsDfnH+CAIRkE82f3 OGA2+SocGm3FnJQ9cOCameGIcfJ5mNFUst0ekiVsuFBsZTo3PQ7Xl7IwZ+wJH/Y1unAuXFs0 udychTXF1YRTnAmg/mc9cBaOwqzBfNyxBE76wvrWICd8AH45WIc5B++GuRtmV38PaCzdiV+A 39aXu/r7wMn1C66YgveXK10nzQPQ0luIFQCe/ilD+qcM6f8fXQ5wA/Ch1doUGa0NUR9W0mf/ e2aJKqUBbH8KvxNGMPdgObATYCzQCSAL53t5pL3xkYTjIRWfO09rVAmaVAWt7QQh9nsX4txn JSr7r1LqEoTBoYJgkUgUHHpEJOR7e9y46ifhkDKxjk6maTWt2dFhLDY3HYvublyNdPt0Bgsr rRyrWK6e6YpI9A+SL9heku1/67vyugfsT2xjAxkKWjHP7mw2xJ/pvXntlE/RgWPCdpP07aVT NQMTt+SqvU0ZEos5pztqjssZqj32JL+KKf6BUxi7cu+b3PhfIv5ozz66K44e5XpLA67kccMv eRljj8SN3q+QKdm7sjl/u8fEZmfnZjyMr/ac8psvSCiNO3g6MrJurz7/fOOjaM/Vu2f63unR VHQUnw7zrVLA2hv9Gy++P2XiVVuSer42Tj6ir15+8syVXxeub77MPuR74nbXF/2Cftvm/igb 3NfdmMc8eC7rMTYsP7kcha10XzZssRWiMuGl967fS2zmM7RJYqEfrtGK/wWgefYHnQQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA02Ra0hTYRzGec85nh0F47RMX7Uboww1rYnBm+ue1fFLSRFSGTbnaUrbGpur nKZTk2qiVlDYlqmRirMSZ9psmrrMS2mmy8iBSqBFipdyIeVlOS3o2/P//Z7n05/CuS8JHypB lsgqZEIJj3Qjal/x1gVF1F0RbevXeaPKN604yrg5j6OKgTwSzXZ146hhQu+C+pvqMFRe8RpD 5uLvGHrtGCfRZ7uNQLctHwEa6dNhqMEWiOobOghkfXGfRIWlIxx0q63aBZmG0wGqnS3E0XTJ VQ56OjZJoHabL+qeb3PZ68XohrpIpk43wGG6B6sIxtqlYoyGGyRT/SiNMfdrSCYnc2KxkDXk wky+7COZ3GcGwEwb1zHXmrIxxjg8jkWuOOW2M46VJFxkFVt3n3WLN8084chL0i631nRyNKBS rAWuFKRD4a+eEVIL3CgubQbQ+HMaXxbesHS+5W9eBcsXvnKWS5kYzHpQQGgBRZF0IHzroJzc gy7CoHVwFHceOF2CQf2HAdK5XkUz8Ivj6dKAoDfBq31yJ3anw2Cr9jdwYkhvhXlDK53YlRbA zp4XmBNzFyv1pVuW2ythx71hwplxej3MrNHjNwGt+0/p/lNFADMAb1aulIqlSr6cL2MvBSuF UqVKJg4WXZAawdKLA/xN4LlhKtgCMApYAKRwnod7WkSKiOseJ0xSs4oLMQqVhFVagC9F8Lzc 32s7Yri0WJjInmdZOav4ZzHK1UeD5Z2KkNoFqkslc+c8Y6K7fWUb9C0xXv5+F0V2PpGwO1Fa 8CDlu7fmeEOItTRzl+zn/JHiGX71Mf/kMLM+1NpbjcVJ03kE3y+1XTQ+Hnsg6kh7W2py1GN9 RP1cJC1J0oZq5+Jta8zlD332RI9I7qzuavbcuD934rQda3EUnDl87ropK/96PmPrnbJFqU48 KbvbIvkhyjeACnXjClWkx7fw5NmJe/30es9ec+xUtkfxK5txYUeeifLdvq9zn67AbA8XOAQZ j9W570Zztl8JGd6cfmjHwWZHvNykboylPpWpoCZMkCpoRgHZiLDcGawqPFl249DRR4FjGxPW BuGbeIQyXsgPwBVK4R9lADCfUQMAAA== X-CMS-MailID: 20221123061017epcas5p246a589e20eac655ac340cfda6028ff35 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061017epcas5p246a589e20eac655ac340cfda6028ff35 References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce blkdev_issue_copy which supports source and destination bdevs, and an array of (source, destination and copy length) tuples. Introduce REQ_COPY copy offload operation flag. Create a read-write bio pair with a token as payload and submitted to the device in order. Read request populates token with source specific information which is then passed with write request. This design is courtesy Mikulas Patocka's token based copy Larger copy will be divided, based on max_copy_sectors limit. Signed-off-by: Nitesh Shetty Signed-off-by: Anuj Gupta --- block/blk-lib.c | 358 ++++++++++++++++++++++++++++++++++++++ block/blk.h | 2 + include/linux/blk_types.h | 44 +++++ include/linux/blkdev.h | 3 + include/uapi/linux/fs.h | 15 ++ 5 files changed, 422 insertions(+) diff --git a/block/blk-lib.c b/block/blk-lib.c index e59c3069e835..2ce3c872ca49 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -115,6 +115,364 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, } EXPORT_SYMBOL(blkdev_issue_discard); +/* + * For synchronous copy offload/emulation, wait and process all in-flight BIOs. + * This must only be called once all bios have been issued so that the refcount + * can only decrease. This just waits for all bios to make it through + * bio_copy_*_write_end_io. IO errors are propagated through cio->io_error. + */ +static int cio_await_completion(struct cio *cio) +{ + int ret = 0; + + atomic_dec(&cio->refcount); + + if (cio->endio) + return 0; + + if (atomic_read(&cio->refcount)) { + __set_current_state(TASK_UNINTERRUPTIBLE); + blk_io_schedule(); + } + + ret = cio->io_err; + kfree(cio); + + return ret; +} + +static void blk_copy_offload_write_end_io(struct bio *bio) +{ + struct copy_ctx *ctx = bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + + if (bio->bi_status) { + cio->io_err = blk_status_to_errno(bio->bi_status); + clen = (bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].dst; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + } + __free_page(bio->bi_io_vec[0].bv_page); + bio_put(bio); + + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +static void blk_copy_offload_read_end_io(struct bio *read_bio) +{ + struct copy_ctx *ctx = read_bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + unsigned long flags; + + if (read_bio->bi_status) { + cio->io_err = blk_status_to_errno(read_bio->bi_status); + goto err_rw_bio; + } + + /* For zoned device, we check if completed bio is first entry in linked + * list, + * if yes, we start the worker to submit write bios. + * if not, then we just update status of bio in ctx, + * once the worker gets scheduled, it will submit writes for all + * the consecutive REQ_COPY_READ_COMPLETE bios. + */ + if (bdev_is_zoned(ctx->write_bio->bi_bdev)) { + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_COMPLETE; + if (ctx == list_first_entry(&cio->list, + struct copy_ctx, list)) { + spin_unlock_irqrestore(&cio->list_lock, flags); + schedule_work(&ctx->dispatch_work); + goto free_read_bio; + } + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + schedule_work(&ctx->dispatch_work); + +free_read_bio: + bio_put(read_bio); + + return; + +err_rw_bio: + clen = (read_bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].src; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + __free_page(read_bio->bi_io_vec[0].bv_page); + bio_put(ctx->write_bio); + bio_put(read_bio); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +static void blk_copy_dispatch_work_fn(struct work_struct *work) +{ + struct copy_ctx *ctx = container_of(work, struct copy_ctx, + dispatch_work); + + submit_bio(ctx->write_bio); +} + +static void blk_zoned_copy_dispatch_work_fn(struct work_struct *work) +{ + struct copy_ctx *ctx = container_of(work, struct copy_ctx, + dispatch_work); + struct cio *cio = ctx->cio; + unsigned long flags = 0; + + atomic_inc(&cio->refcount); + spin_lock_irqsave(&cio->list_lock, flags); + + while (!list_empty(&cio->list)) { + ctx = list_first_entry(&cio->list, struct copy_ctx, list); + + if (ctx->status == REQ_COPY_READ_PROGRESS) + break; + + atomic_inc(&ctx->refcount); + ctx->status = REQ_COPY_WRITE_PROGRESS; + spin_unlock_irqrestore(&cio->list_lock, flags); + submit_bio(ctx->write_bio); + spin_lock_irqsave(&cio->list_lock, flags); + + list_del(&ctx->list); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + } + + spin_unlock_irqrestore(&cio->list_lock, flags); + if (atomic_dec_and_test(&cio->refcount)) + blk_wake_io_task(cio->waiter); +} + +/* + * blk_copy_offload - Use device's native copy offload feature. + * we perform copy operation by sending 2 bio. + * 1. First we send a read bio with REQ_COPY flag along with a token and source + * and length. Once read bio reaches driver layer, device driver adds all the + * source info to token and does a fake completion. + * 2. Once read opration completes, we issue write with REQ_COPY flag with same + * token. In driver layer, token info is used to form a copy offload command. + * + * For conventional devices we submit write bio independentenly once read + * completes. For zoned devices , reads can complete out of order, so we + * maintain a linked list and submit writes in the order, reads are submitted. + */ +static int blk_copy_offload(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, + int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask) +{ + struct cio *cio; + struct copy_ctx *ctx; + struct bio *read_bio, *write_bio; + struct page *token; + sector_t src_blk, copy_len, dst_blk; + sector_t rem, max_copy_len; + int ri = 0, ret = 0; + unsigned long flags; + + cio = kzalloc(sizeof(struct cio), GFP_KERNEL); + if (!cio) + return -ENOMEM; + cio->ranges = ranges; + atomic_set(&cio->refcount, 1); + cio->waiter = current; + cio->endio = end_io; + cio->private = private; + if (bdev_is_zoned(dst_bdev)) { + INIT_LIST_HEAD(&cio->list); + spin_lock_init(&cio->list_lock); + } + + max_copy_len = min(bdev_max_copy_sectors(src_bdev), + bdev_max_copy_sectors(dst_bdev)) << SECTOR_SHIFT; + + for (ri = 0; ri < nr; ri++) { + cio->ranges[ri].comp_len = ranges[ri].len; + src_blk = ranges[ri].src; + dst_blk = ranges[ri].dst; + for (rem = ranges[ri].len; rem > 0; rem -= copy_len) { + copy_len = min(rem, max_copy_len); + + token = alloc_page(gfp_mask); + if (unlikely(!token)) { + ret = -ENOMEM; + goto err_token; + } + + ctx = kzalloc(sizeof(struct copy_ctx), gfp_mask); + if (!ctx) { + ret = -ENOMEM; + goto err_ctx; + } + read_bio = bio_alloc(src_bdev, 1, REQ_OP_READ | REQ_COPY + | REQ_SYNC | REQ_NOMERGE, gfp_mask); + if (!read_bio) { + ret = -ENOMEM; + goto err_read_bio; + } + write_bio = bio_alloc(dst_bdev, 1, REQ_OP_WRITE + | REQ_COPY | REQ_SYNC | REQ_NOMERGE, + gfp_mask); + if (!write_bio) { + cio->io_err = -ENOMEM; + goto err_write_bio; + } + + ctx->cio = cio; + ctx->range_idx = ri; + ctx->write_bio = write_bio; + atomic_set(&ctx->refcount, 1); + + if (bdev_is_zoned(dst_bdev)) { + INIT_WORK(&ctx->dispatch_work, + blk_zoned_copy_dispatch_work_fn); + INIT_LIST_HEAD(&ctx->list); + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_PROGRESS; + list_add_tail(&ctx->list, &cio->list); + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + INIT_WORK(&ctx->dispatch_work, + blk_copy_dispatch_work_fn); + + __bio_add_page(read_bio, token, PAGE_SIZE, 0); + read_bio->bi_iter.bi_size = copy_len; + read_bio->bi_iter.bi_sector = src_blk >> SECTOR_SHIFT; + read_bio->bi_end_io = blk_copy_offload_read_end_io; + read_bio->bi_private = ctx; + + __bio_add_page(write_bio, token, PAGE_SIZE, 0); + write_bio->bi_iter.bi_size = copy_len; + write_bio->bi_end_io = blk_copy_offload_write_end_io; + write_bio->bi_iter.bi_sector = dst_blk >> SECTOR_SHIFT; + write_bio->bi_private = ctx; + + atomic_inc(&cio->refcount); + submit_bio(read_bio); + src_blk += copy_len; + dst_blk += copy_len; + } + } + + /* Wait for completion of all IO's*/ + return cio_await_completion(cio); + +err_write_bio: + bio_put(read_bio); +err_read_bio: + kfree(ctx); +err_ctx: + __free_page(token); +err_token: + ranges[ri].comp_len = min_t(sector_t, + ranges[ri].comp_len, (ranges[ri].len - rem)); + + cio->io_err = ret; + return cio_await_completion(cio); +} + +static inline int blk_copy_sanity_check(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, int nr) +{ + unsigned int align_mask = max(bdev_logical_block_size(dst_bdev), + bdev_logical_block_size(src_bdev)) - 1; + sector_t len = 0; + int i; + + if (!nr) + return -EINVAL; + + if (nr >= MAX_COPY_NR_RANGE) + return -EINVAL; + + if (bdev_read_only(dst_bdev)) + return -EPERM; + + for (i = 0; i < nr; i++) { + if (!ranges[i].len) + return -EINVAL; + + len += ranges[i].len; + if ((ranges[i].dst & align_mask) || + (ranges[i].src & align_mask) || + (ranges[i].len & align_mask)) + return -EINVAL; + ranges[i].comp_len = 0; + } + + if (len && len >= MAX_COPY_TOTAL_LENGTH) + return -EINVAL; + + return 0; +} + +static inline bool blk_check_copy_offload(struct request_queue *src_q, + struct request_queue *dst_q) +{ + return blk_queue_copy(dst_q) && blk_queue_copy(src_q); +} + +/* + * blkdev_issue_copy - queue a copy + * @src_bdev: source block device + * @dst_bdev: destination block device + * @ranges: array of source/dest/len, + * ranges are expected to be allocated/freed by caller + * @nr: number of source ranges to copy + * @end_io: end_io function to be called on completion of copy operation, + * for synchronous operation this should be NULL + * @private: end_io function will be called with this private data, should be + * NULL, if operation is synchronous in nature + * @gfp_mask: memory allocation flags (for bio_alloc) + * + * Description: + * Copy source ranges from source block device to destination block + * device. length of a source range cannot be zero. Max total length of + * copy is limited to MAX_COPY_TOTAL_LENGTH and also maximum number of + * entries is limited to MAX_COPY_NR_RANGE + */ +int blkdev_issue_copy(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, int nr, + cio_iodone_t end_io, void *private, gfp_t gfp_mask) +{ + struct request_queue *src_q = bdev_get_queue(src_bdev); + struct request_queue *dst_q = bdev_get_queue(dst_bdev); + int ret = -EINVAL; + + ret = blk_copy_sanity_check(src_bdev, dst_bdev, ranges, nr); + if (ret) + return ret; + + if (blk_check_copy_offload(src_q, dst_q)) + ret = blk_copy_offload(src_bdev, dst_bdev, ranges, nr, + end_io, private, gfp_mask); + + return ret; +} +EXPORT_SYMBOL_GPL(blkdev_issue_copy); + static int __blkdev_issue_write_zeroes(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, struct bio **biop, unsigned flags) diff --git a/block/blk.h b/block/blk.h index 5929559acd71..6d534047f20d 100644 --- a/block/blk.h +++ b/block/blk.h @@ -308,6 +308,8 @@ static inline bool bio_may_exceed_limits(struct bio *bio, break; } + if (unlikely(op_is_copy(bio->bi_opf))) + return false; /* * All drivers must accept single-segments bios that are <= PAGE_SIZE. * This is a quick and dirty check that relies on the fact that diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index e0b098089ef2..71278c862bba 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -422,6 +422,7 @@ enum req_flag_bits { */ /* for REQ_OP_WRITE_ZEROES: */ __REQ_NOUNMAP, /* do not free blocks when zeroing */ + __REQ_COPY, /* copy request */ __REQ_NR_BITS, /* stops here */ }; @@ -451,6 +452,7 @@ enum req_flag_bits { #define REQ_DRV (__force blk_opf_t)(1ULL << __REQ_DRV) #define REQ_SWAP (__force blk_opf_t)(1ULL << __REQ_SWAP) +#define REQ_COPY ((__force blk_opf_t)(1ULL << __REQ_COPY)) #define REQ_FAILFAST_MASK \ (REQ_FAILFAST_DEV | REQ_FAILFAST_TRANSPORT | REQ_FAILFAST_DRIVER) @@ -484,6 +486,11 @@ static inline bool op_is_write(blk_opf_t op) return !!(op & (__force blk_opf_t)1); } +static inline bool op_is_copy(blk_opf_t op) +{ + return (op & REQ_COPY); +} + /* * Check if the bio or request is one that needs special treatment in the * flush state machine. @@ -543,4 +550,41 @@ struct blk_rq_stat { u64 batch; }; +typedef void (cio_iodone_t)(void *private, int status); + +struct cio { + struct range_entry *ranges; + struct task_struct *waiter; /* waiting task (NULL if none) */ + atomic_t refcount; + int io_err; + cio_iodone_t *endio; /* applicable for async operation */ + void *private; /* applicable for async operation */ + + /* For zoned device we maintain a linked list of IO submissions. + * This is to make sure we maintain the order of submissions. + * Otherwise some reads completing out of order, will submit writes not + * aligned with zone write pointer. + */ + struct list_head list; + spinlock_t list_lock; +}; + +enum copy_io_status { + REQ_COPY_READ_PROGRESS, + REQ_COPY_READ_COMPLETE, + REQ_COPY_WRITE_PROGRESS, +}; + +struct copy_ctx { + struct cio *cio; + struct work_struct dispatch_work; + struct bio *write_bio; + atomic_t refcount; + int range_idx; /* used in error/partial completion */ + + /* For zoned device linked list is maintained. Along with state of IO */ + struct list_head list; + enum copy_io_status status; +}; + #endif /* __LINUX_BLK_TYPES_H */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 3ac324208f2f..a3b12ad42ed7 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1065,6 +1065,9 @@ int __blkdev_issue_discard(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, struct bio **biop); int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp); +int blkdev_issue_copy(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, + int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask); #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit zeroes */ diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index b3ad173f619c..9248b6d259de 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -67,6 +67,21 @@ struct fstrim_range { /* maximum total copy length */ #define MAX_COPY_TOTAL_LENGTH (1 << 27) +/* Maximum no of entries supported */ +#define MAX_COPY_NR_RANGE (1 << 12) + +/* range entry for copy offload, all fields should be byte addressed */ +struct range_entry { + __u64 src; /* source to be copied */ + __u64 dst; /* destination */ + __u64 len; /* length in bytes to be copied */ + + /* length of data copy actually completed. This will be filled by + * kernel, once copy completes + */ + __u64 comp_len; +}; + /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */ #define FILE_DEDUPE_RANGE_SAME 0 #define FILE_DEDUPE_RANGE_DIFFERS 1 From patchwork Wed Nov 23 05:58:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053100 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id E8C7BC433FE for ; Wed, 23 Nov 2022 06:13:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235963AbiKWGNt (ORCPT ); Wed, 23 Nov 2022 01:13:49 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51072 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235931AbiKWGN1 (ORCPT ); Wed, 23 Nov 2022 01:13:27 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A5F1DF2C24 for ; Tue, 22 Nov 2022 22:13:24 -0800 (PST) Received: from epcas5p4.samsung.com (unknown [182.195.41.42]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20221123061323epoutp02fef706d1697d487ce5aee6939872e7b4~qIgHUeG2F1821418214epoutp02y for ; Wed, 23 Nov 2022 06:13:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20221123061323epoutp02fef706d1697d487ce5aee6939872e7b4~qIgHUeG2F1821418214epoutp02y DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184003; bh=0FPx7LMsKlfQgDSh7pdXmYYh7T24JtpfZN1TEoPvn7c=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KONw9b5EggwVSQtG5RFa6eT8a1ZFy448ET40VG/kpDwM7LcAf8w0+ElvunNOG1dCP BvrMQyx4fUKghzZhteCiM02uKWqK2IlKyNat4MqVPBjXtjRaO+oU1z1qFaGBmxxTCW Uo+J72qMUBkpJApZOUlRD9pVRywhgX8rfzJyb6Wk= Received: from epsnrtp2.localdomain (unknown [182.195.42.163]) by epcas5p4.samsung.com (KnoxPortal) with ESMTP id 20221123061322epcas5p4a63e3bc9742748460f0cbdd214de1c25~qIgGnUrYd1580315803epcas5p40; Wed, 23 Nov 2022 06:13:22 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.179]) by epsnrtp2.localdomain (Postfix) with ESMTP id 4NH9mJ3lqvz4x9Pv; Wed, 23 Nov 2022 06:13:20 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id A8.17.56352.00ABD736; Wed, 23 Nov 2022 15:13:20 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p2.samsung.com (KnoxPortal) with ESMTPA id 20221123061021epcas5p276b6d48db889932282d017b27c9a3291~qIdd7WvBX1320613206epcas5p23; Wed, 23 Nov 2022 06:10:21 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20221123061021epsmtrp2aa0cf5e084413f713e1638c588adf37a~qIdd6Dns30451404514epsmtrp2L; Wed, 23 Nov 2022 06:10:21 +0000 (GMT) X-AuditID: b6c32a4b-5f7fe7000001dc20-ce-637dba00b25d Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id 36.B2.18644.D49BD736; Wed, 23 Nov 2022 15:10:21 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061017epsmtip1f6dbbd226075454fb5dd076fb3daff6e~qIdaiGQur2539625396epsmtip1Y; Wed, 23 Nov 2022 06:10:17 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , Vincent Fu Subject: [PATCH v5 03/10] block: add emulation for copy Date: Wed, 23 Nov 2022 11:28:20 +0530 Message-Id: <20221123055827.26996-4-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Ta0xTZxj2XDhtdXXHyuJnFewKLgOEtqOUD4Rp0G1nYQvoMhPdCCvljFtp S0+RSUQuyrgo1wiMSgTGYAOnbEUYUkGsQxDGUBioLI6rxm2hOFiGBglrObD573me932f9/Ll 42KCAo6QG6Mx0HqNUi0m1uMtN9zcPNe1paikZfeEsLH3JgYzCpcweOFBAQEX+wcw2G495wDv d15BYf2FLhSaq/9CYdfyDAEn/h7FYbFlBIEPh40obB/1gFfbb+FwqK2CgJV1DzmwqLvJAbZO pyOwZbESg/O1pzjw0p+zOOwZ3QanTmcjcGCp22EvoIxj/QR1xfiAQw389j1ODfUnUqaGHIJq +iqVMt9PI6i8k1ZbQuaYAzXbMUxQ+ZcbEGre5ExldZ5GKdP0DBq68UhcQDStjKT1Ilqj0kbG aKICxcEfhO8L91FIZZ4yP+grFmmU8XSgeP97oZ5vx6htRxCLjirViTYpVMkwYsmbAXptooEW RWsZQ6CY1kWqdXKdF6OMZxI1UV4a2uAvk0rf8LElfhIX/XmRQFcd8lnJzG1OGjKwNxfhcQEp B2f7vubkIuu5AtKMgOvPTQhL5hBQaP6ZYMk8AqrPXOSslRRbrTgbaEPA0sWnqyQTBem5kzbC 5RKkB+hb5tp1RzIfBdnmTsxOMLITBZkdGajdajPpC5qGWIyTO8HNwYGVFnzSHyzm5q0YAVIC CsY22WUeuRv8dKcNZVM2gVvl07gdY+QOcLL53Io/IGt44LY5D2NH3Q+651pwFm8Gf3RfXl1B COat7QSLk0D92W8ItvgUAox3jQgb2AMyewsw+xAY6QYa2ySs7ARKei+hbOONIG9xGmV1Pmg9 v4ZdwLeNVav+W8HIQvoqpkDLnX9Wr52PgF8XatFCRGR8YSHjCwsZ/29dhWANyFZax8RH0YyP zltDJ/33zCptvAlZ+Rnuwa3I5PgTLwuCchELAriY2JGf+u5xlYAfqTyWTOu14fpENc1YEB/b wYsw4Ssqre1raQzhMrmfVK5QKOR+3gqZeAu/5gt3lYCMUhroOJrW0fq1OpTLE6ah5TOFRxoP W2t+OJFyQuYcVJKiIF93TKgc8z0UFpL1zOBZaymY89++cHD5w6SOfROzVwdfzZkqDa6TwLwz PDfXsVKq7DH/qTIWBJVr33rM1Jl+L/Ze15zcccB1d0ME1rCTEJX2uE7uepScUlbhcv67d8aF WE/l4efcTueQsCfbgybcmT6e9H0nAdkaOv7lMVpYESj5aEvqwbj8rLk9vrruNo+M4ohnoqOu yRtSnXpjhzf4p3nclR+QBrhjN1pimeWYqbJa/tK1mtckrSnqqtiPuV01qdnKhE9xcX1ImGXw kJV+9HKGy44k4cLIruYfdeqcyo6E4/citr1kuo5fw6N+cfAU40y0UuaO6Rnlv1+4NM6iBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrEIsWRmVeSWpSXmKPExsWy7bCSnK7vztpkgztn2SzWnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLFauPspksXvhRyaLo//fslk8/HKLxWLSoWuMFk+vzmKy 2HtL22LP3pMsFpd3zWGzmL/sKbvFxOObWS12PGlktNj2ez6zxeelLewW616/Z7E4cUva4nF3 B6PF+b/HWR0kPGbdP8vmsXPWXXaP8/c2snhcPlvqsWlVJ5vH5iX1HrtvNrB59Da/Aypovc/q 8X7fVTaPvi2rGD0+b5LzaD/QzeSx6clbpgC+KC6blNSczLLUIn27BK6MtolCBQv9K6a+vcDe wHjeoYuRk0NCwERi0rt3LCC2kMAORom1//Ih4pISy/4eYYawhSVW/nvO3sXIBVTTzCRx+fh8 oAYODjYBbYnT/zlA4iICC4Di914xgzjMAmeYJBoufWAC6RYWMJfYfLkJzGYRUJU4duk8O4jN K2Al8burF2yQhIC+RP99QZAwp4C1xJmLu5hAwkJAJXuW6UBUC0qcnPkE7E5mAXmJ5q2zmScw CsxCkpqFJLWAkWkVo2RqQXFuem6xYYFRXmq5XnFibnFpXrpecn7uJkZwNGtp7WDcs+qD3iFG Jg7GQ4wSHMxKIrz1njXJQrwpiZVVqUX58UWlOanFhxilOViUxHkvdJ2MFxJITyxJzU5NLUgt gskycXBKNTCd9ZScvPPbYdWgDyIHbj9fp3iHjUm4/YBmbkRgXFL+NfspWideNgef6eq75rXO g3dL/4NzC78b735drP33pGJD/KW7K40O7jCuCuNbfXKS66Hpz8p+fQ6RF4y+MunvxcCWB1OP BrrKZF2TSppzrELMMUTW0WO6w4GibSY/y0+VfOSuX/M3Z7H0rCufQwSENerK3blag+3lUtyu 2B3eIq7Lu5jzEceiyc0fXuRf8C2SlPtx5sTKVMVJVu/vs0y/fmyD2blFh9nnOr9e+mPL0u0z u9yN9kb17lweVnxhHa/0TYtT6yWKE8yq2VYEPJUO/nKv3Vsx4qmM49o/LOzGfD0cZ1yFp/gc 3Bi4J65Xw1FSiaU4I9FQi7moOBEAx7T9r1UDAAA= X-CMS-MailID: 20221123061021epcas5p276b6d48db889932282d017b27c9a3291 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061021epcas5p276b6d48db889932282d017b27c9a3291 References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org For the devices which does not support copy, copy emulation is added. Copy-emulation is implemented by reading from source ranges into memory and writing to the corresponding destination asynchronously. For zoned device we maintain a linked list of read submission and try to submit corresponding write in same order. Also emulation is used, if copy offload fails or partially completes. Signed-off-by: Nitesh Shetty Signed-off-by: Vincent Fu Signed-off-by: Anuj Gupta --- block/blk-lib.c | 241 ++++++++++++++++++++++++++++++++++++++++- block/blk-map.c | 4 +- include/linux/blkdev.h | 3 + 3 files changed, 245 insertions(+), 3 deletions(-) diff --git a/block/blk-lib.c b/block/blk-lib.c index 2ce3c872ca49..43b1d0ef5732 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -428,6 +428,239 @@ static inline int blk_copy_sanity_check(struct block_device *src_bdev, return 0; } +static void *blk_alloc_buf(sector_t req_size, sector_t *alloc_size, + gfp_t gfp_mask) +{ + int min_size = PAGE_SIZE; + void *buf; + + while (req_size >= min_size) { + buf = kvmalloc(req_size, gfp_mask); + if (buf) { + *alloc_size = req_size; + return buf; + } + /* retry half the requested size */ + req_size >>= 1; + } + + return NULL; +} + +static void blk_copy_emulate_write_end_io(struct bio *bio) +{ + struct copy_ctx *ctx = bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + + if (bio->bi_status) { + cio->io_err = blk_status_to_errno(bio->bi_status); + clen = (bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].dst; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + } + kvfree(page_address(bio->bi_io_vec[0].bv_page)); + bio_map_kern_endio(bio); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +static void blk_copy_emulate_read_end_io(struct bio *read_bio) +{ + struct copy_ctx *ctx = read_bio->bi_private; + struct cio *cio = ctx->cio; + sector_t clen; + int ri = ctx->range_idx; + unsigned long flags; + + if (read_bio->bi_status) { + cio->io_err = blk_status_to_errno(read_bio->bi_status); + goto err_rw_bio; + } + + /* For zoned device, we check if completed bio is first entry in linked + * list, + * if yes, we start the worker to submit write bios. + * if not, then we just update status of bio in ctx, + * once the worker gets scheduled, it will submit writes for all + * the consecutive REQ_COPY_READ_COMPLETE bios. + */ + if (bdev_is_zoned(ctx->write_bio->bi_bdev)) { + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_COMPLETE; + if (ctx == list_first_entry(&cio->list, + struct copy_ctx, list)) { + spin_unlock_irqrestore(&cio->list_lock, flags); + schedule_work(&ctx->dispatch_work); + goto free_read_bio; + } + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + schedule_work(&ctx->dispatch_work); + +free_read_bio: + kfree(read_bio); + + return; + +err_rw_bio: + clen = (read_bio->bi_iter.bi_sector << SECTOR_SHIFT) - + cio->ranges[ri].src; + cio->ranges[ri].comp_len = min_t(sector_t, clen, + cio->ranges[ri].comp_len); + __free_page(read_bio->bi_io_vec[0].bv_page); + bio_map_kern_endio(read_bio); + if (atomic_dec_and_test(&ctx->refcount)) + kfree(ctx); + if (atomic_dec_and_test(&cio->refcount)) { + if (cio->endio) { + cio->endio(cio->private, cio->io_err); + kfree(cio); + } else + blk_wake_io_task(cio->waiter); + } +} + +/* + * If native copy offload feature is absent, this function tries to emulate, + * by copying data from source to a temporary buffer and from buffer to + * destination device. + */ +static int blk_copy_emulate(struct block_device *src_bdev, + struct block_device *dst_bdev, struct range_entry *ranges, + int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask) +{ + struct request_queue *sq = bdev_get_queue(src_bdev); + struct request_queue *dq = bdev_get_queue(dst_bdev); + struct bio *read_bio, *write_bio; + void *buf = NULL; + struct copy_ctx *ctx; + struct cio *cio; + sector_t src, dst, offset, buf_len, req_len, rem = 0; + int ri = 0, ret = 0; + unsigned long flags; + sector_t max_src_hw_len = min_t(unsigned int, queue_max_hw_sectors(sq), + queue_max_segments(sq) << (PAGE_SHIFT - SECTOR_SHIFT)) + << SECTOR_SHIFT; + sector_t max_dst_hw_len = min_t(unsigned int, queue_max_hw_sectors(dq), + queue_max_segments(dq) << (PAGE_SHIFT - SECTOR_SHIFT)) + << SECTOR_SHIFT; + sector_t max_hw_len = min_t(unsigned int, + max_src_hw_len, max_dst_hw_len); + + cio = kzalloc(sizeof(struct cio), GFP_KERNEL); + if (!cio) + return -ENOMEM; + cio->ranges = ranges; + atomic_set(&cio->refcount, 1); + cio->waiter = current; + cio->endio = end_io; + cio->private = private; + + if (bdev_is_zoned(dst_bdev)) { + INIT_LIST_HEAD(&cio->list); + spin_lock_init(&cio->list_lock); + } + + for (ri = 0; ri < nr; ri++) { + offset = ranges[ri].comp_len; + src = ranges[ri].src + offset; + dst = ranges[ri].dst + offset; + /* If IO fails, we truncate comp_len */ + ranges[ri].comp_len = ranges[ri].len; + + for (rem = ranges[ri].len - offset; rem > 0; rem -= buf_len) { + req_len = min_t(int, max_hw_len, rem); + + buf = blk_alloc_buf(req_len, &buf_len, gfp_mask); + if (!buf) { + ret = -ENOMEM; + goto err_alloc_buf; + } + + ctx = kzalloc(sizeof(struct copy_ctx), gfp_mask); + if (!ctx) { + ret = -ENOMEM; + goto err_ctx; + } + + read_bio = bio_map_kern(sq, buf, buf_len, gfp_mask); + if (IS_ERR(read_bio)) { + ret = PTR_ERR(read_bio); + goto err_read_bio; + } + + write_bio = bio_map_kern(dq, buf, buf_len, gfp_mask); + if (IS_ERR(write_bio)) { + ret = PTR_ERR(write_bio); + goto err_write_bio; + } + + ctx->cio = cio; + ctx->range_idx = ri; + ctx->write_bio = write_bio; + atomic_set(&ctx->refcount, 1); + + read_bio->bi_iter.bi_sector = src >> SECTOR_SHIFT; + read_bio->bi_iter.bi_size = buf_len; + read_bio->bi_opf = REQ_OP_READ | REQ_SYNC; + bio_set_dev(read_bio, src_bdev); + read_bio->bi_end_io = blk_copy_emulate_read_end_io; + read_bio->bi_private = ctx; + + write_bio->bi_iter.bi_size = buf_len; + write_bio->bi_opf = REQ_OP_WRITE | REQ_SYNC; + bio_set_dev(write_bio, dst_bdev); + write_bio->bi_end_io = blk_copy_emulate_write_end_io; + write_bio->bi_iter.bi_sector = dst >> SECTOR_SHIFT; + write_bio->bi_private = ctx; + + if (bdev_is_zoned(dst_bdev)) { + INIT_WORK(&ctx->dispatch_work, + blk_zoned_copy_dispatch_work_fn); + INIT_LIST_HEAD(&ctx->list); + spin_lock_irqsave(&cio->list_lock, flags); + ctx->status = REQ_COPY_READ_PROGRESS; + list_add_tail(&ctx->list, &cio->list); + spin_unlock_irqrestore(&cio->list_lock, flags); + } else + INIT_WORK(&ctx->dispatch_work, + blk_copy_dispatch_work_fn); + + atomic_inc(&cio->refcount); + submit_bio(read_bio); + + src += buf_len; + dst += buf_len; + } + } + + /* Wait for completion of all IO's*/ + return cio_await_completion(cio); + +err_write_bio: + bio_put(read_bio); +err_read_bio: + kfree(ctx); +err_ctx: + kvfree(buf); +err_alloc_buf: + ranges[ri].comp_len -= min_t(sector_t, + ranges[ri].comp_len, (ranges[ri].len - rem)); + + cio->io_err = ret; + return cio_await_completion(cio); +} + static inline bool blk_check_copy_offload(struct request_queue *src_q, struct request_queue *dst_q) { @@ -460,15 +693,21 @@ int blkdev_issue_copy(struct block_device *src_bdev, struct request_queue *src_q = bdev_get_queue(src_bdev); struct request_queue *dst_q = bdev_get_queue(dst_bdev); int ret = -EINVAL; + bool offload = false; ret = blk_copy_sanity_check(src_bdev, dst_bdev, ranges, nr); if (ret) return ret; - if (blk_check_copy_offload(src_q, dst_q)) + offload = blk_check_copy_offload(src_q, dst_q); + if (offload) ret = blk_copy_offload(src_bdev, dst_bdev, ranges, nr, end_io, private, gfp_mask); + if (ret || !offload) + ret = blk_copy_emulate(src_bdev, dst_bdev, ranges, nr, + end_io, private, gfp_mask); + return ret; } EXPORT_SYMBOL_GPL(blkdev_issue_copy); diff --git a/block/blk-map.c b/block/blk-map.c index 19940c978c73..bcf8db2b75f1 100644 --- a/block/blk-map.c +++ b/block/blk-map.c @@ -363,7 +363,7 @@ static void bio_invalidate_vmalloc_pages(struct bio *bio) #endif } -static void bio_map_kern_endio(struct bio *bio) +void bio_map_kern_endio(struct bio *bio) { bio_invalidate_vmalloc_pages(bio); bio_uninit(bio); @@ -380,7 +380,7 @@ static void bio_map_kern_endio(struct bio *bio) * Map the kernel address into a bio suitable for io to a block * device. Returns an error pointer in case of error. */ -static struct bio *bio_map_kern(struct request_queue *q, void *data, +struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, gfp_t gfp_mask) { unsigned long kaddr = (unsigned long)data; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index a3b12ad42ed7..b0b18c30a60b 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1068,6 +1068,9 @@ int blkdev_issue_secure_erase(struct block_device *bdev, sector_t sector, int blkdev_issue_copy(struct block_device *src_bdev, struct block_device *dst_bdev, struct range_entry *ranges, int nr, cio_iodone_t end_io, void *private, gfp_t gfp_mask); +struct bio *bio_map_kern(struct request_queue *q, void *data, unsigned int len, + gfp_t gfp_mask); +void bio_map_kern_endio(struct bio *bio); #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit zeroes */ From patchwork Wed Nov 23 05:58:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053101 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 89E29C4332F for ; Wed, 23 Nov 2022 06:13:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235818AbiKWGNw (ORCPT ); Wed, 23 Nov 2022 01:13:52 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51286 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235705AbiKWGNs (ORCPT ); Wed, 23 Nov 2022 01:13:48 -0500 Received: from mailout4.samsung.com (mailout4.samsung.com [203.254.224.34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id E5A77F3906 for ; Tue, 22 Nov 2022 22:13:28 -0800 (PST) Received: from epcas5p2.samsung.com (unknown [182.195.41.40]) by mailout4.samsung.com (KnoxPortal) with ESMTP id 20221123061327epoutp0491c05baa4364328d2b663c515cc8f429~qIgLG_VbV2231722317epoutp04g for ; Wed, 23 Nov 2022 06:13:27 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout4.samsung.com 20221123061327epoutp0491c05baa4364328d2b663c515cc8f429~qIgLG_VbV2231722317epoutp04g DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184007; bh=kSJ3dOFoktYoCaNlIORYzBMHiA2ANyobnZ0zDKBDF0g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=irJJ08ZsN3oEw9hD9aN4LE/GRqVev9eqr5PfEF0Qrhglj/x9CtN4ASUwmM+oRznwc C4dg1nN1KdYHhnawh+88tmfa8g+WRIDTsQlg1tWdlwd0IxOGFqL/PimqlJUU2t0lPf ff4h8pRJcAy62beqThmpoLTr2RgVGQPll2UKiS/c= Received: from epsnrtp3.localdomain (unknown [182.195.42.164]) by epcas5p1.samsung.com (KnoxPortal) with ESMTP id 20221123061326epcas5p1885bf1308a88e8eff224df73c203fb5d~qIgKc2Z0n0806708067epcas5p1V; Wed, 23 Nov 2022 06:13:26 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.180]) by epsnrtp3.localdomain (Postfix) with ESMTP id 4NH9mN24Txz4x9Pp; Wed, 23 Nov 2022 06:13:24 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id AC.17.56352.40ABD736; Wed, 23 Nov 2022 15:13:24 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p2.samsung.com (KnoxPortal) with ESMTPA id 20221123061024epcas5p28fd0296018950d722b5a97e2875cf391~qIdhHT_sF0604106041epcas5p2s; Wed, 23 Nov 2022 06:10:24 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20221123061024epsmtrp256f8d5870cea901380bcaafd9741ed60~qIdhGOQih0451404514epsmtrp2V; Wed, 23 Nov 2022 06:10:24 +0000 (GMT) X-AuditID: b6c32a4b-383ff7000001dc20-db-637dba0497f2 Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id FA.B2.18644.059BD736; Wed, 23 Nov 2022 15:10:24 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061021epsmtip1ff00839ccb85553e37f8279ede1fac88~qIdd9k6Z32063320633epsmtip1C; Wed, 23 Nov 2022 06:10:21 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , Hannes Reinecke , =?utf-8?q?Javier_Gonz=C3=A1lez?= Subject: [PATCH v5 04/10] block: Introduce a new ioctl for copy Date: Wed, 23 Nov 2022 11:28:21 +0530 Message-Id: <20221123055827.26996-5-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Te1BUZRT3u/fu3QWkLo/0A8PWTWuAeCws8EEQTVJewwhoDMUmXOEGDMvu ug/FiHFNGBaK58SjhQkSI0EH4o2LMLQGyCJSw4hCIFIwBQ5QQYPNytAuF8r/ft/vd875nXO+ OTzcvozrzEuWqhiFVCwRkNZE+01XVw9CnxHvPVTDQ43Gfhx9WriOo6tTBSQyDY/gqHupgoPG e69j6MalYgzVXe3DUNfXf2Lo18kVLurbWCTRzOoEgYoNYwDN3dVhqHvCHd3oHiTQqL6SRFW1 c1xUNNDCQZ2zFwBqN1XhaOWbTC5qeLRMoFsTe9DI+gDndSdaNz1M0td1U1x65EETQY8Oq+nm +hySbrl8nu4a15B03sUlc0DWNIde7rlL0vmt9YBuGUqnV5r30tm9n2F08+wiFvlsbEpwEiNO YBR8RhovS0iWJoYIwt+LOxjn5+8t9BAGogABXypOZUIEYUciPd5Klpi3IeCfEUvUZipSrFQK vF4LVsjUKoafJFOqQgSMPEEiF8k9leJUpVqa6CllVEFCb28fP3PgyZSk9r+HCPkTQdpjTS2u AYbnc4EVD1IiuFZzh8wF1jx7qgvAS6UlmEWwp/4CUHtHxgorAJbd1+LbGdfmezmsoAdw8OFF gn1kYbBq/gnIBTweSbnDoQ2ehXek8jGo7erdzMYpEwZLixwt2IEKhf2azE07gjoAtTO3CAu2 pYKgVnt7sw6kvGDBtJ2FtqJehbd/0mNsiB0c/HKWYEu+AC+2VeAWL0gZraB2NX+r0zA4W2Ak WOwAFwZauSx2hitL3SSLz8K6L66QbHImgLp7OsAKoTDLWIBbmsApV9io92JpF1hibMBY42dg nmkWY3lb2PnVNn4RXmus3qrvBMfWLmxhGtZX9gJ2WfkAlmRPcAoBX/fUQLqnBtL9b10N8Hrg xMiVqYmM0k/uK2XO/vfL8bLUZrB5IW7hneCXh394GgDGAwYAebjA0fb84U/i7W0TxOc+ZhSy OIVawigNwM+88CLc+bl4mfnEpKo4oSjQW+Tv7y8K9PUXCnbb1pS7xdtTiWIVk8IwckaxnYfx rJw1mI21qS1A+q20Ijpncj5MgrdVlx1f/v5+5ZmpkOb0mFe847wiPgj4cMFxo/Yot/xYsUPq jCHaVUuO+xJ9wTYdQL+jiZJk7dh1VLQ/9tAV3/Bj8u9gg9Ok+9wB9O7l+vAHXM97b7u877Pb 5UgA3271x5bowkGfxQr/feUny3RReWvUKbeIwIyqnj1MeOKhHyIzcsqjag0dMb9vRMR85LqU +MbLml39Eq3poElu021sHyrtK+4ZTXvJdmex2mQlHTsd+pvT/GMsrfXRm6cy3zlXNzx7eP1E UmjU56NNx9eYggXF6XRPzt5ofF/Qif1YxGRLz828ne4ipUNHyc81sZ3/zGePBGkEhDJJLHTD FUrxv4FABlOqBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFnrEIsWRmVeSWpSXmKPExsWy7bCSnG7Aztpkgw/TVS3WnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLPYsmsRksXL1USaL3Qs/Mlk8vvOZ3eLo/7dsFg+/3GKx mHToGqPF06uzmCz23tK22LP3JIvF5V1z2CzmL3vKbjHx+GZWix1PGhkttv2ez2zxeWkLu8W6 1+9ZLE7ckrY4//c4q4Okx6z7Z9k8ds66y+5x/t5GFo/LZ0s9Nq3qZPPYvKTeY/fNBjaP3uZ3 QAWt91k93u+7yubRt2UVo8fm09UenzfJebQf6Gby2PTkLVMAfxSXTUpqTmZZapG+XQJXxrav p1kK/ihV/GhYxtzAeEimi5GTQ0LARGLNywOsILaQwA5GiRmPBSDikhLL/h5hhrCFJVb+e87e xcgFVNPMJPHswCLGLkYODjYBbYnT/zlA4iICC5gkLt97xQziMAu0M0t8mbsCrFtYwF7iWEML E4jNIqAq0fHwBAuIzStgJdHRcQZskISAvkT/fUGQMKeAtcSZi7uYQMJCQCV7lulAVAtKnJz5 hAUkzCygLrF+nhBImFlAXqJ562zmCYyCs5BUzUKomoWkagEj8ypGydSC4tz03GLDAqO81HK9 4sTc4tK8dL3k/NxNjOBkoKW1g3HPqg96hxiZOBgPMUpwMCuJ8NZ71iQL8aYkVlalFuXHF5Xm pBYfYpTmYFES573QdTJeSCA9sSQ1OzW1ILUIJsvEwSnVwJQRt4P31NmVVg385ebXH9RNuh0q fbN0y5cz0We92IIXMbIc7Y8oVGk/K3Gh87DslpkPJ6x+snL9lRzrBR4b66XZpvF/2PhIXfxO pM0USe/z+Yr6SyRmb1A4G62UUrnC1jd855nV4k9DlDZ4zpiq59wSsui6f+zKQ3vn23MY7Kr8 v/LmkuYsYc/wdbw5jye5O8g9TA3IZ18VU3o8VEX0z4RrFhsE/+/uWpRd/t+X59ZGz0v8UwVj xPh3ZrpGufRlnxLvW7RYwvP91t7lxfOzi38qTWPbtnEtw4kbrFVv43aozf108k3+FFmmwwGi R6Kbmr5z8nNceCpReTrraEyXzJ31fmeWTMgNetcRnt1lM1OJpTgj0VCLuag4EQDA19x3dQMA AA== X-CMS-MailID: 20221123061024epcas5p28fd0296018950d722b5a97e2875cf391 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061024epcas5p28fd0296018950d722b5a97e2875cf391 References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add new BLKCOPY ioctl that offloads copying of one or more sources ranges to one or more destination in a device. COPY ioctl accepts a 'copy_range' structure that contains no of range, a reserved field , followed by an array of ranges. Each source range is represented by 'range_entry' that contains source start offset, destination start offset and length of source ranges (in bytes) MAX_COPY_NR_RANGE, limits the number of entries for the IOCTL and MAX_COPY_TOTAL_LENGTH limits the total copy length, IOCTL can handle. Example code, to issue BLKCOPY: /* Sample example to copy three entries with [dest,src,len], * [32768, 0, 4096] [36864, 4096, 4096] [40960,8192,4096] on same device */ int main(void) { int i, ret, fd; unsigned long src = 0, dst = 32768, len = 4096; struct copy_range *cr; cr = (struct copy_range *)malloc(sizeof(*cr)+ (sizeof(struct range_entry)*3)); cr->nr_range = 3; cr->reserved = 0; for (i = 0; i< cr->nr_range; i++, src += len, dst += len) { cr->ranges[i].dst = dst; cr->ranges[i].src = src; cr->ranges[i].len = len; cr->ranges[i].comp_len = 0; } fd = open("/dev/nvme0n1", O_RDWR); if (fd < 0) return 1; ret = ioctl(fd, BLKCOPY, cr); if (ret != 0) printf("copy failed, ret= %d\n", ret); for (i=0; i< cr->nr_range; i++) if (cr->ranges[i].len != cr->ranges[i].comp_len) printf("Partial copy for entry %d: requested %llu, completed %llu\n", i, cr->ranges[i].len, cr->ranges[i].comp_len); close(fd); free(cr); return ret; } Reviewed-by: Hannes Reinecke Signed-off-by: Nitesh Shetty Signed-off-by: Javier González Signed-off-by: Anuj Gupta --- block/ioctl.c | 36 ++++++++++++++++++++++++++++++++++++ include/uapi/linux/fs.h | 9 +++++++++ 2 files changed, 45 insertions(+) diff --git a/block/ioctl.c b/block/ioctl.c index 60121e89052b..7daf76199161 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -120,6 +120,40 @@ static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode, return err; } +static int blk_ioctl_copy(struct block_device *bdev, fmode_t mode, + unsigned long arg) +{ + struct copy_range ucopy_range, *kcopy_range = NULL; + size_t payload_size = 0; + int ret; + + if (!(mode & FMODE_WRITE)) + return -EBADF; + + if (copy_from_user(&ucopy_range, (void __user *)arg, + sizeof(ucopy_range))) + return -EFAULT; + + if (unlikely(!ucopy_range.nr_range || ucopy_range.reserved || + ucopy_range.nr_range >= MAX_COPY_NR_RANGE)) + return -EINVAL; + + payload_size = (ucopy_range.nr_range * sizeof(struct range_entry)) + + sizeof(ucopy_range); + + kcopy_range = memdup_user((void __user *)arg, payload_size); + if (IS_ERR(kcopy_range)) + return PTR_ERR(kcopy_range); + + ret = blkdev_issue_copy(bdev, bdev, kcopy_range->ranges, + kcopy_range->nr_range, NULL, NULL, GFP_KERNEL); + if (copy_to_user((void __user *)arg, kcopy_range, payload_size)) + ret = -EFAULT; + + kfree(kcopy_range); + return ret; +} + static int blk_ioctl_secure_erase(struct block_device *bdev, fmode_t mode, void __user *argp) { @@ -481,6 +515,8 @@ static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode, return blk_ioctl_discard(bdev, mode, arg); case BLKSECDISCARD: return blk_ioctl_secure_erase(bdev, mode, argp); + case BLKCOPY: + return blk_ioctl_copy(bdev, mode, arg); case BLKZEROOUT: return blk_ioctl_zeroout(bdev, mode, arg); case BLKGETDISKSEQ: diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 9248b6d259de..8af10b926a6f 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -82,6 +82,14 @@ struct range_entry { __u64 comp_len; }; +struct copy_range { + __u64 nr_range; + __u64 reserved; + + /* Ranges always must be at the end */ + struct range_entry ranges[]; +}; + /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */ #define FILE_DEDUPE_RANGE_SAME 0 #define FILE_DEDUPE_RANGE_DIFFERS 1 @@ -203,6 +211,7 @@ struct fsxattr { #define BLKROTATIONAL _IO(0x12,126) #define BLKZEROOUT _IO(0x12,127) #define BLKGETDISKSEQ _IOR(0x12,128,__u64) +#define BLKCOPY _IOWR(0x12, 129, struct copy_range) /* * A jump here: 130-136 are reserved for zoned block devices * (see uapi/linux/blkzoned.h) From patchwork Wed Nov 23 05:58:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053103 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 38092C433FE for ; Wed, 23 Nov 2022 06:14:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235964AbiKWGOJ (ORCPT ); Wed, 23 Nov 2022 01:14:09 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235959AbiKWGNs (ORCPT ); Wed, 23 Nov 2022 01:13:48 -0500 Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 61F22F391E for ; Tue, 22 Nov 2022 22:13:32 -0800 (PST) Received: from epcas5p3.samsung.com (unknown [182.195.41.41]) by mailout3.samsung.com (KnoxPortal) with ESMTP id 20221123061330epoutp0330a807f563418b874b4dab564bdde3a5~qIgOkxAR-0804208042epoutp03R for ; Wed, 23 Nov 2022 06:13:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout3.samsung.com 20221123061330epoutp0330a807f563418b874b4dab564bdde3a5~qIgOkxAR-0804208042epoutp03R DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184010; bh=wixPQbOeqA/8VL7y5AaQHTHqg3E2g4wO+mQ7p/uHXA0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=djbIGFtgq1hVPUV6mpRbpTXU3yxI2/glnXXjjMNEuNPbAq+jLStUQ2IKVj7X8seqx xIoA6Zly/6LgWZRImqGe/jDUZmRB2f/VCO54eEHF9oO44SdBqbht5JVDzmQzZuQPmy INkScdvWueNAE6llNvu5OCw9zSjkuCnvFXgB2Qz8= Received: from epsnrtp1.localdomain (unknown [182.195.42.162]) by epcas5p3.samsung.com (KnoxPortal) with ESMTP id 20221123061330epcas5p3a3dc532a95135f8efcab3c1d537a55b1~qIgN8Kw_U1162011620epcas5p3G; Wed, 23 Nov 2022 06:13:30 +0000 (GMT) Received: from epsmges5p1new.samsung.com (unknown [182.195.38.181]) by epsnrtp1.localdomain (Postfix) with ESMTP id 4NH9mS0rXMz4x9Pv; Wed, 23 Nov 2022 06:13:28 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 18.81.01710.70ABD736; Wed, 23 Nov 2022 15:13:27 +0900 (KST) Received: from epsmtrp1.samsung.com (unknown [182.195.40.13]) by epcas5p1.samsung.com (KnoxPortal) with ESMTPA id 20221123061028epcas5p1aecd27b2f4f694b5a18b51d3df5d7432~qIdkRaBU-1477114771epcas5p1z; Wed, 23 Nov 2022 06:10:28 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp1.samsung.com (KnoxPortal) with ESMTP id 20221123061028epsmtrp1cfd31646264591325c4f7ace998d2661~qIdkQSnO71958719587epsmtrp1L; Wed, 23 Nov 2022 06:10:28 +0000 (GMT) X-AuditID: b6c32a49-c9ffa700000006ae-c6-637dba07ebd0 Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id 8F.B2.18644.359BD736; Wed, 23 Nov 2022 15:10:27 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061024epsmtip1eee71df6bc660545a51ff2a4c5239c0e~qIdhOmlWl2539625396epsmtip1b; Wed, 23 Nov 2022 06:10:24 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty , =?utf-8?q?Javier_Gonz=C3=A1lez?= Subject: [PATCH v5 05/10] nvme: add copy offload support Date: Wed, 23 Nov 2022 11:28:22 +0530 Message-Id: <20221123055827.26996-6-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01TfVCTdRy/3/OMZ8MOexh6/liWu3FwBwZstM0fCKYX1YNxF6HXm+xwN57G 67bbS1CHJ+jQwtgQy3Ikg8sMMaE24mWA0IgQFNEz5BhvlqwDEwzRMJBo44Hyv8/3+/18vi+f 3/04OPcTNo+TodLTWpU8W0CsYzV0hoaGsx0HFEJnjwTV9f6Mo0OlSzg6P2om0GJfP47aZsp9 0FBHM4bOne/CUEvVLIZuj8yxUdfyNIF+feBioTLnTYDcAxYMtbm2ota2Hha64fiSQNazbjY6 3m33QU0ThQA1LFpxNPe1kY1q/7jHQpdcz6D+pW6fnZCyjPcRVLNllE31j33Pom70GShbzccE ZT9zkGoZKiCoksMzHkLRuA917+IAQZnqawA1Z3uOOtpxDKNsE9NY0vp3s2LTaXkareXTKoU6 LUOljBO8tif1pVSJVCgKF0WjbQK+Sp5DxwniE5PCX8nI9pgg4L8vzzZ4UklynU4QuSNWqzbo aX66WqePE9CatGyNWBOhk+foDCplhIrWx4iEwiiJh7g/K720YYytKdPljUxbWQXALCsGvhxI iqGtYZBdDNZxuGQLgJfO9ONMcB9A9+dzgAnmAPyp8RtiTTLdObIqcQBouXV2pcAlizBYP6ws BhwOQW6Fl5c5Xs4G0oTBj1o6cC8HJ4cx2NMW7MUBJIJDj2d8vJhFBsPb9mW2F/uRMdBxbRjz 9oFkJDSP+3vTvuR2eOW6A2Mo/rDn1ASLabkFHv6hfGVrSDb6wqVFM5vRxsPvZmOYnQPgne56 NoN5cMp8ZBXnwnOfVhOM1ui5ZdACmMKLsKjXjHv74GQorHNEMuln4We9tRgzdz0sWZzAmLwf bKpYw0Hw27rKVa8C4c35QoJZh4J9lRzGNhOAhb1ToBTwLU+cY3niHMv/kysBXgMCaY0uR0nr JBqRis79740V6hwbWPkWYQlNYPTWnxFOgHGAE0AOLtjgdzAhX8H1S5N/8CGtVadqDdm0zgkk HruP47yNCrXnX6n0qSJxtFAslUrF0S9IRYJNfl99Eabgkkq5ns6iaQ2tXdNhHF9eAcZpfyPx rbjG+hN7fhNN5O2az6991SJ9qi/o4UjCUbWF5+rowoy04OlTwseuSXem6YCc+/f1hZroaVL2 tuxywpEi/pZjFfYTVe3VUXubHlSPRfCXeXVRCwspsdfuxtvbg/8yLwWoePMyWGJ2t+a7C8o3 X60wslo2K990lJw0/WI/eXry+ZhGSwgYm4pMNAw25V280r2pfd5dnvRPjPXqdmLf7vRZu/gd Wag5RLmv8cLQIfvv9yVlxgspKCj54c69mcbAwrTobftzJa13QfLr4e3CIIsrcKDqToirLYWU YbmzptOZya1iedmul52PrDves+bwNtoeTf6I3BnNmRr/zt0Cli5dLgrDtTr5v/87K+2fBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFrrBIsWRmVeSWpSXmKPExsWy7bCSnG7wztpkg6atNhbrTx1jtmia8JfZ YvXdfjaL32fPM1vsfTeb1eLmgZ1MFitXH2Wy2L3wI5PF4zuf2S2O/n/LZvHwyy0Wi0mHrjFa PL06i8li7y1tiz17T7JYXN41h81i/rKn7BYTj29mtdjxpJHRYtvv+cwWn5e2sFuse/2exeLE LWmL83+PszpIeMy6f5bNY+esu+we5+9tZPG4fLbUY9OqTjaPzUvqPXbfbGDz6G1+B1TQep/V 4/2+q2wefVtWMXp83iTn0X6gm8lj05O3TAF8UVw2Kak5mWWpRfp2CVwZE7bdYy+YVFxx5+18 lgbG/tguRk4OCQETibeH77B3MXJxCAnsYJT4/2MeO0RCUmLZ3yPMELawxMp/z8HiQgLNTBIz m0K6GDk42AS0JU7/5wDpFRFYwCRx+d4rsHpmgadMEmefeIPYwgIWEjf/vGMFsVkEVCUeb/4P NodXwEpi14XbTCBzJAT0JfrvC4KEOQWsJc5c3AUWFgIq2bNMB6JaUOLkzCcsIGFmAXWJ9fOE IBbJSzRvnc08gVFwFpKqWQhVs5BULWBkXsUomVpQnJueW2xYYJSXWq5XnJhbXJqXrpecn7uJ ERz5Wlo7GPes+qB3iJGJg/EQowQHs5IIb71nTbIQb0piZVVqUX58UWlOavEhRmkOFiVx3gtd J+OFBNITS1KzU1MLUotgskwcnFINTIZ8oWoe2v5LCop/WpVFu/H6NCZt7lXuao+V8nkZ0HZ/ w9t96qlLeS3rjU6far4dvW3hR8e5UqzCP8x/njtUI9Dfk1jWV/Oqpztvav8pldDEEz/6zx/d OmX6HmdD0eeanwQyz5jdTNN8YP2+q2avXneV/oxPOkFhCa83v9378ZG86/q8X36THgmmiFvf uHPr7+tffxM5fL9E7SpfP9HtgOds+/Nek1seF1469v9Xh2uJ863+4LjTR+z9JutlP6pfvsu+ 8LTdhUWWXb+z/L+niF+fpvziWFXs3KUvD31a5reud5GrscfppZcCFu0v3edty8px+ZL7Te7N m5uVrF1X7rqwc8NfR9EzXDd4LSJPNyqxFGckGmoxFxUnAgBsRPyUawMAAA== X-CMS-MailID: 20221123061028epcas5p1aecd27b2f4f694b5a18b51d3df5d7432 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061028epcas5p1aecd27b2f4f694b5a18b51d3df5d7432 References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org For device supporting native copy, nvme driver receives read and write request with BLK_COPY op flags. For read request the nvme driver populates the payload with source information. For write request the driver converts it to nvme copy command using the source information in the payload and submits to the device. current design only supports single source range. This design is courtesy Mikulas Patocka's token based copy trace event support for nvme_copy_cmd. Set the device copy limits to queue limits. Signed-off-by: Kanchan Joshi Signed-off-by: Nitesh Shetty Signed-off-by: Javier González Signed-off-by: Anuj Gupta --- drivers/nvme/host/core.c | 106 +++++++++++++++++++++++++++++++++++++- drivers/nvme/host/fc.c | 5 ++ drivers/nvme/host/nvme.h | 7 +++ drivers/nvme/host/pci.c | 28 ++++++++-- drivers/nvme/host/rdma.c | 7 +++ drivers/nvme/host/tcp.c | 16 ++++++ drivers/nvme/host/trace.c | 19 +++++++ include/linux/nvme.h | 43 ++++++++++++++-- 8 files changed, 223 insertions(+), 8 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 4423ccd0b0b1..26ce482ac112 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -751,6 +751,80 @@ static inline void nvme_setup_flush(struct nvme_ns *ns, cmnd->common.nsid = cpu_to_le32(ns->head->ns_id); } +static inline blk_status_t nvme_setup_copy_read(struct nvme_ns *ns, + struct request *req) +{ + struct bio *bio = req->bio; + struct nvme_copy_token *token = bvec_kmap_local(&bio->bi_io_vec[0]); + + memcpy(token->subsys, "nvme", 4); + token->ns = ns; + token->src_sector = bio->bi_iter.bi_sector; + token->sectors = bio->bi_iter.bi_size >> 9; + + return BLK_STS_OK; +} + +static inline blk_status_t nvme_setup_copy_write(struct nvme_ns *ns, + struct request *req, struct nvme_command *cmnd) +{ + struct nvme_copy_range *range = NULL; + struct bio *bio = req->bio; + struct nvme_copy_token *token = bvec_kmap_local(&bio->bi_io_vec[0]); + sector_t src_sector, dst_sector, n_sectors; + u64 src_lba, dst_lba, n_lba; + unsigned short nr_range = 1; + u16 control = 0; + + if (unlikely(memcmp(token->subsys, "nvme", 4))) + return BLK_STS_NOTSUPP; + if (unlikely(token->ns != ns)) + return BLK_STS_NOTSUPP; + + src_sector = token->src_sector; + dst_sector = bio->bi_iter.bi_sector; + n_sectors = token->sectors; + if (WARN_ON(n_sectors != bio->bi_iter.bi_size >> 9)) + return BLK_STS_NOTSUPP; + + src_lba = nvme_sect_to_lba(ns, src_sector); + dst_lba = nvme_sect_to_lba(ns, dst_sector); + n_lba = nvme_sect_to_lba(ns, n_sectors); + + if (WARN_ON(!n_lba)) + return BLK_STS_NOTSUPP; + + if (req->cmd_flags & REQ_FUA) + control |= NVME_RW_FUA; + + if (req->cmd_flags & REQ_FAILFAST_DEV) + control |= NVME_RW_LR; + + memset(cmnd, 0, sizeof(*cmnd)); + cmnd->copy.opcode = nvme_cmd_copy; + cmnd->copy.nsid = cpu_to_le32(ns->head->ns_id); + cmnd->copy.sdlba = cpu_to_le64(dst_lba); + + range = kmalloc_array(nr_range, sizeof(*range), + GFP_ATOMIC | __GFP_NOWARN); + if (!range) + return BLK_STS_RESOURCE; + + range[0].slba = cpu_to_le64(src_lba); + range[0].nlb = cpu_to_le16(n_lba - 1); + + cmnd->copy.nr_range = 0; + + req->special_vec.bv_page = virt_to_page(range); + req->special_vec.bv_offset = offset_in_page(range); + req->special_vec.bv_len = sizeof(*range) * nr_range; + req->rq_flags |= RQF_SPECIAL_PAYLOAD; + + cmnd->copy.control = cpu_to_le16(control); + + return BLK_STS_OK; +} + static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, struct nvme_command *cmnd) { @@ -974,10 +1048,16 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req) ret = nvme_setup_discard(ns, req, cmd); break; case REQ_OP_READ: - ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_read); + if (unlikely(req->cmd_flags & REQ_COPY)) + ret = nvme_setup_copy_read(ns, req); + else + ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_read); break; case REQ_OP_WRITE: - ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_write); + if (unlikely(req->cmd_flags & REQ_COPY)) + ret = nvme_setup_copy_write(ns, req, cmd); + else + ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_write); break; case REQ_OP_ZONE_APPEND: ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_zone_append); @@ -1704,6 +1784,26 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns) blk_queue_max_write_zeroes_sectors(queue, UINT_MAX); } +static void nvme_config_copy(struct gendisk *disk, struct nvme_ns *ns, + struct nvme_id_ns *id) +{ + struct nvme_ctrl *ctrl = ns->ctrl; + struct request_queue *q = disk->queue; + + if (!(ctrl->oncs & NVME_CTRL_ONCS_COPY)) { + blk_queue_max_copy_sectors_hw(q, 0); + blk_queue_flag_clear(QUEUE_FLAG_COPY, q); + return; + } + + /* setting copy limits */ + if (blk_queue_flag_test_and_set(QUEUE_FLAG_COPY, q)) + return; + + blk_queue_max_copy_sectors_hw(q, + nvme_lba_to_sect(ns, le16_to_cpu(id->mssrl))); +} + static bool nvme_ns_ids_equal(struct nvme_ns_ids *a, struct nvme_ns_ids *b) { return uuid_equal(&a->uuid, &b->uuid) && @@ -1903,6 +2003,7 @@ static void nvme_update_disk_info(struct gendisk *disk, set_capacity_and_notify(disk, capacity); nvme_config_discard(disk, ns); + nvme_config_copy(disk, ns, id); blk_queue_max_write_zeroes_sectors(disk->queue, ns->ctrl->max_zeroes_sectors); } @@ -5228,6 +5329,7 @@ static inline void _nvme_check_size(void) BUILD_BUG_ON(sizeof(struct nvme_download_firmware) != 64); BUILD_BUG_ON(sizeof(struct nvme_format_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_dsm_cmd) != 64); + BUILD_BUG_ON(sizeof(struct nvme_copy_command) != 64); BUILD_BUG_ON(sizeof(struct nvme_write_zeroes_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_abort_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_get_log_page_command) != 64); diff --git a/drivers/nvme/host/fc.c b/drivers/nvme/host/fc.c index 5d57a042dbca..b2a1cf37cd92 100644 --- a/drivers/nvme/host/fc.c +++ b/drivers/nvme/host/fc.c @@ -2794,6 +2794,11 @@ nvme_fc_queue_rq(struct blk_mq_hw_ctx *hctx, if (ret) return ret; + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + blk_mq_end_request(rq, BLK_STS_OK); + return BLK_STS_OK; + } /* * nvme core doesn't quite treat the rq opaquely. Commands such * as WRITE ZEROES will return a non-zero rq payload_bytes yet diff --git a/drivers/nvme/host/nvme.h b/drivers/nvme/host/nvme.h index f9df10653f3c..17cfcfc58346 100644 --- a/drivers/nvme/host/nvme.h +++ b/drivers/nvme/host/nvme.h @@ -495,6 +495,13 @@ struct nvme_ns { }; +struct nvme_copy_token { + char subsys[4]; + struct nvme_ns *ns; + u64 src_sector; + u64 sectors; +}; + /* NVMe ns supports metadata actions by the controller (generate/strip) */ static inline bool nvme_ns_has_pi(struct nvme_ns *ns) { diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c index 0163bfa925aa..eb1ed2c8b3a2 100644 --- a/drivers/nvme/host/pci.c +++ b/drivers/nvme/host/pci.c @@ -503,16 +503,19 @@ static inline void nvme_sq_copy_cmd(struct nvme_queue *nvmeq, nvmeq->sq_tail = 0; } -static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) +static inline void nvme_commit_sq_db(struct nvme_queue *nvmeq) { - struct nvme_queue *nvmeq = hctx->driver_data; - spin_lock(&nvmeq->sq_lock); if (nvmeq->sq_tail != nvmeq->last_sq_tail) nvme_write_sq_db(nvmeq, true); spin_unlock(&nvmeq->sq_lock); } +static void nvme_commit_rqs(struct blk_mq_hw_ctx *hctx) +{ + nvme_commit_sq_db(hctx->driver_data); +} + static void **nvme_pci_iod_list(struct request *req) { struct nvme_iod *iod = blk_mq_rq_to_pdu(req); @@ -900,6 +903,12 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) if (ret) return ret; + if (unlikely((req->cmd_flags & REQ_COPY) && + (req_op(req) == REQ_OP_READ))) { + blk_mq_start_request(req); + return BLK_STS_OK; + } + if (blk_rq_nr_phys_segments(req)) { ret = nvme_map_data(dev, req, &iod->cmd); if (ret) @@ -913,6 +922,7 @@ static blk_status_t nvme_prep_rq(struct nvme_dev *dev, struct request *req) } blk_mq_start_request(req); + return BLK_STS_OK; out_unmap_data: nvme_unmap_data(dev, req); @@ -946,6 +956,18 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx, ret = nvme_prep_rq(dev, req); if (unlikely(ret)) return ret; + if (unlikely((req->cmd_flags & REQ_COPY) && + (req_op(req) == REQ_OP_READ))) { + blk_mq_set_request_complete(req); + blk_mq_end_request(req, BLK_STS_OK); + /* Commit the sq if copy read was the last req in the list, + * as copy read deoesn't update sq db + */ + if (bd->last) + nvme_commit_sq_db(nvmeq); + return ret; + } + spin_lock(&nvmeq->sq_lock); nvme_sq_copy_cmd(nvmeq, &iod->cmd); nvme_write_sq_db(nvmeq, bd->last); diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c index 6e079abb22ee..693865139e3c 100644 --- a/drivers/nvme/host/rdma.c +++ b/drivers/nvme/host/rdma.c @@ -2040,6 +2040,13 @@ static blk_status_t nvme_rdma_queue_rq(struct blk_mq_hw_ctx *hctx, if (ret) goto unmap_qe; + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + blk_mq_end_request(rq, BLK_STS_OK); + ret = BLK_STS_OK; + goto unmap_qe; + } + blk_mq_start_request(rq); if (IS_ENABLED(CONFIG_BLK_DEV_INTEGRITY) && diff --git a/drivers/nvme/host/tcp.c b/drivers/nvme/host/tcp.c index 9b47dcb2a7d9..e42fb53e9dc2 100644 --- a/drivers/nvme/host/tcp.c +++ b/drivers/nvme/host/tcp.c @@ -2348,6 +2348,11 @@ static blk_status_t nvme_tcp_setup_cmd_pdu(struct nvme_ns *ns, if (ret) return ret; + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + return BLK_STS_OK; + } + req->state = NVME_TCP_SEND_CMD_PDU; req->status = cpu_to_le16(NVME_SC_SUCCESS); req->offset = 0; @@ -2416,6 +2421,17 @@ static blk_status_t nvme_tcp_queue_rq(struct blk_mq_hw_ctx *hctx, blk_mq_start_request(rq); + if (unlikely((rq->cmd_flags & REQ_COPY) && + (req_op(rq) == REQ_OP_READ))) { + blk_mq_set_request_complete(rq); + blk_mq_end_request(rq, BLK_STS_OK); + /* if copy read is the last req queue tcp reqs */ + if (bd->last && nvme_tcp_queue_more(queue)) + queue_work_on(queue->io_cpu, nvme_tcp_wq, + &queue->io_work); + return ret; + } + nvme_tcp_queue_request(req, true, bd->last); return BLK_STS_OK; diff --git a/drivers/nvme/host/trace.c b/drivers/nvme/host/trace.c index 1c36fcedea20..da4a7494e5a7 100644 --- a/drivers/nvme/host/trace.c +++ b/drivers/nvme/host/trace.c @@ -150,6 +150,23 @@ static const char *nvme_trace_read_write(struct trace_seq *p, u8 *cdw10) return ret; } +static const char *nvme_trace_copy(struct trace_seq *p, u8 *cdw10) +{ + const char *ret = trace_seq_buffer_ptr(p); + u64 slba = get_unaligned_le64(cdw10); + u8 nr_range = get_unaligned_le16(cdw10 + 8); + u16 control = get_unaligned_le16(cdw10 + 10); + u32 dsmgmt = get_unaligned_le32(cdw10 + 12); + u32 reftag = get_unaligned_le32(cdw10 + 16); + + trace_seq_printf(p, + "slba=%llu, nr_range=%u, ctrl=0x%x, dsmgmt=%u, reftag=%u", + slba, nr_range, control, dsmgmt, reftag); + trace_seq_putc(p, 0); + + return ret; +} + static const char *nvme_trace_dsm(struct trace_seq *p, u8 *cdw10) { const char *ret = trace_seq_buffer_ptr(p); @@ -243,6 +260,8 @@ const char *nvme_trace_parse_nvm_cmd(struct trace_seq *p, return nvme_trace_zone_mgmt_send(p, cdw10); case nvme_cmd_zone_mgmt_recv: return nvme_trace_zone_mgmt_recv(p, cdw10); + case nvme_cmd_copy: + return nvme_trace_copy(p, cdw10); default: return nvme_trace_common(p, cdw10); } diff --git a/include/linux/nvme.h b/include/linux/nvme.h index 050d7d0cd81b..41349d78d410 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -336,7 +336,7 @@ struct nvme_id_ctrl { __u8 nvscc; __u8 nwpc; __le16 acwu; - __u8 rsvd534[2]; + __le16 ocfs; __le32 sgls; __le32 mnan; __u8 rsvd544[224]; @@ -364,6 +364,7 @@ enum { NVME_CTRL_ONCS_WRITE_ZEROES = 1 << 3, NVME_CTRL_ONCS_RESERVATIONS = 1 << 5, NVME_CTRL_ONCS_TIMESTAMP = 1 << 6, + NVME_CTRL_ONCS_COPY = 1 << 8, NVME_CTRL_VWC_PRESENT = 1 << 0, NVME_CTRL_OACS_SEC_SUPP = 1 << 0, NVME_CTRL_OACS_NS_MNGT_SUPP = 1 << 3, @@ -413,7 +414,10 @@ struct nvme_id_ns { __le16 npdg; __le16 npda; __le16 nows; - __u8 rsvd74[18]; + __le16 mssrl; + __le32 mcl; + __u8 msrc; + __u8 rsvd91[11]; __le32 anagrpid; __u8 rsvd96[3]; __u8 nsattr; @@ -794,6 +798,7 @@ enum nvme_opcode { nvme_cmd_resv_report = 0x0e, nvme_cmd_resv_acquire = 0x11, nvme_cmd_resv_release = 0x15, + nvme_cmd_copy = 0x19, nvme_cmd_zone_mgmt_send = 0x79, nvme_cmd_zone_mgmt_recv = 0x7a, nvme_cmd_zone_append = 0x7d, @@ -815,7 +820,8 @@ enum nvme_opcode { nvme_opcode_name(nvme_cmd_resv_release), \ nvme_opcode_name(nvme_cmd_zone_mgmt_send), \ nvme_opcode_name(nvme_cmd_zone_mgmt_recv), \ - nvme_opcode_name(nvme_cmd_zone_append)) + nvme_opcode_name(nvme_cmd_zone_append), \ + nvme_opcode_name(nvme_cmd_copy)) @@ -991,6 +997,36 @@ struct nvme_dsm_range { __le64 slba; }; +struct nvme_copy_command { + __u8 opcode; + __u8 flags; + __u16 command_id; + __le32 nsid; + __u64 rsvd2; + __le64 metadata; + union nvme_data_ptr dptr; + __le64 sdlba; + __u8 nr_range; + __u8 rsvd12; + __le16 control; + __le16 rsvd13; + __le16 dspec; + __le32 ilbrt; + __le16 lbat; + __le16 lbatm; +}; + +struct nvme_copy_range { + __le64 rsvd0; + __le64 slba; + __le16 nlb; + __le16 rsvd18; + __le32 rsvd20; + __le32 eilbrt; + __le16 elbat; + __le16 elbatm; +}; + struct nvme_write_zeroes_cmd { __u8 opcode; __u8 flags; @@ -1748,6 +1784,7 @@ struct nvme_command { struct nvme_download_firmware dlfw; struct nvme_format_cmd format; struct nvme_dsm_cmd dsm; + struct nvme_copy_command copy; struct nvme_write_zeroes_cmd write_zeroes; struct nvme_zone_mgmt_send_cmd zms; struct nvme_zone_mgmt_recv_cmd zmr; From patchwork Wed Nov 23 05:58:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053102 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id AF772C3A59F for ; Wed, 23 Nov 2022 06:14:11 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235741AbiKWGOK (ORCPT ); Wed, 23 Nov 2022 01:14:10 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51318 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235931AbiKWGNu (ORCPT ); Wed, 23 Nov 2022 01:13:50 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C2DE0F392E for ; Tue, 22 Nov 2022 22:13:35 -0800 (PST) Received: from epcas5p2.samsung.com (unknown [182.195.41.40]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20221123061334epoutp02bc9ca356b3a3949188a3d70fd46f7358~qIgRvoCn51816518165epoutp02Q for ; Wed, 23 Nov 2022 06:13:34 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20221123061334epoutp02bc9ca356b3a3949188a3d70fd46f7358~qIgRvoCn51816518165epoutp02Q DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184014; bh=TX3O+RQecyvLwAWX9ZvdwE45tGXh/VEnIjgr/dzqgdU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Icfl5jO41mvqvDCe/iAAKqIpZO4ksre2Cx/U3KpnMaPD6hnv+85p99tgyG+bYqxW9 ANg77zMxNCRc2r+aa7/d7QP3ljXWv8ewsys7FqZpJP4YZcK0BFdCwN+ueNRj8Rb7Te RKXj43wcIHDXPRcklBGsbi8wCZedek8Pi3RKtjdo= Received: from epsnrtp1.localdomain (unknown [182.195.42.162]) by epcas5p3.samsung.com (KnoxPortal) with ESMTP id 20221123061333epcas5p3d7047503fc13d684839f2e540dd699c9~qIgRCyl411162411624epcas5p3E; Wed, 23 Nov 2022 06:13:33 +0000 (GMT) Received: from epsmges5p2new.samsung.com (unknown [182.195.38.179]) by epsnrtp1.localdomain (Postfix) with ESMTP id 4NH9mW6VpVz4x9Q2; Wed, 23 Nov 2022 06:13:31 +0000 (GMT) Received: from epcas5p2.samsung.com ( [182.195.41.40]) by epsmges5p2new.samsung.com (Symantec Messaging Gateway) with SMTP id 47.50.39477.B0ABD736; Wed, 23 Nov 2022 15:13:31 +0900 (KST) Received: from epsmtrp1.samsung.com (unknown [182.195.40.13]) by epcas5p3.samsung.com (KnoxPortal) with ESMTPA id 20221123061031epcas5p3745558c2caffd2fd21d15feff00495e9~qIdnWPTRR2070420704epcas5p3D; Wed, 23 Nov 2022 06:10:31 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp1.samsung.com (KnoxPortal) with ESMTP id 20221123061031epsmtrp183e514c3833a8cb7c05d9ed5236a695d~qIdnVMmo81965419654epsmtrp1i; Wed, 23 Nov 2022 06:10:31 +0000 (GMT) X-AuditID: b6c32a4a-007ff70000019a35-62-637dba0bc80e Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 7E.0B.14392.759BD736; Wed, 23 Nov 2022 15:10:31 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061028epsmtip1037ad002957474830da1f8292a02e3a5~qIdkYhWF21761417614epsmtip1L; Wed, 23 Nov 2022 06:10:28 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty Subject: [PATCH v5 06/10] nvmet: add copy command support for bdev and file ns Date: Wed, 23 Nov 2022 11:28:23 +0530 Message-Id: <20221123055827.26996-7-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01TbUxTVxjOube9LbiSK8I8dnE0NRBEoZQVdlqBbZHpnZql4vxDENaUO4qU tvRDJ1P5UMShfJSAzLIBy4YOEDsLEgS6ISIIgswwPg0yBixjCijiFgRkLa2b/573eZ/n/To5 bNw9m8VlJ6j0tFYlU/IJV0b9ra2+/usaT8gDS5sFyNzVjqOM/BUcVY/mEWippxdH1tkSJhpu uYGhyurbGGr69imGbq/OEGh8YYSBCloHAJrqN2HIOrINNVs7Gaiv8WsClV2aYiFjRy0TNUym A1S/VIajZxWnWejqozkGujPyFupd6WC+v5EyjfUQ1A3TKIvqfXiNQfX1GChL1ZcEVft9KtU0 nEZQOadmbYLMMSY191M/QeXWVQHqmeVtKqvlHEZZJmcwqVtUYqiClsXRWh6tkqvjElTxYfy9 B2J3xgaHBAr9hWL0Lp+nkiXRYfyIfVL/XQlK2/583hGZ0mCjpDKdji8ID9WqDXqap1Dr9GF8 WhOn1Ig0ATpZks6gig9Q0XqJMDAwKNgm/DRR0TGVS2huRnyeszCFp4FRlA3YbEiK4Dcl27OB K9udbAIwffE+5gjmAVzquw4cwd8AFj5os2Vc1hzDGecIR8IK4OXZXKcqE4PL3ecZ9roEuQ3e XWXbeQ8yF4Nnm1pwe4CTJRicubTMspfaQEphfq+ZYccM0hu+GBgHdswhJXDqTAXTMaAA5o2t t9Mu5A7Yfb8Rc0jWw86Lk2tWnPSCp66XrNWHZJkLLL77hOkYNQLOp7904g3wr446lgNz4XTe GSc+CisLfyAc5tMAmgZNwJF4D2Z25eH2IXByKzQ3Chz0ZljUdRVzNHaDOUuTzrNwYEPpK7wF XjGXEw68CQ78k+7EFGwbKHYeOBfAnPZyVj7gmV5byPTaQqb/W5cDvApsojW6pHhaF6wJUtFH /3tmuTrJAtY+hd+eBjD+25OAVoCxQSuAbJzvwUn96LjcnRMnO5ZCa9WxWoOS1rWCYNvBjTjX U662/SqVPlYoEgeKQkJCROJ3QoT8jZzvvvKTu5PxMj2dSNMaWvvKh7FduGmYKGZ1NkjDbItp UzR3jia6vLCmnMRjjUU/ZpRnV6sPZaRO1FXEtERJgDlrqChr94fRgx0WecpKPX6hZLCntqjU K/rx+SGfwsUtbxRXHnSvLo28xlvcE3SwqEY5YahxK5T+yWs6KTtwRHLcd/7enTndm+PRoYe3 s/t9JGdTdhldufe8b33w2KhV7F9dx7hSc6GYryhd6C+WemZlcf6YD83f4WO9OcMsI6crsb3l D8ZOjPnjT9XJgoKXs2Lf6sMTNUNRn/1s9Bb8LgxIj9uXeeyT/Q+jJ3TPLeGR4oTLH7c3mr0u Dk23JZP9XyT/6ieO3Ik/4g5olg8FeXC7n3tKwkMLftnMZ+gUMqEfrtXJ/gXd/2xKnQQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrPIsWRmVeSWpSXmKPExsWy7bCSnG74ztpkg4m/+C3WnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLFauPspksXvhRyaLo//fslk8/HKLxWLSoWuMFk+vzmKy 2HtL22LP3pMsFpd3zWGzmL/sKbvFxOObWS12PGlktNj2ez6zxeelLewW616/Z7E4cUva4vzf 46wO4h6z7p9l89g56y67x/l7G1k8Lp8t9di0qpPNY/OSeo/dNxvYPHqb3wEVtN5n9Xi/7yqb R9+WVYwenzfJebQf6Gby2PTkLVMAXxSXTUpqTmZZapG+XQJXxvGnfWwFB10qer88ZW5gvGvR xcjJISFgInGzqZuti5GLQ0hgN6PEm9W7mSASkhLL/h5hhrCFJVb+e84OUdTMJPFt4UagDg4O NgFtidP/OUDiIgILmCQu33vFDOIwCyxlkph95S4bSLewgJ/EkZ6FLCA2i4CqxK9rDxlBbF4B K4mnbUtZQQZJCOhL9N8XBAlzClhLnLm4iwkkLARUsmeZDkS1oMTJmU/ApjALyEs0b53NPIFR YBaS1CwkqQWMTKsYJVMLinPTc4sNCwzzUsv1ihNzi0vz0vWS83M3MYLjWEtzB+P2VR/0DjEy cTAeYpTgYFYS4a33rEkW4k1JrKxKLcqPLyrNSS0+xCjNwaIkznuh62S8kEB6YklqdmpqQWoR TJaJg1OqgYkjP+JdaE5Hsl/M/yUTvtjJLeW3fvfB1vyqYML+5dcEeVSV+pwCgmTyeJn4jGzU lQUdH/7WOafOtliWkdGv4UXTPPbeCHn1wN2Tlf8f/2Kt77ha9zG/crrrr6wb5ps2PEovuLQr bUv63t2TWkpuOPMurLSuWJhj+PGR/IHP937f87+tEVW74WvH87Z86d+XJAKefROIT10V9bZ1 DZPn7rhPoZs/hj3UXnzl8PUtvpmRPQ4Wp/5x3ta4JsakoKHg39RwOcJaPVYpqCx4M1NoWklr ETv/MYd5P//dY1gx/zv3MfnIBV+v+f1ufuew7vSh7ynh02tfP9b/sPK+h8P6SKMPdS/KV0Vx f/32S68zVomlOCPRUIu5qDgRAGi3Ai9SAwAA X-CMS-MailID: 20221123061031epcas5p3745558c2caffd2fd21d15feff00495e9 X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061031epcas5p3745558c2caffd2fd21d15feff00495e9 References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Add support for handling target command on target. For bdev-ns we call into blkdev_issue_copy, which the block layer completes by a offloaded copy request to backend bdev or by emulating the request. For file-ns we call vfs_copy_file_range to service our request. Currently target always shows copy capability by setting NVME_CTRL_ONCS_COPY in controller ONCS. Signed-off-by: Nitesh Shetty Signed-off-by: Anuj Gupta --- drivers/nvme/target/admin-cmd.c | 9 +++- drivers/nvme/target/io-cmd-bdev.c | 79 +++++++++++++++++++++++++++++++ drivers/nvme/target/io-cmd-file.c | 51 ++++++++++++++++++++ drivers/nvme/target/loop.c | 6 +++ drivers/nvme/target/nvmet.h | 2 + 5 files changed, 145 insertions(+), 2 deletions(-) diff --git a/drivers/nvme/target/admin-cmd.c b/drivers/nvme/target/admin-cmd.c index c8a061ce3ee5..5ae509ff4b19 100644 --- a/drivers/nvme/target/admin-cmd.c +++ b/drivers/nvme/target/admin-cmd.c @@ -431,8 +431,7 @@ static void nvmet_execute_identify_ctrl(struct nvmet_req *req) id->nn = cpu_to_le32(NVMET_MAX_NAMESPACES); id->mnan = cpu_to_le32(NVMET_MAX_NAMESPACES); id->oncs = cpu_to_le16(NVME_CTRL_ONCS_DSM | - NVME_CTRL_ONCS_WRITE_ZEROES); - + NVME_CTRL_ONCS_WRITE_ZEROES | NVME_CTRL_ONCS_COPY); /* XXX: don't report vwc if the underlying device is write through */ id->vwc = NVME_CTRL_VWC_PRESENT; @@ -534,6 +533,12 @@ static void nvmet_execute_identify_ns(struct nvmet_req *req) if (req->ns->bdev) nvmet_bdev_set_limits(req->ns->bdev, id); + else { + id->msrc = (u8)to0based(BIO_MAX_VECS - 1); + id->mssrl = cpu_to_le16(BIO_MAX_VECS << + (PAGE_SHIFT - SECTOR_SHIFT)); + id->mcl = cpu_to_le32(le16_to_cpu(id->mssrl)); + } /* * We just provide a single LBA format that matches what the diff --git a/drivers/nvme/target/io-cmd-bdev.c b/drivers/nvme/target/io-cmd-bdev.c index c2d6cea0236b..01f0160125fb 100644 --- a/drivers/nvme/target/io-cmd-bdev.c +++ b/drivers/nvme/target/io-cmd-bdev.c @@ -46,6 +46,19 @@ void nvmet_bdev_set_limits(struct block_device *bdev, struct nvme_id_ns *id) id->npda = id->npdg; /* NOWS = Namespace Optimal Write Size */ id->nows = to0based(bdev_io_opt(bdev) / bdev_logical_block_size(bdev)); + + /*Copy limits*/ + if (bdev_max_copy_sectors(bdev)) { + id->msrc = id->msrc; + id->mssrl = cpu_to_le16((bdev_max_copy_sectors(bdev) << + SECTOR_SHIFT) / bdev_logical_block_size(bdev)); + id->mcl = cpu_to_le32(id->mssrl); + } else { + id->msrc = (u8)to0based(BIO_MAX_VECS - 1); + id->mssrl = cpu_to_le16((BIO_MAX_VECS << PAGE_SHIFT) / + bdev_logical_block_size(bdev)); + id->mcl = cpu_to_le32(id->mssrl); + } } void nvmet_bdev_ns_disable(struct nvmet_ns *ns) @@ -184,6 +197,23 @@ static void nvmet_bio_done(struct bio *bio) nvmet_req_bio_put(req, bio); } +static void nvmet_bdev_copy_end_io(void *private, int status) +{ + struct nvmet_req *req = (struct nvmet_req *)private; + int id; + + if (status) { + for (id = 0 ; id < req->nr_range; id++) { + if (req->ranges[id].len != req->ranges[id].comp_len) { + req->cqe->result.u32 = cpu_to_le32(id); + break; + } + } + } + kfree(req->ranges); + nvmet_req_complete(req, errno_to_nvme_status(req, status)); +} + #ifdef CONFIG_BLK_DEV_INTEGRITY static int nvmet_bdev_alloc_bip(struct nvmet_req *req, struct bio *bio, struct sg_mapping_iter *miter) @@ -450,6 +480,51 @@ static void nvmet_bdev_execute_write_zeroes(struct nvmet_req *req) } } +static void nvmet_bdev_execute_copy(struct nvmet_req *req) +{ + struct nvme_copy_range range; + struct range_entry *ranges; + struct nvme_command *cmnd = req->cmd; + sector_t dest, dest_off = 0; + int ret, id, nr_range; + + nr_range = cmnd->copy.nr_range + 1; + dest = le64_to_cpu(cmnd->copy.sdlba) << req->ns->blksize_shift; + ranges = kmalloc_array(nr_range, sizeof(*ranges), GFP_KERNEL); + + for (id = 0 ; id < nr_range; id++) { + ret = nvmet_copy_from_sgl(req, id * sizeof(range), + &range, sizeof(range)); + if (ret) + goto out; + + ranges[id].dst = dest + dest_off; + ranges[id].src = le64_to_cpu(range.slba) << + req->ns->blksize_shift; + ranges[id].len = (le16_to_cpu(range.nlb) + 1) << + req->ns->blksize_shift; + ranges[id].comp_len = 0; + dest_off += ranges[id].len; + } + req->ranges = ranges; + req->nr_range = nr_range; + ret = blkdev_issue_copy(req->ns->bdev, req->ns->bdev, ranges, nr_range, + nvmet_bdev_copy_end_io, (void *)req, GFP_KERNEL); + if (ret) { + for (id = 0 ; id < nr_range; id++) { + if (ranges[id].len != ranges[id].comp_len) { + req->cqe->result.u32 = cpu_to_le32(id); + break; + } + } + goto out; + } else + return; +out: + kfree(ranges); + nvmet_req_complete(req, errno_to_nvme_status(req, ret)); +} + u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req) { switch (req->cmd->common.opcode) { @@ -468,6 +543,10 @@ u16 nvmet_bdev_parse_io_cmd(struct nvmet_req *req) case nvme_cmd_write_zeroes: req->execute = nvmet_bdev_execute_write_zeroes; return 0; + case nvme_cmd_copy: + req->execute = nvmet_bdev_execute_copy; + return 0; + default: return nvmet_report_invalid_opcode(req); } diff --git a/drivers/nvme/target/io-cmd-file.c b/drivers/nvme/target/io-cmd-file.c index 64b47e2a4633..a81d38796e17 100644 --- a/drivers/nvme/target/io-cmd-file.c +++ b/drivers/nvme/target/io-cmd-file.c @@ -338,6 +338,48 @@ static void nvmet_file_dsm_work(struct work_struct *w) } } +static void nvmet_file_copy_work(struct work_struct *w) +{ + struct nvmet_req *req = container_of(w, struct nvmet_req, f.work); + int nr_range; + loff_t pos; + struct nvme_command *cmnd = req->cmd; + int ret = 0, len = 0, src, id; + + nr_range = cmnd->copy.nr_range + 1; + pos = le64_to_cpu(req->cmd->copy.sdlba) << req->ns->blksize_shift; + if (unlikely(pos + req->transfer_len > req->ns->size)) { + nvmet_req_complete(req, errno_to_nvme_status(req, -ENOSPC)); + return; + } + + for (id = 0 ; id < nr_range; id++) { + struct nvme_copy_range range; + + ret = nvmet_copy_from_sgl(req, id * sizeof(range), &range, + sizeof(range)); + if (ret) + goto out; + + len = (le16_to_cpu(range.nlb) + 1) << (req->ns->blksize_shift); + src = (le64_to_cpu(range.slba) << (req->ns->blksize_shift)); + ret = vfs_copy_file_range(req->ns->file, src, req->ns->file, + pos, len, 0); +out: + if (ret != len) { + pos += ret; + req->cqe->result.u32 = cpu_to_le32(id); + nvmet_req_complete(req, ret < 0 ? + errno_to_nvme_status(req, ret) : + errno_to_nvme_status(req, -EIO)); + return; + + } else + pos += len; +} + nvmet_req_complete(req, ret); + +} static void nvmet_file_execute_dsm(struct nvmet_req *req) { if (!nvmet_check_data_len_lte(req, nvmet_dsm_len(req))) @@ -346,6 +388,12 @@ static void nvmet_file_execute_dsm(struct nvmet_req *req) queue_work(nvmet_wq, &req->f.work); } +static void nvmet_file_execute_copy(struct nvmet_req *req) +{ + INIT_WORK(&req->f.work, nvmet_file_copy_work); + queue_work(nvmet_wq, &req->f.work); +} + static void nvmet_file_write_zeroes_work(struct work_struct *w) { struct nvmet_req *req = container_of(w, struct nvmet_req, f.work); @@ -392,6 +440,9 @@ u16 nvmet_file_parse_io_cmd(struct nvmet_req *req) case nvme_cmd_write_zeroes: req->execute = nvmet_file_execute_write_zeroes; return 0; + case nvme_cmd_copy: + req->execute = nvmet_file_execute_copy; + return 0; default: return nvmet_report_invalid_opcode(req); } diff --git a/drivers/nvme/target/loop.c b/drivers/nvme/target/loop.c index b45fe3adf015..55802632b407 100644 --- a/drivers/nvme/target/loop.c +++ b/drivers/nvme/target/loop.c @@ -146,6 +146,12 @@ static blk_status_t nvme_loop_queue_rq(struct blk_mq_hw_ctx *hctx, return ret; blk_mq_start_request(req); + if (unlikely((req->cmd_flags & REQ_COPY) && + (req_op(req) == REQ_OP_READ))) { + blk_mq_set_request_complete(req); + blk_mq_end_request(req, BLK_STS_OK); + return BLK_STS_OK; + } iod->cmd.common.flags |= NVME_CMD_SGL_METABUF; iod->req.port = queue->ctrl->port; if (!nvmet_req_init(&iod->req, &queue->nvme_cq, diff --git a/drivers/nvme/target/nvmet.h b/drivers/nvme/target/nvmet.h index dfe3894205aa..3b4c7d2ee45d 100644 --- a/drivers/nvme/target/nvmet.h +++ b/drivers/nvme/target/nvmet.h @@ -391,6 +391,8 @@ struct nvmet_req { struct device *p2p_client; u16 error_loc; u64 error_slba; + struct range_entry *ranges; + unsigned int nr_range; }; extern struct workqueue_struct *buffered_io_wq; From patchwork Wed Nov 23 05:58:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053105 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C528AC433FE for ; Wed, 23 Nov 2022 06:14:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236030AbiKWGOT (ORCPT ); Wed, 23 Nov 2022 01:14:19 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51066 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229477AbiKWGNu (ORCPT ); Wed, 23 Nov 2022 01:13:50 -0500 Received: from mailout4.samsung.com (mailout4.samsung.com [203.254.224.34]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BEF5DF3930 for ; Tue, 22 Nov 2022 22:13:40 -0800 (PST) Received: from epcas5p4.samsung.com (unknown [182.195.41.42]) by mailout4.samsung.com (KnoxPortal) with ESMTP id 20221123061339epoutp0405804b9759a43665bd856df8d48e9801~qIgWUQ5OU2309423094epoutp04u for ; Wed, 23 Nov 2022 06:13:39 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout4.samsung.com 20221123061339epoutp0405804b9759a43665bd856df8d48e9801~qIgWUQ5OU2309423094epoutp04u DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184019; bh=jXzSR3ie2tfHlOJCwFoO5jv52UvRDIp1dABQIpgm1Oc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=L/8dRuKpVC7QSUwYg5F+w3CYToULVTiehEIAxAf/8uYNRetlTHyI2vcaAibkB8a1b 3JOczz/fKVfh7jiZe9a43Q7hpd7UwiIhCa352oYO48HpqLvJC1f8WuayQ7OKvq1rdx eYnDwGgXszoW91XCN2DACRoyrDaVgxHtQrYIFESY= Received: from epsnrtp3.localdomain (unknown [182.195.42.164]) by epcas5p1.samsung.com (KnoxPortal) with ESMTP id 20221123061338epcas5p19a10d044dd5231ab4c410f9e745d3588~qIgVsGcz_0806308063epcas5p1o; Wed, 23 Nov 2022 06:13:38 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.177]) by epsnrtp3.localdomain (Postfix) with ESMTP id 4NH9mc6sMzz4x9Pp; Wed, 23 Nov 2022 06:13:36 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id 0D.27.56352.01ABD736; Wed, 23 Nov 2022 15:13:36 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p3.samsung.com (KnoxPortal) with ESMTPA id 20221123061034epcas5p3fe90293ad08df4901f98bae2d7cfc1ba~qIdqbAfxR2070420704epcas5p3G; Wed, 23 Nov 2022 06:10:34 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20221123061034epsmtrp2602de2e166f8f56c95c08c75cbe028c4~qIdqaBxvZ0466304663epsmtrp2x; Wed, 23 Nov 2022 06:10:34 +0000 (GMT) X-AuditID: b6c32a4b-5f7fe7000001dc20-0e-637dba10585d Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 52.1B.14392.A59BD736; Wed, 23 Nov 2022 15:10:34 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061031epsmtip1a9bf6055f42237a716527282c7e5298d~qIdnbPUwA1988319883epsmtip1d; Wed, 23 Nov 2022 06:10:31 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty Subject: [PATCH v5 07/10] dm: Add support for copy offload. Date: Wed, 23 Nov 2022 11:28:24 +0530 Message-Id: <20221123055827.26996-8-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01TbUxTZxTOe2+5LSyVK199BURSMpZC+Ogo5UVgM5lxl0kmiZhl+gPu2ruC QNv0Y04iAwdMZMqHgNEy+dgUJyBsBRRayhgbVpAOpyKCQDdCl30ERFlGHDLWcnHz33POeZ5z nnPevDzc6xOuPy9LqWM0SjpHSHhwrn0nEkWQpnxZdP9SDOoYuYGjjyvXcNQ6U0GgVdsYjiyL dW5ocqAXQ1dahzBkbnqMoaH1BQL9/OcUB50ZvA+QY9yAIctUOOqzDHPQXdNnBGpodnBRlbXT DfXMHwfo2moDjpYvFXNR+x+POOjmVAAaW7O67RJQBruNoHoNM1xqbPZrDnXXpqeMLScJqvNi AWWeLCSo00WLTkKJ3Y161D9OUOVdLYBaNgZRJwY+xSjj/AKWuuVgdmImQ8sZTTCjlKnkWUpF knDv/vQ30mOl0eIIcTyKEwYr6VwmSbg7JTViT1aOc39h8Ad0jt6ZSqW1WmHUa4kalV7HBGeq tLokIaOW56gl6kgtnavVKxWRSka3Uxwd/Wqsk5iRnTn+zQqhnvH/cOLqALcQTPqWAR4PkhL4 eVdgGXDneZFmAK+25ZcBDyd+AqC5rgawwV8AjtS2u7lYLsHsHTvGFiwAnlqyEKy8BIOXFvWu rgQZDm+t81wcH7Icg6XmAdwV4GQdBhean3FdAm8yAVpLhze6csiX4frwdeDCfHInnD7nwFl7 UbDCvtWVdnfSR380YSxlKxw+P89xYZzcAYu663DWXIM7vDd9jMW74WJz9aZpb/i7tYvLYn+4 vMh6huQReKXmS8LlDZLFABomDIAtvA5LRio2POCkCHaYotj0dlg70o6xc7fA06vzGJvnw576 5zgEtnU0bvbfBu+vHN/EFGxous1hb1UO4PhQaCUINrywjuGFdQz/T24EeAvYxqi1uQpGG6uO UTJH/ntimSrXCDY+RNjeHjD301LkIMB4YBBAHi704RckH5N58eX00TxGo0rX6HMY7SCIdZ67 Cvf3lamcP0qpSxdL4qMlUqlUEh8jFQsF/C/Ohcm8SAWtY7IZRs1onuswnrt/IZYSRIQeWBMp vjKJ6s9Axo5Z056NBaw9jOoWvWl/C4Vk5Hsa4ox3VHy68qkUBNUcbrZtr6u3rThMnpVPFSuF fVP7eb8KHOIZh2Niat86jsy3A3/pP4iSHe8QAo7sQPdHKRZpX2rpD378Qzd+c2RkdJbk0e+n yb3eviUxJFx/5UFc6IU9AD8cf2juaOC78n1zgecLmcbw1pVSAVLcu5xQ79eXeHHX5cb0m7Nt aQmevu+N2qpPhJbXXgjIW4V/V2Rm9fhINC3Ncd58VLU7mfas8uAUpw2shRSc/FaQgfGJJ36a 3qIdTWN26z/Tp15a/b7a+EBVdZZ82KcIeOxuG20RcrSZtDgM12jpfwEIyS52mQQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrPIsWRmVeSWpSXmKPExsWy7bCSnG7Uztpkg5eXFS3WnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLFauPspksXvhRyaLo//fslk8/HKLxWLSoWuMFk+vzmKy 2HtL22LP3pMsFpd3zWGzmL/sKbvFxOObWS12PGlktNj2ez6zxeelLewW616/Z7E4cUva4vzf 46wO4h6z7p9l89g56y67x/l7G1k8Lp8t9di0qpPNY/OSeo/dNxvYPHqb3wEVtN5n9Xi/7yqb R9+WVYwenzfJebQf6Gby2PTkLVMAXxSXTUpqTmZZapG+XQJXxtX939kK7kpVXF97gL2B8aZo FyMnh4SAicS9S/eZuhi5OIQEdjNKLNs8gREiISmx7O8RZghbWGLlv+fsEEXNTBJnW3cBFXFw sAloS5z+zwESFxFYwCRx+d4rZhCHWWApk8TsK3fZQLqFBawljnecZAWxWQRUJf6f3A62gVfA SuLOjKfMIIMkBPQl+u8LgoQ5gcrPXNzFBBIWAirZs0wHolpQ4uTMJywgNrOAvETz1tnMExgF ZiFJzUKSWsDItIpRMrWgODc9t9iwwDAvtVyvODG3uDQvXS85P3cTIziOtTR3MG5f9UHvECMT B+MhRgkOZiUR3nrPmmQh3pTEyqrUovz4otKc1OJDjNIcLErivBe6TsYLCaQnlqRmp6YWpBbB ZJk4OKUamDzePZG6uYuBaYZieZKjvIJKgOPil6H1epy1kxV/x0Vy1j/qVL0xgeXShGuLrVtq wva9+O0jG30od36n5OIpNQHq4l+Fp+3KmWd78qOi0vSFj/33B6hPX8X/+Etlq8vxi7LHg894 ygcd7bslsfKwsO2jPeU3ROImzpyncjDy6OTX2ozul9jmf1fscensWPNScdHi5xZyx31zO0p6 L3xP79z/9eXKjIN3PWruN0zeG8AXs93Me8XjLdb2UhUbfxeu9tq/In+ytLRZzImeuN9He99v KQ8tWySmK8LEPJFb5dBhh00ZXOlbQhUj53/tXRB81d/jdr21sqmrWvp1I69Go30bNpz0mpEa brRx8dkfukosxRmJhlrMRcWJAFoIfGtSAwAA X-CMS-MailID: 20221123061034epcas5p3fe90293ad08df4901f98bae2d7cfc1ba X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061034epcas5p3fe90293ad08df4901f98bae2d7cfc1ba References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Before enabling copy for dm target, check if underlying devices and dm target support copy. Avoid split happening inside dm target. Fail early if the request needs split, currently splitting copy request is not supported. Signed-off-by: Nitesh Shetty --- drivers/md/dm-table.c | 42 +++++++++++++++++++++++++++++++++++ drivers/md/dm.c | 7 ++++++ include/linux/device-mapper.h | 5 +++++ 3 files changed, 54 insertions(+) diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index 078da18bb86d..b2073e857a74 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1875,6 +1875,39 @@ static bool dm_table_supports_nowait(struct dm_table *t) return true; } +static int device_not_copy_capable(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) +{ + struct request_queue *q = bdev_get_queue(dev->bdev); + + return !blk_queue_copy(q); +} + +static bool dm_table_supports_copy(struct dm_table *t) +{ + struct dm_target *ti; + unsigned int i; + + for (i = 0; i < t->num_targets; i++) { + ti = dm_table_get_target(t, i); + + if (!ti->copy_offload_supported) + return false; + + /* + * target provides copy support (as implied by setting + * 'copy_offload_supported') + * and it relies on _all_ data devices having copy support. + */ + if (!ti->type->iterate_devices || + ti->type->iterate_devices(ti, + device_not_copy_capable, NULL)) + return false; + } + + return true; +} + static int device_not_discard_capable(struct dm_target *ti, struct dm_dev *dev, sector_t start, sector_t len, void *data) { @@ -1957,6 +1990,15 @@ int dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, q->limits.discard_misaligned = 0; } + if (!dm_table_supports_copy(t)) { + blk_queue_flag_clear(QUEUE_FLAG_COPY, q); + /* Must also clear copy limits... */ + q->limits.max_copy_sectors = 0; + q->limits.max_copy_sectors_hw = 0; + } else { + blk_queue_flag_set(QUEUE_FLAG_COPY, q); + } + if (!dm_table_supports_secure_erase(t)) q->limits.max_secure_erase_sectors = 0; diff --git a/drivers/md/dm.c b/drivers/md/dm.c index e1ea3a7bd9d9..713335995290 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1690,6 +1690,13 @@ static blk_status_t __split_and_process_bio(struct clone_info *ci) if (unlikely(ci->is_abnormal_io)) return __process_abnormal_io(ci, ti); + if ((unlikely(op_is_copy(ci->bio->bi_opf)) && + max_io_len(ti, ci->sector) < ci->sector_count)) { + DMERR("Error, IO size(%u) > max target size(%llu)\n", + ci->sector_count, max_io_len(ti, ci->sector)); + return BLK_STS_IOERR; + } + /* * Only support bio polling for normal IO, and the target io is * exactly inside the dm_io instance (verified in dm_poll_dm_io) diff --git a/include/linux/device-mapper.h b/include/linux/device-mapper.h index 04c6acf7faaa..da4e77e81011 100644 --- a/include/linux/device-mapper.h +++ b/include/linux/device-mapper.h @@ -379,6 +379,11 @@ struct dm_target { * bio_set_dev(). NOTE: ideally a target should _not_ need this. */ bool needs_bio_set_dev:1; + + /* + * copy offload is supported + */ + bool copy_offload_supported:1; }; void *dm_per_bio_data(struct bio *bio, size_t data_size); From patchwork Wed Nov 23 05:58:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 55731C4321E for ; Wed, 23 Nov 2022 06:14:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236033AbiKWGOU (ORCPT ); Wed, 23 Nov 2022 01:14:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51068 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229486AbiKWGNu (ORCPT ); Wed, 23 Nov 2022 01:13:50 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 235FFF3939 for ; Tue, 22 Nov 2022 22:13:47 -0800 (PST) Received: from epcas5p1.samsung.com (unknown [182.195.41.39]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20221123061345epoutp0237f94d15a1b5a38f8611adb8ce228524~qIgcSsn_91913519135epoutp02W for ; Wed, 23 Nov 2022 06:13:45 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20221123061345epoutp0237f94d15a1b5a38f8611adb8ce228524~qIgcSsn_91913519135epoutp02W DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184025; bh=5O5uDT9HWnn4Ccadtd7fTCupkIwM/OnWfdBNErBUieg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=J+Mp0uJ/u1QezkgGdk6cxZ6zr6/HgR++wRRwC4b22S7PuLmdEvzQBaNJ9o6OQJsMc I2oZOaIoaZ6m0Q13tWZ+zKnsNw15pxba+7UP3GwStQYKquOOme3zv7G8MQm9l2kNDI UtA58eDXJnUCjrBR2F8T98bL2575T+9iEbz0u7SM= Received: from epsnrtp3.localdomain (unknown [182.195.42.164]) by epcas5p1.samsung.com (KnoxPortal) with ESMTP id 20221123061342epcas5p119115974f34a518d3b83d7dabac38e90~qIgZzf6Yo2805328053epcas5p1S; Wed, 23 Nov 2022 06:13:42 +0000 (GMT) Received: from epsmges5p1new.samsung.com (unknown [182.195.38.181]) by epsnrtp3.localdomain (Postfix) with ESMTP id 4NH9mh4966z4x9Pp; Wed, 23 Nov 2022 06:13:40 +0000 (GMT) Received: from epcas5p1.samsung.com ( [182.195.41.39]) by epsmges5p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 83.91.01710.41ABD736; Wed, 23 Nov 2022 15:13:40 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p4.samsung.com (KnoxPortal) with ESMTPA id 20221123061037epcas5p4d57436204fbe0065819b156eeeddbfac~qIdtZjzIm1387613876epcas5p4L; Wed, 23 Nov 2022 06:10:37 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20221123061037epsmtrp29863525cb7f744d91a2508cf38d69191~qIdtYR1U20473504735epsmtrp2Q; Wed, 23 Nov 2022 06:10:37 +0000 (GMT) X-AuditID: b6c32a49-c9ffa700000006ae-f4-637dba14c7fb Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id B9.C2.18644.D59BD736; Wed, 23 Nov 2022 15:10:37 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061034epsmtip15ed7fc7e8497363a6f994535aa0d13b8~qIdqfGl_e2064620646epsmtip1F; Wed, 23 Nov 2022 06:10:34 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty Subject: [PATCH v5 08/10] dm: Enable copy offload for dm-linear target Date: Wed, 23 Nov 2022 11:28:25 +0530 Message-Id: <20221123055827.26996-9-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01Te0xTZxTPvbf3tsVVLoXNj7JH6caMMGjLCvtwgIsyvOwVpskeGoNNuaFI X2nLAOOgFRiCEcTBhPIQxjTjEQgFFMuKFUSEiWQ6MOJ4jAGLIEVgT5GxlsLmf7/z+875nfM7 Xw4L4+YyeaxElZ7WqqQKAeHGuNi9Y3ugl+VzmSjT4A6b+q9j8PjpVQzWjxYQcGVgEINWexkO 79kuo7C2vgeFHdWLKOxZmyfgz7+NMOCZrmEETg+ZUGgdCYDfWfsY8I6lnIDnLkwzYWFvCw7b p4wIvLhyDoPL57OYsHFugQFvjPjAwdVe/K1tlGl8gKAum0aZ1OBYM4O6M5BMmetyCarlmwyq 456BoE5l2h0J2eM4tdA5RFD5rXUItWx+kcqxnUQp89Q8Grv1QFK4nJbG01o+rZKp4xNVCRGC d/fH7YkLCRWJA8Vh8A0BXyVV0hGCqPdiA6MTFQ7/Av5nUkWyg4qV6nQCYWS4Vp2sp/lytU4f IaA18QqNRBOkkyp1yaqEIBWt3ykWiYJDHImHk+TGkuuIxoinXrpZgRuQEkYewmYBUgJuNZSj eYgbi0t2IOC2sRFzBUsIONMyvBEsI6Dh61lis6Rybg53Yi5pQcDMWrgrKRsFkzUPHFosFkEG gO/XWE7ei8xHwYkO27oSRpahYP7CE6az2pOMBtfGf1/HDNIPfNnyI+rEHHInOL9UhjuFACkE BeMeTppNvglu/mDZSPEAfaVT6x4w8iWQ2Va2rg/IGjYoauvBXZNGgYnF4xtTe4LZ3lamC/PA st26waeA2qJvCVdxFgJMd02I62EXyO4vwJxDYOQO0GQRuugXQHF/I+pqvBWcWplCXTwHtFdu 4pdBQ1PVhr43GP7TSLi8UGDWoHQtKx8BkzlWxmmEb3rKj+kpP6b/O1chWB3iTWt0ygRaF6IR q+iU/35ZplaakfWb8I9pR0YnHgV1ISgL6UIACxN4cTJijsm4nHhp2lFaq47TJitoXRcS4th3 IcZ7VqZ2HJVKHyeWhIkkoaGhkrDXQ8WCbZyaEn8Zl0yQ6ukkmtbQ2s06lMXmGVB22dkjtm5F Sv61Q/uqH6W83fdx7T57fae5+uhVn7vuYQrLJ1EJnelVlfb7bqk9B/OQZ9Jq5JO322fltSJe L/638pXyZt9Dxy51jcFdnKU/Sua+EuNKnwKD/ie/Ex+Wyh9WZD3ss0VHbk9leMyk+/r/82qG xC99y2KFsptbUf6a98m5ogOh7xcWHzZd+eUvLne/Kjv3oPGDq8F7mEd2C0frPI1e3tHmvQH3 hb5euL9xiP5osNR3UcBWjc12GwKsONDe4sRIg5M67Y8T03Ju6J78ij0XGNXa31wcEQ2iCvgz W4RX3G22yOkHvBh+yTsLEZ9Kvmizn82fKNz7eCCQYXl+NU7A0MmlYn9Mq5P+C5JtjUecBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrPIsWRmVeSWpSXmKPExsWy7bCSnG7sztpkg1NbDSzWnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLFauPspksXvhRyaLo//fslk8/HKLxWLSoWuMFk+vzmKy 2HtL22LP3pMsFpd3zWGzmL/sKbvFxOObWS12PGlktNj2ez6zxeelLewW616/Z7E4cUva4vzf 46wO4h6z7p9l89g56y67x/l7G1k8Lp8t9di0qpPNY/OSeo/dNxvYPHqb3wEVtN5n9Xi/7yqb R9+WVYwenzfJebQf6Gby2PTkLVMAXxSXTUpqTmZZapG+XQJXRuOMY4wFjawV28/MZW1gnMHS xcjJISFgIjHv9WvWLkYuDiGBHYwSS45OYoJISEos+3uEGcIWllj57zk7RFEzk0TPveNADgcH m4C2xOn/HCBxEYEFTBKX771iBnGYBZYyScy+cpcNpFtYwE3iyP2v7CA2i4CqxOTNV8A28ApY SSz9NJsVZJCEgL5E/31BkDCngLXEmYu7mEDCQkAle5bpQFQLSpyc+QTsaGYBeYnmrbOZJzAK zEKSmoUktYCRaRWjZGpBcW56brFhgVFearlecWJucWleul5yfu4mRnAca2ntYNyz6oPeIUYm DsZDjBIczEoivPWeNclCvCmJlVWpRfnxRaU5qcWHGKU5WJTEeS90nYwXEkhPLEnNTk0tSC2C yTJxcEo1MO2yXhF2Qi7QSmuv6vK51VWXFj6ReNfcM/0V98NIubneVyI+66XeT7zvOWe7aPrD XMvn73N+ee9SUXL7uHzd49cMuhOMpn15W/G5vGffpKz0Ty61e5v9nv57JCbd/nLZloi9jowf Jm6cd01TYFlg5ZmiaLU9bf5Nr1j+Nv7wW+QjeSOGxfdb2ro793Jez1DUqTa6oPNELPm7uuSv G70MTNsWB0yTvMUUcqTkTLHpizfJHa4Hdd/GhL/f1ah5altL0nrTbxe93ljFPK97nGNxJsdi Z2/V1ognHP+O7f69fLNGw83TZ8tF5+5U3Sp+5zd7gUT2MmOh5C/231weiXzfzf398KRz3DV7 Zmy+eEtkjcZFJZbijERDLeai4kQAu1HpnVIDAAA= X-CMS-MailID: 20221123061037epcas5p4d57436204fbe0065819b156eeeddbfac X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061037epcas5p4d57436204fbe0065819b156eeeddbfac References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Setting copy_offload_supported flag to enable offload. Signed-off-by: Nitesh Shetty --- drivers/md/dm-linear.c | 1 + 1 file changed, 1 insertion(+) diff --git a/drivers/md/dm-linear.c b/drivers/md/dm-linear.c index 3212ef6aa81b..b4b57bead495 100644 --- a/drivers/md/dm-linear.c +++ b/drivers/md/dm-linear.c @@ -61,6 +61,7 @@ static int linear_ctr(struct dm_target *ti, unsigned int argc, char **argv) ti->num_discard_bios = 1; ti->num_secure_erase_bios = 1; ti->num_write_zeroes_bios = 1; + ti->copy_offload_supported = 1; ti->private = lc; return 0; From patchwork Wed Nov 23 05:58:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053106 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 56312C433FE for ; Wed, 23 Nov 2022 06:14:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236036AbiKWGOb (ORCPT ); Wed, 23 Nov 2022 01:14:31 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51334 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235978AbiKWGNu (ORCPT ); Wed, 23 Nov 2022 01:13:50 -0500 Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F063AF2C26 for ; Tue, 22 Nov 2022 22:13:47 -0800 (PST) Received: from epcas5p2.samsung.com (unknown [182.195.41.40]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20221123061346epoutp02e14915c4f10c5bee87cf8eb510d03794~qIgdBzVR81913519135epoutp02Y for ; Wed, 23 Nov 2022 06:13:46 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20221123061346epoutp02e14915c4f10c5bee87cf8eb510d03794~qIgdBzVR81913519135epoutp02Y DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184026; bh=SaeKrlDyh6fhEVghEF1v9Ec1Wl/S+jtnq86cAYk6FMQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Lzp5Z7USiJTRfYzEKWFPBCWra+bGxZoBDtQ6C2XrbSbBEpgH+tsMqKs2Ti3Fpl3Bq oS/te5lCcQ3LdXdAwLzrFUUwT2/cQAf0BXU+cbtT/Q2qwgIreA6nC+5JC3AUZxZlNa aO44iuBnPjq1lc0DFP5zOWgU2FBj5aLd6wcn3UZ0= Received: from epsnrtp1.localdomain (unknown [182.195.42.162]) by epcas5p3.samsung.com (KnoxPortal) with ESMTP id 20221123061345epcas5p32825de06abfc15b772f3e9db9375f0d3~qIgcbof_M0072800728epcas5p3c; Wed, 23 Nov 2022 06:13:45 +0000 (GMT) Received: from epsmges5p2new.samsung.com (unknown [182.195.38.174]) by epsnrtp1.localdomain (Postfix) with ESMTP id 4NH9mm12ZWz4x9Q1; Wed, 23 Nov 2022 06:13:44 +0000 (GMT) Received: from epcas5p2.samsung.com ( [182.195.41.40]) by epsmges5p2new.samsung.com (Symantec Messaging Gateway) with SMTP id 5D.60.39477.71ABD736; Wed, 23 Nov 2022 15:13:43 +0900 (KST) Received: from epsmtrp1.samsung.com (unknown [182.195.40.13]) by epcas5p4.samsung.com (KnoxPortal) with ESMTPA id 20221123061041epcas5p4413569a46ee730cd3033a9025c8f134a~qIdwfmK_k1387613876epcas5p4R; Wed, 23 Nov 2022 06:10:41 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp1.samsung.com (KnoxPortal) with ESMTP id 20221123061041epsmtrp18e13c583f63f524392e30d1f5cc61af2~qIdweb_Fy1974919749epsmtrp1I; Wed, 23 Nov 2022 06:10:41 +0000 (GMT) X-AuditID: b6c32a4a-259fb70000019a35-93-637dba17073c Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id AF.C2.18644.169BD736; Wed, 23 Nov 2022 15:10:41 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061037epsmtip1c482a1a6bc4b461f55d5956fc9d20346~qIdtgYz_41981819818epsmtip1j; Wed, 23 Nov 2022 06:10:37 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty Subject: [PATCH v5 09/10] dm kcopyd: use copy offload support Date: Wed, 23 Nov 2022 11:28:26 +0530 Message-Id: <20221123055827.26996-10-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA01TbUxTZxTOvbe9Lc6yS2XwAhNLFxJAgVYLe1EQtiG5m8tCsoQluqRr2ruW UdraluEcKkw+WiZUBw52MYLD4QRHJxQGxRIGCsiGbDOV2ciXw2Ww8DUIU5Gxlgub/573ec45 zznnzeFi/CJOIDddY6T0GplaiG9htfaEh0UC+3G5aPx5aB3oxeAnZ1Yx2DBiweHK4BAGHbNV bHivqx2FVxpuorDj4gIKb67N4HBiycWCn3XfReBDJ41Ch2snvO64xYJ37OdxWF33kAPP9jWz YdtkHgJbV6oxuPhVPgc2/jnHgv2uIDi02sdO8ifpsUGcbKdHOOTQ6DUWeWcwi2yqN+Nk86WT ZMe9XJwsOTXrDigYY5NznU6cLLXVI+RiUzBZ1PUpSjZNzqCp3ocy4lWUTEHpBZRGrlWka5QJ woNvS1+TxsSKxJHiOPiyUKCRZVIJwuQ3UyNT0tXu8YWCD2XqLDeVKjMYhNH74/XaLCMlUGkN xgQhpVOodRJdlEGWacjSKKM0lHGvWCTaHeMOfC9D9aAiUWcRHM1z2tBcpCWwGPHiAkIC2juK sGJkC5dPdCBgwFzA9gh84i8EtFxQMMIiAhZ6avDNjM6RJygj2BHwgLZymEcBCqw1i6xihMvF iZ3ghzWuh/clSlFg6uha98CIKhTM1D3leEptIxJAy3UT6sEsIhSULny7bsEj9oHh06fZnkKA iAaWMR8P7eWmf/zZjjIhPuDWF5MsD8aIHeBUS9V6fUBUeQF7+RLGtJoMnI96OAzeBqb7bBs4 EExZCjdwNrhS/jXOJOcjgB6mEUZIBAUDFszTBEaEA6s9mqG3g3MDjShj7A1KViZRhueBtgub +CVw1bq5rgBw9++8DUwC5/lcDrPfUgTcPi09gwjoZ+ahn5mH/t+5BsHqkQBKZ8hUUoYY3W4N lf3fJ8u1mU3I+kVEvNGGTIzPR3UjKBfpRgAXE/ryTr6eI+fzFLKPjlF6rVSfpaYM3UiMe99n scAX5Fr3SWmMUrEkTiSJjY2VxO2JFQv9ebWVEXI+oZQZqQyK0lH6zTyU6xWYi+IGxaB/fU9K il+jX+eqXTKSNuY7aA79Rf+7ZnZ+AY8yVe8vOtrvl5F97PYTe+3UcOX7ywf8KKskbGGrT+P0 1qvehTd+06nIX8E3w6M3DoQpTCcevdtrumTrNU84g/NFOZXXziX8FNJg7Xtn7+euiv7lXWmu XRkVR9K6Lz/m7Qi3PJZfDJH9cegfUgk+eDrV3FobVCZ5sXLP9hap7svwj/lBCtV3kcvlFWn5 fckwKf5wWUROk+2w3uholAaXFB5XcilRkklde+TV+/RbIwdda/Ro2bh23wnxc73m5lfKVr8P nU6le5yXHSXmacfSnK05D69pSLRMxGfOC6V1ATP3Q6RClkElE0dgeoPsX8rOOwiaBAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrHIsWRmVeSWpSXmKPExsWy7bCSnG7iztpkg3+/2CzWnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLFauPspksXvhRyaLo//fslk8/HKLxWLSoWuMFk+vzmKy 2HtL22LP3pMsFpd3zWGzmL/sKbvFxOObWS12PGlktNj2ez6zxeelLewW616/Z7E4cUva4vzf 46wO4h6z7p9l89g56y67x/l7G1k8Lp8t9di0qpPNY/OSeo/dNxvYPHqb3wEVtN5n9Xi/7yqb R9+WVYwenzfJebQf6Gby2PTkLVMAXxSXTUpqTmZZapG+XQJXxqPp9gX9ChWNV7cwNTBulepi 5OSQEDCR2Hf3F1MXIxeHkMAORokPTw8zQSQkJZb9PcIMYQtLrPz3nB2iqJlJYn3zEdYuRg4O NgFtidP/OUDiIgILmCQu33vFDOIwCyxlkph95S4bSLewgK3E1j0dYFNZBFQl+j5uAIvzClhL XO/pARskIaAv0X9fECTMCRQ+c3EXE0hYSMBKYs8yHYhqQYmTM5+wgNjMAvISzVtnM09gFJiF JDULSWoBI9MqRsnUguLc9NxiwwKjvNRyveLE3OLSvHS95PzcTYzgKNbS2sG4Z9UHvUOMTByM hxglOJiVRHjrPWuShXhTEiurUovy44tKc1KLDzFKc7AoifNe6DoZLySQnliSmp2aWpBaBJNl 4uCUamCaKZEc/lmw8aPhTY/S/88qjky+J3slqSjhIGu1lPTm46dS+SZK8gu0e/T/r1CcnFDj yTlXYMWpHV42iQ5erNl/vc1OTpizWEP08A0tY3bLRK5Nnj/P5Rh3JwHdlDQvw3L/gXvnn1Vm cract0j5f/bTiUnBv7eZycwNU2fdGrEpMe29Y8jMh2IFRlOVHcWOXypuNfzw6c/+L1Pe6T34 9Xhfc/2zGsY3HxkaW83Z9Qyj1ITzhHnqTn+/k7rPui5H8tNG6atmezWE7n1nZE4OrQ6ITfqt pSsvxvR22S/vhqDbPZv6kkum6jBFJL1wFcvaGHpj7ttCxsyNUfUWYvfyo1Ly9rUwiJ6u33Kb JzVGiaU4I9FQi7moOBEAOVdKqlEDAAA= X-CMS-MailID: 20221123061041epcas5p4413569a46ee730cd3033a9025c8f134a X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061041epcas5p4413569a46ee730cd3033a9025c8f134a References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Introduce copy_jobs to use copy-offload, if supported by underlying devices otherwise fall back to existing method. run_copy_jobs() calls block layer copy offload API, if both source and destination request queue are same and support copy offload. On successful completion, destination regions copied count is made zero, failed regions are processed via existing method. Signed-off-by: Nitesh Shetty Signed-off-by: Anuj Gupta --- drivers/md/dm-kcopyd.c | 56 +++++++++++++++++++++++++++++++++++++----- 1 file changed, 50 insertions(+), 6 deletions(-) diff --git a/drivers/md/dm-kcopyd.c b/drivers/md/dm-kcopyd.c index 4d3bbbea2e9a..2f9985f671ac 100644 --- a/drivers/md/dm-kcopyd.c +++ b/drivers/md/dm-kcopyd.c @@ -74,18 +74,20 @@ struct dm_kcopyd_client { atomic_t nr_jobs; /* - * We maintain four lists of jobs: + * We maintain five lists of jobs: * - * i) jobs waiting for pages - * ii) jobs that have pages, and are waiting for the io to be issued. - * iii) jobs that don't need to do any IO and just run a callback - * iv) jobs that have completed. + * i) jobs waiting to try copy offload + * ii) jobs waiting for pages + * iii) jobs that have pages, and are waiting for the io to be issued. + * iv) jobs that don't need to do any IO and just run a callback + * v) jobs that have completed. * - * All four of these are protected by job_lock. + * All five of these are protected by job_lock. */ spinlock_t job_lock; struct list_head callback_jobs; struct list_head complete_jobs; + struct list_head copy_jobs; struct list_head io_jobs; struct list_head pages_jobs; }; @@ -579,6 +581,43 @@ static int run_io_job(struct kcopyd_job *job) return r; } +static int run_copy_job(struct kcopyd_job *job) +{ + int r, i, count = 0; + struct range_entry range; + + struct request_queue *src_q, *dest_q; + + for (i = 0; i < job->num_dests; i++) { + range.dst = job->dests[i].sector << SECTOR_SHIFT; + range.src = job->source.sector << SECTOR_SHIFT; + range.len = job->source.count << SECTOR_SHIFT; + + src_q = bdev_get_queue(job->source.bdev); + dest_q = bdev_get_queue(job->dests[i].bdev); + + if (src_q != dest_q || !blk_queue_copy(src_q)) + break; + + r = blkdev_issue_copy(job->source.bdev, job->dests[i].bdev, + &range, 1, NULL, NULL, GFP_KERNEL); + if (r) + break; + + job->dests[i].count = 0; + count++; + } + + if (count == job->num_dests) { + push(&job->kc->complete_jobs, job); + } else { + push(&job->kc->pages_jobs, job); + r = 0; + } + + return r; +} + static int run_pages_job(struct kcopyd_job *job) { int r; @@ -659,6 +698,7 @@ static void do_work(struct work_struct *work) spin_unlock_irq(&kc->job_lock); blk_start_plug(&plug); + process_jobs(&kc->copy_jobs, kc, run_copy_job); process_jobs(&kc->complete_jobs, kc, run_complete_job); process_jobs(&kc->pages_jobs, kc, run_pages_job); process_jobs(&kc->io_jobs, kc, run_io_job); @@ -676,6 +716,8 @@ static void dispatch_job(struct kcopyd_job *job) atomic_inc(&kc->nr_jobs); if (unlikely(!job->source.count)) push(&kc->callback_jobs, job); + else if (job->source.bdev->bd_disk == job->dests[0].bdev->bd_disk) + push(&kc->copy_jobs, job); else if (job->pages == &zero_page_list) push(&kc->io_jobs, job); else @@ -916,6 +958,7 @@ struct dm_kcopyd_client *dm_kcopyd_client_create(struct dm_kcopyd_throttle *thro spin_lock_init(&kc->job_lock); INIT_LIST_HEAD(&kc->callback_jobs); INIT_LIST_HEAD(&kc->complete_jobs); + INIT_LIST_HEAD(&kc->copy_jobs); INIT_LIST_HEAD(&kc->io_jobs); INIT_LIST_HEAD(&kc->pages_jobs); kc->throttle = throttle; @@ -971,6 +1014,7 @@ void dm_kcopyd_client_destroy(struct dm_kcopyd_client *kc) BUG_ON(!list_empty(&kc->callback_jobs)); BUG_ON(!list_empty(&kc->complete_jobs)); + WARN_ON(!list_empty(&kc->copy_jobs)); BUG_ON(!list_empty(&kc->io_jobs)); BUG_ON(!list_empty(&kc->pages_jobs)); destroy_workqueue(kc->kcopyd_wq); From patchwork Wed Nov 23 05:58:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Nitesh Shetty X-Patchwork-Id: 13053108 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id CC042C43217 for ; Wed, 23 Nov 2022 06:15:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236073AbiKWGO7 (ORCPT ); Wed, 23 Nov 2022 01:14:59 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:51308 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S235989AbiKWGOI (ORCPT ); Wed, 23 Nov 2022 01:14:08 -0500 Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8B1E0F3906 for ; Tue, 22 Nov 2022 22:13:55 -0800 (PST) Received: from epcas5p3.samsung.com (unknown [182.195.41.41]) by mailout3.samsung.com (KnoxPortal) with ESMTP id 20221123061354epoutp03157544e75adcf12f9d9ea437b46b9b1b~qIgkHfEXm0929309293epoutp03u for ; Wed, 23 Nov 2022 06:13:54 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout3.samsung.com 20221123061354epoutp03157544e75adcf12f9d9ea437b46b9b1b~qIgkHfEXm0929309293epoutp03u DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=samsung.com; s=mail20170921; t=1669184034; bh=qjxOVyivyB+vJwGT4+o1dKgihFCPZHtfk9wQXF66Pqk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=c1/XHFPyj7Mz683nijLGtrG4B8aKdg0pLUYEQ0kCnp2OrnUQcnpOscgWlfVwAt9Rp wIX6XYOku5f0ple/ScbYqIynD6x15hKmJK8DpFyexkX0OL3Vvc4mrTQ5dmtYZaNKvj YLRh/WI2qXZ7Q4aTeIQ34vxLp/gte8bkkAek8y7E= Received: from epsnrtp1.localdomain (unknown [182.195.42.162]) by epcas5p2.samsung.com (KnoxPortal) with ESMTP id 20221123061353epcas5p27e513557cf50a9c47223485ae09172b8~qIgjaHlxZ2317523175epcas5p22; Wed, 23 Nov 2022 06:13:53 +0000 (GMT) Received: from epsmges5p3new.samsung.com (unknown [182.195.38.177]) by epsnrtp1.localdomain (Postfix) with ESMTP id 4NH9mv2cxKz4x9Q7; Wed, 23 Nov 2022 06:13:51 +0000 (GMT) Received: from epcas5p3.samsung.com ( [182.195.41.41]) by epsmges5p3new.samsung.com (Symantec Messaging Gateway) with SMTP id 4E.37.56352.F1ABD736; Wed, 23 Nov 2022 15:13:51 +0900 (KST) Received: from epsmtrp1.samsung.com (unknown [182.195.40.13]) by epcas5p2.samsung.com (KnoxPortal) with ESMTPA id 20221123061044epcas5p2ac082a91fc8197821f29e84278b6203c~qIdzuCqgV3021930219epcas5p2C; Wed, 23 Nov 2022 06:10:44 +0000 (GMT) Received: from epsmgms1p1new.samsung.com (unknown [182.195.42.41]) by epsmtrp1.samsung.com (KnoxPortal) with ESMTP id 20221123061044epsmtrp15492e9b07b3038a45061cbc65c0dbdf3~qIdzjunIb1974919749epsmtrp1Q; Wed, 23 Nov 2022 06:10:44 +0000 (GMT) X-AuditID: b6c32a4b-5f7fe7000001dc20-42-637dba1f2329 Received: from epsmtip1.samsung.com ( [182.195.34.30]) by epsmgms1p1new.samsung.com (Symantec Messaging Gateway) with SMTP id E1.2B.14392.469BD736; Wed, 23 Nov 2022 15:10:44 +0900 (KST) Received: from test-zns.sa.corp.samsungelectronics.net (unknown [107.110.206.5]) by epsmtip1.samsung.com (KnoxPortal) with ESMTPA id 20221123061041epsmtip1e464d5d909ca389f1c8af953c5b59cb4~qIdwm1Hr_1981819818epsmtip1k; Wed, 23 Nov 2022 06:10:41 +0000 (GMT) From: Nitesh Shetty To: axboe@kernel.dk, agk@redhat.com, snitzer@kernel.org, dm-devel@redhat.com, kbusch@kernel.org, hch@lst.de, sagi@grimberg.me, james.smart@broadcom.com, kch@nvidia.com, damien.lemoal@opensource.wdc.com, naohiro.aota@wdc.com, jth@kernel.org, viro@zeniv.linux.org.uk Cc: linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-nvme@lists.infradead.org, linux-fsdevel@vger.kernel.org, anuj20.g@samsung.com, joshi.k@samsung.com, p.raghav@samsung.com, nitheshshetty@gmail.com, gost.dev@samsung.com, Nitesh Shetty Subject: [PATCH v5 10/10] fs: add support for copy file range in zonefs Date: Wed, 23 Nov 2022 11:28:27 +0530 Message-Id: <20221123055827.26996-11-nj.shetty@samsung.com> X-Mailer: git-send-email 2.35.1.500.gb896f729e2 In-Reply-To: <20221123055827.26996-1-nj.shetty@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFupil+LIzCtJLcpLzFFi42LZdlhTU1d+V22ywZMz3BbrTx1jtmia8JfZ YvXdfjaL32fPM1vsfTeb1eLmgZ1MFitXH2Wy2L3wI5PF0f9v2SwefrnFYjHp0DVGi6dXZzFZ 7L2lbbFn70kWi8u75rBZzF/2lN1i4vHNrBY7njQyWmz7PZ/Z4vPSFnaLda/fs1icuCVtcf7v cVYHcY9Z98+yeeycdZfd4/y9jSwel8+Wemxa1cnmsXlJvcfumw1sHr3N74AKWu+zerzfd5XN o2/LKkaPz5vkPNoPdDN5bHrylimALyrbJiM1MSW1SCE1Lzk/JTMv3VbJOzjeOd7UzMBQ19DS wlxJIS8xN9VWycUnQNctMwfofyWFssScUqBQQGJxsZK+nU1RfmlJqkJGfnGJrVJqQUpOgUmB XnFibnFpXrpeXmqJlaGBgZEpUGFCdkb34bSCqRYVm3ZMZW5gfKDbxcjJISFgInF34w2WLkYu DiGB3YwSO69+YIdwPjFKtB+YD+V8Y5T48eUTE0zL4t7VzBCJvYwSk99Ph3JamST+P30EVMXB wSagLXH6PwdIXESgj0miY/cBsCJmgdlMEm+X/WEHGSUs4C7xbv5ZZhCbRUBV4lx7GyuIzStg LfGs9zcryCAJAX2J/vuCIGFOoPCZi7uYIEoEJU7OfMICYjMLyEs0b50NNl9CYD6nxI2jK5kh TnWRmDt1OSOELSzx6vgWdghbSuJlfxuUXS6xcsoKNojmFkaJWddnQTXYS7Se6mcGOYJZQFNi /S59iLCsxNRT65ggFvNJ9P5+Ag0WXokd82BsZYk16xewQdiSEte+N0LZHhL3b+9ghIRWH6PE tivz2CcwKsxC8tAsJA/NQli9gJF5FaNkakFxbnpqsWmBcV5qOTyak/NzNzGCM4WW9w7GRw8+ 6B1iZOJgPMQowcGsJMJb71mTLMSbklhZlVqUH19UmpNafIjRFBjgE5mlRJPzgbkqryTe0MTS wMTMzMzE0tjMUEmcd/EMrWQhgfTEktTs1NSC1CKYPiYOTqkGpqcXMn4X8j3Nva9TL7u8Sv7U g7PiorPE5aPtNt7JXOXA8085gCkw+7acXfXLiuNafGZGk7Ytv5P8sUljg5++fWDvwoDP/m90 omOvzc71Z/j4/b+OasIVEe+37e/D+S8mBK6bWrlUoOGC+eG6zQJ9Ksczr4cks+cYT/5+zSkx Zvfpf3qfNkQq2yXmlL9uOhNyPmLd8gOn3rg+Y/N3yuoKDRb7ujhspoXYxStR/iXLRHMFZNcJ qkhduvnwWfakm7VsyY+OzWWS4e5m2PjgfcrrVw6yDZdCflxb4W389iP3+ztBDRpPX4p7Rcn9 u33v9A+7bbtPiE/xlNU6Yn5HrMX65akNYWxJGyo8ul0fls8+ocRSnJFoqMVcVJwIALa80fqd BAAA X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFtrHIsWRmVeSWpSXmKPExsWy7bCSnG7Kztpkg/sTJSzWnzrGbNE04S+z xeq7/WwWv8+eZ7bY+242q8XNAzuZLFauPspksXvhRyaLo//fslk8/HKLxWLSoWuMFk+vzmKy 2HtL22LP3pMsFpd3zWGzmL/sKbvFxOObWS12PGlktNj2ez6zxeelLewW616/Z7E4cUva4vzf 46wO4h6z7p9l89g56y67x/l7G1k8Lp8t9di0qpPNY/OSeo/dNxvYPHqb3wEVtN5n9Xi/7yqb R9+WVYwenzfJebQf6Gby2PTkLVMAXxSXTUpqTmZZapG+XQJXRvfhtIKpFhWbdkxlbmB8oNvF yMkhIWAisbh3NXMXIxeHkMBuRolpvV+YIBKSEsv+HmGGsIUlVv57zg5R1Mwksfb0DaAEBweb gLbE6f8cIHERgQVMEpfvvQKbxCywlEli9pW7bCDdwgLuEu/mnwWbxCKgKnGuvY0VxOYVsJZ4 1vubFWSQhIC+RP99QZAwJ1D4zMVdTCBhIQEriT3LdCCqBSVOznzCAmIzC8hLNG+dzTyBUWAW ktQsJKkFjEyrGCVTC4pz03OLDQsM81LL9YoTc4tL89L1kvNzNzGCo1hLcwfj9lUf9A4xMnEw HmKU4GBWEuGt96xJFuJNSaysSi3Kjy8qzUktPsQozcGiJM57oetkvJBAemJJanZqakFqEUyW iYNTqoHJ5H+Qu8iciXkLnzP3HF6ucs/lXvsVmdBFxz7szDnfxOd+4LoQm6Xk91j/U+uOe77d sjtu092GLf9fqj4TFuNondC00b3jXew2xpeSnU0/L4sECVa/DWUQlviWH1Lfvimn7km/0Uy1 uqRfN626rdt14s01Hf/8vv2HYcs2U1u3j8sZn1q9eX45MOtNyl2N50ff7JO7bxSc8/+Gk5yG lWi0yRwFOXfnDxqtv5bE+dn92fFzttjHY0sq3uoGbDvzOulhgfZPF9noH7rHZ7rVnlGL3/bg vcIm3gkBj5+UJltu4i27VCsqVpYxkeXTh73hr3la57zSty47fC903X31bxb6Hx/qSNn8mV3e ZPJui2GFEktxRqKhFnNRcSIAruvDJlEDAAA= X-CMS-MailID: 20221123061044epcas5p2ac082a91fc8197821f29e84278b6203c X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P DLP-Filter: Pass X-CFilter-Loop: Reflected X-CMS-RootMailID: 20221123061044epcas5p2ac082a91fc8197821f29e84278b6203c References: <20221123055827.26996-1-nj.shetty@samsung.com> Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org copy_file_range is implemented using copy offload, copy offloading to device is always enabled. To disable copy offloading mount with "no_copy_offload" mount option. At present copy offload is only used, if the source and destination files are on same block device, otherwise copy file range is completed by generic copy file range. copy file range implemented as following: - write pending writes on the src and dest files - drop page cache for dest file if its conv zone - copy the range using offload - update dest file info For all failure cases we fallback to generic file copy range At present this implementation does not support conv aggregation Signed-off-by: Nitesh Shetty Signed-off-by: Anuj Gupta --- fs/zonefs/super.c | 179 ++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 179 insertions(+) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index abc9a85106f2..15613433d4ae 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -1223,6 +1223,183 @@ static int zonefs_file_release(struct inode *inode, struct file *file) return 0; } +static int zonefs_is_file_copy_offset_ok(struct inode *src_inode, + struct inode *dst_inode, loff_t src_off, loff_t dst_off, + size_t *len) +{ + loff_t size, endoff; + struct zonefs_inode_info *dst_zi = ZONEFS_I(dst_inode); + + inode_lock(src_inode); + size = i_size_read(src_inode); + inode_unlock(src_inode); + /* Don't copy beyond source file EOF. */ + if (src_off < size) { + if (src_off + *len > size) + *len = (size - (src_off + *len)); + } else + *len = 0; + + mutex_lock(&dst_zi->i_truncate_mutex); + if (dst_zi->i_ztype == ZONEFS_ZTYPE_SEQ) { + if (*len > dst_zi->i_max_size - dst_zi->i_wpoffset) + *len -= dst_zi->i_max_size - dst_zi->i_wpoffset; + + if (dst_off != dst_zi->i_wpoffset) + goto err; + } + mutex_unlock(&dst_zi->i_truncate_mutex); + + endoff = dst_off + *len; + inode_lock(dst_inode); + if (endoff > dst_zi->i_max_size || + inode_newsize_ok(dst_inode, endoff)) { + inode_unlock(dst_inode); + goto err; + } + inode_unlock(dst_inode); + + return 0; +err: + mutex_unlock(&dst_zi->i_truncate_mutex); + return -EINVAL; +} + +static ssize_t zonefs_issue_copy(struct zonefs_inode_info *src_zi, + loff_t src_off, struct zonefs_inode_info *dst_zi, + loff_t dst_off, size_t len) +{ + struct block_device *src_bdev = src_zi->i_vnode.i_sb->s_bdev; + struct block_device *dst_bdev = dst_zi->i_vnode.i_sb->s_bdev; + struct range_entry *rlist = NULL; + int ret = len; + + rlist = kmalloc(sizeof(*rlist), GFP_KERNEL); + if (!rlist) + return -ENOMEM; + + rlist[0].dst = (dst_zi->i_zsector << SECTOR_SHIFT) + dst_off; + rlist[0].src = (src_zi->i_zsector << SECTOR_SHIFT) + src_off; + rlist[0].len = len; + rlist[0].comp_len = 0; + ret = blkdev_issue_copy(src_bdev, dst_bdev, rlist, 1, NULL, NULL, + GFP_KERNEL); + if (rlist[0].comp_len > 0) + ret = rlist[0].comp_len; + kfree(rlist); + + return ret; +} + +/* Returns length of possible copy, else returns error */ +static ssize_t zonefs_copy_file_checks(struct file *src_file, loff_t src_off, + struct file *dst_file, loff_t dst_off, + size_t *len, unsigned int flags) +{ + struct inode *src_inode = file_inode(src_file); + struct inode *dst_inode = file_inode(dst_file); + struct zonefs_inode_info *src_zi = ZONEFS_I(src_inode); + struct zonefs_inode_info *dst_zi = ZONEFS_I(dst_inode); + ssize_t ret; + + if (src_inode->i_sb != dst_inode->i_sb) + return -EXDEV; + + /* Start by sync'ing the source and destination files for conv zones */ + if (src_zi->i_ztype == ZONEFS_ZTYPE_CNV) { + ret = file_write_and_wait_range(src_file, src_off, + (src_off + *len)); + if (ret < 0) + goto io_error; + } + inode_dio_wait(src_inode); + + /* Start by sync'ing the source and destination files ifor conv zones */ + if (dst_zi->i_ztype == ZONEFS_ZTYPE_CNV) { + ret = file_write_and_wait_range(dst_file, dst_off, + (dst_off + *len)); + if (ret < 0) + goto io_error; + } + inode_dio_wait(dst_inode); + + /* Drop dst file cached pages for a conv zone*/ + if (dst_zi->i_ztype == ZONEFS_ZTYPE_CNV) { + ret = invalidate_inode_pages2_range(dst_inode->i_mapping, + dst_off >> PAGE_SHIFT, + (dst_off + *len) >> PAGE_SHIFT); + if (ret < 0) + goto io_error; + } + + ret = zonefs_is_file_copy_offset_ok(src_inode, dst_inode, src_off, + dst_off, len); + if (ret < 0) + return ret; + + return *len; + +io_error: + zonefs_io_error(dst_inode, true); + return ret; +} + +static ssize_t zonefs_copy_file(struct file *src_file, loff_t src_off, + struct file *dst_file, loff_t dst_off, + size_t len, unsigned int flags) +{ + struct inode *src_inode = file_inode(src_file); + struct inode *dst_inode = file_inode(dst_file); + struct zonefs_inode_info *src_zi = ZONEFS_I(src_inode); + struct zonefs_inode_info *dst_zi = ZONEFS_I(dst_inode); + ssize_t ret = 0, bytes; + + inode_lock(src_inode); + inode_lock(dst_inode); + bytes = zonefs_issue_copy(src_zi, src_off, dst_zi, dst_off, len); + if (bytes < 0) + goto unlock_exit; + + ret += bytes; + + file_update_time(dst_file); + mutex_lock(&dst_zi->i_truncate_mutex); + zonefs_update_stats(dst_inode, dst_off + bytes); + zonefs_i_size_write(dst_inode, dst_off + bytes); + dst_zi->i_wpoffset += bytes; + mutex_unlock(&dst_zi->i_truncate_mutex); + /* if we still have some bytes left, do splice copy */ + if (bytes && (bytes < len)) { + bytes = do_splice_direct(src_file, &src_off, dst_file, + &dst_off, len, flags); + if (bytes > 0) + ret += bytes; + } +unlock_exit: + if (ret < 0) + zonefs_io_error(dst_inode, true); + inode_unlock(src_inode); + inode_unlock(dst_inode); + return ret; +} + +static ssize_t zonefs_copy_file_range(struct file *src_file, loff_t src_off, + struct file *dst_file, loff_t dst_off, + size_t len, unsigned int flags) +{ + ssize_t ret = -EIO; + + ret = zonefs_copy_file_checks(src_file, src_off, dst_file, dst_off, + &len, flags); + if (ret > 0) + ret = zonefs_copy_file(src_file, src_off, dst_file, dst_off, + len, flags); + else if (ret < 0 && ret == -EXDEV) + ret = generic_copy_file_range(src_file, src_off, dst_file, + dst_off, len, flags); + return ret; +} + static const struct file_operations zonefs_file_operations = { .open = zonefs_file_open, .release = zonefs_file_release, @@ -1234,6 +1411,7 @@ static const struct file_operations zonefs_file_operations = { .splice_read = generic_file_splice_read, .splice_write = iter_file_splice_write, .iopoll = iocb_bio_iopoll, + .copy_file_range = zonefs_copy_file_range, }; static struct kmem_cache *zonefs_inode_cachep; @@ -1804,6 +1982,7 @@ static int zonefs_fill_super(struct super_block *sb, void *data, int silent) atomic_set(&sbi->s_active_seq_files, 0); sbi->s_max_active_seq_files = bdev_max_active_zones(sb->s_bdev); + /* set copy support by default */ ret = zonefs_read_super(sb); if (ret) return ret;