From patchwork Fri Dec 11 13:51:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: SelvaKumar S X-Patchwork-Id: 11971929 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3F87BC4361B for ; Mon, 14 Dec 2020 11:00:40 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 7BAFF22CB1 for ; Mon, 14 Dec 2020 11:00:39 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 7BAFF22CB1 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=samsung.com Authentication-Results: mail.kernel.org; spf=tempfail smtp.mailfrom=dm-devel-bounces@redhat.com Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-6-3WMtXO9UNouuhQjFo0BzKQ-1; Mon, 14 Dec 2020 06:00:35 -0500 X-MC-Unique: 3WMtXO9UNouuhQjFo0BzKQ-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 79C28800D24; Mon, 14 Dec 2020 11:00:31 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 4610760BE2; Mon, 14 Dec 2020 11:00:31 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id DB22D4BB7B; Mon, 14 Dec 2020 11:00:29 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.rdu2.redhat.com [10.11.54.4]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 0BBEvVIq008631 for ; Fri, 11 Dec 2020 09:57:31 -0500 Received: by smtp.corp.redhat.com (Postfix) id 28FC62027144; Fri, 11 Dec 2020 14:57:31 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast05.extmail.prod.ext.rdu2.redhat.com [10.11.55.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 22F922027179 for ; Fri, 11 Dec 2020 14:57:28 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-2.mimecast.com [205.139.110.61]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id B2818800883 for ; Fri, 11 Dec 2020 14:57:28 +0000 (UTC) Received: from mailout3.samsung.com (mailout3.samsung.com [203.254.224.33]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-296-VcLNw8cEO-2pRG_ZURZdJA-1; Fri, 11 Dec 2020 09:57:26 -0500 X-MC-Unique: VcLNw8cEO-2pRG_ZURZdJA-1 Received: from epcas5p1.samsung.com (unknown [182.195.41.39]) by mailout3.samsung.com (KnoxPortal) with ESMTP id 20201211145723epoutp035e72c06cfb26df5534f767e692cc072d~PsWXlI8MV1167611676epoutp03O for ; Fri, 11 Dec 2020 14:57:23 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout3.samsung.com 20201211145723epoutp035e72c06cfb26df5534f767e692cc072d~PsWXlI8MV1167611676epoutp03O Received: from epsmges5p1new.samsung.com (unknown [182.195.42.73]) by epcas5p1.samsung.com (KnoxPortal) with ESMTP id 20201211145722epcas5p1cd39a6f6c02b3a9893bd0b687ea67a9b~PsWW7_8Si1802018020epcas5p1m; Fri, 11 Dec 2020 14:57:22 +0000 (GMT) Received: from epcas5p1.samsung.com ( [182.195.41.39]) by epsmges5p1new.samsung.com (Symantec Messaging Gateway) with SMTP id 52.E3.15682.2D883DF5; Fri, 11 Dec 2020 23:57:22 +0900 (KST) Received: from epsmtrp1.samsung.com (unknown [182.195.40.13]) by epcas5p2.samsung.com (KnoxPortal) with ESMTPA id 20201211135200epcas5p217eaa00b35a59b3468c198d85309fd7d~PrdSJ8rng2146621466epcas5p2D; Fri, 11 Dec 2020 13:52:00 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp1.samsung.com (KnoxPortal) with ESMTP id 20201211135200epsmtrp17dbb4d7af6de44b30c8fefe5f74aaec7~PrdSI3X693246832468epsmtrp1h; Fri, 11 Dec 2020 13:52:00 +0000 (GMT) X-AuditID: b6c32a49-8d5ff70000013d42-90-5fd388d2d89f Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id 73.71.08745.08973DF5; Fri, 11 Dec 2020 22:52:00 +0900 (KST) Received: from localhost.localdomain (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20201211135156epsmtip291d80e72677c32143d3574f83db91b37~PrdO6uDKK2794227942epsmtip2E; Fri, 11 Dec 2020 13:51:56 +0000 (GMT) From: SelvaKumar S To: linux-nvme@lists.infradead.org Date: Fri, 11 Dec 2020 19:21:38 +0530 Message-Id: <20201211135139.49232-2-selvakuma.s1@samsung.com> In-Reply-To: <20201211135139.49232-1-selvakuma.s1@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFmphk+LIzCtJLcpLzFFi42LZdlhTXfdSx+V4g5XnxC1W3+1ns5j24Sez RWv7NyaLve9ms1rsWTSJyWLl6qNMFo/vfGa3+Nt1j8ni6P+3bBaTDl1jtNh7S9vi8q45bBbz lz1lt+i+voPNYvnxf0wWEzuuMlls+z2f2eLKlEXMFutev2exePD+OrvF6x8n2SzaNn5ldBDz uHzF22PnrLvsHufvbWTxuHy21GPTqk42j81L6j1232xg8/j49BaLx/t9V9k8+rasYvTYfLra 4/MmOY/2A91MAbxRXDYpqTmZZalF+nYJXBl90y8wF9xbxVgxey5rA+PkbsYuRk4OCQETidcr T7F3MXJxCAnsZpQ4d3onG4TziVFi6uLFLBDOZ0aJfR8PscK0dJ8+xgxiCwnsYpTo/VwBV/S6 vR+siE1AV+Lakk0sILaIgJLE3/VNYJOYBW4yS1z69YkdJCEsYCVx5PFbsEksAqoS81ubwRp4 BWwl1j/cwQyxTV5i5qXvYPWcAnYSfVvuMEPUCEqcnPkErJ4ZqKZ562xmkAUSAvM5JR53X2GB aHaROHT7HhOELSzx6vgWdghbSuJlfxuUXS7xrHMaVE0Do0Tf+3II217i4p6/QHEOoAWaEut3 6UOEZSWmnlrHBLGXT6L39xOoVl6JHfOegJVLCKhJnNpuBhGWkfhweBcbhO0h8eNzOzSsJzJK nO88zj6BUWEWkndmIXlnFsLmBYzMqxglUwuKc9NTi00LDPNSy/WKE3OLS/PS9ZLzczcxghOq lucOxrsPPugdYmTiYDzEKMHBrCTC+7v+crwQb0piZVVqUX58UWlOavEhRmkOFiVxXqUfZ+KE BNITS1KzU1MLUotgskwcnFINTJOiNu9dYBUacEGFdVnZ6ossTgysmV97tJY09J5vVeI9sEJ9 majKVmavKxfOXVS71NWwXYIpb5f2yT32HpUzDY9ZyjzNSX1y1puh9Masy2t9ecNWKbsEC+/0 svxkZCM2KVzUN8VCb8b/6hOVC2/9UL2ZZ2x64mzjmxlhPh8sf1Z/YAv6aqHMfemwSLBC/591 zpJ3PWZz1riGO6runl9685lFon9BpqGD3gY/L95zvvyG/Z0iv66W88bIsZYf5k9ZEb3C9bpF heccrUniG7dO3e39UGOBVMAxp7W5L79JTcj8tdzHuWN96tPPKzb3ilWorPNtmVv/4Y3fruhb gi9nrm07yRj9hblzi/nj6BIVJZbijERDLeai4kQAHtsaZhcEAAA= X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrLIsWRmVeSWpSXmKPExsWy7bCSvG5D5eV4g/vP+S1W3+1ns5j24Sez RWv7NyaLve9ms1rsWTSJyWLl6qNMFo/vfGa3+Nt1j8ni6P+3bBaTDl1jtNh7S9vi8q45bBbz lz1lt+i+voPNYvnxf0wWEzuuMlls+z2f2eLKlEXMFutev2exePD+OrvF6x8n2SzaNn5ldBDz uHzF22PnrLvsHufvbWTxuHy21GPTqk42j81L6j1232xg8/j49BaLx/t9V9k8+rasYvTYfLra 4/MmOY/2A91MAbxRXDYpqTmZZalF+nYJXBl90y8wF9xbxVgxey5rA+PkbsYuRk4OCQETie7T x5i7GLk4hAR2MEq8W/eVBSIhI7H2bicbhC0ssfLfc3aIoo+MEheuvmAHSbAJ6EpcW7IJrEFE QEni7/omFpAiZoHPzBLzd7WBJYQFrCSOPH7LDGKzCKhKzG9tBovzCthKrH+4gxlig7zEzEvf wYZyCthJ9G25AxYXAqq5f7KfFaJeUOLkzCdAvRxAC9Ql1s8TAgkzA7U2b53NPIFRcBaSqlkI VbOQVC1gZF7FKJlaUJybnltsWGCUl1quV5yYW1yal66XnJ+7iREc41paOxj3rPqgd4iRiYPx EKMEB7OSCK8sy6V4Id6UxMqq1KL8+KLSnNTiQ4zSHCxK4rwXuk7GCwmkJ5akZqemFqQWwWSZ ODilGphWVwd7fF+27XdKrlffg6498zvX78s6u21NZFSQ1Kp9WZV7erep9IQ/rrv/dHtxWmtQ z4sTsfsLdJxOPr/v56c+ZVnCMrtY+941mdlHjVxXHr46pf+mrIdrjKrZkV12U8MEsh+l3LZ8 pPhlceeiR2JVtWoHLr/YUJApHPRzUZWnWOJqD6k3n+vX3tg1hWNR2VQG49ZrfVn8d5LbLvuf 6CwyKpnpLBe2YrarraDf8eVyexjmX5vXKVdWYdS25rheOJMGu9fsXysmZ9n0rFRimL/DROL1 8ftVOy+yVKQyXZGSmK0lt+jO9wz/LRPneZx22s756v26//6LXGU4+D1iH/1acU94n0fcmc1N L8R6mLqUWIozEg21mIuKEwGK39fwYAMAAA== X-CMS-MailID: 20201211135200epcas5p217eaa00b35a59b3468c198d85309fd7d X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P X-CMS-RootMailID: 20201211135200epcas5p217eaa00b35a59b3468c198d85309fd7d References: <20201211135139.49232-1-selvakuma.s1@samsung.com> X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.78 on 10.11.54.4 X-MIME-Autoconverted: from quoted-printable to 8bit by lists01.pubmisc.prod.ext.phx2.redhat.com id 0BBEvVIq008631 X-loop: dm-devel@redhat.com X-Mailman-Approved-At: Mon, 14 Dec 2020 06:00:28 -0500 Cc: axboe@kernel.dk, damien.lemoal@wdc.com, SelvaKumar S , sagi@grimberg.me, linux-scsi@vger.kernel.org, selvajove@gmail.com, Johannes.Thumshirn@wdc.com, snitzer@redhat.com, linux-kernel@vger.kernel.org, nj.shetty@samsung.com, linux-block@vger.kernel.org, dm-devel@redhat.com, mpatocka@redhat.com, joshi.k@samsung.com, martin.petersen@oracle.com, kbusch@kernel.org, javier.gonz@samsung.com, hch@lst.de, bvanassche@acm.org Subject: [dm-devel] [RFC PATCH v3 1/2] block: add simple copy support X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dm-devel-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Add new BLKCOPY ioctl that offloads copying of multiple sources to a destination to the device. Accept copy_ranges that contains destination, no of sources and pointer to the array of source ranges. Each range_entry contains start and length of source ranges (in bytes). Introduce REQ_OP_COPY, a no-merge copy offload operation. Create bio with control information as payload and submit to the device. REQ_OP_COPY(19) is a write op and takes zone_write_lock when submitted to zoned device. If the device doesn't support copy or copy offload is disabled, then copy is emulated by reading and writing each source ranges one by one. Introduce queue limits for simple copy and other helper functions. Add device limits as sysfs entries. - copy_offload - max_copy_sectors - max_copy_ranges_sectors - max_copy_nr_ranges copy_offload(= 0) is disabled by default. max_copy_sectors = 0 indicates the device doesn't support copy. simple copy is not supported for stacked devices. Signed-off-by: SelvaKumar S Signed-off-by: Kanchan Joshi Signed-off-by: Nitesh Shetty Signed-off-by: Javier González --- block/blk-core.c | 94 ++++++++++++++++++-- block/blk-lib.c | 182 ++++++++++++++++++++++++++++++++++++++ block/blk-merge.c | 2 + block/blk-settings.c | 10 +++ block/blk-sysfs.c | 50 +++++++++++ block/blk-zoned.c | 1 + block/bounce.c | 1 + block/ioctl.c | 43 +++++++++ include/linux/bio.h | 1 + include/linux/blk_types.h | 15 ++++ include/linux/blkdev.h | 16 ++++ include/uapi/linux/fs.h | 13 +++ 12 files changed, 420 insertions(+), 8 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index 2db8bda43b6e..07d64514e77b 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -719,6 +719,17 @@ static noinline int should_fail_bio(struct bio *bio) } ALLOW_ERROR_INJECTION(should_fail_bio, ERRNO); +static inline int bio_check_copy_eod(struct bio *bio, sector_t start, + sector_t nr_sectors, sector_t maxsector) +{ + if (nr_sectors && maxsector && + (nr_sectors > maxsector || start > maxsector - nr_sectors)) { + handle_bad_sector(bio, maxsector); + return -EIO; + } + return 0; +} + /* * Check whether this bio extends beyond the end of the device or partition. * This may well happen - the kernel calls bread() without checking the size of @@ -737,6 +748,65 @@ static inline int bio_check_eod(struct bio *bio, sector_t maxsector) return 0; } +/* + * Check for copy limits and remap source ranges if needed. + */ +static int blk_check_copy(struct bio *bio) +{ + struct hd_struct *p = NULL; + struct request_queue *q = bio->bi_disk->queue; + struct blk_copy_payload *payload; + int i, maxsector, start_sect = 0, ret = -EIO; + unsigned short nr_range; + + rcu_read_lock(); + + if (bio->bi_partno) { + p = __disk_get_part(bio->bi_disk, bio->bi_partno); + if (unlikely(!p)) + goto out; + if (unlikely(bio_check_ro(bio, p))) + goto out; + maxsector = part_nr_sects_read(p); + start_sect = p->start_sect; + } else { + if (unlikely(bio_check_ro(bio, &bio->bi_disk->part0))) + goto out; + maxsector = get_capacity(bio->bi_disk); + } + + payload = bio_data(bio); + nr_range = payload->copy_range; + + /* cannot handle copy crossing nr_ranges limit */ + if (payload->copy_range > q->limits.max_copy_nr_ranges) + goto out; + + /* cannot handle copy more than copy limits */ + if (payload->copy_size > q->limits.max_copy_sectors) + goto out; + + /* check if copy length crosses eod */ + if (unlikely(bio_check_copy_eod(bio, bio->bi_iter.bi_sector, + payload->copy_size, maxsector))) + goto out; + bio->bi_iter.bi_sector += start_sect; + + for (i = 0; i < nr_range; i++) { + if (unlikely(bio_check_copy_eod(bio, payload->range[i].src, + payload->range[i].len, maxsector))) + goto out; + payload->range[i].src += start_sect; + } + + if (p) + bio->bi_partno = 0; + ret = 0; +out: + rcu_read_unlock(); + return ret; +} + /* * Remap block n of partition p to block n+start(p) of the disk. */ @@ -825,14 +895,16 @@ static noinline_for_stack bool submit_bio_checks(struct bio *bio) if (should_fail_bio(bio)) goto end_io; - if (bio->bi_partno) { - if (unlikely(blk_partition_remap(bio))) - goto end_io; - } else { - if (unlikely(bio_check_ro(bio, &bio->bi_disk->part0))) - goto end_io; - if (unlikely(bio_check_eod(bio, get_capacity(bio->bi_disk)))) - goto end_io; + if (likely(!op_is_copy(bio->bi_opf))) { + if (bio->bi_partno) { + if (unlikely(blk_partition_remap(bio))) + goto end_io; + } else { + if (unlikely(bio_check_ro(bio, &bio->bi_disk->part0))) + goto end_io; + if (unlikely(bio_check_eod(bio, get_capacity(bio->bi_disk)))) + goto end_io; + } } /* @@ -856,6 +928,12 @@ static noinline_for_stack bool submit_bio_checks(struct bio *bio) if (!blk_queue_discard(q)) goto not_supported; break; + case REQ_OP_COPY: + if (!blk_queue_copy(q)) + goto not_supported; + if (unlikely(blk_check_copy(bio))) + goto end_io; + break; case REQ_OP_SECURE_ERASE: if (!blk_queue_secure_erase(q)) goto not_supported; diff --git a/block/blk-lib.c b/block/blk-lib.c index e90614fd8d6a..47e50e957e75 100644 --- a/block/blk-lib.c +++ b/block/blk-lib.c @@ -150,6 +150,188 @@ int blkdev_issue_discard(struct block_device *bdev, sector_t sector, } EXPORT_SYMBOL(blkdev_issue_discard); +int blk_copy_emulate(struct block_device *bdev, struct blk_copy_payload *payload, + gfp_t gfp_mask) +{ + struct request_queue *q = bdev_get_queue(bdev); + struct bio *bio; + void *buf = NULL; + int i, nr_srcs, max_range_len, ret, cur_dest, cur_size; + + nr_srcs = payload->copy_range; + max_range_len = q->limits.max_copy_range_sectors << SECTOR_SHIFT; + cur_dest = payload->dest; + buf = kvmalloc(max_range_len, GFP_ATOMIC); + if (!buf) + return -ENOMEM; + + for (i = 0; i < nr_srcs; i++) { + bio = bio_alloc(gfp_mask, 1); + bio->bi_iter.bi_sector = payload->range[i].src; + bio->bi_opf = REQ_OP_READ; + bio_set_dev(bio, bdev); + + cur_size = payload->range[i].len << SECTOR_SHIFT; + ret = bio_add_page(bio, virt_to_page(buf), cur_size, + offset_in_page(payload)); + if (ret != cur_size) { + ret = -ENOMEM; + goto out; + } + + ret = submit_bio_wait(bio); + bio_put(bio); + if (ret) + goto out; + + bio = bio_alloc(gfp_mask, 1); + bio_set_dev(bio, bdev); + bio->bi_opf = REQ_OP_WRITE; + bio->bi_iter.bi_sector = cur_dest; + ret = bio_add_page(bio, virt_to_page(buf), cur_size, + offset_in_page(payload)); + if (ret != cur_size) { + ret = -ENOMEM; + goto out; + } + + ret = submit_bio_wait(bio); + bio_put(bio); + if (ret) + goto out; + + cur_dest += payload->range[i].len; + } +out: + kvfree(buf); + return ret; +} + +int __blkdev_issue_copy(struct block_device *bdev, sector_t dest, + sector_t nr_srcs, struct range_entry *rlist, gfp_t gfp_mask, + int flags, struct bio **biop) +{ + struct request_queue *q = bdev_get_queue(bdev); + struct bio *bio; + struct blk_copy_payload *payload; + sector_t bs_mask; + sector_t src_sects, len = 0, total_len = 0; + int i, ret, total_size; + + if (!q) + return -ENXIO; + + if (!nr_srcs) + return -EINVAL; + + if (bdev_read_only(bdev)) + return -EPERM; + + if (!blk_queue_copy(q)) + return -EOPNOTSUPP; + + bs_mask = (bdev_logical_block_size(bdev) >> 9) - 1; + if (dest & bs_mask) + return -EINVAL; + + total_size = struct_size(payload, range, nr_srcs); + payload = kmalloc(total_size, GFP_ATOMIC | __GFP_NOWARN); + if (!payload) + return -ENOMEM; + + payload->dest = dest; + + for (i = 0; i < nr_srcs; i++) { + /* copy payload provided are in bytes */ + src_sects = rlist[i].src; + if (src_sects & bs_mask) { + ret = -EINVAL; + goto err; + } + src_sects = src_sects >> SECTOR_SHIFT; + + if (len & bs_mask) { + ret = -EINVAL; + goto err; + } + + len = rlist[i].len >> SECTOR_SHIFT; + if (len > q->limits.max_copy_range_sectors) { + ret = -EINVAL; + goto err; + } + + total_len += len; + + WARN_ON_ONCE((src_sects << 9) > UINT_MAX); + + payload->range[i].src = src_sects; + payload->range[i].len = len; + } + + /* storing # of source ranges */ + payload->copy_range = i; + /* storing copy len so far */ + payload->copy_size = total_len; + + if (q->limits.copy_offload) { + bio = bio_alloc(gfp_mask, 1); + bio->bi_iter.bi_sector = dest; + bio->bi_opf = REQ_OP_COPY | REQ_NOMERGE; + bio_set_dev(bio, bdev); + + ret = bio_add_page(bio, virt_to_page(payload), total_size, + offset_in_page(payload)); + if (ret != total_size) { + ret = -ENOMEM; + bio_put(bio); + goto err; + } + + *biop = bio; + return 0; + } + + ret = blk_copy_emulate(bdev, payload, gfp_mask); +err: + kfree(payload); + return ret; +} +EXPORT_SYMBOL(__blkdev_issue_copy); + +/** + * blkdev_issue_copy - queue a copy + * @bdev: blockdev to issue copy for + * @dest: dest sector + * @nr_srcs: number of source ranges to copy + * @rlist: list of range entries + * @gfp_mask: memory allocation flags (for bio_alloc) + * @flags: BLKDEV_COPY_* flags to control behaviour //TODO + * + * Description: + * Issue a copy request for dest sector with source in rlist + */ +int blkdev_issue_copy(struct block_device *bdev, sector_t dest, + int nr_srcs, struct range_entry *rlist, + gfp_t gfp_mask, unsigned long flags) +{ + struct bio *bio = NULL; + int ret; + + ret = __blkdev_issue_copy(bdev, dest, nr_srcs, rlist, gfp_mask, flags, + &bio); + if (!ret && bio) { + ret = submit_bio_wait(bio); + + kfree(page_address(bio_first_bvec_all(bio)->bv_page) + + bio_first_bvec_all(bio)->bv_offset); + bio_put(bio); + } + + return ret; +} +EXPORT_SYMBOL(blkdev_issue_copy); + /** * __blkdev_issue_write_same - generate number of bios with same page * @bdev: target blockdev diff --git a/block/blk-merge.c b/block/blk-merge.c index bcf5e4580603..a16e7598d6ad 100644 --- a/block/blk-merge.c +++ b/block/blk-merge.c @@ -301,6 +301,8 @@ void __blk_queue_split(struct bio **bio, unsigned int *nr_segs) struct bio *split = NULL; switch (bio_op(*bio)) { + case REQ_OP_COPY: + break; case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: split = blk_bio_discard_split(q, *bio, &q->bio_split, nr_segs); diff --git a/block/blk-settings.c b/block/blk-settings.c index 9741d1d83e98..9980e681b8b5 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -60,6 +60,10 @@ void blk_set_default_limits(struct queue_limits *lim) lim->io_opt = 0; lim->misaligned = 0; lim->zoned = BLK_ZONED_NONE; + lim->copy_offload = 0; + lim->max_copy_sectors = 0; + lim->max_copy_nr_ranges = 0; + lim->max_copy_range_sectors = 0; } EXPORT_SYMBOL(blk_set_default_limits); @@ -549,6 +553,12 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, t->io_opt = lcm_not_zero(t->io_opt, b->io_opt); t->chunk_sectors = lcm_not_zero(t->chunk_sectors, b->chunk_sectors); + /* simple copy not supported in stacked devices */ + t->copy_offload = 0; + t->max_copy_sectors = 0; + t->max_copy_range_sectors = 0; + t->max_copy_nr_ranges = 0; + /* Physical block size a multiple of the logical block size? */ if (t->physical_block_size & (t->logical_block_size - 1)) { t->physical_block_size = t->logical_block_size; diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index b513f1683af0..51b35a8311d9 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -166,6 +166,47 @@ static ssize_t queue_discard_granularity_show(struct request_queue *q, char *pag return queue_var_show(q->limits.discard_granularity, page); } +static ssize_t queue_copy_offload_show(struct request_queue *q, char *page) +{ + return queue_var_show(q->limits.copy_offload, page); +} + +static ssize_t queue_copy_offload_store(struct request_queue *q, + const char *page, size_t count) +{ + unsigned long copy_offload; + ssize_t ret = queue_var_store(©_offload, page, count); + + if (ret < 0) + return ret; + + if (copy_offload < 0 || copy_offload > 1) + return -EINVAL; + + if (q->limits.max_copy_sectors == 0 && copy_offload == 1) + return -EINVAL; + + q->limits.copy_offload = copy_offload; + return ret; +} + +static ssize_t queue_max_copy_sectors_show(struct request_queue *q, char *page) +{ + return queue_var_show(q->limits.max_copy_sectors, page); +} + +static ssize_t queue_max_copy_range_sectors_show(struct request_queue *q, + char *page) +{ + return queue_var_show(q->limits.max_copy_range_sectors, page); +} + +static ssize_t queue_max_copy_nr_ranges_show(struct request_queue *q, + char *page) +{ + return queue_var_show(q->limits.max_copy_nr_ranges, page); +} + static ssize_t queue_discard_max_hw_show(struct request_queue *q, char *page) { @@ -591,6 +632,11 @@ QUEUE_RO_ENTRY(queue_nr_zones, "nr_zones"); QUEUE_RO_ENTRY(queue_max_open_zones, "max_open_zones"); QUEUE_RO_ENTRY(queue_max_active_zones, "max_active_zones"); +QUEUE_RW_ENTRY(queue_copy_offload, "copy_offload"); +QUEUE_RO_ENTRY(queue_max_copy_sectors, "max_copy_sectors"); +QUEUE_RO_ENTRY(queue_max_copy_range_sectors, "max_copy_range_sectors"); +QUEUE_RO_ENTRY(queue_max_copy_nr_ranges, "max_copy_nr_ranges"); + QUEUE_RW_ENTRY(queue_nomerges, "nomerges"); QUEUE_RW_ENTRY(queue_rq_affinity, "rq_affinity"); QUEUE_RW_ENTRY(queue_poll, "io_poll"); @@ -636,6 +682,10 @@ static struct attribute *queue_attrs[] = { &queue_discard_max_entry.attr, &queue_discard_max_hw_entry.attr, &queue_discard_zeroes_data_entry.attr, + &queue_copy_offload_entry.attr, + &queue_max_copy_sectors_entry.attr, + &queue_max_copy_range_sectors_entry.attr, + &queue_max_copy_nr_ranges_entry.attr, &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, &queue_zone_append_max_entry.attr, diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 6817a673e5ce..6e5fef3cc615 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -75,6 +75,7 @@ bool blk_req_needs_zone_write_lock(struct request *rq) case REQ_OP_WRITE_ZEROES: case REQ_OP_WRITE_SAME: case REQ_OP_WRITE: + case REQ_OP_COPY: return blk_rq_zone_is_seq(rq); default: return false; diff --git a/block/bounce.c b/block/bounce.c index 162a6eee8999..7fbdc52decb3 100644 --- a/block/bounce.c +++ b/block/bounce.c @@ -254,6 +254,7 @@ static struct bio *bounce_clone_bio(struct bio *bio_src, gfp_t gfp_mask, bio->bi_iter.bi_size = bio_src->bi_iter.bi_size; switch (bio_op(bio)) { + case REQ_OP_COPY: case REQ_OP_DISCARD: case REQ_OP_SECURE_ERASE: case REQ_OP_WRITE_ZEROES: diff --git a/block/ioctl.c b/block/ioctl.c index 6b785181344f..a4a507d85e56 100644 --- a/block/ioctl.c +++ b/block/ioctl.c @@ -142,6 +142,47 @@ static int blk_ioctl_discard(struct block_device *bdev, fmode_t mode, GFP_KERNEL, flags); } +static int blk_ioctl_copy(struct block_device *bdev, fmode_t mode, + unsigned long arg, unsigned long flags) +{ + struct copy_range crange; + struct range_entry *rlist; + struct request_queue *q = bdev_get_queue(bdev); + sector_t dest; + int ret; + + if (!(mode & FMODE_WRITE)) + return -EBADF; + + if (!blk_queue_copy(q)) + return -EOPNOTSUPP; + + if (copy_from_user(&crange, (void __user *)arg, sizeof(crange))) + return -EFAULT; + + if (crange.dest & ((1 << SECTOR_SHIFT) - 1)) + return -EFAULT; + dest = crange.dest >> SECTOR_SHIFT; + + rlist = kmalloc_array(crange.nr_range, sizeof(*rlist), + GFP_ATOMIC | __GFP_NOWARN); + + if (!rlist) + return -ENOMEM; + + if (copy_from_user(rlist, (void __user *)crange.range_list, + sizeof(*rlist) * crange.nr_range)) { + ret = -EFAULT; + goto out; + } + + ret = blkdev_issue_copy(bdev, dest, crange.nr_range, + rlist, GFP_KERNEL, flags); +out: + kfree(rlist); + return ret; +} + static int blk_ioctl_zeroout(struct block_device *bdev, fmode_t mode, unsigned long arg) { @@ -467,6 +508,8 @@ static int blkdev_common_ioctl(struct block_device *bdev, fmode_t mode, case BLKSECDISCARD: return blk_ioctl_discard(bdev, mode, arg, BLKDEV_DISCARD_SECURE); + case BLKCOPY: + return blk_ioctl_copy(bdev, mode, arg, 0); case BLKZEROOUT: return blk_ioctl_zeroout(bdev, mode, arg); case BLKREPORTZONE: diff --git a/include/linux/bio.h b/include/linux/bio.h index ecf67108f091..7e40a37f0ee5 100644 --- a/include/linux/bio.h +++ b/include/linux/bio.h @@ -71,6 +71,7 @@ static inline bool bio_has_data(struct bio *bio) static inline bool bio_no_advance_iter(const struct bio *bio) { return bio_op(bio) == REQ_OP_DISCARD || + bio_op(bio) == REQ_OP_COPY || bio_op(bio) == REQ_OP_SECURE_ERASE || bio_op(bio) == REQ_OP_WRITE_SAME || bio_op(bio) == REQ_OP_WRITE_ZEROES; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index d9b69bbde5cc..4ecb9c16702d 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -360,6 +360,8 @@ enum req_opf { REQ_OP_ZONE_RESET = 15, /* reset all the zone present on the device */ REQ_OP_ZONE_RESET_ALL = 17, + /* copy ranges within device */ + REQ_OP_COPY = 19, /* SCSI passthrough using struct scsi_request */ REQ_OP_SCSI_IN = 32, @@ -486,6 +488,11 @@ static inline bool op_is_discard(unsigned int op) return (op & REQ_OP_MASK) == REQ_OP_DISCARD; } +static inline bool op_is_copy(unsigned int op) +{ + return (op & REQ_OP_MASK) == REQ_OP_COPY; +} + /* * Check if a bio or request operation is a zone management operation, with * the exception of REQ_OP_ZONE_RESET_ALL which is treated as a special case @@ -545,4 +552,12 @@ struct blk_rq_stat { u64 batch; }; +struct blk_copy_payload { + sector_t dest; + int copy_range; + int copy_size; + int err; + struct range_entry range[]; +}; + #endif /* __LINUX_BLK_TYPES_H */ diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 05b346a68c2e..5b656b00850b 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -340,10 +340,14 @@ struct queue_limits { unsigned int max_zone_append_sectors; unsigned int discard_granularity; unsigned int discard_alignment; + unsigned int copy_offload; + unsigned int max_copy_sectors; unsigned short max_segments; unsigned short max_integrity_segments; unsigned short max_discard_segments; + unsigned short max_copy_range_sectors; + unsigned short max_copy_nr_ranges; unsigned char misaligned; unsigned char discard_misaligned; @@ -625,6 +629,7 @@ struct request_queue { #define QUEUE_FLAG_RQ_ALLOC_TIME 27 /* record rq->alloc_time_ns */ #define QUEUE_FLAG_HCTX_ACTIVE 28 /* at least one blk-mq hctx is active */ #define QUEUE_FLAG_NOWAIT 29 /* device supports NOWAIT */ +#define QUEUE_FLAG_COPY 30 /* supports copy */ #define QUEUE_FLAG_MQ_DEFAULT ((1 << QUEUE_FLAG_IO_STAT) | \ (1 << QUEUE_FLAG_SAME_COMP) | \ @@ -647,6 +652,7 @@ bool blk_queue_flag_test_and_set(unsigned int flag, struct request_queue *q); #define blk_queue_io_stat(q) test_bit(QUEUE_FLAG_IO_STAT, &(q)->queue_flags) #define blk_queue_add_random(q) test_bit(QUEUE_FLAG_ADD_RANDOM, &(q)->queue_flags) #define blk_queue_discard(q) test_bit(QUEUE_FLAG_DISCARD, &(q)->queue_flags) +#define blk_queue_copy(q) test_bit(QUEUE_FLAG_COPY, &(q)->queue_flags) #define blk_queue_zone_resetall(q) \ test_bit(QUEUE_FLAG_ZONE_RESETALL, &(q)->queue_flags) #define blk_queue_secure_erase(q) \ @@ -1059,6 +1065,9 @@ static inline unsigned int blk_queue_get_max_sectors(struct request_queue *q, return min(q->limits.max_discard_sectors, UINT_MAX >> SECTOR_SHIFT); + if (unlikely(op == REQ_OP_COPY)) + return q->limits.max_copy_sectors; + if (unlikely(op == REQ_OP_WRITE_SAME)) return q->limits.max_write_same_sectors; @@ -1330,6 +1339,13 @@ extern int __blkdev_issue_discard(struct block_device *bdev, sector_t sector, sector_t nr_sects, gfp_t gfp_mask, int flags, struct bio **biop); +extern int __blkdev_issue_copy(struct block_device *bdev, sector_t dest, + sector_t nr_srcs, struct range_entry *rlist, gfp_t gfp_mask, + int flags, struct bio **biop); +extern int blkdev_issue_copy(struct block_device *bdev, sector_t dest, + int nr_srcs, struct range_entry *rlist, + gfp_t gfp_mask, unsigned long flags); + #define BLKDEV_ZERO_NOUNMAP (1 << 0) /* do not free blocks */ #define BLKDEV_ZERO_NOFALLBACK (1 << 1) /* don't write explicit zeroes */ diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index f44eb0a04afd..5cadb176317a 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -64,6 +64,18 @@ struct fstrim_range { __u64 minlen; }; +struct range_entry { + __u64 src; + __u64 len; +}; + +struct copy_range { + __u64 dest; + __u64 nr_range; + __u64 range_list; + __u64 rsvd; +}; + /* extent-same (dedupe) ioctls; these MUST match the btrfs ioctl definitions */ #define FILE_DEDUPE_RANGE_SAME 0 #define FILE_DEDUPE_RANGE_DIFFERS 1 @@ -184,6 +196,7 @@ struct fsxattr { #define BLKSECDISCARD _IO(0x12,125) #define BLKROTATIONAL _IO(0x12,126) #define BLKZEROOUT _IO(0x12,127) +#define BLKCOPY _IOWR(0x12, 128, struct copy_range) /* * A jump here: 130-131 are reserved for zoned block devices * (see uapi/linux/blkzoned.h) From patchwork Fri Dec 11 13:51:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: SelvaKumar S X-Patchwork-Id: 11971931 X-Patchwork-Delegate: snitzer@redhat.com Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=unavailable autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B05E0C1B0D8 for ; Mon, 14 Dec 2020 11:00:55 +0000 (UTC) Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [63.128.21.124]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id F01AF22BEF for ; Mon, 14 Dec 2020 11:00:54 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org F01AF22BEF Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=samsung.com Authentication-Results: mail.kernel.org; spf=tempfail smtp.mailfrom=dm-devel-bounces@redhat.com Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-274-TZHP5dnGP6icetqMKbWyRA-1; Mon, 14 Dec 2020 06:00:51 -0500 X-MC-Unique: TZHP5dnGP6icetqMKbWyRA-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 37D7D107ACF8; Mon, 14 Dec 2020 11:00:47 +0000 (UTC) Received: from colo-mx.corp.redhat.com (colo-mx01.intmail.prod.int.phx2.redhat.com [10.5.11.20]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 13D442BCD2; Mon, 14 Dec 2020 11:00:47 +0000 (UTC) Received: from lists01.pubmisc.prod.ext.phx2.redhat.com (lists01.pubmisc.prod.ext.phx2.redhat.com [10.5.19.33]) by colo-mx.corp.redhat.com (Postfix) with ESMTP id D6EF31809CA1; Mon, 14 Dec 2020 11:00:46 +0000 (UTC) Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) by lists01.pubmisc.prod.ext.phx2.redhat.com (8.13.8/8.13.8) with ESMTP id 0BBEvdbp008670 for ; Fri, 11 Dec 2020 09:57:40 -0500 Received: by smtp.corp.redhat.com (Postfix) id C555D112C08B; Fri, 11 Dec 2020 14:57:39 +0000 (UTC) Received: from mimecast-mx02.redhat.com (mimecast05.extmail.prod.ext.rdu2.redhat.com [10.11.55.21]) by smtp.corp.redhat.com (Postfix) with ESMTPS id C0B3C1043982 for ; Fri, 11 Dec 2020 14:57:37 +0000 (UTC) Received: from us-smtp-1.mimecast.com (us-smtp-delivery-1.mimecast.com [205.139.110.120]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-SHA384 (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id 88617800157 for ; Fri, 11 Dec 2020 14:57:37 +0000 (UTC) Received: from mailout2.samsung.com (mailout2.samsung.com [203.254.224.25]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-438-j_JgE9NaOwGy2qDFwHQEsg-1; Fri, 11 Dec 2020 09:57:33 -0500 X-MC-Unique: j_JgE9NaOwGy2qDFwHQEsg-1 Received: from epcas5p3.samsung.com (unknown [182.195.41.41]) by mailout2.samsung.com (KnoxPortal) with ESMTP id 20201211145730epoutp02a28346f97e0b759f2b637249c77e243f~PsWeex0c61723917239epoutp02M for ; Fri, 11 Dec 2020 14:57:30 +0000 (GMT) DKIM-Filter: OpenDKIM Filter v2.11.0 mailout2.samsung.com 20201211145730epoutp02a28346f97e0b759f2b637249c77e243f~PsWeex0c61723917239epoutp02M Received: from epsmges5p2new.samsung.com (unknown [182.195.42.74]) by epcas5p3.samsung.com (KnoxPortal) with ESMTP id 20201211145729epcas5p32c66a0996a1c53b24668c1be9d615611~PsWdpWNVp2218722187epcas5p35; Fri, 11 Dec 2020 14:57:29 +0000 (GMT) Received: from epcas5p1.samsung.com ( [182.195.41.39]) by epsmges5p2new.samsung.com (Symantec Messaging Gateway) with SMTP id 0E.00.50652.9D883DF5; Fri, 11 Dec 2020 23:57:29 +0900 (KST) Received: from epsmtrp2.samsung.com (unknown [182.195.40.14]) by epcas5p1.samsung.com (KnoxPortal) with ESMTPA id 20201211135205epcas5p1f1696075e1354f0f4c7af04b950d514c~PrdW6fQb60154201542epcas5p1m; Fri, 11 Dec 2020 13:52:05 +0000 (GMT) Received: from epsmgms1p2.samsung.com (unknown [182.195.42.42]) by epsmtrp2.samsung.com (KnoxPortal) with ESMTP id 20201211135205epsmtrp2fe2f38d12e5212d5f5464f91ae3d27ce~PrdW5WhCB0569205692epsmtrp2S; Fri, 11 Dec 2020 13:52:05 +0000 (GMT) X-AuditID: b6c32a4a-6b3ff7000000c5dc-24-5fd388d96153 Received: from epsmtip2.samsung.com ( [182.195.34.31]) by epsmgms1p2.samsung.com (Symantec Messaging Gateway) with SMTP id D5.71.08745.58973DF5; Fri, 11 Dec 2020 22:52:05 +0900 (KST) Received: from localhost.localdomain (unknown [107.110.206.5]) by epsmtip2.samsung.com (KnoxPortal) with ESMTPA id 20201211135202epsmtip26706456da2bdcb8021663f93a8476308~PrdTz_pmh2927029270epsmtip2H; Fri, 11 Dec 2020 13:52:02 +0000 (GMT) From: SelvaKumar S To: linux-nvme@lists.infradead.org Date: Fri, 11 Dec 2020 19:21:39 +0530 Message-Id: <20201211135139.49232-3-selvakuma.s1@samsung.com> In-Reply-To: <20201211135139.49232-1-selvakuma.s1@samsung.com> MIME-Version: 1.0 X-Brightmail-Tracker: H4sIAAAAAAAAA02Sf0wTZxjH996dd9eykgORvaJzS5PpcLPIhtvLVFyyut00S7YRFkKywbme hQmla+lQt2hxY0LZUJg/YpFhCMFRfzBKB11piSlzhbpftCCipYFAySbR8mM4tCvO9lj0v8/7 PJ8n3zxPXhqPLyOT6AJVCa9RcYVSUkx09CSv2zBc4c3deOymFJ0bOUKiE9N3cVR++A6GHLfr liF7Yy2GWs5dxtC4b45CYYMfQ5fv3yJRrfMqQI7rzyGv7TSJGpoDFKoaspLorGsRQzUVgxjq CDXgaOBYI44uTgUJNBocotDUQh+JvmybB68mst6BneyPxhGK/d3fRrDeX3Ws2VRJsu1NB9mu YT3JzgSuE2ywe5Bkqy0mwLZf+ZSdM69hD1+qwt6W5Ii3KPjCgk94TUpGnjjfXbmIqQ9t3mua byX0wC0zAJqGTBoc/y3HAER0PNMFYMD9sgGIH/AsgKNlnaTwmANw2GAgI1ZkwOlvXmrYACz/ ZuahZZjspiIWyWyAV5vMRIQTGCkMtx4iIhLODOPQc282Ki1n0qFvsR6PMME8Ayd/MkVZwmyF s+FbQIh7Cp7y/BP1RUwGrLb4lpw42HdqIhqAP3A+/6EOjwRApkEEOxf6CGFYDmv1dbjAy+FN l4USOAnO3XYs7VMKJytPYALrAawOlgq8Dfbbw1jkSDiTDFttKUL5SXjcfRETcmPh16GJpVEJ tH47gQk3XQvdnS8J5dVwusdGCmUWmnoLhFvVADj4rx0/Cp42PrKN8ZFtjA+DzwDcBFbyam2R ktduUr+g4ktlWq5Iq1MpZR8WF5lB9Jeu32EFY6PTMifAaOAEkMalCZLQQW9uvETB7dvPa4pz NbpCXusEq2hC+oREuvDLB/GMkivh9/C8mtf838VoUZIe2+S50NC9RX1j+zrvAYVoJs19PDSz 2i2t/1g/mHbh2TWnA/tZlrixKvh44ldvDr2/dYe8y+/Clu1tst4L71LJUSoI5cWUvqYjuQRv ieWkD1WfrTc2iRXZno78liMpoqy3qN7dhrt/1FzZXtlqk42hk4kqsrw/tJn++e8Vne8p+6mx DCL9szvv6pMbXR5caS+OIb+fv1RxLfuv4lDewPl02XdDxt1fOP58TNdvHW/L0r1jtTp8r6S/ qK4508K8zuka07btS86QfzRlt1SF3jgwYnbZs+/74q4p4pozM4E1c8WkX5O6MqZ3Z1mKbuP5 57upnizPnp6OnAAXuyt2bbtcSmjzudT1uEbL/Qec0lPFFAQAAA== X-Brightmail-Tracker: H4sIAAAAAAAAA+NgFjrDIsWRmVeSWpSXmKPExsWy7bCSvG5r5eV4g8UnhC1W3+1ns5j24Sez RWv7NyaLve9ms1rsWTSJyWLl6qNMFo/vfGa3+Nt1j8ni6P+3bBaTDl1jtNh7S9vi8q45bBbz lz1lt+i+voPNYvnxf0wWEzuuMlls+z2f2eLKlEXMFutev2exePD+OrvF6x8n2SzaNn5ldBDz uHzF22PnrLvsHufvbWTxuHy21GPTqk42j81L6j1232xg8/j49BaLx/t9V9k8+rasYvTYfLra 4/MmOY/2A91MAbxRXDYpqTmZZalF+nYJXBmnOv8xFTRZV6z6up6lgfGUXhcjJ4eEgInEoXvL 2LoYuTiEBHYwSvyZ1cYOkZCRWHu3kw3CFpZY+e85O0TRR0aJuefPMYIk2AR0Ja4t2cQCYosI KEn8Xd/EAlLELPCZWWL+rjawhLCApcSdf3OZQWwWAVWJZ0dWgdm8ArYSn/6+ZYTYIC8x89J3 sM2cAnYSfVvugNUIAdXcP9nPClEvKHFy5hOgmRxAC9Ql1s8TAgkzA7U2b53NPIFRcBaSqlkI VbOQVC1gZF7FKJlaUJybnltsWGCUl1quV5yYW1yal66XnJ+7iREc4VpaOxj3rPqgd4iRiYPx EKMEB7OSCK8sy6V4Id6UxMqq1KL8+KLSnNTiQ4zSHCxK4rwXuk7GCwmkJ5akZqemFqQWwWSZ ODilGpiWNjy2vW/y8cV5nRVPNrSZ560yKDywskhy0uv/BYdXv0vUTr/6/K/CsUUu79lzE9v2 76nSXf3e6XjQXm8hqSMzpGofOoqG8yT0sj7/fP7W/RtTb16rduk/9sY8pmKzYC+znGKtHP+k I3tjbKdZyb8Qvpyd86twxbO/OryTjXqzTNK2ur/YHx59OiLjg3+53T6LI/+uq+zOn+nrJDLJ 0b/Pndsp4vH0fcfLpu38dzPlzvb5DSqOW/57tKUXl9bon4/s5BBJvz9t/tHVna/1rt9Z+rvm 2KaPoTHaur3yP+urXBe/3ayyZP+7QIevHV+2trsttjeNuucs/lju4Z9/Qi4mb17vmnDbiedK iyRL37I2JZbijERDLeai4kQAJ1oB6F8DAAA= X-CMS-MailID: 20201211135205epcas5p1f1696075e1354f0f4c7af04b950d514c X-Msg-Generator: CA X-Sendblock-Type: REQ_APPROVE CMS-TYPE: 105P X-CMS-RootMailID: 20201211135205epcas5p1f1696075e1354f0f4c7af04b950d514c References: <20201211135139.49232-1-selvakuma.s1@samsung.com> X-Mimecast-Impersonation-Protect: Policy=CLT - Impersonation Protection Definition; Similar Internal Domain=false; Similar Monitored External Domain=false; Custom External Domain=false; Mimecast External Domain=false; Newly Observed Domain=false; Internal User Name=false; Custom Display Name List=false; Reply-to Address Mismatch=false; Targeted Threat Dictionary=false; Mimecast Threat Dictionary=false; Custom Threat Dictionary=false X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-MIME-Autoconverted: from quoted-printable to 8bit by lists01.pubmisc.prod.ext.phx2.redhat.com id 0BBEvdbp008670 X-loop: dm-devel@redhat.com X-Mailman-Approved-At: Mon, 14 Dec 2020 06:00:28 -0500 Cc: axboe@kernel.dk, damien.lemoal@wdc.com, SelvaKumar S , sagi@grimberg.me, linux-scsi@vger.kernel.org, selvajove@gmail.com, Johannes.Thumshirn@wdc.com, snitzer@redhat.com, linux-kernel@vger.kernel.org, nj.shetty@samsung.com, linux-block@vger.kernel.org, dm-devel@redhat.com, mpatocka@redhat.com, joshi.k@samsung.com, martin.petersen@oracle.com, kbusch@kernel.org, javier.gonz@samsung.com, hch@lst.de, bvanassche@acm.org Subject: [dm-devel] [RFC PATCH v3 2/2] nvme: add simple copy support X-BeenThere: dm-devel@redhat.com X-Mailman-Version: 2.1.12 Precedence: junk List-Id: device-mapper development List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: dm-devel-bounces@redhat.com Errors-To: dm-devel-bounces@redhat.com X-Scanned-By: MIMEDefang 2.84 on 10.5.11.23 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=dm-devel-bounces@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Add support for TP 4065a ("Simple Copy Command"), v2020.05.04 ("Ratified") The implementation uses the payload passed from the block layer to form simple copy command. Set the device copy limits to queue limits. Signed-off-by: SelvaKumar S Signed-off-by: Kanchan Joshi Signed-off-by: Nitesh Shetty Signed-off-by: Javier González --- drivers/nvme/host/core.c | 89 ++++++++++++++++++++++++++++++++++++++++ include/linux/nvme.h | 43 +++++++++++++++++-- 2 files changed, 129 insertions(+), 3 deletions(-) diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c index 9b6ebeb29cca..d235156ff565 100644 --- a/drivers/nvme/host/core.c +++ b/drivers/nvme/host/core.c @@ -647,6 +647,65 @@ static inline void nvme_setup_flush(struct nvme_ns *ns, cmnd->common.nsid = cpu_to_le32(ns->head->ns_id); } +static inline blk_status_t nvme_setup_copy(struct nvme_ns *ns, + struct request *req, struct nvme_command *cmnd) +{ + struct nvme_ctrl *ctrl = ns->ctrl; + struct nvme_copy_range *range = NULL; + struct blk_copy_payload *payload; + unsigned short nr_range = 0; + u16 control = 0, ssrl; + u32 dsmgmt = 0; + u64 slba; + int i; + + payload = bio_data(req->bio); + nr_range = payload->copy_range; + + if (req->cmd_flags & REQ_FUA) + control |= NVME_RW_FUA; + + if (req->cmd_flags & REQ_FAILFAST_DEV) + control |= NVME_RW_LR; + + cmnd->copy.opcode = nvme_cmd_copy; + cmnd->copy.nsid = cpu_to_le32(ns->head->ns_id); + cmnd->copy.sdlba = cpu_to_le64(blk_rq_pos(req) >> (ns->lba_shift - 9)); + + range = kmalloc_array(nr_range, sizeof(*range), + GFP_ATOMIC | __GFP_NOWARN); + if (!range) + return BLK_STS_RESOURCE; + + for (i = 0; i < nr_range; i++) { + slba = payload->range[i].src; + slba = slba >> (ns->lba_shift - 9); + + ssrl = payload->range[i].len; + ssrl = ssrl >> (ns->lba_shift - 9); + + range[i].slba = cpu_to_le64(slba); + range[i].nlb = cpu_to_le16(ssrl - 1); + } + + cmnd->copy.nr_range = nr_range - 1; + + req->special_vec.bv_page = virt_to_page(range); + req->special_vec.bv_offset = offset_in_page(range); + req->special_vec.bv_len = sizeof(*range) * nr_range; + req->rq_flags |= RQF_SPECIAL_PAYLOAD; + + if (ctrl->nr_streams) + nvme_assign_write_stream(ctrl, req, &control, &dsmgmt); + + //TBD end-to-end + + cmnd->rw.control = cpu_to_le16(control); + cmnd->rw.dsmgmt = cpu_to_le32(dsmgmt); + + return BLK_STS_OK; +} + static blk_status_t nvme_setup_discard(struct nvme_ns *ns, struct request *req, struct nvme_command *cmnd) { @@ -829,6 +888,9 @@ blk_status_t nvme_setup_cmd(struct nvme_ns *ns, struct request *req, case REQ_OP_DISCARD: ret = nvme_setup_discard(ns, req, cmd); break; + case REQ_OP_COPY: + ret = nvme_setup_copy(ns, req, cmd); + break; case REQ_OP_READ: ret = nvme_setup_rw(ns, req, cmd, nvme_cmd_read); break; @@ -1850,6 +1912,31 @@ static void nvme_config_discard(struct gendisk *disk, struct nvme_ns *ns) blk_queue_max_write_zeroes_sectors(queue, UINT_MAX); } +static void nvme_config_copy(struct gendisk *disk, struct nvme_ns *ns, + struct nvme_id_ns *id) +{ + struct nvme_ctrl *ctrl = ns->ctrl; + struct request_queue *queue = disk->queue; + + if (!(ctrl->oncs & NVME_CTRL_ONCS_COPY)) { + queue->limits.copy_offload = 0; + queue->limits.max_copy_sectors = 0; + queue->limits.max_copy_range_sectors = 0; + queue->limits.max_copy_nr_ranges = 0; + blk_queue_flag_clear(QUEUE_FLAG_COPY, queue); + return; + } + + /* setting copy limits */ + blk_queue_flag_test_and_set(QUEUE_FLAG_COPY, queue); + queue->limits.copy_offload = 0; + queue->limits.max_copy_sectors = le64_to_cpu(id->mcl) * + (1 << (ns->lba_shift - 9)); + queue->limits.max_copy_range_sectors = le32_to_cpu(id->mssrl) * + (1 << (ns->lba_shift - 9)); + queue->limits.max_copy_nr_ranges = id->msrc + 1; +} + static void nvme_config_write_zeroes(struct gendisk *disk, struct nvme_ns *ns) { u64 max_blocks; @@ -2045,6 +2132,7 @@ static void nvme_update_disk_info(struct gendisk *disk, set_capacity_and_notify(disk, capacity); nvme_config_discard(disk, ns); + nvme_config_copy(disk, ns, id); nvme_config_write_zeroes(disk, ns); if (id->nsattr & NVME_NS_ATTR_RO) @@ -4616,6 +4704,7 @@ static inline void _nvme_check_size(void) BUILD_BUG_ON(sizeof(struct nvme_download_firmware) != 64); BUILD_BUG_ON(sizeof(struct nvme_format_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_dsm_cmd) != 64); + BUILD_BUG_ON(sizeof(struct nvme_copy_command) != 64); BUILD_BUG_ON(sizeof(struct nvme_write_zeroes_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_abort_cmd) != 64); BUILD_BUG_ON(sizeof(struct nvme_get_log_page_command) != 64); diff --git a/include/linux/nvme.h b/include/linux/nvme.h index d92535997687..11ed72a2164d 100644 --- a/include/linux/nvme.h +++ b/include/linux/nvme.h @@ -289,7 +289,7 @@ struct nvme_id_ctrl { __u8 nvscc; __u8 nwpc; __le16 acwu; - __u8 rsvd534[2]; + __le16 ocfs; __le32 sgls; __le32 mnan; __u8 rsvd544[224]; @@ -314,6 +314,7 @@ enum { NVME_CTRL_ONCS_WRITE_ZEROES = 1 << 3, NVME_CTRL_ONCS_RESERVATIONS = 1 << 5, NVME_CTRL_ONCS_TIMESTAMP = 1 << 6, + NVME_CTRL_ONCS_COPY = 1 << 8, NVME_CTRL_VWC_PRESENT = 1 << 0, NVME_CTRL_OACS_SEC_SUPP = 1 << 0, NVME_CTRL_OACS_DIRECTIVES = 1 << 5, @@ -362,7 +363,10 @@ struct nvme_id_ns { __le16 npdg; __le16 npda; __le16 nows; - __u8 rsvd74[18]; + __le16 mssrl; + __le32 mcl; + __u8 msrc; + __u8 rsvd91[11]; __le32 anagrpid; __u8 rsvd96[3]; __u8 nsattr; @@ -673,6 +677,7 @@ enum nvme_opcode { nvme_cmd_resv_report = 0x0e, nvme_cmd_resv_acquire = 0x11, nvme_cmd_resv_release = 0x15, + nvme_cmd_copy = 0x19, nvme_cmd_zone_mgmt_send = 0x79, nvme_cmd_zone_mgmt_recv = 0x7a, nvme_cmd_zone_append = 0x7d, @@ -691,7 +696,8 @@ enum nvme_opcode { nvme_opcode_name(nvme_cmd_resv_register), \ nvme_opcode_name(nvme_cmd_resv_report), \ nvme_opcode_name(nvme_cmd_resv_acquire), \ - nvme_opcode_name(nvme_cmd_resv_release)) + nvme_opcode_name(nvme_cmd_resv_release), \ + nvme_opcode_name(nvme_cmd_copy)) /* @@ -863,6 +869,36 @@ struct nvme_dsm_range { __le64 slba; }; +struct nvme_copy_command { + __u8 opcode; + __u8 flags; + __u16 command_id; + __le32 nsid; + __u64 rsvd2; + __le64 metadata; + union nvme_data_ptr dptr; + __le64 sdlba; + __u8 nr_range; + __u8 rsvd12; + __le16 control; + __le16 rsvd13; + __le16 dspec; + __le32 ilbrt; + __le16 lbat; + __le16 lbatm; +}; + +struct nvme_copy_range { + __le64 rsvd0; + __le64 slba; + __le16 nlb; + __le16 rsvd18; + __le32 rsvd20; + __le32 eilbrt; + __le16 elbat; + __le16 elbatm; +}; + struct nvme_write_zeroes_cmd { __u8 opcode; __u8 flags; @@ -1400,6 +1436,7 @@ struct nvme_command { struct nvme_download_firmware dlfw; struct nvme_format_cmd format; struct nvme_dsm_cmd dsm; + struct nvme_copy_command copy; struct nvme_write_zeroes_cmd write_zeroes; struct nvme_zone_mgmt_send_cmd zms; struct nvme_zone_mgmt_recv_cmd zmr;