From patchwork Fri Apr 3 10:12:41 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472431 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 441AB1667 for ; Fri, 3 Apr 2020 10:13:01 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 2363820757 for ; Fri, 3 Apr 2020 10:13:01 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="i39FQf0o" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2389574AbgDCKNA (ORCPT ); Fri, 3 Apr 2020 06:13:00 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56713 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727774AbgDCKNA (ORCPT ); Fri, 3 Apr 2020 06:13:00 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908780; x=1617444780; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=iOKpbr1Ka1cFGP35Fa6ZTxHG0mbfjgFm55pfw0HfVbo=; b=i39FQf0o/Sne8oUB6uVXsvUffFc3Zs+m2B/qf6tpBo/h7wPk1J6cQZz6 6E3YWHyHJ+nkZ25Zhi5LvYyfoYKjBfNbZz9t9DeegLXoqp2QZ1OXNLCdB vREL56LNKvlZC/sQoQVfy4Qq16w5knxhGNqZmHffY1GMOUBybrN2Uko7p UOn4iq4Q4FpKRKEPbK5i0NGW9ppcjqF6UPi9a2Ocm9p300A6d7LB8gRjO 94YhQ9bfW08K59gV1zm81xsJqx7W4xFnWVkXuHM4zcNbxlI5oDhoM7524 63j5YKk3zZqOZn4mMV+o3XCzmsED/VHnO8rZ0xZWxubew1C7Px9jeQury g==; IronPort-SDR: jCxQFdG0FIKkD2b8TM+0kw4bp+az5868RNQH/it6BUdGNO+1DPKf5yR1Gp9j7tfnCxAIBZ/Fr3 AtAQ5ztCaSd8UkCKc4dTaqSwyKlkg7lwgeNjTTIJry4joS8Bpb0UC8rdvYrKq8obGn7Ll7yvFP 6fPes/DXCuAN0R0oQdLhjYV4nW/XxrGU0wflwhDdVCRlw14qdbQMywDAlcsekxname6GF+Xo8b 1MBmQPY4MQif4syCLQ8kKE1rz6ae4DBlcy9w8hoWbBF8+r0J2v5GXLd4nTqZWQFmfQXhkc3irN 23U= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135955985" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:00 +0800 IronPort-SDR: sBnqjut9ETJ8anmBBcqKpFxINTxAFMXl16c6SA0LBN+1O0UyZ9m+W+quxkDPfzn+iZv+ZlkF4q xReqwUIpdgjVFZ814OwxOFEqeUG5YzKqiOIquQtkgb0o1vUVrzOLQLAM8ZLHPaXgMbe6dOcgBA CDTC9BnNvx25rvu1JWACYe4NodNDO+PWAs7/Rr8GhR6bC191Yb5nTgDhJI/aiTGRqxlTtFQfvh G9A2KgLclbWDv/YGuitV9r1FqyB8X7Bbs4ho0ulTps5qXHa7QbudIB+y4Pvcpq4hwhyLXssOM8 /57hE6etWf4P/5AmxP4hF6Au Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:03:49 -0700 IronPort-SDR: C8QlHCSAhkkzsOxz/XBOnpQEUi67ssjLzwPUl+qtyHlzKJ1Yx5FyDwo/RVSmqTXf9dCLc7nFAR fMnDcZoXA1JNU6MZpyzAT5fEQTAWCURq14TCSaSY9Hpi4586x3qUTUSTTNVD2r1xH9pyKvlY0a kc4uh9ygDtwBDcFuX5MyMB1Wh8iO+o+qt5ALv7gY6RR2+6r3uUqTmRe3J75KXIgp3yfgI0MRhM cTNSCrFMCjZ7kc+MHeLqqgXcNt8yvrj2vAXnRdm3lz/5uOb4HHAOun8WHUiBxDu/EzluUsZScF DNE= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:12:56 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn , Christoph Hellwig Subject: [PATCH v4 01/10] block: provide fallbacks for blk_queue_zone_is_seq and blk_queue_zone_no Date: Fri, 3 Apr 2020 19:12:41 +0900 Message-Id: <20200403101250.33245-2-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org blk_queue_zone_is_seq() and blk_queue_zone_no() have not been called with CONFIG_BLK_DEV_ZONED disabled until now. The introduction of REQ_OP_ZONE_APPEND will change this, so we need to provide noop fallbacks for the !CONFIG_BLK_DEV_ZONED case. Signed-off-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig --- include/linux/blkdev.h | 10 ++++++++++ 1 file changed, 10 insertions(+) diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index f629d40c645c..25b63f714619 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -729,6 +729,16 @@ static inline unsigned int blk_queue_nr_zones(struct request_queue *q) { return 0; } +static inline bool blk_queue_zone_is_seq(struct request_queue *q, + sector_t sector) +{ + return false; +} +static inline unsigned int blk_queue_zone_no(struct request_queue *q, + sector_t sector) +{ + return 0; +} #endif /* CONFIG_BLK_DEV_ZONED */ static inline bool rq_is_sync(struct request *rq) From patchwork Fri Apr 3 10:12:42 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472435 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7D766159A for ; Fri, 3 Apr 2020 10:13:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4807A208E4 for ; Fri, 3 Apr 2020 10:13:05 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Rjv/I1SX" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390442AbgDCKNE (ORCPT ); Fri, 3 Apr 2020 06:13:04 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56713 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727774AbgDCKNB (ORCPT ); Fri, 3 Apr 2020 06:13:01 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908782; x=1617444782; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8ktU2tWEaMJUhbWE3rZRtiiRyX7GPxPReS4h+tAAhRs=; b=Rjv/I1SXLJadoSHswWRKcnglDUpgEymL1gb4+Q3Cy6SvbIhdeTJWmj5i +ywBvrEum+9p7SUgspRdMBD4UjypQ+Ndo6WzhFGVlAhNgwzY3nNlgO5A7 /wPSJl9wk9+FUCS8YSExBz7Y2ZVVxAoZMdZY3tduc1qX0dh+Rkw7Mi55p yP0o3KdhGiM1W6OqWpFlUAJQcIJ/Tft5ZAZe842r3Al2UC1vNOtI784pQ y65NtasQ5XFGyC9nIDgbZB2tQLfgXbxQJIJakGQxm+aJ09R9YgEh4RlYH 62hV1Fh/rBsMg6S/NRRJ4L0SRKJ3BvY7lOx0F8K4vHWgKNegic/HNGOJg Q==; IronPort-SDR: sPE1AKDhEdm3UlliMEfKQCQVeQoi/PRvgqU8dyibAuVXmtUE4u5sHQm/D79qMjHRhKT3rpiILu S3dB9SqUIoSTNYft6WpR9IYCo2BX5p7mfEKHyWB9YuVhQDQLF71a/jPo5tYObXXnM6rltOp3P6 k4/eM5kBk8dJDWATqiBVjAb75Esor34GAjPV+ncIX0FY1JcR25xR1tYdTqS5IoygahCTTQs0LK tQ0zpw5kJmAH+G6XKWWrk838fbsZB2HT1h22SeaOLZchJe0iwuPej/MmXtD35PHzMyGsO7eDcx laU= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135955996" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:01 +0800 IronPort-SDR: I6pCXPCOFnqGD4hLrFLXOb90HFoLYy9le83lKclEA1zpUIzlkHWzubyCn1LZR4pyZNALU2uLxA 2T77X1qblDcv/JNUkOsSGNagrW5NxyYVvivqB/QuH1BCLNzNHCtEZkG+/4uhZDj6Rzhbu+StXu M7RlPb9MYayr7CB5WqfeeZQpYuLY3lK3aK4EAZ9GSDqYkkiQVtm6cpB8hyKIAh465ugEAdDgVH sLw8D0yrOVP6RBpCCVUdS3/pXl958O8Edj3EiozoEpyleBiovVJNlz0quv0zzM26EUP2KRmyD9 lgO2E/TUxkaFboyLhpvkrU2O Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:03:51 -0700 IronPort-SDR: /1ul2xIy+XfWZUoMpBJ2N/PDgXEXAcgjh5Boj8kmA8aNZhn4QTmeQz84c0oLlvEjggtosIQo2m 9UEA2pFHc9tfnwNRi8nUh0b9JVpbGv3OBkZDZWq8jHBq12PwHKiF+ue450oyTkJcBbpLAXapmj rDvKm2t32GyIOUp4159meyk3vtdTmT3CExcYsmMNUnpy9e2kLyf79Eg7v4bItDR7rnDoeI6T4N DmJZTDtkbqiNyGU/hdAiE7zIdTRjgn0YiKuGtCC6Csmc2e8FEswnKWba9gfjKa3nOu/qdKy5xr mZE= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:12:58 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v4 02/10] block: Introduce REQ_OP_ZONE_APPEND Date: Fri, 3 Apr 2020 19:12:42 +0900 Message-Id: <20200403101250.33245-3-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org From: Keith Busch Define REQ_OP_ZONE_APPEND to append-write sectors to a zone of a zoned block device. This is a no-merge write operation. A zone append write BIO must: * Target a zoned block device * Have a sector position indicating the start sector of the target zone * The target zone must be a sequential write zone * The BIO must not cross a zone boundary * The BIO size must not be split to ensure that a single range of LBAs is written with a single command. Implement these checks in generic_make_request_checks() using the helper function blk_check_zone_append(). To avoid write append BIO splitting, introduce the new max_zone_append_sectors queue limit attribute and ensure that a BIO size is always lower than this limit. Export this new limit through sysfs and check these limits in bio_full(). Also when a LLDD can't dispatch a request to a specific zone, it will return BLK_STS_ZONE_RESOURCE indicating this request needs to be delayed, e.g. because the zone it will be dispatched to is still write-locked. If this happens set the request aside in a local list to continue trying dispatching requests such as READ requests or a WRITE/ZONE_APPEND requests targetting other zones. This way we can still keep a high queue depth without starving other requests even if one request can't be served due to zone write-locking. Finally, make sure that the bio sector position indicates the actual write position as indicated by the device on completion. Signed-off-by: Keith Busch Signed-off-by: Johannes Thumshirn --- block/bio.c | 57 ++++++++++++++++++++++++++++++++++++++- block/blk-core.c | 52 +++++++++++++++++++++++++++++++++++ block/blk-mq.c | 27 +++++++++++++++++++ block/blk-settings.c | 23 ++++++++++++++++ block/blk-sysfs.c | 13 +++++++++ drivers/scsi/scsi_lib.c | 1 + include/linux/blk_types.h | 14 ++++++++++ include/linux/blkdev.h | 11 ++++++++ 8 files changed, 197 insertions(+), 1 deletion(-) diff --git a/block/bio.c b/block/bio.c index 94d697217887..e8c9273884a6 100644 --- a/block/bio.c +++ b/block/bio.c @@ -679,6 +679,48 @@ struct bio *bio_clone_fast(struct bio *bio, gfp_t gfp_mask, struct bio_set *bs) } EXPORT_SYMBOL(bio_clone_fast); +static bool bio_try_merge_zone_append_page(struct bio *bio, struct page *page, + unsigned int len, unsigned int off) +{ + struct request_queue *q = bio->bi_disk->queue; + struct bio_vec *bv = &bio->bi_io_vec[bio->bi_vcnt - 1]; + unsigned long mask = queue_segment_boundary(q); + phys_addr_t addr1 = page_to_phys(bv->bv_page) + bv->bv_offset; + phys_addr_t addr2 = page_to_phys(page) + off + len - 1; + bool same_page = false; + + if ((addr1 | mask) != (addr2 | mask)) + return false; + if (bv->bv_len + len > queue_max_segment_size(q)) + return false; + return __bio_try_merge_page(bio, page, len, off, &same_page); +} + +static int bio_add_append_page(struct bio *bio, struct page *page, unsigned len, + size_t offset) +{ + struct request_queue *q = bio->bi_disk->queue; + unsigned int max_append_sectors = queue_max_zone_append_sectors(q); + + if (WARN_ON_ONCE(!max_append_sectors)) + return 0; + + if (((bio->bi_iter.bi_size + len) >> 9) > max_append_sectors) + return 0; + + if (bio->bi_vcnt > 0) { + if (bio_try_merge_zone_append_page(bio, page, len, offset)) + return len; + } + + if (bio->bi_vcnt >= queue_max_segments(q)) + return 0; + + __bio_add_page(bio, page, len, offset); + + return len; +} + static inline bool page_is_mergeable(const struct bio_vec *bv, struct page *page, unsigned int len, unsigned int off, bool *same_page) @@ -866,6 +908,7 @@ int bio_add_page(struct bio *bio, struct page *page, if (!__bio_try_merge_page(bio, page, len, offset, &same_page)) { if (bio_full(bio, len)) return 0; + __bio_add_page(bio, page, len, offset); } return len; @@ -927,6 +970,7 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) ssize_t size, left; unsigned len, i; size_t offset; + unsigned op = bio_op(bio); /* * Move page array up in the allocated memory for the bio vecs as far as @@ -944,13 +988,20 @@ static int __bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) struct page *page = pages[i]; len = min_t(size_t, PAGE_SIZE - offset, left); + if (op == REQ_OP_ZONE_APPEND) { + int ret; - if (__bio_try_merge_page(bio, page, len, offset, &same_page)) { + ret = bio_add_append_page(bio, page, len, offset); + if (ret != len) + return -EINVAL; + } else if (__bio_try_merge_page(bio, page, len, offset, + &same_page)) { if (same_page) put_page(page); } else { if (WARN_ON_ONCE(bio_full(bio, len))) return -EINVAL; + __bio_add_page(bio, page, len, offset); } offset = 0; @@ -1895,6 +1946,10 @@ struct bio *bio_split(struct bio *bio, int sectors, BUG_ON(sectors <= 0); BUG_ON(sectors >= bio_sectors(bio)); + /* Zone append commands cannot be split */ + if (WARN_ON_ONCE(bio_op(bio) == REQ_OP_ZONE_APPEND)) + return NULL; + split = bio_clone_fast(bio, gfp, bs); if (!split) return NULL; diff --git a/block/blk-core.c b/block/blk-core.c index 60dc9552ef8d..57127092d816 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -135,6 +135,7 @@ static const char *const blk_op_name[] = { REQ_OP_NAME(ZONE_OPEN), REQ_OP_NAME(ZONE_CLOSE), REQ_OP_NAME(ZONE_FINISH), + REQ_OP_NAME(ZONE_APPEND), REQ_OP_NAME(WRITE_SAME), REQ_OP_NAME(WRITE_ZEROES), REQ_OP_NAME(SCSI_IN), @@ -240,6 +241,17 @@ static void req_bio_endio(struct request *rq, struct bio *bio, bio_advance(bio, nbytes); + if (req_op(rq) == REQ_OP_ZONE_APPEND && error == BLK_STS_OK) { + /* + * Partial zone append completions cannot be supported as the + * BIO fragments may end up not being written sequentially. + */ + if (bio->bi_iter.bi_size) + bio->bi_status = BLK_STS_IOERR; + else + bio->bi_iter.bi_sector = rq->__sector; + } + /* don't actually finish bio if it's part of flush sequence */ if (bio->bi_iter.bi_size == 0 && !(rq->rq_flags & RQF_FLUSH_SEQ)) bio_endio(bio); @@ -865,6 +877,41 @@ static inline int blk_partition_remap(struct bio *bio) return ret; } +/* + * Check write append to a zoned block device. + */ +static inline blk_status_t blk_check_zone_append(struct request_queue *q, + struct bio *bio) +{ + sector_t pos = bio->bi_iter.bi_sector; + int nr_sectors = bio_sectors(bio); + + /* Only applicable to zoned block devices */ + if (!blk_queue_is_zoned(q)) + return BLK_STS_NOTSUPP; + + /* The bio sector must point to the start of a sequential zone */ + if (pos & (blk_queue_zone_sectors(q) - 1) || + !blk_queue_zone_is_seq(q, pos)) + return BLK_STS_IOERR; + + /* + * Not allowed to cross zone boundaries. Otherwise, the BIO will be + * split and could result in non-contiguous sectors being written in + * different zones. + */ + if (blk_queue_zone_no(q, pos) != blk_queue_zone_no(q, pos + nr_sectors)) + return BLK_STS_IOERR; + + /* Make sure the BIO is small enough and will not get split */ + if (nr_sectors > q->limits.max_zone_append_sectors) + return BLK_STS_IOERR; + + bio->bi_opf |= REQ_NOMERGE; + + return BLK_STS_OK; +} + static noinline_for_stack bool generic_make_request_checks(struct bio *bio) { @@ -937,6 +984,11 @@ generic_make_request_checks(struct bio *bio) if (!q->limits.max_write_same_sectors) goto not_supported; break; + case REQ_OP_ZONE_APPEND: + status = blk_check_zone_append(q, bio); + if (status != BLK_STS_OK) + goto end_io; + break; case REQ_OP_ZONE_RESET: case REQ_OP_ZONE_OPEN: case REQ_OP_ZONE_CLOSE: diff --git a/block/blk-mq.c b/block/blk-mq.c index d92088dec6c3..ce60a071660f 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -1178,6 +1178,19 @@ static void blk_mq_update_dispatch_busy(struct blk_mq_hw_ctx *hctx, bool busy) #define BLK_MQ_RESOURCE_DELAY 3 /* ms units */ +static void blk_mq_handle_zone_resource(struct request *rq, + struct list_head *zone_list) +{ + /* + * If we end up here it is because we cannot dispatch a request to a + * specific zone due to LLD level zone-write locking or other zone + * related resource not being available. In this case, set the request + * aside in zone_list for retrying it later. + */ + list_add(&rq->queuelist, zone_list); + __blk_mq_requeue_request(rq); +} + /* * Returns true if we did some work AND can potentially do more. */ @@ -1189,6 +1202,7 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, bool no_tag = false; int errors, queued; blk_status_t ret = BLK_STS_OK; + LIST_HEAD(zone_list); if (list_empty(list)) return false; @@ -1257,6 +1271,16 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, list_add(&rq->queuelist, list); __blk_mq_requeue_request(rq); break; + } else if (ret == BLK_STS_ZONE_RESOURCE) { + /* + * Move the request to zone_list and keep going through + * the dispatch list to find more requests the drive can + * accept. + */ + blk_mq_handle_zone_resource(rq, &zone_list); + if (list_empty(list)) + break; + continue; } if (unlikely(ret != BLK_STS_OK)) { @@ -1268,6 +1292,9 @@ bool blk_mq_dispatch_rq_list(struct request_queue *q, struct list_head *list, queued++; } while (!list_empty(list)); + if (!list_empty(&zone_list)) + list_splice_tail_init(&zone_list, list); + hctx->dispatched[queued_to_index(queued)]++; /* diff --git a/block/blk-settings.c b/block/blk-settings.c index c8eda2e7b91e..5388965841df 100644 --- a/block/blk-settings.c +++ b/block/blk-settings.c @@ -48,6 +48,7 @@ void blk_set_default_limits(struct queue_limits *lim) lim->chunk_sectors = 0; lim->max_write_same_sectors = 0; lim->max_write_zeroes_sectors = 0; + lim->max_zone_append_sectors = 0; lim->max_discard_sectors = 0; lim->max_hw_discard_sectors = 0; lim->discard_granularity = 0; @@ -83,6 +84,7 @@ void blk_set_stacking_limits(struct queue_limits *lim) lim->max_dev_sectors = UINT_MAX; lim->max_write_same_sectors = UINT_MAX; lim->max_write_zeroes_sectors = UINT_MAX; + lim->max_zone_append_sectors = UINT_MAX; } EXPORT_SYMBOL(blk_set_stacking_limits); @@ -257,6 +259,25 @@ void blk_queue_max_write_zeroes_sectors(struct request_queue *q, } EXPORT_SYMBOL(blk_queue_max_write_zeroes_sectors); +/** + * blk_queue_max_zone_append_sectors - set max sectors for a single zone append + * @q: the request queue for the device + * @max_zone_append_sectors: maximum number of sectors to write per command + **/ +void blk_queue_max_zone_append_sectors(struct request_queue *q, + unsigned int max_zone_append_sectors) +{ + unsigned int max_sectors; + + max_sectors = min(q->limits.max_hw_sectors, max_zone_append_sectors); + if (max_sectors) + max_sectors = min_not_zero(q->limits.chunk_sectors, + max_sectors); + + q->limits.max_zone_append_sectors = max_sectors; +} +EXPORT_SYMBOL_GPL(blk_queue_max_zone_append_sectors); + /** * blk_queue_max_segments - set max hw segments for a request for this queue * @q: the request queue for the device @@ -506,6 +527,8 @@ int blk_stack_limits(struct queue_limits *t, struct queue_limits *b, b->max_write_same_sectors); t->max_write_zeroes_sectors = min(t->max_write_zeroes_sectors, b->max_write_zeroes_sectors); + t->max_zone_append_sectors = min(t->max_zone_append_sectors, + b->max_zone_append_sectors); t->bounce_pfn = min_not_zero(t->bounce_pfn, b->bounce_pfn); t->seg_boundary_mask = min_not_zero(t->seg_boundary_mask, diff --git a/block/blk-sysfs.c b/block/blk-sysfs.c index fca9b158f4a0..02643e149d5e 100644 --- a/block/blk-sysfs.c +++ b/block/blk-sysfs.c @@ -218,6 +218,13 @@ static ssize_t queue_write_zeroes_max_show(struct request_queue *q, char *page) (unsigned long long)q->limits.max_write_zeroes_sectors << 9); } +static ssize_t queue_zone_append_max_show(struct request_queue *q, char *page) +{ + unsigned long long max_sectors = q->limits.max_zone_append_sectors; + + return sprintf(page, "%llu\n", max_sectors << SECTOR_SHIFT); +} + static ssize_t queue_max_sectors_store(struct request_queue *q, const char *page, size_t count) { @@ -639,6 +646,11 @@ static struct queue_sysfs_entry queue_write_zeroes_max_entry = { .show = queue_write_zeroes_max_show, }; +static struct queue_sysfs_entry queue_zone_append_max_entry = { + .attr = {.name = "zone_append_max_bytes", .mode = 0444 }, + .show = queue_zone_append_max_show, +}; + static struct queue_sysfs_entry queue_nonrot_entry = { .attr = {.name = "rotational", .mode = 0644 }, .show = queue_show_nonrot, @@ -749,6 +761,7 @@ static struct attribute *queue_attrs[] = { &queue_discard_zeroes_data_entry.attr, &queue_write_same_max_entry.attr, &queue_write_zeroes_max_entry.attr, + &queue_zone_append_max_entry.attr, &queue_nonrot_entry.attr, &queue_zoned_entry.attr, &queue_nr_zones_entry.attr, diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index 610ee41fa54c..ea327f320b7f 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -1706,6 +1706,7 @@ static blk_status_t scsi_queue_rq(struct blk_mq_hw_ctx *hctx, case BLK_STS_OK: break; case BLK_STS_RESOURCE: + case BLK_STS_ZONE_RESOURCE: if (atomic_read(&sdev->device_busy) || scsi_device_blocked(sdev)) ret = BLK_STS_DEV_RESOURCE; diff --git a/include/linux/blk_types.h b/include/linux/blk_types.h index 70254ae11769..824ec2d89954 100644 --- a/include/linux/blk_types.h +++ b/include/linux/blk_types.h @@ -63,6 +63,18 @@ typedef u8 __bitwise blk_status_t; */ #define BLK_STS_DEV_RESOURCE ((__force blk_status_t)13) +/* + * BLK_STS_ZONE_RESOURCE is returned from the driver to the block layer if zone + * related resources are unavailable, but the driver can guarantee the queue + * will be rerun in the future once the resources become available again. + * + * This is different from BLK_STS_DEV_RESOURCE in that it explicitly references + * a zone specific resource and IO to a different zone on the same device could + * still be served. Examples of that are zones that are write-locked, but a read + * to the same zone could be served. + */ +#define BLK_STS_ZONE_RESOURCE ((__force blk_status_t)14) + /** * blk_path_error - returns true if error may be path related * @error: status the request was completed with @@ -296,6 +308,8 @@ enum req_opf { REQ_OP_ZONE_CLOSE = 11, /* Transition a zone to full */ REQ_OP_ZONE_FINISH = 12, + /* write data at the current zone write pointer */ + REQ_OP_ZONE_APPEND = 13, /* SCSI passthrough using struct scsi_request */ REQ_OP_SCSI_IN = 32, diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 25b63f714619..36111b10d514 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -336,6 +336,7 @@ struct queue_limits { unsigned int max_hw_discard_sectors; unsigned int max_write_same_sectors; unsigned int max_write_zeroes_sectors; + unsigned int max_zone_append_sectors; unsigned int discard_granularity; unsigned int discard_alignment; @@ -757,6 +758,9 @@ static inline bool rq_mergeable(struct request *rq) if (req_op(rq) == REQ_OP_WRITE_ZEROES) return false; + if (req_op(rq) == REQ_OP_ZONE_APPEND) + return false; + if (rq->cmd_flags & REQ_NOMERGE_FLAGS) return false; if (rq->rq_flags & RQF_NOMERGE_FLAGS) @@ -1088,6 +1092,8 @@ extern void blk_queue_max_write_same_sectors(struct request_queue *q, extern void blk_queue_max_write_zeroes_sectors(struct request_queue *q, unsigned int max_write_same_sectors); extern void blk_queue_logical_block_size(struct request_queue *, unsigned int); +extern void blk_queue_max_zone_append_sectors(struct request_queue *q, + unsigned int max_zone_append_sectors); extern void blk_queue_physical_block_size(struct request_queue *, unsigned int); extern void blk_queue_alignment_offset(struct request_queue *q, unsigned int alignment); @@ -1301,6 +1307,11 @@ static inline unsigned int queue_max_segment_size(const struct request_queue *q) return q->limits.max_segment_size; } +static inline unsigned int queue_max_zone_append_sectors(const struct request_queue *q) +{ + return q->limits.max_zone_append_sectors; +} + static inline unsigned queue_logical_block_size(const struct request_queue *q) { int retval = 512; From patchwork Fri Apr 3 10:12:43 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472447 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2E8FC1667 for ; Fri, 3 Apr 2020 10:13:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0D14420787 for ; Fri, 3 Apr 2020 10:13:08 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="E5AmxqKQ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390291AbgDCKNE (ORCPT ); Fri, 3 Apr 2020 06:13:04 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56729 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390105AbgDCKNE (ORCPT ); Fri, 3 Apr 2020 06:13:04 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908784; x=1617444784; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Kvkqhedaxckecfp14XwRLXhYwHCzNuCrujz+QWt/cdU=; b=E5AmxqKQVhdI74T97u0p0/OgEd3Dj12nZTcIlFkvPxPvfN58TFh1cIhX hhiourYGU84GRQlZOyZHvRDMwImEQkOBlrQa207N8uW4TPTZWZGq4TbnC DhGddZfltMSAyoeTOijzcEicoMoNLvCKh34BkifLQfkr7HeeBTtfHU91Y DzlBTor79IDomXB0g0VRjkeVekiqTNqPed3Kz9JNH2ZyaJnmoWM4lUeT+ E13Mzu6qzwQtsDfmfWo2YfWZU+ZeDIaVp/TK5A+MpQ+RSVs+nVr88sapz Gq//6vFt5LYpMF6cF6y70+y8cyr7rsP7NuvE2o9AHfzjhS2EFA9eSUHaF Q==; IronPort-SDR: awRpl5ZIkBvEQ0Ym7XDgT9NhCWKmB1MKFcTq4hBfz10dvgCDDUaJSQ0v3nMB5Vez+1SPswtP4R GrjW9v2U3RS7S7/wf8gbbLPScg6TwZTyccSnRs8FnSCasPpqI5M/BwjwZOOedcnO8m4HZkidd+ sz9UDDQrXVbTea5Yh8O9siuNSCdZAcnnUnlc7eeZ8dArxT1g4bzSmETuJP4V9vU72YyiHTOEV7 VmvzPMO+lralxlQnF9/6W99i7mgc/ax30U3OJwP3ah2Dkccdbe+VJlKxfjOT67A1cPaBDXCvh9 eyA= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956006" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:03 +0800 IronPort-SDR: 6BSG/poTewsrgJSV73RE7T5xP7+IrhdrylWCH2o16EGSSnUdMm7ztjT8GJhscaN6WkfFj/ue2z +cN6Ay85pdIrB/RM0CNSm0icSvXWGaYapAxNMmOgkgiDBEDWsxCQdi0ImNS1D6Q0Rlhaiz1kWC 00z72EJgn21mVme2k7UwB0OrM3G6ar2CdW48j0uRsFAjxvY+BHmExKRv2pxjHMGqb486cZMywi 5JZNc8MkmpGDj7nYwIUlqpxeUzbXQV410/ahA4eiwwGFLpqxwX5UeS/s8da5Sg8BnwwYi+4qpQ 2W4GM16DlcQKkfiNfEqI6slJ Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:03:52 -0700 IronPort-SDR: BsUA2+SnRFFmQaC7Gk61IDnL+d3AbppkquPX57VZp7SlIcsvPzegt8rsas0E0rK5Nj/9GyavwB l4XzZgqU50FqBDo7AMod6hiXbgv/G7w0vz75v6O+FA37D3p/V86JHusbcuP2XWqxfxIOzOUY+E Vjc3sv2AeyePQootN0xDtyA6uDqHz9f7x/rHkbq33H3WqWRG1m5wm1XRtJubqf+XUxhTBTP5BV Oz7iE4w8VO2lrtW/9NUTiDTmZ2bUe18KaUo6puOIUqhVusqE4P3+OnwObjMY74SDsVWp+H6nHz 4h0= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:00 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn , Christoph Hellwig Subject: [PATCH v4 03/10] block: introduce blk_req_zone_write_trylock Date: Fri, 3 Apr 2020 19:12:43 +0900 Message-Id: <20200403101250.33245-4-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Introduce blk_req_zone_write_trylock(), which either grabs the write-lock for a sequential zone or returns false, if the zone is already locked. Signed-off-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig --- block/blk-zoned.c | 14 ++++++++++++++ include/linux/blkdev.h | 1 + 2 files changed, 15 insertions(+) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 05741c6f618b..00b025b8b7c0 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -50,6 +50,20 @@ bool blk_req_needs_zone_write_lock(struct request *rq) } EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock); +bool blk_req_zone_write_trylock(struct request *rq) +{ + unsigned int zno = blk_rq_zone_no(rq); + + if (test_and_set_bit(zno, rq->q->seq_zones_wlock)) + return false; + + WARN_ON_ONCE(rq->rq_flags & RQF_ZONE_WRITE_LOCKED); + rq->rq_flags |= RQF_ZONE_WRITE_LOCKED; + + return true; +} +EXPORT_SYMBOL_GPL(blk_req_zone_write_trylock); + void __blk_req_zone_write_lock(struct request *rq) { if (WARN_ON_ONCE(test_and_set_bit(blk_rq_zone_no(rq), diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 36111b10d514..e591b22ace03 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -1746,6 +1746,7 @@ extern int bdev_write_page(struct block_device *, sector_t, struct page *, #ifdef CONFIG_BLK_DEV_ZONED bool blk_req_needs_zone_write_lock(struct request *rq); +bool blk_req_zone_write_trylock(struct request *rq); void __blk_req_zone_write_lock(struct request *rq); void __blk_req_zone_write_unlock(struct request *rq); From patchwork Fri Apr 3 10:12:44 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472443 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3A32392A for ; Fri, 3 Apr 2020 10:13:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 185D420CC7 for ; Fri, 3 Apr 2020 10:13:07 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Jq3JwJ4V" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390487AbgDCKNG (ORCPT ); Fri, 3 Apr 2020 06:13:06 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56735 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727774AbgDCKNF (ORCPT ); Fri, 3 Apr 2020 06:13:05 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908785; x=1617444785; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=VAexep2FtXqCN5SCJ3uHplSg+43mZm/ggMW4xnbE9g8=; b=Jq3JwJ4V9JjcNTQfXd6ou7+FsvCymMqx484oYn9ylAYMKpj6TmN6jbIs OKne1uSZrHTo3Kk4EYuhOk2YV7uioVOtEUuwX3rjnRdgpWiqcerJoWZgQ LYZfoPs+PUHzu4y4ddvfL8ueEmEE50p3Gg6NJcvWqfs2WzgraLjanj3r4 /2V07uV3Y02KzWyJL+iRlMbuVVfylPfvB2y/c97ZGo/vkgxDMcm1qsRtY 7m53Uotfrnbgy/n1pjA7XvRQvBErpMjL1heYBWK1ypmJ7mjitvtnt5p1j kZ3CoemymeEdTOdtFfHP9AH6ocqYC9jp+TfA/vnWpR2o06tmecP3NDeaf Q==; IronPort-SDR: L+0WP4dTQWgOcJ3z3UbLdIvDI9qpzU1ektWoukpEpCwczr2RVft0oBC9/QWWmyfPVejhwtkz9V BcNisVwhFizgmUcZz8IL5ccVzVYw4kprlU+wxPGsUTl3G6FBr5gFjUxZcXXcpso2/IStasfBAK NQFs8eo4UDW37xARnJIZy9B7Bl/19BGdGgzp3EaTuHtElbFYDVRZ6Yt2/4LWjfGAcHFvkKX+ZG BZOSVtndmxfpN9+8QJLm5CKWtDyX5AgzoowZuXpA1N4i1iiCwQ72jyhvQBWjrbbiE5VTkAIHFJ AfA= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956014" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:05 +0800 IronPort-SDR: WpsHcIdcCWmcCCptMqq1lNkzsd33scaESx3suL/ucADCHaGMNB6X1V+7vmrDOPa5io38majEA2 TYsesjb89PWRoAFU+BQW9ziYPLy181++EMT+h3jjyj77SinbLw6WMOOgCJuPUtiZK/JYez+Mo8 sn5PrTuOcpi54TRgUOV+IWvG/6/5VPJPFsie9t3XY3vWeR14D+BhVm1ZykjNwDZf5jVGnne+/I 6Gy3YxAHlBRVmlGOox/zMXWl9bTBE/FLZ8/+UeQEIdT8K0imvr//JUgSXakB1PoIsjFEeOrurS rEhlqbEVAyy/2mnHTfHA/o+2 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:03:54 -0700 IronPort-SDR: VxveQzL7EwNtk5KJYcriYhTgBDnz8Sb8ib0KspGo3d2uHAEptBYvkCEotB43bi/TkhBJsjTVcz CMYdZU8RjPpiMZ9FsZ/8Ubf32EEhc4zRdJrbyuGiuV34/UDjAIZnaalb0z/6fjdZQFLMV9E1Wf 54rUYdVcdHY8vlhNjGefEsEsvhsf/l4WnjsxZRSa0TUW3RkSmgftR6k83yTU+39CWhOqTdGEZZ Hftl4MhCb8TK5gjTdKsTeVHs+27C1HRkAMukwtOS5RIJ+/XEiO8p5S/KUGTdViATVhydrJZZs6 RMQ= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:02 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Damien Le Moal Subject: [PATCH v4 04/10] block: Modify revalidate zones Date: Fri, 3 Apr 2020 19:12:44 +0900 Message-Id: <20200403101250.33245-5-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org From: Damien Le Moal Modify the interface of blk_revalidate_disk_zones() to add an optional revalidation callback function that a driver can use to extend checks and processing done during zone revalidation. The callback, if defined, is executed once for each zone inspected and a final time after all zones are inspected. blk_revalidate_disk_zones() is renamed as __blk_revalidate_disk_zones() and blk_revalidate_disk_zones() implemented as an inline function calling __blk_revalidate_disk_zones() without no revalidation callback specified, resulting in an unchanged behavior for all callers of blk_revalidate_disk_zones(). Signed-off-by: Damien Le Moal --- block/blk-zoned.c | 38 +++++++++++++++++++++++++++----------- include/linux/blkdev.h | 11 ++++++++++- 2 files changed, 37 insertions(+), 12 deletions(-) diff --git a/block/blk-zoned.c b/block/blk-zoned.c index 00b025b8b7c0..a5fed0fa1504 100644 --- a/block/blk-zoned.c +++ b/block/blk-zoned.c @@ -353,12 +353,14 @@ void blk_queue_free_zone_bitmaps(struct request_queue *q) } struct blk_revalidate_zone_args { - struct gendisk *disk; - unsigned long *conv_zones_bitmap; - unsigned long *seq_zones_wlock; - unsigned int nr_zones; - sector_t zone_sectors; - sector_t sector; + struct gendisk *disk; + revalidate_zones_cb revalidate_cb; + void *revalidate_data; + unsigned long *conv_zones_bitmap; + unsigned long *seq_zones_wlock; + unsigned int nr_zones; + sector_t zone_sectors; + sector_t sector; }; /* @@ -432,25 +434,37 @@ static int blk_revalidate_zone_cb(struct blk_zone *zone, unsigned int idx, return -ENODEV; } + if (args->revalidate_cb) + args->revalidate_cb(zone, idx, args->revalidate_data); + args->sector += zone->len; return 0; } /** - * blk_revalidate_disk_zones - (re)allocate and initialize zone bitmaps - * @disk: Target disk + * __blk_revalidate_disk_zones - (re)allocate and initialize zone bitmaps + * @disk: Target disk + * @revalidate_cb: LLD callback + * @revalidate_data: LLD callback argument * * Helper function for low-level device drivers to (re) allocate and initialize * a disk request queue zone bitmaps. This functions should normally be called * within the disk ->revalidate method for blk-mq based drivers. For BIO based * drivers only q->nr_zones needs to be updated so that the sysfs exposed value * is correct. + * If driver @revalidate_cb callback function is not NULL, the callback will be + * executed for each zone inspected as well as a final time to apply changes + * under with the device request queue frozen. */ -int blk_revalidate_disk_zones(struct gendisk *disk) +int __blk_revalidate_disk_zones(struct gendisk *disk, + revalidate_zones_cb revalidate_cb, + void *revalidate_data) { struct request_queue *q = disk->queue; struct blk_revalidate_zone_args args = { - .disk = disk, + .disk = disk, + .revalidate_cb = revalidate_cb, + .revalidate_data = revalidate_data, }; unsigned int noio_flag; int ret; @@ -480,6 +494,8 @@ int blk_revalidate_disk_zones(struct gendisk *disk) q->nr_zones = args.nr_zones; swap(q->seq_zones_wlock, args.seq_zones_wlock); swap(q->conv_zones_bitmap, args.conv_zones_bitmap); + if (revalidate_cb) + revalidate_cb(NULL, 0, revalidate_data); ret = 0; } else { pr_warn("%s: failed to revalidate zones\n", disk->disk_name); @@ -491,4 +507,4 @@ int blk_revalidate_disk_zones(struct gendisk *disk) kfree(args.conv_zones_bitmap); return ret; } -EXPORT_SYMBOL_GPL(blk_revalidate_disk_zones); +EXPORT_SYMBOL_GPL(__blk_revalidate_disk_zones); diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index e591b22ace03..49f41562b3f9 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -353,6 +353,9 @@ struct queue_limits { typedef int (*report_zones_cb)(struct blk_zone *zone, unsigned int idx, void *data); +typedef void (*revalidate_zones_cb)(struct blk_zone *zone, unsigned int idx, + void *data); + #ifdef CONFIG_BLK_DEV_ZONED #define BLK_ALL_ZONES ((unsigned int)-1) @@ -362,7 +365,13 @@ unsigned int blkdev_nr_zones(struct gendisk *disk); extern int blkdev_zone_mgmt(struct block_device *bdev, enum req_opf op, sector_t sectors, sector_t nr_sectors, gfp_t gfp_mask); -extern int blk_revalidate_disk_zones(struct gendisk *disk); +int __blk_revalidate_disk_zones(struct gendisk *disk, + revalidate_zones_cb revalidate_cb, + void *revalidate_data); +static inline int blk_revalidate_disk_zones(struct gendisk *disk) +{ + return __blk_revalidate_disk_zones(disk, NULL, NULL); +} extern int blkdev_report_zones_ioctl(struct block_device *bdev, fmode_t mode, unsigned int cmd, unsigned long arg); From patchwork Fri Apr 3 10:12:45 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472457 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 0C713159A for ; Fri, 3 Apr 2020 10:13:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id DF26020787 for ; Fri, 3 Apr 2020 10:13:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="Ny3IEzwi" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390572AbgDCKNL (ORCPT ); Fri, 3 Apr 2020 06:13:11 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56735 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390514AbgDCKNI (ORCPT ); Fri, 3 Apr 2020 06:13:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908788; x=1617444788; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=af+TpmrNaP6opwMEPh6jKrw6Bxfh1IsevFX/PW8xfPo=; b=Ny3IEzwiLNFiL5RkCfqCbGi/R/xtPLVTtRQWBzy0/ujmLG27GVkf1w3U WaPg7gj1liQ/gVmWICFs1eD8akfNNHLHI/m9sPVUgtUcuv4Hx5HqNp3Ah E6tEreB/OWFMjclbOiAaSZlYF70OZkjn6a3A5BotS76GBOjX7nVkyBhjs jQS3PzXGz5G8kjPq+u3f8sYlCCNz5tKcvf1p3qrkAzLEKIYwRwttIVFCy UBrMPa85brI9Dq3y+I/NHRfV1AQiIXcpm+LsvIh5nroPXKVSu71uZAzpL qeAbFX1ybL0PjLzHbWNg8nMoh7NPP8ZW+c/brRy2P0ihi0LRM/qPeq9Na A==; IronPort-SDR: RGrTW+ddos814URNyM9ydqWtzQFCyBjrFHl/bBPfsEhfN5ewwXe6JgxAN0+pH+llvV22GRnowN p2n8HKsYDOLWfa49JQkRSvdmAKW4MmOVD9XD7/eE2bOTsSRhtNKMuU6ZShhiGDe2WZJjnYuIlQ jhyeypTJEVef3hl/upOjXNAOCR+POx5/MZAvNwIvQxrPkisGZZNF/lc3TwjjRF0U/FgZrB5402 VKXyMx4ho19Oeb68ZnYFMNOe3Mz7hRd0a0PgKKdnmtQmuWkHPLj71dYjZizmNTw+kZ7sn3wkPh bSc= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956020" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:07 +0800 IronPort-SDR: qe+Ll3qBR2goe2cO6oQINXjn2bjY2N5fK4U5fG/mhR0VIaRrAI12jhxwXoI8i7gYdWZLKXDyBb 0s+esHLs/WKNH7uifoGOz1MZ+NUuUWsJTFy3Ly+zDTGKn6lYFDG7rSxgNjaKyNtsd7XixPQIzG PJPeda2ZiirAcZQBI90RZm23GDrIqM/PovuB3Gl6fTRTckTbVjuEfcpvnTZH6MliVAEqn7DJkL v7/uk78qcRBu2Efw2//QkL8oJNAtQCMecjchlSNEOM96aUP4c7IXAwFa1OTRWeW/9jgQeuCFO8 lYQdo63Mf/auP1WS33ZsjuEN Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:03:56 -0700 IronPort-SDR: YMuY9yGwMiyhWHJWU8qcIjate/KDbDsNC3XHMZ/NsMWDg6HBRTOaihcLw1qV1XlcA6VkfVLN+O Nqe4lTNlcdWX/4Qk8InyV0G1PU9makfmAK6jAGh8o32vkKXAD+/JiN60kMf8fVoqzMQo4oy7Dn S0Dl/10kBjgi1SMWs6wnR33IEeUQDuloSxKjqQseOG5Deh2putNiQsq+rr9+V3gi6dbT0d7oQb eOJjVOmKto/pYAsUOJ3XlDHUIDsIgU00sdX4kojIKe5bWwBEEO2QwuiZLi4RMcXNzsxVCKE49P 2hY= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:03 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn , Christoph Hellwig Subject: [PATCH v4 05/10] scsi: sd_zbc: factor out sanity checks for zoned commands Date: Fri, 3 Apr 2020 19:12:45 +0900 Message-Id: <20200403101250.33245-6-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Factor sanity checks for zoned commands from sd_zbc_setup_zone_mgmt_cmnd(). This will help with the introduction of an emulated ZONE_APPEND command. Signed-off-by: Johannes Thumshirn Reviewed-by: Christoph Hellwig --- drivers/scsi/sd_zbc.c | 36 +++++++++++++++++++++++++----------- 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index f45c22b09726..ee156fbf3780 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -209,6 +209,26 @@ int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, return ret; } +static blk_status_t sd_zbc_cmnd_checks(struct scsi_cmnd *cmd) +{ + struct request *rq = cmd->request; + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); + sector_t sector = blk_rq_pos(rq); + + if (!sd_is_zoned(sdkp)) + /* Not a zoned device */ + return BLK_STS_IOERR; + + if (sdkp->device->changed) + return BLK_STS_IOERR; + + if (sector & (sd_zbc_zone_sectors(sdkp) - 1)) + /* Unaligned request */ + return BLK_STS_IOERR; + + return BLK_STS_OK; +} + /** * sd_zbc_setup_zone_mgmt_cmnd - Prepare a zone ZBC_OUT command. The operations * can be RESET WRITE POINTER, OPEN, CLOSE or FINISH. @@ -223,20 +243,14 @@ blk_status_t sd_zbc_setup_zone_mgmt_cmnd(struct scsi_cmnd *cmd, unsigned char op, bool all) { struct request *rq = cmd->request; - struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); sector_t sector = blk_rq_pos(rq); + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); sector_t block = sectors_to_logical(sdkp->device, sector); + blk_status_t ret; - if (!sd_is_zoned(sdkp)) - /* Not a zoned device */ - return BLK_STS_IOERR; - - if (sdkp->device->changed) - return BLK_STS_IOERR; - - if (sector & (sd_zbc_zone_sectors(sdkp) - 1)) - /* Unaligned request */ - return BLK_STS_IOERR; + ret = sd_zbc_cmnd_checks(cmd); + if (ret != BLK_STS_OK) + return ret; cmd->cmd_len = 16; memset(cmd->cmnd, 0, cmd->cmd_len); From patchwork Fri Apr 3 10:12:46 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472453 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 839CE92A for ; Fri, 3 Apr 2020 10:13:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 63642208E4 for ; Fri, 3 Apr 2020 10:13:12 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="pWmNQycn" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390586AbgDCKNL (ORCPT ); Fri, 3 Apr 2020 06:13:11 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56741 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390520AbgDCKNI (ORCPT ); Fri, 3 Apr 2020 06:13:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908789; x=1617444789; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=RckJ1sRQ4Jj+9CntP2Tbns2sAuC38WlYy87tzhye2Ow=; b=pWmNQycnyagogkNd09q9VBw9IYlV1wBCoW+PEUDH6SD2Eb7qwkzUS311 bs1yIe9LbcKrALitYNWip07EraAP5HPuaLdSqYK33MDg9bnISEMW7w4z3 mj5XvGrV1I00sUAueiLn6bASBo80NHz61IJExrPrMupzhs68DKzVWMOzO lp3/Ex4qCyk4dzUscz5aXW5LNfMRkhEuxSQdH5pTLELfbqhyGb+PaagNX 010f8DgmbFxj7USg8KZnGETiztiloaYBHt+/rdm/IWoyDOZROD4cPkDw/ Vo9ULAk5gsU9483DIobp26pSfOi0ZrTe92fg0mZs/3X6h9pTBSo67xYDe A==; IronPort-SDR: gYsU7QmrXmUw3hH+QiEkiLd+P0zY0s299+MN9udaAHka9+rAFhpiJLUzvdGaJEo48qqUKKiFth YSpjHPNgbEswE22eICzjz4jYBp9BiMb6ol3oI4NlrQeKZBa3MONl34FfQz8WP/rBp3ONmBPEef lNErwfUnKwTXcFIs0/2HsC0KZyMtEb3eOHNes8lpLwYfREYUEnenJJG/+5DPuPoOhlwxC/RV/W V/XUmlHnAwVRGdENLEwKTFAj8O5zgZKVDrrkO0wHe/9sExP1qYFxBwg4heIXWHsH5NF0+ifNE/ Vys= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956026" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:08 +0800 IronPort-SDR: 9myJisO6mosf6IfHTYB47hWWwsaUeGlvw0St+25JPqG39PFK8BXVqdb7TSShD+iIaI1uhKKlur Rx9ymWLfxmAbB2cWD0X8HoSqOpVB/Nh+AppbLhVKYpmg55S2+D26dISTRQdjCON0ITVdPlrqSZ KnKYjQHnBk+2Vo3/8Ag8nFG6tDM1RchHJwrd0icqgje2GaoSrKtG6y6eGu8CsnUzm/bhhGr7gB o+dTaE/ACku5RmZY+gP7/Q53PesUFW9pUWWcxoF7ynHaLOI6wcka5IbtGNeq7hm/01qUo39VvT DEiOJwE/QLAupAUjgmkXD/3T Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:03:58 -0700 IronPort-SDR: K6D54gYeH8kgtYZEz21xvEkqiwMeWyeW+I2LtzEB8WYp5xmcVQlsw+XUiZl9IiBls0e+bXM4Nf iS7FhlfTcbXbOGXH0Pygh5vkr6AWtsFSC7rMIknn+1+CxYqoQtKjlwTMmxOZc3hzJSZR9Kvmbx gb+TNpnGu+ikbHp1PVdEfiBnPf5ZfDYBUrQSz7ow/eRjmLE004s9uBrmqUYxbhB/O2CyT1SvFT TkcuhR0r4ZLEhlOIYrV8lxk4iHM5KYom2RXOpeiNt/WOFuvDfYYqNPJCkpPpJZr6Vmn3SGuSYG Wl8= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:05 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v4 06/10] scsi: export scsi_mq_uninit_cmnd Date: Fri, 3 Apr 2020 19:12:46 +0900 Message-Id: <20200403101250.33245-7-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org scsi_mq_uninit_cmnd is used to free the sg_tables, uninitialize the command and delete it from the command list. Export this function so it can be used from modular code to free the memory allocated by scsi_init_io() if the caller of scsi_init_io() needs to do error recovery. Signed-off-by: Johannes Thumshirn --- drivers/scsi/scsi_lib.c | 9 ++++++--- include/scsi/scsi_cmnd.h | 1 + 2 files changed, 7 insertions(+), 3 deletions(-) diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c index ea327f320b7f..4646575a89d6 100644 --- a/drivers/scsi/scsi_lib.c +++ b/drivers/scsi/scsi_lib.c @@ -57,8 +57,6 @@ static struct kmem_cache *scsi_sense_cache; static struct kmem_cache *scsi_sense_isadma_cache; static DEFINE_MUTEX(scsi_sense_cache_mutex); -static void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd); - static inline struct kmem_cache * scsi_select_sense_cache(bool unchecked_isa_dma) { @@ -558,12 +556,17 @@ static void scsi_mq_free_sgtables(struct scsi_cmnd *cmd) SCSI_INLINE_PROT_SG_CNT); } -static void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd) +/** + * scsi_mq_uninit_cmd - uninitialize a SCSI command + * @cmd: the command to free + */ +void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd) { scsi_mq_free_sgtables(cmd); scsi_uninit_cmd(cmd); scsi_del_cmd_from_list(cmd); } +EXPORT_SYMBOL_GPL(scsi_mq_uninit_cmd); /* Returns false when no more bytes to process, true if there are more */ static bool scsi_end_request(struct request *req, blk_status_t error, diff --git a/include/scsi/scsi_cmnd.h b/include/scsi/scsi_cmnd.h index a2849bb9cd19..65ff625db38b 100644 --- a/include/scsi/scsi_cmnd.h +++ b/include/scsi/scsi_cmnd.h @@ -167,6 +167,7 @@ extern void *scsi_kmap_atomic_sg(struct scatterlist *sg, int sg_count, extern void scsi_kunmap_atomic_sg(void *virt); extern blk_status_t scsi_init_io(struct scsi_cmnd *cmd); +extern void scsi_mq_uninit_cmd(struct scsi_cmnd *cmd); #ifdef CONFIG_SCSI_DMA extern int scsi_dma_map(struct scsi_cmnd *cmd); From patchwork Fri Apr 3 10:12:47 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472465 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 5264592A for ; Fri, 3 Apr 2020 10:13:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1D9ED208E4 for ; Fri, 3 Apr 2020 10:13:15 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="qazQb6+g" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390612AbgDCKNN (ORCPT ); Fri, 3 Apr 2020 06:13:13 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56746 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390533AbgDCKNL (ORCPT ); Fri, 3 Apr 2020 06:13:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908790; x=1617444790; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=eEX5KZTm4CCQ20eiS+LE5MXXQEhYT9xoa9Z3UK+4rB0=; b=qazQb6+g3iraTOJCvgt7rZZa7w1xNZ6pvZJTn3okTvz6B3NOWtR8blqr Nc/iGQ2tfbRSTkFYUguSOO/7r75jDsEHdsFi5wQj+cnSlzu4MSYbI2GzE 55dj7c0hC0m5WP0RBiXQcdr6RaUhPFL1TiZMsl35NVH4U2Uu67a4TLHdk gnTc/zZLsb3cCbWiOw930+MA8kqVUkSt2Gyw6zNlmj8aFW3+5EfpRSPdS Lx9hQUWd3TAsQSqu5vY8cKpqW3hwayJxYzmtSo3g1s3LH4ZTN4G+nJJJN lcR6HnqT1kKjvXFvXFOVW1wTpr1XBGTUa6DmtN1TxakbZ4ygqbq0iKzyN Q==; IronPort-SDR: 9XeFCEL8kB0Ya2QGORYbGWwJ4UXfebW30iJ1Wx91mKJMCNeSCkmRGherzqhhwuOvinTsWfdr8+ tyx6X3mgLzS/4SF+kumUz1IeZNI5fcHoFNM3RIug+vN1k+QW9Z/E9xD0vCodRUDpIecBoGGRLe DvVvQjKr2QC0oYq7CbDYhmiKW7JYaiT0EX3Xy6zGelbZ58HR8mvOCJd9Bg2olq1Mmi8pwoZTXv 1AV97iA/UUfZmGIB2bBueEkypxR4Y68Jh1v0Z03hKC17ShPB26eT4vi6hPygX3yNRPaizz4Rpv VeQ= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956035" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:10 +0800 IronPort-SDR: D83eAP6UTN7pTuvqB5WkziFbGkuzBKZx1qJt3LGfW525WU8ImC0JCSx0niaYeub0QbTQ1kIIMO P80W/G0p53qXmXCI+XDjEQaZjzRWulqK4EY2Q+wAnMlWY47gfcLxl57JD68JlKz07hLXCTVS21 FOTNxHeWO6pO5oeXvV9RQH9wuwasUZe+vSNuXiLV8kXh9uEO37fbd26X1zjKuUtV6JlI79hvzR N+DkaDriqIJKtzKjC/BkpLKYs+kPO3VWvfKo2jeAJPQXu8B9rhhgPwNwXvxfmXZ7Lq+wmkxHz9 oy3jL6sboUBHWhhHeEVOXqSN Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:03:59 -0700 IronPort-SDR: ujMkS8uqJszUZ/YAgLyMBsWG7W55KuMHTYCuSQcZmfRBCYF8WTBoUzfcJazasK4xUuT2hHXpjG JsTtyeGqsvMZTeQC0lTASggdtgPmAOkvCx+iaEmbRRwOL2MmnR3/gHMAmQP7YQ55vsd24QbOVB FXO/oKwPWN4jbeF+JMzIHSHFHW41pswL/sVJjQanBYni9fOgY3Yf/PG52poj8vX4qYfm55r+d1 a00aQ8Kg4z0uXn7rlzhLZ/QBmoa4oQQdPlij643zRr0YBJyUCO5Q9GcjmBrYTAcoquQ2OE7mjW DrI= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:07 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v4 07/10] scsi: sd_zbc: emulate ZONE_APPEND commands Date: Fri, 3 Apr 2020 19:12:47 +0900 Message-Id: <20200403101250.33245-8-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Emulate ZONE_APPEND for SCSI disks using a regular WRITE(16) with a start LBA set to the target zone write pointer position. In order to always know the write pointer position of a sequential write zone, the queue flag QUEUE_FLAG_ZONE_WP_OFST is set to get an initialized write pointer offset array attached to the device request queue. The values of the cache are maintained in sync with the device as follows: 1) the write pointer offset of a zone is reset to 0 when a REQ_OP_ZONE_RESET command completes. 2) the write pointer offset of a zone is set to the zone size when a REQ_OP_ZONE_FINISH command completes. 3) the write pointer offset of a zone is incremented by the number of 512B sectors written when a write or a zone append command completes 4) the write pointer offset of all zones is reset to 0 when a REQ_OP_ZONE_RESET_ALL command completes. Since the block layer does not write lock zones for zone append commands, to ensure a sequential ordering of the write commands used for the emulation, the target zone of a zone append command is locked when the function sd_zbc_prepare_zone_append() is called from sd_setup_read_write_cmnd(). If the zone write lock cannot be obtained (e.g. a zone append is in-flight or a regular write has already locked the zone), the zone append command dispatching is delayed by returning BLK_STS_ZONE_RESOURCE. Since zone reset and finish operations can be issued concurrently with writes and zone append requests, ensure a coherent update of the zone write pointer offsets by also write locking the target zones for these zone management requests. Finally, to avoid the need for write locking all zones for REQ_OP_ZONE_RESET_ALL requests, use a spinlock to protect accesses and modifications of the zone write pointer offsets. This spinlock is initialized from sd_probe() using the new function sd_zbc_init(). Signed-off-by: Johannes Thumshirn --- drivers/scsi/sd.c | 26 ++- drivers/scsi/sd.h | 38 ++++- drivers/scsi/sd_zbc.c | 367 +++++++++++++++++++++++++++++++++++++++++- 3 files changed, 413 insertions(+), 18 deletions(-) diff --git a/drivers/scsi/sd.c b/drivers/scsi/sd.c index 2710a0e5ae6d..569b22ab394e 100644 --- a/drivers/scsi/sd.c +++ b/drivers/scsi/sd.c @@ -1206,6 +1206,14 @@ static blk_status_t sd_setup_read_write_cmnd(struct scsi_cmnd *cmd) } } + if (req_op(rq) == REQ_OP_ZONE_APPEND) { + ret = sd_zbc_prepare_zone_append(cmd, &lba, nr_blocks); + if (ret) { + scsi_mq_uninit_cmd(cmd); + return ret; + } + } + fua = rq->cmd_flags & REQ_FUA ? 0x8 : 0; dix = scsi_prot_sg_count(cmd); dif = scsi_host_dif_capable(cmd->device->host, sdkp->protection_type); @@ -1287,6 +1295,7 @@ static blk_status_t sd_init_command(struct scsi_cmnd *cmd) return sd_setup_flush_cmnd(cmd); case REQ_OP_READ: case REQ_OP_WRITE: + case REQ_OP_ZONE_APPEND: return sd_setup_read_write_cmnd(cmd); case REQ_OP_ZONE_RESET: return sd_zbc_setup_zone_mgmt_cmnd(cmd, ZO_RESET_WRITE_POINTER, @@ -2055,7 +2064,7 @@ static int sd_done(struct scsi_cmnd *SCpnt) out: if (sd_is_zoned(sdkp)) - sd_zbc_complete(SCpnt, good_bytes, &sshdr); + good_bytes = sd_zbc_complete(SCpnt, good_bytes, &sshdr); SCSI_LOG_HLCOMPLETE(1, scmd_printk(KERN_INFO, SCpnt, "sd_done: completed %d of %d bytes\n", @@ -3371,6 +3380,10 @@ static int sd_probe(struct device *dev) sdkp->first_scan = 1; sdkp->max_medium_access_timeouts = SD_MAX_MEDIUM_TIMEOUTS; + error = sd_zbc_init_disk(sdkp); + if (error) + goto out_free_index; + sd_revalidate_disk(gd); gd->flags = GENHD_FL_EXT_DEVT; @@ -3408,6 +3421,7 @@ static int sd_probe(struct device *dev) out_put: put_disk(gd); out_free: + sd_zbc_release_disk(sdkp); kfree(sdkp); out: scsi_autopm_put_device(sdp); @@ -3484,6 +3498,8 @@ static void scsi_disk_release(struct device *dev) put_disk(disk); put_device(&sdkp->device->sdev_gendev); + sd_zbc_release_disk(sdkp); + kfree(sdkp); } @@ -3664,19 +3680,19 @@ static int __init init_sd(void) if (!sd_page_pool) { printk(KERN_ERR "sd: can't init discard page pool\n"); err = -ENOMEM; - goto err_out_ppool; + goto err_out_cdb_pool; } err = scsi_register_driver(&sd_template.gendrv); if (err) - goto err_out_driver; + goto err_out_ppool; return 0; -err_out_driver: +err_out_ppool: mempool_destroy(sd_page_pool); -err_out_ppool: +err_out_cdb_pool: mempool_destroy(sd_cdb_pool); err_out_cache: diff --git a/drivers/scsi/sd.h b/drivers/scsi/sd.h index 50fff0bf8c8e..74448c250fea 100644 --- a/drivers/scsi/sd.h +++ b/drivers/scsi/sd.h @@ -79,6 +79,10 @@ struct scsi_disk { u32 zones_optimal_open; u32 zones_optimal_nonseq; u32 zones_max_open; + u32 *zones_wp_ofst; + spinlock_t zones_wp_ofst_lock; + struct work_struct zone_wp_ofst_work; + char *zone_wp_update_buf; #endif atomic_t openers; sector_t capacity; /* size in logical blocks */ @@ -207,17 +211,32 @@ static inline int sd_is_zoned(struct scsi_disk *sdkp) #ifdef CONFIG_BLK_DEV_ZONED +int sd_zbc_init_disk(struct scsi_disk *sdkp); +void sd_zbc_release_disk(struct scsi_disk *sdkp); extern int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buffer); extern void sd_zbc_print_zones(struct scsi_disk *sdkp); blk_status_t sd_zbc_setup_zone_mgmt_cmnd(struct scsi_cmnd *cmd, unsigned char op, bool all); -extern void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, - struct scsi_sense_hdr *sshdr); +unsigned int sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, + struct scsi_sense_hdr *sshdr); int sd_zbc_report_zones(struct gendisk *disk, sector_t sector, unsigned int nr_zones, report_zones_cb cb, void *data); +blk_status_t sd_zbc_prepare_zone_append(struct scsi_cmnd *cmd, sector_t *lba, + unsigned int nr_blocks); + #else /* CONFIG_BLK_DEV_ZONED */ +static inline int sd_zbc_init(void) +{ + return 0; +} + +static inline void sd_zbc_exit(void) {} + +static inline void sd_zbc_init_disk(struct scsi_disk *sdkp) {} +static inline void sd_zbc_release_disk(struct scsi_disk *sdkp) {} + static inline int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) { @@ -233,9 +252,18 @@ static inline blk_status_t sd_zbc_setup_zone_mgmt_cmnd(struct scsi_cmnd *cmd, return BLK_STS_TARGET; } -static inline void sd_zbc_complete(struct scsi_cmnd *cmd, - unsigned int good_bytes, - struct scsi_sense_hdr *sshdr) {} +static inline unsigned int sd_zbc_complete(struct scsi_cmnd *cmd, + unsigned int good_bytes, struct scsi_sense_hdr *sshdr) +{ + return 0; +} + +static inline blk_status_t sd_zbc_prepare_zone_append(struct scsi_cmnd *cmd, + sector_t *lba, + unsigned int nr_blocks) +{ + return BLK_STS_TARGET; +} #define sd_zbc_report_zones NULL diff --git a/drivers/scsi/sd_zbc.c b/drivers/scsi/sd_zbc.c index ee156fbf3780..49c78040fa84 100644 --- a/drivers/scsi/sd_zbc.c +++ b/drivers/scsi/sd_zbc.c @@ -42,6 +42,30 @@ static int sd_zbc_parse_report(struct scsi_disk *sdkp, u8 *buf, return cb(&zone, idx, data); } +static unsigned int sd_zbc_get_zone_wp_ofst(struct blk_zone *zone) +{ + if (zone->type == ZBC_ZONE_TYPE_CONV) + return 0; + + switch (zone->cond) { + case BLK_ZONE_COND_IMP_OPEN: + case BLK_ZONE_COND_EXP_OPEN: + case BLK_ZONE_COND_CLOSED: + return zone->wp - zone->start; + case BLK_ZONE_COND_FULL: + return zone->len; + case BLK_ZONE_COND_EMPTY: + case BLK_ZONE_COND_OFFLINE: + case BLK_ZONE_COND_READONLY: + default: + /* + * Offline and read-only zones do not have a valid + * write pointer. Use 0 as for an empty zone. + */ + return 0; + } +} + /** * sd_zbc_do_report_zones - Issue a REPORT ZONES scsi command. * @sdkp: The target disk @@ -229,6 +253,134 @@ static blk_status_t sd_zbc_cmnd_checks(struct scsi_cmnd *cmd) return BLK_STS_OK; } +#define SD_ZBC_INVALID_WP_OFST ~(0u) +#define SD_ZBC_UPDATING_WP_OFST (SD_ZBC_INVALID_WP_OFST - 1) + +static int sd_zbc_update_wp_ofst_cb(struct blk_zone *zone, unsigned int idx, + void *data) +{ + struct scsi_disk *sdkp = data; + + lockdep_assert_held(&sdkp->zones_wp_ofst_lock); + + sdkp->zones_wp_ofst[idx] = sd_zbc_get_zone_wp_ofst(zone); + + return 0; +} + +static void sd_zbc_update_wp_ofst_workfn(struct work_struct *work) +{ + struct scsi_disk *sdkp; + unsigned int zno; + int ret; + + sdkp = container_of(work, struct scsi_disk, zone_wp_ofst_work); + + spin_lock_bh(&sdkp->zones_wp_ofst_lock); + for (zno = 0; zno < sdkp->nr_zones; zno++) { + if (sdkp->zones_wp_ofst[zno] != SD_ZBC_UPDATING_WP_OFST) + continue; + + spin_unlock_bh(&sdkp->zones_wp_ofst_lock); + ret = sd_zbc_do_report_zones(sdkp, sdkp->zone_wp_update_buf, + SD_BUF_SIZE, + zno * sdkp->zone_blocks, true); + spin_lock_bh(&sdkp->zones_wp_ofst_lock); + if (!ret) + sd_zbc_parse_report(sdkp, sdkp->zone_wp_update_buf + 64, + zno, sd_zbc_update_wp_ofst_cb, + sdkp); + } + spin_unlock_bh(&sdkp->zones_wp_ofst_lock); + + scsi_device_put(sdkp->device); +} + +static blk_status_t sd_zbc_update_wp_ofst(struct scsi_disk *sdkp, + unsigned int zno) +{ + /* + * We are about to schedule work to update a zone write pointer offset, + * which will cause the zone append command to be requeued. So make + * sure that the scsi device does not go away while the work is + * being processed. + */ + if (scsi_device_get(sdkp->device)) + return BLK_STS_IOERR; + + sdkp->zones_wp_ofst[zno] = SD_ZBC_UPDATING_WP_OFST; + + schedule_work(&sdkp->zone_wp_ofst_work); + + return BLK_STS_RESOURCE; +} + +/** + * sd_zbc_prepare_zone_append() - Prepare an emulated ZONE_APPEND command. + * @cmd: the command to setup + * @lba: the LBA to patch + * @nr_blocks: the number of LBAs to be written + * + * Called from sd_setup_read_write_cmnd() for REQ_OP_ZONE_APPEND. + * @sd_zbc_prepare_zone_append() handles the necessary zone wrote locking and + * patching of the lba for an emulated ZONE_APPEND command. + * + * In case the cached write pointer offset is %SD_ZBC_INVALID_WP_OFST it will + * schedule a REPORT ZONES command and return BLK_STS_IOERR. + */ +blk_status_t sd_zbc_prepare_zone_append(struct scsi_cmnd *cmd, sector_t *lba, + unsigned int nr_blocks) +{ + struct request *rq = cmd->request; + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); + unsigned int wp_ofst, zno = blk_rq_zone_no(rq); + blk_status_t ret; + + ret = sd_zbc_cmnd_checks(cmd); + if (ret != BLK_STS_OK) + return ret; + + if (!blk_rq_zone_is_seq(rq)) + return BLK_STS_IOERR; + + /* Unlock of the write lock will happen in sd_zbc_complete() */ + if (!blk_req_zone_write_trylock(rq)) + return BLK_STS_ZONE_RESOURCE; + + spin_lock_bh(&sdkp->zones_wp_ofst_lock); + + wp_ofst = sdkp->zones_wp_ofst[zno]; + if (wp_ofst == SD_ZBC_UPDATING_WP_OFST) { + /* Write pointer offset update in progress: ask for a requeue */ + ret = BLK_STS_RESOURCE; + goto err; + } + + if (wp_ofst == SD_ZBC_INVALID_WP_OFST) { + /* Invalid write pointer offset: trigger an update from disk */ + ret = sd_zbc_update_wp_ofst(sdkp, zno); + goto err; + } + + wp_ofst = sectors_to_logical(sdkp->device, wp_ofst); + if (wp_ofst + nr_blocks > sdkp->zone_blocks) { + ret = BLK_STS_IOERR; + goto err; + } + + /* Set the LBA for the write command used to emulate zone append */ + *lba += wp_ofst; + + spin_unlock_bh(&sdkp->zones_wp_ofst_lock); + + return BLK_STS_OK; + +err: + spin_unlock_bh(&sdkp->zones_wp_ofst_lock); + blk_req_zone_write_unlock(rq); + return ret; +} + /** * sd_zbc_setup_zone_mgmt_cmnd - Prepare a zone ZBC_OUT command. The operations * can be RESET WRITE POINTER, OPEN, CLOSE or FINISH. @@ -266,19 +418,139 @@ blk_status_t sd_zbc_setup_zone_mgmt_cmnd(struct scsi_cmnd *cmd, cmd->transfersize = 0; cmd->allowed = 0; + /* Only zone reset and zone finish need zone write locking */ + if (op != ZO_RESET_WRITE_POINTER && op != ZO_FINISH_ZONE) + return BLK_STS_OK; + + if (all) { + /* We do not write lock all zones for an all zone reset */ + if (op == ZO_RESET_WRITE_POINTER) + return BLK_STS_OK; + + /* Finishing all zones is not supported */ + return BLK_STS_IOERR; + } + + if (!blk_rq_zone_is_seq(rq)) + return BLK_STS_IOERR; + + if (!blk_req_zone_write_trylock(rq)) + return BLK_STS_ZONE_RESOURCE; + return BLK_STS_OK; } +static inline bool sd_zbc_zone_needs_write_unlock(struct request *rq) +{ + /* + * For zone append, the zone was locked in sd_zbc_prepare_zone_append(). + * For zone reset and zone finish, the zone was locked in + * sd_zbc_setup_zone_mgmt_cmnd(). + * For regular writes, the zone is unlocked by the block layer elevator. + */ + return req_op(rq) == REQ_OP_ZONE_APPEND || + req_op(rq) == REQ_OP_ZONE_RESET || + req_op(rq) == REQ_OP_ZONE_FINISH; +} + +static bool sd_zbc_need_zone_wp_update(struct request *rq) +{ + switch (req_op(rq)) { + case REQ_OP_ZONE_APPEND: + case REQ_OP_ZONE_FINISH: + case REQ_OP_ZONE_RESET: + case REQ_OP_ZONE_RESET_ALL: + return true; + case REQ_OP_WRITE: + case REQ_OP_WRITE_ZEROES: + case REQ_OP_WRITE_SAME: + return blk_rq_zone_is_seq(rq); + default: + return false; + } +} + +/** + * sd_zbc_zone_wp_update - Update cached zone write pointer upon cmd completion + * @cmd: Completed command + * @good_bytes: Command reply bytes + * + * Called from sd_zbc_complete() to handle the update of the cached zone write + * pointer value in case an update is needed. + */ +static unsigned int sd_zbc_zone_wp_update(struct scsi_cmnd *cmd, + unsigned int good_bytes) +{ + int result = cmd->result; + struct request *rq = cmd->request; + struct scsi_disk *sdkp = scsi_disk(rq->rq_disk); + unsigned int zno = blk_rq_zone_no(rq); + enum req_opf op = req_op(rq); + + /* + * If we got an error for a command that needs updating the write + * pointer offset cache, we must mark the zone wp offset entry as + * invalid to force an update from disk the next time a zone append + * command is issued. + */ + spin_lock_bh(&sdkp->zones_wp_ofst_lock); + + if (result && op != REQ_OP_ZONE_RESET_ALL) { + if (op == REQ_OP_ZONE_APPEND) { + /* Force complete completion (no retry) */ + good_bytes = 0; + scsi_set_resid(cmd, blk_rq_bytes(rq)); + } + + /* + * Force an update of the zone write pointer offset on + * the next zone append access. + */ + if (sdkp->zones_wp_ofst[zno] != SD_ZBC_UPDATING_WP_OFST) + sdkp->zones_wp_ofst[zno] = SD_ZBC_INVALID_WP_OFST; + goto unlock_wp_ofst; + } + + switch (op) { + case REQ_OP_ZONE_APPEND: + rq->__sector += sdkp->zones_wp_ofst[zno]; + /* fallthrough */ + case REQ_OP_WRITE_ZEROES: + case REQ_OP_WRITE_SAME: + case REQ_OP_WRITE: + if (sdkp->zones_wp_ofst[zno] < sd_zbc_zone_sectors(sdkp)) + sdkp->zones_wp_ofst[zno] += good_bytes >> SECTOR_SHIFT; + break; + case REQ_OP_ZONE_RESET: + sdkp->zones_wp_ofst[zno] = 0; + break; + case REQ_OP_ZONE_FINISH: + sdkp->zones_wp_ofst[zno] = sd_zbc_zone_sectors(sdkp); + break; + case REQ_OP_ZONE_RESET_ALL: + memset(sdkp->zones_wp_ofst, 0, + sdkp->nr_zones * sizeof(unsigned int)); + break; + default: + break; + } + +unlock_wp_ofst: + spin_unlock_bh(&sdkp->zones_wp_ofst_lock); + + return good_bytes; +} + /** * sd_zbc_complete - ZBC command post processing. * @cmd: Completed command * @good_bytes: Command reply bytes * @sshdr: command sense header * - * Called from sd_done(). Process report zones reply and handle reset zone - * and write commands errors. + * Called from sd_done() to handle zone commands errors and updates to the + * device queue zone write pointer offset cahce. */ -void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, +unsigned int sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, struct scsi_sense_hdr *sshdr) { int result = cmd->result; @@ -294,7 +566,18 @@ void sd_zbc_complete(struct scsi_cmnd *cmd, unsigned int good_bytes, * so be quiet about the error. */ rq->rq_flags |= RQF_QUIET; + goto unlock_zone; } + + if (sd_zbc_need_zone_wp_update(rq)) + good_bytes = sd_zbc_zone_wp_update(cmd, good_bytes); + + +unlock_zone: + if (sd_zbc_zone_needs_write_unlock(rq)) + blk_req_zone_write_unlock(rq); + + return good_bytes; } /** @@ -396,11 +679,52 @@ static int sd_zbc_check_capacity(struct scsi_disk *sdkp, unsigned char *buf, return 0; } +struct sd_zbc_revalidate_zone_args { + struct scsi_disk *sdkp; + u32 *zones_wp_ofst; +}; + +static void sd_zbc_revalidate_zones_cb(struct blk_zone *zone, unsigned int idx, + void *data) +{ + struct sd_zbc_revalidate_zone_args *args = data; + struct scsi_disk *sdkp = args->sdkp; + + if (zone) { + args->zones_wp_ofst[idx] = sd_zbc_get_zone_wp_ofst(zone); + return; + } + + /* Final call: apply change */ + swap(sdkp->zones_wp_ofst, args->zones_wp_ofst); +} + +static int sd_zbc_revalidate_zones(struct scsi_disk *sdkp, + unsigned int nr_zones) +{ + struct sd_zbc_revalidate_zone_args args = { + .sdkp = sdkp, + }; + int ret; + + args.zones_wp_ofst = kvcalloc(nr_zones, sizeof(u32), GFP_NOIO); + if (!args.zones_wp_ofst) + return -ENOMEM; + + ret = __blk_revalidate_disk_zones(sdkp->disk, + sd_zbc_revalidate_zones_cb, &args); + kvfree(args.zones_wp_ofst); + + return ret; +} + int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) { struct gendisk *disk = sdkp->disk; + struct request_queue *q = disk->queue; unsigned int nr_zones; u32 zone_blocks = 0; + u32 max_append; int ret; if (!sd_is_zoned(sdkp)) @@ -420,10 +744,14 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) if (ret != 0) goto err; + max_append = min_t(u32, logical_to_sectors(sdkp->device, zone_blocks), + q->limits.max_segments << (PAGE_SHIFT - 9)); + max_append = min_t(u32, max_append, queue_max_hw_sectors(q)); + /* The drive satisfies the kernel restrictions: set it up */ - blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, sdkp->disk->queue); - blk_queue_required_elevator_features(sdkp->disk->queue, - ELEVATOR_F_ZBD_SEQ_WRITE); + blk_queue_flag_set(QUEUE_FLAG_ZONE_RESETALL, q); + blk_queue_required_elevator_features(q, ELEVATOR_F_ZBD_SEQ_WRITE); + blk_queue_max_zone_append_sectors(q, max_append); nr_zones = round_up(sdkp->capacity, zone_blocks) >> ilog2(zone_blocks); /* READ16/WRITE16 is mandatory for ZBC disks */ @@ -443,8 +771,8 @@ int sd_zbc_read_zones(struct scsi_disk *sdkp, unsigned char *buf) if (sdkp->zone_blocks != zone_blocks || sdkp->nr_zones != nr_zones || - disk->queue->nr_zones != nr_zones) { - ret = blk_revalidate_disk_zones(disk); + q->nr_zones != nr_zones) { + ret = sd_zbc_revalidate_zones(sdkp, nr_zones); if (ret != 0) goto err; sdkp->zone_blocks = zone_blocks; @@ -475,3 +803,26 @@ void sd_zbc_print_zones(struct scsi_disk *sdkp) sdkp->nr_zones, sdkp->zone_blocks); } + +int sd_zbc_init_disk(struct scsi_disk *sdkp) +{ + if (!sd_is_zoned(sdkp)) + return 0; + + sdkp->zones_wp_ofst = NULL; + spin_lock_init(&sdkp->zones_wp_ofst_lock); + INIT_WORK(&sdkp->zone_wp_ofst_work, sd_zbc_update_wp_ofst_workfn); + sdkp->zone_wp_update_buf = kzalloc(SD_BUF_SIZE, GFP_KERNEL); + if (!sdkp->zone_wp_update_buf) + return -ENOMEM; + + return 0; +} + +void sd_zbc_release_disk(struct scsi_disk *sdkp) +{ + kvfree(sdkp->zones_wp_ofst); + sdkp->zones_wp_ofst = NULL; + kfree(sdkp->zone_wp_update_buf); + sdkp->zone_wp_update_buf = NULL; +} From patchwork Fri Apr 3 10:12:48 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472471 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38E6692A for ; Fri, 3 Apr 2020 10:13:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1733620B1F for ; Fri, 3 Apr 2020 10:13:16 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="duoBcMkW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390627AbgDCKNO (ORCPT ); Fri, 3 Apr 2020 06:13:14 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56746 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390520AbgDCKNM (ORCPT ); Fri, 3 Apr 2020 06:13:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908792; x=1617444792; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=6HGr64038QxXgZLqzVn3A738Tu3C+gMz5NiM29UgamM=; b=duoBcMkWxHozvg1yDnxoP0sCTt9K8wwU9hsMH6p6IWgWdEPQ6XBoqbmU HEfbPHraxwgWuNK806YN4j/FuitThEhO3W+a2dMZI4Iv94yeCu0TqoNzd PYyyYBupLgrR4NO2evv6jUc7p4hUg90oqAVnuILmmXRW5IMujDITnzsAy rzd5isH08iI8ZLvRjCVmDFpF9JKIgO6h8du2VbS7RQAgGjEuruTtiWEjX MraX7pzsSn6M0mIMs0MFHKfXpg2z7pnfPwNeznzYoWmmi58xBvWa+CcQo DE1NFbZuttbBwq44x5qSBOkXEHKLnos++zIRnVeJu96SqZuhfeaIKJWoQ g==; IronPort-SDR: 6VMUOJgikKBL02Wi3y5LDaGG6zXuLDmKG/YFm5TnWlG8edQY+b3uqxn7W8VEiavUAX2OIyzvIU hi89xWZdxXgxRAKmI0/SdaIJUBoqfFSjSAFTVEkgALEJHs/FmC8vFKLgzmGvXAuoRDUWbH2vsZ PcqVGLEUjO6ePHaLP9yp5GYP9RusG2gQQeq/ChtwO7OTNbBZWbIPhU7rGF4dP5hhqk+weEWRvB UBfHFjjXkII8r8nTCIGiHYWPtmjLHgOyvAj2VIDVLAlVBubKdljKNrXToOIFgV1GXTzruOpCBB Zwc= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956042" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:12 +0800 IronPort-SDR: vADdhwGW3UcUroU6qGlMIUFGC7xTOJQk40VjA4LvQ4Gp30VgqxGV4tESOGVt2bJZubl03vpSlC BQxeO55kw/TGHpfXLB3hjlNqUOCMpRC2O6CVhZl2EyQdtuszLM1ZRj2bwJ91kGaXL41M9JZvT2 +kZlpTEvnfdEQ1xehOyeYNKOuxOmWCLdtXYkFMnbIrrGMa+x3YafCICLIGFrUwdi4USx5vhJG7 6tkll0PCm5HRpn5k24PrnExLo53BFLL5/Ah40Jcp2CE4KVMp1htBMm8svp6RP/LBvsU0sw3OSt IX3JcM6bdICz+388C/sj1LI2 Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:04:01 -0700 IronPort-SDR: jp933wNZ6r5GwPJZdOuCxpwABCoL1ONO1Khrwft9a4qhi89xO9y/EL52ryw2SvATUh2ajEB7vM bSqK/AMo5mkMSzdg1Y6z4of69suXkHmhfX22+kC2bWM0nXZ4Lw54ZsH2yjEwUyGS3W3eAb6Che 4E2GqlC2Njw+bmLNG2DB39zwule+VPVsPY5GcKuDDFgKhhren2dO3x8MJ1pJmhQp+/DGguupP3 knRDhCzgCO118I/eTwNB2uJHuuHIa35sxvyu/S2YB7RsT6Af52VkK6x4XZyjva5Q4bJnwBHTdY C/o= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:08 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Damien Le Moal Subject: [PATCH v4 08/10] null_blk: Support REQ_OP_ZONE_APPEND Date: Fri, 3 Apr 2020 19:12:48 +0900 Message-Id: <20200403101250.33245-9-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org From: Damien Le Moal Support REQ_OP_ZONE_APPEND requests for null_blk devices with zoned mode enabled. Use the internally tracked zone write pointer position as the actual write position and return it using the command request __sector field in the case of an mq device and using the command BIO sector in the case of a BIO device. Signed-off-by: Damien Le Moal Reviewed-by: Christoph Hellwig --- drivers/block/null_blk_zoned.c | 39 +++++++++++++++++++++++++++------- 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/drivers/block/null_blk_zoned.c b/drivers/block/null_blk_zoned.c index c60b19432a2e..b664be0bbb5e 100644 --- a/drivers/block/null_blk_zoned.c +++ b/drivers/block/null_blk_zoned.c @@ -67,13 +67,22 @@ int null_init_zoned_dev(struct nullb_device *dev, struct request_queue *q) int null_register_zoned_dev(struct nullb *nullb) { + struct nullb_device *dev = nullb->dev; struct request_queue *q = nullb->q; - if (queue_is_mq(q)) - return blk_revalidate_disk_zones(nullb->disk); + if (queue_is_mq(q)) { + int ret = blk_revalidate_disk_zones(nullb->disk); + + if (ret) + return ret; + } else { + blk_queue_chunk_sectors(q, dev->zone_size_sects); + q->nr_zones = blkdev_nr_zones(nullb->disk); + } - blk_queue_chunk_sectors(q, nullb->dev->zone_size_sects); - q->nr_zones = blkdev_nr_zones(nullb->disk); + blk_queue_max_zone_append_sectors(q, + min_t(sector_t, q->limits.max_hw_sectors, + dev->zone_size_sects)); return 0; } @@ -133,7 +142,7 @@ size_t null_zone_valid_read_len(struct nullb *nullb, } static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector, - unsigned int nr_sectors) + unsigned int nr_sectors, bool append) { struct nullb_device *dev = cmd->nq->dev; unsigned int zno = null_zone_no(dev, sector); @@ -151,9 +160,21 @@ static blk_status_t null_zone_write(struct nullb_cmd *cmd, sector_t sector, case BLK_ZONE_COND_IMP_OPEN: case BLK_ZONE_COND_EXP_OPEN: case BLK_ZONE_COND_CLOSED: - /* Writes must be at the write pointer position */ - if (sector != zone->wp) + /* + * Regular writes must be at the write pointer position. + * Zone append writes are automatically issued at the write + * pointer and the position returned using the request or BIO + * sector. + */ + if (append) { + sector = zone->wp; + if (cmd->bio) + cmd->bio->bi_iter.bi_sector = sector; + else + cmd->rq->__sector = sector; + } else if (sector != zone->wp) { return BLK_STS_IOERR; + } if (zone->cond != BLK_ZONE_COND_EXP_OPEN) zone->cond = BLK_ZONE_COND_IMP_OPEN; @@ -232,7 +253,9 @@ blk_status_t null_handle_zoned(struct nullb_cmd *cmd, enum req_opf op, { switch (op) { case REQ_OP_WRITE: - return null_zone_write(cmd, sector, nr_sectors); + return null_zone_write(cmd, sector, nr_sectors, false); + case REQ_OP_ZONE_APPEND: + return null_zone_write(cmd, sector, nr_sectors, true); case REQ_OP_ZONE_RESET: case REQ_OP_ZONE_RESET_ALL: case REQ_OP_ZONE_OPEN: From patchwork Fri Apr 3 10:12:49 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472485 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 38B20159A for ; Fri, 3 Apr 2020 10:13:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 16FE520787 for ; Fri, 3 Apr 2020 10:13:22 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="RubCzKkl" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390640AbgDCKNU (ORCPT ); Fri, 3 Apr 2020 06:13:20 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56746 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390591AbgDCKNO (ORCPT ); Fri, 3 Apr 2020 06:13:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908794; x=1617444794; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zp8Fvp7+ys681iufOgo7VTs31ay4WvSMoXUToPrkYmw=; b=RubCzKklemnx+1oQQEjYk9psfOHosBg/BYnEspjNQKULRjm32/POjtlJ a6qEhAA1GmWj1+t9GczMhHJvzXL1YJ7iHP3WGgKLQW8EmmNVsP6LN1R/s 1OVyuG8ddtcdtRlbHl+fTt197JaDTGXYJ0PkAWPQ7oIA17KJtiqOT1F2u fb1JSVlijnCQw+KQ4mUhfC3qYxYYgklosN4p8ZOJtorgemGVyDhKT+Vqx bGIOrGKW8tVAOTEzBXnXyuVAlrhFuRd1ymCqk4dPeHU0qvhiMqbD7HZVd k+Ds7Gl5txTNKGjz00gCiUob202rgQonZa7I8T2GLpndW6IUQZFpaRqJJ A==; IronPort-SDR: ZQB0f32jAYhL8euGmv5e9341re0Zr+VYi5snnNFWMS1hkMEgF+HZcIolvQy9JYMe3dMXz2aHoR rGTwl4IA5vY5Eq4tNENVTwMvOaITxWCxWOeFo8BOqMzu9ubnEpyeh7aZtGaDxACC80x0kxcNI5 mav7ZMLCOB2vrYGUCAymmB0hap3PKIJHrnoPp+SSge5qg+8qYg8SOCDIkYTj1EqMpmZJh1uatc Mx38sbZzDucdkprGeoRlV4KB5bQIkz4ya1xA6LTyYzSx3vbp9yxjybMb2Ju4VarmZ7vYGxdWK1 peU= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956047" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:14 +0800 IronPort-SDR: 8yQQSnkT3vsn/DFd06LClb6c9hXvJ5Llf6jaH6U7u56/4gbVKhjaipt/3gEJAcXAVXI33Kq2eb k00BykAjkhWJCf7Mskg3wqblPi0MmfruFV7w/531FF7n0/O2yDoYk/LtEkbFYYlEW8nrt+EAdk 77yO8O6HAcng3BTTjTrn8hAmHvGe5DxgK1HCtH5OeVaYTP8h+3GigK8N4Ir6rIheycKKnwQUvn fbhFZRO218ul5dbipSnlomDYiUjabSPEm+2q7ryeViSDF3APbypPT+YqIQcP7TNwQDCguekpfg e1n9KKvmOy6QvsKb6lU3UfLL Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:04:03 -0700 IronPort-SDR: P1V+9SY1KgNIdJZX3X4r2ZiRz03o+myYlHzqTVlimbyPBwOGrAinoCJOENjjGrk93yOsrUZky9 rphgK14sabn0grbQIAnNiG+OSYBecY6F5EBRjK4ESD8DMi3X73Ac1MXAn91YBid6ZC+kVdWpaa 2HjueYdp6S5/89Fcjd9+obHP+cILfpEokLiL6erFMPtiW8yR4NMKjGjYSQG2ALbyJ6PV1gJ7OI nL+SoWFpZafQPTFAawUjKcXeXZx72gaV29DznXLmQ1mXL6wQneAT51TH4n++C+dVkPhQTB9+rz 5+E= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:10 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v4 09/10] block: export bio_release_pages and bio_iov_iter_get_pages Date: Fri, 3 Apr 2020 19:12:49 +0900 Message-Id: <20200403101250.33245-10-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Export bio_release_pages and bio_iov_iter_get_pages, so they can be used from modular code. Signed-off-by: Johannes Thumshirn --- Changes to v3: - Use EXPORT_SYMBOL_GPL --- block/bio.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/block/bio.c b/block/bio.c index e8c9273884a6..7819b01d269c 100644 --- a/block/bio.c +++ b/block/bio.c @@ -929,6 +929,7 @@ void bio_release_pages(struct bio *bio, bool mark_dirty) put_page(bvec->bv_page); } } +EXPORT_SYMBOL_GPL(bio_release_pages); static int __bio_iov_bvec_add_pages(struct bio *bio, struct iov_iter *iter) { @@ -1050,6 +1051,7 @@ int bio_iov_iter_get_pages(struct bio *bio, struct iov_iter *iter) bio_set_flag(bio, BIO_NO_PAGE_REF); return bio->bi_vcnt ? 0 : ret; } +EXPORT_SYMBOL_GPL(bio_iov_iter_get_pages); static void submit_bio_wait_endio(struct bio *bio) { From patchwork Fri Apr 3 10:12:50 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Johannes Thumshirn X-Patchwork-Id: 11472479 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E484B92A for ; Fri, 3 Apr 2020 10:13:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C3A8D20CC7 for ; Fri, 3 Apr 2020 10:13:20 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=wdc.com header.i=@wdc.com header.b="GErCpaeN" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2390658AbgDCKNU (ORCPT ); Fri, 3 Apr 2020 06:13:20 -0400 Received: from esa6.hgst.iphmx.com ([216.71.154.45]:56752 "EHLO esa6.hgst.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2390633AbgDCKNP (ORCPT ); Fri, 3 Apr 2020 06:13:15 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=wdc.com; i=@wdc.com; q=dns/txt; s=dkim.wdc.com; t=1585908795; x=1617444795; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=DntH3FOL+F2qJ4JB8bNfeHvJeRnIrbJq4X8RpN63JaA=; b=GErCpaeNP5ToG5PipM0I25323UAz4kYbZ2cFenjZp6viuNtQY/d2ZOHn DjEW/Dly4/PyF5RMxxx8A5+u7ynzitYdDPW99Es2ujwAUqKxugwmWMgkY Rw8JnUOPU2IMKDM9SrKl2iBi9nrVIV1RCQSvppdZFOYhU0c0zSGhjHvjQ KXUEKSLz16T9iZntk7bl4S8S16Jqe3rGcUmeup6ARXSOlwavUa/lY5OFH OLB0nF6WXSQIv+Klgc9IokLtySy7DA81J7kqeVPSRbMYSl+QXbH+fkXH6 M2cL3vkGJP2qP+UhbAkh62Bi82JfgFY/VNRQP/5CpSGLfXwFbSg1eqaXN A==; IronPort-SDR: ge2drv3IAlwPOj6qIlYPp/t3OqdS3OoqKIndlWR5ZzBapR9sZ3F2Pi/lvKSWWHIyxJEsNtdHlT OCqw3Fh/Q44zPQLS6eThXe3YBLG7PhgTwkRMB4P90K2/+tup8i1AilBLx9njlcwaxPLJoh4NRP G1PzRzheuzWbHi05V5ffRiPzWlCuz8w1/4ksr6rA0MA5omJNrYTU1dBjDOvQW449mZMF5cef0S N5HQjkQKgWAa3w/WkQ8oowyrM07LUGRw/TCx4iDSdWIO/FtpMMCjxCPWieTqZvIReT6cVVs0ds 4Vc= X-IronPort-AV: E=Sophos;i="5.72,339,1580745600"; d="scan'208";a="135956057" Received: from uls-op-cesaip02.wdc.com (HELO uls-op-cesaep02.wdc.com) ([199.255.45.15]) by ob1.hgst.iphmx.com with ESMTP; 03 Apr 2020 18:13:15 +0800 IronPort-SDR: 9GGFY5qNc8Q3jtHjZu838vi+eKJ9Eh004g/yPnqsrJXt5g3i+Gnj05IFlss5qAZ6zvyXxbBbCd 27yK1EiYHJ0rbKkVVA4v28RNwqxu+W5GE7RgWZHdaZtCxypuehzc6DcAKYAIDDAhnnbD97lkL9 x4OvBRIX2mHnnpK1kn3YNV19iCepLrgIsN5LGaZjwoaHTCVTsXrzPhTXen7n14Ojo2Eai+h1UY 4evk54c2zKCG/N6QWvXOuL2YkEVxr9ZHZ9M9z96OlOwoqBsAibDwr2+tcKpb8iA3e0EsAMtQBY W0AsKD6WCNQMqXBSkqU5+bzB Received: from uls-op-cesaip02.wdc.com ([10.248.3.37]) by uls-op-cesaep02.wdc.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 03 Apr 2020 03:04:04 -0700 IronPort-SDR: nV9xeSC198WDQDPHiej2oAz+ea5egwfJmMSL3FUrsJsDBqhCg1ysRo+BxrcCkKj45k6YVzU4xb DZqUfXLnrMe+zdL/pKVdiXE6EkwvqkBC3NKyjod/E8IAH5v++gpadmiB5kg+tgtG02GQuQwQMK r5Lkiiixyvz83+VusZ+uca/vvGb520q6V68TaFeRlACAG3vimPaxuyK+HrI508xOSHbJdXEbP9 AjbibnZyomsxUji2EIzyIIFLEOGEzZeHyNzsKjY4xw4KkWoU/nNnEefSPBGaIRJTYnlGfQb874 eSk= WDCIronportException: Internal Received: from unknown (HELO redsun60.ssa.fujisawa.hgst.com) ([10.149.66.36]) by uls-op-cesaip02.wdc.com with ESMTP; 03 Apr 2020 03:13:12 -0700 From: Johannes Thumshirn To: Jens Axboe Cc: Christoph Hellwig , linux-block , Damien Le Moal , Keith Busch , "linux-scsi @ vger . kernel . org" , "Martin K . Petersen" , "linux-fsdevel @ vger . kernel . org" , Johannes Thumshirn Subject: [PATCH v4 10/10] zonefs: use REQ_OP_ZONE_APPEND for sync DIO Date: Fri, 3 Apr 2020 19:12:50 +0900 Message-Id: <20200403101250.33245-11-johannes.thumshirn@wdc.com> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20200403101250.33245-1-johannes.thumshirn@wdc.com> References: <20200403101250.33245-1-johannes.thumshirn@wdc.com> MIME-Version: 1.0 Sender: linux-scsi-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-scsi@vger.kernel.org Synchronous direct I/O to a sequential write only zone can be issued using the new REQ_OP_ZONE_APPEND request operation. As dispatching multiple BIOs can potentially result in reordering, we cannot support asynchronous IO via this interface. We also can only dispatch up to queue_max_zone_append_sectors() via the new zone-append method and have to return a short write back to user-space in case an IO larger than queue_max_zone_append_sectors() has been issued. Signed-off-by: Johannes Thumshirn --- fs/zonefs/super.c | 80 ++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 72 insertions(+), 8 deletions(-) diff --git a/fs/zonefs/super.c b/fs/zonefs/super.c index 3ce9829a6936..0bf7009f50a2 100644 --- a/fs/zonefs/super.c +++ b/fs/zonefs/super.c @@ -20,6 +20,7 @@ #include #include #include +#include #include "zonefs.h" @@ -596,6 +597,61 @@ static const struct iomap_dio_ops zonefs_write_dio_ops = { .end_io = zonefs_file_write_dio_end_io, }; +static ssize_t zonefs_file_dio_append(struct kiocb *iocb, struct iov_iter *from) +{ + struct inode *inode = file_inode(iocb->ki_filp); + struct zonefs_inode_info *zi = ZONEFS_I(inode); + struct block_device *bdev = inode->i_sb->s_bdev; + unsigned int max; + struct bio *bio; + ssize_t size; + int nr_pages; + ssize_t ret; + + nr_pages = iov_iter_npages(from, BIO_MAX_PAGES); + if (!nr_pages) + return 0; + + max = queue_max_zone_append_sectors(bdev_get_queue(bdev)); + max = ALIGN_DOWN(max << SECTOR_SHIFT, inode->i_sb->s_blocksize); + iov_iter_truncate(from, max); + + bio = bio_alloc_bioset(GFP_NOFS, nr_pages, &fs_bio_set); + if (!bio) + return -ENOMEM; + + bio_set_dev(bio, bdev); + bio->bi_iter.bi_sector = zi->i_zsector; + bio->bi_write_hint = iocb->ki_hint; + bio->bi_ioprio = iocb->ki_ioprio; + bio->bi_opf = REQ_OP_ZONE_APPEND | REQ_SYNC | REQ_IDLE; + if (iocb->ki_flags & IOCB_DSYNC) + bio->bi_opf |= REQ_FUA; + + ret = bio_iov_iter_get_pages(bio, from); + if (unlikely(ret)) { + bio_io_error(bio); + return ret; + } + size = bio->bi_iter.bi_size; + task_io_account_write(ret); + + if (iocb->ki_flags & IOCB_HIPRI) + bio_set_polled(bio, iocb); + + ret = submit_bio_wait(bio); + + bio_put(bio); + + zonefs_file_write_dio_end_io(iocb, size, ret, 0); + if (ret >= 0) { + iocb->ki_pos += size; + return size; + } + + return ret; +} + /* * Handle direct writes. For sequential zone files, this is the only possible * write path. For these files, check that the user is issuing writes @@ -611,6 +667,8 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) struct inode *inode = file_inode(iocb->ki_filp); struct zonefs_inode_info *zi = ZONEFS_I(inode); struct super_block *sb = inode->i_sb; + bool sync = is_sync_kiocb(iocb); + bool append = false; size_t count; ssize_t ret; @@ -619,7 +677,7 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) * as this can cause write reordering (e.g. the first aio gets EAGAIN * on the inode lock but the second goes through but is now unaligned). */ - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !is_sync_kiocb(iocb) && + if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && !sync && (iocb->ki_flags & IOCB_NOWAIT)) return -EOPNOTSUPP; @@ -643,16 +701,22 @@ static ssize_t zonefs_file_dio_write(struct kiocb *iocb, struct iov_iter *from) } /* Enforce sequential writes (append only) in sequential zones */ - mutex_lock(&zi->i_truncate_mutex); - if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && iocb->ki_pos != zi->i_wpoffset) { + if (zi->i_ztype == ZONEFS_ZTYPE_SEQ) { + mutex_lock(&zi->i_truncate_mutex); + if (iocb->ki_pos != zi->i_wpoffset) { + mutex_unlock(&zi->i_truncate_mutex); + ret = -EINVAL; + goto inode_unlock; + } mutex_unlock(&zi->i_truncate_mutex); - ret = -EINVAL; - goto inode_unlock; + append = sync; } - mutex_unlock(&zi->i_truncate_mutex); - ret = iomap_dio_rw(iocb, from, &zonefs_iomap_ops, - &zonefs_write_dio_ops, is_sync_kiocb(iocb)); + if (append) + ret = zonefs_file_dio_append(iocb, from); + else + ret = iomap_dio_rw(iocb, from, &zonefs_iomap_ops, + &zonefs_write_dio_ops, sync); if (zi->i_ztype == ZONEFS_ZTYPE_SEQ && (ret > 0 || ret == -EIOCBQUEUED)) { if (ret > 0)