From patchwork Tue Oct 20 06:54:18 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 11845851 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 14D5FC433E7 for ; Tue, 20 Oct 2020 06:54:24 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BB6DC223BF for ; Tue, 20 Oct 2020 06:54:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404580AbgJTGyX (ORCPT ); Tue, 20 Oct 2020 02:54:23 -0400 Received: from out30-43.freemail.mail.aliyun.com ([115.124.30.43]:59924 "EHLO out30-43.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728831AbgJTGyX (ORCPT ); Tue, 20 Oct 2020 02:54:23 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R451e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e01424;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UCd5jTl_1603176860; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UCd5jTl_1603176860) by smtp.aliyun-inc.com(127.0.0.1); Tue, 20 Oct 2020 14:54:21 +0800 From: Jeffle Xu To: snitzer@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, dm-devel@redhat.com, joseph.qi@linux.alibaba.com, xiaoguang.wang@linux.alibaba.com, haoxu@linux.alibaba.com Subject: [RFC 1/3] block/mq: add iterator for polling hw queues Date: Tue, 20 Oct 2020 14:54:18 +0800 Message-Id: <20201020065420.124885-2-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201020065420.124885-1-jefflexu@linux.alibaba.com> References: <20201020065420.124885-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Add one helper function for iterating all hardware queues in polling mode. Signed-off-by: Jeffle Xu --- include/linux/blk-mq.h | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h index b23eeca4d677..81c70ce97715 100644 --- a/include/linux/blk-mq.h +++ b/include/linux/blk-mq.h @@ -571,6 +571,12 @@ static inline void *blk_mq_rq_to_pdu(struct request *rq) for ((i) = 0; (i) < (q)->nr_hw_queues && \ ({ hctx = (q)->queue_hw_ctx[i]; 1; }); (i)++) +#define queue_for_each_poll_hw_ctx(q, hctx, i) \ + for ((i) = 0; ((q)->tag_set->nr_maps > HCTX_TYPE_POLL) && \ + (i) < (q)->tag_set->map[HCTX_TYPE_POLL].nr_queues && \ + ({ hctx = (q)->queue_hw_ctx[((q)->tag_set->map[HCTX_TYPE_POLL].queue_offset + (i))]; 1; }); \ + (i)++) + #define hctx_for_each_ctx(hctx, ctx, i) \ for ((i) = 0; (i) < (hctx)->nr_ctx && \ ({ ctx = (hctx)->ctxs[(i)]; 1; }); (i)++) From patchwork Tue Oct 20 06:54:19 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 11845855 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 80D7CC433E7 for ; Tue, 20 Oct 2020 06:54:27 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 33980223C6 for ; Tue, 20 Oct 2020 06:54:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727458AbgJTGy0 (ORCPT ); Tue, 20 Oct 2020 02:54:26 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:52826 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729212AbgJTGy0 (ORCPT ); Tue, 20 Oct 2020 02:54:26 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R171e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=alimailimapcm10staff010182156082;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UCcdPoy_1603176861; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UCcdPoy_1603176861) by smtp.aliyun-inc.com(127.0.0.1); Tue, 20 Oct 2020 14:54:21 +0800 From: Jeffle Xu To: snitzer@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, dm-devel@redhat.com, joseph.qi@linux.alibaba.com, xiaoguang.wang@linux.alibaba.com, haoxu@linux.alibaba.com Subject: [RFC 2/3] block: add back ->poll_fn in request queue Date: Tue, 20 Oct 2020 14:54:19 +0800 Message-Id: <20201020065420.124885-3-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201020065420.124885-1-jefflexu@linux.alibaba.com> References: <20201020065420.124885-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This is a prep for adding support of IO polling for dm device. ->poll_fn is introduced in commit ea435e1b9392 ("block: add a poll_fn callback to struct request_queue") for supporting non-mq queues such as nvme multipath, but removed in commit 529262d56dbe ("block: remove ->poll_fn"). To add support of IO polling for dm device, support for non-mq device should be added and thus we need ->poll_fn back. commit c62b37d96b6e ("block: move ->make_request_fn to struct block_device_operations") moved all callbacks into struct block_device_operations in gendisk. But ->poll_fn can't be moved there since there's no way to fetch the corresponding gendisk from request_queue. Signed-off-by: Jeffle Xu --- block/blk-mq.c | 30 ++++++++++++++++++++++++------ include/linux/blkdev.h | 3 +++ 2 files changed, 27 insertions(+), 6 deletions(-) diff --git a/block/blk-mq.c b/block/blk-mq.c index 696450257ac1..b521ab01eaf3 100644 --- a/block/blk-mq.c +++ b/block/blk-mq.c @@ -43,6 +43,7 @@ static DEFINE_PER_CPU(struct list_head, blk_cpu_done); +static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie); static void blk_mq_poll_stats_start(struct request_queue *q); static void blk_mq_poll_stats_fn(struct blk_stat_callback *cb); @@ -3212,6 +3213,9 @@ struct request_queue *blk_mq_init_allocated_queue(struct blk_mq_tag_set *set, q->tag_set = set; + if (q->mq_ops->poll) + q->poll_fn = blk_mq_poll; + q->queue_flags |= QUEUE_FLAG_MQ_DEFAULT; if (set->nr_maps > HCTX_TYPE_POLL && set->map[HCTX_TYPE_POLL].nr_queues) @@ -3856,7 +3860,8 @@ int blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin) if (current->plug) blk_flush_plug_list(current->plug, false); - hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)]; + hctx = queue_is_mq(q) ? + q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)] : NULL; /* * If we sleep, have the caller restart the poll loop to reset @@ -3864,21 +3869,26 @@ int blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin) * caller is responsible for checking if the IO completed. If * the IO isn't complete, we'll get called again and will go * straight to the busy poll loop. + * + * Currently dm doesn't support hybrid polling. */ - if (blk_mq_poll_hybrid(q, hctx, cookie)) + if (hctx && blk_mq_poll_hybrid(q, hctx, cookie)) return 1; - hctx->poll_considered++; + if (hctx) + hctx->poll_considered++; state = current->state; do { int ret; - hctx->poll_invoked++; + if (hctx) + hctx->poll_invoked++; - ret = q->mq_ops->poll(hctx); + ret = q->poll_fn(q, cookie); if (ret > 0) { - hctx->poll_success++; + if (hctx) + hctx->poll_success++; __set_current_state(TASK_RUNNING); return ret; } @@ -3898,6 +3908,14 @@ int blk_poll(struct request_queue *q, blk_qc_t cookie, bool spin) } EXPORT_SYMBOL_GPL(blk_poll); +static int blk_mq_poll(struct request_queue *q, blk_qc_t cookie) +{ + struct blk_mq_hw_ctx *hctx; + + hctx = q->queue_hw_ctx[blk_qc_t_to_queue_num(cookie)]; + return q->mq_ops->poll(hctx); +} + unsigned int blk_mq_rq_cpu(struct request *rq) { return rq->mq_ctx->cpu; diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h index 639cae2c158b..d05684449893 100644 --- a/include/linux/blkdev.h +++ b/include/linux/blkdev.h @@ -288,6 +288,8 @@ static inline unsigned short req_get_ioprio(struct request *req) struct blk_queue_ctx; +typedef int (poll_q_fn) (struct request_queue *q, blk_qc_t); + struct bio_vec; enum blk_eh_timer_return { @@ -486,6 +488,7 @@ struct request_queue { struct blk_stat_callback *poll_cb; struct blk_rq_stat poll_stat[BLK_MQ_POLL_STATS_BKTS]; + poll_q_fn *poll_fn; struct timer_list timeout; struct work_struct timeout_work; From patchwork Tue Oct 20 06:54:20 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jingbo Xu X-Patchwork-Id: 11845857 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-12.7 required=3.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_PATCH,MAILING_LIST_MULTI,SIGNED_OFF_BY, SPF_HELO_NONE,SPF_PASS,UNPARSEABLE_RELAY,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C0DC3C43467 for ; Tue, 20 Oct 2020 06:54:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7352A223BF for ; Tue, 20 Oct 2020 06:54:31 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2404608AbgJTGy1 (ORCPT ); Tue, 20 Oct 2020 02:54:27 -0400 Received: from out30-56.freemail.mail.aliyun.com ([115.124.30.56]:51366 "EHLO out30-56.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728831AbgJTGy0 (ORCPT ); Tue, 20 Oct 2020 02:54:26 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R111e4;CH=green;DM=||false|;DS=||;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04357;MF=jefflexu@linux.alibaba.com;NM=1;PH=DS;RN=7;SR=0;TI=SMTPD_---0UCd5jTt_1603176861; Received: from localhost(mailfrom:jefflexu@linux.alibaba.com fp:SMTPD_---0UCd5jTt_1603176861) by smtp.aliyun-inc.com(127.0.0.1); Tue, 20 Oct 2020 14:54:22 +0800 From: Jeffle Xu To: snitzer@redhat.com, axboe@kernel.dk Cc: linux-block@vger.kernel.org, dm-devel@redhat.com, joseph.qi@linux.alibaba.com, xiaoguang.wang@linux.alibaba.com, haoxu@linux.alibaba.com Subject: [RFC 3/3] dm: add support for IO polling Date: Tue, 20 Oct 2020 14:54:20 +0800 Message-Id: <20201020065420.124885-4-jefflexu@linux.alibaba.com> X-Mailer: git-send-email 2.27.0 In-Reply-To: <20201020065420.124885-1-jefflexu@linux.alibaba.com> References: <20201020065420.124885-1-jefflexu@linux.alibaba.com> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Design of cookie is initially constrained as a per-bio concept. It dosn't work well when bio-split needed, and it is really an issue when adding support of iopoll for dm devices. The current algorithm implementation is simple. The returned cookie of dm device is actually not used since it is just the cookie of one of the cloned bios. Polling of dm device is actually polling on all hardware queues (in poll mode) of all underlying target devices. Signed-off-by: Jeffle Xu --- drivers/md/dm-core.h | 1 + drivers/md/dm-table.c | 30 ++++++++++++++++++++++++++++++ drivers/md/dm.c | 39 +++++++++++++++++++++++++++++++++++++++ 3 files changed, 70 insertions(+) diff --git a/drivers/md/dm-core.h b/drivers/md/dm-core.h index d522093cb39d..f18e066beffe 100644 --- a/drivers/md/dm-core.h +++ b/drivers/md/dm-core.h @@ -187,4 +187,5 @@ extern atomic_t dm_global_event_nr; extern wait_queue_head_t dm_global_eventq; void dm_issue_global_event(void); +int dm_io_poll(struct request_queue *q, blk_qc_t cookie); #endif diff --git a/drivers/md/dm-table.c b/drivers/md/dm-table.c index ce543b761be7..634b79842519 100644 --- a/drivers/md/dm-table.c +++ b/drivers/md/dm-table.c @@ -1809,6 +1809,31 @@ static bool dm_table_requires_stable_pages(struct dm_table *t) return false; } +static int device_not_support_poll(struct dm_target *ti, struct dm_dev *dev, + sector_t start, sector_t len, void *data) +{ + struct request_queue *q = bdev_get_queue(dev->bdev); + + return q && !(q->queue_flags & QUEUE_FLAG_POLL); +} + +bool dm_table_supports_poll(struct dm_table *t) +{ + struct dm_target *ti; + unsigned int i; + + /* Ensure that all targets support DAX. */ + for (i = 0; i < dm_table_get_num_targets(t); i++) { + ti = dm_table_get_target(t, i); + + if (!ti->type->iterate_devices || + ti->type->iterate_devices(ti, device_not_support_poll, NULL)) + return false; + } + + return true; +} + void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, struct queue_limits *limits) { @@ -1901,6 +1926,11 @@ void dm_table_set_restrictions(struct dm_table *t, struct request_queue *q, #endif blk_queue_update_readahead(q); + + if (dm_table_supports_poll(t)) { + q->poll_fn = dm_io_poll; + blk_queue_flag_set(QUEUE_FLAG_POLL, q); + } } unsigned int dm_table_get_num_targets(struct dm_table *t) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index c18fc2548518..4eceaf87ffd4 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -1666,6 +1666,45 @@ static blk_qc_t dm_submit_bio(struct bio *bio) return ret; } +static int dm_poll_one_dev(struct request_queue *q, blk_qc_t cookie) +{ + /* Iterate polling on all polling queues for mq device */ + if (queue_is_mq(q)) { + struct blk_mq_hw_ctx *hctx; + int i, ret = 0; + + if (!percpu_ref_tryget(&q->q_usage_counter)) + return 0; + + queue_for_each_poll_hw_ctx(q, hctx, i) { + ret += q->mq_ops->poll(hctx); + } + + percpu_ref_put(&q->q_usage_counter); + return ret; + } else + return q->poll_fn(q, cookie); +} + +int dm_io_poll(struct request_queue *q, blk_qc_t cookie) +{ + struct mapped_device *md = q->queuedata; + struct dm_table *table; + struct dm_dev_internal *dd; + int srcu_idx; + int ret = 0; + + table = dm_get_live_table(md, &srcu_idx); + if (!table) + goto out; + + list_for_each_entry(dd, dm_table_get_devices(table), list) + ret += dm_poll_one_dev(bdev_get_queue(dd->dm_dev->bdev), cookie); +out: + dm_put_live_table(md, srcu_idx); + return ret; +} + /*----------------------------------------------------------------- * An IDR is used to keep track of allocated minor numbers. *---------------------------------------------------------------*/