From patchwork Fri Jan 31 10:37:25 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359585 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A0AAB112B for ; Fri, 31 Jan 2020 10:38:16 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 891B6206F0 for ; Fri, 31 Jan 2020 10:38:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728423AbgAaKiP (ORCPT ); Fri, 31 Jan 2020 05:38:15 -0500 Received: from mx2.suse.de ([195.135.220.15]:55846 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728372AbgAaKiG (ORCPT ); Fri, 31 Jan 2020 05:38:06 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 8A2F0AF96; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 01/15] rbd: lock object request list Date: Fri, 31 Jan 2020 11:37:25 +0100 Message-Id: <20200131103739.136098-2-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The object request list can be accessed from various contexts so we need to lock it to avoid concurrent modifications and random crashes. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 36 ++++++++++++++++++++++++++++-------- 1 file changed, 28 insertions(+), 8 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 5710b2a8609c..db80b964d8ea 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -344,6 +344,7 @@ struct rbd_img_request { struct list_head lock_item; struct list_head object_extents; /* obj_req.ex structs */ + struct mutex object_mutex; struct mutex state_mutex; struct pending_result pending; @@ -1664,6 +1665,7 @@ static struct rbd_img_request *rbd_img_request_create( INIT_LIST_HEAD(&img_request->lock_item); INIT_LIST_HEAD(&img_request->object_extents); mutex_init(&img_request->state_mutex); + mutex_init(&img_request->object_mutex); kref_init(&img_request->kref); return img_request; @@ -1680,8 +1682,10 @@ static void rbd_img_request_destroy(struct kref *kref) dout("%s: img %p\n", __func__, img_request); WARN_ON(!list_empty(&img_request->lock_item)); + mutex_lock(&img_request->object_mutex); for_each_obj_request_safe(img_request, obj_request, next_obj_request) rbd_img_obj_request_del(img_request, obj_request); + mutex_unlock(&img_request->object_mutex); if (img_request_layered_test(img_request)) { img_request_layered_clear(img_request); @@ -2486,6 +2490,7 @@ static int __rbd_img_fill_request(struct rbd_img_request *img_req) struct rbd_obj_request *obj_req, *next_obj_req; int ret; + mutex_lock(&img_req->object_mutex); for_each_obj_request_safe(img_req, obj_req, next_obj_req) { switch (img_req->op_type) { case OBJ_OP_READ: @@ -2503,14 +2508,16 @@ static int __rbd_img_fill_request(struct rbd_img_request *img_req) default: BUG(); } - if (ret < 0) + if (ret < 0) { + mutex_unlock(&img_req->object_mutex); return ret; + } if (ret > 0) { rbd_img_obj_request_del(img_req, obj_req); continue; } } - + mutex_unlock(&img_req->object_mutex); img_req->state = RBD_IMG_START; return 0; } @@ -2569,6 +2576,7 @@ static int rbd_img_fill_request_nocopy(struct rbd_img_request *img_req, * position in the provided bio (list) or bio_vec array. */ fctx->iter = *fctx->pos; + mutex_lock(&img_req->object_mutex); for (i = 0; i < num_img_extents; i++) { ret = ceph_file_to_extents(&img_req->rbd_dev->layout, img_extents[i].fe_off, @@ -2576,10 +2584,12 @@ static int rbd_img_fill_request_nocopy(struct rbd_img_request *img_req, &img_req->object_extents, alloc_object_extent, img_req, fctx->set_pos_fn, &fctx->iter); - if (ret) + if (ret) { + mutex_unlock(&img_req->object_mutex); return ret; + } } - + mutex_unlock(&img_req->object_mutex); return __rbd_img_fill_request(img_req); } @@ -2620,6 +2630,7 @@ static int rbd_img_fill_request(struct rbd_img_request *img_req, * or bio_vec array because when mapped, those bio_vecs can straddle * stripe unit boundaries. */ + mutex_lock(&img_req->object_mutex); fctx->iter = *fctx->pos; for (i = 0; i < num_img_extents; i++) { ret = ceph_file_to_extents(&rbd_dev->layout, @@ -2629,15 +2640,17 @@ static int rbd_img_fill_request(struct rbd_img_request *img_req, alloc_object_extent, img_req, fctx->count_fn, &fctx->iter); if (ret) - return ret; + goto out_unlock; } for_each_obj_request(img_req, obj_req) { obj_req->bvec_pos.bvecs = kmalloc_array(obj_req->bvec_count, sizeof(*obj_req->bvec_pos.bvecs), GFP_NOIO); - if (!obj_req->bvec_pos.bvecs) - return -ENOMEM; + if (!obj_req->bvec_pos.bvecs) { + ret = -ENOMEM; + goto out_unlock; + } } /* @@ -2652,10 +2665,14 @@ static int rbd_img_fill_request(struct rbd_img_request *img_req, &img_req->object_extents, fctx->copy_fn, &fctx->iter); if (ret) - return ret; + goto out_unlock; } + mutex_unlock(&img_req->object_mutex); return __rbd_img_fill_request(img_req); +out_unlock: + mutex_unlock(&img_req->object_mutex); + return ret; } static int rbd_img_fill_nodata(struct rbd_img_request *img_req, @@ -3552,18 +3569,21 @@ static void rbd_img_object_requests(struct rbd_img_request *img_req) rbd_assert(!img_req->pending.result && !img_req->pending.num_pending); + mutex_lock(&img_req->object_mutex); for_each_obj_request(img_req, obj_req) { int result = 0; if (__rbd_obj_handle_request(obj_req, &result)) { if (result) { img_req->pending.result = result; + mutex_unlock(&img_req->object_mutex); return; } } else { img_req->pending.num_pending++; } } + mutex_unlock(&img_req->object_mutex); } static bool rbd_img_advance(struct rbd_img_request *img_req, int *result) From patchwork Fri Jan 31 10:37:26 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359593 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3122813A4 for ; Fri, 31 Jan 2020 10:38:20 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 1951A206F0 for ; Fri, 31 Jan 2020 10:38:20 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728384AbgAaKiE (ORCPT ); Fri, 31 Jan 2020 05:38:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:55782 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728325AbgAaKiD (ORCPT ); Fri, 31 Jan 2020 05:38:03 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 89986AF2D; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 02/15] rbd: use READ_ONCE() when checking the mapping size Date: Fri, 31 Jan 2020 11:37:26 +0100 Message-Id: <20200131103739.136098-3-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The mapping size is changed only very infrequently, so we don't need to take the header mutex for checking; using READ_ONCE() is sufficient here. And it avoids having to take a mutex in the hot path. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 12 ++++++------ 1 file changed, 6 insertions(+), 6 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index db80b964d8ea..792180548e89 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4788,13 +4788,13 @@ static void rbd_queue_workfn(struct work_struct *work) blk_mq_start_request(rq); - down_read(&rbd_dev->header_rwsem); - mapping_size = rbd_dev->mapping.size; + mapping_size = READ_ONCE(rbd_dev->mapping.size); if (op_type != OBJ_OP_READ) { + down_read(&rbd_dev->header_rwsem); snapc = rbd_dev->header.snapc; ceph_get_snap_context(snapc); + up_read(&rbd_dev->header_rwsem); } - up_read(&rbd_dev->header_rwsem); if (offset + length > mapping_size) { rbd_warn(rbd_dev, "beyond EOD (%llu~%llu > %llu)", offset, @@ -4981,9 +4981,9 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) u64 mapping_size; int ret; - down_write(&rbd_dev->header_rwsem); - mapping_size = rbd_dev->mapping.size; + mapping_size = READ_ONCE(rbd_dev->mapping.size); + down_write(&rbd_dev->header_rwsem); ret = rbd_dev_header_info(rbd_dev); if (ret) goto out; @@ -4999,7 +4999,7 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) } rbd_assert(!rbd_is_snap(rbd_dev)); - rbd_dev->mapping.size = rbd_dev->header.image_size; + WRITE_ONCE(rbd_dev->mapping.size, rbd_dev->header.image_size); out: up_write(&rbd_dev->header_rwsem); From patchwork Fri Jan 31 10:37:27 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359605 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id BA210159A for ; Fri, 31 Jan 2020 10:38:23 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A2F0220CC7 for ; Fri, 31 Jan 2020 10:38:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728344AbgAaKiD (ORCPT ); Fri, 31 Jan 2020 05:38:03 -0500 Received: from mx2.suse.de ([195.135.220.15]:55706 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728301AbgAaKiD (ORCPT ); Fri, 31 Jan 2020 05:38:03 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 899BCAF41; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 03/15] rbd: reorder rbd_img_advance() Date: Fri, 31 Jan 2020 11:37:27 +0100 Message-Id: <20200131103739.136098-4-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Reorder switch statement to avoid the use of a label/goto and add an RBD_IMG_DONE state to signal that the state machine has completed. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 30 +++++++++++++++++------------- 1 file changed, 17 insertions(+), 13 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 792180548e89..c80942e08164 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -321,9 +321,9 @@ enum img_req_flags { }; enum rbd_img_state { - RBD_IMG_START = 1, + RBD_IMG_DONE, + RBD_IMG_START, RBD_IMG_EXCLUSIVE_LOCK, - __RBD_IMG_OBJECT_REQUESTS, RBD_IMG_OBJECT_REQUESTS, }; @@ -3591,40 +3591,44 @@ static bool rbd_img_advance(struct rbd_img_request *img_req, int *result) struct rbd_device *rbd_dev = img_req->rbd_dev; int ret; -again: + dout("%s: img %p state %d\n", __func__, img_req, img_req->state); switch (img_req->state) { case RBD_IMG_START: rbd_assert(!*result); + img_req->state = RBD_IMG_EXCLUSIVE_LOCK; ret = rbd_img_exclusive_lock(img_req); if (ret < 0) { *result = ret; + img_req->state = RBD_IMG_DONE; return true; } - img_req->state = RBD_IMG_EXCLUSIVE_LOCK; - if (ret > 0) - goto again; - return false; + if (ret == 0) + return false; + /* fall through */ case RBD_IMG_EXCLUSIVE_LOCK: - if (*result) + if (*result) { + img_req->state = RBD_IMG_DONE; return true; + } rbd_assert(!need_exclusive_lock(img_req) || __rbd_is_lock_owner(rbd_dev)); + img_req->state = RBD_IMG_OBJECT_REQUESTS; rbd_img_object_requests(img_req); if (!img_req->pending.num_pending) { *result = img_req->pending.result; - img_req->state = RBD_IMG_OBJECT_REQUESTS; - goto again; + img_req->state = RBD_IMG_DONE; + return true; } - img_req->state = __RBD_IMG_OBJECT_REQUESTS; return false; - case __RBD_IMG_OBJECT_REQUESTS: + case RBD_IMG_OBJECT_REQUESTS: if (!pending_result_dec(&img_req->pending, result)) return false; + img_req->state = RBD_IMG_DONE; /* fall through */ - case RBD_IMG_OBJECT_REQUESTS: + case RBD_IMG_DONE: return true; default: BUG(); From patchwork Fri Jan 31 10:37:28 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359595 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3AEFB112B for ; Fri, 31 Jan 2020 10:38:21 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 236FB206F0 for ; Fri, 31 Jan 2020 10:38:21 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728365AbgAaKiE (ORCPT ); Fri, 31 Jan 2020 05:38:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:55736 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728336AbgAaKiD (ORCPT ); Fri, 31 Jan 2020 05:38:03 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 89C9BAF43; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 04/15] rbd: reorder switch statement in rbd_advance_read() Date: Fri, 31 Jan 2020 11:37:28 +0100 Message-Id: <20200131103739.136098-5-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Reorder switch statement to avoid the use of a label/goto statement and add a 'done' state to indicate that the state machine has completed. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 31 +++++++++++++++++++------------ 1 file changed, 19 insertions(+), 12 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index c80942e08164..4d7857667e9c 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -235,7 +235,8 @@ enum obj_operation_type { #define RBD_OBJ_FLAG_NOOP_FOR_NONEXISTENT (1U << 4) enum rbd_obj_read_state { - RBD_OBJ_READ_START = 1, + RBD_OBJ_READ_DONE, + RBD_OBJ_READ_START, RBD_OBJ_READ_OBJECT, RBD_OBJ_READ_PARENT, }; @@ -2924,36 +2925,38 @@ static bool rbd_obj_advance_read(struct rbd_obj_request *obj_req, int *result) struct rbd_device *rbd_dev = obj_req->img_request->rbd_dev; int ret; -again: + dout("%s: obj %p state %d\n", __func__, obj_req, obj_req->read_state); switch (obj_req->read_state) { case RBD_OBJ_READ_START: rbd_assert(!*result); - if (!rbd_obj_may_exist(obj_req)) { - *result = -ENOENT; + if (rbd_obj_may_exist(obj_req)) { + ret = rbd_obj_read_object(obj_req); + if (ret) { + *result = ret; + obj_req->read_state = RBD_OBJ_READ_DONE; + return true; + } obj_req->read_state = RBD_OBJ_READ_OBJECT; - goto again; - } - - ret = rbd_obj_read_object(obj_req); - if (ret) { - *result = ret; - return true; + return false; } + *result = -ENOENT; obj_req->read_state = RBD_OBJ_READ_OBJECT; - return false; + /* fall through */ case RBD_OBJ_READ_OBJECT: if (*result == -ENOENT && rbd_dev->parent_overlap) { /* reverse map this object extent onto the parent */ ret = rbd_obj_calc_img_extents(obj_req, false); if (ret) { *result = ret; + obj_req->read_state = RBD_OBJ_READ_DONE; return true; } if (obj_req->num_img_extents) { ret = rbd_obj_read_from_parent(obj_req); if (ret) { *result = ret; + obj_req->read_state = RBD_OBJ_READ_DONE; return true; } obj_req->read_state = RBD_OBJ_READ_PARENT; @@ -2977,6 +2980,7 @@ static bool rbd_obj_advance_read(struct rbd_obj_request *obj_req, int *result) rbd_assert(*result == obj_req->ex.oe_len); *result = 0; } + obj_req->read_state = RBD_OBJ_READ_DONE; return true; case RBD_OBJ_READ_PARENT: /* @@ -2990,6 +2994,9 @@ static bool rbd_obj_advance_read(struct rbd_obj_request *obj_req, int *result) rbd_obj_zero_range(obj_req, obj_overlap, obj_req->ex.oe_len - obj_overlap); } + obj_req->read_state = RBD_OBJ_READ_DONE; + /* fall through */ + case RBD_OBJ_READ_DONE: return true; default: BUG(); From patchwork Fri Jan 31 10:37:29 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359569 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4BE47112B for ; Fri, 31 Jan 2020 10:38:12 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 3452E206F0 for ; Fri, 31 Jan 2020 10:38:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728406AbgAaKiG (ORCPT ); Fri, 31 Jan 2020 05:38:06 -0500 Received: from mx2.suse.de ([195.135.220.15]:55858 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728390AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id E189CB011; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 05/15] rbd: reorder switch statement in rbd_advance_write() Date: Fri, 31 Jan 2020 11:37:29 +0100 Message-Id: <20200131103739.136098-6-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Reorder switch statement to avoid the use of a label/goto statement and add a 'done' state to indicate that the state machine has completed. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 36 +++++++++++++++++++++++++----------- 1 file changed, 25 insertions(+), 11 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 4d7857667e9c..766d67e4d5e5 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -267,7 +267,8 @@ enum rbd_obj_read_state { * even if there is a parent). */ enum rbd_obj_write_state { - RBD_OBJ_WRITE_START = 1, + RBD_OBJ_WRITE_DONE, + RBD_OBJ_WRITE_START, RBD_OBJ_WRITE_PRE_OBJECT_MAP, RBD_OBJ_WRITE_OBJECT, __RBD_OBJ_WRITE_COPYUP, @@ -3382,31 +3383,37 @@ static bool rbd_obj_advance_write(struct rbd_obj_request *obj_req, int *result) int ret; again: + dout("%s: obj %p state %d\n", __func__, obj_req, obj_req->write_state); switch (obj_req->write_state) { case RBD_OBJ_WRITE_START: rbd_assert(!*result); - if (rbd_obj_write_is_noop(obj_req)) + if (rbd_obj_write_is_noop(obj_req)) { + obj_req->write_state = RBD_OBJ_WRITE_DONE; return true; + } ret = rbd_obj_write_pre_object_map(obj_req); if (ret < 0) { *result = ret; + obj_req->write_state = RBD_OBJ_WRITE_DONE; return true; } obj_req->write_state = RBD_OBJ_WRITE_PRE_OBJECT_MAP; - if (ret > 0) - goto again; - return false; + if (ret == 0) + return false; + /* fall through */ case RBD_OBJ_WRITE_PRE_OBJECT_MAP: if (*result) { rbd_warn(rbd_dev, "pre object map update failed: %d", *result); + obj_req->write_state = RBD_OBJ_WRITE_DONE; return true; } ret = rbd_obj_write_object(obj_req); if (ret) { *result = ret; + obj_req->write_state = RBD_OBJ_WRITE_DONE; return true; } obj_req->write_state = RBD_OBJ_WRITE_OBJECT; @@ -3426,33 +3433,40 @@ static bool rbd_obj_advance_write(struct rbd_obj_request *obj_req, int *result) if (obj_req->flags & RBD_OBJ_FLAG_DELETION) *result = 0; } - if (*result) + if (*result) { + obj_req->write_state = RBD_OBJ_WRITE_DONE; return true; - + } obj_req->write_state = RBD_OBJ_WRITE_COPYUP; goto again; case __RBD_OBJ_WRITE_COPYUP: if (!rbd_obj_advance_copyup(obj_req, result)) return false; + obj_req->write_state = RBD_OBJ_WRITE_COPYUP; /* fall through */ case RBD_OBJ_WRITE_COPYUP: if (*result) { rbd_warn(rbd_dev, "copyup failed: %d", *result); + obj_req->write_state = RBD_OBJ_WRITE_DONE; return true; } + obj_req->write_state = RBD_OBJ_WRITE_POST_OBJECT_MAP; ret = rbd_obj_write_post_object_map(obj_req); if (ret < 0) { *result = ret; + obj_req->write_state = RBD_OBJ_WRITE_DONE; return true; } - obj_req->write_state = RBD_OBJ_WRITE_POST_OBJECT_MAP; - if (ret > 0) - goto again; - return false; + if (ret == 0) + return false; + /* fall through */ case RBD_OBJ_WRITE_POST_OBJECT_MAP: if (*result) rbd_warn(rbd_dev, "post object map update failed: %d", *result); + obj_req->write_state = RBD_OBJ_WRITE_DONE; + /* fall through */ + case RBD_OBJ_WRITE_DONE: return true; default: BUG(); From patchwork Fri Jan 31 10:37:30 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359581 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id D4EB513A4 for ; Fri, 31 Jan 2020 10:38:15 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id BD57C214D8 for ; Fri, 31 Jan 2020 10:38:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728400AbgAaKiG (ORCPT ); Fri, 31 Jan 2020 05:38:06 -0500 Received: from mx2.suse.de ([195.135.220.15]:55852 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728383AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id D62A5B001; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 06/15] rbd: add 'done' state for rbd_obj_advance_copyup() Date: Fri, 31 Jan 2020 11:37:30 +0100 Message-Id: <20200131103739.136098-7-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Rename 'RBD_OBJ_COPYUP_WRITE_OBJECT' to 'RBD_OBJ_COPYUP_DONE' to signal that the state machine has completed. With that we can rename '__RBD_OBJ_COPYUP_WRITE_OBJECT' to 'RBD_OPJ_COPYUP_WRITE_OBJECT'. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 25 ++++++++++++++++--------- 1 file changed, 16 insertions(+), 9 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 766d67e4d5e5..c31507a5fdd2 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -277,11 +277,11 @@ enum rbd_obj_write_state { }; enum rbd_obj_copyup_state { - RBD_OBJ_COPYUP_START = 1, + RBD_OBJ_COPYUP_DONE, + RBD_OBJ_COPYUP_START, RBD_OBJ_COPYUP_READ_PARENT, __RBD_OBJ_COPYUP_OBJECT_MAPS, RBD_OBJ_COPYUP_OBJECT_MAPS, - __RBD_OBJ_COPYUP_WRITE_OBJECT, RBD_OBJ_COPYUP_WRITE_OBJECT, }; @@ -3294,6 +3294,8 @@ static bool rbd_obj_advance_copyup(struct rbd_obj_request *obj_req, int *result) int ret; again: + dout("%s: obj %p copyup %d pending %d\n", __func__, + obj_req, obj_req->copyup_state, obj_req->pending.num_pending); switch (obj_req->copyup_state) { case RBD_OBJ_COPYUP_START: rbd_assert(!*result); @@ -3301,17 +3303,19 @@ static bool rbd_obj_advance_copyup(struct rbd_obj_request *obj_req, int *result) ret = rbd_obj_copyup_read_parent(obj_req); if (ret) { *result = ret; + obj_req->copyup_state = RBD_OBJ_COPYUP_DONE; return true; } if (obj_req->num_img_extents) obj_req->copyup_state = RBD_OBJ_COPYUP_READ_PARENT; else - obj_req->copyup_state = RBD_OBJ_COPYUP_WRITE_OBJECT; + obj_req->copyup_state = RBD_OBJ_COPYUP_DONE; return false; case RBD_OBJ_COPYUP_READ_PARENT: - if (*result) + if (*result) { + obj_req->copyup_state = RBD_OBJ_COPYUP_DONE; return true; - + } if (is_zero_bvecs(obj_req->copyup_bvecs, rbd_obj_img_extents_bytes(obj_req))) { dout("%s %p detected zeros\n", __func__, obj_req); @@ -3329,27 +3333,30 @@ static bool rbd_obj_advance_copyup(struct rbd_obj_request *obj_req, int *result) case __RBD_OBJ_COPYUP_OBJECT_MAPS: if (!pending_result_dec(&obj_req->pending, result)) return false; + obj_req->copyup_state = RBD_OBJ_COPYUP_OBJECT_MAPS; /* fall through */ case RBD_OBJ_COPYUP_OBJECT_MAPS: if (*result) { rbd_warn(rbd_dev, "snap object map update failed: %d", *result); + obj_req->copyup_state = RBD_OBJ_COPYUP_DONE; return true; } rbd_obj_copyup_write_object(obj_req); if (!obj_req->pending.num_pending) { *result = obj_req->pending.result; - obj_req->copyup_state = RBD_OBJ_COPYUP_WRITE_OBJECT; + obj_req->copyup_state = RBD_OBJ_COPYUP_DONE; goto again; } - obj_req->copyup_state = __RBD_OBJ_COPYUP_WRITE_OBJECT; + obj_req->copyup_state = RBD_OBJ_COPYUP_WRITE_OBJECT; return false; - case __RBD_OBJ_COPYUP_WRITE_OBJECT: + case RBD_OBJ_COPYUP_WRITE_OBJECT: if (!pending_result_dec(&obj_req->pending, result)) return false; + obj_req->copyup_state = RBD_OBJ_COPYUP_DONE; /* fall through */ - case RBD_OBJ_COPYUP_WRITE_OBJECT: + case RBD_OBJ_COPYUP_DONE: return true; default: BUG(); From patchwork Fri Jan 31 10:37:31 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359559 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id E1876159A for ; Fri, 31 Jan 2020 10:38:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C9FAC20CC7 for ; Fri, 31 Jan 2020 10:38:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728414AbgAaKiH (ORCPT ); Fri, 31 Jan 2020 05:38:07 -0500 Received: from mx2.suse.de ([195.135.220.15]:55850 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728380AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id DB623B004; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 07/15] rbd: use callback for image request completion Date: Fri, 31 Jan 2020 11:37:31 +0100 Message-Id: <20200131103739.136098-8-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Using callbacks to simplify code and to separate out the different code paths for parent and child requests. Suggested-by: David Disseldorp Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 61 +++++++++++++++++++++++++++++------------------------ 1 file changed, 33 insertions(+), 28 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index c31507a5fdd2..8cfd9407cbb8 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -317,6 +317,9 @@ struct rbd_obj_request { struct kref kref; }; +typedef void (*rbd_img_request_cb_t)(struct rbd_img_request *img_request, + int result); + enum img_req_flags { IMG_REQ_CHILD, /* initiator: block = 0, child image = 1 */ IMG_REQ_LAYERED, /* ENOENT handling: normal = 0, layered = 1 */ @@ -339,11 +342,8 @@ struct rbd_img_request { u64 snap_id; /* for reads */ struct ceph_snap_context *snapc; /* for writes */ }; - union { - struct request *rq; /* block request */ - struct rbd_obj_request *obj_request; /* obj req initiator */ - }; - + void *callback_data; + rbd_img_request_cb_t callback; struct list_head lock_item; struct list_head object_extents; /* obj_req.ex structs */ struct mutex object_mutex; @@ -506,6 +506,8 @@ static ssize_t add_single_major_store(struct bus_type *bus, const char *buf, static ssize_t remove_single_major_store(struct bus_type *bus, const char *buf, size_t count); static int rbd_dev_image_probe(struct rbd_device *rbd_dev, int depth); +static void rbd_img_end_child_request(struct rbd_img_request *img_req, + int result); static int rbd_dev_id_to_minor(int dev_id) { @@ -2882,7 +2884,8 @@ static int rbd_obj_read_from_parent(struct rbd_obj_request *obj_req) return -ENOMEM; __set_bit(IMG_REQ_CHILD, &child_img_req->flags); - child_img_req->obj_request = obj_req; + child_img_req->callback = rbd_img_end_child_request; + child_img_req->callback_data = obj_req; dout("%s child_img_req %p for obj_req %p\n", __func__, child_img_req, obj_req); @@ -3506,14 +3509,12 @@ static bool __rbd_obj_handle_request(struct rbd_obj_request *obj_req, return done; } -/* - * This is open-coded in rbd_img_handle_request() to avoid parent chain - * recursion. - */ static void rbd_obj_handle_request(struct rbd_obj_request *obj_req, int result) { - if (__rbd_obj_handle_request(obj_req, &result)) + if (__rbd_obj_handle_request(obj_req, &result)) { + /* Recurse into parent */ rbd_img_handle_request(obj_req->img_request, result); + } } static bool need_exclusive_lock(struct rbd_img_request *img_req) @@ -3695,26 +3696,29 @@ static bool __rbd_img_handle_request(struct rbd_img_request *img_req, return done; } -static void rbd_img_handle_request(struct rbd_img_request *img_req, int result) +static void rbd_img_end_child_request(struct rbd_img_request *img_req, + int result) { -again: - if (!__rbd_img_handle_request(img_req, &result)) - return; + struct rbd_obj_request *obj_req = img_req->callback_data; - if (test_bit(IMG_REQ_CHILD, &img_req->flags)) { - struct rbd_obj_request *obj_req = img_req->obj_request; + rbd_img_request_put(img_req); + rbd_obj_handle_request(obj_req, result); +} - rbd_img_request_put(img_req); - if (__rbd_obj_handle_request(obj_req, &result)) { - img_req = obj_req->img_request; - goto again; - } - } else { - struct request *rq = img_req->rq; +static void rbd_img_end_request(struct rbd_img_request *img_req, int result) +{ + struct request *rq = img_req->callback_data; - rbd_img_request_put(img_req); - blk_mq_end_request(rq, errno_to_blk_status(result)); - } + rbd_img_request_put(img_req); + blk_mq_end_request(rq, errno_to_blk_status(result)); +} + +void rbd_img_handle_request(struct rbd_img_request *img_req, int result) +{ + if (!__rbd_img_handle_request(img_req, &result)) + return; + + img_req->callback(img_req, result); } static const struct rbd_client_id rbd_empty_cid; @@ -4840,7 +4844,8 @@ static void rbd_queue_workfn(struct work_struct *work) result = -ENOMEM; goto err_rq; } - img_request->rq = rq; + img_request->callback = rbd_img_end_request; + img_request->callback_data = rq; snapc = NULL; /* img_request consumes a ref */ dout("%s rbd_dev %p img_req %p %s %llu~%llu\n", __func__, rbd_dev, From patchwork Fri Jan 31 10:37:32 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359601 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AEEF813A4 for ; Fri, 31 Jan 2020 10:38:22 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 97384206F0 for ; Fri, 31 Jan 2020 10:38:22 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728351AbgAaKiD (ORCPT ); Fri, 31 Jan 2020 05:38:03 -0500 Received: from mx2.suse.de ([195.135.220.15]:55726 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728335AbgAaKiD (ORCPT ); Fri, 31 Jan 2020 05:38:03 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B7622AFF6; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 08/15] rbd: add debugging statements for the state machine Date: Fri, 31 Jan 2020 11:37:32 +0100 Message-Id: <20200131103739.136098-9-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add additional debugging statements to analyse the state machine. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 17 +++++++++++++++-- 1 file changed, 15 insertions(+), 2 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 8cfd9407cbb8..b708f5ecda07 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -291,6 +291,7 @@ struct rbd_obj_request { union { enum rbd_obj_read_state read_state; /* for reads */ enum rbd_obj_write_state write_state; /* for writes */ + unsigned char obj_state; /* generic access */ }; struct rbd_img_request *img_request; @@ -1352,8 +1353,12 @@ static inline void rbd_img_obj_request_add(struct rbd_img_request *img_request, static inline void rbd_img_obj_request_del(struct rbd_img_request *img_request, struct rbd_obj_request *obj_request) { - dout("%s: img %p obj %p\n", __func__, img_request, obj_request); - list_del(&obj_request->ex.oe_item); + dout("%s: img %p obj %p state %d copyup %d pending %d\n", __func__, + img_request, obj_request, obj_request->obj_state, + obj_request->copyup_state, obj_request->pending.num_pending); + WARN_ON(obj_request->obj_state > 1); + WARN_ON(obj_request->pending.num_pending); + list_del_init(&obj_request->ex.oe_item); rbd_assert(obj_request->img_request == img_request); rbd_obj_request_put(obj_request); } @@ -1497,6 +1502,8 @@ __rbd_obj_add_osd_request(struct rbd_obj_request *obj_req, req->r_callback = rbd_osd_req_callback; req->r_priv = obj_req; + dout("%s: osd_req %p for obj_req %p\n", __func__, req, obj_req); + /* * Data objects may be stored in a separate pool, but always in * the same namespace in that pool as the header in its pool. @@ -1686,6 +1693,7 @@ static void rbd_img_request_destroy(struct kref *kref) dout("%s: img %p\n", __func__, img_request); WARN_ON(!list_empty(&img_request->lock_item)); + WARN_ON(img_request->state != RBD_IMG_DONE); mutex_lock(&img_request->object_mutex); for_each_obj_request_safe(img_request, obj_request, next_obj_request) rbd_img_obj_request_del(img_request, obj_request); @@ -3513,6 +3521,8 @@ static void rbd_obj_handle_request(struct rbd_obj_request *obj_req, int result) { if (__rbd_obj_handle_request(obj_req, &result)) { /* Recurse into parent */ + dout("%s: obj %p parent %p result %d\n", __func__, + obj_req, obj_req->img_request, result); rbd_img_handle_request(obj_req->img_request, result); } } @@ -3603,6 +3613,9 @@ static void rbd_img_object_requests(struct rbd_img_request *img_req) int result = 0; if (__rbd_obj_handle_request(obj_req, &result)) { + dout("%s: obj %p parent %p img %p result %d\n", + __func__, obj_req, obj_req->img_request, + img_req, result); if (result) { img_req->pending.result = result; mutex_unlock(&img_req->object_mutex); From patchwork Fri Jan 31 10:37:33 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359547 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6D1CB159A for ; Fri, 31 Jan 2020 10:38:05 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5565D206F0 for ; Fri, 31 Jan 2020 10:38:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728393AbgAaKiE (ORCPT ); Fri, 31 Jan 2020 05:38:04 -0500 Received: from mx2.suse.de ([195.135.220.15]:55716 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728330AbgAaKiD (ORCPT ); Fri, 31 Jan 2020 05:38:03 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id B773FAFFC; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 09/15] rbd: count pending object requests in-line Date: Fri, 31 Jan 2020 11:37:33 +0100 Message-Id: <20200131103739.136098-10-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Instead of having a counter for outstanding object requests check the state and count only those which are not in the final state. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 41 ++++++++++++++++++++++++++++++----------- 1 file changed, 30 insertions(+), 11 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index b708f5ecda07..a6c95b6e9c0c 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -350,7 +350,7 @@ struct rbd_img_request { struct mutex object_mutex; struct mutex state_mutex; - struct pending_result pending; + int pending_result; struct work_struct work; int work_result; struct kref kref; @@ -3602,11 +3602,12 @@ static int rbd_img_exclusive_lock(struct rbd_img_request *img_req) return 0; } -static void rbd_img_object_requests(struct rbd_img_request *img_req) +static int rbd_img_object_requests(struct rbd_img_request *img_req) { struct rbd_obj_request *obj_req; + int num_pending = 0; - rbd_assert(!img_req->pending.result && !img_req->pending.num_pending); + rbd_assert(!img_req->pending_result); mutex_lock(&img_req->object_mutex); for_each_obj_request(img_req, obj_req) { @@ -3617,15 +3618,33 @@ static void rbd_img_object_requests(struct rbd_img_request *img_req) __func__, obj_req, obj_req->img_request, img_req, result); if (result) { - img_req->pending.result = result; - mutex_unlock(&img_req->object_mutex); - return; + img_req->pending_result = result; + break; } } else { - img_req->pending.num_pending++; + num_pending++; } } mutex_unlock(&img_req->object_mutex); + return num_pending; +} + +static int rbd_img_object_requests_pending(struct rbd_img_request *img_req) +{ + struct rbd_obj_request *obj_req; + int num_pending = 0; + + mutex_lock(&img_req->object_mutex); + for_each_obj_request(img_req, obj_req) { + if (obj_req->obj_state > 1) + num_pending++; + else if (WARN_ON(obj_req->obj_state == 1)) + num_pending++; + else if (WARN_ON(obj_req->pending.num_pending)) + num_pending++; + } + mutex_unlock(&img_req->object_mutex); + return num_pending; } static bool rbd_img_advance(struct rbd_img_request *img_req, int *result) @@ -3658,16 +3677,16 @@ static bool rbd_img_advance(struct rbd_img_request *img_req, int *result) __rbd_is_lock_owner(rbd_dev)); img_req->state = RBD_IMG_OBJECT_REQUESTS; - rbd_img_object_requests(img_req); - if (!img_req->pending.num_pending) { - *result = img_req->pending.result; + if (!rbd_img_object_requests(img_req)) { + *result = img_req->pending_result; img_req->state = RBD_IMG_DONE; return true; } return false; case RBD_IMG_OBJECT_REQUESTS: - if (!pending_result_dec(&img_req->pending, result)) + if (rbd_img_object_requests_pending(img_req)) return false; + *result = img_req->pending_result; img_req->state = RBD_IMG_DONE; /* fall through */ case RBD_IMG_DONE: From patchwork Fri Jan 31 10:37:34 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359563 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C038B112B for ; Fri, 31 Jan 2020 10:38:09 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id A8BA2206F0 for ; Fri, 31 Jan 2020 10:38:09 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728417AbgAaKiI (ORCPT ); Fri, 31 Jan 2020 05:38:08 -0500 Received: from mx2.suse.de ([195.135.220.15]:55862 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728325AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id EE52AB01E; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 10/15] rbd: kill 'work_result' Date: Fri, 31 Jan 2020 11:37:34 +0100 Message-Id: <20200131103739.136098-11-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Use 'pending_result' instead of 'work_result' and kill the latter. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 5 ++--- 1 file changed, 2 insertions(+), 3 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index a6c95b6e9c0c..671e941d6edf 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -352,7 +352,6 @@ struct rbd_img_request { struct mutex state_mutex; int pending_result; struct work_struct work; - int work_result; struct kref kref; }; @@ -2834,13 +2833,13 @@ static void rbd_img_handle_request_work(struct work_struct *work) struct rbd_img_request *img_req = container_of(work, struct rbd_img_request, work); - rbd_img_handle_request(img_req, img_req->work_result); + rbd_img_handle_request(img_req, img_req->pending_result); } static void rbd_img_schedule(struct rbd_img_request *img_req, int result) { INIT_WORK(&img_req->work, rbd_img_handle_request_work); - img_req->work_result = result; + img_req->pending_result = result; queue_work(rbd_wq, &img_req->work); } From patchwork Fri Jan 31 10:37:35 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359561 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B3EAE112B for ; Fri, 31 Jan 2020 10:38:08 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 9C918206F0 for ; Fri, 31 Jan 2020 10:38:08 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728416AbgAaKiI (ORCPT ); Fri, 31 Jan 2020 05:38:08 -0500 Received: from mx2.suse.de ([195.135.220.15]:55856 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728389AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id EF638B020; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 11/15] rbd: drop state_mutex in __rbd_img_handle_request() Date: Fri, 31 Jan 2020 11:37:35 +0100 Message-Id: <20200131103739.136098-12-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The use of READ_ONCE/WRITE_ONCE for the image request state allows us to drop the state_mutex in __rbd_img_handle_request(). Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 26 +++++++++----------------- 1 file changed, 9 insertions(+), 17 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 671e941d6edf..db04401c4d8b 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -349,7 +349,6 @@ struct rbd_img_request { struct list_head object_extents; /* obj_req.ex structs */ struct mutex object_mutex; - struct mutex state_mutex; int pending_result; struct work_struct work; struct kref kref; @@ -1674,7 +1673,6 @@ static struct rbd_img_request *rbd_img_request_create( INIT_LIST_HEAD(&img_request->lock_item); INIT_LIST_HEAD(&img_request->object_extents); - mutex_init(&img_request->state_mutex); mutex_init(&img_request->object_mutex); kref_init(&img_request->kref); @@ -2529,7 +2527,7 @@ static int __rbd_img_fill_request(struct rbd_img_request *img_req) } } mutex_unlock(&img_req->object_mutex); - img_req->state = RBD_IMG_START; + WRITE_ONCE(img_req->state, RBD_IMG_START); return 0; } @@ -3652,15 +3650,15 @@ static bool rbd_img_advance(struct rbd_img_request *img_req, int *result) int ret; dout("%s: img %p state %d\n", __func__, img_req, img_req->state); - switch (img_req->state) { + switch (READ_ONCE(img_req->state)) { case RBD_IMG_START: rbd_assert(!*result); - img_req->state = RBD_IMG_EXCLUSIVE_LOCK; + WRITE_ONCE(img_req->state, RBD_IMG_EXCLUSIVE_LOCK); ret = rbd_img_exclusive_lock(img_req); if (ret < 0) { *result = ret; - img_req->state = RBD_IMG_DONE; + WRITE_ONCE(img_req->state, RBD_IMG_DONE); return true; } if (ret == 0) @@ -3668,17 +3666,17 @@ static bool rbd_img_advance(struct rbd_img_request *img_req, int *result) /* fall through */ case RBD_IMG_EXCLUSIVE_LOCK: if (*result) { - img_req->state = RBD_IMG_DONE; + WRITE_ONCE(img_req->state, RBD_IMG_DONE); return true; } rbd_assert(!need_exclusive_lock(img_req) || __rbd_is_lock_owner(rbd_dev)); - img_req->state = RBD_IMG_OBJECT_REQUESTS; + WRITE_ONCE(img_req->state, RBD_IMG_OBJECT_REQUESTS); if (!rbd_img_object_requests(img_req)) { *result = img_req->pending_result; - img_req->state = RBD_IMG_DONE; + WRITE_ONCE(img_req->state, RBD_IMG_DONE); return true; } return false; @@ -3686,7 +3684,7 @@ static bool rbd_img_advance(struct rbd_img_request *img_req, int *result) if (rbd_img_object_requests_pending(img_req)) return false; *result = img_req->pending_result; - img_req->state = RBD_IMG_DONE; + WRITE_ONCE(img_req->state, RBD_IMG_DONE); /* fall through */ case RBD_IMG_DONE: return true; @@ -3706,16 +3704,12 @@ static bool __rbd_img_handle_request(struct rbd_img_request *img_req, if (need_exclusive_lock(img_req)) { down_read(&rbd_dev->lock_rwsem); - mutex_lock(&img_req->state_mutex); done = rbd_img_advance(img_req, result); if (done) rbd_lock_del_request(img_req); - mutex_unlock(&img_req->state_mutex); up_read(&rbd_dev->lock_rwsem); } else { - mutex_lock(&img_req->state_mutex); done = rbd_img_advance(img_req, result); - mutex_unlock(&img_req->state_mutex); } if (done && *result) { @@ -3985,10 +3979,8 @@ static void wake_lock_waiters(struct rbd_device *rbd_dev, int result) } list_for_each_entry(img_req, &rbd_dev->acquiring_list, lock_item) { - mutex_lock(&img_req->state_mutex); - rbd_assert(img_req->state == RBD_IMG_EXCLUSIVE_LOCK); + rbd_assert(READ_ONCE(img_req->state) == RBD_IMG_EXCLUSIVE_LOCK); rbd_img_schedule(img_req, result); - mutex_unlock(&img_req->state_mutex); } list_splice_tail_init(&rbd_dev->acquiring_list, &rbd_dev->running_list); From patchwork Fri Jan 31 10:37:36 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359555 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 6F0FF112B for ; Fri, 31 Jan 2020 10:38:07 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 57BC4206F0 for ; Fri, 31 Jan 2020 10:38:07 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728412AbgAaKiG (ORCPT ); Fri, 31 Jan 2020 05:38:06 -0500 Received: from mx2.suse.de ([195.135.220.15]:55860 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728385AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 0EBB2B027; Fri, 31 Jan 2020 10:38:02 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 12/15] rbd: kill img_request kref Date: Fri, 31 Jan 2020 11:37:36 +0100 Message-Id: <20200131103739.136098-13-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The reference counter is never increased, so we can as well call rbd_img_request_destroy() directly and drop the kref. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 24 +++++------------------- 1 file changed, 5 insertions(+), 19 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index db04401c4d8b..2566d6bd8230 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -351,7 +351,6 @@ struct rbd_img_request { int pending_result; struct work_struct work; - struct kref kref; }; #define for_each_obj_request(ireq, oreq) \ @@ -1329,15 +1328,6 @@ static void rbd_obj_request_put(struct rbd_obj_request *obj_request) kref_put(&obj_request->kref, rbd_obj_request_destroy); } -static void rbd_img_request_destroy(struct kref *kref); -static void rbd_img_request_put(struct rbd_img_request *img_request) -{ - rbd_assert(img_request != NULL); - dout("%s: img %p (was %d)\n", __func__, img_request, - kref_read(&img_request->kref)); - kref_put(&img_request->kref, rbd_img_request_destroy); -} - static inline void rbd_img_obj_request_add(struct rbd_img_request *img_request, struct rbd_obj_request *obj_request) { @@ -1674,19 +1664,15 @@ static struct rbd_img_request *rbd_img_request_create( INIT_LIST_HEAD(&img_request->lock_item); INIT_LIST_HEAD(&img_request->object_extents); mutex_init(&img_request->object_mutex); - kref_init(&img_request->kref); return img_request; } -static void rbd_img_request_destroy(struct kref *kref) +static void rbd_img_request_destroy(struct rbd_img_request *img_request) { - struct rbd_img_request *img_request; struct rbd_obj_request *obj_request; struct rbd_obj_request *next_obj_request; - img_request = container_of(kref, struct rbd_img_request, kref); - dout("%s: img %p\n", __func__, img_request); WARN_ON(!list_empty(&img_request->lock_item)); @@ -2920,7 +2906,7 @@ static int rbd_obj_read_from_parent(struct rbd_obj_request *obj_req) obj_req->copyup_bvecs); } if (ret) { - rbd_img_request_put(child_img_req); + rbd_img_request_destroy(child_img_req); return ret; } @@ -3726,7 +3712,7 @@ static void rbd_img_end_child_request(struct rbd_img_request *img_req, { struct rbd_obj_request *obj_req = img_req->callback_data; - rbd_img_request_put(img_req); + rbd_img_request_destroy(img_req); rbd_obj_handle_request(obj_req, result); } @@ -3734,7 +3720,7 @@ static void rbd_img_end_request(struct rbd_img_request *img_req, int result) { struct request *rq = img_req->callback_data; - rbd_img_request_put(img_req); + rbd_img_request_destroy(img_req); blk_mq_end_request(rq, errno_to_blk_status(result)); } @@ -4886,7 +4872,7 @@ static void rbd_queue_workfn(struct work_struct *work) return; err_img_request: - rbd_img_request_put(img_request); + rbd_img_request_destroy(img_request); err_rq: if (result) rbd_warn(rbd_dev, "%s %llx at %llx result %d", From patchwork Fri Jan 31 10:37:37 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359571 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 7551313A4 for ; Fri, 31 Jan 2020 10:38:13 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 5DD5B206F0 for ; Fri, 31 Jan 2020 10:38:13 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728422AbgAaKiM (ORCPT ); Fri, 31 Jan 2020 05:38:12 -0500 Received: from mx2.suse.de ([195.135.220.15]:55864 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728387AbgAaKiG (ORCPT ); Fri, 31 Jan 2020 05:38:06 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 00DA7B021; Fri, 31 Jan 2020 10:38:02 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 13/15] rbd: schedule image_request after preparation Date: Fri, 31 Jan 2020 11:37:37 +0100 Message-Id: <20200131103739.136098-14-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Instead of pushing I/O directly to the workqueue we should be preparing it first, and push it onto the workqueue as the last step. This allows us to signal some back-pressure to the block layer in case the queue fills up. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 52 +++++++++++++++------------------------------------- 1 file changed, 15 insertions(+), 37 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 2566d6bd8230..9829f225c57d 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -4775,9 +4775,10 @@ static int rbd_obj_method_sync(struct rbd_device *rbd_dev, return ret; } -static void rbd_queue_workfn(struct work_struct *work) +static blk_status_t rbd_queue_rq(struct blk_mq_hw_ctx *hctx, + const struct blk_mq_queue_data *bd) { - struct request *rq = blk_mq_rq_from_pdu(work); + struct request *rq = bd->rq; struct rbd_device *rbd_dev = rq->q->queuedata; struct rbd_img_request *img_request; struct ceph_snap_context *snapc = NULL; @@ -4802,24 +4803,14 @@ static void rbd_queue_workfn(struct work_struct *work) break; default: dout("%s: non-fs request type %d\n", __func__, req_op(rq)); - result = -EIO; - goto err; - } - - /* Ignore/skip any zero-length requests */ - - if (!length) { - dout("%s: zero-length request\n", __func__); - result = 0; - goto err_rq; + return BLK_STS_IOERR; } if (op_type != OBJ_OP_READ) { if (rbd_is_ro(rbd_dev)) { rbd_warn(rbd_dev, "%s on read-only mapping", obj_op_name(op_type)); - result = -EIO; - goto err; + return BLK_STS_IOERR; } rbd_assert(!rbd_is_snap(rbd_dev)); } @@ -4827,11 +4818,17 @@ static void rbd_queue_workfn(struct work_struct *work) if (offset && length > U64_MAX - offset + 1) { rbd_warn(rbd_dev, "bad request range (%llu~%llu)", offset, length); - result = -EINVAL; - goto err_rq; /* Shouldn't happen */ + return BLK_STS_NOSPC; /* Shouldn't happen */ } blk_mq_start_request(rq); + /* Ignore/skip any zero-length requests */ + if (!length) { + dout("%s: zero-length request\n", __func__); + result = 0; + goto err; + } + mapping_size = READ_ONCE(rbd_dev->mapping.size); if (op_type != OBJ_OP_READ) { @@ -4868,8 +4865,8 @@ static void rbd_queue_workfn(struct work_struct *work) if (result) goto err_img_request; - rbd_img_handle_request(img_request, 0); - return; + rbd_img_schedule(img_request, 0); + return BLK_STS_OK; err_img_request: rbd_img_request_destroy(img_request); @@ -4880,15 +4877,6 @@ static void rbd_queue_workfn(struct work_struct *work) ceph_put_snap_context(snapc); err: blk_mq_end_request(rq, errno_to_blk_status(result)); -} - -static blk_status_t rbd_queue_rq(struct blk_mq_hw_ctx *hctx, - const struct blk_mq_queue_data *bd) -{ - struct request *rq = bd->rq; - struct work_struct *work = blk_mq_rq_to_pdu(rq); - - queue_work(rbd_wq, work); return BLK_STS_OK; } @@ -5055,18 +5043,8 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) return ret; } -static int rbd_init_request(struct blk_mq_tag_set *set, struct request *rq, - unsigned int hctx_idx, unsigned int numa_node) -{ - struct work_struct *work = blk_mq_rq_to_pdu(rq); - - INIT_WORK(work, rbd_queue_workfn); - return 0; -} - static const struct blk_mq_ops rbd_mq_ops = { .queue_rq = rbd_queue_rq, - .init_request = rbd_init_request, }; static int rbd_init_disk(struct rbd_device *rbd_dev) From patchwork Fri Jan 31 10:37:38 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359565 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 2251B159A for ; Fri, 31 Jan 2020 10:38:10 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 0AE70206F0 for ; Fri, 31 Jan 2020 10:38:10 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728419AbgAaKiJ (ORCPT ); Fri, 31 Jan 2020 05:38:09 -0500 Received: from mx2.suse.de ([195.135.220.15]:55848 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728374AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id DC947B00A; Fri, 31 Jan 2020 10:38:01 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 14/15] rbd: embed image request as blk_mq request payload Date: Fri, 31 Jan 2020 11:37:38 +0100 Message-Id: <20200131103739.136098-15-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Instead of allocating an image request for every block request we can as well embed it as the payload and save the allocation. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 56 +++++++++++++++++++++++++++++++++++------------------ 1 file changed, 37 insertions(+), 19 deletions(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index 9829f225c57d..cc3e5116fe58 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -324,6 +324,7 @@ typedef void (*rbd_img_request_cb_t)(struct rbd_img_request *img_request, enum img_req_flags { IMG_REQ_CHILD, /* initiator: block = 0, child image = 1 */ IMG_REQ_LAYERED, /* ENOENT handling: normal = 0, layered = 1 */ + IMG_REQ_EMBEDDED, /* free handling: normal = 0, embedded = 1 */ }; enum rbd_img_state { @@ -1640,17 +1641,10 @@ static bool rbd_dev_parent_get(struct rbd_device *rbd_dev) * that comprises the image request, and the Linux request pointer * (if there is one). */ -static struct rbd_img_request *rbd_img_request_create( - struct rbd_device *rbd_dev, - enum obj_operation_type op_type, - struct ceph_snap_context *snapc) +static void rbd_img_request_setup(struct rbd_img_request *img_request, + struct rbd_device *rbd_dev, enum obj_operation_type op_type, + struct ceph_snap_context *snapc) { - struct rbd_img_request *img_request; - - img_request = kmem_cache_zalloc(rbd_img_request_cache, GFP_NOIO); - if (!img_request) - return NULL; - img_request->rbd_dev = rbd_dev; img_request->op_type = op_type; if (!rbd_img_is_write(img_request)) @@ -1661,9 +1655,25 @@ static struct rbd_img_request *rbd_img_request_create( if (rbd_dev_parent_get(rbd_dev)) img_request_layered_set(img_request); + img_request->pending_result = 0; + img_request->state = RBD_IMG_DONE; INIT_LIST_HEAD(&img_request->lock_item); INIT_LIST_HEAD(&img_request->object_extents); mutex_init(&img_request->object_mutex); +} + +struct rbd_img_request *rbd_img_request_create( + struct rbd_device *rbd_dev, + enum obj_operation_type op_type, + struct ceph_snap_context *snapc) +{ + struct rbd_img_request *img_request; + + img_request = kmem_cache_zalloc(rbd_img_request_cache, GFP_NOIO); + if (!img_request) + return NULL; + + rbd_img_request_setup(img_request, rbd_dev, op_type, snapc); return img_request; } @@ -1690,7 +1700,8 @@ static void rbd_img_request_destroy(struct rbd_img_request *img_request) if (rbd_img_is_write(img_request)) ceph_put_snap_context(img_request->snapc); - kmem_cache_free(rbd_img_request_cache, img_request); + if (!test_bit(IMG_REQ_EMBEDDED, &img_request->flags)) + kmem_cache_free(rbd_img_request_cache, img_request); } #define BITS_PER_OBJ 2 @@ -4780,7 +4791,7 @@ static blk_status_t rbd_queue_rq(struct blk_mq_hw_ctx *hctx, { struct request *rq = bd->rq; struct rbd_device *rbd_dev = rq->q->queuedata; - struct rbd_img_request *img_request; + struct rbd_img_request *img_request = blk_mq_rq_to_pdu(rq); struct ceph_snap_context *snapc = NULL; u64 offset = (u64)blk_rq_pos(rq) << SECTOR_SHIFT; u64 length = blk_rq_bytes(rq); @@ -4845,11 +4856,7 @@ static blk_status_t rbd_queue_rq(struct blk_mq_hw_ctx *hctx, goto err_rq; } - img_request = rbd_img_request_create(rbd_dev, op_type, snapc); - if (!img_request) { - result = -ENOMEM; - goto err_rq; - } + rbd_img_request_setup(img_request, rbd_dev, op_type, snapc); img_request->callback = rbd_img_end_request; img_request->callback_data = rq; snapc = NULL; /* img_request consumes a ref */ @@ -4865,7 +4872,7 @@ static blk_status_t rbd_queue_rq(struct blk_mq_hw_ctx *hctx, if (result) goto err_img_request; - rbd_img_schedule(img_request, 0); + queue_work(rbd_wq, &img_request->work); return BLK_STS_OK; err_img_request: @@ -5043,8 +5050,19 @@ static int rbd_dev_refresh(struct rbd_device *rbd_dev) return ret; } +static int rbd_init_request(struct blk_mq_tag_set *set, struct request *rq, + unsigned int hctx_idx, unsigned int numa_node) +{ + struct rbd_img_request *img_req = blk_mq_rq_to_pdu(rq); + + INIT_WORK(&img_req->work, rbd_img_handle_request_work); + set_bit(IMG_REQ_EMBEDDED, &img_req->flags); + return 0; +} + static const struct blk_mq_ops rbd_mq_ops = { .queue_rq = rbd_queue_rq, + .init_request = rbd_init_request, }; static int rbd_init_disk(struct rbd_device *rbd_dev) @@ -5077,7 +5095,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) rbd_dev->tag_set.numa_node = NUMA_NO_NODE; rbd_dev->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; rbd_dev->tag_set.nr_hw_queues = 1; - rbd_dev->tag_set.cmd_size = sizeof(struct work_struct); + rbd_dev->tag_set.cmd_size = sizeof(struct rbd_img_request); err = blk_mq_alloc_tag_set(&rbd_dev->tag_set); if (err) From patchwork Fri Jan 31 10:37:39 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Hannes Reinecke X-Patchwork-Id: 11359589 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id DD38713A4 for ; Fri, 31 Jan 2020 10:38:17 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id C5D4B206F0 for ; Fri, 31 Jan 2020 10:38:17 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728427AbgAaKiR (ORCPT ); Fri, 31 Jan 2020 05:38:17 -0500 Received: from mx2.suse.de ([195.135.220.15]:55866 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728386AbgAaKiF (ORCPT ); Fri, 31 Jan 2020 05:38:05 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (unknown [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 16881B028; Fri, 31 Jan 2020 10:38:02 +0000 (UTC) From: Hannes Reinecke To: Ilya Dryomov Cc: Sage Weil , Daniel Disseldorp , Jens Axboe , ceph-devel@vger.kernel.org, linux-block@vger.kernel.org, Hannes Reinecke Subject: [PATCH 15/15] rbd: switch to blk-mq Date: Fri, 31 Jan 2020 11:37:39 +0100 Message-Id: <20200131103739.136098-16-hare@suse.de> X-Mailer: git-send-email 2.16.4 In-Reply-To: <20200131103739.136098-1-hare@suse.de> References: <20200131103739.136098-1-hare@suse.de> Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allocate one queue per CPU and get a performance boost from higher parallelism. Signed-off-by: Hannes Reinecke --- drivers/block/rbd.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index cc3e5116fe58..dc3b44177fea 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -5094,7 +5094,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) rbd_dev->tag_set.queue_depth = rbd_dev->opts->queue_depth; rbd_dev->tag_set.numa_node = NUMA_NO_NODE; rbd_dev->tag_set.flags = BLK_MQ_F_SHOULD_MERGE; - rbd_dev->tag_set.nr_hw_queues = 1; + rbd_dev->tag_set.nr_hw_queues = num_present_cpus(); rbd_dev->tag_set.cmd_size = sizeof(struct rbd_img_request); err = blk_mq_alloc_tag_set(&rbd_dev->tag_set);