From patchwork Thu Jul 27 10:32:57 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: xiongweijiang666@gmail.com X-Patchwork-Id: 9866543 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 06B0060382 for ; Thu, 27 Jul 2017 10:33:28 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EB772287E7 for ; Thu, 27 Jul 2017 10:33:27 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id E00D5287FB; Thu, 27 Jul 2017 10:33:27 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=unavailable version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 89F06287D7 for ; Thu, 27 Jul 2017 10:33:27 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751113AbdG0KdO (ORCPT ); Thu, 27 Jul 2017 06:33:14 -0400 Received: from mail-oi0-f65.google.com ([209.85.218.65]:37518 "EHLO mail-oi0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750836AbdG0KdN (ORCPT ); Thu, 27 Jul 2017 06:33:13 -0400 Received: by mail-oi0-f65.google.com with SMTP id j194so9451379oib.4; Thu, 27 Jul 2017 03:33:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=Fm0vt1TglOHcRzqdE+cFE72xqhetSWLALvDZeTPafjM=; b=dx7nE3D0/x84neSyw/i1E4U/8X2pOvhGUNvHswgdESGFiqvIg2dCBVzkKq3J8T13Hf Voi8TvleI4nFaU8QoMScePJkExybfrvcWhtTPiRbHEVH0cD1NdTD0tk5POtG/8JiKDmy 8BxfiwaPl/UJzc8zn59YhCvYVDDByYqaGKvYW8a5clEo0ImLCcDpqGM0OvNAegNzhFLg r8P5HkV2RMnFgZJTadNDCvCjG9OsbuKh4KML26Z29OkFAlecVOD7DmNxA/af590DuMLO gI4cknAhCKQM0InD/GL6+J9lUgRqh2CdoThm4cg75Emv69CPHHI0FLXyhVsuf4Aluj8n yf3Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Fm0vt1TglOHcRzqdE+cFE72xqhetSWLALvDZeTPafjM=; b=oOWVz7F0j3t9m+B6utuHxvl6id7N7qD75Isd9n54xrpjguqwCPzEETYRjALoCc90PX ivGDnE/qq0zJsEIL+dFUBWzXBbnFNmWidTRIMhY2u1VSgZsOXLYBQQayDbwdPIl4Jtxl kyTq4TdG5cRA2/juOcyA2TWvpKtXiFnLNslNSCs5cuvOAa2Ahvx3OC+N2RVJUzfMEd4/ NOipNldYUHHrX82Rba+L1mWTKhpydTRHhw2KyAh0zdpxCfkYqoN1svNZySWldNL9wRK5 QMcQVqpdtnfAAF34qE9W3VxJnHsmGl7bj3ac6+1t18td6jNE+k6UqjL0bEJcEsBKbFNu 0XfA== X-Gm-Message-State: AIVw112fX6O3a5UlazUbl/4H7lslen50Gieas6f+Vzcn3hnUrefdH7jN secTnof91TdAiQ== X-Received: by 10.202.108.146 with SMTP id h140mr3203540oic.125.1501151592722; Thu, 27 Jul 2017 03:33:12 -0700 (PDT) Received: from ali-55502n.hz.ali.com ([205.204.117.19]) by smtp.gmail.com with ESMTPSA id t186sm8590068oie.29.2017.07.27.03.33.09 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Thu, 27 Jul 2017 03:33:12 -0700 (PDT) From: xiongweijiang666@gmail.com X-Google-Original-From: xiongwei.jiang@alibaba-inc.com To: idryomov@gmail.com, sage@redhat.com, elder@kernel.org, ceph-devel@vger.kernel.org, linux-kernel@vger.kernel.org Cc: chenggang.qcg@alibaba-inc.com, xiongwei.jiang@alibaba-inc.com Subject: [PATCH] rbd: add timeout function to rbd driver Date: Thu, 27 Jul 2017 18:32:57 +0800 Message-Id: <20170727103257.11608-1-xiongwei.jiang@alibaba-inc.com> X-Mailer: git-send-email 2.13.3.windows.1 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Xiongwei Jiang when an application is writing or reading on rbd device, if some or all OSDs crash, the application will hang and can't be killed because it is in D state. Even though OSDs comes up later, the application may still keeps in D state. So we need a timeout mechanism to solve this problem. Signed-off-by: Xiongwei Jiang --- drivers/block/rbd.c | 34 +++++++++++++++++++++++++++++++++- 1 file changed, 33 insertions(+), 1 deletion(-) diff --git a/drivers/block/rbd.c b/drivers/block/rbd.c index c16f745..33a1c97 100644 --- a/drivers/block/rbd.c +++ b/drivers/block/rbd.c @@ -159,6 +159,13 @@ struct rbd_image_header { u64 *snap_sizes; /* format 1 only */ }; + +struct rbd_request_linker { + struct work_struct work; + void *img_request; +}; + + /* * An rbd image specification. * @@ -4013,6 +4020,7 @@ static void rbd_queue_workfn(struct work_struct *work) struct request *rq = blk_mq_rq_from_pdu(work); struct rbd_device *rbd_dev = rq->q->queuedata; struct rbd_img_request *img_request; + struct rbd_request_linker *linker; struct ceph_snap_context *snapc = NULL; u64 offset = (u64)blk_rq_pos(rq) << SECTOR_SHIFT; u64 length = blk_rq_bytes(rq); @@ -4120,6 +4128,7 @@ static void rbd_queue_workfn(struct work_struct *work) goto err_unlock; } img_request->rq = rq; + linker->img_request = img_request; snapc = NULL; /* img_request consumes a ref */ if (op_type == OBJ_OP_DISCARD) @@ -4358,9 +4367,32 @@ static int rbd_init_request(struct blk_mq_tag_set *set, struct request *rq, return 0; } +static enum blk_eh_timer_return rbd_request_timeout(struct request *rq, + bool reserved) +{ + struct rbd_obj_request *obj_request; + struct rbd_obj_request *next_obj_request; + struct rbd_img_request *img_request; + struct rbd_request_linker *linker = blk_mq_rq_to_pdu(rq); + + img_request = (struct rbd_img_request *)linker->img_request; + for_each_obj_request_safe(img_request, obj_request, next_obj_request) { + struct ceph_osd_request *osd_req = obj_request->osd_req; + + if (!osd_req) + printk(KERN_INFO "osd_req is null \n"); + else + ceph_osdc_cancel_request(osd_req); + } + return BLK_EH_HANDLED; +} + + + static const struct blk_mq_ops rbd_mq_ops = { .queue_rq = rbd_queue_rq, .init_request = rbd_init_request, + .timeout = rbd_request_timeout, }; static int rbd_init_disk(struct rbd_device *rbd_dev) @@ -4392,7 +4424,7 @@ static int rbd_init_disk(struct rbd_device *rbd_dev) rbd_dev->tag_set.numa_node = NUMA_NO_NODE; rbd_dev->tag_set.flags = BLK_MQ_F_SHOULD_MERGE | BLK_MQ_F_SG_MERGE; rbd_dev->tag_set.nr_hw_queues = 1; - rbd_dev->tag_set.cmd_size = sizeof(struct work_struct); + rbd_dev->tag_set.cmd_size = sizeof(struct rbd_request_linker); err = blk_mq_alloc_tag_set(&rbd_dev->tag_set); if (err)