From patchwork Fri Mar 3 19:05:46 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Ilya Dryomov X-Patchwork-Id: 9603397 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 5C10E604E2 for ; Fri, 3 Mar 2017 19:14:35 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 504B7285CA for ; Fri, 3 Mar 2017 19:14:35 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 4488C285D5; Fri, 3 Mar 2017 19:14:35 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 74708285CF for ; Fri, 3 Mar 2017 19:14:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752398AbdCCTOb (ORCPT ); Fri, 3 Mar 2017 14:14:31 -0500 Received: from mail-wm0-f66.google.com ([74.125.82.66]:34350 "EHLO mail-wm0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752396AbdCCTO2 (ORCPT ); Fri, 3 Mar 2017 14:14:28 -0500 Received: by mail-wm0-f66.google.com with SMTP id m70so4548022wma.1 for ; Fri, 03 Mar 2017 11:14:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=Flk8bKGP7xTw0QByl0P++8ffpIP0ttPbbN2BuPzOkNM=; b=VJ4dGmiO/BUfxeEO03tFTo/IYAFknHaHbt9q0PevkOdNxde2rFycO8apfLeJssxvg/ yVkDBYJxw49U6bLIJXKaZyroHazsy6nnY20J2BsRqB/5/+winSzKGYBQBssiOqbaDCfL 9SI9Q44KfNhr3YNyjR4J18D4SZr5Ex2PjxywmjS2Wra0IjzIWE91hqzcXJEy2oHxaCzo VxIwkAVMZy1+YOl5+BmaKGQCkP31WUBYJbTfX9Tql0OXiRNrWOduOMf8WoKMKVMsaGIE jQl0JA5Fgn1RLW8i2UC8fLvodwU0++QBaFlig61Q3li087fyTMXcsAkt2ZQair3ibQqV 9UiQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=Flk8bKGP7xTw0QByl0P++8ffpIP0ttPbbN2BuPzOkNM=; b=OuQFyO6odT3YpQPWtQSqxiMW5gc2RxdpopxZMerijkzVlo1zMR1ymOz3O0qNmAqh3K Qro9VFIz002Q6JZwMwlOLaPU9sLfr3gpTQ0xyeKWRI+IU2auJXN6ElTBuf7ENhnOhVtM uPLus1wqe7O8vJsTG+ynIE8BghmG1F2WU5u1c9XrHWOpW7cmnyOwV2S/5H7kVJdlZBWS LgS1zd2EDUFMJpIS7O9BnUwf4nJ1liDfMw7Pn7Zzg7y2tRKKS5SBPPu1F5/AgIQRtaIf EJoAhThcb33Ua5acEmrAcdO2Rl+oOMI1iKo0a0O481pWHuZB0NRNnaod0OppRDzkeRzq SE2A== X-Gm-Message-State: AMke39nxPN0jy2xLDdKk6npZP2LEaPi88PwoEXpJgEzIgOxOCWoHpiYZQn3j4IA3B9UxfA== X-Received: by 10.28.98.2 with SMTP id w2mr4181846wmb.66.1488567967402; Fri, 03 Mar 2017 11:06:07 -0800 (PST) Received: from localhost.localdomain.com (ip-78-102-108-116.net.upcbroadband.cz. [78.102.108.116]) by smtp.gmail.com with ESMTPSA id f67sm4099052wmd.0.2017.03.03.11.06.06 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 03 Mar 2017 11:06:06 -0800 (PST) From: Ilya Dryomov To: ceph-devel@vger.kernel.org Cc: Artur Molchanov Subject: [PATCH] libceph: osd_request_timeout option Date: Fri, 3 Mar 2017 20:05:46 +0100 Message-Id: <1488567946-1696-1-git-send-email-idryomov@gmail.com> X-Mailer: git-send-email 2.4.3 Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP osd_request_timeout specifies how many seconds to wait for a response from OSDs before returning -ETIMEDOUT from an OSD request. 0 (default) means no limit. osd_request_timeout is osdkeepalive-precise -- in-flight requests are swept through every osdkeepalive seconds. With ack vs commit behaviour gone, abort_request() is really simple. This is based on a patch from Artur Molchanov . Tested-by: Artur Molchanov Signed-off-by: Ilya Dryomov Reviewed-by: Sage Weil --- include/linux/ceph/libceph.h | 2 ++ include/linux/ceph/osd_client.h | 1 + net/ceph/ceph_common.c | 15 +++++++++++++++ net/ceph/osd_client.c | 36 +++++++++++++++++++++++++++++++++++- 4 files changed, 53 insertions(+), 1 deletion(-) diff --git a/include/linux/ceph/libceph.h b/include/linux/ceph/libceph.h index 1816c5e26581..88cd5dc8e238 100644 --- a/include/linux/ceph/libceph.h +++ b/include/linux/ceph/libceph.h @@ -48,6 +48,7 @@ struct ceph_options { unsigned long mount_timeout; /* jiffies */ unsigned long osd_idle_ttl; /* jiffies */ unsigned long osd_keepalive_timeout; /* jiffies */ + unsigned long osd_request_timeout; /* jiffies */ /* * any type that can't be simply compared or doesn't need need @@ -68,6 +69,7 @@ struct ceph_options { #define CEPH_MOUNT_TIMEOUT_DEFAULT msecs_to_jiffies(60 * 1000) #define CEPH_OSD_KEEPALIVE_DEFAULT msecs_to_jiffies(5 * 1000) #define CEPH_OSD_IDLE_TTL_DEFAULT msecs_to_jiffies(60 * 1000) +#define CEPH_OSD_REQUEST_TIMEOUT_DEFAULT 0 /* no timeout */ #define CEPH_MONC_HUNT_INTERVAL msecs_to_jiffies(3 * 1000) #define CEPH_MONC_PING_INTERVAL msecs_to_jiffies(10 * 1000) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index e1cb5d825bc5..b04a2ca11e60 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -190,6 +190,7 @@ struct ceph_osd_request { /* internal */ unsigned long r_stamp; /* jiffies, send or check time */ + unsigned long r_start_stamp; /* jiffies */ int r_attempts; struct ceph_eversion r_replay_version; /* aka reassert_version */ u32 r_last_force_resend; diff --git a/net/ceph/ceph_common.c b/net/ceph/ceph_common.c index 464e88599b9d..108533859a53 100644 --- a/net/ceph/ceph_common.c +++ b/net/ceph/ceph_common.c @@ -230,6 +230,7 @@ enum { Opt_osdkeepalivetimeout, Opt_mount_timeout, Opt_osd_idle_ttl, + Opt_osd_request_timeout, Opt_last_int, /* int args above */ Opt_fsid, @@ -256,6 +257,7 @@ static match_table_t opt_tokens = { {Opt_osdkeepalivetimeout, "osdkeepalive=%d"}, {Opt_mount_timeout, "mount_timeout=%d"}, {Opt_osd_idle_ttl, "osd_idle_ttl=%d"}, + {Opt_osd_request_timeout, "osd_request_timeout=%d"}, /* int args above */ {Opt_fsid, "fsid=%s"}, {Opt_name, "name=%s"}, @@ -361,6 +363,7 @@ ceph_parse_options(char *options, const char *dev_name, opt->osd_keepalive_timeout = CEPH_OSD_KEEPALIVE_DEFAULT; opt->mount_timeout = CEPH_MOUNT_TIMEOUT_DEFAULT; opt->osd_idle_ttl = CEPH_OSD_IDLE_TTL_DEFAULT; + opt->osd_request_timeout = CEPH_OSD_REQUEST_TIMEOUT_DEFAULT; /* get mon ip(s) */ /* ip1[:port1][,ip2[:port2]...] */ @@ -473,6 +476,15 @@ ceph_parse_options(char *options, const char *dev_name, } opt->mount_timeout = msecs_to_jiffies(intval * 1000); break; + case Opt_osd_request_timeout: + /* 0 is "wait forever" (i.e. infinite timeout) */ + if (intval < 0 || intval > INT_MAX / 1000) { + pr_err("osd_request_timeout out of range\n"); + err = -EINVAL; + goto out; + } + opt->osd_request_timeout = msecs_to_jiffies(intval * 1000); + break; case Opt_share: opt->flags &= ~CEPH_OPT_NOSHARE; @@ -557,6 +569,9 @@ int ceph_print_client_options(struct seq_file *m, struct ceph_client *client) if (opt->osd_keepalive_timeout != CEPH_OSD_KEEPALIVE_DEFAULT) seq_printf(m, "osdkeepalivetimeout=%d,", jiffies_to_msecs(opt->osd_keepalive_timeout) / 1000); + if (opt->osd_request_timeout != CEPH_OSD_REQUEST_TIMEOUT_DEFAULT) + seq_printf(m, "osd_request_timeout=%d,", + jiffies_to_msecs(opt->osd_request_timeout) / 1000); /* drop redundant comma */ if (m->count != pos) diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index e4f712ebcf05..534c2cd17582 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1727,6 +1727,8 @@ static void account_request(struct ceph_osd_request *req) req->r_flags |= CEPH_OSD_FLAG_ONDISK; atomic_inc(&req->r_osdc->num_requests); + + req->r_start_stamp = jiffies; } static void submit_request(struct ceph_osd_request *req, bool wrlocked) @@ -1853,6 +1855,14 @@ static void cancel_request(struct ceph_osd_request *req) ceph_osdc_put_request(req); } +static void abort_request(struct ceph_osd_request *req, int err) +{ + dout("%s req %p tid %llu err %d\n", __func__, req, req->r_tid, err); + + cancel_map_check(req); + complete_request(req, err); +} + static void check_pool_dne(struct ceph_osd_request *req) { struct ceph_osd_client *osdc = req->r_osdc; @@ -2551,6 +2561,7 @@ static void handle_timeout(struct work_struct *work) container_of(work, struct ceph_osd_client, timeout_work.work); struct ceph_options *opts = osdc->client->options; unsigned long cutoff = jiffies - opts->osd_keepalive_timeout; + unsigned long expiry_cutoff = jiffies - opts->osd_request_timeout; LIST_HEAD(slow_osds); struct rb_node *n, *p; @@ -2566,15 +2577,23 @@ static void handle_timeout(struct work_struct *work) struct ceph_osd *osd = rb_entry(n, struct ceph_osd, o_node); bool found = false; - for (p = rb_first(&osd->o_requests); p; p = rb_next(p)) { + for (p = rb_first(&osd->o_requests); p; ) { struct ceph_osd_request *req = rb_entry(p, struct ceph_osd_request, r_node); + p = rb_next(p); /* abort_request() */ + if (time_before(req->r_stamp, cutoff)) { dout(" req %p tid %llu on osd%d is laggy\n", req, req->r_tid, osd->o_osd); found = true; } + if (opts->osd_request_timeout && + time_before(req->r_start_stamp, expiry_cutoff)) { + pr_err_ratelimited("tid %llu on osd%d timeout\n", + req->r_tid, osd->o_osd); + abort_request(req, -ETIMEDOUT); + } } for (p = rb_first(&osd->o_linger_requests); p; p = rb_next(p)) { struct ceph_osd_linger_request *lreq = @@ -2594,6 +2613,21 @@ static void handle_timeout(struct work_struct *work) list_move_tail(&osd->o_keepalive_item, &slow_osds); } + if (opts->osd_request_timeout) { + for (p = rb_first(&osdc->homeless_osd.o_requests); p; ) { + struct ceph_osd_request *req = + rb_entry(p, struct ceph_osd_request, r_node); + + p = rb_next(p); /* abort_request() */ + + if (time_before(req->r_start_stamp, expiry_cutoff)) { + pr_err_ratelimited("tid %llu on osd%d timeout\n", + req->r_tid, osdc->homeless_osd.o_osd); + abort_request(req, -ETIMEDOUT); + } + } + } + if (atomic_read(&osdc->num_homeless) || !list_empty(&slow_osds)) maybe_request_map(osdc);