From patchwork Fri Jan 20 15:17:32 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 9528739 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1B4FF60113 for ; Fri, 20 Jan 2017 15:17:56 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0CA75286A0 for ; Fri, 20 Jan 2017 15:17:56 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 01AA7286A2; Fri, 20 Jan 2017 15:17:55 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 8DF82286A0 for ; Fri, 20 Jan 2017 15:17:55 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752479AbdATPRy (ORCPT ); Fri, 20 Jan 2017 10:17:54 -0500 Received: from mx1.redhat.com ([209.132.183.28]:46572 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752124AbdATPRk (ORCPT ); Fri, 20 Jan 2017 10:17:40 -0500 Received: from int-mx10.intmail.prod.int.phx2.redhat.com (int-mx10.intmail.prod.int.phx2.redhat.com [10.5.11.23]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id CDFF8C04BD26; Fri, 20 Jan 2017 15:17:40 +0000 (UTC) Received: from tleilax.poochiereds.net (ovpn-116-147.rdu2.redhat.com [10.10.116.147]) by int-mx10.intmail.prod.int.phx2.redhat.com (8.14.4/8.14.4) with ESMTP id v0KFHd8e003021; Fri, 20 Jan 2017 10:17:40 -0500 From: Jeff Layton To: ceph-devel@vger.kernel.org Cc: jspray@redhat.com, idryomov@gmail.com, zyan@redhat.com, sage@redhat.com Subject: [PATCH v1 1/7] libceph: add ceph_osdc_cancel_writes Date: Fri, 20 Jan 2017 10:17:32 -0500 Message-Id: <20170120151738.9584-2-jlayton@redhat.com> In-Reply-To: <20170120151738.9584-1-jlayton@redhat.com> References: <20170120151738.9584-1-jlayton@redhat.com> X-Scanned-By: MIMEDefang 2.68 on 10.5.11.23 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.31]); Fri, 20 Jan 2017 15:17:40 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When a Ceph volume hits capacity, a flag is set in the OSD map to indicate that and a new map is sprayed around the cluster. When the cephfs client sees that, we want it to shut down any OSD writes that are in-progress with an -ENOSPC error as they'll just hang otherwise. Add a callback to the osdc that gets called on map updates and add a small API to register the callback. [ jlayton: code style cleanup and adaptation to new osd msg handling ] Signed-off-by: John Spray Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 12 ++++++++++ net/ceph/osd_client.c | 50 +++++++++++++++++++++++++++++++++++++++++ 2 files changed, 62 insertions(+) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 03a6653d329a..a5298c02bde4 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -21,6 +21,7 @@ struct ceph_osd_client; /* * completion callback for async writepages */ +typedef void (*ceph_osdc_map_callback_t)(struct ceph_osd_client *, void *); typedef void (*ceph_osdc_callback_t)(struct ceph_osd_request *); typedef void (*ceph_osdc_unsafe_callback_t)(struct ceph_osd_request *, bool); @@ -289,6 +290,9 @@ struct ceph_osd_client { struct ceph_msgpool msgpool_op_reply; struct workqueue_struct *notify_wq; + + ceph_osdc_map_callback_t map_cb; + void *map_p; }; static inline bool ceph_osdmap_flag(struct ceph_osd_client *osdc, int flag) @@ -391,6 +395,7 @@ extern void ceph_osdc_put_request(struct ceph_osd_request *req); extern int ceph_osdc_start_request(struct ceph_osd_client *osdc, struct ceph_osd_request *req, bool nofail); +extern u32 ceph_osdc_complete_writes(struct ceph_osd_client *osdc, int r); extern void ceph_osdc_cancel_request(struct ceph_osd_request *req); extern int ceph_osdc_wait_request(struct ceph_osd_client *osdc, struct ceph_osd_request *req); @@ -457,5 +462,12 @@ int ceph_osdc_list_watchers(struct ceph_osd_client *osdc, struct ceph_object_locator *oloc, struct ceph_watch_item **watchers, u32 *num_watchers); + +static inline void ceph_osdc_register_map_cb(struct ceph_osd_client *osdc, + ceph_osdc_map_callback_t cb, void *data) +{ + osdc->map_cb = cb; + osdc->map_p = data; +} #endif diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 3a2417bb6ff0..0562ea76c772 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -18,6 +18,7 @@ #include #include #include +#include #define OSD_OPREPLY_FRONT_LEN 512 @@ -1771,6 +1772,51 @@ static void complete_request(struct ceph_osd_request *req, int err) ceph_osdc_put_request(req); } +/* + * Drop all pending write/modify requests and complete + * them with the `r` as return code. + * + * Returns the highest OSD map epoch of a request that was + * cancelled, or 0 if none were cancelled. + */ +u32 ceph_osdc_complete_writes(struct ceph_osd_client *osdc, int r) +{ + struct ceph_osd_request *req; + struct ceph_osd *osd; + struct rb_node *m, *n; + u32 latest_epoch = 0; + + lockdep_assert_held(&osdc->lock); + + dout("enter complete_writes r=%d\n", r); + + for (n = rb_first(&osdc->osds); n; n = rb_next(n)) { + osd = rb_entry(n, struct ceph_osd, o_node); + m = rb_first(&osd->o_requests); + mutex_lock(&osd->lock); + while (m) { + req = rb_entry(m, struct ceph_osd_request, r_node); + m = rb_next(m); + + if (req->r_flags & CEPH_OSD_FLAG_WRITE && + (ceph_osdmap_flag(osdc, CEPH_OSDMAP_FULL) || + pool_full(osdc, req->r_t.base_oloc.pool))) { + u32 cur_epoch = le32_to_cpu(req->r_replay_version.epoch); + + dout("%s: complete tid=%llu flags 0x%x\n", __func__, req->r_tid, req->r_flags); + complete_request(req, r); + if (cur_epoch > latest_epoch) + latest_epoch = cur_epoch; + } + } + mutex_unlock(&osd->lock); + } + + dout("return complete_writes latest_epoch=%u\n", latest_epoch); + return latest_epoch; +} +EXPORT_SYMBOL(ceph_osdc_complete_writes); + static void cancel_map_check(struct ceph_osd_request *req) { struct ceph_osd_client *osdc = req->r_osdc; @@ -3286,6 +3332,8 @@ void ceph_osdc_handle_map(struct ceph_osd_client *osdc, struct ceph_msg *msg) ceph_monc_got_map(&osdc->client->monc, CEPH_SUB_OSDMAP, osdc->osdmap->epoch); + if (osdc->map_cb) + osdc->map_cb(osdc, osdc->map_p); up_write(&osdc->lock); wake_up_all(&osdc->client->auth_wq); return; @@ -4090,6 +4138,8 @@ int ceph_osdc_init(struct ceph_osd_client *osdc, struct ceph_client *client) osdc->linger_requests = RB_ROOT; osdc->map_checks = RB_ROOT; osdc->linger_map_checks = RB_ROOT; + osdc->map_cb = NULL; + osdc->map_p = NULL; INIT_DELAYED_WORK(&osdc->timeout_work, handle_timeout); INIT_DELAYED_WORK(&osdc->osds_timeout_work, handle_osds_timeout);