From patchwork Thu Mar 30 18:07:03 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 9654923 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 36B9560113 for ; Thu, 30 Mar 2017 18:07:25 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 2AE8925E13 for ; Thu, 30 Mar 2017 18:07:25 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 1F28A26E39; Thu, 30 Mar 2017 18:07:25 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id B9D7725E13 for ; Thu, 30 Mar 2017 18:07:24 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934436AbdC3SHN (ORCPT ); Thu, 30 Mar 2017 14:07:13 -0400 Received: from mx1.redhat.com ([209.132.183.28]:59708 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S933150AbdC3SHM (ORCPT ); Thu, 30 Mar 2017 14:07:12 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id D1AFB6A6AD; Thu, 30 Mar 2017 18:07:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mx1.redhat.com D1AFB6A6AD Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; dmarc=none (p=none dis=none) header.from=redhat.com Authentication-Results: ext-mx04.extmail.prod.ext.phx2.redhat.com; spf=pass smtp.mailfrom=jlayton@redhat.com DKIM-Filter: OpenDKIM Filter v2.11.0 mx1.redhat.com D1AFB6A6AD Received: from tleilax.poochiereds.net (ovpn-120-210.rdu2.redhat.com [10.10.120.210]) by smtp.corp.redhat.com (Postfix) with ESMTP id 189B4182F0; Thu, 30 Mar 2017 18:07:11 +0000 (UTC) From: Jeff Layton To: idryomov@gmail.com, zyan@redhat.com, sage@redhat.com Cc: jspray@redhat.com, ceph-devel@vger.kernel.org Subject: [PATCH v6 3/7] libceph: abort already submitted but abortable requests when map or pool goes full Date: Thu, 30 Mar 2017 14:07:03 -0400 Message-Id: <20170330180707.11137-3-jlayton@redhat.com> In-Reply-To: <20170330180707.11137-1-jlayton@redhat.com> References: <20170330180546.11021-1-jlayton@redhat.com> <20170330180707.11137-1-jlayton@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.28]); Thu, 30 Mar 2017 18:07:12 +0000 (UTC) Sender: ceph-devel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP When a Ceph volume hits capacity, a flag is set in the OSD map to indicate that, and a new map is sprayed around the cluster. With cephfs we want it to shut down any abortable requests that are in progress with an -ENOSPC error as they'd just hang otherwise. Add a new ceph_osdc_abort_on_full helper function to handle this. It will first check whether there is an out-of-space condition in the cluster and then walk the tree and abort any request that has r_abort_on_full set with a -ENOSPC error. Call this new function directly whenever we get a new OSD map. Reviewed-by: "Yan, Zhengā€¯ Signed-off-by: Jeff Layton Reviewed-by: Ilya Dryomov --- net/ceph/osd_client.c | 35 +++++++++++++++++++++++++++++++++++ 1 file changed, 35 insertions(+) diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 781048990599..4e56cd1ec265 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1806,6 +1806,40 @@ static void abort_request(struct ceph_osd_request *req, int err) complete_request(req, err); } +/* + * Drop all pending requests that are stalled waiting on a full condition to + * clear, and complete them with ENOSPC as the return code. + */ +static void ceph_osdc_abort_on_full(struct ceph_osd_client *osdc) +{ + struct rb_node *n; + bool osdmap_full = ceph_osdmap_flag(osdc, CEPH_OSDMAP_FULL); + + dout("enter abort_on_full\n"); + + if (!osdmap_full && !have_pool_full(osdc)) + goto out; + + for (n = rb_first(&osdc->osds); n; n = rb_next(n)) { + struct ceph_osd *osd = rb_entry(n, struct ceph_osd, o_node); + struct rb_node *m; + + m = rb_first(&osd->o_requests); + while (m) { + struct ceph_osd_request *req = rb_entry(m, + struct ceph_osd_request, r_node); + m = rb_next(m); + + if (req->r_abort_on_full && + (ceph_osdmap_flag(osdc, CEPH_OSDMAP_FULL) || + pool_full(osdc, req->r_t.target_oloc.pool))) + abort_request(req, -ENOSPC); + } + } +out: + dout("return abort_on_full\n"); +} + static void check_pool_dne(struct ceph_osd_request *req) { struct ceph_osd_client *osdc = req->r_osdc; @@ -3264,6 +3298,7 @@ void ceph_osdc_handle_map(struct ceph_osd_client *osdc, struct ceph_msg *msg) kick_requests(osdc, &need_resend, &need_resend_linger); + ceph_osdc_abort_on_full(osdc); ceph_monc_got_map(&osdc->client->monc, CEPH_SUB_OSDMAP, osdc->osdmap->epoch); up_write(&osdc->lock);