From patchwork Mon Jan 8 20:21:18 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Lyle X-Patchwork-Id: 10150559 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 12092605F5 for ; Mon, 8 Jan 2018 20:22:18 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1B27E1FFDA for ; Mon, 8 Jan 2018 20:22:18 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 0FE1B212BE; Mon, 8 Jan 2018 20:22:18 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 12442289C0 for ; Mon, 8 Jan 2018 20:22:06 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757104AbeAHUWE (ORCPT ); Mon, 8 Jan 2018 15:22:04 -0500 Received: from mail-pf0-f170.google.com ([209.85.192.170]:42843 "EHLO mail-pf0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S934303AbeAHUVl (ORCPT ); Mon, 8 Jan 2018 15:21:41 -0500 Received: by mail-pf0-f170.google.com with SMTP id d23so6774670pfe.9 for ; Mon, 08 Jan 2018 12:21:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lyle-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=AGc1baEc2pjNpyXfNJdPRcEO3zLJNwtZl7RLydvZQfk=; b=S31ztQzxavUEESC0HyK7YyQSnlji/ADfVlF/Dys7SI/zzjyeNalvvntzI9Q41swhJr 6WJA7tXrn58dsLt3QablcGoRN3pBScfZyXQcyD2YbJ5A1R0THXyin6fNvWBDfkRA1H51 gfxkr1K16i6eNSMY/TWoxDVOBAoSvee3Hg0OPUCjRR2hH0Bo2U0MVA6XaPDJ1ti8kKk+ ogpNQE156gSZSuIxDq8YxygG0iAqg+7Lv1E0V1Vs4qwe5v0XOh1sCMjJKdI4u6hLhoI0 v5hE6qb4FQQYmgTi9VmWTBFjpxErixM+jZnfocjFZRxiaJgOp5ImsXBrZ40ZOp5kt8S1 qS4A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=AGc1baEc2pjNpyXfNJdPRcEO3zLJNwtZl7RLydvZQfk=; b=YuIePhYkZQj1ihaoMLDADR8v9nx7EiG3t+ut0b0l0EtGOKYX6cXGczXT/1VgZweJxK Yn25AWtRFnfE9JPj7qvcLluvMPVzHzePF8GjmcLG9JOoEIUhVNpERiJ0VGcakyRFqNEM tBJtPxCKdofpnMz3AGan46famoXKvFes4+holUTnG2ZYzTolu/iTOCVfRPp0N+nGT5Qu mjJb4tvUOa++/HSaxvoOSBUDt5rSYMLdibabCzZo10iYwfPO4X42ttr69dIj8sJcS0lO r/PLDkG2W3hkK+ou7O6YVu9LO5bsTc1cuhD/sPfIp1IwZSIeh3AMOX8SS15eX4bd4wuT 7Jng== X-Gm-Message-State: AKGB3mKAGrIH6EfmsnKuuKbh1r8r5J/JmZIr1xGwQqBd3DmUPNE31Uk4 k/Imh33LfH7uvZFIjdsyCLHLJA== X-Google-Smtp-Source: ACJfBos2Zb5nyW4PXXfMwhq4RhXdli3cD1KRhvCKr3AkZ7sC6x1IXf/mgoj67QTpHOspPasxjsbJAw== X-Received: by 10.84.132.44 with SMTP id 41mr13240869ple.63.1515442900520; Mon, 08 Jan 2018 12:21:40 -0800 (PST) Received: from midnight.lan (2600-6c52-6200-383d-a0f8-4aea-fac9-9f39.dhcp6.chtrptr.net. [2600:6c52:6200:383d:a0f8:4aea:fac9:9f39]) by smtp.gmail.com with ESMTPSA id o70sm28540227pfk.79.2018.01.08.12.21.39 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 08 Jan 2018 12:21:40 -0800 (PST) From: Michael Lyle To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org Cc: axboe@fb.com, Rui Hua , Michael Lyle Subject: [416 PATCH 01/13] bcache: ret IOERR when read meets metadata error Date: Mon, 8 Jan 2018 12:21:18 -0800 Message-Id: <20180108202130.31303-2-mlyle@lyle.org> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180108202130.31303-1-mlyle@lyle.org> References: <20180108202130.31303-1-mlyle@lyle.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Rui Hua The read request might meet error when searching the btree, but the error was not handled in cache_lookup(), and this kind of metadata failure will not go into cached_dev_read_error(), finally, the upper layer will receive bi_status=0. In this patch we judge the metadata error by the return value of bch_btree_map_keys(), there are two potential paths give rise to the error: 1. Because the btree is not totally cached in memery, we maybe get error when read btree node from cache device (see bch_btree_node_get()), the likely errno is -EIO, -ENOMEM 2. When read miss happens, bch_btree_insert_check_key() will be called to insert a "replace_key" to btree(see cached_dev_cache_miss(), just for doing preparatory work before insert the missed data to cache device), a failure can also happen in this situation, the likely errno is -ENOMEM bch_btree_map_keys() will return MAP_DONE in normal scenario, but we will get either -EIO or -ENOMEM in above two cases. if this happened, we should NOT recover data from backing device (when cache device is dirty) because we don't know whether bkeys the read request covered are all clean. And after that happened, s->iop.status is still its initially value(0) before we submit s->bio.bio, we set it to BLK_STS_IOERR, so it can go into cached_dev_read_error(), and finally it can be passed to upper layer, or recovered by reread from backing device. [edit by mlyle: patch formatting, word-wrap, comment spelling, commit log format] Signed-off-by: Hua Rui Reviewed-by: Michael Lyle Signed-off-by: Michael Lyle --- drivers/md/bcache/request.c | 22 ++++++++++++++++++++++ 1 file changed, 22 insertions(+) diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index c493fb947dc9..52b4ce24f9e2 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -576,6 +576,7 @@ static void cache_lookup(struct closure *cl) { struct search *s = container_of(cl, struct search, iop.cl); struct bio *bio = &s->bio.bio; + struct cached_dev *dc; int ret; bch_btree_op_init(&s->op, -1); @@ -588,6 +589,27 @@ static void cache_lookup(struct closure *cl) return; } + /* + * We might meet err when searching the btree, If that happens, we will + * get negative ret, in this scenario we should not recover data from + * backing device (when cache device is dirty) because we don't know + * whether bkeys the read request covered are all clean. + * + * And after that happened, s->iop.status is still its initial value + * before we submit s->bio.bio + */ + if (ret < 0) { + BUG_ON(ret == -EINTR); + if (s->d && s->d->c && + !UUID_FLASH_ONLY(&s->d->c->uuids[s->d->id])) { + dc = container_of(s->d, struct cached_dev, disk); + if (dc && atomic_read(&dc->has_dirty)) + s->recoverable = false; + } + if (!s->iop.status) + s->iop.status = BLK_STS_IOERR; + } + closure_return(cl); }