From patchwork Wed Sep 20 19:38:59 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Coly Li X-Patchwork-Id: 9962183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id C9BCC601D5 for ; Wed, 20 Sep 2017 19:39:14 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id BC04D2921B for ; Wed, 20 Sep 2017 19:39:14 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B07412921D; Wed, 20 Sep 2017 19:39:14 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 47FD32921B for ; Wed, 20 Sep 2017 19:39:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751799AbdITTjL (ORCPT ); Wed, 20 Sep 2017 15:39:11 -0400 Received: from mx2.suse.de ([195.135.220.15]:58846 "EHLO mx1.suse.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751709AbdITTjG (ORCPT ); Wed, 20 Sep 2017 15:39:06 -0400 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay2.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx1.suse.de (Postfix) with ESMTP id 846DDACA1; Wed, 20 Sep 2017 19:39:04 +0000 (UTC) Subject: Re: [PATCHv2] bcache: option for allow stale data on read failure To: Kent Overstreet Cc: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org, Nix , Kai Krakow , Eric Wheeler , Junhui Tang , stable@vger.kernel.org References: <20170919222433.24336-1-colyli@suse.de> <20170920160735.jp4riq7x3qc472px@kmo-pixel> From: Coly Li Message-ID: Date: Wed, 20 Sep 2017 21:38:59 +0200 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.3.0 MIME-Version: 1.0 In-Reply-To: <20170920160735.jp4riq7x3qc472px@kmo-pixel> Content-Language: en-US Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On 2017/9/20 下午6:07, Kent Overstreet wrote: > On Wed, Sep 20, 2017 at 06:24:33AM +0800, Coly Li wrote: >> When bcache does read I/Os, for example in writeback or writethrough mode, >> if a read request on cache device is failed, bcache will try to recovery >> the request by reading from cached device. If the data on cached device is >> not synced with cache device, then requester will get a stale data. >> >> For critical storage system like database, providing stale data from >> recovery may result an application level data corruption, which is >> unacceptible. But for some other situation like multi-media stream cache, >> continuous service may be more important and it is acceptible to fetch >> a chunk of stale data. >> >> This patch tries to solve the above conflict by adding a sysfs option >> /sys/block/bcache/bcache/allow_stale_data_on_failure >> which is defaultly cleared (to 0) as disabled. Now people can make choices >> for different situations. > > IMO this is just a bug, I'd rather not have an option to keep the buggy > behaviour. How about this patch: > Hi Kent, OK, last time when I discuss with other bcache developers, people wanted to keep this behavior, then I modify it as an option in this version patch. I support fix it without an option, because there are too many options already. Good to know you have similar decision :-) > commit 2746f9c1f962288d8c5d7dabe698bf7b3fddd405 > Author: Kent Overstreet > Date: Wed Sep 20 18:06:37 2017 +0200 > > bcache: Don't recover from IO errors when reading dirty data > > Signed-off-by: Kent Overstreet > > diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c > index 382397772a..c2d57ef953 100644 > --- a/drivers/md/bcache/request.c > +++ b/drivers/md/bcache/request.c > @@ -532,8 +532,10 @@ static int cache_lookup_fn(struct btree_op *op, struct btree *b, struct bkey *k) > > PTR_BUCKET(b->c, k, ptr)->prio = INITIAL_PRIO; > > - if (KEY_DIRTY(k)) > + if (KEY_DIRTY(k)) { > s->read_dirty_data = true; > + s->recoverable = false; > + } > I though of fixing here, the reason I gave up to modify here was, cache_lookup_fn() is called for keys in leaf nodes (b->level == 0), bch_btree_map_keys_recurse() needs to do I/O to fetch upper level nodes before accessing leaf node. When a SSD failed bch_btree_node_get() will fail before cache_lookup_fn() is executed. So the your patch, there is no chance to set s->recoverable to false, recovery still happens. If you don't like an option, the following modification should be much simpler, This might be the simplest way I know for now. Thanks. Coly Li diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index 681b4f12b05a..f397785d9c38 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -697,8 +697,10 @@ static void cached_dev_read_error(struct closure *cl) { struct search *s = container_of(cl, struct search, cl); struct bio *bio = &s->bio.bio; + struct cached_dev *dc = container_of(s->d, struct cached_dev, disk); - if (s->recoverable) { + if (s->recoverable && + (dc && !atomic_read(&dc->has_dirty)) { /* Retry from the backing device: */ trace_bcache_read_retry(s->orig_bio);