From patchwork Mon Mar 19 00:36:25 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Michael Lyle X-Patchwork-Id: 10291477 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id D25B9600F6 for ; Mon, 19 Mar 2018 00:37:05 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C4B1A28D4B for ; Mon, 19 Mar 2018 00:37:05 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id B959128F5F; Mon, 19 Mar 2018 00:37:05 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 1400828EDF for ; Mon, 19 Mar 2018 00:37:05 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754849AbeCSAhC (ORCPT ); Sun, 18 Mar 2018 20:37:02 -0400 Received: from mail-pl0-f66.google.com ([209.85.160.66]:37964 "EHLO mail-pl0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1754783AbeCSAgz (ORCPT ); Sun, 18 Mar 2018 20:36:55 -0400 Received: by mail-pl0-f66.google.com with SMTP id m22-v6so9230783pls.5 for ; Sun, 18 Mar 2018 17:36:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=lyle-org.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=1WXk9+LLuV7FUFrI5yzdCemkYEhxPWu/H6Tv11AxsH8=; b=pS+Ed2JU3AXTD5JHoWW4IzU1FVYdVSAH1mtU3NOikXm9Ut7o5Tvpd0M6CSaHg2iuva NvQAZQduXB7EvaJXFBHdjQ115wLBM4iWkxLeX9rKgmoAVaS6qoXZzctkzPTdUjp4kjIi e4lNHAoE0NyFqavlIBwsGpQOkZH+iaw7Y+SJ/hz7ZmIEg4Uf32+ZAlGwK9/YBHHtHpGp gClzJWrrVGcsXvorKsgpOUDzTc4/KY6F/XnN3i7MCQnXmC3i+cZnYHB4grZUEfF5zzUu 7uX4XWF30b73F9KKvue42DacufplZXxgdJkdLWlYPqogDMH0yadiSaYYUMEQrWQIr8Wb LdJg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=1WXk9+LLuV7FUFrI5yzdCemkYEhxPWu/H6Tv11AxsH8=; b=J0J0aaRUuWdBKZ8H2qfpNOsAWlHlfzpkYmHLwJNGhgnjfpsUGQLKLlv0kWFAY0OeIV ESDuM2PV1JyLbNKLdoCL2jDQj/JYuo6F0lX6kBglfWg/e9pJ+z5Nnc/jo2oMtRGiYHwB 49l2dAb1vFFRE2M4QNk3PpP7SPbxHXMqzIZKZM126RVfFRNRZaILrPcCma6fUH4lqcd6 ZCGgrOPcPv3EKLsU5m7iloGLuSQv8y39V3h5G3iO1Yuwfb4MzppCffTmJDJw/ZmGvTt4 L1Bg97dAGhfGdazIgH+rX092sYsH+RoGU3SOhDUEHcEf8EHzNzGoJnoAg5mRWOkTA3V2 UP+w== X-Gm-Message-State: AElRT7E6sph8TutiOyrgC/DDjiiuvaKQIlOv4/ax6KM3ygQ9ceyrnNsa 2hlh94/k74zRBpIt6Nu9NyqzHg== X-Google-Smtp-Source: AG47ELs6XStLDPBRcrNXbJexR7SW8WP6H6eMu+SUpS28mFQYydpUCnZqzsHw7Tb2WYaCH/T4rG5c3w== X-Received: by 2002:a17:902:8d92:: with SMTP id v18-v6mr10383299plo.21.1521419814419; Sun, 18 Mar 2018 17:36:54 -0700 (PDT) Received: from midnight.lan (2600-6c52-6200-09b7-0000-0000-0000-0d66.dhcp6.chtrptr.net. [2600:6c52:6200:9b7::d66]) by smtp.gmail.com with ESMTPSA id a17sm27674857pfc.122.2018.03.18.17.36.53 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Sun, 18 Mar 2018 17:36:53 -0700 (PDT) From: Michael Lyle To: linux-bcache@vger.kernel.org, linux-block@vger.kernel.org Cc: axboe@fb.com, Coly Li , Michael Lyle , Junhui Tang Subject: [for-4.17 12/20] bcache: add io_disable to struct cached_dev Date: Sun, 18 Mar 2018 17:36:25 -0700 Message-Id: <20180319003633.27225-13-mlyle@lyle.org> X-Mailer: git-send-email 2.14.1 In-Reply-To: <20180319003633.27225-1-mlyle@lyle.org> References: <20180319003633.27225-1-mlyle@lyle.org> Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Coly Li If a bcache device is configured to writeback mode, current code does not handle write I/O errors on backing devices properly. In writeback mode, write request is written to cache device, and latter being flushed to backing device. If I/O failed when writing from cache device to the backing device, bcache code just ignores the error and upper layer code is NOT noticed that the backing device is broken. This patch tries to handle backing device failure like how the cache device failure is handled, - Add a error counter 'io_errors' and error limit 'error_limit' in struct cached_dev. Add another io_disable to struct cached_dev to disable I/Os on the problematic backing device. - When I/O error happens on backing device, increase io_errors counter. And if io_errors reaches error_limit, set cache_dev->io_disable to true, and stop the bcache device. The result is, if backing device is broken of disconnected, and I/O errors reach its error limit, backing device will be disabled and the associated bcache device will be removed from system. Changelog: v2: remove "bcache: " prefix in pr_error(), and use correct name string to print out bcache device gendisk name. v1: indeed this is new added in v2 patch set. Signed-off-by: Coly Li Reviewed-by: Hannes Reinecke Reviewed-by: Michael Lyle Cc: Michael Lyle Cc: Junhui Tang --- drivers/md/bcache/bcache.h | 6 ++++++ drivers/md/bcache/io.c | 14 ++++++++++++++ drivers/md/bcache/request.c | 14 ++++++++++++-- drivers/md/bcache/super.c | 21 +++++++++++++++++++++ drivers/md/bcache/sysfs.c | 15 ++++++++++++++- 5 files changed, 67 insertions(+), 3 deletions(-) diff --git a/drivers/md/bcache/bcache.h b/drivers/md/bcache/bcache.h index 5e9f3610c6fd..d338b7086013 100644 --- a/drivers/md/bcache/bcache.h +++ b/drivers/md/bcache/bcache.h @@ -367,6 +367,7 @@ struct cached_dev { unsigned sequential_cutoff; unsigned readahead; + unsigned io_disable:1; unsigned verify:1; unsigned bypass_torture_test:1; @@ -388,6 +389,9 @@ struct cached_dev { unsigned writeback_rate_minimum; enum stop_on_failure stop_when_cache_set_failed; +#define DEFAULT_CACHED_DEV_ERROR_LIMIT 64 + atomic_t io_errors; + unsigned error_limit; }; enum alloc_reserve { @@ -911,6 +915,7 @@ static inline void wait_for_kthread_stop(void) /* Forward declarations */ +void bch_count_backing_io_errors(struct cached_dev *dc, struct bio *bio); void bch_count_io_errors(struct cache *, blk_status_t, int, const char *); void bch_bbio_count_io_errors(struct cache_set *, struct bio *, blk_status_t, const char *); @@ -938,6 +943,7 @@ int bch_bucket_alloc_set(struct cache_set *, unsigned, struct bkey *, int, bool); bool bch_alloc_sectors(struct cache_set *, struct bkey *, unsigned, unsigned, unsigned, bool); +bool bch_cached_dev_error(struct cached_dev *dc); __printf(2, 3) bool bch_cache_set_error(struct cache_set *, const char *, ...); diff --git a/drivers/md/bcache/io.c b/drivers/md/bcache/io.c index 8013ecbcdbda..7fac97ae036e 100644 --- a/drivers/md/bcache/io.c +++ b/drivers/md/bcache/io.c @@ -50,6 +50,20 @@ void bch_submit_bbio(struct bio *bio, struct cache_set *c, } /* IO errors */ +void bch_count_backing_io_errors(struct cached_dev *dc, struct bio *bio) +{ + char buf[BDEVNAME_SIZE]; + unsigned errors; + + WARN_ONCE(!dc, "NULL pointer of struct cached_dev"); + + errors = atomic_add_return(1, &dc->io_errors); + if (errors < dc->error_limit) + pr_err("%s: IO error on backing device, unrecoverable", + bio_devname(bio, buf)); + else + bch_cached_dev_error(dc); +} void bch_count_io_errors(struct cache *ca, blk_status_t error, diff --git a/drivers/md/bcache/request.c b/drivers/md/bcache/request.c index b4a5768afbe9..5a82237c7025 100644 --- a/drivers/md/bcache/request.c +++ b/drivers/md/bcache/request.c @@ -637,6 +637,8 @@ static void backing_request_endio(struct bio *bio) if (bio->bi_status) { struct search *s = container_of(cl, struct search, cl); + struct cached_dev *dc = container_of(s->d, + struct cached_dev, disk); /* * If a bio has REQ_PREFLUSH for writeback mode, it is * speically assembled in cached_dev_write() for a non-zero @@ -657,6 +659,7 @@ static void backing_request_endio(struct bio *bio) } s->recoverable = false; /* should count I/O error for backing device here */ + bch_count_backing_io_errors(dc, bio); } bio_put(bio); @@ -1065,8 +1068,14 @@ static void detached_dev_end_io(struct bio *bio) bio_data_dir(bio), &ddip->d->disk->part0, ddip->start_time); + if (bio->bi_status) { + struct cached_dev *dc = container_of(ddip->d, + struct cached_dev, disk); + /* should count I/O error for backing device here */ + bch_count_backing_io_errors(dc, bio); + } + kfree(ddip); - bio->bi_end_io(bio); } @@ -1105,7 +1114,8 @@ static blk_qc_t cached_dev_make_request(struct request_queue *q, struct cached_dev *dc = container_of(d, struct cached_dev, disk); int rw = bio_data_dir(bio); - if (unlikely(d->c && test_bit(CACHE_SET_IO_DISABLE, &d->c->flags))) { + if (unlikely((d->c && test_bit(CACHE_SET_IO_DISABLE, &d->c->flags)) || + dc->io_disable)) { bio->bi_status = BLK_STS_IOERR; bio_endio(bio); return BLK_QC_T_NONE; diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c index c9fea035065d..640ff8072ed8 100644 --- a/drivers/md/bcache/super.c +++ b/drivers/md/bcache/super.c @@ -1208,6 +1208,9 @@ static int cached_dev_init(struct cached_dev *dc, unsigned block_size) max(dc->disk.disk->queue->backing_dev_info->ra_pages, q->backing_dev_info->ra_pages); + atomic_set(&dc->io_errors, 0); + dc->io_disable = false; + dc->error_limit = DEFAULT_CACHED_DEV_ERROR_LIMIT; /* default to auto */ dc->stop_when_cache_set_failed = BCH_CACHED_DEV_STOP_AUTO; @@ -1362,6 +1365,24 @@ int bch_flash_dev_create(struct cache_set *c, uint64_t size) return flash_dev_run(c, u); } +bool bch_cached_dev_error(struct cached_dev *dc) +{ + char name[BDEVNAME_SIZE]; + + if (!dc || test_bit(BCACHE_DEV_CLOSING, &dc->disk.flags)) + return false; + + dc->io_disable = true; + /* make others know io_disable is true earlier */ + smp_mb(); + + pr_err("stop %s: too many IO errors on backing device %s\n", + dc->disk.disk->disk_name, bdevname(dc->bdev, name)); + + bcache_device_stop(&dc->disk); + return true; +} + /* Cache set */ __printf(2, 3) diff --git a/drivers/md/bcache/sysfs.c b/drivers/md/bcache/sysfs.c index 8c3fd05db87a..dfeef583ee50 100644 --- a/drivers/md/bcache/sysfs.c +++ b/drivers/md/bcache/sysfs.c @@ -141,7 +141,9 @@ SHOW(__bch_cached_dev) var_print(writeback_delay); var_print(writeback_percent); sysfs_hprint(writeback_rate, dc->writeback_rate.rate << 9); - + sysfs_hprint(io_errors, atomic_read(&dc->io_errors)); + sysfs_printf(io_error_limit, "%i", dc->error_limit); + sysfs_printf(io_disable, "%i", dc->io_disable); var_print(writeback_rate_update_seconds); var_print(writeback_rate_i_term_inverse); var_print(writeback_rate_p_term_inverse); @@ -232,6 +234,14 @@ STORE(__cached_dev) d_strtoul(writeback_rate_i_term_inverse); d_strtoul_nonzero(writeback_rate_p_term_inverse); + sysfs_strtoul_clamp(io_error_limit, dc->error_limit, 0, INT_MAX); + + if (attr == &sysfs_io_disable) { + int v = strtoul_or_return(buf); + + dc->io_disable = v ? 1 : 0; + } + d_strtoi_h(sequential_cutoff); d_strtoi_h(readahead); @@ -352,6 +362,9 @@ static struct attribute *bch_cached_dev_files[] = { &sysfs_writeback_rate_i_term_inverse, &sysfs_writeback_rate_p_term_inverse, &sysfs_writeback_rate_debug, + &sysfs_errors, + &sysfs_io_error_limit, + &sysfs_io_disable, &sysfs_dirty_data, &sysfs_stripe_size, &sysfs_partial_stripes_expensive,