diff mbox series

[09/18] bcache: add failure check to run_cache_set() for journal replay

Message ID 20190424164843.29535-10-colyli@suse.de (mailing list archive)
State New, archived
Headers show
Series bcache patches for Linux v5.2 | expand

Commit Message

Coly Li April 24, 2019, 4:48 p.m. UTC
Currently run_cache_set() has no return value, if there is failure in
bch_journal_replay(), the caller of run_cache_set() has no idea about
such failure and just continue to execute following code after
run_cache_set().  The internal failure is triggered inside
bch_journal_replay() and being handled in async way. This behavior is
inefficient, while failure handling inside bch_journal_replay(), cache
register code is still running to start the cache set. Registering and
unregistering code running as same time may introduce some rare race
condition, and make the code to be more hard to be understood.

This patch adds return value to run_cache_set(), and returns -EIO if
bch_journal_rreplay() fails. Then caller of run_cache_set() may detect
such failure and stop registering code flow immedidately inside
register_cache_set().

If journal replay fails, run_cache_set() can report error immediately
to register_cache_set(). This patch makes the failure handling for
bch_journal_replay() be in synchronized way, easier to understand and
debug, and avoid poetential race condition for register-and-unregister
in same time.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/super.c | 17 ++++++++++++-----
 1 file changed, 12 insertions(+), 5 deletions(-)

Comments

Hannes Reinecke April 25, 2019, 6:01 a.m. UTC | #1
On 4/24/19 6:48 PM, Coly Li wrote:
> Currently run_cache_set() has no return value, if there is failure in
> bch_journal_replay(), the caller of run_cache_set() has no idea about
> such failure and just continue to execute following code after
> run_cache_set().  The internal failure is triggered inside
> bch_journal_replay() and being handled in async way. This behavior is
> inefficient, while failure handling inside bch_journal_replay(), cache
> register code is still running to start the cache set. Registering and
> unregistering code running as same time may introduce some rare race
> condition, and make the code to be more hard to be understood.
> 
> This patch adds return value to run_cache_set(), and returns -EIO if
> bch_journal_rreplay() fails. Then caller of run_cache_set() may detect
> such failure and stop registering code flow immedidately inside
> register_cache_set().
> 
> If journal replay fails, run_cache_set() can report error immediately
> to register_cache_set(). This patch makes the failure handling for
> bch_journal_replay() be in synchronized way, easier to understand and
> debug, and avoid poetential race condition for register-and-unregister
> in same time.
> 
> Signed-off-by: Coly Li <colyli@suse.de>
> ---
>   drivers/md/bcache/super.c | 17 ++++++++++++-----
>   1 file changed, 12 insertions(+), 5 deletions(-)
> 
Reviewed-by: Hannes Reinecke <hare@suse.com>

Cheers,

Hannes
diff mbox series

Patch

diff --git a/drivers/md/bcache/super.c b/drivers/md/bcache/super.c
index 53c5e3e0ac22..8c7fdada0acf 100644
--- a/drivers/md/bcache/super.c
+++ b/drivers/md/bcache/super.c
@@ -1773,7 +1773,7 @@  struct cache_set *bch_cache_set_alloc(struct cache_sb *sb)
 	return NULL;
 }
 
-static void run_cache_set(struct cache_set *c)
+static int run_cache_set(struct cache_set *c)
 {
 	const char *err = "cannot allocate memory";
 	struct cached_dev *dc, *t;
@@ -1867,7 +1867,9 @@  static void run_cache_set(struct cache_set *c)
 		if (j->version < BCACHE_JSET_VERSION_UUID)
 			__uuid_write(c);
 
-		bch_journal_replay(c, &journal);
+		err = "bcache: replay journal failed";
+		if (bch_journal_replay(c, &journal))
+			goto err;
 	} else {
 		pr_notice("invalidating existing data");
 
@@ -1935,11 +1937,13 @@  static void run_cache_set(struct cache_set *c)
 	flash_devs_run(c);
 
 	set_bit(CACHE_SET_RUNNING, &c->flags);
-	return;
+	return 0;
 err:
 	closure_sync(&cl);
 	/* XXX: test this, it's broken */
 	bch_cache_set_error(c, "%s", err);
+
+	return -EIO;
 }
 
 static bool can_attach_cache(struct cache *ca, struct cache_set *c)
@@ -2003,8 +2007,11 @@  static const char *register_cache_set(struct cache *ca)
 	ca->set->cache[ca->sb.nr_this_dev] = ca;
 	c->cache_by_alloc[c->caches_loaded++] = ca;
 
-	if (c->caches_loaded == c->sb.nr_in_set)
-		run_cache_set(c);
+	if (c->caches_loaded == c->sb.nr_in_set) {
+		err = "failed to run cache set";
+		if (run_cache_set(c) < 0)
+			goto err;
+	}
 
 	return NULL;
 err: