diff mbox

[v1,05/10] bcache: stop dc->writeback_rate_update if cache set is stopping

Message ID 20180103140325.63175-6-colyli@suse.de (mailing list archive)
State New, archived
Headers show

Commit Message

Coly Li Jan. 3, 2018, 2:03 p.m. UTC
struct delayed_work writeback_rate_update in struct cache_dev is a delayed
worker to call function update_writeback_rate() in period (the interval is
defined by dc->writeback_rate_update_seconds).

When a metadate I/O error happens on cache device, bcache error handling
routine bch_cache_set_error() will call bch_cache_set_unregister() to
retire whole cache set. On the unregister code path, cached_dev_free()
calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
delayed work.

dc->writeback_rate_update is a special delayed work from others in bcache.
In its routine update_writeback_rate(), this delayed work is re-armed
after a piece of time. That means when cancel_delayed_work_sync() returns,
this delayed work can still be executed after several seconds defined by
dc->writeback_rate_update_seconds.

The problem is, after cancel_delayed_work_sync() returns, the cache set
unregister code path will eventually release memory of struct cache set.
Then the delayed work is scheduled to run, and inside its routine
update_writeback_rate() that already released cache set NULL pointer will
be accessed. Now a NULL pointer deference panic is triggered.

In order to avoid the above problem, this patch checks cache set flags in
delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
is set, this routine will quit without re-arm the delayed work. Then the
NULL pointer deference panic won't happen after cache set is released.

Signed-off-by: Coly Li <colyli@suse.de>
---
 drivers/md/bcache/writeback.c | 9 +++++++++
 1 file changed, 9 insertions(+)

Comments

Hannes Reinecke Jan. 8, 2018, 7:22 a.m. UTC | #1
On 01/03/2018 03:03 PM, Coly Li wrote:
> struct delayed_work writeback_rate_update in struct cache_dev is a delayed
> worker to call function update_writeback_rate() in period (the interval is
> defined by dc->writeback_rate_update_seconds).
> 
> When a metadate I/O error happens on cache device, bcache error handling
> routine bch_cache_set_error() will call bch_cache_set_unregister() to
> retire whole cache set. On the unregister code path, cached_dev_free()
> calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
> delayed work.
> 
> dc->writeback_rate_update is a special delayed work from others in bcache.
> In its routine update_writeback_rate(), this delayed work is re-armed
> after a piece of time. That means when cancel_delayed_work_sync() returns,
> this delayed work can still be executed after several seconds defined by
> dc->writeback_rate_update_seconds.
> 
> The problem is, after cancel_delayed_work_sync() returns, the cache set
> unregister code path will eventually release memory of struct cache set.
> Then the delayed work is scheduled to run, and inside its routine
> update_writeback_rate() that already released cache set NULL pointer will
> be accessed. Now a NULL pointer deference panic is triggered.
> 
> In order to avoid the above problem, this patch checks cache set flags in
> delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
> is set, this routine will quit without re-arm the delayed work. Then the
> NULL pointer deference panic won't happen after cache set is released.
> 
> Signed-off-by: Coly Li <colyli@suse.de>
> ---
>  drivers/md/bcache/writeback.c | 9 +++++++++
>  1 file changed, 9 insertions(+)
> 
> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
> index 0789a9e18337..745d9b2a326f 100644
> --- a/drivers/md/bcache/writeback.c
> +++ b/drivers/md/bcache/writeback.c
> @@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>  	struct cached_dev *dc = container_of(to_delayed_work(work),
>  					     struct cached_dev,
>  					     writeback_rate_update);
> +	struct cache_set *c = dc->disk.c;
> +
> +	/* quit directly if cache set is stopping */
> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
> +		return;
>  
>  	down_read(&dc->writeback_lock);
>  
> @@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
>  
>  	up_read(&dc->writeback_lock);
>  
> +	/* do not schedule delayed work if cache set is stopping */
> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
> +		return;
> +
>  	schedule_delayed_work(&dc->writeback_rate_update,
>  			      dc->writeback_rate_update_seconds * HZ);
>  }
> 
This is actually not quite correct; the function might still be called
after 'struct cached_dev' has been removed.
The correct way of fixing is to either take a reference to struct
cached_dev and release it once 'update_writeback_rate' is finished, or
to call 'cancel_delayed_work_sync()' before deleting struct cached_dev.

Cheers,

Hannes
Coly Li Jan. 8, 2018, 4:01 p.m. UTC | #2
On 08/01/2018 3:22 PM, Hannes Reinecke wrote:
> On 01/03/2018 03:03 PM, Coly Li wrote:
>> struct delayed_work writeback_rate_update in struct cache_dev is a delayed
>> worker to call function update_writeback_rate() in period (the interval is
>> defined by dc->writeback_rate_update_seconds).
>>
>> When a metadate I/O error happens on cache device, bcache error handling
>> routine bch_cache_set_error() will call bch_cache_set_unregister() to
>> retire whole cache set. On the unregister code path, cached_dev_free()
>> calls cancel_delayed_work_sync(&dc->writeback_rate_update) to stop this
>> delayed work.
>>
>> dc->writeback_rate_update is a special delayed work from others in bcache.
>> In its routine update_writeback_rate(), this delayed work is re-armed
>> after a piece of time. That means when cancel_delayed_work_sync() returns,
>> this delayed work can still be executed after several seconds defined by
>> dc->writeback_rate_update_seconds.
>>
>> The problem is, after cancel_delayed_work_sync() returns, the cache set
>> unregister code path will eventually release memory of struct cache set.
>> Then the delayed work is scheduled to run, and inside its routine
>> update_writeback_rate() that already released cache set NULL pointer will
>> be accessed. Now a NULL pointer deference panic is triggered.
>>
>> In order to avoid the above problem, this patch checks cache set flags in
>> delayed work routine update_writeback_rate(). If flag CACHE_SET_STOPPING
>> is set, this routine will quit without re-arm the delayed work. Then the
>> NULL pointer deference panic won't happen after cache set is released.
>>
>> Signed-off-by: Coly Li <colyli@suse.de>
>> ---
>>  drivers/md/bcache/writeback.c | 9 +++++++++
>>  1 file changed, 9 insertions(+)
>>
>> diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
>> index 0789a9e18337..745d9b2a326f 100644
>> --- a/drivers/md/bcache/writeback.c
>> +++ b/drivers/md/bcache/writeback.c
>> @@ -91,6 +91,11 @@ static void update_writeback_rate(struct work_struct *work)
>>  	struct cached_dev *dc = container_of(to_delayed_work(work),
>>  					     struct cached_dev,
>>  					     writeback_rate_update);
>> +	struct cache_set *c = dc->disk.c;
>> +
>> +	/* quit directly if cache set is stopping */
>> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +		return;
>>  
>>  	down_read(&dc->writeback_lock);
>>  
>> @@ -100,6 +105,10 @@ static void update_writeback_rate(struct work_struct *work)
>>  
>>  	up_read(&dc->writeback_lock);
>>  
>> +	/* do not schedule delayed work if cache set is stopping */
>> +	if (test_bit(CACHE_SET_STOPPING, &c->flags))
>> +		return;
>> +
>>  	schedule_delayed_work(&dc->writeback_rate_update,
>>  			      dc->writeback_rate_update_seconds * HZ);
>>  }
>>
> This is actually not quite correct; the function might still be called
> after 'struct cached_dev' has been removed.
> The correct way of fixing is to either take a reference to struct
> cached_dev and release it once 'update_writeback_rate' is finished, or
> to call 'cancel_delayed_work_sync()' before deleting struct cached_dev.

Hi Hannes,

The problem is not cached_dev, it is cache_set. In
__update_writeback_rate(), struct cache_set is referenced. The solutions
is similar as you suggested, call cancel_delayed_work_sync() before
deleting struct cache_set. Junhui posted another patch to fix duplicated
writeback threads issue, but also fixes this problem too. Therefore just
prevent this kworker from re-arm itself again should be enough, and my
next patche to stop dc->writeback_thread and dc->writeback_rate_update
can be ignored, Junhui's patch is in bcache-for-next already.

Thanks.

Coly Li
diff mbox

Patch

diff --git a/drivers/md/bcache/writeback.c b/drivers/md/bcache/writeback.c
index 0789a9e18337..745d9b2a326f 100644
--- a/drivers/md/bcache/writeback.c
+++ b/drivers/md/bcache/writeback.c
@@ -91,6 +91,11 @@  static void update_writeback_rate(struct work_struct *work)
 	struct cached_dev *dc = container_of(to_delayed_work(work),
 					     struct cached_dev,
 					     writeback_rate_update);
+	struct cache_set *c = dc->disk.c;
+
+	/* quit directly if cache set is stopping */
+	if (test_bit(CACHE_SET_STOPPING, &c->flags))
+		return;
 
 	down_read(&dc->writeback_lock);
 
@@ -100,6 +105,10 @@  static void update_writeback_rate(struct work_struct *work)
 
 	up_read(&dc->writeback_lock);
 
+	/* do not schedule delayed work if cache set is stopping */
+	if (test_bit(CACHE_SET_STOPPING, &c->flags))
+		return;
+
 	schedule_delayed_work(&dc->writeback_rate_update,
 			      dc->writeback_rate_update_seconds * HZ);
 }