diff mbox series

[1/2] btrfs: warn for any missed cleanup at btrfs_close_one_device

Message ID 584322021db1e182838b5dc9d90459850e5fcf36.1680619177.git.anand.jain@oracle.com (mailing list archive)
State New, archived
Headers show
Series minor cleanups for device close and free functions | expand

Commit Message

Anand Jain April 4, 2023, 2:55 p.m. UTC
During my recent search for the root cause of a reported bug, I realized
that it's a good idea to issue a warning for missed cleanup instead of
using debug-only assertions. Since most installations run with debug off,
missed cleanups and premature calls to close could go unnoticed. However,
these issues are serious enough to warrant reporting and fixing.

Signed-off-by: Anand Jain <anand.jain@oracle.com>
---
 fs/btrfs/volumes.c | 8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

David Sterba April 5, 2023, 5:37 p.m. UTC | #1
On Tue, Apr 04, 2023 at 10:55:11PM +0800, Anand Jain wrote:
> During my recent search for the root cause of a reported bug, I realized
> that it's a good idea to issue a warning for missed cleanup instead of
> using debug-only assertions. Since most installations run with debug off,
> missed cleanups and premature calls to close could go unnoticed. However,
> these issues are serious enough to warrant reporting and fixing.

There are other WARN_ONs checking for device state so it should be ok to
add some more.

> Signed-off-by: Anand Jain <anand.jain@oracle.com>
> ---
>  fs/btrfs/volumes.c | 8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
> index eead4a1f53b7..0e3677650a78 100644
> --- a/fs/btrfs/volumes.c
> +++ b/fs/btrfs/volumes.c
> @@ -1150,10 +1150,10 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>  	device->last_flush_error = 0;
>  
>  	/* Verify the device is back in a pristine state  */
> -	ASSERT(!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
> -	ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state));
> -	ASSERT(list_empty(&device->dev_alloc_list));
> -	ASSERT(list_empty(&device->post_commit_list));
> +	WARN_ON(test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
> +	WARN_ON(test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state));
> +	WARN_ON(!list_empty(&device->dev_alloc_list));
> +	WARN_ON(!list_empty(&device->post_commit_list));

I'm thinking if we can reset some of the values to a safe state too, not
keeping it until the next time the device is opened, because not
everything is reset at device open time.
Anand Jain April 6, 2023, 5:15 a.m. UTC | #2
On 4/6/23 01:37, David Sterba wrote:
> On Tue, Apr 04, 2023 at 10:55:11PM +0800, Anand Jain wrote:
>> During my recent search for the root cause of a reported bug, I realized
>> that it's a good idea to issue a warning for missed cleanup instead of
>> using debug-only assertions. Since most installations run with debug off,
>> missed cleanups and premature calls to close could go unnoticed. However,
>> these issues are serious enough to warrant reporting and fixing.
> 
> There are other WARN_ONs checking for device state so it should be ok to
> add some more.
> 
>> Signed-off-by: Anand Jain <anand.jain@oracle.com>
>> ---
>>   fs/btrfs/volumes.c | 8 ++++----
>>   1 file changed, 4 insertions(+), 4 deletions(-)
>>
>> diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
>> index eead4a1f53b7..0e3677650a78 100644
>> --- a/fs/btrfs/volumes.c
>> +++ b/fs/btrfs/volumes.c
>> @@ -1150,10 +1150,10 @@ static void btrfs_close_one_device(struct btrfs_device *device)
>>   	device->last_flush_error = 0;
>>   
>>   	/* Verify the device is back in a pristine state  */
>> -	ASSERT(!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
>> -	ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state));
>> -	ASSERT(list_empty(&device->dev_alloc_list));
>> -	ASSERT(list_empty(&device->post_commit_list));
>> +	WARN_ON(test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
>> +	WARN_ON(test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state));
>> +	WARN_ON(!list_empty(&device->dev_alloc_list));
>> +	WARN_ON(!list_empty(&device->post_commit_list));
> 
> I'm thinking if we can reset some of the values to a safe state too, not
> keeping it until the next time the device is opened, because not
> everything is reset at device open time.

Good idea. We need some cleanup around these code lines. I will send 
patches.

Thanks, Anand
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index eead4a1f53b7..0e3677650a78 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -1150,10 +1150,10 @@  static void btrfs_close_one_device(struct btrfs_device *device)
 	device->last_flush_error = 0;
 
 	/* Verify the device is back in a pristine state  */
-	ASSERT(!test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
-	ASSERT(!test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state));
-	ASSERT(list_empty(&device->dev_alloc_list));
-	ASSERT(list_empty(&device->post_commit_list));
+	WARN_ON(test_bit(BTRFS_DEV_STATE_FLUSH_SENT, &device->dev_state));
+	WARN_ON(test_bit(BTRFS_DEV_STATE_REPLACE_TGT, &device->dev_state));
+	WARN_ON(!list_empty(&device->dev_alloc_list));
+	WARN_ON(!list_empty(&device->post_commit_list));
 }
 
 static void close_fs_devices(struct btrfs_fs_devices *fs_devices)