diff mbox series

[v5,1/4] btrfs: Reset device size when btrfs_update_device() failed in btrfs_grow_device()

Message ID 20200109071634.32384-2-wqu@suse.com (mailing list archive)
State New, archived
Headers show
Series Introduce per-profile available space array to avoid over-confident can_overcommit() | expand

Commit Message

Qu Wenruo Jan. 9, 2020, 7:16 a.m. UTC
When btrfs_update_device() failed due to ENOMEM, we didn't reset device
size back to its original size, causing the in-memory device size larger
than original.

If somehow the memory pressure get solved, and the fs committed, since
the device item is not updated, but super block total size get updated,
it would cause mount failure due to size mismatch.

So here revert device size and super size to its original size when
btrfs_update_device() failed, just like what we did in shrink_device().

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/volumes.c | 20 +++++++++++++++++++-
 1 file changed, 19 insertions(+), 1 deletion(-)

Comments

Josef Bacik Jan. 9, 2020, 2:21 p.m. UTC | #1
On 1/9/20 2:16 AM, Qu Wenruo wrote:
> When btrfs_update_device() failed due to ENOMEM, we didn't reset device
> size back to its original size, causing the in-memory device size larger
> than original.
> 
> If somehow the memory pressure get solved, and the fs committed, since
> the device item is not updated, but super block total size get updated,
> it would cause mount failure due to size mismatch.
> 
> So here revert device size and super size to its original size when
> btrfs_update_device() failed, just like what we did in shrink_device().
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>

Did you test this with error injection to make sure nothing else wonky came out 
of this?  If you are going to fix this I'd rather it be in a different series 
because it's not necessarily related to what you are doing, and isn't any more 
broken with your other patches.  The thing you are fixing in this series is 
important and I'd rather not hold it up on some error handling shenanigans.  Thanks,

Josef
Qu Wenruo Jan. 10, 2020, 1:40 a.m. UTC | #2
On 2020/1/9 下午10:21, Josef Bacik wrote:
> On 1/9/20 2:16 AM, Qu Wenruo wrote:
>> When btrfs_update_device() failed due to ENOMEM, we didn't reset device
>> size back to its original size, causing the in-memory device size larger
>> than original.
>>
>> If somehow the memory pressure get solved, and the fs committed, since
>> the device item is not updated, but super block total size get updated,
>> it would cause mount failure due to size mismatch.
>>
>> So here revert device size and super size to its original size when
>> btrfs_update_device() failed, just like what we did in shrink_device().
>>
>> Signed-off-by: Qu Wenruo <wqu@suse.com>
> 
> Did you test this with error injection to make sure nothing else wonky
> came out of this?  If you are going to fix this I'd rather it be in a
> different series because it's not necessarily related to what you are
> doing, and isn't any more broken with your other patches.  The thing you
> are fixing in this series is important and I'd rather not hold it up on
> some error handling shenanigans.  Thanks,

Yes, I have the same feeling.

But sometimes I just can't stop addressing the comment that makes sense.

And you're right, I forgot the error injection test, and it detects one bug.
In the error handling path, I forgot the re-update per-profile
available, causing df showing the grown size, not the old size.

To David, what's your idea on this?
I guess the patchset can't be backported anyway due to new infrastructure.
I'm OK solving the problem by either removing this patch, or fix the bug
exposed by the error injection.

Thanks,
Qu

> 
> Josef
Qu Wenruo Jan. 15, 2020, 7:05 a.m. UTC | #3
On 2020/1/10 上午9:40, Qu Wenruo wrote:
> 
> 
> On 2020/1/9 下午10:21, Josef Bacik wrote:
>> On 1/9/20 2:16 AM, Qu Wenruo wrote:
>>> When btrfs_update_device() failed due to ENOMEM, we didn't reset device
>>> size back to its original size, causing the in-memory device size larger
>>> than original.
>>>
>>> If somehow the memory pressure get solved, and the fs committed, since
>>> the device item is not updated, but super block total size get updated,
>>> it would cause mount failure due to size mismatch.
>>>
>>> So here revert device size and super size to its original size when
>>> btrfs_update_device() failed, just like what we did in shrink_device().
>>>
>>> Signed-off-by: Qu Wenruo <wqu@suse.com>
>>
>> Did you test this with error injection to make sure nothing else wonky
>> came out of this?  If you are going to fix this I'd rather it be in a
>> different series because it's not necessarily related to what you are
>> doing, and isn't any more broken with your other patches.  The thing you
>> are fixing in this series is important and I'd rather not hold it up on
>> some error handling shenanigans.  Thanks,
> 
> Yes, I have the same feeling.
> 
> But sometimes I just can't stop addressing the comment that makes sense.
> 
> And you're right, I forgot the error injection test, and it detects one bug.
> In the error handling path, I forgot the re-update per-profile
> available, causing df showing the grown size, not the old size.
> 
> To David, what's your idea on this?
> I guess the patchset can't be backported anyway due to new infrastructure.
> I'm OK solving the problem by either removing this patch, or fix the bug
> exposed by the error injection.

Gentle ping.

Any feedback, David?

Thanks,
Qu

> 
> Thanks,
> Qu
> 
>>
>> Josef
>
diff mbox series

Patch

diff --git a/fs/btrfs/volumes.c b/fs/btrfs/volumes.c
index d8e5560db285..be638465f210 100644
--- a/fs/btrfs/volumes.c
+++ b/fs/btrfs/volumes.c
@@ -2633,8 +2633,10 @@  int btrfs_grow_device(struct btrfs_trans_handle *trans,
 {
 	struct btrfs_fs_info *fs_info = device->fs_info;
 	struct btrfs_super_block *super_copy = fs_info->super_copy;
+	u64 old_device_size;
 	u64 old_total;
 	u64 diff;
+	int ret;
 
 	if (!test_bit(BTRFS_DEV_STATE_WRITEABLE, &device->dev_state))
 		return -EACCES;
@@ -2642,6 +2644,7 @@  int btrfs_grow_device(struct btrfs_trans_handle *trans,
 	new_size = round_down(new_size, fs_info->sectorsize);
 
 	mutex_lock(&fs_info->chunk_mutex);
+	old_device_size = device->total_bytes;
 	old_total = btrfs_super_total_bytes(super_copy);
 	diff = round_down(new_size - device->total_bytes, fs_info->sectorsize);
 
@@ -2663,7 +2666,22 @@  int btrfs_grow_device(struct btrfs_trans_handle *trans,
 			      &trans->transaction->dev_update_list);
 	mutex_unlock(&fs_info->chunk_mutex);
 
-	return btrfs_update_device(trans, device);
+	ret = btrfs_update_device(trans, device);
+	if (ret < 0) {
+		/*
+		 * Although we dropped chunk_mutex halfway for
+		 * btrfs_update_device(), we have FS_EXCL_OP bit to prevent
+		 * shrinking/growing race.
+		 * So we're safe to use the old size directly.
+		 */
+		mutex_lock(&fs_info->chunk_mutex);
+		btrfs_set_super_total_bytes(super_copy, old_total);
+		device->fs_devices->total_rw_bytes -= diff;
+		btrfs_device_set_total_bytes(device, old_device_size);
+		btrfs_device_set_disk_total_bytes(device, old_device_size);
+		mutex_unlock(&fs_info->chunk_mutex);
+	}
+	return ret;
 }
 
 static int btrfs_free_chunk(struct btrfs_trans_handle *trans, u64 chunk_offset)