Message ID | 20200211214042.4645-5-josef@toxicpanda.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Error condition failure fixes | expand |
On 2020/2/12 上午5:40, Josef Bacik wrote: > I hit the following warning while running my error injection stress testing > > ------------[ cut here ]------------ > WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] > RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] > Call Trace: > btrfs_free_reserved_data_space+0x4f/0x70 [btrfs] > __btrfs_prealloc_file_range+0x378/0x470 [btrfs] > elfcorehdr_read+0x40/0x40 > ? elfcorehdr_read+0x40/0x40 > ? btrfs_commit_transaction+0xca/0xa50 [btrfs] > ? dput+0xb4/0x2a0 > ? btrfs_log_dentry_safe+0x55/0x70 [btrfs] > ? btrfs_sync_file+0x30e/0x420 [btrfs] > ? do_fsync+0x38/0x70 > ? __x64_sys_fdatasync+0x13/0x20 > ? do_syscall_64+0x5b/0x1b0 > ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > ---[ end trace 70ccb5d0fe51151c ]--- > > This happens if we fail to insert our reserved file extent. At this > point we've already converted our reservation from ->bytes_may_use to > ->bytes_reserved. However once we break we will attempt to free > everything from [cur_offset, end] from ->bytes_may_use, but our extent > reservation will overlap part of this. > > Fix this problem by adding ins.offset (our extent allocation size) to > cur_offset so we remove the actual remaining part from ->bytes_may_use. > > I validated this fix using my inject-error.py script > > python inject-error.py -o should_fail_bio -t cache_save_setup -t \ > __btrfs_prealloc_file_range \ > -t insert_reserved_file_extent.constprop.0 \ > -r "-5" ./run-fsstress.sh > > where run-fsstress.sh simply mounts and runs fsstress on a disk. > > Signed-off-by: Josef Bacik <josef@toxicpanda.com> Reviewed-by: Qu Wenruo <wqu@suse.com> Thanks, Qu > --- > fs/btrfs/inode.c | 8 ++++++++ > 1 file changed, 8 insertions(+) > > diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c > index 84e649724549..747d860aedf6 100644 > --- a/fs/btrfs/inode.c > +++ b/fs/btrfs/inode.c > @@ -9919,6 +9919,14 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, > ins.offset, 0, 0, 0, > BTRFS_FILE_EXTENT_PREALLOC); > if (ret) { > + /* > + * We've reserved this space, and thus converted it from > + * ->bytes_may_use to ->bytes_reserved, which we cleanup > + * here. We need to adjust cur_offset so that we only > + * drop the ->bytes_may_use for the area we still have > + * remaining in ->>bytes_may_use. > + */ > + cur_offset += ins.objectid; > btrfs_free_reserved_extent(fs_info, ins.objectid, > ins.offset, 0); > btrfs_abort_transaction(trans, ret); >
On 11.02.20 г. 23:40 ч., Josef Bacik wrote: > I hit the following warning while running my error injection stress testing > > ------------[ cut here ]------------ > WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] > RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] > Call Trace: > btrfs_free_reserved_data_space+0x4f/0x70 [btrfs] > __btrfs_prealloc_file_range+0x378/0x470 [btrfs] > elfcorehdr_read+0x40/0x40 > ? elfcorehdr_read+0x40/0x40 > ? btrfs_commit_transaction+0xca/0xa50 [btrfs] > ? dput+0xb4/0x2a0 > ? btrfs_log_dentry_safe+0x55/0x70 [btrfs] > ? btrfs_sync_file+0x30e/0x420 [btrfs] > ? do_fsync+0x38/0x70 > ? __x64_sys_fdatasync+0x13/0x20 > ? do_syscall_64+0x5b/0x1b0 > ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 > ---[ end trace 70ccb5d0fe51151c ]--- > > This happens if we fail to insert our reserved file extent. At this > point we've already converted our reservation from ->bytes_may_use to > ->bytes_reserved. However once we break we will attempt to free > everything from [cur_offset, end] from ->bytes_may_use, but our extent > reservation will overlap part of this. > > Fix this problem by adding ins.offset (our extent allocation size) to > cur_offset so we remove the actual remaining part from ->bytes_may_use. This contradicts the code, you are adding ins.objectid which is the offset and not the size. This means either the code is buggy. <snip> > if (ret) { > + /* > + * We've reserved this space, and thus converted it from > + * ->bytes_may_use to ->bytes_reserved, which we cleanup > + * here. We need to adjust cur_offset so that we only > + * drop the ->bytes_may_use for the area we still have > + * remaining in ->>bytes_may_use. > + */ > + cur_offset += ins.objectid; > btrfs_free_reserved_extent(fs_info, ins.objectid, > ins.offset, 0); > btrfs_abort_transaction(trans, ret); >
On 2/13/20 5:17 AM, Nikolay Borisov wrote: > > > On 11.02.20 г. 23:40 ч., Josef Bacik wrote: >> I hit the following warning while running my error injection stress testing >> >> ------------[ cut here ]------------ >> WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] >> RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] >> Call Trace: >> btrfs_free_reserved_data_space+0x4f/0x70 [btrfs] >> __btrfs_prealloc_file_range+0x378/0x470 [btrfs] >> elfcorehdr_read+0x40/0x40 >> ? elfcorehdr_read+0x40/0x40 >> ? btrfs_commit_transaction+0xca/0xa50 [btrfs] >> ? dput+0xb4/0x2a0 >> ? btrfs_log_dentry_safe+0x55/0x70 [btrfs] >> ? btrfs_sync_file+0x30e/0x420 [btrfs] >> ? do_fsync+0x38/0x70 >> ? __x64_sys_fdatasync+0x13/0x20 >> ? do_syscall_64+0x5b/0x1b0 >> ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 >> ---[ end trace 70ccb5d0fe51151c ]--- >> >> This happens if we fail to insert our reserved file extent. At this >> point we've already converted our reservation from ->bytes_may_use to >> ->bytes_reserved. However once we break we will attempt to free >> everything from [cur_offset, end] from ->bytes_may_use, but our extent >> reservation will overlap part of this. >> >> Fix this problem by adding ins.offset (our extent allocation size) to >> cur_offset so we remove the actual remaining part from ->bytes_may_use. > This contradicts the code, you are adding ins.objectid which is the > offset and not the size. This means either the code is buggy. Ooops you're right, I was getting lucky because we're making the whole allocation at once, and ins.objectid was past extent_end so we ended up doing the right thing, but for the wrong reasons. In fact I need to adjust this for the other error condition, so I'll fix this up. Thanks, Josef
diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c index 84e649724549..747d860aedf6 100644 --- a/fs/btrfs/inode.c +++ b/fs/btrfs/inode.c @@ -9919,6 +9919,14 @@ static int __btrfs_prealloc_file_range(struct inode *inode, int mode, ins.offset, 0, 0, 0, BTRFS_FILE_EXTENT_PREALLOC); if (ret) { + /* + * We've reserved this space, and thus converted it from + * ->bytes_may_use to ->bytes_reserved, which we cleanup + * here. We need to adjust cur_offset so that we only + * drop the ->bytes_may_use for the area we still have + * remaining in ->>bytes_may_use. + */ + cur_offset += ins.objectid; btrfs_free_reserved_extent(fs_info, ins.objectid, ins.offset, 0); btrfs_abort_transaction(trans, ret);
I hit the following warning while running my error injection stress testing ------------[ cut here ]------------ WARNING: CPU: 3 PID: 1453 at fs/btrfs/space-info.h:108 btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] RIP: 0010:btrfs_free_reserved_data_space_noquota+0xfd/0x160 [btrfs] Call Trace: btrfs_free_reserved_data_space+0x4f/0x70 [btrfs] __btrfs_prealloc_file_range+0x378/0x470 [btrfs] elfcorehdr_read+0x40/0x40 ? elfcorehdr_read+0x40/0x40 ? btrfs_commit_transaction+0xca/0xa50 [btrfs] ? dput+0xb4/0x2a0 ? btrfs_log_dentry_safe+0x55/0x70 [btrfs] ? btrfs_sync_file+0x30e/0x420 [btrfs] ? do_fsync+0x38/0x70 ? __x64_sys_fdatasync+0x13/0x20 ? do_syscall_64+0x5b/0x1b0 ? entry_SYSCALL_64_after_hwframe+0x44/0xa9 ---[ end trace 70ccb5d0fe51151c ]--- This happens if we fail to insert our reserved file extent. At this point we've already converted our reservation from ->bytes_may_use to ->bytes_reserved. However once we break we will attempt to free everything from [cur_offset, end] from ->bytes_may_use, but our extent reservation will overlap part of this. Fix this problem by adding ins.offset (our extent allocation size) to cur_offset so we remove the actual remaining part from ->bytes_may_use. I validated this fix using my inject-error.py script python inject-error.py -o should_fail_bio -t cache_save_setup -t \ __btrfs_prealloc_file_range \ -t insert_reserved_file_extent.constprop.0 \ -r "-5" ./run-fsstress.sh where run-fsstress.sh simply mounts and runs fsstress on a disk. Signed-off-by: Josef Bacik <josef@toxicpanda.com> --- fs/btrfs/inode.c | 8 ++++++++ 1 file changed, 8 insertions(+)