mbox series

[00/23] Change data reservations to use the ticketing infra

Message ID 20200630135921.745612-1-josef@toxicpanda.com (mailing list archive)
Headers show
Series Change data reservations to use the ticketing infra | expand

Message

Josef Bacik June 30, 2020, 1:58 p.m. UTC
We've had two different things in place to reserve data and metadata space,
because generally speaking data is much simpler.  However the data reservations
still suffered from the same issues that plagued metadata reservations, you
could get multiple tasks racing in to get reservations.  This causes problems
with cases like write/delete loops where we should be able to fill up the fs,
delete everything, and go again.  You would sometimes race with a flusher that's
trying to unpin the free space, take it's reservations, and then it would fail.

Fix this by moving the data reservations under the metadata ticketing
infrastructure.  This gets rid of that problem, and simplifies our enospc code
more by consolidating it into a single path.  Thanks,

Josef

Comments

David Sterba July 3, 2020, 4:30 p.m. UTC | #1
On Tue, Jun 30, 2020 at 09:58:58AM -0400, Josef Bacik wrote:
> We've had two different things in place to reserve data and metadata space,
> because generally speaking data is much simpler.  However the data reservations
> still suffered from the same issues that plagued metadata reservations, you
> could get multiple tasks racing in to get reservations.  This causes problems
> with cases like write/delete loops where we should be able to fill up the fs,
> delete everything, and go again.  You would sometimes race with a flusher that's
> trying to unpin the free space, take it's reservations, and then it would fail.
> 
> Fix this by moving the data reservations under the metadata ticketing
> infrastructure.  This gets rid of that problem, and simplifies our enospc code
> more by consolidating it into a single path.  Thanks,

I created a for-next-20200703 snapshot with this branch included, and it
went up to generic/102 and got stuck due to softlockups. This could mean
the machine is overloaded but it usually recovers from that, while not
in this case as it took more than 7 hours before I killed the VM.

There is the dio-iomap so the patchsets could interact in some way, I'll
do more tests next week, this is an early warning that something might
be going on.

generic/102             [02:56:08][11376.253779] run fstests generic/102 at 2020-07-03 02:56:08
[11376.582218] BTRFS info (device vda): disk space caching is enabled
[11376.585713] BTRFS info (device vda): has skinny extents
[11376.692139] BTRFS: device fsid 55b09ca8-60e8-4488-8fa0-3ff05c3e6906 devid 1 transid 5 /dev/vdb scanned by mkfs.btrfs (5196)
[11376.717361] BTRFS info (device vdb): disk space caching is enabled
[11376.720437] BTRFS info (device vdb): has skinny extents
[11376.722974] BTRFS info (device vdb): flagging fs with big metadata feature
[11376.731535] BTRFS info (device vdb): checking UUID tree
[11436.612816] watchdog: BUG: soft lockup - CPU#1 stuck for 23s! [kworker/u8:17:8005]
[11436.615341] Modules linked in: dm_snapshot dm_thin_pool dm_persistent_data dm_bio_prison dm_bufio dm_log_writes dm_flakey dm_mod dax btrfs blake2b_generic libcrc32c crc32c_intel xor zstd_decompress zstd_compres
[11436.621957] irq event stamp: 0
[11436.623062] hardirqs last  enabled at (0): [<0000000000000000>] 0x0
[11436.624882] hardirqs last disabled at (0): [<ffffffff9a07fd9b>] copy_process+0x67b/0x1b00
[11436.627545] softirqs last  enabled at (0): [<ffffffff9a07fd9b>] copy_process+0x67b/0x1b00
[11436.630161] softirqs last disabled at (0): [<0000000000000000>] 0x0
[11436.631919] CPU: 1 PID: 8005 Comm: kworker/u8:17 Tainted: G             L    5.8.0-rc3-default+ #1173
[11436.634311] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS rel-1.12.0-59-gc9ba527-rebuilt.opensuse.org 04/01/2014
[11436.637193] Workqueue: events_unbound btrfs_async_reclaim_data_space [btrfs]
[11436.638878] RIP: 0010:__slab_alloc+0x5c/0x70
[11436.644622] RSP: 0018:ffffbc1dc60f7cd0 EFLAGS: 00000246
[11436.645961] RAX: 0000000000000000 RBX: 0000000000000246 RCX: 0000000000120001
[11436.647989] RDX: ffffa2a70b349cc0 RSI: ffffffff9a262774 RDI: ffffffff9a2639ba
[11436.649924] RBP: 0000000000000d40 R08: ffffa2a76d1cf138 R09: 0000000000000001
[11436.652014] R10: 0000000000000000 R11: ffffa2a76d1cec10 R12: ffffa2a76d1cf138
[11436.654134] R13: 00000000ffffffff R14: ffffffffc0149633 R15: ffffdc1dbf802b50
[11436.656186] FS:  0000000000000000(0000) GS:ffffa2a77d800000(0000) knlGS:0000000000000000
[11436.658920] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[11436.660679] CR2: 00005641be0178b8 CR3: 0000000041011002 CR4: 0000000000160ee0
[11436.662693] Call Trace:
[11436.663642]  ? start_transaction+0x133/0x5e0 [btrfs]
[11436.664991]  kmem_cache_alloc+0x1a2/0x2f0
[11436.666248]  start_transaction+0x133/0x5e0 [btrfs]
[11436.667733]  flush_space+0x315/0x670 [btrfs]
[11436.669116]  btrfs_async_reclaim_data_space+0xd8/0x1a0 [btrfs]
[11436.670853]  process_one_work+0x22c/0x600
[11436.672182]  worker_thread+0x50/0x3b0
[11436.673436]  ? process_one_work+0x600/0x600
[11436.674675]  kthread+0x137/0x150
[11436.675795]  ? kthread_stop+0x2a0/0x2a0
[11436.677099]  ret_from_fork+0x1f/0x30
Josef Bacik July 7, 2020, 3:28 p.m. UTC | #2
On 7/3/20 12:30 PM, David Sterba wrote:
> On Tue, Jun 30, 2020 at 09:58:58AM -0400, Josef Bacik wrote:
>> We've had two different things in place to reserve data and metadata space,
>> because generally speaking data is much simpler.  However the data reservations
>> still suffered from the same issues that plagued metadata reservations, you
>> could get multiple tasks racing in to get reservations.  This causes problems
>> with cases like write/delete loops where we should be able to fill up the fs,
>> delete everything, and go again.  You would sometimes race with a flusher that's
>> trying to unpin the free space, take it's reservations, and then it would fail.
>>
>> Fix this by moving the data reservations under the metadata ticketing
>> infrastructure.  This gets rid of that problem, and simplifies our enospc code
>> more by consolidating it into a single path.  Thanks,
> 
> I created a for-next-20200703 snapshot with this branch included, and it
> went up to generic/102 and got stuck due to softlockups. This could mean
> the machine is overloaded but it usually recovers from that, while not
> in this case as it took more than 7 hours before I killed the VM.
> 
> There is the dio-iomap so the patchsets could interact in some way, I'll
> do more tests next week, this is an early warning that something might
> be going on.

Softlockup in __slab_alloc, I think we were just the victim here.  Thanks,

Josef