Message ID | 20200313195809.141753-1-josef@toxicpanda.com (mailing list archive) |
---|---|
Headers | show |
Series | Deal with a few ENOSPC corner cases | expand |
On 13.03.20 г. 21:58 ч., Josef Bacik wrote: > v1->v2: > - Dropped "btrfs: only take normal tickets into account in > may_commit_transaction" because "btrfs: only check priority tickets for > priority flushing" should actually fix the problem, and Nikolay pointed out > that evict uses the priority list but is allowed to commit, so we need to take > into account priority tickets sometimes. > - Added "btrfs: allow us to use up to 90% of the global rsv for" so that the > global rsv change was separate from the serialization patch. > - Fixed up some changelogs. > - Dropped an extra trace_printk that made it into v2. > > ----------------------- Original email -------------------------------------- > > Nikolay has been digging into a failure of generic/320 on ppc64. This has > shaken out a variety of issues, and he's done a good job at running all of the > weird corners down and then testing my ideas to get them all fixed. This is the > series that has survived the longest, so we're declaring victory. > > First there is the global reserve stealing logic. The way unlink works is it > attempts to start a transaction with a normal reservation amount, and if this > fails with ENOSPC we fall back to stealing from the global reserve. This is > problematic because of all the same reasons we had with previous iterations of > the ENOSPC handling, thundering herd. We get a bunch of failures all at once, > everybody tries to allocate from the global reserve, some win and some lose, we > get an ENSOPC. > > To fix this we need to integrate this logic into the normal ENOSPC > infrastructure. The idea is simple, we add a new flushing state that indicates > we are allowed to steal from the global reserve. We still go through all of the > normal flushing work, and at the moment we begin to fail all the tickets we try > to satisfy any tickets that are allowed to steal by stealing from the global > reserve. If this works we start the flushing system over again just like we > would with a normal ticket satisfaction. This serializes our global reserve > stealing, so we don't have the thundering herd problem > > This isn't the only problem however. Nikolay also noticed that we would > sometimes have huge amounts of space in the trans block rsv and we would ENOSPC > out. This is because the may_commit_transaction() logic didn't take into > account the space that would be reclaimed by all of the outstanding trans > handles being required to stop in order to commit the transaction. > > Another corner here was that priority tickets could race in and make > may_commit_transaction() think that it had no work left to do, and thus not > commit the transaction. > > Those fixes all address the failures that Nikolay was seeing. The last two > patches are just cleanups around how we handle priority tickets. We shouldn't > even be serializing priority tickets behind normal tickets, only behind other > priority tickets. And finally there would be a small window where priority > tickets would fail out if there were multiple priority tickets and one of them > failed. This is addressed by the previous patch. > > Nikolay has put these through many iterations of generic/320, and so far it > hasn't failed. Thanks, > > Josef > I tested this on PPC64LE and didn't observe any regressions (apart form the one fixed by [PATCH] btrfs: force chunk allocation if our global rsv is larger than metadata), so: Tested-by: Nikolay Borisov <nborisov@suse.com>
On Fri, Mar 13, 2020 at 03:58:04PM -0400, Josef Bacik wrote: > v1->v2: > - Dropped "btrfs: only take normal tickets into account in > may_commit_transaction" because "btrfs: only check priority tickets for > priority flushing" should actually fix the problem, and Nikolay pointed out > that evict uses the priority list but is allowed to commit, so we need to take > into account priority tickets sometimes. > - Added "btrfs: allow us to use up to 90% of the global rsv for" so that the > global rsv change was separate from the serialization patch. > - Fixed up some changelogs. > - Dropped an extra trace_printk that made it into v2. The patchset seems to be based on some other, code I think it's the tickets for data chunks. The compilation fails because BTRFS_RESERVE_FLUSH_DATA is not defined, but it's mentioned in several patches. If the base patchset is a hard requirement then both would need to go in at the same time, otherwise if it's possible to refresh this branch I could add it to for-next now.
On 25.03.20 г. 17:50 ч., David Sterba wrote: > On Fri, Mar 13, 2020 at 03:58:04PM -0400, Josef Bacik wrote: >> v1->v2: >> - Dropped "btrfs: only take normal tickets into account in >> may_commit_transaction" because "btrfs: only check priority tickets for >> priority flushing" should actually fix the problem, and Nikolay pointed out >> that evict uses the priority list but is allowed to commit, so we need to take >> into account priority tickets sometimes. >> - Added "btrfs: allow us to use up to 90% of the global rsv for" so that the >> global rsv change was separate from the serialization patch. >> - Fixed up some changelogs. >> - Dropped an extra trace_printk that made it into v2. > > The patchset seems to be based on some other, code I think it's the > tickets for data chunks. The compilation fails because > BTRFS_RESERVE_FLUSH_DATA is not defined, but it's mentioned in several > patches. > > If the base patchset is a hard requirement then both would need to go in > at the same time, otherwise if it's possible to refresh this branch I > could add it to for-next now. > No, the data ticket is not a hard requirement. I've tested this branch on our SLE kernels without it. So the conflict resolution is really mino - simply removing the conditions involving BTRFS_RESERVE_FLUSH_DATA.
On Wed, Mar 25, 2020 at 05:52:38PM +0200, Nikolay Borisov wrote: > > > On 25.03.20 г. 17:50 ч., David Sterba wrote: > > On Fri, Mar 13, 2020 at 03:58:04PM -0400, Josef Bacik wrote: > >> v1->v2: > >> - Dropped "btrfs: only take normal tickets into account in > >> may_commit_transaction" because "btrfs: only check priority tickets for > >> priority flushing" should actually fix the problem, and Nikolay pointed out > >> that evict uses the priority list but is allowed to commit, so we need to take > >> into account priority tickets sometimes. > >> - Added "btrfs: allow us to use up to 90% of the global rsv for" so that the > >> global rsv change was separate from the serialization patch. > >> - Fixed up some changelogs. > >> - Dropped an extra trace_printk that made it into v2. > > > > The patchset seems to be based on some other, code I think it's the > > tickets for data chunks. The compilation fails because > > BTRFS_RESERVE_FLUSH_DATA is not defined, but it's mentioned in several > > patches. > > > > If the base patchset is a hard requirement then both would need to go in > > at the same time, otherwise if it's possible to refresh this branch I > > could add it to for-next now. > > > > No, the data ticket is not a hard requirement. I've tested this branch > on our SLE kernels without it. So the conflict resolution is really mino > - simply removing the conditions involving BTRFS_RESERVE_FLUSH_DATA. Ok, thanks. With this diff applied, I'll add the branch to for-next and then to misc-next once some tests finish. --- a/fs/btrfs/space-info.c +++ b/fs/btrfs/space-info.c @@ -1188,8 +1188,7 @@ static int handle_reserve_ticket(struct btrfs_fs_info *fs_info, */ static inline bool is_normal_flushing(enum btrfs_reserve_flush_enum flush) { - return (flush == BTRFS_RESERVE_FLUSH_DATA) || - (flush == BTRFS_RESERVE_FLUSH_ALL) || + return (flush == BTRFS_RESERVE_FLUSH_ALL) || (flush == BTRFS_RESERVE_FLUSH_ALL_STEAL); }
On Fri, Mar 13, 2020 at 03:58:04PM -0400, Josef Bacik wrote: > v1->v2: > - Dropped "btrfs: only take normal tickets into account in > may_commit_transaction" because "btrfs: only check priority tickets for > priority flushing" should actually fix the problem, and Nikolay pointed out > that evict uses the priority list but is allowed to commit, so we need to take > into account priority tickets sometimes. > - Added "btrfs: allow us to use up to 90% of the global rsv for" so that the > global rsv change was separate from the serialization patch. > - Fixed up some changelogs. > - Dropped an extra trace_printk that made it into v2. Patchset moved to misc-next, thanks.