mbox series

[0/3] fixup work fixes

Message ID 20200121165144.2174309-1-josef@toxicpanda.com (mailing list archive)
Headers show
Series fixup work fixes | expand

Message

Josef Bacik Jan. 21, 2020, 4:51 p.m. UTC
This series is to address a few issues with the fixup worker we hit in
production.

The first of this is a resend of

  Btrfs: keep pages dirty when using

I've cleaned this up based on the feedback and added a bunch more comments to
make it clear what is happening and why we're doing it.

The next patch is a cleanup that is made possible by the previous patch, again
to clear up the fixup workers job.

  btrfs: drop the -EBUSY case in __extent_writepage_io

And finally the deadlock fix that I submitted earlier.  I noticed while trying
to backport this onto our kernel that we had changed the error case with the
above patch from Chris, and actually we really, really need Chris's fix as well.
There is also a change in the error handling from v1 where we now set the page
error properly but only once we've locked the page and verified we're still
responsible for COW'ing the page.  Thanks,

Josef

Comments

David Sterba Jan. 29, 2020, 3:11 p.m. UTC | #1
On Tue, Jan 21, 2020 at 11:51:41AM -0500, Josef Bacik wrote:
> This series is to address a few issues with the fixup worker we hit in
> production.
> 
> The first of this is a resend of
> 
>   Btrfs: keep pages dirty when using
> 
> I've cleaned this up based on the feedback and added a bunch more comments to
> make it clear what is happening and why we're doing it.
> 
> The next patch is a cleanup that is made possible by the previous patch, again
> to clear up the fixup workers job.
> 
>   btrfs: drop the -EBUSY case in __extent_writepage_io
> 
> And finally the deadlock fix that I submitted earlier.  I noticed while trying
> to backport this onto our kernel that we had changed the error case with the
> above patch from Chris, and actually we really, really need Chris's fix as well.
> There is also a change in the error handling from v1 where we now set the page
> error properly but only once we've locked the page and verified we're still
> responsible for COW'ing the page.  Thanks,

Reviewed-by: David Sterba <dsterba@suse.com>

Very tricky stuff that fixup worker, it's like the worst present a
filesystem can get from memory management.

Estimated Merge target is 5.6, post rc1 so we have enough time for
testing. It'll appear either in misc-next or for-next.