mbox series

[v3,0/5] btrfs: remove buffer heads form superblock handling

Message ID 20200127155931.10818-1-johannes.thumshirn@wdc.com (mailing list archive)
Headers show
Series btrfs: remove buffer heads form superblock handling | expand

Message

Johannes Thumshirn Jan. 27, 2020, 3:59 p.m. UTC
This patch series removes the use of buffer_heads from btrfs' super block read
and write paths. It also converts the integrity-checking code to only work
with pages and BIOs.

Compared to buffer heads, this gives us a leaner call path, as the
buffer_head code wraps around getting pages from the page-cache and adding
them to BIOs to submit.

The first patch removes buffer_heads from superblock reading.  The second
removes it from super_block writing and the subsequent patches remove the
buffer_heads from the integrity check code.

It's based on misc-next from Wednesday January 22, and doesn't show any
regressions in xfstests to the baseline.

Changes to v2:
- Removed patch #1 again
- Added Reviews from Josef
- Re-visited page locking, but not changes, it retains the same locking scheme
  the buffer_heads had
- Incroptorated comments from David regarding open-coding functions
- For more details see the idividual patches.


Changes to v1:
- Added patch #1
- Converted sb reading and integrity checking to use the page cache
- Added rationale behind the conversion to the commit messages.
- For more details see the idividual patches.


Johannes Thumshirn (5):
  btrfs: remove buffer heads from super block reading
  btrfs: remove use of buffer_heads from superblock writeout
  btrfs: remove btrfsic_submit_bh()
  btrfs: remove buffer_heads from btrfsic_process_written_block()
  btrfs: remove buffer_heads form superblock mirror integrity checking

 fs/btrfs/check-integrity.c | 218 +++++++++++--------------------------
 fs/btrfs/check-integrity.h |   2 -
 fs/btrfs/disk-io.c         | 200 +++++++++++++++++++++-------------
 fs/btrfs/disk-io.h         |   4 +-
 fs/btrfs/volumes.c         |  66 ++++++-----
 fs/btrfs/volumes.h         |   2 -
 6 files changed, 230 insertions(+), 262 deletions(-)

Comments

David Sterba Jan. 29, 2020, 2:25 p.m. UTC | #1
On Tue, Jan 28, 2020 at 12:59:26AM +0900, Johannes Thumshirn wrote:
> This patch series removes the use of buffer_heads from btrfs' super block read
> and write paths. It also converts the integrity-checking code to only work
> with pages and BIOs.
> 
> Compared to buffer heads, this gives us a leaner call path, as the
> buffer_head code wraps around getting pages from the page-cache and adding
> them to BIOs to submit.
> 
> The first patch removes buffer_heads from superblock reading.  The second
> removes it from super_block writing and the subsequent patches remove the
> buffer_heads from the integrity check code.
> 
> It's based on misc-next from Wednesday January 22, and doesn't show any
> regressions in xfstests to the baseline.
> 
> Changes to v2:
> - Removed patch #1 again
> - Added Reviews from Josef
> - Re-visited page locking, but not changes, it retains the same locking scheme
>   the buffer_heads had
> - Incroptorated comments from David regarding open-coding functions
> - For more details see the idividual patches.
> 
> 
> Changes to v1:
> - Added patch #1
> - Converted sb reading and integrity checking to use the page cache
> - Added rationale behind the conversion to the commit messages.
> - For more details see the idividual patches.
> 
> 
> Johannes Thumshirn (5):
>   btrfs: remove buffer heads from super block reading
>   btrfs: remove use of buffer_heads from superblock writeout

For next iteration, please change the subjects of the two patches to say
"replace buffer heads with bio for superblock reading". Thanks.
Johannes Thumshirn Jan. 30, 2020, 11:23 a.m. UTC | #2
On 29/01/2020 15:25, David Sterba wrote:
>> Johannes Thumshirn (5):
>>    btrfs: remove buffer heads from super block reading
>>    btrfs: remove use of buffer_heads from superblock writeout
> 
> For next iteration, please change the subjects of the two patches to say
> "replace buffer heads with bio for superblock reading". Thanks.


Sure but with hch's proposed change to using read_cache_page_gfp() this 
doesn't make too much sense anymore at least for the read path.

Maybe "use page cache for superblock reading"?

Byte,
	Johannes
David Sterba Jan. 30, 2020, 12:15 p.m. UTC | #3
On Thu, Jan 30, 2020 at 11:23:24AM +0000, Johannes Thumshirn wrote:
> On 29/01/2020 15:25, David Sterba wrote:
> >> Johannes Thumshirn (5):
> >>    btrfs: remove buffer heads from super block reading
> >>    btrfs: remove use of buffer_heads from superblock writeout
> > 
> > For next iteration, please change the subjects of the two patches to say
> > "replace buffer heads with bio for superblock reading". Thanks.
> 
> 
> Sure but with hch's proposed change to using read_cache_page_gfp() this 
> doesn't make too much sense anymore at least for the read path.
> 
> Maybe "use page cache for superblock reading"?

That works too. We might need a new iteration that summarizes up all the
feedback so far, so we have same code to refer to.
Christoph Hellwig Jan. 30, 2020, 1:39 p.m. UTC | #4
On Thu, Jan 30, 2020 at 01:15:30PM +0100, David Sterba wrote:
> > Sure but with hch's proposed change to using read_cache_page_gfp() this 
> > doesn't make too much sense anymore at least for the read path.
> > 
> > Maybe "use page cache for superblock reading"?
> 
> That works too. We might need a new iteration that summarizes up all the
> feedback so far, so we have same code to refer to.

Per my question on the second patch:  why even use the page cache at
all.  btrfs already caches the value outside the pagecache, so why
even bother with the page cache overhead?
Johannes Thumshirn Jan. 30, 2020, 3:53 p.m. UTC | #5
On 30/01/2020 14:39, Christoph Hellwig wrote:
> On Thu, Jan 30, 2020 at 01:15:30PM +0100, David Sterba wrote:
>>> Sure but with hch's proposed change to using read_cache_page_gfp() this
>>> doesn't make too much sense anymore at least for the read path.
>>>
>>> Maybe "use page cache for superblock reading"?
>>
>> That works too. We might need a new iteration that summarizes up all the
>> feedback so far, so we have same code to refer to.
> 
> Per my question on the second patch:  why even use the page cache at
> all.  btrfs already caches the value outside the pagecache, so why
> even bother with the page cache overhead?
> 
This is what my first version did, alloc_page() and submit_bio() 
directly [1]. But reviewers told me to go the route via page cache.

[1] 
https://lore.kernel.org/linux-btrfs/20200117125105.20989-1-johannes.thumshirn@wdc.com/
Christoph Hellwig Jan. 30, 2020, 3:56 p.m. UTC | #6
On Thu, Jan 30, 2020 at 03:53:37PM +0000, Johannes Thumshirn wrote:
> On 30/01/2020 14:39, Christoph Hellwig wrote:
> > On Thu, Jan 30, 2020 at 01:15:30PM +0100, David Sterba wrote:
> >>> Sure but with hch's proposed change to using read_cache_page_gfp() this
> >>> doesn't make too much sense anymore at least for the read path.
> >>>
> >>> Maybe "use page cache for superblock reading"?
> >>
> >> That works too. We might need a new iteration that summarizes up all the
> >> feedback so far, so we have same code to refer to.
> > 
> > Per my question on the second patch:  why even use the page cache at
> > all.  btrfs already caches the value outside the pagecache, so why
> > even bother with the page cache overhead?
> > 
> This is what my first version did, alloc_page() and submit_bio() 
> directly [1]. But reviewers told me to go the route via page cache.

I only see your patch at the url, not any reply.  What is the issue
of not using the page cache?  Also you really shoudn't need a separate
alloc_page - you should be able to use the already cached superblock
as the destination and source of I/O, assuming they are properly aligned
(and if not that could be fixed easily).
Johannes Thumshirn Jan. 30, 2020, 4:09 p.m. UTC | #7
On 30/01/2020 16:56, Christoph Hellwig wrote:
> On Thu, Jan 30, 2020 at 03:53:37PM +0000, Johannes Thumshirn wrote:
>> On 30/01/2020 14:39, Christoph Hellwig wrote:
>>> On Thu, Jan 30, 2020 at 01:15:30PM +0100, David Sterba wrote:
>>>>> Sure but with hch's proposed change to using read_cache_page_gfp() this
>>>>> doesn't make too much sense anymore at least for the read path.
>>>>>
>>>>> Maybe "use page cache for superblock reading"?
>>>>
>>>> That works too. We might need a new iteration that summarizes up all the
>>>> feedback so far, so we have same code to refer to.
>>>
>>> Per my question on the second patch:  why even use the page cache at
>>> all.  btrfs already caches the value outside the pagecache, so why
>>> even bother with the page cache overhead?
>>>
>> This is what my first version did, alloc_page() and submit_bio()
>> directly [1]. But reviewers told me to go the route via page cache.
> 
> I only see your patch at the url, not any reply.  What is the issue
> of not using the page cache?  Also you really shoudn't need a separate
> alloc_page - you should be able to use the already cached superblock
> as the destination and source of I/O, assuming they are properly aligned
> (and if not that could be fixed easily).
> 

Care to elaborate? Who would have cached the superblock, when we haven't 
mounted the FS yet.

So here's the answer from that thread:

"IIRC we had some funny bugs when mount and device scan (udev) raced 
just after mkfs, the page cache must be used so there's no way to read 
stale data."
https://lore.kernel.org/linux-btrfs/20200117151352.GK3929@twin.jikos.cz/
Christoph Hellwig Jan. 30, 2020, 4:15 p.m. UTC | #8
On Thu, Jan 30, 2020 at 04:09:45PM +0000, Johannes Thumshirn wrote:
> > alloc_page - you should be able to use the already cached superblock
> > as the destination and source of I/O, assuming they are properly aligned
> > (and if not that could be fixed easily).
> > 
> 
> Care to elaborate? Who would have cached the superblock, when we haven't 
> mounted the FS yet.

You use one allocation, which then gets owned by the in-memory suprblock
once probed.  No need to allocate memory again to write it out.

> 
> So here's the answer from that thread:
> 
> "IIRC we had some funny bugs when mount and device scan (udev) raced 
> just after mkfs, the page cache must be used so there's no way to read 
> stale data."
> https://lore.kernel.org/linux-btrfs/20200117151352.GK3929@twin.jikos.cz/

Well, XFS has not used the page cache for any metadata since 2011, and
we've not run into those problems.  But then again XFS doesn't use
the page cache for mkfs, does btrfs?
David Sterba Jan. 30, 2020, 4:16 p.m. UTC | #9
On Thu, Jan 30, 2020 at 05:39:21AM -0800, Christoph Hellwig wrote:
> On Thu, Jan 30, 2020 at 01:15:30PM +0100, David Sterba wrote:
> > > Sure but with hch's proposed change to using read_cache_page_gfp() this 
> > > doesn't make too much sense anymore at least for the read path.
> > > 
> > > Maybe "use page cache for superblock reading"?
> > 
> > That works too. We might need a new iteration that summarizes up all the
> > feedback so far, so we have same code to refer to.
> 
> Per my question on the second patch:  why even use the page cache at
> all.  btrfs already caches the value outside the pagecache, so why
> even bother with the page cache overhead?

I'd like to remove the buffer_head interface in two steps. First remove
the wrappers and open code the calls, so the functionality is unchanged.
Then have another look if we can optimize that further, eg. removing the
page cache.

We've had subtle bugs when mount/scanning ioctl/mkfs interacted and did
not see a consistent state. See 6f60cbd3ae442cb35861bb522f388db123d42ec1
("btrfs: access superblock via pagecache in scan_one_device"). It's been
a few years so I don't recall all details, but it was quite hard to
catch. Mkfs followed by mount sometimes did not work.

So page cache is the common access point for all the parts and for now
we rely on that. If removing is possible, I'd like to see a good
explanation why and not such change lost in a well meant cleanup.
Johannes Thumshirn Jan. 31, 2020, 1:43 p.m. UTC | #10
On 30/01/2020 17:16, David Sterba wrote:
> I'd like to remove the buffer_head interface in two steps. First remove
> the wrappers and open code the calls, so the functionality is unchanged.
> Then have another look if we can optimize that further, eg. removing the
> page cache.
> 
> We've had subtle bugs when mount/scanning ioctl/mkfs interacted and did
> not see a consistent state. See 6f60cbd3ae442cb35861bb522f388db123d42ec1
> ("btrfs: access superblock via pagecache in scan_one_device"). It's been
> a few years so I don't recall all details, but it was quite hard to
> catch. Mkfs followed by mount sometimes did not work.
> 
> So page cache is the common access point for all the parts and for now
> we rely on that. If removing is possible, I'd like to see a good
> explanation why and not such change lost in a well meant cleanup.
> 

I have a local version now, where btrfs_read_dev_one_super() uses 
read_cache_page_gfp() and btrfs_scratch_superblocks() uses 
write_one_page() offloading all the I/O to the pagecache. So far this 
works as expected.

Now when I started converting write_dev_supers() and friends I wasn't 
sure if I can/should mix up read_cache_page_gfp() and submit_bio_wait(). 
I could also convert it to write_one_page() but this way we would loose 
integrity checking for the super block.

Any advice?
Christoph Hellwig Feb. 3, 2020, 8:29 a.m. UTC | #11
On Fri, Jan 31, 2020 at 01:43:19PM +0000, Johannes Thumshirn wrote:
> 
> I have a local version now, where btrfs_read_dev_one_super() uses 
> read_cache_page_gfp() and btrfs_scratch_superblocks() uses 
> write_one_page() offloading all the I/O to the pagecache. So far this 
> works as expected.
> 
> Now when I started converting write_dev_supers() and friends I wasn't 
> sure if I can/should mix up read_cache_page_gfp() and submit_bio_wait(). 
> I could also convert it to write_one_page() but this way we would loose 
> integrity checking for the super block.
> 
> Any advice?

You can mix it, although it needs some big fat comments explaining what
is going on there to the reader.