mbox series

[00/20] btrfs: refactor and generalize chunk/dev_extent/extent allocation

Message ID 20200206104214.400857-1-naohiro.aota@wdc.com (mailing list archive)
Headers show
Series btrfs: refactor and generalize chunk/dev_extent/extent allocation | expand

Message

Naohiro Aota Feb. 6, 2020, 10:41 a.m. UTC
This series refactors chunk allocation, device_extent allocation and
extent allocation functions and make them generalized to be able to
implement other allocation policy easily.

On top of this series, we can simplify some part of the "btrfs: zoned
block device support" series as adding a new type of chunk allocator
and extent allocator for zoned block devices. Furthermore, we will be
able to implement and test some other allocator in the idea page of
the wiki e.g. SSD caching, dedicated metadata drive, chunk allocation
groups, and so on.

This series has no functional changes except introducing "enum
btrfs_chunk_allocation_policy" and "enum
btrfs_extent_allocation_policy".

* Refactoring chunk/dev_extent allocator

Two functions are separated from find_free_dev_extent_start().
dev_extent_search_start() decides the starting position of the search.
dev_extent_hole_check() checks if a hole found is suitable for device
extent allocation.

__btrfs_alloc_chunk() is split into four functions. set_parameters()
initializes the parameters of an allocation. gather_device_info()
loops over devices and gather information of
them. decide_stripe_size() decides the size of chunk and
device_extent. And, create_chunk() creates a chunk and device extents.

* Refactoring extent allocator

Three functions are introduced in
find_free_extent(). prepare_allocation() initializes the parameters
and gives a hint byte to start the allocation with. do_allocation()
handles the actual allocation in a given block group.
release_block_group() is called when it gives up an allocation from a
block group, so the allocation context should be reset.

Two functions are introduced in find_free_extent_update_loop().
found_extent() is called when the allocator finally find a proper
extent. chunk_allocation_failed() is called when it failed to allocate
a new chunk. An allocator implementation can use this hook to set the
next stage to try e.g. LOOP_NO_EMPTY_SIZE.

Furthermore, LOOP_NO_EMPTY_SIZE stage is tweaked so that other
allocator than the current clustered allocator skips this stage.

* Patch organization

Patch 1 is a trivial patch to fix the type of an argument of
find_free_extent_update_loop().

Patches 2-9 refactors chunk and device_extent allocation functions:
find_free_dev_extent_start() and __btrfs_alloc_chunk().

Patches 10-20 refactors extent allocation function: find_free_extent()
and find_free_extent_update_loop().

Naohiro Aota (20):
  btrfs: change type of full_search to bool
  btrfs: introduce chunk allocation policy
  btrfs: refactor find_free_dev_extent_start()
  btrfs: introduce alloc_chunk_ctl
  btrfs: factor out set_parameters()
  btrfs: factor out gather_device_info()
  btrfs: factor out decide_stripe_size()
  btrfs: factor out create_chunk()
  btrfs: parameterize dev_extent_min
  btrfs: introduce extent allocation policy
  btrfs: move hint_byte into find_free_extent_ctl
  btrfs: introduce clustered_alloc_info
  btrfs: factor out do_allocation()
  btrfs: drop unnecessary arguments from clustered allocation functions
  btrfs: factor out release_block_group()
  btrfs: factor out found_extent()
  btrfs: drop unnecessary arguments from find_free_extent_update_loop()
  btrfs: factor out chunk_allocation_failed()
  btrfs: add assertion on LOOP_NO_EMPTY_SIZE stage
  btrfs: factor out prepare_allocation()

 fs/btrfs/extent-tree.c | 378 ++++++++++++++++++++++++++-------------
 fs/btrfs/volumes.c     | 397 +++++++++++++++++++++++++++--------------
 fs/btrfs/volumes.h     |   6 +
 3 files changed, 525 insertions(+), 256 deletions(-)

Comments

Martin Steigerwald Feb. 6, 2020, 11:43 a.m. UTC | #1
Hi Naohiro.

Naohiro Aota - 06.02.20, 11:41:54 CET:
> This series refactors chunk allocation, device_extent allocation and
> extent allocation functions and make them generalized to be able to
> implement other allocation policy easily.
> 
> On top of this series, we can simplify some part of the "btrfs: zoned
> block device support" series as adding a new type of chunk allocator
> and extent allocator for zoned block devices. Furthermore, we will be
> able to implement and test some other allocator in the idea page of
> the wiki e.g. SSD caching, dedicated metadata drive, chunk allocation
> groups, and so on.

Regarding SSD caching, are you aware that there has been previous work 
with even involved handling part of it in the Virtual Filesystem Switch 
(VFS)?

VFS hot-data tracking, LWN article:

https://lwn.net/Articles/525651/

Patchset, not sure whether it is the most recent one:

[PATCH v2 00/12] VFS hot tracking

https://lore.kernel.org/linux-btrfs/1368493184-5939-1-git-send-email-zwu.kernel@gmail.com/

So for SSD caching you may be able to re-use or pick up some of this 
work, unless it would be unsuitable to be used with this new approach.

Thanks,
Naohiro Aota Feb. 7, 2020, 6:06 a.m. UTC | #2
On Thu, Feb 06, 2020 at 12:43:30PM +0100, Martin Steigerwald wrote:
>Hi Naohiro.
>
>Naohiro Aota - 06.02.20, 11:41:54 CET:
>> This series refactors chunk allocation, device_extent allocation and
>> extent allocation functions and make them generalized to be able to
>> implement other allocation policy easily.
>>
>> On top of this series, we can simplify some part of the "btrfs: zoned
>> block device support" series as adding a new type of chunk allocator
>> and extent allocator for zoned block devices. Furthermore, we will be
>> able to implement and test some other allocator in the idea page of
>> the wiki e.g. SSD caching, dedicated metadata drive, chunk allocation
>> groups, and so on.
>
>Regarding SSD caching, are you aware that there has been previous work
>with even involved handling part of it in the Virtual Filesystem Switch
>(VFS)?
>
>VFS hot-data tracking, LWN article:
>
>https://lwn.net/Articles/525651/
>
>Patchset, not sure whether it is the most recent one:
>
>[PATCH v2 00/12] VFS hot tracking
>
>https://lore.kernel.org/linux-btrfs/1368493184-5939-1-git-send-email-zwu.kernel@gmail.com/

Yes, I once saw the patchset. Not sure about the detail, though.

>So for SSD caching you may be able to re-use or pick up some of this
>work, unless it would be unsuitable to be used with this new approach.

Currently, I have no plan to implement the SSD caching feature by
myself. I think some patches of the series like this [1] can be
reworked on my series as adding an "SSD caching chunk allocator." So,
it's welcome to hear suggestions about the hook interface.

[1] https://lore.kernel.org/linux-btrfs/1371817260-8615-3-git-send-email-zwu.kernel@gmail.com/

>Thanks,
>-- 
>Martin
>
>
Martin Steigerwald Feb. 7, 2020, 8:02 a.m. UTC | #3
Naohiro Aota - 07.02.20, 07:06:00 CET:
> On Thu, Feb 06, 2020 at 12:43:30PM +0100, Martin Steigerwald wrote:
> >Hi Naohiro.
> >
> >Naohiro Aota - 06.02.20, 11:41:54 CET:
> >> This series refactors chunk allocation, device_extent allocation
> >> and
> >> extent allocation functions and make them generalized to be able to
> >> implement other allocation policy easily.
> >> 
> >> On top of this series, we can simplify some part of the "btrfs:
> >> zoned
> >> block device support" series as adding a new type of chunk
> >> allocator
> >> and extent allocator for zoned block devices. Furthermore, we will
> >> be
> >> able to implement and test some other allocator in the idea page of
> >> the wiki e.g. SSD caching, dedicated metadata drive, chunk
> >> allocation
> >> groups, and so on.
> >
> >Regarding SSD caching, are you aware that there has been previous
> >work with even involved handling part of it in the Virtual
> >Filesystem Switch (VFS)?
> >
> >VFS hot-data tracking, LWN article:
> >
> >https://lwn.net/Articles/525651/
> >
> >Patchset, not sure whether it is the most recent one:
> >
> >[PATCH v2 00/12] VFS hot tracking
> >
> >https://lore.kernel.org/linux-btrfs/1368493184-5939-1-git-send-email-> >zwu.kernel@gmail.com/
> Yes, I once saw the patchset. Not sure about the detail, though.
> 
> >So for SSD caching you may be able to re-use or pick up some of this
> >work, unless it would be unsuitable to be used with this new
> >approach.
> Currently, I have no plan to implement the SSD caching feature by
> myself. I think some patches of the series like this [1] can be
> reworked on my series as adding an "SSD caching chunk allocator." So,
> it's welcome to hear suggestions about the hook interface.

Thank you. Adding Zhi Yong Wu to Cc.

I don't know details about this patch set either.

@Zhi Yong Wu: Are you interested interested in rebasing your SSD caching 
patch above this refactoring work?

Ciao,
Martin

> [1]
> https://lore.kernel.org/linux-btrfs/1371817260-8615-3-git-send-email-> zwu.kernel@gmail.com/
> >Thanks,