[0/3,RESEND,Rebased] readmirror feature
mbox series

Message ID 20190626083402.1895-1-anand.jain@oracle.com
Headers show
Series
  • readmirror feature
Related show

Message

Anand Jain June 26, 2019, 8:33 a.m. UTC
These patches are tested to be working fine.

Function call chain  __btrfs_map_block()->find_live_mirror() uses
thread pid to determine the %mirror_num when the mirror_num=0.

This patch introduces a framework so that we can add policies to determine
the %mirror_num. And adds the devid as the readmirror policy.

The property is stored as an extented attributes of root inode
(BTRFS_FS_TREE_OBJECTID).
User provided devid list is validated against the fs_devices::dev_list.

 For example:
   Usage:
     btrfs property set <mnt> readmirror devid<n>[,<m>...]
     btrfs property set <mnt> readmirror ""

   mkfs.btrfs -fq -draid1 -mraid1 /dev/sd[b-d] && mount /dev/sdb /btrfs
   btrfs prop set /btrfs readmirror devid1,2
   btrfs prop get /btrfs readmirror
    readmirror=devid1,2
   getfattr -n btrfs.readmirror --absolute-names /btrfs
    btrfs.readmirror="devid1,2"
   btrfs prop set /btrfs readmirror ""
   getfattr -n btrfs.readmirror --absolute-names /btrfs
    /btrfs: btrfs.readmirror: No such attribute
   btrfs prop get /btrfs readmirror

RFC->v1:
  Drops pid as one of the readmirror policy choices and as usual remains
  as default. And when the devid is reset the readmirror policy falls back
  to pid.
  Drops the mount -o readmirror idea, it can be added at a later point of
  time.
  Property now accepts more than 1 devid as readmirror device. As shown
  in the example above.

Anand Jain (3):
  btrfs: add inode pointer to prop_handler::validate()
  btrfs: add readmirror property framework
  btrfs: add readmirror devid property

 fs/btrfs/props.c   | 120 +++++++++++++++++++++++++++++++++++++++++++--
 fs/btrfs/props.h   |   4 +-
 fs/btrfs/volumes.c |  25 +++++++++-
 fs/btrfs/volumes.h |   8 +++
 fs/btrfs/xattr.c   |   2 +-
 5 files changed, 150 insertions(+), 9 deletions(-)

Comments

Anand Jain July 2, 2019, 8:09 a.m. UTC | #1
Ping?


On 26/6/19 4:33 PM, Anand Jain wrote:
> These patches are tested to be working fine.
> 
> Function call chain  __btrfs_map_block()->find_live_mirror() uses
> thread pid to determine the %mirror_num when the mirror_num=0.
> 
> This patch introduces a framework so that we can add policies to determine
> the %mirror_num. And adds the devid as the readmirror policy.
> 
> The property is stored as an extented attributes of root inode
> (BTRFS_FS_TREE_OBJECTID).
> User provided devid list is validated against the fs_devices::dev_list.
> 
>   For example:
>     Usage:
>       btrfs property set <mnt> readmirror devid<n>[,<m>...]
>       btrfs property set <mnt> readmirror ""
> 
>     mkfs.btrfs -fq -draid1 -mraid1 /dev/sd[b-d] && mount /dev/sdb /btrfs
>     btrfs prop set /btrfs readmirror devid1,2
>     btrfs prop get /btrfs readmirror
>      readmirror=devid1,2
>     getfattr -n btrfs.readmirror --absolute-names /btrfs
>      btrfs.readmirror="devid1,2"
>     btrfs prop set /btrfs readmirror ""
>     getfattr -n btrfs.readmirror --absolute-names /btrfs
>      /btrfs: btrfs.readmirror: No such attribute
>     btrfs prop get /btrfs readmirror
> 
> RFC->v1:
>    Drops pid as one of the readmirror policy choices and as usual remains
>    as default. And when the devid is reset the readmirror policy falls back
>    to pid.
>    Drops the mount -o readmirror idea, it can be added at a later point of
>    time.
>    Property now accepts more than 1 devid as readmirror device. As shown
>    in the example above.
> 
> Anand Jain (3):
>    btrfs: add inode pointer to prop_handler::validate()
>    btrfs: add readmirror property framework
>    btrfs: add readmirror devid property
> 
>   fs/btrfs/props.c   | 120 +++++++++++++++++++++++++++++++++++++++++++--
>   fs/btrfs/props.h   |   4 +-
>   fs/btrfs/volumes.c |  25 +++++++++-
>   fs/btrfs/volumes.h |   8 +++
>   fs/btrfs/xattr.c   |   2 +-
>   5 files changed, 150 insertions(+), 9 deletions(-)
>
Qu Wenruo July 24, 2019, 12:20 a.m. UTC | #2
On 2019/6/26 下午4:33, Anand Jain wrote:
> These patches are tested to be working fine.
> 
> Function call chain  __btrfs_map_block()->find_live_mirror() uses
> thread pid to determine the %mirror_num when the mirror_num=0.
> 
> This patch introduces a framework so that we can add policies to determine
> the %mirror_num. And adds the devid as the readmirror policy.
> 
> The property is stored as an extented attributes of root inode
> (BTRFS_FS_TREE_OBJECTID).

This doesn't look right to me.

As readmirror should work at chunk layer, putting it into root tree
doesn't follow the layer separation of btrfs.

And furthermore, this breaks the XATTR schema. Normally we only have
XATTR item after an INODE item, not a ROOT_ITEM.

Is the on-disk format already accepted or still under design stage?

Thanks,
Qu

> User provided devid list is validated against the fs_devices::dev_list.
> 
>  For example:
>    Usage:
>      btrfs property set <mnt> readmirror devid<n>[,<m>...]
>      btrfs property set <mnt> readmirror ""
> 
>    mkfs.btrfs -fq -draid1 -mraid1 /dev/sd[b-d] && mount /dev/sdb /btrfs
>    btrfs prop set /btrfs readmirror devid1,2
>    btrfs prop get /btrfs readmirror
>     readmirror=devid1,2
>    getfattr -n btrfs.readmirror --absolute-names /btrfs
>     btrfs.readmirror="devid1,2"
>    btrfs prop set /btrfs readmirror ""
>    getfattr -n btrfs.readmirror --absolute-names /btrfs
>     /btrfs: btrfs.readmirror: No such attribute
>    btrfs prop get /btrfs readmirror
> 
> RFC->v1:
>   Drops pid as one of the readmirror policy choices and as usual remains
>   as default. And when the devid is reset the readmirror policy falls back
>   to pid.
>   Drops the mount -o readmirror idea, it can be added at a later point of
>   time.
>   Property now accepts more than 1 devid as readmirror device. As shown
>   in the example above.
> 
> Anand Jain (3):
>   btrfs: add inode pointer to prop_handler::validate()
>   btrfs: add readmirror property framework
>   btrfs: add readmirror devid property
> 
>  fs/btrfs/props.c   | 120 +++++++++++++++++++++++++++++++++++++++++++--
>  fs/btrfs/props.h   |   4 +-
>  fs/btrfs/volumes.c |  25 +++++++++-
>  fs/btrfs/volumes.h |   8 +++
>  fs/btrfs/xattr.c   |   2 +-
>  5 files changed, 150 insertions(+), 9 deletions(-)
>
Anand Jain July 24, 2019, 2:26 a.m. UTC | #3
> On 24 Jul 2019, at 8:20 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
> 
> 
> 
> On 2019/6/26 下午4:33, Anand Jain wrote:
>> These patches are tested to be working fine.
>> 
>> Function call chain  __btrfs_map_block()->find_live_mirror() uses
>> thread pid to determine the %mirror_num when the mirror_num=0.
>> 
>> This patch introduces a framework so that we can add policies to determine
>> the %mirror_num. And adds the devid as the readmirror policy.
>> 
>> The property is stored as an extented attributes of root inode
>> (BTRFS_FS_TREE_OBJECTID).
> 
> This doesn't look right to me.
> 
> As readmirror should work at chunk layer, putting it into root tree
> doesn't follow the layer separation of btrfs.
> 
> And furthermore, this breaks the XATTR schema. Normally we only have
> XATTR item after an INODE item, not a ROOT_ITEM.
> 
> Is the on-disk format already accepted or still under design stage?
> 

 I mentioned about the storage for this new property in the RFC patch, as I knew there will be some surprises.

 The advantage of using the XATTR on the ROOT_ITEM is there is no on-disk format update nor there is any new KEY, albeit it deviates from the traditional way of using the xattr. Also, this approach don’t need an ioctl, as things work using the existing get/set xattr interface.

 The other way I had in mind was to introduce a new Key in the dev-tree such as

    (BTRFS_READMIRROR_OBJECTID, BTRFS_PERSISTENT_ITEM_KEY, devid)

 Again the interface can be ioctl or the get/set xattr. If we have to use the xattr then irrespective which inode is used we would anyway store it in the dev-tree using the above key.

 This is still open for changes, the idea is to get a long lasting flexible design, so comments are welcome.

Thanks, Anand


> Thanks,
> Qu
> 
>> User provided devid list is validated against the fs_devices::dev_list.
>> 
>> For example:
>>   Usage:
>>     btrfs property set <mnt> readmirror devid<n>[,<m>...]
>>     btrfs property set <mnt> readmirror ""
>> 
>>   mkfs.btrfs -fq -draid1 -mraid1 /dev/sd[b-d] && mount /dev/sdb /btrfs
>>   btrfs prop set /btrfs readmirror devid1,2
>>   btrfs prop get /btrfs readmirror
>>    readmirror=devid1,2
>>   getfattr -n btrfs.readmirror --absolute-names /btrfs
>>    btrfs.readmirror="devid1,2"
>>   btrfs prop set /btrfs readmirror ""
>>   getfattr -n btrfs.readmirror --absolute-names /btrfs
>>    /btrfs: btrfs.readmirror: No such attribute
>>   btrfs prop get /btrfs readmirror
>> 
>> RFC->v1:
>>  Drops pid as one of the readmirror policy choices and as usual remains
>>  as default. And when the devid is reset the readmirror policy falls back
>>  to pid.
>>  Drops the mount -o readmirror idea, it can be added at a later point of
>>  time.
>>  Property now accepts more than 1 devid as readmirror device. As shown
>>  in the example above.
>> 
>> Anand Jain (3):
>>  btrfs: add inode pointer to prop_handler::validate()
>>  btrfs: add readmirror property framework
>>  btrfs: add readmirror devid property
>> 
>> fs/btrfs/props.c   | 120 +++++++++++++++++++++++++++++++++++++++++++--
>> fs/btrfs/props.h   |   4 +-
>> fs/btrfs/volumes.c |  25 +++++++++-
>> fs/btrfs/volumes.h |   8 +++
>> fs/btrfs/xattr.c   |   2 +-
>> 5 files changed, 150 insertions(+), 9 deletions(-)
>> 
>
Qu Wenruo July 24, 2019, 3:05 a.m. UTC | #4
On 2019/7/24 上午10:26, Anand Jain wrote:
> 
> 
>> On 24 Jul 2019, at 8:20 AM, Qu Wenruo <quwenruo.btrfs@gmx.com> wrote:
>>
>>
>>
>> On 2019/6/26 下午4:33, Anand Jain wrote:
>>> These patches are tested to be working fine.
>>>
>>> Function call chain  __btrfs_map_block()->find_live_mirror() uses
>>> thread pid to determine the %mirror_num when the mirror_num=0.
>>>
>>> This patch introduces a framework so that we can add policies to determine
>>> the %mirror_num. And adds the devid as the readmirror policy.
>>>
>>> The property is stored as an extented attributes of root inode
>>> (BTRFS_FS_TREE_OBJECTID).
>>
>> This doesn't look right to me.
>>
>> As readmirror should work at chunk layer, putting it into root tree
>> doesn't follow the layer separation of btrfs.
>>
>> And furthermore, this breaks the XATTR schema. Normally we only have
>> XATTR item after an INODE item, not a ROOT_ITEM.
>>
>> Is the on-disk format already accepted or still under design stage?
>>
> 
>  I mentioned about the storage for this new property in the RFC patch, as I knew there will be some surprises.
> 
>  The advantage of using the XATTR on the ROOT_ITEM is there is no on-disk format update nor there is any new KEY, albeit it deviates from the traditional way of using the xattr. Also, this approach don’t need an ioctl, as things work using the existing get/set xattr interface.
> 
>  The other way I had in mind was to introduce a new Key in the dev-tree such as
> 
>     (BTRFS_READMIRROR_OBJECTID, BTRFS_PERSISTENT_ITEM_KEY, devid)

I'd say, this is more generic.
If one day we have some more features, we can just add new objectid for it.
And to my point of view, it's easier to validate than some floating
XATTR in root tree, especially for tree checker.

The only minor comment for this key is the offset and where the tree it
belongs to.

If the readmirror policy is global, I'd prefer to put it into root tree
and set the key offset to 0.

If the policy is per-chunk, it would be more easier to understand by
putting it into chunk tree, and use chunk bytenr as key offset.

If the policy is per-device, then your current key looks pretty good to me.
> 
>  Again the interface can be ioctl or the get/set xattr. If we have to use the xattr then irrespective which inode is used we would anyway store it in the dev-tree using the above key.

The prop interface can be enhanced, as we have filesystem level prop
already, I see no reason why it can't handle things in other trees.

Thanks,
Qu

> 
>  This is still open for changes, the idea is to get a long lasting flexible design, so comments are welcome.
> 
> Thanks, Anand
> 
> 
>> Thanks,
>> Qu
>>
>>> User provided devid list is validated against the fs_devices::dev_list.
>>>
>>> For example:
>>>   Usage:
>>>     btrfs property set <mnt> readmirror devid<n>[,<m>...]
>>>     btrfs property set <mnt> readmirror ""
>>>
>>>   mkfs.btrfs -fq -draid1 -mraid1 /dev/sd[b-d] && mount /dev/sdb /btrfs
>>>   btrfs prop set /btrfs readmirror devid1,2
>>>   btrfs prop get /btrfs readmirror
>>>    readmirror=devid1,2
>>>   getfattr -n btrfs.readmirror --absolute-names /btrfs
>>>    btrfs.readmirror="devid1,2"
>>>   btrfs prop set /btrfs readmirror ""
>>>   getfattr -n btrfs.readmirror --absolute-names /btrfs
>>>    /btrfs: btrfs.readmirror: No such attribute
>>>   btrfs prop get /btrfs readmirror
>>>
>>> RFC->v1:
>>>  Drops pid as one of the readmirror policy choices and as usual remains
>>>  as default. And when the devid is reset the readmirror policy falls back
>>>  to pid.
>>>  Drops the mount -o readmirror idea, it can be added at a later point of
>>>  time.
>>>  Property now accepts more than 1 devid as readmirror device. As shown
>>>  in the example above.
>>>
>>> Anand Jain (3):
>>>  btrfs: add inode pointer to prop_handler::validate()
>>>  btrfs: add readmirror property framework
>>>  btrfs: add readmirror devid property
>>>
>>> fs/btrfs/props.c   | 120 +++++++++++++++++++++++++++++++++++++++++++--
>>> fs/btrfs/props.h   |   4 +-
>>> fs/btrfs/volumes.c |  25 +++++++++-
>>> fs/btrfs/volumes.h |   8 +++
>>> fs/btrfs/xattr.c   |   2 +-
>>> 5 files changed, 150 insertions(+), 9 deletions(-)
>>>
>>
>