[2/5] btrfs: get bdev from latest_dev for dio bh_result
diff mbox series

Message ID 2ae45bba23d97e0d60d9949d9c65dbab9961cb34.1570474492.git.dsterba@suse.com
State New
Headers show
Series
  • Remove extent_map::bdev
Related show

Commit Message

David Sterba Oct. 7, 2019, 7:37 p.m. UTC
To remove use of extent_map::bdev we need to find a replacement, and the
latest_bdev is the only one we can use here, because inode::i_bdev and
superblock::s_bdev are NULL.

The only thing that DIO code uses from the bdev is the blocksize to
perform alignment checks in do_blockdev_direct_IO, but we do them in
btrfs code before any call to DIO. We can't pass NULL because there are
dereferences of bdev in __blockdev_direct_IO, even though it's not going
to be used later.

So it's safe to pass any valid bdev that's used within the filesystem.

Signed-off-by: David Sterba <dsterba@suse.com>
---
 fs/btrfs/inode.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

Comments

Nikolay Borisov Oct. 9, 2019, 10:42 a.m. UTC | #1
On 7.10.19 г. 22:37 ч., David Sterba wrote:
> To remove use of extent_map::bdev we need to find a replacement, and the
> latest_bdev is the only one we can use here, because inode::i_bdev and
> superblock::s_bdev are NULL.
> 
> The only thing that DIO code uses from the bdev is the blocksize to
> perform alignment checks in do_blockdev_direct_IO, but we do them in
> btrfs code before any call to DIO. We can't pass NULL because there are

nit: This is not entirely correct. In fact map_bh in
do_blockdev_direct_IO gets filled in :

do_direct_IO
  get_more_blocks
   sdio->get_block() <-- this is btrfs_get_blocks_direct

Subsequently the map_bh->b_dev member is used in
clean_bdev_aliases and dio_new_bio to set the bio's bdev to that of the
buffer_head. However, because we have provided a submit function
dio_bio_submit calls our submission function and ignores the bdev.

Furthermore __blockdev_direct_IO already uses latest_bdev as per
btrfs_direct_IO.


> dereferences of bdev in __blockdev_direct_IO, even though it's not going
> to be used later.
> 
> So it's safe to pass any valid bdev that's used within the filesystem.
> 
> Signed-off-by: David Sterba <dsterba@suse.com>

The code is ok so :

Reviewed-by: Nikolay Borisov <nborisov@suse.com>

> ---
>  fs/btrfs/inode.c | 6 ++++--
>  1 file changed, 4 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
> index 067cbd6e3923..8e085d21c3c5 100644
> --- a/fs/btrfs/inode.c
> +++ b/fs/btrfs/inode.c
> @@ -7603,6 +7603,8 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em,
>  					struct inode *inode,
>  					u64 start, u64 len)
>  {
> +	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
> +
>  	if (em->block_start == EXTENT_MAP_HOLE ||
>  			test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>  		return -ENOENT;
> @@ -7612,7 +7614,7 @@ static int btrfs_get_blocks_direct_read(struct extent_map *em,
>  	bh_result->b_blocknr = (em->block_start + (start - em->start)) >>
>  		inode->i_blkbits;
>  	bh_result->b_size = len;
> -	bh_result->b_bdev = em->bdev;
> +	bh_result->b_bdev = fs_info->fs_devices->latest_bdev;
>  	set_buffer_mapped(bh_result);
>  
>  	return 0;
> @@ -7695,7 +7697,7 @@ static int btrfs_get_blocks_direct_write(struct extent_map **map,
>  	bh_result->b_blocknr = (em->block_start + (start - em->start)) >>
>  		inode->i_blkbits;
>  	bh_result->b_size = len;
> -	bh_result->b_bdev = em->bdev;
> +	bh_result->b_bdev = fs_info->fs_devices->latest_bdev;
>  	set_buffer_mapped(bh_result);
>  
>  	if (!test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
>
David Sterba Oct. 11, 2019, 6:26 p.m. UTC | #2
On Wed, Oct 09, 2019 at 01:42:00PM +0300, Nikolay Borisov wrote:
> 
> 
> On 7.10.19 г. 22:37 ч., David Sterba wrote:
> > To remove use of extent_map::bdev we need to find a replacement, and the
> > latest_bdev is the only one we can use here, because inode::i_bdev and
> > superblock::s_bdev are NULL.
> > 
> > The only thing that DIO code uses from the bdev is the blocksize to
> > perform alignment checks in do_blockdev_direct_IO, but we do them in
> > btrfs code before any call to DIO. We can't pass NULL because there are
> 
> nit: This is not entirely correct. In fact map_bh in
> do_blockdev_direct_IO gets filled in :
> 
> do_direct_IO
>   get_more_blocks
>    sdio->get_block() <-- this is btrfs_get_blocks_direct
> 
> Subsequently the map_bh->b_dev member is used in
> clean_bdev_aliases and dio_new_bio to set the bio's bdev to that of the
> buffer_head. However, because we have provided a submit function
> dio_bio_submit calls our submission function and ignores the bdev.

You're right, and actually I got crashes in clean_bdev_aliases when I
supplied a NULL bdev, so I'll add it to the changelog. Thanks.
David Sterba Oct. 11, 2019, 6:54 p.m. UTC | #3
On Fri, Oct 11, 2019 at 08:26:42PM +0200, David Sterba wrote:
> You're right, and actually I got crashes in clean_bdev_aliases when I
> supplied a NULL bdev, so I'll add it to the changelog. Thanks.

Unless there are further comments, I won't resend the whole patchset.
The changelog in this patch will be:

---
    btrfs: get bdev from latest_dev for dio bh_result

    To remove use of extent_map::bdev we need to find a replacement, and the
    latest_bdev is the only one we can use here, because inode::i_bdev and
    superblock::s_bdev are NULL.

    The DIO code uses bdev in two places:

    * to read blocksize to perform alignment checks in
      do_blockdev_direct_IO, but we do them in btrfs code before any call to
      DIO

    * in the following call chain:

      do_direct_IO
        get_more_blocks
         sdio->get_block() <-- this is btrfs_get_blocks_direct

      subsequently the map_bh->b_dev member is used in clean_bdev_aliases
      and dio_new_bio to set the bio's bdev to that of the buffer_head.
      However, because we have provided a submit function dio_bio_submit
      calls our submission function and ignores the bdev.

    We can't pass NULL because there are dereferences of bdev in
    __blockdev_direct_IO, even though it's not going to be used later.

    So it's safe to pass any valid bdev that's used within the filesystem.
---

Patch
diff mbox series

diff --git a/fs/btrfs/inode.c b/fs/btrfs/inode.c
index 067cbd6e3923..8e085d21c3c5 100644
--- a/fs/btrfs/inode.c
+++ b/fs/btrfs/inode.c
@@ -7603,6 +7603,8 @@  static int btrfs_get_blocks_direct_read(struct extent_map *em,
 					struct inode *inode,
 					u64 start, u64 len)
 {
+	struct btrfs_fs_info *fs_info = btrfs_sb(inode->i_sb);
+
 	if (em->block_start == EXTENT_MAP_HOLE ||
 			test_bit(EXTENT_FLAG_PREALLOC, &em->flags))
 		return -ENOENT;
@@ -7612,7 +7614,7 @@  static int btrfs_get_blocks_direct_read(struct extent_map *em,
 	bh_result->b_blocknr = (em->block_start + (start - em->start)) >>
 		inode->i_blkbits;
 	bh_result->b_size = len;
-	bh_result->b_bdev = em->bdev;
+	bh_result->b_bdev = fs_info->fs_devices->latest_bdev;
 	set_buffer_mapped(bh_result);
 
 	return 0;
@@ -7695,7 +7697,7 @@  static int btrfs_get_blocks_direct_write(struct extent_map **map,
 	bh_result->b_blocknr = (em->block_start + (start - em->start)) >>
 		inode->i_blkbits;
 	bh_result->b_size = len;
-	bh_result->b_bdev = em->bdev;
+	bh_result->b_bdev = fs_info->fs_devices->latest_bdev;
 	set_buffer_mapped(bh_result);
 
 	if (!test_bit(EXTENT_FLAG_PREALLOC, &em->flags))