diff mbox series

[v2,1/3] btrfs: change how we calculate the nrptrs for btrfs_buffered_write()

Message ID 20200824075959.85212-2-wqu@suse.com
State New, archived
Headers show
Series btrfs: basic refactor of btrfs_buffered_write() | expand

Commit Message

Qu Wenruo Aug. 24, 2020, 7:59 a.m. UTC
@nrptrs of btrfs_bufferd_write() determines the up limit of pages we can
process in one batch.

Normally we want it to be as large as possible as btrfs metadata/data
reserve and release can cost quite a lot of time.

Commit 142349f541d0 ("btrfs: lower the dirty balance poll interval")
introduced two extra limitations which are suspicious now:
- limit the page number to nr_dirtied_pause - nr_dirtied
  However I can't find any mainline fs nor iomap utilize these two
  members.
  Although page write back still uses those two members, as no other fs
  utilizeing them at all, I doubt about the usefulness.

- ensure we always have 8 pages allocates
  The 8 lower limit looks pretty strange, this means even we're just
  writing 4K, we will allocate page* for 8 pages no matter what.
  To me, this 8 pages look more like a upper limit.

This patch will change it by:
- Extract the calculation into another function
  This allows us to add more comment explaining every calculation.

- Do proper page alignment calculation
  The old calculation, DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE)
  doesn't take @pos into consideration.
  In fact we can easily have iov_iter_count(i) == 2, but still cross two
  pages. (pos == page_offset() + PAGE_SIZE - 1).

- Remove the useless max(8)

- Use PAGE_SIZE independent up limit
  Now we use 64K as nr_pages limit, so we should get similar performance
  between different arches.

Signed-off-by: Qu Wenruo <wqu@suse.com>
---
 fs/btrfs/file.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)

Comments

Nikolay Borisov Aug. 24, 2020, 10:32 a.m. UTC | #1
On 24.08.20 г. 10:59 ч., Qu Wenruo wrote:
> @nrptrs of btrfs_bufferd_write() determines the up limit of pages we can
> process in one batch.
> 
> Normally we want it to be as large as possible as btrfs metadata/data
> reserve and release can cost quite a lot of time.
> 
> Commit 142349f541d0 ("btrfs: lower the dirty balance poll interval")
> introduced two extra limitations which are suspicious now:
> - limit the page number to nr_dirtied_pause - nr_dirtied
>   However I can't find any mainline fs nor iomap utilize these two
>   members.
>   Although page write back still uses those two members, as no other fs
>   utilizeing them at all, I doubt about the usefulness.
> 
> - ensure we always have 8 pages allocates
>   The 8 lower limit looks pretty strange, this means even we're just
>   writing 4K, we will allocate page* for 8 pages no matter what.
>   To me, this 8 pages look more like a upper limit.
> 
> This patch will change it by:
> - Extract the calculation into another function
>   This allows us to add more comment explaining every calculation.
> 
> - Do proper page alignment calculation
>   The old calculation, DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE)
>   doesn't take @pos into consideration.
>   In fact we can easily have iov_iter_count(i) == 2, but still cross two
>   pages. (pos == page_offset() + PAGE_SIZE - 1).
> 
> - Remove the useless max(8)
> 
> - Use PAGE_SIZE independent up limit
>   Now we use 64K as nr_pages limit, so we should get similar performance
>   between different arches.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>  fs/btrfs/file.c | 28 ++++++++++++++++++++++++----
>  1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 5a818ebcb01f..c592350a5a82 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1620,6 +1620,29 @@ void btrfs_check_nocow_unlock(struct btrfs_inode *inode)
>  	btrfs_drew_write_unlock(&inode->root->snapshot_lock);
>  }
>  
> +/* Helper to get how many pages we should alloc for the batch */
> +static int get_nr_pages(struct btrfs_fs_info *fs_info, loff_t pos,
> +			struct iov_iter *iov)

Drop the fs_info parameter as it's unsued. Also rename the function to
calc_nr_pages since it's closer to what the function is actually doing.

> +{
> +	int nr_pages;
> +
> +	/*
> +	 * Try to cover the full iov range, as btrfs metadata/data reserve
> +	 * and release can be pretty slow, thus the more pages we process in
> +	 * one batch the better.
> +	 */
> +	nr_pages = (round_up(pos + iov_iter_count(iov), PAGE_SIZE) -
> +		    round_down(pos, PAGE_SIZE)) / PAGE_SIZE;
> +
> +	/*
> +	 * But still limit it to 64KiB, so we can still get a similar
> +	 * buffered write performance between different page sizes
> +	 */
> +	nr_pages = min_t(int, nr_pages, SZ_64K / PAGE_SIZE);
> +
> +	return nr_pages;
> +}
> +
>  static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>  					       struct iov_iter *i)
>  {
> @@ -1638,10 +1661,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>  	bool only_release_metadata = false;
>  	bool force_page_uptodate = false;
>  
> -	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
> -			PAGE_SIZE / (sizeof(struct page *)));
> -	nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
> -	nrptrs = max(nrptrs, 8);
> +	nrptrs = get_nr_pages(fs_info, pos, i);
>  	pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL);
>  	if (!pages)
>  		return -ENOMEM;
>
Josef Bacik Aug. 24, 2020, 4:59 p.m. UTC | #2
On 8/24/20 3:59 AM, Qu Wenruo wrote:
> @nrptrs of btrfs_bufferd_write() determines the up limit of pages we can
> process in one batch.
> 
> Normally we want it to be as large as possible as btrfs metadata/data
> reserve and release can cost quite a lot of time.
> 
> Commit 142349f541d0 ("btrfs: lower the dirty balance poll interval")
> introduced two extra limitations which are suspicious now:
> - limit the page number to nr_dirtied_pause - nr_dirtied
>    However I can't find any mainline fs nor iomap utilize these two
>    members.
>    Although page write back still uses those two members, as no other fs
>    utilizeing them at all, I doubt about the usefulness.
> 
> - ensure we always have 8 pages allocates
>    The 8 lower limit looks pretty strange, this means even we're just
>    writing 4K, we will allocate page* for 8 pages no matter what.
>    To me, this 8 pages look more like a upper limit.
> 
> This patch will change it by:
> - Extract the calculation into another function
>    This allows us to add more comment explaining every calculation.
> 
> - Do proper page alignment calculation
>    The old calculation, DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE)
>    doesn't take @pos into consideration.
>    In fact we can easily have iov_iter_count(i) == 2, but still cross two
>    pages. (pos == page_offset() + PAGE_SIZE - 1).
> 
> - Remove the useless max(8)
> 
> - Use PAGE_SIZE independent up limit
>    Now we use 64K as nr_pages limit, so we should get similar performance
>    between different arches.
> 
> Signed-off-by: Qu Wenruo <wqu@suse.com>
> ---
>   fs/btrfs/file.c | 28 ++++++++++++++++++++++++----
>   1 file changed, 24 insertions(+), 4 deletions(-)
> 
> diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
> index 5a818ebcb01f..c592350a5a82 100644
> --- a/fs/btrfs/file.c
> +++ b/fs/btrfs/file.c
> @@ -1620,6 +1620,29 @@ void btrfs_check_nocow_unlock(struct btrfs_inode *inode)
>   	btrfs_drew_write_unlock(&inode->root->snapshot_lock);
>   }
>   
> +/* Helper to get how many pages we should alloc for the batch */
> +static int get_nr_pages(struct btrfs_fs_info *fs_info, loff_t pos,
> +			struct iov_iter *iov)
> +{
> +	int nr_pages;
> +
> +	/*
> +	 * Try to cover the full iov range, as btrfs metadata/data reserve
> +	 * and release can be pretty slow, thus the more pages we process in
> +	 * one batch the better.
> +	 */
> +	nr_pages = (round_up(pos + iov_iter_count(iov), PAGE_SIZE) -
> +		    round_down(pos, PAGE_SIZE)) / PAGE_SIZE;
> +
> +	/*
> +	 * But still limit it to 64KiB, so we can still get a similar
> +	 * buffered write performance between different page sizes
> +	 */
> +	nr_pages = min_t(int, nr_pages, SZ_64K / PAGE_SIZE);
> +
> +	return nr_pages;
> +}
> +
>   static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>   					       struct iov_iter *i)
>   {
> @@ -1638,10 +1661,7 @@ static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
>   	bool only_release_metadata = false;
>   	bool force_page_uptodate = false;
>   
> -	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
> -			PAGE_SIZE / (sizeof(struct page *)));
> -	nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
> -	nrptrs = max(nrptrs, 8);
> +	nrptrs = get_nr_pages(fs_info, pos, i);
>   	pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL);

These cleanups are valuable, I don't want to change this behavior in a cleanup. 
Rework the code first, then make the changes you want to make, that way when we 
go back and blame we have a reason why the behavior was changed, and we don't 
end up in a refactoring patch that also happened to change the behavior.  Thanks,

Josef
diff mbox series

Patch

diff --git a/fs/btrfs/file.c b/fs/btrfs/file.c
index 5a818ebcb01f..c592350a5a82 100644
--- a/fs/btrfs/file.c
+++ b/fs/btrfs/file.c
@@ -1620,6 +1620,29 @@  void btrfs_check_nocow_unlock(struct btrfs_inode *inode)
 	btrfs_drew_write_unlock(&inode->root->snapshot_lock);
 }
 
+/* Helper to get how many pages we should alloc for the batch */
+static int get_nr_pages(struct btrfs_fs_info *fs_info, loff_t pos,
+			struct iov_iter *iov)
+{
+	int nr_pages;
+
+	/*
+	 * Try to cover the full iov range, as btrfs metadata/data reserve
+	 * and release can be pretty slow, thus the more pages we process in
+	 * one batch the better.
+	 */
+	nr_pages = (round_up(pos + iov_iter_count(iov), PAGE_SIZE) -
+		    round_down(pos, PAGE_SIZE)) / PAGE_SIZE;
+
+	/*
+	 * But still limit it to 64KiB, so we can still get a similar
+	 * buffered write performance between different page sizes
+	 */
+	nr_pages = min_t(int, nr_pages, SZ_64K / PAGE_SIZE);
+
+	return nr_pages;
+}
+
 static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 					       struct iov_iter *i)
 {
@@ -1638,10 +1661,7 @@  static noinline ssize_t btrfs_buffered_write(struct kiocb *iocb,
 	bool only_release_metadata = false;
 	bool force_page_uptodate = false;
 
-	nrptrs = min(DIV_ROUND_UP(iov_iter_count(i), PAGE_SIZE),
-			PAGE_SIZE / (sizeof(struct page *)));
-	nrptrs = min(nrptrs, current->nr_dirtied_pause - current->nr_dirtied);
-	nrptrs = max(nrptrs, 8);
+	nrptrs = get_nr_pages(fs_info, pos, i);
 	pages = kmalloc_array(nrptrs, sizeof(struct page *), GFP_KERNEL);
 	if (!pages)
 		return -ENOMEM;