diff mbox series

[RFC,v2,04/14] readahead: set file_ra_state->ra_pages to be at least mapping_min_order

Message ID 20240213093713.1753368-5-kernel@pankajraghav.com (mailing list archive)
State New
Headers show
Series enable bs > ps in XFS | expand

Commit Message

Pankaj Raghav (Samsung) Feb. 13, 2024, 9:37 a.m. UTC
From: Luis Chamberlain <mcgrof@kernel.org>

Set the file_ra_state->ra_pages in file_ra_state_init() to be at least
mapping_min_order of pages if the bdi->ra_pages is less than that.

Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
---
 mm/readahead.c | 5 +++++
 1 file changed, 5 insertions(+)

Comments

Hannes Reinecke Feb. 13, 2024, 2:59 p.m. UTC | #1
On 2/13/24 10:37, Pankaj Raghav (Samsung) wrote:
> From: Luis Chamberlain <mcgrof@kernel.org>
> 
> Set the file_ra_state->ra_pages in file_ra_state_init() to be at least
> mapping_min_order of pages if the bdi->ra_pages is less than that.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>   mm/readahead.c | 5 +++++
>   1 file changed, 5 insertions(+)
> 
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 2648ec4f0494..4fa7d0e65706 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -138,7 +138,12 @@
>   void
>   file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
>   {
> +	unsigned int min_nrpages = mapping_min_folio_nrpages(mapping);
> +	unsigned int max_pages = inode_to_bdi(mapping->host)->io_pages;
> +
>   	ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> +	if (ra->ra_pages < min_nrpages && min_nrpages < max_pages)
> +		ra->ra_pages = min_nrpages;
>   	ra->prev_pos = -1;
>   }
>   EXPORT_SYMBOL_GPL(file_ra_state_init);

Reviewed-by: Hannes Reinecke <hare@suse.de>

Cheers,

Hannes
Darrick J. Wong Feb. 13, 2024, 4:46 p.m. UTC | #2
On Tue, Feb 13, 2024 at 10:37:03AM +0100, Pankaj Raghav (Samsung) wrote:
> From: Luis Chamberlain <mcgrof@kernel.org>
> 
> Set the file_ra_state->ra_pages in file_ra_state_init() to be at least
> mapping_min_order of pages if the bdi->ra_pages is less than that.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>

Looks good to me,
Acked-by: Darrick J. Wong <djwong@kernel.org>

--D

> ---
>  mm/readahead.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 2648ec4f0494..4fa7d0e65706 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -138,7 +138,12 @@
>  void
>  file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
>  {
> +	unsigned int min_nrpages = mapping_min_folio_nrpages(mapping);
> +	unsigned int max_pages = inode_to_bdi(mapping->host)->io_pages;
> +
>  	ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> +	if (ra->ra_pages < min_nrpages && min_nrpages < max_pages)
> +		ra->ra_pages = min_nrpages;
>  	ra->prev_pos = -1;
>  }
>  EXPORT_SYMBOL_GPL(file_ra_state_init);
> -- 
> 2.43.0
> 
>
Dave Chinner Feb. 13, 2024, 10:09 p.m. UTC | #3
On Tue, Feb 13, 2024 at 10:37:03AM +0100, Pankaj Raghav (Samsung) wrote:
> From: Luis Chamberlain <mcgrof@kernel.org>
> 
> Set the file_ra_state->ra_pages in file_ra_state_init() to be at least
> mapping_min_order of pages if the bdi->ra_pages is less than that.
> 
> Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> ---
>  mm/readahead.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/readahead.c b/mm/readahead.c
> index 2648ec4f0494..4fa7d0e65706 100644
> --- a/mm/readahead.c
> +++ b/mm/readahead.c
> @@ -138,7 +138,12 @@
>  void
>  file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
>  {
> +	unsigned int min_nrpages = mapping_min_folio_nrpages(mapping);
> +	unsigned int max_pages = inode_to_bdi(mapping->host)->io_pages;
> +
>  	ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> +	if (ra->ra_pages < min_nrpages && min_nrpages < max_pages)
> +		ra->ra_pages = min_nrpages;

Why do we want to clamp readahead in this case to io_pages?

We're still going to be allocating a min_order folio in the page
cache, but it is far more efficient to initialise the entire folio
all in a single readahead pass than it is to only partially fill it
with data here and then have to issue and wait for more IO to bring
the folio fully up to date before we can read out data out of it,
right?

-Dave.
Pankaj Raghav (Samsung) Feb. 14, 2024, 1:32 p.m. UTC | #4
On Wed, Feb 14, 2024 at 09:09:53AM +1100, Dave Chinner wrote:
> On Tue, Feb 13, 2024 at 10:37:03AM +0100, Pankaj Raghav (Samsung) wrote:
> > From: Luis Chamberlain <mcgrof@kernel.org>
> > 
> > Set the file_ra_state->ra_pages in file_ra_state_init() to be at least
> > mapping_min_order of pages if the bdi->ra_pages is less than that.
> > 
> > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> > ---
> >  mm/readahead.c | 5 +++++
> >  1 file changed, 5 insertions(+)
> > 
> > diff --git a/mm/readahead.c b/mm/readahead.c
> > index 2648ec4f0494..4fa7d0e65706 100644
> > --- a/mm/readahead.c
> > +++ b/mm/readahead.c
> > @@ -138,7 +138,12 @@
> >  void
> >  file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
> >  {
> > +	unsigned int min_nrpages = mapping_min_folio_nrpages(mapping);
> > +	unsigned int max_pages = inode_to_bdi(mapping->host)->io_pages;
> > +
> >  	ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> > +	if (ra->ra_pages < min_nrpages && min_nrpages < max_pages)
> > +		ra->ra_pages = min_nrpages;
> 
> Why do we want to clamp readahead in this case to io_pages?
> 
> We're still going to be allocating a min_order folio in the page
> cache, but it is far more efficient to initialise the entire folio
> all in a single readahead pass than it is to only partially fill it
> with data here and then have to issue and wait for more IO to bring
> the folio fully up to date before we can read out data out of it,
> right?

We are not clamping it to io_pages. ra_pages is set to min_nrpages if
bdi->ra_pages is less than the min_nrpages. The io_pages parameter is
used as a sanity check so that min_nrpages does not go beyond it.

So maybe, this is not the right place to check if we can at least send
min_nrpages to the backing device but instead do it during mount?

> 
> -Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
Pankaj Raghav (Samsung) Feb. 14, 2024, 1:53 p.m. UTC | #5
On Wed, Feb 14, 2024 at 02:32:20PM +0100, Pankaj Raghav (Samsung) wrote:
> On Wed, Feb 14, 2024 at 09:09:53AM +1100, Dave Chinner wrote:
> > On Tue, Feb 13, 2024 at 10:37:03AM +0100, Pankaj Raghav (Samsung) wrote:
> > > From: Luis Chamberlain <mcgrof@kernel.org>
> > > 
> > > Set the file_ra_state->ra_pages in file_ra_state_init() to be at least
> > > mapping_min_order of pages if the bdi->ra_pages is less than that.
> > > 
> > > Signed-off-by: Luis Chamberlain <mcgrof@kernel.org>
> > > ---
> > >  mm/readahead.c | 5 +++++
> > >  1 file changed, 5 insertions(+)
> > > 
> > > diff --git a/mm/readahead.c b/mm/readahead.c
> > > index 2648ec4f0494..4fa7d0e65706 100644
> > > --- a/mm/readahead.c
> > > +++ b/mm/readahead.c
> > > @@ -138,7 +138,12 @@
> > >  void
> > >  file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
> > >  {
> > > +	unsigned int min_nrpages = mapping_min_folio_nrpages(mapping);
> > > +	unsigned int max_pages = inode_to_bdi(mapping->host)->io_pages;
> > > +
> > >  	ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
> > > +	if (ra->ra_pages < min_nrpages && min_nrpages < max_pages)
> > > +		ra->ra_pages = min_nrpages;
> > 
> > Why do we want to clamp readahead in this case to io_pages?
> > 
> > We're still going to be allocating a min_order folio in the page
> > cache, but it is far more efficient to initialise the entire folio
> > all in a single readahead pass than it is to only partially fill it
> > with data here and then have to issue and wait for more IO to bring
> > the folio fully up to date before we can read out data out of it,
> > right?

I think I misunderstood your question. I got more context after seeing
your next response.

You are right, I will remove the clamp to io_pages. So a single FSB
might be split into multiple IOs if the underlying block device has
io_pages < min_nrpages.

> 
> We are not clamping it to io_pages. ra_pages is set to min_nrpages if
> bdi->ra_pages is less than the min_nrpages. The io_pages parameter is
> used as a sanity check so that min_nrpages does not go beyond it.
> 
> So maybe, this is not the right place to check if we can at least send
> min_nrpages to the backing device but instead do it during mount?
> 
> > 
> > -Dave.
> > -- 
> > Dave Chinner
> > david@fromorbit.com
diff mbox series

Patch

diff --git a/mm/readahead.c b/mm/readahead.c
index 2648ec4f0494..4fa7d0e65706 100644
--- a/mm/readahead.c
+++ b/mm/readahead.c
@@ -138,7 +138,12 @@ 
 void
 file_ra_state_init(struct file_ra_state *ra, struct address_space *mapping)
 {
+	unsigned int min_nrpages = mapping_min_folio_nrpages(mapping);
+	unsigned int max_pages = inode_to_bdi(mapping->host)->io_pages;
+
 	ra->ra_pages = inode_to_bdi(mapping->host)->ra_pages;
+	if (ra->ra_pages < min_nrpages && min_nrpages < max_pages)
+		ra->ra_pages = min_nrpages;
 	ra->prev_pos = -1;
 }
 EXPORT_SYMBOL_GPL(file_ra_state_init);