diff mbox series

[v2] rcu/kvfree: Prevents cache growing when the backoff_page_cache_fill is set

Message ID 20230406001238.647536-1-qiang1.zhang@intel.com (mailing list archive)
State Superseded
Headers show
Series [v2] rcu/kvfree: Prevents cache growing when the backoff_page_cache_fill is set | expand

Commit Message

Zqiang April 6, 2023, 12:12 a.m. UTC
Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
executed before kfree_rcu_monitor() to drain page cache, if the bnode
structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
page cache again in kfree_rcu_monitor(), this commit add a check
for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
if the krcp structure's->backoff_page_cache_fill is set, prevent page
cache growing.

Signed-off-by: Zqiang <qiang1.zhang@intel.com>
---
 kernel/rcu/tree.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Uladzislau Rezki April 6, 2023, 4:37 a.m. UTC | #1
On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> executed before kfree_rcu_monitor() to drain page cache, if the bnode
> structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> page cache again in kfree_rcu_monitor(), this commit add a check
> for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> if the krcp structure's->backoff_page_cache_fill is set, prevent page
> cache growing.
> 
> Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> ---
>  kernel/rcu/tree.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9cc0a7766fd2..f25430ae1936 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2907,6 +2907,8 @@ static inline bool
>  put_cached_bnode(struct kfree_rcu_cpu *krcp,
>  	struct kvfree_rcu_bulk_data *bnode)
>  {
> +	if (atomic_read(&krcp->backoff_page_cache_fill))
> +		return false;
>  	// Check the limit.
>  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
>  		return false;
> -- 
> 2.32.0
>
Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

--
Uladzislau Rezki
Paul E. McKenney April 6, 2023, 5:50 p.m. UTC | #2
On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > page cache again in kfree_rcu_monitor(), this commit add a check
> > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > cache growing.
> > 
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > ---
> >  kernel/rcu/tree.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9cc0a7766fd2..f25430ae1936 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2907,6 +2907,8 @@ static inline bool
> >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> >  	struct kvfree_rcu_bulk_data *bnode)
> >  {
> > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > +		return false;
> >  	// Check the limit.
> >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> >  		return false;
> > -- 
> > 2.32.0
> >
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>

Thank you both!

One question, though.  Might it be better to instead modify the "for"
loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
of starting at zero?  That way, we still provide a single page under
low-memory conditions, but provide rcu_min_cached_objs of them if memory
is plentiful.

Alternatively, if we really don't want to allow any pages at all under
low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
flag is set?  This would likely mean also breaking out of that loop if
krcp->backoff_page_cache_fill was set in the meantime (which happens
implicitly with Zqiang's patch).

Or am I missing something subtle here?

							Thanx, Paul
Zqiang April 6, 2023, 11:11 p.m. UTC | #3
>>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > page cache again in kfree_rcu_monitor(), this commit add a check
> > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > cache growing.
> > 
> > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > ---
> >  kernel/rcu/tree.c | 2 ++
> >  1 file changed, 2 insertions(+)
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9cc0a7766fd2..f25430ae1936 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2907,6 +2907,8 @@ static inline bool
> >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> >  	struct kvfree_rcu_bulk_data *bnode)
> >  {
> > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > +		return false;
> >  	// Check the limit.
> >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> >  		return false;
> > -- 
> > 2.32.0
> >
> Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
>
>Thank you both!
>
>One question, though.  Might it be better to instead modify the "for"
>loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
>of starting at zero?  That way, we still provide a single page under
>low-memory conditions, but provide rcu_min_cached_objs of them if memory
>is plentiful.
>
>Alternatively, if we really don't want to allow any pages at all under
>low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
>to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
>flag is set?  

Hi, Paul

If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
the allocated single page will also be freed in fill_page_cache_func().

or it would be better not to allocate under memory pressure.

How about like this?

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9cc0a7766fd2..94aedbc3da36 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2907,6 +2907,8 @@ static inline bool
 put_cached_bnode(struct kfree_rcu_cpu *krcp,
        struct kvfree_rcu_bulk_data *bnode)
 {
+       if (atomic_read(&krcp->backoff_page_cache_fill))
+               return false;
        // Check the limit.
        if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
                return false;
@@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
        int i;

        nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
-               1 : rcu_min_cached_objs;
+               0 : rcu_min_cached_objs;

        for (i = 0; i < nr_pages; i++) {
                bnode = (struct kvfree_rcu_bulk_data *)

Thanks
Zqiang

>This would likely mean also breaking out of that loop if
>krcp->backoff_page_cache_fill was set in the meantime (which happens
>implicitly with Zqiang's patch).
>
>Or am I missing something subtle here?
>
>							Thanx, Paul
Paul E. McKenney April 7, 2023, 12:04 a.m. UTC | #4
On Thu, Apr 06, 2023 at 11:11:37PM +0000, Zhang, Qiang1 wrote:
> >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > cache growing.
> > > 
> > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > ---
> > >  kernel/rcu/tree.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 9cc0a7766fd2..f25430ae1936 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -2907,6 +2907,8 @@ static inline bool
> > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > >  	struct kvfree_rcu_bulk_data *bnode)
> > >  {
> > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > +		return false;
> > >  	// Check the limit.
> > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > >  		return false;
> > > -- 
> > > 2.32.0
> > >
> > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> >
> >Thank you both!
> >
> >One question, though.  Might it be better to instead modify the "for"
> >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> >of starting at zero?  That way, we still provide a single page under
> >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> >is plentiful.
> >
> >Alternatively, if we really don't want to allow any pages at all under
> >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> >flag is set?  
> 
> Hi, Paul
> 
> If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> the allocated single page will also be freed in fill_page_cache_func().
> 
> or it would be better not to allocate under memory pressure.

That was my thought.  ;-)

> How about like this?
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9cc0a7766fd2..94aedbc3da36 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2907,6 +2907,8 @@ static inline bool
>  put_cached_bnode(struct kfree_rcu_cpu *krcp,
>         struct kvfree_rcu_bulk_data *bnode)
>  {
> +       if (atomic_read(&krcp->backoff_page_cache_fill))
> +               return false;
>         // Check the limit.
>         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
>                 return false;
> @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
>         int i;
> 
>         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> -               1 : rcu_min_cached_objs;
> +               0 : rcu_min_cached_objs;
> 
>         for (i = 0; i < nr_pages; i++) {

The other question is why this loop does not allow for any pages
that might already be allocated, thus perhaps looking like this:

		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {

Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
seeing this, but I do feel the need to ask.)

							Thanx, Paul

>                 bnode = (struct kvfree_rcu_bulk_data *)
> 
> Thanks
> Zqiang
> 
> >This would likely mean also breaking out of that loop if
> >krcp->backoff_page_cache_fill was set in the meantime (which happens
> >implicitly with Zqiang's patch).
> >
> >Or am I missing something subtle here?
> >
> >							Thanx, Paul
Zqiang April 7, 2023, 1:26 a.m. UTC | #5
> >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > cache growing.
> > > 
> > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > ---
> > >  kernel/rcu/tree.c | 2 ++
> > >  1 file changed, 2 insertions(+)
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 9cc0a7766fd2..f25430ae1936 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -2907,6 +2907,8 @@ static inline bool
> > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > >  	struct kvfree_rcu_bulk_data *bnode)
> > >  {
> > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > +		return false;
> > >  	// Check the limit.
> > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > >  		return false;
> > > -- 
> > > 2.32.0
> > >
> > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> >
> >Thank you both!
> >
> >One question, though.  Might it be better to instead modify the "for"
> >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> >of starting at zero?  That way, we still provide a single page under
> >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> >is plentiful.
> >
> >Alternatively, if we really don't want to allow any pages at all under
> >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> >flag is set?  
> 
> Hi, Paul
> 
> If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> the allocated single page will also be freed in fill_page_cache_func().
> 
> or it would be better not to allocate under memory pressure.
>
>That was my thought.  ;-)
>
> How about like this?
> 
> diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> index 9cc0a7766fd2..94aedbc3da36 100644
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -2907,6 +2907,8 @@ static inline bool
>  put_cached_bnode(struct kfree_rcu_cpu *krcp,
>         struct kvfree_rcu_bulk_data *bnode)
>  {
> +       if (atomic_read(&krcp->backoff_page_cache_fill))
> +               return false;
>         // Check the limit.
>         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
>                 return false;
> @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
>         int i;
> 
>         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> -               1 : rcu_min_cached_objs;
> +               0 : rcu_min_cached_objs;
> 
>         for (i = 0; i < nr_pages; i++) {
>
>The other question is why this loop does not allow for any pages
>that might already be allocated, thus perhaps looking like this:
>
>		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
>
>Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
>seeing this, but I do feel the need to ask.)


The fill_page_cache_func() is triggered when we invoke get_cached_bnode() return NULL,
this also means that krcp->nr_bkv_objs is equal to zero. 
But if can_alloc is set,  we will unlock krcp0->lock and allocated single page,  after that
we will reacquire krcp1 and lock,  but the krcp1 at this time may be different from the
previous krcp0,  if !bnode is true, also trigger work to invoke fill_page_cache_func(),  but
maybe the krcp1-> nr_bkv_objs is not equal to zero.


Thanks
Zqiang


>
>							Thanx, Paul
>
>                 bnode = (struct kvfree_rcu_bulk_data *)
> 
> Thanks
> Zqiang
> 
> >This would likely mean also breaking out of that loop if
> >krcp->backoff_page_cache_fill was set in the meantime (which happens
> >implicitly with Zqiang's patch).
> >
> >Or am I missing something subtle here?
> >
> >							Thanx, Paul
Paul E. McKenney April 7, 2023, 3:33 p.m. UTC | #6
On Fri, Apr 07, 2023 at 01:26:39AM +0000, Zhang, Qiang1 wrote:
> > >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > > cache growing.
> > > > 
> > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > ---
> > > >  kernel/rcu/tree.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 9cc0a7766fd2..f25430ae1936 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > >  	struct kvfree_rcu_bulk_data *bnode)
> > > >  {
> > > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > +		return false;
> > > >  	// Check the limit.
> > > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > >  		return false;
> > > > -- 
> > > > 2.32.0
> > > >
> > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > >
> > >Thank you both!
> > >
> > >One question, though.  Might it be better to instead modify the "for"
> > >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> > >of starting at zero?  That way, we still provide a single page under
> > >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> > >is plentiful.
> > >
> > >Alternatively, if we really don't want to allow any pages at all under
> > >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> > >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> > >flag is set?  
> > 
> > Hi, Paul
> > 
> > If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> > the allocated single page will also be freed in fill_page_cache_func().
> > 
> > or it would be better not to allocate under memory pressure.
> >
> >That was my thought.  ;-)
> >
> > How about like this?
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9cc0a7766fd2..94aedbc3da36 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2907,6 +2907,8 @@ static inline bool
> >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> >         struct kvfree_rcu_bulk_data *bnode)
> >  {
> > +       if (atomic_read(&krcp->backoff_page_cache_fill))
> > +               return false;
> >         // Check the limit.
> >         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> >                 return false;
> > @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
> >         int i;
> > 
> >         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> > -               1 : rcu_min_cached_objs;
> > +               0 : rcu_min_cached_objs;
> > 
> >         for (i = 0; i < nr_pages; i++) {
> >
> >The other question is why this loop does not allow for any pages
> >that might already be allocated, thus perhaps looking like this:
> >
> >		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
> >
> >Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
> >seeing this, but I do feel the need to ask.)
> 
> 
> The fill_page_cache_func() is triggered when we invoke get_cached_bnode() return NULL,
> this also means that krcp->nr_bkv_objs is equal to zero. 
> But if can_alloc is set,  we will unlock krcp0->lock and allocated single page,  after that
> we will reacquire krcp1 and lock,  but the krcp1 at this time may be different from the
> previous krcp0,  if !bnode is true, also trigger work to invoke fill_page_cache_func(),  but
> maybe the krcp1-> nr_bkv_objs is not equal to zero.

OK.  Given all of these good points, what would be a good patch for
this issue?  ;-)

							Thanx, Paul

> Thanks
> Zqiang
> 
> 
> >
> >							Thanx, Paul
> >
> >                 bnode = (struct kvfree_rcu_bulk_data *)
> > 
> > Thanks
> > Zqiang
> > 
> > >This would likely mean also breaking out of that loop if
> > >krcp->backoff_page_cache_fill was set in the meantime (which happens
> > >implicitly with Zqiang's patch).
> > >
> > >Or am I missing something subtle here?
> > >
> > >							Thanx, Paul
Zqiang April 8, 2023, 1:56 a.m. UTC | #7
On Fri, Apr 07, 2023 at 01:26:39AM +0000, Zhang, Qiang1 wrote:
> > >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > > cache growing.
> > > > 
> > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > ---
> > > >  kernel/rcu/tree.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 9cc0a7766fd2..f25430ae1936 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > >  	struct kvfree_rcu_bulk_data *bnode)
> > > >  {
> > > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > +		return false;
> > > >  	// Check the limit.
> > > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > >  		return false;
> > > > -- 
> > > > 2.32.0
> > > >
> > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > >
> > >Thank you both!
> > >
> > >One question, though.  Might it be better to instead modify the "for"
> > >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> > >of starting at zero?  That way, we still provide a single page under
> > >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> > >is plentiful.
> > >
> > >Alternatively, if we really don't want to allow any pages at all under
> > >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> > >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> > >flag is set?  
> > 
> > Hi, Paul
> > 
> > If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> > the allocated single page will also be freed in fill_page_cache_func().
> > 
> > or it would be better not to allocate under memory pressure.
> >
> >That was my thought.  ;-)
> >
> > How about like this?
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9cc0a7766fd2..94aedbc3da36 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2907,6 +2907,8 @@ static inline bool
> >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> >         struct kvfree_rcu_bulk_data *bnode)
> >  {
> > +       if (atomic_read(&krcp->backoff_page_cache_fill))
> > +               return false;
> >         // Check the limit.
> >         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> >                 return false;
> > @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
> >         int i;
> > 
> >         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> > -               1 : rcu_min_cached_objs;
> > +               0 : rcu_min_cached_objs;
> > 
> >         for (i = 0; i < nr_pages; i++) {
> >
> >The other question is why this loop does not allow for any pages
> >that might already be allocated, thus perhaps looking like this:
> >
> >		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
> >
> >Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
> >seeing this, but I do feel the need to ask.)
> 
> 
> The fill_page_cache_func() is triggered when we invoke get_cached_bnode() return NULL,
> this also means that krcp->nr_bkv_objs is equal to zero. 
> But if can_alloc is set,  we will unlock krcp0->lock and allocated single page,  after that
> we will reacquire krcp1 and lock,  but the krcp1 at this time may be different from the
> previous krcp0,  if !bnode is true, also trigger work to invoke fill_page_cache_func(),  but
> maybe the krcp1-> nr_bkv_objs is not equal to zero.
>
>OK.  Given all of these good points, what would be a good patch for
>this issue?  ;-)

Is it possible to keep the filling of the page always on the correct krcp?

--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -3303,7 +3303,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
                        // scenarios.
                        bnode = (struct kvfree_rcu_bulk_data *)
                                __get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
-                       *krcp = krc_this_cpu_lock(flags);
+                       raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
                }

                if (!bnode)


thoughts?

Thanks
Zqiang


>
>							Thanx, Paul
>
> Thanks
> Zqiang
> 
> 
> >
> >							Thanx, Paul
> >
> >                 bnode = (struct kvfree_rcu_bulk_data *)
> > 
> > Thanks
> > Zqiang
> > 
> > >This would likely mean also breaking out of that loop if
> > >krcp->backoff_page_cache_fill was set in the meantime (which happens
> > >implicitly with Zqiang's patch).
> > >
> > >Or am I missing something subtle here?
> > >
> > >							Thanx, Paul
Uladzislau Rezki April 8, 2023, 8 a.m. UTC | #8
On Thu, Apr 06, 2023 at 05:04:03PM -0700, Paul E. McKenney wrote:
> On Thu, Apr 06, 2023 at 11:11:37PM +0000, Zhang, Qiang1 wrote:
> > >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > > cache growing.
> > > > 
> > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > ---
> > > >  kernel/rcu/tree.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 9cc0a7766fd2..f25430ae1936 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > >  	struct kvfree_rcu_bulk_data *bnode)
> > > >  {
> > > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > +		return false;
> > > >  	// Check the limit.
> > > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > >  		return false;
> > > > -- 
> > > > 2.32.0
> > > >
> > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > >
> > >Thank you both!
> > >
> > >One question, though.  Might it be better to instead modify the "for"
> > >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> > >of starting at zero?  That way, we still provide a single page under
> > >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> > >is plentiful.
> > >
> > >Alternatively, if we really don't want to allow any pages at all under
> > >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> > >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> > >flag is set?  
> > 
> > Hi, Paul
> > 
> > If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> > the allocated single page will also be freed in fill_page_cache_func().
> > 
> > or it would be better not to allocate under memory pressure.
> 
> That was my thought.  ;-)
> 
> > How about like this?
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9cc0a7766fd2..94aedbc3da36 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2907,6 +2907,8 @@ static inline bool
> >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> >         struct kvfree_rcu_bulk_data *bnode)
> >  {
> > +       if (atomic_read(&krcp->backoff_page_cache_fill))
> > +               return false;
> >         // Check the limit.
> >         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> >                 return false;
> > @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
> >         int i;
> > 
> >         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> > -               1 : rcu_min_cached_objs;
> > +               0 : rcu_min_cached_objs;
> > 
> >         for (i = 0; i < nr_pages; i++) {
> 
> The other question is why this loop does not allow for any pages
> that might already be allocated, thus perhaps looking like this:
> 
> 		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
> 
> Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
> seeing this, but I do feel the need to ask.)
> 
Usually we start from zero, this is when a ptr. was not added into
a bulk array, due to no memory reason for a single argument and no
cache pages anymore for double argument.

In the fill page function, the limit is checked by the put_cached_bnode() itself
so it stops prefetch once nr_bkv_objs contains desired value.

--
Uladzislau Rezki
Uladzislau Rezki April 8, 2023, 9:04 a.m. UTC | #9
On Sat, Apr 08, 2023 at 01:56:40AM +0000, Zhang, Qiang1 wrote:
> On Fri, Apr 07, 2023 at 01:26:39AM +0000, Zhang, Qiang1 wrote:
> > > >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > > > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > > > cache growing.
> > > > > 
> > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > > ---
> > > > >  kernel/rcu/tree.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > index 9cc0a7766fd2..f25430ae1936 100644
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > > >  	struct kvfree_rcu_bulk_data *bnode)
> > > > >  {
> > > > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > > +		return false;
> > > > >  	// Check the limit.
> > > > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > > >  		return false;
> > > > > -- 
> > > > > 2.32.0
> > > > >
> > > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > >
> > > >Thank you both!
> > > >
> > > >One question, though.  Might it be better to instead modify the "for"
> > > >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> > > >of starting at zero?  That way, we still provide a single page under
> > > >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> > > >is plentiful.
> > > >
> > > >Alternatively, if we really don't want to allow any pages at all under
> > > >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> > > >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> > > >flag is set?  
> > > 
> > > Hi, Paul
> > > 
> > > If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> > > the allocated single page will also be freed in fill_page_cache_func().
> > > 
> > > or it would be better not to allocate under memory pressure.
> > >
> > >That was my thought.  ;-)
> > >
> > > How about like this?
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 9cc0a7766fd2..94aedbc3da36 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -2907,6 +2907,8 @@ static inline bool
> > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > >         struct kvfree_rcu_bulk_data *bnode)
> > >  {
> > > +       if (atomic_read(&krcp->backoff_page_cache_fill))
> > > +               return false;
> > >         // Check the limit.
> > >         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > >                 return false;
> > > @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
> > >         int i;
> > > 
> > >         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> > > -               1 : rcu_min_cached_objs;
> > > +               0 : rcu_min_cached_objs;
> > > 
> > >         for (i = 0; i < nr_pages; i++) {
> > >
> > >The other question is why this loop does not allow for any pages
> > >that might already be allocated, thus perhaps looking like this:
> > >
> > >		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
> > >
> > >Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
> > >seeing this, but I do feel the need to ask.)
> > 
> > 
> > The fill_page_cache_func() is triggered when we invoke get_cached_bnode() return NULL,
> > this also means that krcp->nr_bkv_objs is equal to zero. 
> > But if can_alloc is set,  we will unlock krcp0->lock and allocated single page,  after that
> > we will reacquire krcp1 and lock,  but the krcp1 at this time may be different from the
> > previous krcp0,  if !bnode is true, also trigger work to invoke fill_page_cache_func(),  but
> > maybe the krcp1-> nr_bkv_objs is not equal to zero.
> >
> >OK.  Given all of these good points, what would be a good patch for
> >this issue?  ;-)
> 
> Is it possible to keep the filling of the page always on the correct krcp?
> 
> --- a/kernel/rcu/tree.c
> +++ b/kernel/rcu/tree.c
> @@ -3303,7 +3303,7 @@ add_ptr_to_bulk_krc_lock(struct kfree_rcu_cpu **krcp,
>                         // scenarios.
>                         bnode = (struct kvfree_rcu_bulk_data *)
>                                 __get_free_page(GFP_KERNEL | __GFP_NORETRY | __GFP_NOMEMALLOC | __GFP_NOWARN);
> -                       *krcp = krc_this_cpu_lock(flags);
> +                       raw_spin_lock_irqsave(&(*krcp)->lock, *flags);
>                 }
> 
I do not expect any contention if this is applied. From the other
hand it might be that it was done deliberately for some reason or
because we most likely anyway stay on the same CPU.

I did some test to see how many times a migration occurs during this
small window and according to my data it was negligible.

Anyway i do not have any objections as of now against this change
and it looks correct to me.

--
Uladzislau Rezki
Zqiang April 12, 2023, 4:18 a.m. UTC | #10
> On Thu, Apr 06, 2023 at 11:11:37PM +0000, Zhang, Qiang1 wrote:
> > >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > > cache growing.
> > > > 
> > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > ---
> > > >  kernel/rcu/tree.c | 2 ++
> > > >  1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 9cc0a7766fd2..f25430ae1936 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > >  	struct kvfree_rcu_bulk_data *bnode)
> > > >  {
> > > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > +		return false;
> > > >  	// Check the limit.
> > > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > >  		return false;
> > > > -- 
> > > > 2.32.0
> > > >
> > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > >
> > >Thank you both!
> > >
> > >One question, though.  Might it be better to instead modify the "for"
> > >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> > >of starting at zero?  That way, we still provide a single page under
> > >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> > >is plentiful.
> > >
> > >Alternatively, if we really don't want to allow any pages at all under
> > >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> > >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> > >flag is set?  
> > 
> > Hi, Paul
> > 
> > If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> > the allocated single page will also be freed in fill_page_cache_func().
> > 
> > or it would be better not to allocate under memory pressure.
> 
> That was my thought.  ;-)
> 
> > How about like this?
> > 
> > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > index 9cc0a7766fd2..94aedbc3da36 100644
> > --- a/kernel/rcu/tree.c
> > +++ b/kernel/rcu/tree.c
> > @@ -2907,6 +2907,8 @@ static inline bool
> >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> >         struct kvfree_rcu_bulk_data *bnode)
> >  {
> > +       if (atomic_read(&krcp->backoff_page_cache_fill))
> > +               return false;
> >         // Check the limit.
> >         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> >                 return false;
> > @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
> >         int i;
> > 
> >         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> > -               1 : rcu_min_cached_objs;
> > +               0 : rcu_min_cached_objs;
> > 
> >         for (i = 0; i < nr_pages; i++) {
> 
> The other question is why this loop does not allow for any pages
> that might already be allocated, thus perhaps looking like this:
> 
> 		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
> 
> Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
> seeing this, but I do feel the need to ask.)
> 
>Usually we start from zero, this is when a ptr. was not added into
>a bulk array, due to no memory reason for a single argument and no
>cache pages anymore for double argument.
>
>In the fill page function, the limit is checked by the put_cached_bnode() itself
>so it stops prefetch once nr_bkv_objs contains desired value.
>

If the krcp->nr_bkv_objs is updated in kfree_rcu_work() and happens before invoke fill_page_cache_func(),
when invoke fill_page_cache_func(), we start from zero,  will allocate page and hold krcp->lock,
fill krcp->bkvcache,  but if krcp->nr_bkv_objs already equal to rcu_min_cached_objs,  this page will
be freed and exit loop,  this allocate page seems like a meaningless operation. 

I also want to ask if starting from krcp->nr_bkv_objs is necessary?


Thanks
Zqiang

>
>--
>Uladzislau Rezki
Uladzislau Rezki April 12, 2023, 12:36 p.m. UTC | #11
On Wed, Apr 12, 2023 at 04:18:20AM +0000, Zhang, Qiang1 wrote:
> > On Thu, Apr 06, 2023 at 11:11:37PM +0000, Zhang, Qiang1 wrote:
> > > >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > > > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > > > cache growing.
> > > > > 
> > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > > ---
> > > > >  kernel/rcu/tree.c | 2 ++
> > > > >  1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > index 9cc0a7766fd2..f25430ae1936 100644
> > > > > --- a/kernel/rcu/tree.c
> > > > > +++ b/kernel/rcu/tree.c
> > > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > > >  	struct kvfree_rcu_bulk_data *bnode)
> > > > >  {
> > > > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > > +		return false;
> > > > >  	// Check the limit.
> > > > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > > >  		return false;
> > > > > -- 
> > > > > 2.32.0
> > > > >
> > > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > >
> > > >Thank you both!
> > > >
> > > >One question, though.  Might it be better to instead modify the "for"
> > > >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> > > >of starting at zero?  That way, we still provide a single page under
> > > >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> > > >is plentiful.
> > > >
> > > >Alternatively, if we really don't want to allow any pages at all under
> > > >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> > > >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> > > >flag is set?  
> > > 
> > > Hi, Paul
> > > 
> > > If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> > > the allocated single page will also be freed in fill_page_cache_func().
> > > 
> > > or it would be better not to allocate under memory pressure.
> > 
> > That was my thought.  ;-)
> > 
> > > How about like this?
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 9cc0a7766fd2..94aedbc3da36 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -2907,6 +2907,8 @@ static inline bool
> > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > >         struct kvfree_rcu_bulk_data *bnode)
> > >  {
> > > +       if (atomic_read(&krcp->backoff_page_cache_fill))
> > > +               return false;
> > >         // Check the limit.
> > >         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > >                 return false;
> > > @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
> > >         int i;
> > > 
> > >         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> > > -               1 : rcu_min_cached_objs;
> > > +               0 : rcu_min_cached_objs;
> > > 
> > >         for (i = 0; i < nr_pages; i++) {
> > 
> > The other question is why this loop does not allow for any pages
> > that might already be allocated, thus perhaps looking like this:
> > 
> > 		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
> > 
> > Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
> > seeing this, but I do feel the need to ask.)
> > 
> >Usually we start from zero, this is when a ptr. was not added into
> >a bulk array, due to no memory reason for a single argument and no
> >cache pages anymore for double argument.
> >
> >In the fill page function, the limit is checked by the put_cached_bnode() itself
> >so it stops prefetch once nr_bkv_objs contains desired value.
> >
> 
> If the krcp->nr_bkv_objs is updated in kfree_rcu_work() and happens before invoke fill_page_cache_func(),
> when invoke fill_page_cache_func(), we start from zero,  will allocate page and hold krcp->lock,
> fill krcp->bkvcache,  but if krcp->nr_bkv_objs already equal to rcu_min_cached_objs,  this page will
> be freed and exit loop,  this allocate page seems like a meaningless operation. 
> 
> I also want to ask if starting from krcp->nr_bkv_objs is necessary?
> 
At least it does not break anything. The example like you described can
occur. So starting from the krcp->nr_bkv_objs is worth to do.

So, if it happens it would be good of you could simulate it and update
the commit message accordingly.


--
Uladzislau Rezki
Uladzislau Rezki April 12, 2023, 1:17 p.m. UTC | #12
On Wed, Apr 12, 2023 at 02:36:30PM +0200, Uladzislau Rezki wrote:
> On Wed, Apr 12, 2023 at 04:18:20AM +0000, Zhang, Qiang1 wrote:
> > > On Thu, Apr 06, 2023 at 11:11:37PM +0000, Zhang, Qiang1 wrote:
> > > > >>On Thu, Apr 06, 2023 at 06:37:53AM +0200, Uladzislau Rezki wrote:
> > > > > On Thu, Apr 06, 2023 at 08:12:38AM +0800, Zqiang wrote:
> > > > > > Currently, in kfree_rcu_shrink_scan(), the drain_page_cache() is
> > > > > > executed before kfree_rcu_monitor() to drain page cache, if the bnode
> > > > > > structure's->gp_snap has done, the kvfree_rcu_bulk() will fill the
> > > > > > page cache again in kfree_rcu_monitor(), this commit add a check
> > > > > > for krcp structure's->backoff_page_cache_fill in put_cached_bnode(),
> > > > > > if the krcp structure's->backoff_page_cache_fill is set, prevent page
> > > > > > cache growing.
> > > > > > 
> > > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com>
> > > > > > ---
> > > > > >  kernel/rcu/tree.c | 2 ++
> > > > > >  1 file changed, 2 insertions(+)
> > > > > > 
> > > > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > > > index 9cc0a7766fd2..f25430ae1936 100644
> > > > > > --- a/kernel/rcu/tree.c
> > > > > > +++ b/kernel/rcu/tree.c
> > > > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > > > >  	struct kvfree_rcu_bulk_data *bnode)
> > > > > >  {
> > > > > > +	if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > > > +		return false;
> > > > > >  	// Check the limit.
> > > > > >  	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > > > >  		return false;
> > > > > > -- 
> > > > > > 2.32.0
> > > > > >
> > > > > Reviewed-by: Uladzislau Rezki (Sony) <urezki@gmail.com>
> > > > >
> > > > >Thank you both!
> > > > >
> > > > >One question, though.  Might it be better to instead modify the "for"
> > > > >loop in fill_page_cache_func() to start at krcp->nr_bkv_objs instead
> > > > >of starting at zero?  That way, we still provide a single page under
> > > > >low-memory conditions, but provide rcu_min_cached_objs of them if memory
> > > > >is plentiful.
> > > > >
> > > > >Alternatively, if we really don't want to allow any pages at all under
> > > > >low-memory conditions, shouldn't the fill_page_cache_func() set nr_pages
> > > > >to zero (instead of the current 1) when the krcp->backoff_page_cache_fill
> > > > >flag is set?  
> > > > 
> > > > Hi, Paul
> > > > 
> > > > If the krcp->backoff_page_cache_fill is true, the put_cached_bnode () return false,
> > > > the allocated single page will also be freed in fill_page_cache_func().
> > > > 
> > > > or it would be better not to allocate under memory pressure.
> > > 
> > > That was my thought.  ;-)
> > > 
> > > > How about like this?
> > > > 
> > > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > > index 9cc0a7766fd2..94aedbc3da36 100644
> > > > --- a/kernel/rcu/tree.c
> > > > +++ b/kernel/rcu/tree.c
> > > > @@ -2907,6 +2907,8 @@ static inline bool
> > > >  put_cached_bnode(struct kfree_rcu_cpu *krcp,
> > > >         struct kvfree_rcu_bulk_data *bnode)
> > > >  {
> > > > +       if (atomic_read(&krcp->backoff_page_cache_fill))
> > > > +               return false;
> > > >         // Check the limit.
> > > >         if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
> > > >                 return false;
> > > > @@ -3220,7 +3222,7 @@ static void fill_page_cache_func(struct work_struct *work)
> > > >         int i;
> > > > 
> > > >         nr_pages = atomic_read(&krcp->backoff_page_cache_fill) ?
> > > > -               1 : rcu_min_cached_objs;
> > > > +               0 : rcu_min_cached_objs;
> > > > 
> > > >         for (i = 0; i < nr_pages; i++) {
> > > 
> > > The other question is why this loop does not allow for any pages
> > > that might already be allocated, thus perhaps looking like this:
> > > 
> > > 		for (i = krcp->nr_bkv_objs; i < nr_pages; i++) {
> > > 
> > > Or do we somehow know that krcp->nr_bkv_objs is equal to zero?  (I am not
> > > seeing this, but I do feel the need to ask.)
> > > 
> > >Usually we start from zero, this is when a ptr. was not added into
> > >a bulk array, due to no memory reason for a single argument and no
> > >cache pages anymore for double argument.
> > >
> > >In the fill page function, the limit is checked by the put_cached_bnode() itself
> > >so it stops prefetch once nr_bkv_objs contains desired value.
> > >
> > 
> > If the krcp->nr_bkv_objs is updated in kfree_rcu_work() and happens before invoke fill_page_cache_func(),
> > when invoke fill_page_cache_func(), we start from zero,  will allocate page and hold krcp->lock,
> > fill krcp->bkvcache,  but if krcp->nr_bkv_objs already equal to rcu_min_cached_objs,  this page will
> > be freed and exit loop,  this allocate page seems like a meaningless operation. 
> > 
> > I also want to ask if starting from krcp->nr_bkv_objs is necessary?
> > 
> At least it does not break anything. The example like you described can
> occur. So starting from the krcp->nr_bkv_objs is worth to do.
> 
> So, if it happens it would be good of you could simulate it and update
> the commit message accordingly.
> 
A small nit, the krcp->nr_bkv_objs can not be accessed without a lock.
So it must then accessed using READ_ONCE()/WRITE_ONCE() helpers.

--
Uladzislau Rezki
diff mbox series

Patch

diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
index 9cc0a7766fd2..f25430ae1936 100644
--- a/kernel/rcu/tree.c
+++ b/kernel/rcu/tree.c
@@ -2907,6 +2907,8 @@  static inline bool
 put_cached_bnode(struct kfree_rcu_cpu *krcp,
 	struct kvfree_rcu_bulk_data *bnode)
 {
+	if (atomic_read(&krcp->backoff_page_cache_fill))
+		return false;
 	// Check the limit.
 	if (krcp->nr_bkv_objs >= rcu_min_cached_objs)
 		return false;