diff mbox series

[2/2] fs/dcache: Make negative dentries easier to be reclaimed

Message ID 1535476780-5773-3-git-send-email-longman@redhat.com (mailing list archive)
State New, archived
Headers show
Series fs/dcache: Track # of negative dentries | expand

Commit Message

Waiman Long Aug. 28, 2018, 5:19 p.m. UTC
For negative dentries that are accessed once and never used again, they
should be removed first before other dentries when shrinker is running.
This is done by putting negative dentries at the head of the LRU list
instead at the tail.

A new DCACHE_NEW_NEGATIVE flag is now added to a negative dentry when it
is initially created. When such a dentry is added to the LRU, it will be
added to the head so that it will be the first to go when a shrinker is
running if it is never accessed again (DCACHE_REFERENCED bit not set).
The flag is cleared after the LRU list addition.

Suggested-by: Larry Woodman <lwoodman@redhat.com>
Signed-off-by: Waiman Long <longman@redhat.com>
---
 fs/dcache.c              | 25 +++++++++++++++++--------
 include/linux/dcache.h   |  1 +
 include/linux/list_lru.h | 17 +++++++++++++++++
 mm/list_lru.c            | 16 ++++++++++++++--
 4 files changed, 49 insertions(+), 10 deletions(-)

Comments

Matthew Wilcox Aug. 28, 2018, 10:13 p.m. UTC | #1
On Tue, Aug 28, 2018 at 01:19:40PM -0400, Waiman Long wrote:
> @@ -134,7 +135,7 @@ bool list_lru_add(struct list_lru *lru, struct list_head *item)
>  	spin_lock(&nlru->lock);
>  	if (list_empty(item)) {
>  		l = list_lru_from_kmem(nlru, item, &memcg);
> -		list_add_tail(item, &l->list);
> +		(add_tail ? list_add_tail : list_add)(item, &l->list);
>  		/* Set shrinker bit if the first element was added */
>  		if (!l->nr_items++)
>  			memcg_set_shrinker_bit(memcg, nid,

That's not OK.  Write it out properly, ie:

		if (add_tail)
			list_add_tail(item, &l->list);
		else
			list_add(item, &l->list);
Waiman Long Aug. 28, 2018, 10:29 p.m. UTC | #2
On 08/28/2018 06:13 PM, Matthew Wilcox wrote:
> On Tue, Aug 28, 2018 at 01:19:40PM -0400, Waiman Long wrote:
>> @@ -134,7 +135,7 @@ bool list_lru_add(struct list_lru *lru, struct list_head *item)
>>  	spin_lock(&nlru->lock);
>>  	if (list_empty(item)) {
>>  		l = list_lru_from_kmem(nlru, item, &memcg);
>> -		list_add_tail(item, &l->list);
>> +		(add_tail ? list_add_tail : list_add)(item, &l->list);
>>  		/* Set shrinker bit if the first element was added */
>>  		if (!l->nr_items++)
>>  			memcg_set_shrinker_bit(memcg, nid,
> That's not OK.  Write it out properly, ie:
>
> 		if (add_tail)
> 			list_add_tail(item, &l->list);
> 		else
> 			list_add(item, &l->list);
>
Yes, I can rewrite it. What is the problem with the abbreviated form?

Cheers,
Longman
Andrew Morton Aug. 28, 2018, 11:01 p.m. UTC | #3
Another pet peeve ;)

On Tue, 28 Aug 2018 13:19:40 -0400 Waiman Long <longman@redhat.com> wrote:

>  /**
> + * list_lru_add_head: add an element to the lru list's head
> + * @list_lru: the lru pointer
> + * @item: the item to be added.
> + *
> + * This is similar to list_lru_add(). The only difference is the location
> + * where the new item will be added. The list_lru_add() function will add

People often use the term "the foo() function".  I don't know why -
just say "foo()"!

> + * the new item to the tail as it is the most recently used one. The
> + * list_lru_add_head() will add the new item into the head so that it

Ditto.

"to the head"

> + * will the first to go if a shrinker is running. So this function should

"will be the"

> + * only be used for less important item that can be the first to go if

"items"

> + * the system is under memory pressure.
> + *
> + * Return value: true if the list was updated, false otherwise
> + */
Linus Torvalds Aug. 28, 2018, 11:10 p.m. UTC | #4
On Tue, Aug 28, 2018 at 3:29 PM Waiman Long <longman@redhat.com> wrote:
>
> Yes, I can rewrite it. What is the problem with the abbreviated form?

Either gcc rewrites it for you, or you end up _actually_ using a
function pointer and calling through it.

The latter would be absolutely horribly bad for something like
"list_add()", which should expand to just a couple of instructions.

And the former would be ok, except for the "you wrote code the garbage
way, and then depended on the compiler fixing it up". Which we
generally try to avoid in the kernel.

(Don't get me wrong - we definitely depend on the compiler doing a
good job at CSE and dead code elimination etc, but generally we try to
avoid the whole "compiler has to rewrite code to be good" model).

                 Linus
Andrew Morton Aug. 28, 2018, 11:22 p.m. UTC | #5
On Tue, 28 Aug 2018 16:10:24 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:

> On Tue, Aug 28, 2018 at 3:29 PM Waiman Long <longman@redhat.com> wrote:
> >
> > Yes, I can rewrite it. What is the problem with the abbreviated form?
> 
> Either gcc rewrites it for you, or you end up _actually_ using a
> function pointer and calling through it.
> 
> The latter would be absolutely horribly bad for something like
> "list_add()", which should expand to just a couple of instructions.
> 
> And the former would be ok, except for the "you wrote code the garbage
> way, and then depended on the compiler fixing it up". Which we
> generally try to avoid in the kernel.
> 
> (Don't get me wrong - we definitely depend on the compiler doing a
> good job at CSE and dead code elimination etc, but generally we try to
> avoid the whole "compiler has to rewrite code to be good" model).
> 

And the "abbreviated form" will surely explode if one or both of those
"functions" happens to be implemented (or later reimplemented) as a macro.
It's best not to unnecessarily make such assumptions.
Dave Chinner Aug. 29, 2018, 1:02 a.m. UTC | #6
On Tue, Aug 28, 2018 at 01:19:40PM -0400, Waiman Long wrote:
> For negative dentries that are accessed once and never used again, they
> should be removed first before other dentries when shrinker is running.
> This is done by putting negative dentries at the head of the LRU list
> instead at the tail.
> 
> A new DCACHE_NEW_NEGATIVE flag is now added to a negative dentry when it
> is initially created. When such a dentry is added to the LRU, it will be
> added to the head so that it will be the first to go when a shrinker is
> running if it is never accessed again (DCACHE_REFERENCED bit not set).
> The flag is cleared after the LRU list addition.

This exposes internal LRU list functionality to callers. I carefully
hid what end of the list was the most or least recent from
subsystems interacting with LRUs precisely because "head" and "tail"
are completely confusing when interacting with a LRU.

LRUs are about tracking relative object age, not heads and tails.

IOWs, "track this object as most recently used", not "add this
object to the list tail". The opposite is "track this object is
least recently used", not "add object at head of list".

IOWs, the interface should be list_lru_add_oldest() or maybe
list_lru_add_least_recent() to indicate that these objects are
considered to be the oldest and therefore prime immediate reclaim
candidates.

Which points out a problem. That the most recent negative dentry
will be the first to be reclaimed. That's not LRU behaviour, and
prevents a working set of negative dentries from being retained
when the shrinker rotates through the dentries in LRU order. i.e.
this patch turns the LRU into a MRU list for negative dentries.

And then there's shrinker behaviour. What happens if the shrinker
isolate callback returns LRU_ROTATE on a negative dentry? It gets
moved to the most recent end of the list, so it won't have an
attempt to reclaim it again until it's tried to reclaim all the real
dentries. IOWs, it goes back to behaving like LRUs are supposed to
behaving.

IOWs, reclaim behaviour of negative dentries will be
highly unpredictable, it will not easily retain a working set, nor
will the working set it does retain be predictable or easy to eject
from memory when the workload changes.

Is this the behavour what we want for negative dentries?

Perhaps a second internal LRU list in the list_lru for "immediate
reclaim" objects would be a better solution. i.e. similar to the
active/inactive lists used for prioritising the working set iover
single use pages in page reclaim. negative dentries go onto the
immediate list, real dentries go on the existing list. Both are LRU,
and the shrinker operates on the immediate list first. When we
rotate referenced negative dentries on the immediate list, promote
them to the active list with all the real dentries so they age out
with the rest of the working set. That way single use negative
dentries get removed in LRU order in preference to the working set
of real dentries.

Being able to make changes to the list implementation easily was one
of the reasons I hid the implementation of the list_lru from the
interface callers use....

[...]

> @@ -430,8 +424,20 @@ static void d_lru_add(struct dentry *dentry)
>  	D_FLAG_VERIFY(dentry, 0);
>  	dentry->d_flags |= DCACHE_LRU_LIST;
>  	this_cpu_inc(nr_dentry_unused);
> +	if (d_is_negative(dentry)) {
> +		__neg_dentry_inc(dentry);

/me notes this patch now open codes this, like suggested in the
previous patch.

> +		if (dentry->d_flags & DCACHE_NEW_NEGATIVE) {
> +			/*
> +			 * Add the negative dentry to the head once, it
> +			 * will be added to the tail next time.
> +			 */
> +			WARN_ON_ONCE(!list_lru_add_head(
> +				&dentry->d_sb->s_dentry_lru, &dentry->d_lru));
> +			dentry->d_flags &= ~DCACHE_NEW_NEGATIVE;
> +			return;
> +		}
> +	}
>  	WARN_ON_ONCE(!list_lru_add(&dentry->d_sb->s_dentry_lru, &dentry->d_lru));
> -	neg_dentry_inc(dentry);
>  }
>  
>  static void d_lru_del(struct dentry *dentry)
> @@ -2620,6 +2626,9 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode)
>  		__d_set_inode_and_type(dentry, inode, add_flags);
>  		raw_write_seqcount_end(&dentry->d_seq);
>  		fsnotify_update_flags(dentry);
> +	} else {
> +		/* It is a negative dentry, add it to LRU head initially. */
> +		dentry->d_flags |= DCACHE_NEW_NEGATIVE;

Comments about LRU behaviour should not be put anywhere but in the
lru code. Otherwise it ends up stale and wrong, assuming it is
correct to start with....

>  	}
>  	__d_rehash(dentry);
>  	if (dir)
> diff --git a/include/linux/dcache.h b/include/linux/dcache.h
> index df942e5..03a1918 100644
> --- a/include/linux/dcache.h
> +++ b/include/linux/dcache.h
> @@ -214,6 +214,7 @@ struct dentry_operations {
>  #define DCACHE_FALLTHRU			0x01000000 /* Fall through to lower layer */
>  #define DCACHE_ENCRYPTED_WITH_KEY	0x02000000 /* dir is encrypted with a valid key */
>  #define DCACHE_OP_REAL			0x04000000
> +#define DCACHE_NEW_NEGATIVE		0x08000000 /* New negative dentry */
>  
>  #define DCACHE_PAR_LOOKUP		0x10000000 /* being looked up (with parent locked shared) */
>  #define DCACHE_DENTRY_CURSOR		0x20000000
> diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
> index aa5efd9..bfac057 100644
> --- a/include/linux/list_lru.h
> +++ b/include/linux/list_lru.h
> @@ -90,6 +90,23 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware,
>  bool list_lru_add(struct list_lru *lru, struct list_head *item);
>  
>  /**
> + * list_lru_add_head: add an element to the lru list's head
> + * @list_lru: the lru pointer
> + * @item: the item to be added.
> + *
> + * This is similar to list_lru_add(). The only difference is the location
> + * where the new item will be added. The list_lru_add() function will add
> + * the new item to the tail as it is the most recently used one. The
> + * list_lru_add_head() will add the new item into the head so that it
> + * will the first to go if a shrinker is running. So this function should
> + * only be used for less important item that can be the first to go if
> + * the system is under memory pressure.

As mentioned above, this API needs reference object ages, not
internal list ordering and shrinker implementation details.

Cheers,

Dave.
Waiman Long Aug. 29, 2018, 1:18 a.m. UTC | #7
On 08/28/2018 07:10 PM, Linus Torvalds wrote:
> On Tue, Aug 28, 2018 at 3:29 PM Waiman Long <longman@redhat.com> wrote:
>> Yes, I can rewrite it. What is the problem with the abbreviated form?
> Either gcc rewrites it for you, or you end up _actually_ using a
> function pointer and calling through it.

Yes, function pointer will be really bad.
>
> The latter would be absolutely horribly bad for something like
> "list_add()", which should expand to just a couple of instructions.
>
> And the former would be ok, except for the "you wrote code the garbage
> way, and then depended on the compiler fixing it up". Which we
> generally try to avoid in the kernel.
>
> (Don't get me wrong - we definitely depend on the compiler doing a
> good job at CSE and dead code elimination etc, but generally we try to
> avoid the whole "compiler has to rewrite code to be good" model).
>
>                  Linus

I see your point here. I will rewrite to use the regular if-then-else.

Thanks,
Longman
Waiman Long Aug. 29, 2018, 1:18 a.m. UTC | #8
On 08/28/2018 07:22 PM, Andrew Morton wrote:
> On Tue, 28 Aug 2018 16:10:24 -0700 Linus Torvalds <torvalds@linux-foundation.org> wrote:
>
>> On Tue, Aug 28, 2018 at 3:29 PM Waiman Long <longman@redhat.com> wrote:
>>> Yes, I can rewrite it. What is the problem with the abbreviated form?
>> Either gcc rewrites it for you, or you end up _actually_ using a
>> function pointer and calling through it.
>>
>> The latter would be absolutely horribly bad for something like
>> "list_add()", which should expand to just a couple of instructions.
>>
>> And the former would be ok, except for the "you wrote code the garbage
>> way, and then depended on the compiler fixing it up". Which we
>> generally try to avoid in the kernel.
>>
>> (Don't get me wrong - we definitely depend on the compiler doing a
>> good job at CSE and dead code elimination etc, but generally we try to
>> avoid the whole "compiler has to rewrite code to be good" model).
>>
> And the "abbreviated form" will surely explode if one or both of those
> "functions" happens to be implemented (or later reimplemented) as a macro.
> It's best not to unnecessarily make such assumptions.
>
Yes,  that is true.

Thanks,
Longman
Michal Hocko Aug. 29, 2018, 7:51 a.m. UTC | #9
On Tue 28-08-18 13:19:40, Waiman Long wrote:
> For negative dentries that are accessed once and never used again, they
> should be removed first before other dentries when shrinker is running.
> This is done by putting negative dentries at the head of the LRU list
> instead at the tail.
> 
> A new DCACHE_NEW_NEGATIVE flag is now added to a negative dentry when it
> is initially created. When such a dentry is added to the LRU, it will be
> added to the head so that it will be the first to go when a shrinker is
> running if it is never accessed again (DCACHE_REFERENCED bit not set).
> The flag is cleared after the LRU list addition.

Placing object to the head of the LRU list can be really tricky as Dave
pointed out. I am not familiar with the dentry cache reclaim so my
comparison below might not apply. Let me try anyway.

Negative dentries sound very similar to MADV_FREE pages from the reclaim
POV. They are primary candidate for reclaim, yet you want to preserve
aging to other easily reclaimable objects (including other MADV_FREE
pages). What we do for those pages is to move them from the anonymous
LRU list to the inactive file LRU list. Now you obviously do not have
anon/file LRUs but something similar to active/inactive LRU lists might
be a reasonably good match. Have easily reclaimable dentries on the
inactive list including negative dentries. If negative entries are
heavily used then they can promote to the active list because there is
no reason to reclaim them soon.

Just my 2c
Paul E. McKenney Aug. 29, 2018, 5:54 p.m. UTC | #10
On Tue, Aug 28, 2018 at 04:01:50PM -0700, Andrew Morton wrote:
> 
> Another pet peeve ;)
> 
> On Tue, 28 Aug 2018 13:19:40 -0400 Waiman Long <longman@redhat.com> wrote:
> 
> >  /**
> > + * list_lru_add_head: add an element to the lru list's head
> > + * @list_lru: the lru pointer
> > + * @item: the item to be added.
> > + *
> > + * This is similar to list_lru_add(). The only difference is the location
> > + * where the new item will be added. The list_lru_add() function will add
> 
> People often use the term "the foo() function".  I don't know why -
> just say "foo()"!

For whatever it is worth...

I tend to use "The foo() function ..." instead of "foo() ..." in order
to properly capitalize the first word of the sentence.  So I might say
"The call_rcu() function enqueues an RCU callback." rather than something
like "call_rcu() enqueues an RCU callback."  Or I might use some other
trick to keep "call_rcu()" from being the first word of the sentence.
But if the end of the previous sentence introduced call_rcu(), you
usually want the next sentence's first use of "call_rcu()" to be very
early in the sentence, because otherwise the flow will seem choppy.

And no, I have no idea what I would do if I were writing in German,
where nouns are capitalized, given that function names tend to be used
as nouns.  Probably I would get yelled at a lot for capitalizing my
function names.  ;-)

							Thanx, Paul
Waiman Long Aug. 29, 2018, 7:34 p.m. UTC | #11
On 08/28/2018 09:02 PM, Dave Chinner wrote:
> On Tue, Aug 28, 2018 at 01:19:40PM -0400, Waiman Long wrote:
>> For negative dentries that are accessed once and never used again, they
>> should be removed first before other dentries when shrinker is running.
>> This is done by putting negative dentries at the head of the LRU list
>> instead at the tail.
>>
>> A new DCACHE_NEW_NEGATIVE flag is now added to a negative dentry when it
>> is initially created. When such a dentry is added to the LRU, it will be
>> added to the head so that it will be the first to go when a shrinker is
>> running if it is never accessed again (DCACHE_REFERENCED bit not set).
>> The flag is cleared after the LRU list addition.
> This exposes internal LRU list functionality to callers. I carefully
> hid what end of the list was the most or least recent from
> subsystems interacting with LRUs precisely because "head" and "tail"
> are completely confusing when interacting with a LRU.
>
> LRUs are about tracking relative object age, not heads and tails.
>
> IOWs, "track this object as most recently used", not "add this
> object to the list tail". The opposite is "track this object is
> least recently used", not "add object at head of list".
>
> IOWs, the interface should be list_lru_add_oldest() or maybe
> list_lru_add_least_recent() to indicate that these objects are
> considered to be the oldest and therefore prime immediate reclaim
> candidates.

Good point. I will use age as function name instead.

> Which points out a problem. That the most recent negative dentry
> will be the first to be reclaimed. That's not LRU behaviour, and
> prevents a working set of negative dentries from being retained
> when the shrinker rotates through the dentries in LRU order. i.e.
> this patch turns the LRU into a MRU list for negative dentries.
>
> And then there's shrinker behaviour. What happens if the shrinker
> isolate callback returns LRU_ROTATE on a negative dentry? It gets
> moved to the most recent end of the list, so it won't have an
> attempt to reclaim it again until it's tried to reclaim all the real
> dentries. IOWs, it goes back to behaving like LRUs are supposed to
> behaving.
>
> IOWs, reclaim behaviour of negative dentries will be
> highly unpredictable, it will not easily retain a working set, nor
> will the working set it does retain be predictable or easy to eject
> from memory when the workload changes.
>
> Is this the behavour what we want for negative dentries?

I am aware that the behavior is not strictly LRU for negative dentries.
This is a compromise for using one LRU list for 2 different classes of
dentries. The basic idea is that negative dentries that are used only
once will go first irrespective of their age.

> Perhaps a second internal LRU list in the list_lru for "immediate
> reclaim" objects would be a better solution. i.e. similar to the
> active/inactive lists used for prioritising the working set iover
> single use pages in page reclaim. negative dentries go onto the
> immediate list, real dentries go on the existing list. Both are LRU,
> and the shrinker operates on the immediate list first. When we
> rotate referenced negative dentries on the immediate list, promote
> them to the active list with all the real dentries so they age out
> with the rest of the working set. That way single use negative
> dentries get removed in LRU order in preference to the working set
> of real dentries.
>
> Being able to make changes to the list implementation easily was one
> of the reasons I hid the implementation of the list_lru from the
> interface callers use....
>
> [...]

I have thought about using 2 lists for dentries. That will require much
more extensive changes to the code and hence much more testing will be
needed to verify their correctness. That is the main reason why I try to
avoid doing that.

As you have suggested, we could implement this 2-level LRU list in the
list_lru API. But it is used by other subsystems as well. Extensive
change like that will have similar issue in term of testing and
verification effort.

>> @@ -430,8 +424,20 @@ static void d_lru_add(struct dentry *dentry)
>>  	D_FLAG_VERIFY(dentry, 0);
>>  	dentry->d_flags |= DCACHE_LRU_LIST;
>>  	this_cpu_inc(nr_dentry_unused);
>> +	if (d_is_negative(dentry)) {
>> +		__neg_dentry_inc(dentry);
> /me notes this patch now open codes this, like suggested in the
> previous patch.
>
>> +		if (dentry->d_flags & DCACHE_NEW_NEGATIVE) {
>> +			/*
>> +			 * Add the negative dentry to the head once, it
>> +			 * will be added to the tail next time.
>> +			 */
>> +			WARN_ON_ONCE(!list_lru_add_head(
>> +				&dentry->d_sb->s_dentry_lru, &dentry->d_lru));
>> +			dentry->d_flags &= ~DCACHE_NEW_NEGATIVE;
>> +			return;
>> +		}
>> +	}
>>  	WARN_ON_ONCE(!list_lru_add(&dentry->d_sb->s_dentry_lru, &dentry->d_lru));
>> -	neg_dentry_inc(dentry);
>>  }
>>  
>>  static void d_lru_del(struct dentry *dentry)
>> @@ -2620,6 +2626,9 @@ static inline void __d_add(struct dentry *dentry, struct inode *inode)
>>  		__d_set_inode_and_type(dentry, inode, add_flags);
>>  		raw_write_seqcount_end(&dentry->d_seq);
>>  		fsnotify_update_flags(dentry);
>> +	} else {
>> +		/* It is a negative dentry, add it to LRU head initially. */
>> +		dentry->d_flags |= DCACHE_NEW_NEGATIVE;
> Comments about LRU behaviour should not be put anywhere but in the
> lru code. Otherwise it ends up stale and wrong, assuming it is
> correct to start with....
>

I will document all that in the LRU code.

>>  	}
>>  	__d_rehash(dentry);
>>  	if (dir)
>> diff --git a/include/linux/dcache.h b/include/linux/dcache.h
>> index df942e5..03a1918 100644
>> --- a/include/linux/dcache.h
>> +++ b/include/linux/dcache.h
>> @@ -214,6 +214,7 @@ struct dentry_operations {
>>  #define DCACHE_FALLTHRU			0x01000000 /* Fall through to lower layer */
>>  #define DCACHE_ENCRYPTED_WITH_KEY	0x02000000 /* dir is encrypted with a valid key */
>>  #define DCACHE_OP_REAL			0x04000000
>> +#define DCACHE_NEW_NEGATIVE		0x08000000 /* New negative dentry */
>>  
>>  #define DCACHE_PAR_LOOKUP		0x10000000 /* being looked up (with parent locked shared) */
>>  #define DCACHE_DENTRY_CURSOR		0x20000000
>> diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
>> index aa5efd9..bfac057 100644
>> --- a/include/linux/list_lru.h
>> +++ b/include/linux/list_lru.h
>> @@ -90,6 +90,23 @@ int __list_lru_init(struct list_lru *lru, bool memcg_aware,
>>  bool list_lru_add(struct list_lru *lru, struct list_head *item);
>>  
>>  /**
>> + * list_lru_add_head: add an element to the lru list's head
>> + * @list_lru: the lru pointer
>> + * @item: the item to be added.
>> + *
>> + * This is similar to list_lru_add(). The only difference is the location
>> + * where the new item will be added. The list_lru_add() function will add
>> + * the new item to the tail as it is the most recently used one. The
>> + * list_lru_add_head() will add the new item into the head so that it
>> + * will the first to go if a shrinker is running. So this function should
>> + * only be used for less important item that can be the first to go if
>> + * the system is under memory pressure.
> As mentioned above, this API needs reference object ages, not
> internal list ordering and shrinker implementation details.

Sure. Thanks for the suggestion.

Cheers,
Longman
Waiman Long Aug. 29, 2018, 7:58 p.m. UTC | #12
On 08/29/2018 03:51 AM, Michal Hocko wrote:
> On Tue 28-08-18 13:19:40, Waiman Long wrote:
>> For negative dentries that are accessed once and never used again, they
>> should be removed first before other dentries when shrinker is running.
>> This is done by putting negative dentries at the head of the LRU list
>> instead at the tail.
>>
>> A new DCACHE_NEW_NEGATIVE flag is now added to a negative dentry when it
>> is initially created. When such a dentry is added to the LRU, it will be
>> added to the head so that it will be the first to go when a shrinker is
>> running if it is never accessed again (DCACHE_REFERENCED bit not set).
>> The flag is cleared after the LRU list addition.
> Placing object to the head of the LRU list can be really tricky as Dave
> pointed out. I am not familiar with the dentry cache reclaim so my
> comparison below might not apply. Let me try anyway.
>
> Negative dentries sound very similar to MADV_FREE pages from the reclaim
> POV. They are primary candidate for reclaim, yet you want to preserve
> aging to other easily reclaimable objects (including other MADV_FREE
> pages). What we do for those pages is to move them from the anonymous
> LRU list to the inactive file LRU list. Now you obviously do not have
> anon/file LRUs but something similar to active/inactive LRU lists might
> be a reasonably good match. Have easily reclaimable dentries on the
> inactive list including negative dentries. If negative entries are
> heavily used then they can promote to the active list because there is
> no reason to reclaim them soon.
>
> Just my 2c

As mentioned in my reply to Dave, I did considered using a 2 LRU list
solution. However, that will add more complexity to the dcache LRU
management code than my current approach and probably more potential for
slowdown.

One point that I forgot to mention is that the current dcache LRU list
will only update the order of the dentries when the list is being
shrinked. If a dentry was referenced again before (the DCACHE_REFERENCED
bit set), it will rotated to the tail of the list via LRU rotation
instead of being removed. This is not strictly LRU anyway. The way that
my patch work will just make it move further away from true LRU behavior.

Cheers,
Longman
Waiman Long Aug. 29, 2018, 8:03 p.m. UTC | #13
On 08/29/2018 01:54 PM, Paul E. McKenney wrote:
> On Tue, Aug 28, 2018 at 04:01:50PM -0700, Andrew Morton wrote:
>> Another pet peeve ;)
>>
>> On Tue, 28 Aug 2018 13:19:40 -0400 Waiman Long <longman@redhat.com> wrote:
>>
>>>  /**
>>> + * list_lru_add_head: add an element to the lru list's head
>>> + * @list_lru: the lru pointer
>>> + * @item: the item to be added.
>>> + *
>>> + * This is similar to list_lru_add(). The only difference is the location
>>> + * where the new item will be added. The list_lru_add() function will add
>> People often use the term "the foo() function".  I don't know why -
>> just say "foo()"!
> For whatever it is worth...
>
> I tend to use "The foo() function ..." instead of "foo() ..." in order
> to properly capitalize the first word of the sentence.  So I might say
> "The call_rcu() function enqueues an RCU callback." rather than something
> like "call_rcu() enqueues an RCU callback."  Or I might use some other
> trick to keep "call_rcu()" from being the first word of the sentence.
> But if the end of the previous sentence introduced call_rcu(), you
> usually want the next sentence's first use of "call_rcu()" to be very
> early in the sentence, because otherwise the flow will seem choppy.
>
> And no, I have no idea what I would do if I were writing in German,
> where nouns are capitalized, given that function names tend to be used
> as nouns.  Probably I would get yelled at a lot for capitalizing my
> function names.  ;-)
>
> 							Thanx, Paul
>
Yes, doing proper capitalization of the first letter of a sentence is
the main reason I used "The foo() function" in a sentence.

Cheers,
Longman
Matthew Wilcox Aug. 29, 2018, 9:04 p.m. UTC | #14
On Wed, Aug 29, 2018 at 10:54:05AM -0700, Paul E. McKenney wrote:
> On Tue, Aug 28, 2018 at 04:01:50PM -0700, Andrew Morton wrote:
> > People often use the term "the foo() function".  I don't know why -
> > just say "foo()"!
> 
> For whatever it is worth...
> 
> I tend to use "The foo() function ..." instead of "foo() ..." in order
> to properly capitalize the first word of the sentence.  So I might say
> "The call_rcu() function enqueues an RCU callback." rather than something
> like "call_rcu() enqueues an RCU callback."  Or I might use some other
> trick to keep "call_rcu()" from being the first word of the sentence.

I tend to write 'Use call_rcu() to enqueue a callback." or "When
call_rcu() returns, the callback will have been enqueued".
Dave Chinner Aug. 30, 2018, 1:12 a.m. UTC | #15
On Wed, Aug 29, 2018 at 03:34:05PM -0400, Waiman Long wrote:
> On 08/28/2018 09:02 PM, Dave Chinner wrote:
> > On Tue, Aug 28, 2018 at 01:19:40PM -0400, Waiman Long wrote:
> > And then there's shrinker behaviour. What happens if the shrinker
> > isolate callback returns LRU_ROTATE on a negative dentry? It gets
> > moved to the most recent end of the list, so it won't have an
> > attempt to reclaim it again until it's tried to reclaim all the real
> > dentries. IOWs, it goes back to behaving like LRUs are supposed to
> > behaving.
> >
> > IOWs, reclaim behaviour of negative dentries will be
> > highly unpredictable, it will not easily retain a working set, nor
> > will the working set it does retain be predictable or easy to eject
> > from memory when the workload changes.
> >
> > Is this the behavour what we want for negative dentries?
> 
> I am aware that the behavior is not strictly LRU for negative dentries.
> This is a compromise for using one LRU list for 2 different classes of
> dentries.

Thus demonstrating just enough knowledge to be dangerous.

We already 3 different classes of dentries on the LRU list:

	- in use dentries (because we use lazy removal to avoid
	  lru list lock contention on cache hit lookups)
	- unused, referenced dentries
	- unused, unreferenced dentries.

Each of these classes of dentries are treated differently by the
shrinker, but the key point is that they are all aged the same way
and so there's consistent maintenance of the working set under
memory pressure. Putting negative dentries at the head of the list
doesn't create a new "class" of object on the LRU, it just changes
the ordering of the lru list. This will cause unpredictable
behaviour because objects haven't had a chance to age gracefully
before they are reclaimed.

FYI, the inode cache has the same list_lru setup, object types and
shrinker algorithm as the dentry cache, so this isn't a one-off.
Indeed, the XFS buffer cache has a multi-reference heirarchy of 13
different types of {unused, referenced} buffers in it's list_lru to
implement a quasi aging-NFU reclaim algorithm in it's shrinker.

i.e. the list_lru infrastructure has never provided or enforced a
pure LRU algorithm. It is common infrastructure intended to provide
a scalable, flexible and memcg-aware FIFO-like object tracking
system that interates tightly with memory reclaim to allow
subsystems to implement cache reclaim algorithms that are optimal
for that subsystem.

IOWs, the list_lru doesn't define the reclaim algorithm the
subsystem uses and there's no reason why we can't extend the
infrastructure to support more complex algorithms without impacting
existing subsystem reclaim algorithms at all. Only the subsystems
that use the new infrastructure and algorithms would need careful
examination.  Of course, the overall system cache balancing
behaviour under different and changing workloads would still need to
be verified, but you have to do that for any cache reclaim algorithm
change that is made....

> The basic idea is that negative dentries that are used only
> once will go first irrespective of their age.

Using MRU for negative dentries, as I've previously explained, is a
flawed concept. It might be expedient to solve your specific
problem, but it's not a solution to the general problem of negative
dentry management.

> > Perhaps a second internal LRU list in the list_lru for "immediate
> > reclaim" objects would be a better solution. i.e. similar to the
> > active/inactive lists used for prioritising the working set iover
> > single use pages in page reclaim. negative dentries go onto the
> > immediate list, real dentries go on the existing list. Both are LRU,
> > and the shrinker operates on the immediate list first. When we
> > rotate referenced negative dentries on the immediate list, promote
> > them to the active list with all the real dentries so they age out
> > with the rest of the working set. That way single use negative
> > dentries get removed in LRU order in preference to the working set
> > of real dentries.
> >
> > Being able to make changes to the list implementation easily was one
> > of the reasons I hid the implementation of the list_lru from the
> > interface callers use....
> >
> > [...]
> 
> I have thought about using 2 lists for dentries. That will require much
> more extensive changes to the code and hence much more testing will be
> needed to verify their correctness.  That is the main reason why I try to
> avoid doing that.

i.e. expediency.

However, you're changing the behaviour of core caching and memory
reclaim algorithms. The amount and level of testing and verification
you need to do is the same regardless of whether it's a small change
or a large change.  Sure, you've shown that *one* artificial
micro-benchmark improves, but what about everything else?

> As you have suggested, we could implement this 2-level LRU list in the
> list_lru API. But it is used by other subsystems as well. Extensive
> change like that will have similar issue in term of testing and
> verification effort.

I know what changing reclaim algorithm involves, how difficult
it is to balance the competing caches quickly and to the desired
ratios for acceptable performance, how difficult it is to measure
and control the system reacts to transient and impulse memory
pressure events, etc.

I also know that "simple" and/or "obviously correct" subsystem
changes can cause very unexepected system level effects, and that
it's almost never what you think it is that caused the unexpected
behaviour.  IOWs, getting anything even slightly wrong in these
algorithms will adversely affect system performance and balance
significantly.  Hence the bar is /always/ set high for core caching
algorithm changes like this.

Cheers,

Dave.
Michal Hocko Aug. 30, 2018, 7:20 a.m. UTC | #16
On Wed 29-08-18 15:58:52, Waiman Long wrote:
> On 08/29/2018 03:51 AM, Michal Hocko wrote:
> > On Tue 28-08-18 13:19:40, Waiman Long wrote:
> >> For negative dentries that are accessed once and never used again, they
> >> should be removed first before other dentries when shrinker is running.
> >> This is done by putting negative dentries at the head of the LRU list
> >> instead at the tail.
> >>
> >> A new DCACHE_NEW_NEGATIVE flag is now added to a negative dentry when it
> >> is initially created. When such a dentry is added to the LRU, it will be
> >> added to the head so that it will be the first to go when a shrinker is
> >> running if it is never accessed again (DCACHE_REFERENCED bit not set).
> >> The flag is cleared after the LRU list addition.
> > Placing object to the head of the LRU list can be really tricky as Dave
> > pointed out. I am not familiar with the dentry cache reclaim so my
> > comparison below might not apply. Let me try anyway.
> >
> > Negative dentries sound very similar to MADV_FREE pages from the reclaim
> > POV. They are primary candidate for reclaim, yet you want to preserve
> > aging to other easily reclaimable objects (including other MADV_FREE
> > pages). What we do for those pages is to move them from the anonymous
> > LRU list to the inactive file LRU list. Now you obviously do not have
> > anon/file LRUs but something similar to active/inactive LRU lists might
> > be a reasonably good match. Have easily reclaimable dentries on the
> > inactive list including negative dentries. If negative entries are
> > heavily used then they can promote to the active list because there is
> > no reason to reclaim them soon.
> >
> > Just my 2c
> 
> As mentioned in my reply to Dave, I did considered using a 2 LRU list
> solution. However, that will add more complexity to the dcache LRU
> management code than my current approach and probably more potential for
> slowdown.

I completely agree with Dave here. This is not easy but trying to sneak
in something that works for an _artificial_ workload is simply a no go.
So if it takes to come with a more complex solution to cover more
general workloads then be it. Someone has to bite a bullet and explore
that direction. It won't be a simple project but well, if negative
dentries really matter then it is worth making the reclaim design robust
and comprehensible rather than adhoc and unpredictable.
Waiman Long Aug. 30, 2018, 9:48 p.m. UTC | #17
On 08/30/2018 03:20 AM, Michal Hocko wrote:
> On Wed 29-08-18 15:58:52, Waiman Long wrote:
>> On 08/29/2018 03:51 AM, Michal Hocko wrote:
>>> On Tue 28-08-18 13:19:40, Waiman Long wrote:
>>>> For negative dentries that are accessed once and never used again, they
>>>> should be removed first before other dentries when shrinker is running.
>>>> This is done by putting negative dentries at the head of the LRU list
>>>> instead at the tail.
>>>>
>>>> A new DCACHE_NEW_NEGATIVE flag is now added to a negative dentry when it
>>>> is initially created. When such a dentry is added to the LRU, it will be
>>>> added to the head so that it will be the first to go when a shrinker is
>>>> running if it is never accessed again (DCACHE_REFERENCED bit not set).
>>>> The flag is cleared after the LRU list addition.
>>> Placing object to the head of the LRU list can be really tricky as Dave
>>> pointed out. I am not familiar with the dentry cache reclaim so my
>>> comparison below might not apply. Let me try anyway.
>>>
>>> Negative dentries sound very similar to MADV_FREE pages from the reclaim
>>> POV. They are primary candidate for reclaim, yet you want to preserve
>>> aging to other easily reclaimable objects (including other MADV_FREE
>>> pages). What we do for those pages is to move them from the anonymous
>>> LRU list to the inactive file LRU list. Now you obviously do not have
>>> anon/file LRUs but something similar to active/inactive LRU lists might
>>> be a reasonably good match. Have easily reclaimable dentries on the
>>> inactive list including negative dentries. If negative entries are
>>> heavily used then they can promote to the active list because there is
>>> no reason to reclaim them soon.
>>>
>>> Just my 2c
>> As mentioned in my reply to Dave, I did considered using a 2 LRU list
>> solution. However, that will add more complexity to the dcache LRU
>> management code than my current approach and probably more potential for
>> slowdown.
> I completely agree with Dave here. This is not easy but trying to sneak
> in something that works for an _artificial_ workload is simply a no go.
> So if it takes to come with a more complex solution to cover more
> general workloads then be it. Someone has to bite a bullet and explore
> that direction. It won't be a simple project but well, if negative
> dentries really matter then it is worth making the reclaim design robust
> and comprehensible rather than adhoc and unpredictable.

OK, I will need to spend more time to think about a better way of doing
that.

Cheers,
Longman
Waiman Long Aug. 30, 2018, 9:51 p.m. UTC | #18
On 08/29/2018 09:12 PM, Dave Chinner wrote:
> On Wed, Aug 29, 2018 at 03:34:05PM -0400, Waiman Long wrote:
>> On 08/28/2018 09:02 PM, Dave Chinner wrote:
>>> On Tue, Aug 28, 2018 at 01:19:40PM -0400, Waiman Long wrote:
>>> And then there's shrinker behaviour. What happens if the shrinker
>>> isolate callback returns LRU_ROTATE on a negative dentry? It gets
>>> moved to the most recent end of the list, so it won't have an
>>> attempt to reclaim it again until it's tried to reclaim all the real
>>> dentries. IOWs, it goes back to behaving like LRUs are supposed to
>>> behaving.
>>>
>>> IOWs, reclaim behaviour of negative dentries will be
>>> highly unpredictable, it will not easily retain a working set, nor
>>> will the working set it does retain be predictable or easy to eject
>>> from memory when the workload changes.
>>>
>>> Is this the behavour what we want for negative dentries?
>> I am aware that the behavior is not strictly LRU for negative dentries.
>> This is a compromise for using one LRU list for 2 different classes of
>> dentries.
> Thus demonstrating just enough knowledge to be dangerous.
>
> We already 3 different classes of dentries on the LRU list:
>
> 	- in use dentries (because we use lazy removal to avoid
> 	  lru list lock contention on cache hit lookups)
> 	- unused, referenced dentries
> 	- unused, unreferenced dentries.
>
> Each of these classes of dentries are treated differently by the
> shrinker, but the key point is that they are all aged the same way
> and so there's consistent maintenance of the working set under
> memory pressure. Putting negative dentries at the head of the list
> doesn't create a new "class" of object on the LRU, it just changes
> the ordering of the lru list. This will cause unpredictable
> behaviour because objects haven't had a chance to age gracefully
> before they are reclaimed.
>
> FYI, the inode cache has the same list_lru setup, object types and
> shrinker algorithm as the dentry cache, so this isn't a one-off.
> Indeed, the XFS buffer cache has a multi-reference heirarchy of 13
> different types of {unused, referenced} buffers in it's list_lru to
> implement a quasi aging-NFU reclaim algorithm in it's shrinker.
>
> i.e. the list_lru infrastructure has never provided or enforced a
> pure LRU algorithm. It is common infrastructure intended to provide
> a scalable, flexible and memcg-aware FIFO-like object tracking
> system that interates tightly with memory reclaim to allow
> subsystems to implement cache reclaim algorithms that are optimal
> for that subsystem.
>
> IOWs, the list_lru doesn't define the reclaim algorithm the
> subsystem uses and there's no reason why we can't extend the
> infrastructure to support more complex algorithms without impacting
> existing subsystem reclaim algorithms at all. Only the subsystems
> that use the new infrastructure and algorithms would need careful
> examination.  Of course, the overall system cache balancing
> behaviour under different and changing workloads would still need to
> be verified, but you have to do that for any cache reclaim algorithm
> change that is made....
>
>> The basic idea is that negative dentries that are used only
>> once will go first irrespective of their age.
> Using MRU for negative dentries, as I've previously explained, is a
> flawed concept. It might be expedient to solve your specific
> problem, but it's not a solution to the general problem of negative
> dentry management.
>
>>> Perhaps a second internal LRU list in the list_lru for "immediate
>>> reclaim" objects would be a better solution. i.e. similar to the
>>> active/inactive lists used for prioritising the working set iover
>>> single use pages in page reclaim. negative dentries go onto the
>>> immediate list, real dentries go on the existing list. Both are LRU,
>>> and the shrinker operates on the immediate list first. When we
>>> rotate referenced negative dentries on the immediate list, promote
>>> them to the active list with all the real dentries so they age out
>>> with the rest of the working set. That way single use negative
>>> dentries get removed in LRU order in preference to the working set
>>> of real dentries.
>>>
>>> Being able to make changes to the list implementation easily was one
>>> of the reasons I hid the implementation of the list_lru from the
>>> interface callers use....
>>>
>>> [...]
>> I have thought about using 2 lists for dentries. That will require much
>> more extensive changes to the code and hence much more testing will be
>> needed to verify their correctness.  That is the main reason why I try to
>> avoid doing that.
> i.e. expediency.
>
> However, you're changing the behaviour of core caching and memory
> reclaim algorithms. The amount and level of testing and verification
> you need to do is the same regardless of whether it's a small change
> or a large change.  Sure, you've shown that *one* artificial
> micro-benchmark improves, but what about everything else?
>
>> As you have suggested, we could implement this 2-level LRU list in the
>> list_lru API. But it is used by other subsystems as well. Extensive
>> change like that will have similar issue in term of testing and
>> verification effort.
> I know what changing reclaim algorithm involves, how difficult
> it is to balance the competing caches quickly and to the desired
> ratios for acceptable performance, how difficult it is to measure
> and control the system reacts to transient and impulse memory
> pressure events, etc.
>
> I also know that "simple" and/or "obviously correct" subsystem
> changes can cause very unexepected system level effects, and that
> it's almost never what you think it is that caused the unexpected
> behaviour.  IOWs, getting anything even slightly wrong in these
> algorithms will adversely affect system performance and balance
> significantly.  Hence the bar is /always/ set high for core caching
> algorithm changes like this.
>
> Cheers,
>
> Dave.

Thanks for the comments. I will need more time to think about it.

Cheers,
Longman
diff mbox series

Patch

diff --git a/fs/dcache.c b/fs/dcache.c
index 69f5541..ab6a4cf 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -242,12 +242,6 @@  static inline void __neg_dentry_inc(struct dentry *dentry)
 	this_cpu_inc(nr_dentry_neg);
 }
 
-static inline void neg_dentry_inc(struct dentry *dentry)
-{
-	if (unlikely(d_is_negative(dentry)))
-		__neg_dentry_inc(dentry);
-}
-
 static inline int dentry_cmp(const struct dentry *dentry, const unsigned char *ct, unsigned tcount)
 {
 	/*
@@ -353,7 +347,7 @@  static inline void __d_set_inode_and_type(struct dentry *dentry,
 
 	dentry->d_inode = inode;
 	flags = READ_ONCE(dentry->d_flags);
-	flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU);
+	flags &= ~(DCACHE_ENTRY_TYPE | DCACHE_FALLTHRU | DCACHE_NEW_NEGATIVE);
 	flags |= type_flags;
 	WRITE_ONCE(dentry->d_flags, flags);
 }
@@ -430,8 +424,20 @@  static void d_lru_add(struct dentry *dentry)
 	D_FLAG_VERIFY(dentry, 0);
 	dentry->d_flags |= DCACHE_LRU_LIST;
 	this_cpu_inc(nr_dentry_unused);
+	if (d_is_negative(dentry)) {
+		__neg_dentry_inc(dentry);
+		if (dentry->d_flags & DCACHE_NEW_NEGATIVE) {
+			/*
+			 * Add the negative dentry to the head once, it
+			 * will be added to the tail next time.
+			 */
+			WARN_ON_ONCE(!list_lru_add_head(
+				&dentry->d_sb->s_dentry_lru, &dentry->d_lru));
+			dentry->d_flags &= ~DCACHE_NEW_NEGATIVE;
+			return;
+		}
+	}
 	WARN_ON_ONCE(!list_lru_add(&dentry->d_sb->s_dentry_lru, &dentry->d_lru));
-	neg_dentry_inc(dentry);
 }
 
 static void d_lru_del(struct dentry *dentry)
@@ -2620,6 +2626,9 @@  static inline void __d_add(struct dentry *dentry, struct inode *inode)
 		__d_set_inode_and_type(dentry, inode, add_flags);
 		raw_write_seqcount_end(&dentry->d_seq);
 		fsnotify_update_flags(dentry);
+	} else {
+		/* It is a negative dentry, add it to LRU head initially. */
+		dentry->d_flags |= DCACHE_NEW_NEGATIVE;
 	}
 	__d_rehash(dentry);
 	if (dir)
diff --git a/include/linux/dcache.h b/include/linux/dcache.h
index df942e5..03a1918 100644
--- a/include/linux/dcache.h
+++ b/include/linux/dcache.h
@@ -214,6 +214,7 @@  struct dentry_operations {
 #define DCACHE_FALLTHRU			0x01000000 /* Fall through to lower layer */
 #define DCACHE_ENCRYPTED_WITH_KEY	0x02000000 /* dir is encrypted with a valid key */
 #define DCACHE_OP_REAL			0x04000000
+#define DCACHE_NEW_NEGATIVE		0x08000000 /* New negative dentry */
 
 #define DCACHE_PAR_LOOKUP		0x10000000 /* being looked up (with parent locked shared) */
 #define DCACHE_DENTRY_CURSOR		0x20000000
diff --git a/include/linux/list_lru.h b/include/linux/list_lru.h
index aa5efd9..bfac057 100644
--- a/include/linux/list_lru.h
+++ b/include/linux/list_lru.h
@@ -90,6 +90,23 @@  int __list_lru_init(struct list_lru *lru, bool memcg_aware,
 bool list_lru_add(struct list_lru *lru, struct list_head *item);
 
 /**
+ * list_lru_add_head: add an element to the lru list's head
+ * @list_lru: the lru pointer
+ * @item: the item to be added.
+ *
+ * This is similar to list_lru_add(). The only difference is the location
+ * where the new item will be added. The list_lru_add() function will add
+ * the new item to the tail as it is the most recently used one. The
+ * list_lru_add_head() will add the new item into the head so that it
+ * will the first to go if a shrinker is running. So this function should
+ * only be used for less important item that can be the first to go if
+ * the system is under memory pressure.
+ *
+ * Return value: true if the list was updated, false otherwise
+ */
+bool list_lru_add_head(struct list_lru *lru, struct list_head *item);
+
+/**
  * list_lru_del: delete an element to the lru list
  * @list_lru: the lru pointer
  * @item: the item to be deleted.
diff --git a/mm/list_lru.c b/mm/list_lru.c
index 5b30625..133f41c 100644
--- a/mm/list_lru.c
+++ b/mm/list_lru.c
@@ -124,7 +124,8 @@  static inline bool list_lru_memcg_aware(struct list_lru *lru)
 }
 #endif /* CONFIG_MEMCG_KMEM */
 
-bool list_lru_add(struct list_lru *lru, struct list_head *item)
+static inline bool __list_lru_add(struct list_lru *lru, struct list_head *item,
+				  const bool add_tail)
 {
 	int nid = page_to_nid(virt_to_page(item));
 	struct list_lru_node *nlru = &lru->node[nid];
@@ -134,7 +135,7 @@  bool list_lru_add(struct list_lru *lru, struct list_head *item)
 	spin_lock(&nlru->lock);
 	if (list_empty(item)) {
 		l = list_lru_from_kmem(nlru, item, &memcg);
-		list_add_tail(item, &l->list);
+		(add_tail ? list_add_tail : list_add)(item, &l->list);
 		/* Set shrinker bit if the first element was added */
 		if (!l->nr_items++)
 			memcg_set_shrinker_bit(memcg, nid,
@@ -146,8 +147,19 @@  bool list_lru_add(struct list_lru *lru, struct list_head *item)
 	spin_unlock(&nlru->lock);
 	return false;
 }
+
+bool list_lru_add(struct list_lru *lru, struct list_head *item)
+{
+	return __list_lru_add(lru, item, true);
+}
 EXPORT_SYMBOL_GPL(list_lru_add);
 
+bool list_lru_add_head(struct list_lru *lru, struct list_head *item)
+{
+	return __list_lru_add(lru, item, false);
+}
+EXPORT_SYMBOL_GPL(list_lru_add_head);
+
 bool list_lru_del(struct list_lru *lru, struct list_head *item)
 {
 	int nid = page_to_nid(virt_to_page(item));