diff mbox series

[v3,1/7] mm/lru: add per lruvec lock for memcg

Message ID 1573874106-23802-2-git-send-email-alex.shi@linux.alibaba.com (mailing list archive)
State New, archived
Headers show
Series per lruvec lru_lock for memcg | expand

Commit Message

Alex Shi Nov. 16, 2019, 3:15 a.m. UTC
Currently memcg still use per node pgdat->lru_lock to guard its lruvec.
That causes some lru_lock contention in a high container density system.

If we can use per lruvec lock, that could relief much of the lru_lock
contention.

The later patches will replace the pgdat->lru_lock with lruvec->lru_lock
and show the performance benefit by benchmarks.

Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Mel Gorman <mgorman@techsingularity.net>
Cc: Wei Yang <richard.weiyang@gmail.com>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Arun KS <arunks@codeaurora.org>
Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
Cc: Hugh Dickins <hughd@google.com>
Cc: cgroups@vger.kernel.org
Cc: linux-mm@kvack.org
Cc: linux-kernel@vger.kernel.org
---
 include/linux/mmzone.h | 2 ++
 mm/mmzone.c            | 1 +
 2 files changed, 3 insertions(+)

Comments

Shakeel Butt Nov. 16, 2019, 6:28 a.m. UTC | #1
On Fri, Nov 15, 2019 at 7:15 PM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>
> Currently memcg still use per node pgdat->lru_lock to guard its lruvec.
> That causes some lru_lock contention in a high container density system.
>
> If we can use per lruvec lock, that could relief much of the lru_lock
> contention.
>
> The later patches will replace the pgdat->lru_lock with lruvec->lru_lock
> and show the performance benefit by benchmarks.

Merge this patch with actual usage. No need to have a separate patch.

>
> Signed-off-by: Alex Shi <alex.shi@linux.alibaba.com>
> Cc: Andrew Morton <akpm@linux-foundation.org>
> Cc: Vlastimil Babka <vbabka@suse.cz>
> Cc: Dan Williams <dan.j.williams@intel.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Cc: Mel Gorman <mgorman@techsingularity.net>
> Cc: Wei Yang <richard.weiyang@gmail.com>
> Cc: Johannes Weiner <hannes@cmpxchg.org>
> Cc: Arun KS <arunks@codeaurora.org>
> Cc: Konstantin Khlebnikov <khlebnikov@yandex-team.ru>
> Cc: Hugh Dickins <hughd@google.com>
> Cc: cgroups@vger.kernel.org
> Cc: linux-mm@kvack.org
> Cc: linux-kernel@vger.kernel.org
> ---
>  include/linux/mmzone.h | 2 ++
>  mm/mmzone.c            | 1 +
>  2 files changed, 3 insertions(+)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 07b784ffbb14..a13b8a602ee5 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -267,6 +267,8 @@ struct lruvec {
>         unsigned long                   refaults;
>         /* Various lruvec state flags (enum lruvec_flags) */
>         unsigned long                   flags;
> +       /* per lruvec lru_lock for memcg */
> +       spinlock_t                      lru_lock;
>  #ifdef CONFIG_MEMCG
>         struct pglist_data *pgdat;
>  #endif
> diff --git a/mm/mmzone.c b/mm/mmzone.c
> index 4686fdc23bb9..3750a90ed4a0 100644
> --- a/mm/mmzone.c
> +++ b/mm/mmzone.c
> @@ -91,6 +91,7 @@ void lruvec_init(struct lruvec *lruvec)
>         enum lru_list lru;
>
>         memset(lruvec, 0, sizeof(struct lruvec));
> +       spin_lock_init(&lruvec->lru_lock);
>
>         for_each_lru(lru)
>                 INIT_LIST_HEAD(&lruvec->lists[lru]);
> --
> 1.8.3.1
>
Alex Shi Nov. 18, 2019, 2:44 a.m. UTC | #2
在 2019/11/16 下午2:28, Shakeel Butt 写道:
> On Fri, Nov 15, 2019 at 7:15 PM Alex Shi <alex.shi@linux.alibaba.com> wrote:
>>
>> Currently memcg still use per node pgdat->lru_lock to guard its lruvec.
>> That causes some lru_lock contention in a high container density system.
>>
>> If we can use per lruvec lock, that could relief much of the lru_lock
>> contention.
>>
>> The later patches will replace the pgdat->lru_lock with lruvec->lru_lock
>> and show the performance benefit by benchmarks.
> 
> Merge this patch with actual usage. No need to have a separate patch.

Thanks for comment, Shakeel!

Yes, but considering the 3rd, huge and un-splitable patch of actully replacing, I'd rather to pull sth out from 
it. Ty to make patches a bit more readable, Do you think so?

Thanks
Alex
Matthew Wilcox Nov. 18, 2019, 12:08 p.m. UTC | #3
On Mon, Nov 18, 2019 at 10:44:57AM +0800, Alex Shi wrote:
> 
> 
> 在 2019/11/16 下午2:28, Shakeel Butt 写道:
> > On Fri, Nov 15, 2019 at 7:15 PM Alex Shi <alex.shi@linux.alibaba.com> wrote:
> >>
> >> Currently memcg still use per node pgdat->lru_lock to guard its lruvec.
> >> That causes some lru_lock contention in a high container density system.
> >>
> >> If we can use per lruvec lock, that could relief much of the lru_lock
> >> contention.
> >>
> >> The later patches will replace the pgdat->lru_lock with lruvec->lru_lock
> >> and show the performance benefit by benchmarks.
> > 
> > Merge this patch with actual usage. No need to have a separate patch.
> 
> Thanks for comment, Shakeel!
> 
> Yes, but considering the 3rd, huge and un-splitable patch of actully replacing, I'd rather to pull sth out from 
> it. Ty to make patches a bit more readable, Do you think so?

This method of splitting the patches doesn't help with the reviewability of
the patch series.
Alex Shi Nov. 18, 2019, 12:37 p.m. UTC | #4
在 2019/11/18 下午8:08, Matthew Wilcox 写道:
>>> Merge this patch with actual usage. No need to have a separate patch.
>> Thanks for comment, Shakeel!
>>
>> Yes, but considering the 3rd, huge and un-splitable patch of actully replacing, I'd rather to pull sth out from 
>> it. Ty to make patches a bit more readable, Do you think so?
> This method of splitting the patches doesn't help with the reviewability of
> the patch series.

Hi Matthew,

Thanks for comments!
I will fold them into the 3rd patch as you are all insist on this. :)

Thanks
Alex
Alex Shi Nov. 19, 2019, 10:05 a.m. UTC | #5
在 2019/11/18 下午8:08, Matthew Wilcox 写道:
>> Thanks for comment, Shakeel!
>>
>> Yes, but considering the 3rd, huge and un-splitable patch of actully replacing, I'd rather to pull sth out from 
>> it. Ty to make patches a bit more readable, Do you think so?
> This method of splitting the patches doesn't help with the reviewability of
> the patch series.

thanks for comments! I will reorg my patchset for better understanding.

Alex
diff mbox series

Patch

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 07b784ffbb14..a13b8a602ee5 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -267,6 +267,8 @@  struct lruvec {
 	unsigned long			refaults;
 	/* Various lruvec state flags (enum lruvec_flags) */
 	unsigned long			flags;
+	/* per lruvec lru_lock for memcg */
+	spinlock_t			lru_lock;
 #ifdef CONFIG_MEMCG
 	struct pglist_data *pgdat;
 #endif
diff --git a/mm/mmzone.c b/mm/mmzone.c
index 4686fdc23bb9..3750a90ed4a0 100644
--- a/mm/mmzone.c
+++ b/mm/mmzone.c
@@ -91,6 +91,7 @@  void lruvec_init(struct lruvec *lruvec)
 	enum lru_list lru;
 
 	memset(lruvec, 0, sizeof(struct lruvec));
+	spin_lock_init(&lruvec->lru_lock);
 
 	for_each_lru(lru)
 		INIT_LIST_HEAD(&lruvec->lists[lru]);