Message ID | 20230429082759.1600796-2-fengwei.yin@intel.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | Reduce lock contention related with large folio | expand |
On Sat, Apr 29, 2023 at 04:27:58PM +0800, Yin Fengwei wrote: > free_transhuge_page() acquires split queue lock then check > whether the THP was added to deferred list or not. It brings > high deferred queue lock contention. > > It's safe to check whether the THP is in deferred list or not > without holding the deferred queue lock in free_transhuge_page() > because when code hit free_transhuge_page(), there is no one > tries to add the folio to _deferred_list. > > Running page_fault1 of will-it-scale + order 2 folio for anonymous > mapping with 96 processes on an Ice Lake 48C/96T test box, we could > see the 61% split_queue_lock contention: > - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page > - 63.01% free_transhuge_page > + 62.91% _raw_spin_lock_irqsave > > With this patch applied, the split_queue_lock contention is less > than 1%. > > Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Yin Fengwei <fengwei.yin@intel.com> writes: > free_transhuge_page() acquires split queue lock then check > whether the THP was added to deferred list or not. It brings > high deferred queue lock contention. > > It's safe to check whether the THP is in deferred list or not > without holding the deferred queue lock in free_transhuge_page() > because when code hit free_transhuge_page(), there is no one > tries to add the folio to _deferred_list. > > Running page_fault1 of will-it-scale + order 2 folio for anonymous > mapping with 96 processes on an Ice Lake 48C/96T test box, we could > see the 61% split_queue_lock contention: > - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page > - 63.01% free_transhuge_page > + 62.91% _raw_spin_lock_irqsave > > With this patch applied, the split_queue_lock contention is less > than 1%. > > Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> Thanks! Reviewed-by: "Huang, Ying" <ying.huang@intel.com> > --- > mm/huge_memory.c | 17 ++++++++++++----- > 1 file changed, 12 insertions(+), 5 deletions(-) > > diff --git a/mm/huge_memory.c b/mm/huge_memory.c > index 032fb0ef9cd1..2a1df2c24c8e 100644 > --- a/mm/huge_memory.c > +++ b/mm/huge_memory.c > @@ -2799,12 +2799,19 @@ void free_transhuge_page(struct page *page) > struct deferred_split *ds_queue = get_deferred_split_queue(folio); > unsigned long flags; > > - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > - if (!list_empty(&folio->_deferred_list)) { > - ds_queue->split_queue_len--; > - list_del(&folio->_deferred_list); > + /* > + * At this point, there is no one trying to add the folio to > + * deferred_list. If folio is not in deferred_list, it's safe > + * to check without acquiring the split_queue_lock. > + */ > + if (data_race(!list_empty(&folio->_deferred_list))) { > + spin_lock_irqsave(&ds_queue->split_queue_lock, flags); > + if (!list_empty(&folio->_deferred_list)) { > + ds_queue->split_queue_len--; > + list_del(&folio->_deferred_list); > + } > + spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > } > - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); > free_compound_page(page); > }
Hi Kirill, On 5/4/2023 7:48 PM, kirill@shutemov.name wrote: > On Sat, Apr 29, 2023 at 04:27:58PM +0800, Yin Fengwei wrote: >> free_transhuge_page() acquires split queue lock then check >> whether the THP was added to deferred list or not. It brings >> high deferred queue lock contention. >> >> It's safe to check whether the THP is in deferred list or not >> without holding the deferred queue lock in free_transhuge_page() >> because when code hit free_transhuge_page(), there is no one >> tries to add the folio to _deferred_list. >> >> Running page_fault1 of will-it-scale + order 2 folio for anonymous >> mapping with 96 processes on an Ice Lake 48C/96T test box, we could >> see the 61% split_queue_lock contention: >> - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page >> - 63.01% free_transhuge_page >> + 62.91% _raw_spin_lock_irqsave >> >> With this patch applied, the split_queue_lock contention is less >> than 1%. >> >> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> > > Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> Thanks a lot for the reviewing. Regards Yin, Fengwei >
Hi Ying, On 5/5/2023 8:52 AM, Huang, Ying wrote: > Yin Fengwei <fengwei.yin@intel.com> writes: > >> free_transhuge_page() acquires split queue lock then check >> whether the THP was added to deferred list or not. It brings >> high deferred queue lock contention. >> >> It's safe to check whether the THP is in deferred list or not >> without holding the deferred queue lock in free_transhuge_page() >> because when code hit free_transhuge_page(), there is no one >> tries to add the folio to _deferred_list. >> >> Running page_fault1 of will-it-scale + order 2 folio for anonymous >> mapping with 96 processes on an Ice Lake 48C/96T test box, we could >> see the 61% split_queue_lock contention: >> - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page >> - 63.01% free_transhuge_page >> + 62.91% _raw_spin_lock_irqsave >> >> With this patch applied, the split_queue_lock contention is less >> than 1%. >> >> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> > > Thanks! > > Reviewed-by: "Huang, Ying" <ying.huang@intel.com> Thanks a lot for reviewing. Regards Yin, Fengwei > >> --- >> mm/huge_memory.c | 17 ++++++++++++----- >> 1 file changed, 12 insertions(+), 5 deletions(-) >> >> diff --git a/mm/huge_memory.c b/mm/huge_memory.c >> index 032fb0ef9cd1..2a1df2c24c8e 100644 >> --- a/mm/huge_memory.c >> +++ b/mm/huge_memory.c >> @@ -2799,12 +2799,19 @@ void free_transhuge_page(struct page *page) >> struct deferred_split *ds_queue = get_deferred_split_queue(folio); >> unsigned long flags; >> >> - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >> - if (!list_empty(&folio->_deferred_list)) { >> - ds_queue->split_queue_len--; >> - list_del(&folio->_deferred_list); >> + /* >> + * At this point, there is no one trying to add the folio to >> + * deferred_list. If folio is not in deferred_list, it's safe >> + * to check without acquiring the split_queue_lock. >> + */ >> + if (data_race(!list_empty(&folio->_deferred_list))) { >> + spin_lock_irqsave(&ds_queue->split_queue_lock, flags); >> + if (!list_empty(&folio->_deferred_list)) { >> + ds_queue->split_queue_len--; >> + list_del(&folio->_deferred_list); >> + } >> + spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >> } >> - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); >> free_compound_page(page); >> }
Hi Andrew, On 5/4/23 19:48, kirill@shutemov.name wrote: > On Sat, Apr 29, 2023 at 04:27:58PM +0800, Yin Fengwei wrote: >> free_transhuge_page() acquires split queue lock then check >> whether the THP was added to deferred list or not. It brings >> high deferred queue lock contention. >> >> It's safe to check whether the THP is in deferred list or not >> without holding the deferred queue lock in free_transhuge_page() >> because when code hit free_transhuge_page(), there is no one >> tries to add the folio to _deferred_list. >> >> Running page_fault1 of will-it-scale + order 2 folio for anonymous >> mapping with 96 processes on an Ice Lake 48C/96T test box, we could >> see the 61% split_queue_lock contention: >> - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page >> - 63.01% free_transhuge_page >> + 62.91% _raw_spin_lock_irqsave >> >> With this patch applied, the split_queue_lock contention is less >> than 1%. >> >> Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> > > Acked-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com> I didn't get the green light for patch2 (which was trying to reduce lru lock contention) from Matthew. It may need more time to figure out how to reduce lru lock contention. I am wondering whether this patch1 (without patch2) can be picked as - It has nothing to do with patch2 - It could reduce the deferred queue lock contention. - It got acked-by from Kirill. Let me know if you want me to resend this patch1. Thanks. Regards Yin, Fengwei >
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 032fb0ef9cd1..2a1df2c24c8e 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2799,12 +2799,19 @@ void free_transhuge_page(struct page *page) struct deferred_split *ds_queue = get_deferred_split_queue(folio); unsigned long flags; - spin_lock_irqsave(&ds_queue->split_queue_lock, flags); - if (!list_empty(&folio->_deferred_list)) { - ds_queue->split_queue_len--; - list_del(&folio->_deferred_list); + /* + * At this point, there is no one trying to add the folio to + * deferred_list. If folio is not in deferred_list, it's safe + * to check without acquiring the split_queue_lock. + */ + if (data_race(!list_empty(&folio->_deferred_list))) { + spin_lock_irqsave(&ds_queue->split_queue_lock, flags); + if (!list_empty(&folio->_deferred_list)) { + ds_queue->split_queue_len--; + list_del(&folio->_deferred_list); + } + spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); } - spin_unlock_irqrestore(&ds_queue->split_queue_lock, flags); free_compound_page(page); }
free_transhuge_page() acquires split queue lock then check whether the THP was added to deferred list or not. It brings high deferred queue lock contention. It's safe to check whether the THP is in deferred list or not without holding the deferred queue lock in free_transhuge_page() because when code hit free_transhuge_page(), there is no one tries to add the folio to _deferred_list. Running page_fault1 of will-it-scale + order 2 folio for anonymous mapping with 96 processes on an Ice Lake 48C/96T test box, we could see the 61% split_queue_lock contention: - 63.02% 0.01% page_fault1_pro [kernel.kallsyms] [k] free_transhuge_page - 63.01% free_transhuge_page + 62.91% _raw_spin_lock_irqsave With this patch applied, the split_queue_lock contention is less than 1%. Signed-off-by: Yin Fengwei <fengwei.yin@intel.com> --- mm/huge_memory.c | 17 ++++++++++++----- 1 file changed, 12 insertions(+), 5 deletions(-)