Message ID | 20240412064353.133497-1-zhaoyang.huang@unisoc.com (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/1] mm: protect xa split stuff under lruvec->lru_lock during migration | expand |
On Fri, Apr 12, 2024 at 02:43:53PM +0800, zhaoyang.huang wrote: > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > Livelock in [1] is reported multitimes since v515, where the zero-ref > folio is repeatly found on the page cache by find_get_entry. A possible > timing sequence is proposed in [2], which can be described briefly as I have no patience for going through another one of your "analyses". 1. Can you reproduce this bug without this patch? 2. Does the reproducer stop working after this patch? Otherwise I'm not interested. Sorry. You burnt all my good will. > the lockless xarray operation could get harmed by an illegal folio > remaining on the slot[offset]. This commit would like to protect > the xa split stuff(folio_ref_freeze and __split_huge_page) under > lruvec->lock to remove the race window.
On Fri, 12 Apr 2024 14:43:53 +0800 "zhaoyang.huang" <zhaoyang.huang@unisoc.com> wrote:
> Livelock in [1] is reported multitimes since v515,
Are you able to provide us with a means by which others can reproduce this?
Thanks.
loop Dave, since he has ever helped set up an reproducer in https://lore.kernel.org/linux-mm/20221101071721.GV2703033@dread.disaster.area/ @Dave Chinner , I would like to ask for your kindly help on if you can verify this patch on your environment if convenient. Thanks a lot. On Sat, Apr 13, 2024 at 5:34 AM Andrew Morton <akpm@linux-foundation.org> wrote: > > On Fri, 12 Apr 2024 14:43:53 +0800 "zhaoyang.huang" <zhaoyang.huang@unisoc.com> wrote: > > > Livelock in [1] is reported multitimes since v515, > > Are you able to provide us with a means by which others can reproduce this? > > Thanks.
On Fri, Apr 12, 2024 at 8:24 PM Matthew Wilcox <willy@infradead.org> wrote: > > On Fri, Apr 12, 2024 at 02:43:53PM +0800, zhaoyang.huang wrote: > > From: Zhaoyang Huang <zhaoyang.huang@unisoc.com> > > > > Livelock in [1] is reported multitimes since v515, where the zero-ref > > folio is repeatly found on the page cache by find_get_entry. A possible > > timing sequence is proposed in [2], which can be described briefly as > > I have no patience for going through another one of your "analyses". > > 1. Can you reproduce this bug without this patch? > 2. Does the reproducer stop working after this patch? > > Otherwise I'm not interested. Sorry. You burnt all my good will. This bug has been reported many times by three people including me as below for at least two years, have you ever tried to solve it? Do Dave and Brian also burnt your good will if you ever have? Be aware that you are the maintainer who has the responsibility for maintaining the code but not us. "To wear crowns shall bear the heavy or give up". Put me on your SPAM list, thank you. https://lore.kernel.org/linux-mm/20221018223042.GJ2703033@dread.disaster.area/ https://lore.kernel.org/linux-mm/Y0%2FkZbIvMgkNhWpM@bfoster/ > > > the lockless xarray operation could get harmed by an illegal folio > > remaining on the slot[offset]. This commit would like to protect > > the xa split stuff(folio_ref_freeze and __split_huge_page) under > > lruvec->lock to remove the race window.
On Sat, Apr 13, 2024 at 10:01:27AM +0800, Zhaoyang Huang wrote: > loop Dave, since he has ever helped set up an reproducer in > https://lore.kernel.org/linux-mm/20221101071721.GV2703033@dread.disaster.area/ > @Dave Chinner , I would like to ask for your kindly help on if you can > verify this patch on your environment if convenient. Thanks a lot. I don't have the test environment from 18 months ago available any more. Also, I haven't seen this problem since that specific test environment tripped over the issue. Hence I don't have any way of confirming that the problem is fixed, either, because first I'd have to reproduce it... -Dave.
On Mon, Apr 15, 2024 at 8:09 AM Dave Chinner <david@fromorbit.com> wrote: > > On Sat, Apr 13, 2024 at 10:01:27AM +0800, Zhaoyang Huang wrote: > > loop Dave, since he has ever helped set up an reproducer in > > https://lore.kernel.org/linux-mm/20221101071721.GV2703033@dread.disaster.area/ > > @Dave Chinner , I would like to ask for your kindly help on if you can > > verify this patch on your environment if convenient. Thanks a lot. > > I don't have the test environment from 18 months ago available any > more. Also, I haven't seen this problem since that specific test > environment tripped over the issue. Hence I don't have any way of > confirming that the problem is fixed, either, because first I'd have > to reproduce it... Thanks for the information. I noticed that you reported another soft lockup which is related to xas_load since NOV.2023. This patch is supposed to be helpful for this. With regard to the version timing, this commit is actually a revert of <mm/thp: narrow lru locking> b6769834aac1d467fa1c71277d15688efcbb4d76 which is merged before v5.15. For saving your time, a brief description below. IMO, b6769834aa introduce a potential stall between freeze the folio's refcnt and store it back to 2, which have the xas_load->folio_try_get_rcu loops as livelock if it stalls the lru_lock's holder. b6769834aa split_huge_page_to_list - spin_lock(lru_lock) xas_split(&xas, folio,order) folio_refcnt_freeze(folio, 1 + folio_nr_pages(folio0) + spin_lock(lru_lock) xas_store(&xas, offset++, head+i) page_ref_add(head, 2) spin_unlock(lru_lock) Sorry in advance if the above doesn't make sense, I am just a developer who is also suffering from this bug and trying to fix it > > -Dave. > -- > Dave Chinner > david@fromorbit.com
diff --git a/mm/huge_memory.c b/mm/huge_memory.c index 9859aa4f7553..418e8d03480a 100644 --- a/mm/huge_memory.c +++ b/mm/huge_memory.c @@ -2891,7 +2891,7 @@ static void __split_huge_page(struct page *page, struct list_head *list, { struct folio *folio = page_folio(page); struct page *head = &folio->page; - struct lruvec *lruvec; + struct lruvec *lruvec = folio_lruvec(folio); struct address_space *swap_cache = NULL; unsigned long offset = 0; int i, nr_dropped = 0; @@ -2908,8 +2908,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, xa_lock(&swap_cache->i_pages); } - /* lock lru list/PageCompound, ref frozen by page_ref_freeze */ - lruvec = folio_lruvec_lock(folio); ClearPageHasHWPoisoned(head); @@ -2942,7 +2940,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, folio_set_order(new_folio, new_order); } - unlock_page_lruvec(lruvec); /* Caller disabled irqs, so they are still disabled here */ split_page_owner(head, order, new_order); @@ -2961,7 +2958,6 @@ static void __split_huge_page(struct page *page, struct list_head *list, folio_ref_add(folio, 1 + new_nr); xa_unlock(&folio->mapping->i_pages); } - local_irq_enable(); if (nr_dropped) shmem_uncharge(folio->mapping->host, nr_dropped); @@ -3048,6 +3044,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, int extra_pins, ret; pgoff_t end; bool is_hzp; + struct lruvec *lruvec; VM_BUG_ON_FOLIO(!folio_test_locked(folio), folio); VM_BUG_ON_FOLIO(!folio_test_large(folio), folio); @@ -3159,6 +3156,14 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, /* block interrupt reentry in xa_lock and spinlock */ local_irq_disable(); + + /* + * take lruvec's lock before freeze the folio to prevent the folio + * remains in the page cache with refcnt == 0, which could lead to + * find_get_entry enters livelock by iterating the xarray. + */ + lruvec = folio_lruvec_lock(folio); + if (mapping) { /* * Check if the folio is present in page cache. @@ -3203,12 +3208,16 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list, } __split_huge_page(page, list, end, new_order); + unlock_page_lruvec(lruvec); + local_irq_enable(); ret = 0; } else { spin_unlock(&ds_queue->split_queue_lock); fail: if (mapping) xas_unlock(&xas); + + unlock_page_lruvec(lruvec); local_irq_enable(); remap_page(folio, folio_nr_pages(folio)); ret = -EAGAIN;