diff mbox series

[v3,1/4] mm/hwpoison: delete all entries before traversal in __folio_free_raw_hwp

Message ID 20230707201904.953262-2-jiaqiyan@google.com (mailing list archive)
State New
Headers show
Series Improve hugetlbfs read on HWPOISON hugepages | expand

Commit Message

Jiaqi Yan July 7, 2023, 8:19 p.m. UTC
Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
are deleted from the llist. Correct the way __folio_free_raw_hwp deletes
and frees raw_hwp_page entries in raw_hwp_list: first llist_del_all, then
kfree within llist_for_each_safe.

As of today, concurrent adding, deleting, and traversal on raw_hwp_list
from hugetlb.c and/or memory-failure.c are fine with each other. Note
this is guaranteed partly by the lock-free nature of llist, and partly
by holding hugetlb_lock and/or mf_mutex. For example, as llist_del_all
is lock-free with itself, folio_clear_hugetlb_hwpoison()s from
__update_and_free_hugetlb_folio and memory_failure won't need explicit
locking when freeing the raw_hwp_list. New code that manipulates
raw_hwp_list must be careful to ensure the concurrency correctness.

Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>
---
 mm/memory-failure.c | 8 +++-----
 1 file changed, 3 insertions(+), 5 deletions(-)

Comments

Miaohe Lin July 8, 2023, 2:40 a.m. UTC | #1
On 2023/7/8 4:19, Jiaqi Yan wrote:
> Traversal on llist (e.g. llist_for_each_safe) is only safe AFTER entries
> are deleted from the llist. Correct the way __folio_free_raw_hwp deletes
> and frees raw_hwp_page entries in raw_hwp_list: first llist_del_all, then
> kfree within llist_for_each_safe.
> 
> As of today, concurrent adding, deleting, and traversal on raw_hwp_list
> from hugetlb.c and/or memory-failure.c are fine with each other. Note

I think there's a race on freeing the raw_hwp_list between unpoison_memory and __update_and_free_hugetlb_folio:

  unpoison_memory		__update_and_free_hugetlb_folio
    				if (folio_test_hwpoison)
    	  			    folio_clear_hugetlb_hwpoison
    folio_free_raw_hwp		      folio_free_raw_hwp
    folio_test_clear_hwpoison

unpoison_memory and __update_and_free_hugetlb_folio can traverse and free the raw_hwp_list
at the same time. And I believe your patch will fix the problem. Thanks.

> this is guaranteed partly by the lock-free nature of llist, and partly
> by holding hugetlb_lock and/or mf_mutex. For example, as llist_del_all
> is lock-free with itself, folio_clear_hugetlb_hwpoison()s from
> __update_and_free_hugetlb_folio and memory_failure won't need explicit
> locking when freeing the raw_hwp_list. New code that manipulates
> raw_hwp_list must be careful to ensure the concurrency correctness.
> 
> Acked-by: Mike Kravetz <mike.kravetz@oracle.com>
> Acked-by: Naoya Horiguchi <naoya.horiguchi@nec.com>
> Signed-off-by: Jiaqi Yan <jiaqiyan@google.com>

Anyway, this patch looks good to me.

Reviewed-by: Miaohe Lin <linmiaohe@huawei.com>
Thanks.
diff mbox series

Patch

diff --git a/mm/memory-failure.c b/mm/memory-failure.c
index e245191e6b04..a08677dcf953 100644
--- a/mm/memory-failure.c
+++ b/mm/memory-failure.c
@@ -1829,12 +1829,11 @@  static inline struct llist_head *raw_hwp_list_head(struct folio *folio)
 
 static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
 {
-	struct llist_head *head;
-	struct llist_node *t, *tnode;
+	struct llist_node *t, *tnode, *head;
 	unsigned long count = 0;
 
-	head = raw_hwp_list_head(folio);
-	llist_for_each_safe(tnode, t, head->first) {
+	head = llist_del_all(raw_hwp_list_head(folio));
+	llist_for_each_safe(tnode, t, head) {
 		struct raw_hwp_page *p = container_of(tnode, struct raw_hwp_page, node);
 
 		if (move_flag)
@@ -1844,7 +1843,6 @@  static unsigned long __folio_free_raw_hwp(struct folio *folio, bool move_flag)
 		kfree(p);
 		count++;
 	}
-	llist_del_all(head);
 	return count;
 }