diff mbox series

[4.19.y] fs/iomap: use consistent gfp flags during xfs readpages

Message ID 20241210094617.66152-1-husong@kylinos.cn (mailing list archive)
State New
Headers show
Series [4.19.y] fs/iomap: use consistent gfp flags during xfs readpages | expand

Commit Message

Hu Song Dec. 10, 2024, 9:46 a.m. UTC
In low memory situations(Specifically in docker),xfs_vm_readpages
path might declare memcg oom during fs pagefault and kill applications.

This patch extends the commit 8a5c743e308d ("mm, memcg: use consistent
gfp flags during readahead") to include XFS by modifying its readahead
path to use readahead_gfp_mask.Specifically, the gfp_mask logic in
xfs_vm_readpages and related functions is now aligned with
readahead_gfp_mask to ensure consistent behavior during readahead.
This prevents potential OOMs caused by discrepancies in gfp_mask handling.

Test Results:
run docker:docker container run --name wget.100m.ky -d
	--memory 104857600 --memory-swap 104857600;
docker : wget http://172.17.0.1/testfile(2G largely file)

Before the fix:
printk:try_to_free_mem_cgroup_pages's parameters: gfp_mask=0x62004a
(GFP_NOFS|__GFP_HIGHMEM |__GFP_HARDWALL|__GFP_MOVABLE)
and return value:nr_reclaimed: 0

[  153.390196] CPU: 1 PID: 5405 Comm: wget Kdump: loaded Not tainted 4.19.90-25+ #24
[  153.390197] Hardware name: American Megatrends Inc. To be filled by O.E.M./To be filled by O.E.M., BIOS ITSW3001 09/14/2020
[  153.390197] Call Trace:
[  153.390199]  dump_stack+0x64/0x88
[  153.390200]  try_to_free_mem_cgroup_pages.cold+0x30/0x3e
[  153.390201]  try_charge+0x2d9/0x7a0
[  153.390202]  ? memcg_check_events+0xdd/0x250
[  153.390203]  mem_cgroup_try_charge+0x8b/0x180
[  153.390204]  __add_to_page_cache_locked+0x64/0x240
[  153.390205]  add_to_page_cache_lru+0x48/0xe0
[  153.390206]  iomap_readpages_actor+0x10e/0x240
[  153.390207]  iomap_apply+0xc3/0x130
[  153.390208]  ? iomap_write_begin.constprop.0+0x310/0x310
[  153.390209]  iomap_readpages+0xa4/0x190
[  153.390210]  ? iomap_write_begin.constprop.0+0x310/0x310
[  153.390211]  read_pages.isra.0+0x72/0x190
[  153.390212]  __do_page_cache_readahead+0x1b2/0x1d0
[  153.390214]  filemap_fault+0x2d6/0x570
[  153.390235]  __xfs_filemap_fault+0x6b/0x200 [xfs]
[  153.390236]  __do_fault+0x38/0x120
[  153.390237]  do_fault+0x119/0x3e0
[  153.390238]  __handle_mm_fault+0x455/0x5d0
[  153.390239]  handle_mm_fault+0x90/0x1b0
[  153.390240]  __do_page_fault+0x2ea/0x540
[  153.390242]  do_page_fault+0x33/0x120
[  153.390243]  ? page_fault+0x8/0x30
[  153.390243]  page_fault+0x1e/0x30
[  153.390244] RIP: 0033:0x7f5404794514
[  153.390246] Code: Bad RIP value.
[  153.390246] RSP: 002b:00007fff244f0728 EFLAGS: 00010246
[  153.390246] RAX: 0000000000001000 RBX: 0000000000001000 RCX: 00007f5404794514
[  153.390247] RDX: 0000000000001000 RSI: 000055ef7f87e640 RDI: 0000000000000004
[  153.390247] RBP: 000055ef7f87e640 R08: 0000000000000000 R09: 000055ef7f87e670
[  153.390248] R10: 000055ef7f87e620 R11: 0000000000000246 R12: 000055ef7f879d80
[  153.390248] R13: 0000000000001000 R14: 00007f540485d7c0 R15: 0000000000001000
[  153.390257] wget invoked oom-killer: gfp_mask=0x600040(GFP_NOFS), nodemask=(null), order=0, oom_score_adj=0
[  153.390257] wget cpuset=bae816dd30bd6e193684d5580f57fd54df29c0a695dec5b7606931d248c18dd2 mems_allowed=0

wget downloads a 2G file and oom kills the process almost every time

After the fix:
printk:try_to_free_mem_cgroup_pages's parameters: gfp_mask=0x62124a
(GFP_NOFS|__GFP_HIGHMEM|__GFP_NOWARN|__GFP_NORETRY|
__GFP_HARDWALL|__GFP_MOVABLE)
and return value: nr_reclaimed: 55

[  196.970857] CPU: 9 PID: 5326 Comm: wget Kdump: loaded Not tainted 4.19.90-25+ #23
[  196.970858] Hardware name: American Megatrends Inc. To be filled by O.E.M./To be filled by O.E.M., BIOS ITSW3001 09/14/2020
[  196.970858] Call Trace:
[  196.970860]  dump_stack+0x64/0x88
[  196.970861]  try_to_free_mem_cgroup_pages.cold+0x30/0x3e
[  196.970862]  try_charge+0x2d9/0x7a0
[  196.970863]  ? memcg_check_events+0xdd/0x250
[  196.970865]  mem_cgroup_try_charge+0x8b/0x180
[  196.970865]  __add_to_page_cache_locked+0x64/0x240
[  196.970866]  add_to_page_cache_lru+0x48/0xe0
[  196.970868]  iomap_readpages_actor+0x125/0x250
[  196.970869]  iomap_apply+0xc3/0x130
[  196.970870]  ? iomap_write_begin.constprop.0+0x310/0x310
[  196.970871]  iomap_readpages+0xa4/0x190
[  196.970872]  ? iomap_write_begin.constprop.0+0x310/0x310
[  196.970873]  read_pages.isra.0+0x72/0x190
[  196.970875]  __do_page_cache_readahead+0x160/0x1d0
[  196.970876]  filemap_fault+0x2d6/0x570
[  196.970897]  __xfs_filemap_fault+0x6b/0x200 [xfs]
[  196.970899]  __do_fault+0x38/0x120
[  196.970900]  do_fault+0x119/0x3e0
[  196.970901]  __handle_mm_fault+0x455/0x5d0
[  196.970903]  handle_mm_fault+0x90/0x1b0
[  196.970905]  __do_page_fault+0x2ea/0x540
[  196.970906]  do_page_fault+0x33/0x120
[  196.970907]  ? page_fault+0x8/0x30
[  196.970908]  page_fault+0x1e/0x30
[  196.970909] RIP: 0033:0x7fed5d34b340
[  196.970911] Code: Bad RIP value.
[  196.970912] RSP: 002b:00007ffcf231fd68 EFLAGS: 00010246
[  196.970913] RAX: 0000000000000000 RBX: 000055f860649030 RCX: 00000000061a9000
[  196.970913] RDX: 000055f860664980 RSI: 0000000000000000 RDI: 000055f860649030
[  196.970913] RBP: 000000000000003b R08: 7fffffffffffffff R09: 7ffffffff9e58fff
[  196.970914] R10: 000055f860667620 R11: 0000000000000246 R12: 00000000061a9000
[  196.970914] R13: 0000000000000000 R14: 000055f860664b50 R15: 000055f860664980

wget downloads a 2G file and is tested 500 times without being killed

Fixes: 8a5c743e308d ("mm, memcg: use consistent gfp flags during readahead")
Signed-off-by: Hu Song <husong@kylinos.cn>
---
 fs/iomap.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)
diff mbox series

Patch

diff --git a/fs/iomap.c b/fs/iomap.c
index 04e82b6bd9bf..a34e4ec874f0 100644
--- a/fs/iomap.c
+++ b/fs/iomap.c
@@ -424,6 +424,7 @@  static struct page *
 iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos,
 		loff_t length, loff_t *done)
 {
+	gfp_t gfp_mask = readahead_gfp_mask(inode->i_mapping);
 	while (!list_empty(pages)) {
 		struct page *page = lru_to_page(pages);
 
@@ -432,7 +433,7 @@  iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos,
 
 		list_del(&page->lru);
 		if (!add_to_page_cache_lru(page, inode->i_mapping, page->index,
-				GFP_NOFS))
+				gfp_mask | GFP_NOFS))
 			return page;
 
 		/*