mbox series

[v2,0/8] fsdax,xfs: fix warning messages

Message ID 1669908538-55-1-git-send-email-ruansy.fnst@fujitsu.com (mailing list archive)
Headers show
Series fsdax,xfs: fix warning messages | expand

Message

Shiyang Ruan Dec. 1, 2022, 3:28 p.m. UTC
Changes since v1:
 1. Added a snippet of the warning message and some of the failed cases
 2. Separated the patch for easily review
 3. Added page->share and its helper functions
 4. Included the patch[1] that removes the restrictions of fsdax and reflink
[1] https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.fnst@fujitsu.com/

Many testcases failed in dax+reflink mode with warning message in dmesg.
Such as generic/051,075,127.  The warning message is like this:
[  775.509337] ------------[ cut here ]------------
[  775.509636] WARNING: CPU: 1 PID: 16815 at fs/dax.c:386 dax_insert_entry.cold+0x2e/0x69
[  775.510151] Modules linked in: auth_rpcgss oid_registry nfsv4 algif_hash af_alg af_packet nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter ip_tables x_tables dax_pmem nd_pmem nd_btt sch_fq_codel configfs xfs libcrc32c fuse
[  775.524288] CPU: 1 PID: 16815 Comm: fsx Kdump: loaded Tainted: G        W          6.1.0-rc4+ #164 eb34e4ee4200c7cbbb47de2b1892c5a3e027fd6d
[  775.524904] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.0-3-3 04/01/2014
[  775.525460] RIP: 0010:dax_insert_entry.cold+0x2e/0x69
[  775.525797] Code: c7 c7 18 eb e0 81 48 89 4c 24 20 48 89 54 24 10 e8 73 6d ff ff 48 83 7d 18 00 48 8b 54 24 10 48 8b 4c 24 20 0f 84 e3 e9 b9 ff <0f> 0b e9 dc e9 b9 ff 48 c7 c6 a0 20 c3 81 48 c7 c7 f0 ea e0 81 48
[  775.526708] RSP: 0000:ffffc90001d57b30 EFLAGS: 00010082
[  775.527042] RAX: 000000000000002a RBX: 0000000000000000 RCX: 0000000000000042
[  775.527396] RDX: ffffea000a0f6c80 RSI: ffffffff81dfab1b RDI: 00000000ffffffff
[  775.527819] RBP: ffffea000a0f6c40 R08: 0000000000000000 R09: ffffffff820625e0
[  775.528241] R10: ffffc90001d579d8 R11: ffffffff820d2628 R12: ffff88815fc98320
[  775.528598] R13: ffffc90001d57c18 R14: 0000000000000000 R15: 0000000000000001
[  775.528997] FS:  00007f39fc75d740(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000
[  775.529474] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  775.529800] CR2: 00007f39fc772040 CR3: 0000000107eb6001 CR4: 00000000003706e0
[  775.530214] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  775.530592] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  775.531002] Call Trace:
[  775.531230]  <TASK>
[  775.531444]  dax_fault_iter+0x267/0x6c0
[  775.531719]  dax_iomap_pte_fault+0x198/0x3d0
[  775.532002]  __xfs_filemap_fault+0x24a/0x2d0 [xfs aa8d25411432b306d9554da38096f4ebb86bdfe7]
[  775.532603]  __do_fault+0x30/0x1e0
[  775.532903]  do_fault+0x314/0x6c0
[  775.533166]  __handle_mm_fault+0x646/0x1250
[  775.533480]  handle_mm_fault+0xc1/0x230
[  775.533810]  do_user_addr_fault+0x1ac/0x610
[  775.534110]  exc_page_fault+0x63/0x140
[  775.534389]  asm_exc_page_fault+0x22/0x30
[  775.534678] RIP: 0033:0x7f39fc55820a
[  775.534950] Code: 00 01 00 00 00 74 99 83 f9 c0 0f 87 7b fe ff ff c5 fe 6f 4e 20 48 29 fe 48 83 c7 3f 49 8d 0c 10 48 83 e7 c0 48 01 fe 48 29 f9 <f3> a4 c4 c1 7e 7f 00 c4 c1 7e 7f 48 20 c5 f8 77 c3 0f 1f 44 00 00
[  775.535839] RSP: 002b:00007ffc66a08118 EFLAGS: 00010202
[  775.536157] RAX: 00007f39fc772001 RBX: 0000000000042001 RCX: 00000000000063c1
[  775.536537] RDX: 0000000000006400 RSI: 00007f39fac42050 RDI: 00007f39fc772040
[  775.536919] RBP: 0000000000006400 R08: 00007f39fc772001 R09: 0000000000042000
[  775.537304] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
[  775.537694] R13: 00007f39fc772000 R14: 0000000000006401 R15: 0000000000000003
[  775.538086]  </TASK>
[  775.538333] ---[ end trace 0000000000000000 ]---

This also effects dax+noreflink mode if we run the test after a
dax+reflink test.  So, the most urgent thing is solving the warning
messages.

With these fixes, most warning messages in dax_associate_entry() are
gone.  But honestly, generic/388 will randomly failed with the warning.
The case shutdown the xfs when fsstress is running, and do it for many
times.  I think the reason is that dax pages in use are not able to be
invalidated in time when fs is shutdown.  The next time dax page to be
associated, it still remains the mapping value set last time.  I'll keep
on solving it.

The warning message in dax_writeback_one() can also be fixed because of
the dax unshare.


Shiyang Ruan (8):
  fsdax: introduce page->share for fsdax in reflink mode
  fsdax: invalidate pages when CoW
  fsdax: zero the edges if source is HOLE or UNWRITTEN
  fsdax,xfs: set the shared flag when file extent is shared
  fsdax: dedupe: iter two files at the same time
  xfs: use dax ops for zero and truncate in fsdax mode
  fsdax,xfs: port unshare to fsdax
  xfs: remove restrictions for fsdax and reflink

 fs/dax.c                   | 220 +++++++++++++++++++++++++------------
 fs/xfs/xfs_ioctl.c         |   4 -
 fs/xfs/xfs_iomap.c         |   6 +-
 fs/xfs/xfs_iops.c          |   4 -
 fs/xfs/xfs_reflink.c       |   8 +-
 include/linux/dax.h        |   2 +
 include/linux/mm_types.h   |   5 +-
 include/linux/page-flags.h |   2 +-
 8 files changed, 166 insertions(+), 85 deletions(-)

Comments

Dan Williams Dec. 3, 2022, 1:21 a.m. UTC | #1
Shiyang Ruan wrote:
> Changes since v1:
>  1. Added a snippet of the warning message and some of the failed cases
>  2. Separated the patch for easily review
>  3. Added page->share and its helper functions
>  4. Included the patch[1] that removes the restrictions of fsdax and reflink
> [1] https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.fnst@fujitsu.com/
> 
> Many testcases failed in dax+reflink mode with warning message in dmesg.
> Such as generic/051,075,127.  The warning message is like this:
> [  775.509337] ------------[ cut here ]------------
> [  775.509636] WARNING: CPU: 1 PID: 16815 at fs/dax.c:386 dax_insert_entry.cold+0x2e/0x69
> [  775.510151] Modules linked in: auth_rpcgss oid_registry nfsv4 algif_hash af_alg af_packet nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 ip_set nf_tables nfnetlink ip6table_filter ip6_tables iptable_filter ip_tables x_tables dax_pmem nd_pmem nd_btt sch_fq_codel configfs xfs libcrc32c fuse
> [  775.524288] CPU: 1 PID: 16815 Comm: fsx Kdump: loaded Tainted: G        W          6.1.0-rc4+ #164 eb34e4ee4200c7cbbb47de2b1892c5a3e027fd6d
> [  775.524904] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.16.0-3-3 04/01/2014
> [  775.525460] RIP: 0010:dax_insert_entry.cold+0x2e/0x69
> [  775.525797] Code: c7 c7 18 eb e0 81 48 89 4c 24 20 48 89 54 24 10 e8 73 6d ff ff 48 83 7d 18 00 48 8b 54 24 10 48 8b 4c 24 20 0f 84 e3 e9 b9 ff <0f> 0b e9 dc e9 b9 ff 48 c7 c6 a0 20 c3 81 48 c7 c7 f0 ea e0 81 48
> [  775.526708] RSP: 0000:ffffc90001d57b30 EFLAGS: 00010082
> [  775.527042] RAX: 000000000000002a RBX: 0000000000000000 RCX: 0000000000000042
> [  775.527396] RDX: ffffea000a0f6c80 RSI: ffffffff81dfab1b RDI: 00000000ffffffff
> [  775.527819] RBP: ffffea000a0f6c40 R08: 0000000000000000 R09: ffffffff820625e0
> [  775.528241] R10: ffffc90001d579d8 R11: ffffffff820d2628 R12: ffff88815fc98320
> [  775.528598] R13: ffffc90001d57c18 R14: 0000000000000000 R15: 0000000000000001
> [  775.528997] FS:  00007f39fc75d740(0000) GS:ffff88817bc80000(0000) knlGS:0000000000000000
> [  775.529474] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [  775.529800] CR2: 00007f39fc772040 CR3: 0000000107eb6001 CR4: 00000000003706e0
> [  775.530214] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
> [  775.530592] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
> [  775.531002] Call Trace:
> [  775.531230]  <TASK>
> [  775.531444]  dax_fault_iter+0x267/0x6c0
> [  775.531719]  dax_iomap_pte_fault+0x198/0x3d0
> [  775.532002]  __xfs_filemap_fault+0x24a/0x2d0 [xfs aa8d25411432b306d9554da38096f4ebb86bdfe7]
> [  775.532603]  __do_fault+0x30/0x1e0
> [  775.532903]  do_fault+0x314/0x6c0
> [  775.533166]  __handle_mm_fault+0x646/0x1250
> [  775.533480]  handle_mm_fault+0xc1/0x230
> [  775.533810]  do_user_addr_fault+0x1ac/0x610
> [  775.534110]  exc_page_fault+0x63/0x140
> [  775.534389]  asm_exc_page_fault+0x22/0x30
> [  775.534678] RIP: 0033:0x7f39fc55820a
> [  775.534950] Code: 00 01 00 00 00 74 99 83 f9 c0 0f 87 7b fe ff ff c5 fe 6f 4e 20 48 29 fe 48 83 c7 3f 49 8d 0c 10 48 83 e7 c0 48 01 fe 48 29 f9 <f3> a4 c4 c1 7e 7f 00 c4 c1 7e 7f 48 20 c5 f8 77 c3 0f 1f 44 00 00
> [  775.535839] RSP: 002b:00007ffc66a08118 EFLAGS: 00010202
> [  775.536157] RAX: 00007f39fc772001 RBX: 0000000000042001 RCX: 00000000000063c1
> [  775.536537] RDX: 0000000000006400 RSI: 00007f39fac42050 RDI: 00007f39fc772040
> [  775.536919] RBP: 0000000000006400 R08: 00007f39fc772001 R09: 0000000000042000
> [  775.537304] R10: 0000000000000001 R11: 0000000000000246 R12: 0000000000000001
> [  775.537694] R13: 00007f39fc772000 R14: 0000000000006401 R15: 0000000000000003
> [  775.538086]  </TASK>
> [  775.538333] ---[ end trace 0000000000000000 ]---
> 
> This also effects dax+noreflink mode if we run the test after a
> dax+reflink test.  So, the most urgent thing is solving the warning
> messages.
> 
> With these fixes, most warning messages in dax_associate_entry() are
> gone.  But honestly, generic/388 will randomly failed with the warning.
> The case shutdown the xfs when fsstress is running, and do it for many
> times.  I think the reason is that dax pages in use are not able to be
> invalidated in time when fs is shutdown.  The next time dax page to be
> associated, it still remains the mapping value set last time.  I'll keep
> on solving it.

This one also sounds like it is going to be relevant for CXL PMEM, and
the improvements to the reference counting. CXL has a facility where the
driver asserts that no more writes are in-flight to the device so that
the device can assert a clean shutdown. Part of that will be making sure
that page access ends at fs shutdown.

> The warning message in dax_writeback_one() can also be fixed because of
> the dax unshare.
> 
> 
> Shiyang Ruan (8):
>   fsdax: introduce page->share for fsdax in reflink mode
>   fsdax: invalidate pages when CoW
>   fsdax: zero the edges if source is HOLE or UNWRITTEN
>   fsdax,xfs: set the shared flag when file extent is shared
>   fsdax: dedupe: iter two files at the same time
>   xfs: use dax ops for zero and truncate in fsdax mode
>   fsdax,xfs: port unshare to fsdax
>   xfs: remove restrictions for fsdax and reflink
> 
>  fs/dax.c                   | 220 +++++++++++++++++++++++++------------
>  fs/xfs/xfs_ioctl.c         |   4 -
>  fs/xfs/xfs_iomap.c         |   6 +-
>  fs/xfs/xfs_iops.c          |   4 -
>  fs/xfs/xfs_reflink.c       |   8 +-
>  include/linux/dax.h        |   2 +
>  include/linux/mm_types.h   |   5 +-
>  include/linux/page-flags.h |   2 +-
>  8 files changed, 166 insertions(+), 85 deletions(-)
> 
> -- 
> 2.38.1
> 
>
Shiyang Ruan Dec. 29, 2022, 8:23 a.m. UTC | #2
在 2022/12/3 9:21, Dan Williams 写道:
> Shiyang Ruan wrote:
>> Changes since v1:
>>   1. Added a snippet of the warning message and some of the failed cases
>>   2. Separated the patch for easily review
>>   3. Added page->share and its helper functions
>>   4. Included the patch[1] that removes the restrictions of fsdax and reflink
>> [1] https://lore.kernel.org/linux-xfs/1663234002-17-1-git-send-email-ruansy.fnst@fujitsu.com/
>>
...
>>
>> This also effects dax+noreflink mode if we run the test after a
>> dax+reflink test.  So, the most urgent thing is solving the warning
>> messages.
>>
>> With these fixes, most warning messages in dax_associate_entry() are
>> gone.  But honestly, generic/388 will randomly failed with the warning.
>> The case shutdown the xfs when fsstress is running, and do it for many
>> times.  I think the reason is that dax pages in use are not able to be
>> invalidated in time when fs is shutdown.  The next time dax page to be
>> associated, it still remains the mapping value set last time.  I'll keep
>> on solving it.
> 
> This one also sounds like it is going to be relevant for CXL PMEM, and
> the improvements to the reference counting. CXL has a facility where the
> driver asserts that no more writes are in-flight to the device so that
> the device can assert a clean shutdown. Part of that will be making sure
> that page access ends at fs shutdown.

I was trying to locate the root cause of the fail on generic/388.  But 
since it's a fsstress test, I can't relpay the operation sequence to 
help me locate the operations.  So, I tried to replace fsstress with 
fsx, which can do replay after the case fails, but it can't reproduce 
the fail.  I think another important factor is that fsstress tests with 
multiple threads.  So, for now, it's hard for me to locate the cause by 
running the test.

Then I updated the kernel to the latest v6.2-rc1 and run generic/388 for 
many times.  The warning dmesg doesn't show any more.

How is your test on this case?  Does it still fail on the latest kernel? 
  If so, I think I have to keep on locating the cause, and need your advice.


--
Thanks,
Ruan.

> 
>> The warning message in dax_writeback_one() can also be fixed because of
>> the dax unshare.
>>
>>
>> Shiyang Ruan (8):
>>    fsdax: introduce page->share for fsdax in reflink mode
>>    fsdax: invalidate pages when CoW
>>    fsdax: zero the edges if source is HOLE or UNWRITTEN
>>    fsdax,xfs: set the shared flag when file extent is shared
>>    fsdax: dedupe: iter two files at the same time
>>    xfs: use dax ops for zero and truncate in fsdax mode
>>    fsdax,xfs: port unshare to fsdax
>>    xfs: remove restrictions for fsdax and reflink
>>
>>   fs/dax.c                   | 220 +++++++++++++++++++++++++------------
>>   fs/xfs/xfs_ioctl.c         |   4 -
>>   fs/xfs/xfs_iomap.c         |   6 +-
>>   fs/xfs/xfs_iops.c          |   4 -
>>   fs/xfs/xfs_reflink.c       |   8 +-
>>   include/linux/dax.h        |   2 +
>>   include/linux/mm_types.h   |   5 +-
>>   include/linux/page-flags.h |   2 +-
>>   8 files changed, 166 insertions(+), 85 deletions(-)
>>
>> -- 
>> 2.38.1
>>
>>
> 
>