Message ID | 20201215121414.253660-1-ruansy.fnst@cn.fujitsu.com (mailing list archive) |
---|---|
Headers | show |
Series | fsdax: introduce fs query to support reflink | expand |
Hi, Shiyang, On 12/15/2020 4:14 AM, Shiyang Ruan wrote: > The call trace is like this: > memory_failure() > pgmap->ops->memory_failure() => pmem_pgmap_memory_failure() > gendisk->fops->corrupted_range() => - pmem_corrupted_range() > - md_blk_corrupted_range() > sb->s_ops->currupted_range() => xfs_fs_corrupted_range() > xfs_rmap_query_range() > xfs_currupt_helper() > * corrupted on metadata > try to recover data, call xfs_force_shutdown() > * corrupted on file data > try to recover data, call mf_dax_mapping_kill_procs() > > The fsdax & reflink support for XFS is not contained in this patchset. > > (Rebased on v5.10) So I tried the patchset with pmem error injection, the SIGBUS payload does not look right - ** SIGBUS(7): ** ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) ** I expect the payload looks like ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) ** thanks, -jane
On 2020/12/17 上午4:55, Jane Chu wrote: > Hi, Shiyang, > > On 12/15/2020 4:14 AM, Shiyang Ruan wrote: >> The call trace is like this: >> memory_failure() >> pgmap->ops->memory_failure() => pmem_pgmap_memory_failure() >> gendisk->fops->corrupted_range() => - pmem_corrupted_range() >> - md_blk_corrupted_range() >> sb->s_ops->currupted_range() => xfs_fs_corrupted_range() >> xfs_rmap_query_range() >> xfs_currupt_helper() >> * corrupted on metadata >> try to recover data, call xfs_force_shutdown() >> * corrupted on file data >> try to recover data, call mf_dax_mapping_kill_procs() >> >> The fsdax & reflink support for XFS is not contained in this patchset. >> >> (Rebased on v5.10) > > So I tried the patchset with pmem error injection, the SIGBUS payload > does not look right - > > ** SIGBUS(7): ** > ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) ** > > I expect the payload looks like > > ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) ** Thanks for testing. I test the SIGBUS by writing a program which calls madvise(... ,MADV_HWPOISON) to inject memory-failure. It just shows that the program is killed by SIGBUS. I cannot get any detail from it. So, could you please show me the right way(test tools) to test it? -- Thanks, Ruan Shiyang. > > thanks, > -jane > > > > > >
On Fri, Dec 18, 2020 at 10:44:26AM +0800, Ruan Shiyang wrote: > > > On 2020/12/17 上午4:55, Jane Chu wrote: > > Hi, Shiyang, > > > > On 12/15/2020 4:14 AM, Shiyang Ruan wrote: > > > The call trace is like this: > > > memory_failure() > > > pgmap->ops->memory_failure() => pmem_pgmap_memory_failure() > > > gendisk->fops->corrupted_range() => - pmem_corrupted_range() > > > - md_blk_corrupted_range() > > > sb->s_ops->currupted_range() => xfs_fs_corrupted_range() > > > xfs_rmap_query_range() > > > xfs_currupt_helper() > > > * corrupted on metadata > > > try to recover data, call xfs_force_shutdown() > > > * corrupted on file data > > > try to recover data, call mf_dax_mapping_kill_procs() > > > > > > The fsdax & reflink support for XFS is not contained in this patchset. > > > > > > (Rebased on v5.10) > > > > So I tried the patchset with pmem error injection, the SIGBUS payload > > does not look right - > > > > ** SIGBUS(7): ** > > ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) ** > > > > I expect the payload looks like > > > > ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) ** > > Thanks for testing. I test the SIGBUS by writing a program which calls > madvise(... ,MADV_HWPOISON) to inject memory-failure. It just shows that > the program is killed by SIGBUS. I cannot get any detail from it. So, > could you please show me the right way(test tools) to test it? I'm assuming that Jane is using a program that calls sigaction to install a SIGBUS handler, and dumps the entire siginfo_t structure whenever it receives one... --D > > -- > Thanks, > Ruan Shiyang. > > > > > thanks, > > -jane > > > > > > > > > > > > > >
On 2020/12/18 上午11:49, Darrick J. Wong wrote: > On Fri, Dec 18, 2020 at 10:44:26AM +0800, Ruan Shiyang wrote: >> >> >> On 2020/12/17 上午4:55, Jane Chu wrote: >>> Hi, Shiyang, >>> >>> On 12/15/2020 4:14 AM, Shiyang Ruan wrote: >>>> The call trace is like this: >>>> memory_failure() >>>> pgmap->ops->memory_failure() => pmem_pgmap_memory_failure() >>>> gendisk->fops->corrupted_range() => - pmem_corrupted_range() >>>> - md_blk_corrupted_range() >>>> sb->s_ops->currupted_range() => xfs_fs_corrupted_range() >>>> xfs_rmap_query_range() >>>> xfs_currupt_helper() >>>> * corrupted on metadata >>>> try to recover data, call xfs_force_shutdown() >>>> * corrupted on file data >>>> try to recover data, call mf_dax_mapping_kill_procs() >>>> >>>> The fsdax & reflink support for XFS is not contained in this patchset. >>>> >>>> (Rebased on v5.10) >>> >>> So I tried the patchset with pmem error injection, the SIGBUS payload >>> does not look right - >>> >>> ** SIGBUS(7): ** >>> ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) ** >>> >>> I expect the payload looks like >>> >>> ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, BUS_MCEERR_AR) ** >> >> Thanks for testing. I test the SIGBUS by writing a program which calls >> madvise(... ,MADV_HWPOISON) to inject memory-failure. It just shows that >> the program is killed by SIGBUS. I cannot get any detail from it. So, >> could you please show me the right way(test tools) to test it? > > I'm assuming that Jane is using a program that calls sigaction to > install a SIGBUS handler, and dumps the entire siginfo_t structure > whenever it receives one... OK. Let me try it and figure out what's wrong in it. -- Thanks, Ruan Shiyang. > > --D > >> >> -- >> Thanks, >> Ruan Shiyang. >> >>> >>> thanks, >>> -jane >>> >>> >>> >>> >>> >>> >> >> > >
Hi, Shiyang, On 12/18/2020 1:13 AM, Ruan Shiyang wrote: >>>> >>>> So I tried the patchset with pmem error injection, the SIGBUS payload >>>> does not look right - >>>> >>>> ** SIGBUS(7): ** >>>> ** si_addr(0x(nil)), si_lsb(0xC), si_code(0x4, BUS_MCEERR_AR) ** >>>> >>>> I expect the payload looks like >>>> >>>> ** si_addr(0x7f3672e00000), si_lsb(0x15), si_code(0x4, >>>> BUS_MCEERR_AR) ** >>> >>> Thanks for testing. I test the SIGBUS by writing a program which calls >>> madvise(... ,MADV_HWPOISON) to inject memory-failure. It just shows >>> that >>> the program is killed by SIGBUS. I cannot get any detail from it. So, >>> could you please show me the right way(test tools) to test it? >> >> I'm assuming that Jane is using a program that calls sigaction to >> install a SIGBUS handler, and dumps the entire siginfo_t structure >> whenever it receives one... Yes, thanks Darrick. > > OK. Let me try it and figure out what's wrong in it. I injected poison via "ndctl inject-error", not expecting it made any difference though. Any luck? thanks, -jane