Message ID | 20151120001813.27997.41722.stgit@dwillia2-desk3.jf.intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Nov 19, 2015 at 04:18:13PM -0800, Dan Williams wrote: > Neither the filemap_fault() nor the xfs dax fault path is updating time. > This call leads to the following WARN() when the block device has been > torn down: I don't think that is right. In xfs_filemap_fault(): .... /* DAX can shortcut the normal fault path on write faults! */ if ((vmf->flags & FAULT_FLAG_WRITE) && IS_DAX(inode)) return xfs_filemap_page_mkwrite(vma, vmf); .... And xfs_filemap_page_mkwrite() most definitely calls file_update_time(): .... trace_xfs_filemap_page_mkwrite(XFS_I(inode)); sb_start_pagefault(inode->i_sb); file_update_time(vma->vm_file); xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); .... And, finally, in xfs_filemap_pmd_fault(): .... if (flags & FAULT_FLAG_WRITE) { sb_start_pagefault(inode->i_sb); file_update_time(vma->vm_file); } .... So we are clearly updating timestamps in XFS on every write fault that occurs, whether it be through ->page_mkwrite, ->fault or ->pmd_fault. Hence removing those from ext4 can't be the righ tthing to do. > > WARNING: CPU: 0 PID: 2133 at fs/fs-writeback.c:2065 __mark_inode_dirty+0x261/0x350() > bdi-block not registered ^^^^^^^^^^^^^^^^^^^^^^^^ > [..] > Call Trace: > [<ffffffff81459f62>] dump_stack+0x44/0x62 > [<ffffffff810a2052>] warn_slowpath_common+0x82/0xc0 > [<ffffffff810a20ec>] warn_slowpath_fmt+0x5c/0x80 > [<ffffffff812831a1>] __mark_inode_dirty+0x261/0x350 > [<ffffffff8126d109>] generic_update_time+0x79/0xd0 > [<ffffffff8126d28d>] file_update_time+0xbd/0x110 > [<ffffffff812e4bc8>] ext4_dax_fault+0x68/0x110 > [<ffffffff811f816e>] __do_fault+0x4e/0xf0 > [<ffffffff811fc2a7>] handle_mm_fault+0x5e7/0x1b50 > [<ffffffff811fbd11>] ? handle_mm_fault+0x51/0x1b50 > [<ffffffff810689c1>] __do_page_fault+0x191/0x3f0 > [<ffffffff81068cef>] trace_do_page_fault+0x4f/0x120 > [<ffffffff8106314a>] do_async_page_fault+0x1a/0xa0 > [<ffffffff81902678>] async_page_fault+0x28/0x30 Doesn't this indicate some problem at the block/bdi level? __mark_inode_dirty() should not throw warnings like this regardless of where it is called from... Cheers, Dave.
On Thu, Nov 19, 2015 at 7:03 PM, Dave Chinner <david@fromorbit.com> wrote: > On Thu, Nov 19, 2015 at 04:18:13PM -0800, Dan Williams wrote: >> Neither the filemap_fault() nor the xfs dax fault path is updating time. >> This call leads to the following WARN() when the block device has been >> torn down: > > I don't think that is right. In xfs_filemap_fault(): > > > .... > /* DAX can shortcut the normal fault path on write faults! */ > if ((vmf->flags & FAULT_FLAG_WRITE) && IS_DAX(inode)) > return xfs_filemap_page_mkwrite(vma, vmf); > .... > > And xfs_filemap_page_mkwrite() most definitely calls file_update_time(): > > .... > trace_xfs_filemap_page_mkwrite(XFS_I(inode)); > > sb_start_pagefault(inode->i_sb); > file_update_time(vma->vm_file); > xfs_ilock(XFS_I(inode), XFS_MMAPLOCK_SHARED); > .... > > And, finally, in xfs_filemap_pmd_fault(): > > .... > if (flags & FAULT_FLAG_WRITE) { > sb_start_pagefault(inode->i_sb); > file_update_time(vma->vm_file); > } > .... > > So we are clearly updating timestamps in XFS on every write fault > that occurs, whether it be through ->page_mkwrite, ->fault or > ->pmd_fault. Hence removing those from ext4 can't be the righ tthing > to do. > Ah sorry I missed that. When I saw that xfs did not trigger the same warning as ext4 I just assumed it wasn't doing the time update. >> >> WARNING: CPU: 0 PID: 2133 at fs/fs-writeback.c:2065 __mark_inode_dirty+0x261/0x350() >> bdi-block not registered > ^^^^^^^^^^^^^^^^^^^^^^^^ >> [..] >> Call Trace: >> [<ffffffff81459f62>] dump_stack+0x44/0x62 >> [<ffffffff810a2052>] warn_slowpath_common+0x82/0xc0 >> [<ffffffff810a20ec>] warn_slowpath_fmt+0x5c/0x80 >> [<ffffffff812831a1>] __mark_inode_dirty+0x261/0x350 >> [<ffffffff8126d109>] generic_update_time+0x79/0xd0 >> [<ffffffff8126d28d>] file_update_time+0xbd/0x110 >> [<ffffffff812e4bc8>] ext4_dax_fault+0x68/0x110 >> [<ffffffff811f816e>] __do_fault+0x4e/0xf0 >> [<ffffffff811fc2a7>] handle_mm_fault+0x5e7/0x1b50 >> [<ffffffff811fbd11>] ? handle_mm_fault+0x51/0x1b50 >> [<ffffffff810689c1>] __do_page_fault+0x191/0x3f0 >> [<ffffffff81068cef>] trace_do_page_fault+0x4f/0x120 >> [<ffffffff8106314a>] do_async_page_fault+0x1a/0xa0 >> [<ffffffff81902678>] async_page_fault+0x28/0x30 > > Doesn't this indicate some problem at the block/bdi level? > __mark_inode_dirty() should not throw warnings like this regardless > of where it is called from... > I'll look closer at how the xfs path avoids triggering this... -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 19, 2015 at 08:39:06PM -0800, Dan Williams wrote: > On Thu, Nov 19, 2015 at 7:03 PM, Dave Chinner <david@fromorbit.com> wrote: > > On Thu, Nov 19, 2015 at 04:18:13PM -0800, Dan Williams wrote: > >> WARNING: CPU: 0 PID: 2133 at fs/fs-writeback.c:2065 __mark_inode_dirty+0x261/0x350() > >> bdi-block not registered > > ^^^^^^^^^^^^^^^^^^^^^^^^ > >> [..] > >> Call Trace: > >> [<ffffffff81459f62>] dump_stack+0x44/0x62 > >> [<ffffffff810a2052>] warn_slowpath_common+0x82/0xc0 > >> [<ffffffff810a20ec>] warn_slowpath_fmt+0x5c/0x80 > >> [<ffffffff812831a1>] __mark_inode_dirty+0x261/0x350 > >> [<ffffffff8126d109>] generic_update_time+0x79/0xd0 > >> [<ffffffff8126d28d>] file_update_time+0xbd/0x110 > >> [<ffffffff812e4bc8>] ext4_dax_fault+0x68/0x110 > >> [<ffffffff811f816e>] __do_fault+0x4e/0xf0 > >> [<ffffffff811fc2a7>] handle_mm_fault+0x5e7/0x1b50 > >> [<ffffffff811fbd11>] ? handle_mm_fault+0x51/0x1b50 > >> [<ffffffff810689c1>] __do_page_fault+0x191/0x3f0 > >> [<ffffffff81068cef>] trace_do_page_fault+0x4f/0x120 > >> [<ffffffff8106314a>] do_async_page_fault+0x1a/0xa0 > >> [<ffffffff81902678>] async_page_fault+0x28/0x30 > > > > Doesn't this indicate some problem at the block/bdi level? > > __mark_inode_dirty() should not throw warnings like this regardless > > of where it is called from... > > > > I'll look closer at how the xfs path avoids triggering this... XFS doesn't mark inodes dirty at the VFS level for inode metadata changes - dirty metadata is tracked by the journalling subsystem, not the VFS. Cheers, Dave.
On Fri 20-11-15 14:03:44, Dave Chinner wrote: > On Thu, Nov 19, 2015 at 04:18:13PM -0800, Dan Williams wrote: > > WARNING: CPU: 0 PID: 2133 at fs/fs-writeback.c:2065 __mark_inode_dirty+0x261/0x350() > > bdi-block not registered > ^^^^^^^^^^^^^^^^^^^^^^^^ > > [..] > > Call Trace: > > [<ffffffff81459f62>] dump_stack+0x44/0x62 > > [<ffffffff810a2052>] warn_slowpath_common+0x82/0xc0 > > [<ffffffff810a20ec>] warn_slowpath_fmt+0x5c/0x80 > > [<ffffffff812831a1>] __mark_inode_dirty+0x261/0x350 > > [<ffffffff8126d109>] generic_update_time+0x79/0xd0 > > [<ffffffff8126d28d>] file_update_time+0xbd/0x110 > > [<ffffffff812e4bc8>] ext4_dax_fault+0x68/0x110 > > [<ffffffff811f816e>] __do_fault+0x4e/0xf0 > > [<ffffffff811fc2a7>] handle_mm_fault+0x5e7/0x1b50 > > [<ffffffff811fbd11>] ? handle_mm_fault+0x51/0x1b50 > > [<ffffffff810689c1>] __do_page_fault+0x191/0x3f0 > > [<ffffffff81068cef>] trace_do_page_fault+0x4f/0x120 > > [<ffffffff8106314a>] do_async_page_fault+0x1a/0xa0 > > [<ffffffff81902678>] async_page_fault+0x28/0x30 > > Doesn't this indicate some problem at the block/bdi level? > __mark_inode_dirty() should not throw warnings like this regardless > of where it is called from... The warning warns that we are dirtying inodes against a backing device that advertises it does writeback but is not registered with writeback code (so flusher does not run for it). That makes sense but people seem to be experimenting a lot with removing devices under filesystems lately and so this warning triggers. Probably we have to refine it to not trigger for such cases... Honza
On Mon, Nov 23, 2015 at 03:15:56PM +0100, Jan Kara wrote: > On Fri 20-11-15 14:03:44, Dave Chinner wrote: > > On Thu, Nov 19, 2015 at 04:18:13PM -0800, Dan Williams wrote: > > > WARNING: CPU: 0 PID: 2133 at fs/fs-writeback.c:2065 __mark_inode_dirty+0x261/0x350() > > > bdi-block not registered > > ^^^^^^^^^^^^^^^^^^^^^^^^ > > > [..] > > > Call Trace: > > > [<ffffffff81459f62>] dump_stack+0x44/0x62 > > > [<ffffffff810a2052>] warn_slowpath_common+0x82/0xc0 > > > [<ffffffff810a20ec>] warn_slowpath_fmt+0x5c/0x80 > > > [<ffffffff812831a1>] __mark_inode_dirty+0x261/0x350 > > > [<ffffffff8126d109>] generic_update_time+0x79/0xd0 > > > [<ffffffff8126d28d>] file_update_time+0xbd/0x110 > > > [<ffffffff812e4bc8>] ext4_dax_fault+0x68/0x110 > > > [<ffffffff811f816e>] __do_fault+0x4e/0xf0 > > > [<ffffffff811fc2a7>] handle_mm_fault+0x5e7/0x1b50 > > > [<ffffffff811fbd11>] ? handle_mm_fault+0x51/0x1b50 > > > [<ffffffff810689c1>] __do_page_fault+0x191/0x3f0 > > > [<ffffffff81068cef>] trace_do_page_fault+0x4f/0x120 > > > [<ffffffff8106314a>] do_async_page_fault+0x1a/0xa0 > > > [<ffffffff81902678>] async_page_fault+0x28/0x30 > > > > Doesn't this indicate some problem at the block/bdi level? > > __mark_inode_dirty() should not throw warnings like this regardless > > of where it is called from... > > The warning warns that we are dirtying inodes against a backing device that > advertises it does writeback but is not registered with writeback code (so > flusher does not run for it). That makes sense but people seem to be > experimenting a lot with removing devices under filesystems lately and so > this warning triggers. Probably we have to refine it to not trigger for > such cases... Or, alternatively, have the block devices inform the filesystem that it no longer has a valid bdi backing it. That way the filesystem can error out at a higher layer and we never get to the point of trying to mark an inode dirty when the backing device has been unplugged... Cheers, Dave.
diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 113837e7ba98..76854dd6dd7d 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -214,7 +214,6 @@ static int ext4_dax_fault(struct vm_area_struct *vma, struct vm_fault *vmf) if (write) { sb_start_pagefault(sb); - file_update_time(vma->vm_file); handle = ext4_journal_start_sb(sb, EXT4_HT_WRITE_PAGE, EXT4_DATA_TRANS_BLOCKS(sb)); } @@ -245,7 +244,6 @@ static int ext4_dax_pmd_fault(struct vm_area_struct *vma, unsigned long addr, if (write) { sb_start_pagefault(sb); - file_update_time(vma->vm_file); handle = ext4_journal_start_sb(sb, EXT4_HT_WRITE_PAGE, ext4_chunk_trans_blocks(inode, PMD_SIZE / PAGE_SIZE));
Neither the filemap_fault() nor the xfs dax fault path is updating time. This call leads to the following WARN() when the block device has been torn down: WARNING: CPU: 0 PID: 2133 at fs/fs-writeback.c:2065 __mark_inode_dirty+0x261/0x350() bdi-block not registered [..] Call Trace: [<ffffffff81459f62>] dump_stack+0x44/0x62 [<ffffffff810a2052>] warn_slowpath_common+0x82/0xc0 [<ffffffff810a20ec>] warn_slowpath_fmt+0x5c/0x80 [<ffffffff812831a1>] __mark_inode_dirty+0x261/0x350 [<ffffffff8126d109>] generic_update_time+0x79/0xd0 [<ffffffff8126d28d>] file_update_time+0xbd/0x110 [<ffffffff812e4bc8>] ext4_dax_fault+0x68/0x110 [<ffffffff811f816e>] __do_fault+0x4e/0xf0 [<ffffffff811fc2a7>] handle_mm_fault+0x5e7/0x1b50 [<ffffffff811fbd11>] ? handle_mm_fault+0x51/0x1b50 [<ffffffff810689c1>] __do_page_fault+0x191/0x3f0 [<ffffffff81068cef>] trace_do_page_fault+0x4f/0x120 [<ffffffff8106314a>] do_async_page_fault+0x1a/0xa0 [<ffffffff81902678>] async_page_fault+0x28/0x30 Cc: Jan Kara <jack@suse.com> Cc: Ted Ts'o <tytso@mit.edu> Cc: Andreas Dilger <adilger.kernel@dilger.ca> Signed-off-by: Dan Williams <dan.j.williams@intel.com> --- fs/ext4/file.c | 2 -- 1 file changed, 2 deletions(-) -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html