Message ID | 5cdda475417b2719dced162cce89a283153cb818.1466012020.git.osandov@fb.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Jun 15, 2016 at 10:42:05AM -0700, Omar Sandoval wrote: > From: Omar Sandoval <osandov@fb.com> > > Someone at Facebook reported that their coredumps were much faster when > using a pipe helper than when dumping directly to a file, which doesn't > make much sense. It turns out that this difference is because in > do_coredump(), we truncate the core file and thus trigger the ext4 > auto_da_alloc heuristic. We can't use O_TRUNC because we might bail out > of do_coredump() in certain conditions, so instead, avoid truncating > when the file is already empty. In cases where we're actually > overwriting a core file, this won't help, but the common case will be > much better. > > Signed-off-by: Omar Sandoval <osandov@fb.com> > --- > Hi, Al and Ted, > > This is probably the wrong solution to the problem I described in the > commit message. Do you guys have any better ideas? Something like > 0eab928221ba ("ext4: Don't treat a truncation of a zero-length file as > replace-via-truncate") would also work, but that apparently wasn't > right, as it was reverted in 5534fb5bb35a ("ext4: Fix the alloc on close > after a truncate hueristic"). > > Thanks. Ping, any thoughts on this? > fs/coredump.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/coredump.c b/fs/coredump.c > index 281b768000e6..9da7357773f0 100644 > --- a/fs/coredump.c > +++ b/fs/coredump.c > @@ -741,8 +741,10 @@ void do_coredump(const siginfo_t *siginfo) > goto close_fail; > if (!(cprm.file->f_mode & FMODE_CAN_WRITE)) > goto close_fail; > - if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) > - goto close_fail; > + if (i_size_read(file_inode(cprm.file)) != 0) { > + if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) > + goto close_fail; > + } > } > > /* get us an unshared descriptor table; almost always a no-op */ > -- > 2.8.3 >
On 06/15/2016 01:42 PM, Omar Sandoval wrote: > From: Omar Sandoval <osandov@fb.com> > > Someone at Facebook reported that their coredumps were much faster when > using a pipe helper than when dumping directly to a file, which doesn't > make much sense. It turns out that this difference is because in > do_coredump(), we truncate the core file and thus trigger the ext4 > auto_da_alloc heuristic. We can't use O_TRUNC because we might bail out > of do_coredump() in certain conditions, so instead, avoid truncating > when the file is already empty. In cases where we're actually > overwriting a core file, this won't help, but the common case will be > much better. > > Signed-off-by: Omar Sandoval <osandov@fb.com> > --- > Hi, Al and Ted, > > This is probably the wrong solution to the problem I described in the > commit message. Do you guys have any better ideas? Something like > 0eab928221ba ("ext4: Don't treat a truncation of a zero-length file as > replace-via-truncate") would also work, but that apparently wasn't > right, as it was reverted in 5534fb5bb35a ("ext4: Fix the alloc on close > after a truncate hueristic"). > > Thanks. > > fs/coredump.c | 6 ++++-- > 1 file changed, 4 insertions(+), 2 deletions(-) > > diff --git a/fs/coredump.c b/fs/coredump.c > index 281b768000e6..9da7357773f0 100644 > --- a/fs/coredump.c > +++ b/fs/coredump.c > @@ -741,8 +741,10 @@ void do_coredump(const siginfo_t *siginfo) > goto close_fail; > if (!(cprm.file->f_mode & FMODE_CAN_WRITE)) > goto close_fail; > - if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) > - goto close_fail; > + if (i_size_read(file_inode(cprm.file)) != 0) { > + if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) > + goto close_fail; > + } > } > > /* get us an unshared descriptor table; almost always a no-op */ > Omar, this probably breaks the case where we do fallocate(FALLOC_FL_KEEP_SIZE), the i_size will be 0 but there will be blocks to truncate. Probably want to check i_blocks or something. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 05, 2016 at 09:42:13AM -0400, Josef Bacik wrote: > > diff --git a/fs/coredump.c b/fs/coredump.c > > index 281b768000e6..9da7357773f0 100644 > > --- a/fs/coredump.c > > +++ b/fs/coredump.c > > @@ -741,8 +741,10 @@ void do_coredump(const siginfo_t *siginfo) > > goto close_fail; > > if (!(cprm.file->f_mode & FMODE_CAN_WRITE)) > > goto close_fail; > > - if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) > > - goto close_fail; > > + if (i_size_read(file_inode(cprm.file)) != 0) { > > + if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) > > + goto close_fail; > > + } > > } > > > > /* get us an unshared descriptor table; almost always a no-op */ > > > > Omar, this probably breaks the case where we do > fallocate(FALLOC_FL_KEEP_SIZE), the i_size will be 0 but there will be > blocks to truncate. Probably want to check i_blocks or something. Thanks, Sure, but this is in the coredump code; do we care there? What are the odds that someone will have fallocated blocks beyond i_size in a file named "core"? And if so, it's not like it's going to make the coredump invalid or non-useful in any way. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 07/05/2016 10:37 AM, Theodore Ts'o wrote: > On Tue, Jul 05, 2016 at 09:42:13AM -0400, Josef Bacik wrote: >>> diff --git a/fs/coredump.c b/fs/coredump.c >>> index 281b768000e6..9da7357773f0 100644 >>> --- a/fs/coredump.c >>> +++ b/fs/coredump.c >>> @@ -741,8 +741,10 @@ void do_coredump(const siginfo_t *siginfo) >>> goto close_fail; >>> if (!(cprm.file->f_mode & FMODE_CAN_WRITE)) >>> goto close_fail; >>> - if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) >>> - goto close_fail; >>> + if (i_size_read(file_inode(cprm.file)) != 0) { >>> + if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) >>> + goto close_fail; >>> + } >>> } >>> >>> /* get us an unshared descriptor table; almost always a no-op */ >>> >> >> Omar, this probably breaks the case where we do >> fallocate(FALLOC_FL_KEEP_SIZE), the i_size will be 0 but there will be >> blocks to truncate. Probably want to check i_blocks or something. Thanks, > > Sure, but this is in the coredump code; do we care there? What are > the odds that someone will have fallocated blocks beyond i_size in a > file named "core"? And if so, it's not like it's going to make the > coredump invalid or non-useful in any way. Wow I totally didn't notice this was in coredump.c, I thought it was in ext4 code because you said it failed regression tests, which I assumed were your ext4 tests. Ignore me. Thanks, Josef -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Jul 05, 2016 at 11:01:40AM -0400, Josef Bacik wrote: > > > Omar, this probably breaks the case where we do > > > fallocate(FALLOC_FL_KEEP_SIZE), the i_size will be 0 but there will be > > > blocks to truncate. Probably want to check i_blocks or something. Thanks, > > > > Sure, but this is in the coredump code; do we care there? What are > > the odds that someone will have fallocated blocks beyond i_size in a > > file named "core"? And if so, it's not like it's going to make the > > coredump invalid or non-useful in any way. > > Wow I totally didn't notice this was in coredump.c, I thought it was in ext4 > code because you said it failed regression tests, which I assumed were your > ext4 tests. Ignore me. Thanks, Yeah, Omar's original patch was something he described as a "hack" to the coredump code. I actually don't think it's that bad, but it does make sense to have ext4 not enable the "replace-via-truncate" code when the truncate is a no-op, but it turns out this is a bit tricky because the places where we set i_size and where we decide to truncate beyond i_size are separated. I tried to do something simple but it didn't quite work right; I'll look into why it didn't work hopefully later today. - Ted -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/coredump.c b/fs/coredump.c index 281b768000e6..9da7357773f0 100644 --- a/fs/coredump.c +++ b/fs/coredump.c @@ -741,8 +741,10 @@ void do_coredump(const siginfo_t *siginfo) goto close_fail; if (!(cprm.file->f_mode & FMODE_CAN_WRITE)) goto close_fail; - if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) - goto close_fail; + if (i_size_read(file_inode(cprm.file)) != 0) { + if (do_truncate(cprm.file->f_path.dentry, 0, 0, cprm.file)) + goto close_fail; + } } /* get us an unshared descriptor table; almost always a no-op */