Message ID | 1444653923-22111-1-git-send-email-jack@suse.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Mon, 12 Oct 2015 14:45:23 +0200 Jan Kara <jack@suse.com> wrote: > Currently a simple program below issues a sendfile(2) system call which > takes about 62 days to complete in my test KVM instance. Geeze some people are impatient. > int fd; > off_t off = 0; > > fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644); > ftruncate(fd, 2); > lseek(fd, 0, SEEK_END); > sendfile(fd, fd, &off, 0xfffffff); > > Now you should not ask kernel to do a stupid stuff like copying 256MB in > 2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin > should have a way to stop you. > > We actually do have a check for fatal_signal_pending() in > generic_perform_write() which triggers in this path however because we > always succeed in writing something before the check is done, we return > value > 0 from generic_perform_write() and thus the information about > signal gets lost. ah. > Fix the problem by doing the signal check before writing anything. That > way generic_perform_write() returns -EINTR, the error gets propagated up > and the sendfile loop terminates early. > > ... > > --- a/mm/filemap.c > +++ b/mm/filemap.c > @@ -2488,6 +2488,11 @@ again: > break; > } > > + if (fatal_signal_pending(current)) { > + status = -EINTR; > + break; > + } > + > status = a_ops->write_begin(file, mapping, pos, bytes, flags, > &page, &fsdata); > if (unlikely(status < 0)) > @@ -2525,10 +2530,6 @@ again: > written += copied; > > balance_dirty_pages_ratelimited(mapping); > - if (fatal_signal_pending(current)) { > - status = -EINTR; > - break; > - } > } while (iov_iter_count(i)); > > return written ? written : status; This won't work, will it? If user hits ^C after we've written a few pages, `written' is non-zero and the same thing happens? -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu 15-10-15 13:46:44, Andrew Morton wrote: > On Mon, 12 Oct 2015 14:45:23 +0200 Jan Kara <jack@suse.com> wrote: > > > Currently a simple program below issues a sendfile(2) system call which > > takes about 62 days to complete in my test KVM instance. > > Geeze some people are impatient. > > > int fd; > > off_t off = 0; > > > > fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644); > > ftruncate(fd, 2); > > lseek(fd, 0, SEEK_END); > > sendfile(fd, fd, &off, 0xfffffff); > > > > Now you should not ask kernel to do a stupid stuff like copying 256MB in > > 2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin > > should have a way to stop you. > > > > We actually do have a check for fatal_signal_pending() in > > generic_perform_write() which triggers in this path however because we > > always succeed in writing something before the check is done, we return > > value > 0 from generic_perform_write() and thus the information about > > signal gets lost. > > ah. > > > Fix the problem by doing the signal check before writing anything. That > > way generic_perform_write() returns -EINTR, the error gets propagated up > > and the sendfile loop terminates early. > > > > ... > > > > --- a/mm/filemap.c > > +++ b/mm/filemap.c > > @@ -2488,6 +2488,11 @@ again: > > break; > > } > > > > + if (fatal_signal_pending(current)) { > > + status = -EINTR; > > + break; > > + } > > + > > status = a_ops->write_begin(file, mapping, pos, bytes, flags, > > &page, &fsdata); > > if (unlikely(status < 0)) > > @@ -2525,10 +2530,6 @@ again: > > written += copied; > > > > balance_dirty_pages_ratelimited(mapping); > > - if (fatal_signal_pending(current)) { > > - status = -EINTR; > > - break; > > - } > > } while (iov_iter_count(i)); > > > > return written ? written : status; > > This won't work, will it? If user hits ^C after we've written a few > pages, `written' is non-zero and the same thing happens? It does work - I've tested it :). Sure, the generic_perform_write() call that is running when the signal is delivered will return with value > 0. But the interesting thing is what happens after that: Either we return to userspace (and then we are fine) or generic_perform_write() gets called again because there's more to write and *that* call will return -EINTR which ends up terminating the whole sendfile syscall. Actually there is one general lesson to be learned here: When you check for fatal signal and bail out, it's better to do it before doing any work. That way things keep working even if the function is called in a loop. Honza
On Fri, 16 Oct 2015 08:40:27 +0200 Jan Kara <jack@suse.cz> wrote: > > > balance_dirty_pages_ratelimited(mapping); > > > - if (fatal_signal_pending(current)) { > > > - status = -EINTR; > > > - break; > > > - } > > > } while (iov_iter_count(i)); > > > > > > return written ? written : status; > > > > This won't work, will it? If user hits ^C after we've written a few > > pages, `written' is non-zero and the same thing happens? > > It does work - I've tested it :). Sure, the generic_perform_write() call > that is running when the signal is delivered will return with value > 0. > But the interesting thing is what happens after that: Either we return to > userspace (and then we are fine) or generic_perform_write() gets called > again because there's more to write and *that* call will return -EINTR > which ends up terminating the whole sendfile syscall. OK. I guess that's better behaviour than overwriting a non-zero `written' when signalled. I'm going to tag this one for -stable. It's a bit of a DoS. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/mm/filemap.c b/mm/filemap.c index 1cc5467cf36c..327910c2400c 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -2488,6 +2488,11 @@ again: break; } + if (fatal_signal_pending(current)) { + status = -EINTR; + break; + } + status = a_ops->write_begin(file, mapping, pos, bytes, flags, &page, &fsdata); if (unlikely(status < 0)) @@ -2525,10 +2530,6 @@ again: written += copied; balance_dirty_pages_ratelimited(mapping); - if (fatal_signal_pending(current)) { - status = -EINTR; - break; - } } while (iov_iter_count(i)); return written ? written : status;
Currently a simple program below issues a sendfile(2) system call which takes about 62 days to complete in my test KVM instance. int fd; off_t off = 0; fd = open("file", O_RDWR | O_TRUNC | O_SYNC | O_CREAT, 0644); ftruncate(fd, 2); lseek(fd, 0, SEEK_END); sendfile(fd, fd, &off, 0xfffffff); Now you should not ask kernel to do a stupid stuff like copying 256MB in 2-byte chunks and call fsync(2) after each chunk but if you do, sysadmin should have a way to stop you. We actually do have a check for fatal_signal_pending() in generic_perform_write() which triggers in this path however because we always succeed in writing something before the check is done, we return value > 0 from generic_perform_write() and thus the information about signal gets lost. Fix the problem by doing the signal check before writing anything. That way generic_perform_write() returns -EINTR, the error gets propagated up and the sendfile loop terminates early. Reported-by: Dmitry Vyukov <dvyukov@google.com> Signed-off-by: Jan Kara <jack@suse.com> --- mm/filemap.c | 9 +++++---- 1 file changed, 5 insertions(+), 4 deletions(-)