Message ID | 1473842236-28655-7-git-send-email-mszeredi@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Wed, Sep 14, 2016 at 10:37:11AM +0200, Miklos Szeredi wrote: > Things could happen to that page that make it not uptodate while sitting in > the pipe, but it's questionable whether we should care about that. > Checking for being uptodate in the face of such page state change is always > going to be racy. I'm not sure it's the right thing to do here; that area looks like a victim of serious bitrot - once upon a time it was ->map(), which used to lock page, verity that it's valid, and kmap it. ->unmap() did kunmap + unlock. Then the validate part got split off (->pin(), later renamed to ->confirm()), with lock _not_ held over the kmap/kunmap. That's the point when it got racy, AFAICS. OTOH, I would really hate to hold a page locked over e.g. copying to userland - too easy to get deadlocks that way. Jens, could you comment? Pages definitely shouldn't be getting into pipe without having been uptodate; the question is what (if anything) should be done about having a page go non-uptodate (on truncate, hole-punching, etc.) while a reference to it is sitting in a pipe buffer. -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Sep 27, 2016 at 5:40 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Wed, Sep 14, 2016 at 10:37:11AM +0200, Miklos Szeredi wrote: > >> Things could happen to that page that make it not uptodate while sitting in >> the pipe, but it's questionable whether we should care about that. >> Checking for being uptodate in the face of such page state change is always >> going to be racy. > > I'm not sure it's the right thing to do here; that area looks like a victim > of serious bitrot - once upon a time it was ->map(), which used to lock > page, verity that it's valid, and kmap it. ->unmap() did kunmap + unlock. > > Then the validate part got split off (->pin(), later renamed to ->confirm()), > with lock _not_ held over the kmap/kunmap. That's the point when it got racy, > AFAICS. OTOH, I would really hate to hold a page locked over e.g. copying to > userland - too easy to get deadlocks that way. > > Jens, could you comment? Pages definitely shouldn't be getting into pipe > without having been uptodate; the question is what (if anything) should be > done about having a page go non-uptodate (on truncate, hole-punching, etc.) > while a reference to it is sitting in a pipe buffer. Truncate/holepunch is interesting. It doesn't make the page go non-uptodate, just clears page->mapping. Works like a charm, old data can be read from the pipe just fine. Partial truncate is even more interesting, because it will result in partially cleared data (race is there with plain read() as well, AFAICS, but very narrow). Page invalidation by filesystems is similar to truncate. Old data can be read from the pipe. And in fact this probably the way we want it to work, since redoing the page lookup in ->confirm() would be really messy. Also just modifying the data sitting in the pipe will also result in undefined behavior; either the old or the new data can be read out from the other end of the pipe. And while not explicitly documented, all the above cases are fine and implicit in the non-copy behavior of splice. Perhaps the man page should note that modifying data after splice but before reading from the other end of the pipe results in undefined behavior. I haven't looked at filesystems, but generic code calls ClearPageUptodate() from only a few places: end_swap_bio_read(): does it on a non-uptodate page page_endio(): AFAICS part of the page reading chain, again doing it on a non-uptodate page me_swapcache_dirty(): memory error on dirty swapcache page, this is the only one that would make sense to trigger EIO on reading the pipe buffer. But then why only the dirty swapcache case? Thanks, Miklos -- To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/fs/splice.c b/fs/splice.c index ba7a2240d58c..0ecbe3011796 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -92,51 +92,10 @@ static void page_cache_pipe_buf_release(struct pipe_inode_info *pipe, buf->flags &= ~PIPE_BUF_FLAG_LRU; } -/* - * Check whether the contents of buf is OK to access. Since the content - * is a page cache page, IO may be in flight. - */ -static int page_cache_pipe_buf_confirm(struct pipe_inode_info *pipe, - struct pipe_buffer *buf) -{ - struct page *page = buf->page; - int err; - - if (!PageUptodate(page)) { - lock_page(page); - - /* - * Page got truncated/unhashed. This will cause a 0-byte - * splice, if this is the first page. - */ - if (!page->mapping) { - err = -ENODATA; - goto error; - } - - /* - * Uh oh, read-error from disk. - */ - if (!PageUptodate(page)) { - err = -EIO; - goto error; - } - - /* - * Page is ok afterall, we are done. - */ - unlock_page(page); - } - - return 0; -error: - unlock_page(page); - return err; -} const struct pipe_buf_operations page_cache_pipe_buf_ops = { .can_merge = 0, - .confirm = page_cache_pipe_buf_confirm, + .confirm = generic_pipe_buf_confirm, .release = page_cache_pipe_buf_release, .steal = page_cache_pipe_buf_steal, .get = generic_pipe_buf_get,
page_cache_pipe_buf_confirm() is used only in page_cache_pipe_buf_ops. page_cache_pipe_buf_ops is used in two places: 1) __generic_file_splice_read() This iterates all the pages, if not uptodate locks page, and if still not uptodate does ->readpage() which reads the page synchronously. 2) shmem_file_splice_read() This calls shmem_getpage() on individual pages. shmem_getpage() swaps in the page synchronously if not in memory, so it also seems to return an uptodate page. So all pages put into the buffer will be uptodate. Things could happen to that page that make it not uptodate while sitting in the pipe, but it's questionable whether we should care about that. Checking for being uptodate in the face of such page state change is always going to be racy. Signed-off-by: Miklos Szeredi <mszeredi@redhat.com> --- fs/splice.c | 43 +------------------------------------------ 1 file changed, 1 insertion(+), 42 deletions(-)