From patchwork Thu Feb 9 10:29:45 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: David Howells X-Patchwork-Id: 13134342 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C65C6C636D4 for ; Thu, 9 Feb 2023 10:30:29 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 0953B6B0078; Thu, 9 Feb 2023 05:30:28 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id F36F46B007B; Thu, 9 Feb 2023 05:30:27 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id DD6E76B007D; Thu, 9 Feb 2023 05:30:27 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id D081C6B0078 for ; Thu, 9 Feb 2023 05:30:27 -0500 (EST) Received: from smtpin12.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay05.hostedemail.com (Postfix) with ESMTP id 8047540869 for ; Thu, 9 Feb 2023 10:30:27 +0000 (UTC) X-FDA: 80447384094.12.73011DE Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by imf04.hostedemail.com (Postfix) with ESMTP id 9B8534000C for ; Thu, 9 Feb 2023 10:30:25 +0000 (UTC) Authentication-Results: imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=P9mEUDzS; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1675938625; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gZh4MABlkpe6ltIe1JtsGk3eMjQ687XGJ1n+AAo3k5s=; b=sB7nqXweuQwS8xEIYBVF3Oh0B7HRM0vFTDna7TLrc56oYy1hzq9LXqItw58/fNSY2/M/Vx rXlrFp1LFA7M68PjcsFAgMOgwwlw/FA9V6yZurHnhgW0rPFTnpedvSxSkI2KzDkQam/zFw OogZQTdU6wzorUGtf5LvW6dSw0QJj6s= ARC-Authentication-Results: i=1; imf04.hostedemail.com; dkim=pass header.d=redhat.com header.s=mimecast20190719 header.b=P9mEUDzS; dmarc=pass (policy=none) header.from=redhat.com; spf=pass (imf04.hostedemail.com: domain of dhowells@redhat.com designates 170.10.133.124 as permitted sender) smtp.mailfrom=dhowells@redhat.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1675938625; a=rsa-sha256; cv=none; b=Cpk7owwe/0cONfd2j9Xnay8JVsNwNB+PBLeNQkTssO12LQdDdGedScbUIoaLguGBI2Uh4C UlIO94jZym5EIT42J2m04uSCoAOBdpJRA9U95IGz7eIho7cr1eU9egrlD5AlNw/pnp42tG jJxBit0wcFCpFSprBYFKmanTMqnsfcA= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1675938624; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=gZh4MABlkpe6ltIe1JtsGk3eMjQ687XGJ1n+AAo3k5s=; b=P9mEUDzSQ3G+NoMLpTNK00H0ZK0MUwQVnK+gUFiNNuiX+T5pKH0ybuPVoBA4WentwDTCHf t2Hn1tmRAhc+zImwSI4yYIoFJ0F5Ke4mC7IGlj6uF114aZC06eGKgMK/aJv+hleU6/NpTM LIrX/fxNX9IAqDHQOGZVveoKPJkOcOM= Received: from mimecast-mx02.redhat.com (mimecast-mx02.redhat.com [66.187.233.88]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-528-mHs8wxQ_M9q60lEOpooSWQ-1; Thu, 09 Feb 2023 05:30:19 -0500 X-MC-Unique: mHs8wxQ_M9q60lEOpooSWQ-1 Received: from smtp.corp.redhat.com (int-mx08.intmail.prod.int.rdu2.redhat.com [10.11.54.8]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx02.redhat.com (Postfix) with ESMTPS id BFC1D100F83C; Thu, 9 Feb 2023 10:30:06 +0000 (UTC) Received: from warthog.procyon.org.uk (unknown [10.33.36.24]) by smtp.corp.redhat.com (Postfix) with ESMTP id CC3E7C16022; Thu, 9 Feb 2023 10:30:04 +0000 (UTC) From: David Howells To: Jens Axboe , Al Viro , Christoph Hellwig Cc: David Howells , Matthew Wilcox , Jan Kara , Jeff Layton , David Hildenbrand , Jason Gunthorpe , Logan Gunthorpe , Hillf Danton , linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, Christoph Hellwig , John Hubbard Subject: [PATCH v13 03/12] splice: Do splice read from a buffered file without using ITER_PIPE Date: Thu, 9 Feb 2023 10:29:45 +0000 Message-Id: <20230209102954.528942-4-dhowells@redhat.com> In-Reply-To: <20230209102954.528942-1-dhowells@redhat.com> References: <20230209102954.528942-1-dhowells@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 3.1 on 10.11.54.8 X-Rspam-User: X-Rspamd-Server: rspam02 X-Rspamd-Queue-Id: 9B8534000C X-Stat-Signature: znzb4ejn67czrmahuxzpmj3mb1kbbi5c X-HE-Tag: 1675938625-982218 X-HE-Meta: U2FsdGVkX1+5Jy+DzFLNSJ+EZKDxVyCE8vS2pHUJp5oIpd+FJxRozg3H31VEeJQTgm74a86euYzilEzxuzoFszoigs3187RJamgLDbnjxacWEPl0ceVlEpfO7JMkacpzFJp1e4C88lOV4w4xDEtUjuvTsuHilbj5y4CHEtQ/Vt6nKzRJ7KITNX0fDLLrn44ZWSkWlnJ/vwK50Ff4M03k5Gs/xM0oPkAvVp2xjWUoEcRs02P2KZ3vgAKsWkBcnCTWeNTi/aHP43SG9TfRBkYLjA8lwPdzn9XPZTBDejLXYYdqBLdSC/fDYAdR6/LF0mIwhPnOs9TsSmXP3Ez9wEYjIm498GcHUAoUfGbroBz5Sa0mFxgFUMNtDP0WUlHdm3UnYB/PHuRimWi4eTPXSvYfhO8wgZ+q2kXoNoYPZK5dBEBW270eVEEcLuFQKazL4o5EVJaJ99ew+rGnW03/IGmCkUwV5hUB9PooqPiEVexpYy6KvZTnmsTxS6O+xtUgp5Uh5v6Pa0r49A65vh64HQzRAksSEN0XS1yRa56oWstNFhFZBOnsvzo/ozlpj5GBZb2/Rv7m2DFRnGop/rElYd3Sr7Q4YjVFBqjQxMxJVppnqbGJNG5DnjWXmeaP+0TbfN8AmkK44lGYQB5KEBeeodGIm0GbZWdrvmN7L/DZ7TZ5rZaNbX2QjfXiNDGMzJX0Zm7iouYqCVU6z4QOlf1jVjFXV6FFHOHNwsRmLJbz4LYlzpcLeKn41MUMnW57peDH8hCN8wzuVC4Cnc1DRf48HznHJBxzJuHDhUD/0J5QcKUExQ7gkz4tIaxfsLl1g7ddWTYdDDCt0v8ZhmYymzm9zZkQKTwjF5rfeIq6MoOvrMihbef6I+JFbzUVpxfPs0qsJ0cyLLqvVy8T5thxNwDIV8grxaVJonSocM3T0LvCNjHVYxr7BeXFZlpnZcvLbl9BJocPTutsypQ5H/Z8HyqJcd9 K1edotWP mWUNe1RaqMyUgNidxrck8vLdU1bA5TjPXYnXIv+hXbe307XulpbYAkUq/u02QLgj+rOOsZ/UG+0mPnDTynJMZMum8iGnhbPi5kdI83bhsRxNcITmZnEU1BpZm7ZNT/h0JfkrrJ5h4ZQrSiaD8KYsdzLr3AFquo6klPfjX3nHyR17ThSNjRDZLKNRcveFUz0ZM2vUjp5x1+v1g5EIJEaxij8HqIyKgn64na8/3nuQN1Q+cP6Wl2auC7PtJCJEVOzEQKj0ovlMXDjyKd1vHb05Gtc6WJ/+twMAGgJydN2TtZ00qF4E06gnZqOE96o5WxdaFNVOT87x98aPUKB9V55TVm/Hq0tf1sUNSwZ2J9jwWdXjGCzPdrM+EE3+HMGKAiyBzbV1bqmvxyc/qESlz54jqWwl0omx25pn3e4OtCxSPRojwWKt6isjfV2pUcCj6vBnP8GLd X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Provide a function to do splice read from a buffered file, pulling the folios out of the pagecache directly by calling filemap_get_pages() to do any required reading and then pasting the returned folios into the pipe. A helper function is provided to do the actual folio pasting and will handle multipage folios by splicing as many of the relevant subpages as will fit into the pipe. The ITER_BVEC-based splicing previously added is then only used for splicing from O_DIRECT files. The code is loosely based on filemap_read() and might belong in mm/filemap.c with that as it needs to use filemap_get_pages(). With this, ITER_PIPE is no longer used. Signed-off-by: David Howells cc: Jens Axboe cc: Christoph Hellwig cc: Al Viro cc: David Hildenbrand cc: John Hubbard cc: linux-mm@kvack.org cc: linux-block@vger.kernel.org cc: linux-fsdevel@vger.kernel.org --- fs/splice.c | 159 ++++++++++++++++++++++++++++++++++++++++++++-------- 1 file changed, 135 insertions(+), 24 deletions(-) diff --git a/fs/splice.c b/fs/splice.c index b4be6fc314a1..963cbf20abc8 100644 --- a/fs/splice.c +++ b/fs/splice.c @@ -22,6 +22,7 @@ #include #include #include +#include #include #include #include @@ -375,6 +376,135 @@ static ssize_t generic_file_direct_splice_read(struct file *in, loff_t *ppos, return ret; } +/* + * Splice subpages from a folio into a pipe. + */ +static size_t splice_folio_into_pipe(struct pipe_inode_info *pipe, + struct folio *folio, + loff_t fpos, size_t size) +{ + struct page *page; + size_t spliced = 0, offset = offset_in_folio(folio, fpos); + + page = folio_page(folio, offset / PAGE_SIZE); + size = min(size, folio_size(folio) - offset); + offset %= PAGE_SIZE; + + while (spliced < size && + !pipe_full(pipe->head, pipe->tail, pipe->max_usage)) { + struct pipe_buffer *buf = &pipe->bufs[pipe->head & (pipe->ring_size - 1)]; + size_t part = min_t(size_t, PAGE_SIZE - offset, size - spliced); + + *buf = (struct pipe_buffer) { + .ops = &page_cache_pipe_buf_ops, + .page = page, + .offset = offset, + .len = part, + }; + folio_get(folio); + pipe->head++; + page++; + spliced += part; + offset = 0; + } + + return spliced; +} + +/* + * Splice folios from the pagecache of a buffered (ie. non-O_DIRECT) file into + * a pipe. + */ +static ssize_t generic_file_buffered_splice_read(struct file *in, loff_t *ppos, + struct pipe_inode_info *pipe, + size_t len, + unsigned int flags) +{ + struct folio_batch fbatch; + size_t total_spliced = 0, used, npages; + loff_t isize, end_offset; + bool writably_mapped; + int i, error = 0; + + struct kiocb iocb = { + .ki_filp = in, + .ki_pos = *ppos, + }; + + /* Work out how much data we can actually add into the pipe */ + used = pipe_occupancy(pipe->head, pipe->tail); + npages = max_t(ssize_t, pipe->max_usage - used, 0); + len = min_t(size_t, len, npages * PAGE_SIZE); + + folio_batch_init(&fbatch); + + do { + cond_resched(); + + if (*ppos >= i_size_read(file_inode(in))) + break; + + iocb.ki_pos = *ppos; + error = filemap_get_pages(&iocb, len, &fbatch, true); + if (error < 0) + break; + + /* + * i_size must be checked after we know the pages are Uptodate. + * + * Checking i_size after the check allows us to calculate + * the correct value for "nr", which means the zero-filled + * part of the page is not copied back to userspace (unless + * another truncate extends the file - this is desired though). + */ + isize = i_size_read(file_inode(in)); + if (unlikely(*ppos >= isize)) + break; + end_offset = min_t(loff_t, isize, *ppos + len); + + /* + * Once we start copying data, we don't want to be touching any + * cachelines that might be contended: + */ + writably_mapped = mapping_writably_mapped(in->f_mapping); + + for (i = 0; i < folio_batch_count(&fbatch); i++) { + struct folio *folio = fbatch.folios[i]; + size_t n; + + if (folio_pos(folio) >= end_offset) + goto out; + folio_mark_accessed(folio); + + /* + * If users can be writing to this folio using arbitrary + * virtual addresses, take care of potential aliasing + * before reading the folio on the kernel side. + */ + if (writably_mapped) + flush_dcache_folio(folio); + + n = splice_folio_into_pipe(pipe, folio, *ppos, len); + if (!n) + goto out; + len -= n; + total_spliced += n; + *ppos += n; + in->f_ra.prev_pos = *ppos; + if (pipe_full(pipe->head, pipe->tail, pipe->max_usage)) + goto out; + } + + folio_batch_release(&fbatch); + } while (len); + +out: + folio_batch_release(&fbatch); + file_accessed(in); + + return total_spliced ? total_spliced : error; +} + /** * generic_file_splice_read - splice data from file to a pipe * @in: file to splice from @@ -392,32 +522,13 @@ ssize_t generic_file_splice_read(struct file *in, loff_t *ppos, struct pipe_inode_info *pipe, size_t len, unsigned int flags) { - struct iov_iter to; - struct kiocb kiocb; - int ret; - + if (unlikely(*ppos >= file_inode(in)->i_sb->s_maxbytes)) + return 0; + if (unlikely(!len)) + return 0; if (in->f_flags & O_DIRECT) return generic_file_direct_splice_read(in, ppos, pipe, len, flags); - - iov_iter_pipe(&to, ITER_DEST, pipe, len); - init_sync_kiocb(&kiocb, in); - kiocb.ki_pos = *ppos; - ret = call_read_iter(in, &kiocb, &to); - if (ret > 0) { - *ppos = kiocb.ki_pos; - file_accessed(in); - } else if (ret < 0) { - /* free what was emitted */ - pipe_discard_from(pipe, to.start_head); - /* - * callers of ->splice_read() expect -EAGAIN on - * "can't put anything in there", rather than -EFAULT. - */ - if (ret == -EFAULT) - ret = -EAGAIN; - } - - return ret; + return generic_file_buffered_splice_read(in, ppos, pipe, len, flags); } EXPORT_SYMBOL(generic_file_splice_read);