From patchwork Thu Dec 12 19:01:29 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11289219 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8BDD914BD for ; Thu, 12 Dec 2019 19:01:40 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 620592073D for ; Thu, 12 Dec 2019 19:01:40 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="Vzr47eRV" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730608AbfLLTBj (ORCPT ); Thu, 12 Dec 2019 14:01:39 -0500 Received: from mail-il1-f195.google.com ([209.85.166.195]:36082 "EHLO mail-il1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730552AbfLLTBj (ORCPT ); Thu, 12 Dec 2019 14:01:39 -0500 Received: by mail-il1-f195.google.com with SMTP id b15so2959088iln.3 for ; Thu, 12 Dec 2019 11:01:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=EURgosiuwacwBKKWTVFdoxY9333p7skGDIpbIB1Zdws=; b=Vzr47eRVotGYSh3EQMnnAe6Ot66q0hOOBDZojBAkzci5NUWtVNl3lneR1NJ/3wIZfK Fg7Hn9R0niEEs2DjJNtjZovqOx3Ee9i5zY/zRy7Imy+TZf3FCCEMrYzJk1iLwas0jnqf HKDwgUpPTo4vH6ewgmDGNe1vH3GKmuUvVz7bRcYweW3sbdUgDq/GSml9V9/MuKoxZkU+ YdCNiD2ilKqWV5xB629CB9WK1/JBVNG+Jsy+WPBbYWXG/YiZyOkR4UG1uOwKsvT7ZrZ9 fOlzBGNUItf64nch77z4IDMLTMFE4cNrLEH/sdi666IbGQ/+3QKETKHzFqVZ5gf/mSkF vPcA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=EURgosiuwacwBKKWTVFdoxY9333p7skGDIpbIB1Zdws=; b=Te7bKunporJyLz8TdRfqCre96eztFwc3NcYa67WWPpyNDqCTE92oJ5Slzh9uLdZc95 hAENAljSUOWgfsJnIGrrjUcycfE0+4qu7isWShJ+9D03MA17NXPwNAQ05T6m9BYnDmhb VYVyOvF4MZu0I02jKDvsdywDr9/cRm7aQ5CRw8jT9Hyjvjx+iYVEsOS/Ipfxf3GDA2IZ 3/p93SgKx9D74E6+skX9bFy63ryJ5OoOWZ6ztRjBAFtU1NdeKK9GFDIGIbma52IhH7DF nj19USd7VDPTTr1JqLn8Xo0gpK1049F/AEfaY1v6ELCookqprMhmRwS4cFdRADwlLALs SVag== X-Gm-Message-State: APjAAAXiJgdB2jSBsDXMjfJKHo7nkqTov39QbwqoFhKxeCB7FsKQ/bmt jZB7ZpJfk5ZM1rgD265oOqT4gg== X-Google-Smtp-Source: APXvYqwTJOvHM9WtB0OBGg/KPxfufWu3Y/hk0jZpBwWYKUq7wMMB/eLDz2FyftDozy1WbYbk+WlZrg== X-Received: by 2002:a92:c9c8:: with SMTP id k8mr871646ilq.235.1576177298605; Thu, 12 Dec 2019 11:01:38 -0800 (PST) Received: from x1.thefacebook.com ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id i22sm1957745ill.40.2019.12.12.11.01.36 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Dec 2019 11:01:37 -0800 (PST) From: Jens Axboe To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Cc: willy@infradead.org, clm@fb.com, torvalds@linux-foundation.org, david@fromorbit.com, Jens Axboe Subject: [PATCH 1/5] fs: add read support for RWF_UNCACHED Date: Thu, 12 Dec 2019 12:01:29 -0700 Message-Id: <20191212190133.18473-2-axboe@kernel.dk> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191212190133.18473-1-axboe@kernel.dk> References: <20191212190133.18473-1-axboe@kernel.dk> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If RWF_UNCACHED is set for io_uring (or preadv2(2)), we'll use private pages for the buffered reads. These pages will never be inserted into the page cache, and they are simply droped when we have done the copy at the end of IO. If pages in the read range are already in the page cache, then use those for just copying the data instead of starting IO on private pages. A previous solution used the page cache even for non-cached ranges, but the cost of doing so was too high. Removing nodes at the end is expensive, even with LRU bypass. On top of that, repeatedly instantiating new xarray nodes is very costly, as it needs to memset 576 bytes of data, and freeing said nodes involve an RCU call per node as well. All that adds up, making uncached somewhat slower than O_DIRECT. With the current solition, we're basically at O_DIRECT levels of performance for RWF_UNCACHED IO. Protect against truncate the same way O_DIRECT does, by calling inode_dio_begin() to elevate the inode->i_dio_count. Signed-off-by: Jens Axboe --- include/linux/fs.h | 3 +++ include/uapi/linux/fs.h | 5 ++++- mm/filemap.c | 40 +++++++++++++++++++++++++++++++++------- 3 files changed, 40 insertions(+), 8 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index 98e0349adb52..092ea2a4319b 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -314,6 +314,7 @@ enum rw_hint { #define IOCB_SYNC (1 << 5) #define IOCB_WRITE (1 << 6) #define IOCB_NOWAIT (1 << 7) +#define IOCB_UNCACHED (1 << 8) struct kiocb { struct file *ki_filp; @@ -3418,6 +3419,8 @@ static inline int kiocb_set_rw_flags(struct kiocb *ki, rwf_t flags) ki->ki_flags |= (IOCB_DSYNC | IOCB_SYNC); if (flags & RWF_APPEND) ki->ki_flags |= IOCB_APPEND; + if (flags & RWF_UNCACHED) + ki->ki_flags |= IOCB_UNCACHED; return 0; } diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h index 379a612f8f1d..357ebb0e0c5d 100644 --- a/include/uapi/linux/fs.h +++ b/include/uapi/linux/fs.h @@ -299,8 +299,11 @@ typedef int __bitwise __kernel_rwf_t; /* per-IO O_APPEND */ #define RWF_APPEND ((__force __kernel_rwf_t)0x00000010) +/* drop cache after reading or writing data */ +#define RWF_UNCACHED ((__force __kernel_rwf_t)0x00000040) + /* mask of flags supported by the kernel */ #define RWF_SUPPORTED (RWF_HIPRI | RWF_DSYNC | RWF_SYNC | RWF_NOWAIT |\ - RWF_APPEND) + RWF_APPEND | RWF_UNCACHED) #endif /* _UAPI_LINUX_FS_H */ diff --git a/mm/filemap.c b/mm/filemap.c index bf6aa30be58d..5d299d69b185 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -1990,6 +1990,13 @@ static void shrink_readahead_size_eio(struct file *filp, ra->ra_pages /= 4; } +static void buffered_put_page(struct page *page, bool clear_mapping) +{ + if (clear_mapping) + page->mapping = NULL; + put_page(page); +} + /** * generic_file_buffered_read - generic file read routine * @iocb: the iocb to read @@ -2013,6 +2020,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, struct address_space *mapping = filp->f_mapping; struct inode *inode = mapping->host; struct file_ra_state *ra = &filp->f_ra; + bool did_dio_begin = false; loff_t *ppos = &iocb->ki_pos; pgoff_t index; pgoff_t last_index; @@ -2032,6 +2040,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, offset = *ppos & ~PAGE_MASK; for (;;) { + bool clear_mapping = false; struct page *page; pgoff_t end_index; loff_t isize; @@ -2048,6 +2057,13 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, if (!page) { if (iocb->ki_flags & IOCB_NOWAIT) goto would_block; + /* UNCACHED implies no read-ahead */ + if (iocb->ki_flags & IOCB_UNCACHED) { + did_dio_begin = true; + /* block truncate for UNCACHED reads */ + inode_dio_begin(inode); + goto no_cached_page; + } page_cache_sync_readahead(mapping, ra, filp, index, last_index - index); @@ -2106,7 +2122,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, isize = i_size_read(inode); end_index = (isize - 1) >> PAGE_SHIFT; if (unlikely(!isize || index > end_index)) { - put_page(page); + buffered_put_page(page, clear_mapping); goto out; } @@ -2115,7 +2131,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, if (index == end_index) { nr = ((isize - 1) & ~PAGE_MASK) + 1; if (nr <= offset) { - put_page(page); + buffered_put_page(page, clear_mapping); goto out; } } @@ -2147,7 +2163,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, offset &= ~PAGE_MASK; prev_offset = offset; - put_page(page); + buffered_put_page(page, clear_mapping); written += ret; if (!iov_iter_count(iter)) goto out; @@ -2189,7 +2205,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, if (unlikely(error)) { if (error == AOP_TRUNCATED_PAGE) { - put_page(page); + buffered_put_page(page, clear_mapping); error = 0; goto find_page; } @@ -2206,7 +2222,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, * invalidate_mapping_pages got it */ unlock_page(page); - put_page(page); + buffered_put_page(page, clear_mapping); goto find_page; } unlock_page(page); @@ -2221,7 +2237,7 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, readpage_error: /* UHHUH! A synchronous read error occurred. Report it */ - put_page(page); + buffered_put_page(page, clear_mapping); goto out; no_cached_page: @@ -2234,7 +2250,15 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, error = -ENOMEM; goto out; } - error = add_to_page_cache_lru(page, mapping, index, + if (iocb->ki_flags & IOCB_UNCACHED) { + __SetPageLocked(page); + page->mapping = mapping; + page->index = index; + clear_mapping = true; + goto readpage; + } + + error = add_to_page_cache(page, mapping, index, mapping_gfp_constraint(mapping, GFP_KERNEL)); if (error) { put_page(page); @@ -2250,6 +2274,8 @@ static ssize_t generic_file_buffered_read(struct kiocb *iocb, would_block: error = -EAGAIN; out: + if (did_dio_begin) + inode_dio_end(inode); ra->prev_pos = prev_index; ra->prev_pos <<= PAGE_SHIFT; ra->prev_pos |= prev_offset; From patchwork Thu Dec 12 19:01:30 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11289227 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id F0C8014E3 for ; Thu, 12 Dec 2019 19:01:42 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id D037F2173E for ; Thu, 12 Dec 2019 19:01:42 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="rE2OUvxu" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730620AbfLLTBl (ORCPT ); Thu, 12 Dec 2019 14:01:41 -0500 Received: from mail-il1-f194.google.com ([209.85.166.194]:34810 "EHLO mail-il1-f194.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730552AbfLLTBl (ORCPT ); Thu, 12 Dec 2019 14:01:41 -0500 Received: by mail-il1-f194.google.com with SMTP id w13so2967736ilo.1 for ; Thu, 12 Dec 2019 11:01:40 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=fbpmMkO7cF3l0nX3uMERV011BvImxrSyna50D9UZcqY=; b=rE2OUvxuQQR9e+OCoGeKlkc1cUF+15y3F4LLwx3WdT9h9LKQx9Didy7wYh+VA3fSF7 +kBrzd+B6vQBJ0BrZhLXL5rA1NJtQb1kW55UJegziee7iRIJmTwOZof3aRylNyXxDQE/ 9NInFTkvbNMXc/kGMTruW1JYbiZlao4XdqjBv44yo0w9Nsoz+b8GvVSzNGWSO9pZ9KCi eryZPEPld3a91/bVkFB8/ZmFKbm8VB5H3+AlH/E+v/ywydEuAbDKyIcI5rxTFnd2AsLe EJQOubuj/XMvejkd3wuiuhsBsbnsxPEwo2TeWY//2ctQl8ezMe9JqMTrm393hK/J0Xhy ihVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=fbpmMkO7cF3l0nX3uMERV011BvImxrSyna50D9UZcqY=; b=pLIrmPo2Bl9Uu5m1LS8pZnrosPpSSq8cvz1KGoma03LO0S1cfAn9DUsAmnIlyWvCIe YxwvJwYxtXjtUvD3semGNvgsWT0PGbMUi085b99uFIpa98zvLx+IrTHKgU4ui1cRRVoy D9YM40BFAXKzpnd98n2ZtZ3GEEAK+QVZziCYDeg5gsT8sfNkihgacXxjyr2ThzoWOIgn BP/KUSa80KT/EnNy1REW2iOoTd3TgF3sTO7vmmvRjV54VSZJxA74UlJ2v8parFh463VK 90P87tdNY/gBf3jCLUFQ0lYK6asa7yRdEqbGs4shB1la8ZcevSQm/SnOHG5p/VZfSG9z jGQQ== X-Gm-Message-State: APjAAAVy62xTt1CdITLepUtROzs3YQvVaUYEtJtGkuHQ3wncAXIkcMES ajZx/L63JwTweWuTqmu6VeT45g== X-Google-Smtp-Source: APXvYqyITTmNy4tNYMAf6rKh/QJj2zfjnSElNIDEYONualw6iMY3maR+LVOkyIy1eJVakzqREBT76g== X-Received: by 2002:a92:b657:: with SMTP id s84mr8881028ili.253.1576177300212; Thu, 12 Dec 2019 11:01:40 -0800 (PST) Received: from x1.thefacebook.com ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id i22sm1957745ill.40.2019.12.12.11.01.38 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Dec 2019 11:01:39 -0800 (PST) From: Jens Axboe To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Cc: willy@infradead.org, clm@fb.com, torvalds@linux-foundation.org, david@fromorbit.com, Jens Axboe Subject: [PATCH 2/5] mm: make generic_perform_write() take a struct kiocb Date: Thu, 12 Dec 2019 12:01:30 -0700 Message-Id: <20191212190133.18473-3-axboe@kernel.dk> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191212190133.18473-1-axboe@kernel.dk> References: <20191212190133.18473-1-axboe@kernel.dk> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org Right now all callers pass in iocb->ki_pos, just pass in the iocb. This is in preparation for using the iocb flags in generic_perform_write(). Signed-off-by: Jens Axboe --- fs/ceph/file.c | 2 +- fs/ext4/file.c | 2 +- fs/nfs/file.c | 2 +- include/linux/fs.h | 3 ++- mm/filemap.c | 8 +++++--- 5 files changed, 10 insertions(+), 7 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 11929d2bb594..096c009f188f 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1538,7 +1538,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) * are pending vmtruncate. So write and vmtruncate * can not run at the same time */ - written = generic_perform_write(file, from, pos); + written = generic_perform_write(file, from, iocb); if (likely(written >= 0)) iocb->ki_pos = pos + written; ceph_end_io_write(inode); diff --git a/fs/ext4/file.c b/fs/ext4/file.c index 6a7293a5cda2..9ffb857765d5 100644 --- a/fs/ext4/file.c +++ b/fs/ext4/file.c @@ -249,7 +249,7 @@ static ssize_t ext4_buffered_write_iter(struct kiocb *iocb, goto out; current->backing_dev_info = inode_to_bdi(inode); - ret = generic_perform_write(iocb->ki_filp, from, iocb->ki_pos); + ret = generic_perform_write(iocb->ki_filp, from, iocb); current->backing_dev_info = NULL; out: diff --git a/fs/nfs/file.c b/fs/nfs/file.c index 8eb731d9be3e..d8f51a702a4e 100644 --- a/fs/nfs/file.c +++ b/fs/nfs/file.c @@ -624,7 +624,7 @@ ssize_t nfs_file_write(struct kiocb *iocb, struct iov_iter *from) result = generic_write_checks(iocb, from); if (result > 0) { current->backing_dev_info = inode_to_bdi(inode); - result = generic_perform_write(file, from, iocb->ki_pos); + result = generic_perform_write(file, from, iocb); current->backing_dev_info = NULL; } nfs_end_io_write(inode); diff --git a/include/linux/fs.h b/include/linux/fs.h index 092ea2a4319b..bf58db1bc032 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -3103,7 +3103,8 @@ extern ssize_t generic_file_read_iter(struct kiocb *, struct iov_iter *); extern ssize_t __generic_file_write_iter(struct kiocb *, struct iov_iter *); extern ssize_t generic_file_write_iter(struct kiocb *, struct iov_iter *); extern ssize_t generic_file_direct_write(struct kiocb *, struct iov_iter *); -extern ssize_t generic_perform_write(struct file *, struct iov_iter *, loff_t); +extern ssize_t generic_perform_write(struct file *, struct iov_iter *, + struct kiocb *); ssize_t vfs_iter_read(struct file *file, struct iov_iter *iter, loff_t *ppos, rwf_t flags); diff --git a/mm/filemap.c b/mm/filemap.c index 5d299d69b185..00b8e87fb9cf 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3292,10 +3292,11 @@ struct page *grab_cache_page_write_begin(struct address_space *mapping, EXPORT_SYMBOL(grab_cache_page_write_begin); ssize_t generic_perform_write(struct file *file, - struct iov_iter *i, loff_t pos) + struct iov_iter *i, struct kiocb *iocb) { struct address_space *mapping = file->f_mapping; const struct address_space_operations *a_ops = mapping->a_ops; + loff_t pos = iocb->ki_pos; long status = 0; ssize_t written = 0; unsigned int flags = 0; @@ -3429,7 +3430,8 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from) if (written < 0 || !iov_iter_count(from) || IS_DAX(inode)) goto out; - status = generic_perform_write(file, from, pos = iocb->ki_pos); + pos = iocb->ki_pos; + status = generic_perform_write(file, from, iocb); /* * If generic_perform_write() returned a synchronous error * then we want to return the number of bytes which were @@ -3461,7 +3463,7 @@ ssize_t __generic_file_write_iter(struct kiocb *iocb, struct iov_iter *from) */ } } else { - written = generic_perform_write(file, from, iocb->ki_pos); + written = generic_perform_write(file, from, iocb); if (likely(written > 0)) iocb->ki_pos += written; } From patchwork Thu Dec 12 19:01:31 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11289233 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 39D78188B for ; Thu, 12 Dec 2019 19:01:44 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 18E09227BF for ; Thu, 12 Dec 2019 19:01:44 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="haFgiW36" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730626AbfLLTBn (ORCPT ); Thu, 12 Dec 2019 14:01:43 -0500 Received: from mail-il1-f195.google.com ([209.85.166.195]:38110 "EHLO mail-il1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730552AbfLLTBm (ORCPT ); Thu, 12 Dec 2019 14:01:42 -0500 Received: by mail-il1-f195.google.com with SMTP id f5so2952207ilq.5 for ; Thu, 12 Dec 2019 11:01:42 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=o3zF2IoLkG99aJLG3KAoCKldE2uwFDYuWKBNVHHKeAg=; b=haFgiW36qK83MkfbXn59ksnSEeEsdMHyWL9vgaQPI/2M7BzIcgSEL5tmvJUk3CVyL9 gaIUjsxIuvc2Scx+pXvX18Vol2DgeB6lck/Md1S9G8ORheTieuSd5A3FA09ZBu9H7LwU y8V5EwFbsXEx36zibedzoA5a9jooRrvi1Vufw6Bd6hsBuKZPJZHp7Qw3uMUUMmhZ5WTD jijcETsNbJTE6e2rQ/pBA1N0y4a+Prc5V7u02l8GVhZMpObLBF+CWkI4cN+2hZd+ZVU7 v0wh3XrdAMhdaZNFvhnzz8UZY6e+m1OGDnOo39MZvia9Cgm5RurFCcDKsOl4s4SvRxAt X6Qg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=o3zF2IoLkG99aJLG3KAoCKldE2uwFDYuWKBNVHHKeAg=; b=bcZw2el+ALsBg6pSY9fnsFTpbho2gXWJNx6gya5hFtbQGwnzuQkHjwrs7zC5lGt6lt o9bFkqZ5VFxmU1AX12LrTxcDZ5wLtcH37MOvewacVsCuJZ+woQfKKMHm+Xul/zV2+sCn x8c4+KMQW+yJZlHHKUyy/EYXV8zflx7YLa/v1Lxx7ZBeZBhqj5h77kTvzG+/8tJDbZpj Sz4NIuSJMuX9AqbQ1yxE/+g/uyHkSOjW8P7DStOPoO+y3ePeSgbOcjgSAhnibWb9s7G0 317dcDeTu5Whh+ojh/Z9TSNFpTqR6+SG09z4A7FQo4iMqgbeMKBUpHXfmfASxW6BNXWw HDjA== X-Gm-Message-State: APjAAAVLM+dutSf/8SNz+gwDUYCDXDbSQ8AhUmwAw/RrTbO8UOMGZ6QW xkrykwi2ogokmuIK8xjRTEsrlJbQJYE+Qw== X-Google-Smtp-Source: APXvYqy4Ha1ZPT0wkuEbZwzKrQRha4JXZvcnVJOgvUSyZMjxrG63y4ujfU5A0qzhnHglt6gsLEgwZw== X-Received: by 2002:a92:4781:: with SMTP id e1mr9348975ilk.147.1576177301694; Thu, 12 Dec 2019 11:01:41 -0800 (PST) Received: from x1.thefacebook.com ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id i22sm1957745ill.40.2019.12.12.11.01.40 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Dec 2019 11:01:41 -0800 (PST) From: Jens Axboe To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Cc: willy@infradead.org, clm@fb.com, torvalds@linux-foundation.org, david@fromorbit.com, Jens Axboe Subject: [PATCH 3/5] mm: make buffered writes work with RWF_UNCACHED Date: Thu, 12 Dec 2019 12:01:31 -0700 Message-Id: <20191212190133.18473-4-axboe@kernel.dk> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191212190133.18473-1-axboe@kernel.dk> References: <20191212190133.18473-1-axboe@kernel.dk> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org If RWF_UNCACHED is set for io_uring (or pwritev2(2)), we'll drop the cache instantiated for buffered writes. If new pages aren't instantiated, we leave them alone. This provides similar semantics to reads with RWF_UNCACHED set. Signed-off-by: Jens Axboe --- include/linux/fs.h | 1 + mm/filemap.c | 41 +++++++++++++++++++++++++++++++++++++++-- 2 files changed, 40 insertions(+), 2 deletions(-) diff --git a/include/linux/fs.h b/include/linux/fs.h index bf58db1bc032..5ea5fc167524 100644 --- a/include/linux/fs.h +++ b/include/linux/fs.h @@ -285,6 +285,7 @@ enum positive_aop_returns { #define AOP_FLAG_NOFS 0x0002 /* used by filesystem to direct * helper code (eg buffer layer) * to clear GFP_FS from alloc */ +#define AOP_FLAG_UNCACHED 0x0004 /* * oh the beauties of C type declarations. diff --git a/mm/filemap.c b/mm/filemap.c index 00b8e87fb9cf..fbcd4537979d 100644 --- a/mm/filemap.c +++ b/mm/filemap.c @@ -3277,10 +3277,12 @@ struct page *grab_cache_page_write_begin(struct address_space *mapping, pgoff_t index, unsigned flags) { struct page *page; - int fgp_flags = FGP_LOCK|FGP_WRITE|FGP_CREAT; + int fgp_flags = FGP_LOCK|FGP_WRITE; if (flags & AOP_FLAG_NOFS) fgp_flags |= FGP_NOFS; + if (!(flags & AOP_FLAG_UNCACHED)) + fgp_flags |= FGP_CREAT; page = pagecache_get_page(mapping, index, fgp_flags, mapping_gfp_mask(mapping)); @@ -3301,6 +3303,9 @@ ssize_t generic_perform_write(struct file *file, ssize_t written = 0; unsigned int flags = 0; + if (iocb->ki_flags & IOCB_UNCACHED) + flags |= AOP_FLAG_UNCACHED; + do { struct page *page; unsigned long offset; /* Offset into pagecache page */ @@ -3333,10 +3338,16 @@ ssize_t generic_perform_write(struct file *file, break; } +retry: status = a_ops->write_begin(file, mapping, pos, bytes, flags, &page, &fsdata); - if (unlikely(status < 0)) + if (unlikely(status < 0)) { + if (status == -ENOMEM && (flags & AOP_FLAG_UNCACHED)) { + flags &= ~AOP_FLAG_UNCACHED; + goto retry; + } break; + } if (mapping_writably_mapped(mapping)) flush_dcache_page(page); @@ -3372,6 +3383,32 @@ ssize_t generic_perform_write(struct file *file, balance_dirty_pages_ratelimited(mapping); } while (iov_iter_count(i)); + if (written && (iocb->ki_flags & IOCB_UNCACHED)) { + loff_t end; + + pos = iocb->ki_pos; + end = pos + written; + + status = filemap_write_and_wait_range(mapping, pos, end); + if (status) + goto out; + + /* + * No pages were created for this range, we're done + */ + if (flags & AOP_FLAG_UNCACHED) + goto out; + + /* + * Try to invalidate cache pages for the range we just wrote. + * We don't care if invalidation fails as the write has still + * worked and leaving clean uptodate pages in the page cache + * isn't a corruption vector for uncached IO. + */ + invalidate_inode_pages2_range(mapping, + pos >> PAGE_SHIFT, end >> PAGE_SHIFT); + } +out: return written ? written : status; } EXPORT_SYMBOL(generic_perform_write); From patchwork Thu Dec 12 19:01:32 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11289237 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 8B5AD14E3 for ; Thu, 12 Dec 2019 19:01:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4322F2464B for ; Thu, 12 Dec 2019 19:01:46 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="CX74bB3G" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730631AbfLLTBp (ORCPT ); Thu, 12 Dec 2019 14:01:45 -0500 Received: from mail-io1-f65.google.com ([209.85.166.65]:38922 "EHLO mail-io1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730261AbfLLTBp (ORCPT ); Thu, 12 Dec 2019 14:01:45 -0500 Received: by mail-io1-f65.google.com with SMTP id c16so3936529ioh.6 for ; Thu, 12 Dec 2019 11:01:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=PyWDbU4hG6P7d49y0tBCsMJGkotSmS2Uil+mwDXCj8U=; b=CX74bB3GC+bODKChj/UZLr49RRkK/BRzUqQpr7UNdI4ixnYnD9p68B7wqYWQpiNUK+ DJyZ7jWgcLpucnz/U0kVe4FdsD7yshJtFRKFWRl1oNlMT/qmumXPsGcjNndrstTO8ZAD iGsfSiJCJBPp1LiEfojYW8dxljwT4T0eahvB/mSA7zAiq8HCV5pYT19b1rQZimBEgVI+ OuNKT9MsbjuWRmPTQOhRT3HntDnQxOWLNRFhkF4O2wbWXpC4bOJBViFmSU/iI2fp2Pfz sIHP9RrVjF3R4KTcWYZUF8lRov4dJre34NOcka6pzktgpwh+mCdbDMiSy3KZi6dW6ltk 9eJw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=PyWDbU4hG6P7d49y0tBCsMJGkotSmS2Uil+mwDXCj8U=; b=O2XBjGPxfJA7BTXLq6vsWJRaJrQ/EXQoP/bbuduaVQ1RwVkwzTi2ZtI2NDCfR3+30L m40fmypmYSv4gsYmTDeOQPkKR1rXgYiSXLqUJgEANLOJpnwKXR6vpV8mlfGpBe+KkCGL G7zFtik0BJ6lzRxTclcxeOxncN2pX5ZXlk9qF+NTrztjmqzZIqrjCXdcSQaTAVDyfUwF YDwyUZkmTx1vDiissbhXQI+VDLMrB9Kj+DhEQ+tXG4qeabVEMHISU9WwdqZzDALFjJFt 5molOUFWHlPVPeknMNbDqYhPA5C6H7r/QUK/n07JBQt2T3b9LWI2wihrEm3ZwT0oBPZ3 KXeA== X-Gm-Message-State: APjAAAUR0upQp7Q8rfai1Qu2f+qgvVk/7apkfw3XEQDrwK2cBFyRIhOa EISySM7YjqQopf6nNnpDOHYOMg== X-Google-Smtp-Source: APXvYqyTEhfpOZJyJHspgWaBBmrqw2bRDjdeXHnaI6yoJbG/i718bPPGb3OfyIwf67f/YRRRqwuRKw== X-Received: by 2002:a02:3312:: with SMTP id c18mr9302995jae.24.1576177303301; Thu, 12 Dec 2019 11:01:43 -0800 (PST) Received: from x1.thefacebook.com ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id i22sm1957745ill.40.2019.12.12.11.01.41 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Dec 2019 11:01:42 -0800 (PST) From: Jens Axboe To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Cc: willy@infradead.org, clm@fb.com, torvalds@linux-foundation.org, david@fromorbit.com, Jens Axboe Subject: [PATCH 4/5] iomap: add struct iomap_data Date: Thu, 12 Dec 2019 12:01:32 -0700 Message-Id: <20191212190133.18473-5-axboe@kernel.dk> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191212190133.18473-1-axboe@kernel.dk> References: <20191212190133.18473-1-axboe@kernel.dk> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org We pass a lot of arguments to iomap_apply(), and subsequently to the actors that it calls. In preparation for adding one more argument, switch them to using a struct iomap_data instead. The actor gets a const version of that, they are not supposed to change anything in it. Signed-off-by: Jens Axboe --- fs/dax.c | 25 +++-- fs/iomap/apply.c | 26 +++--- fs/iomap/buffered-io.c | 202 +++++++++++++++++++++++++---------------- fs/iomap/direct-io.c | 57 +++++++----- fs/iomap/fiemap.c | 48 ++++++---- fs/iomap/seek.c | 64 ++++++++----- fs/iomap/swapfile.c | 27 +++--- include/linux/iomap.h | 15 ++- 8 files changed, 278 insertions(+), 186 deletions(-) diff --git a/fs/dax.c b/fs/dax.c index 1f1f0201cad1..d1c32dbbdf24 100644 --- a/fs/dax.c +++ b/fs/dax.c @@ -1090,13 +1090,16 @@ int __dax_zero_page_range(struct block_device *bdev, EXPORT_SYMBOL_GPL(__dax_zero_page_range); static loff_t -dax_iomap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap, struct iomap *srcmap) +dax_iomap_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { struct block_device *bdev = iomap->bdev; struct dax_device *dax_dev = iomap->dax_dev; - struct iov_iter *iter = data; + struct iov_iter *iter = data->priv; + loff_t pos = data->pos; + loff_t length = pos + data->len; loff_t end = pos + length, done = 0; + struct inode *inode = data->inode; ssize_t ret = 0; size_t xfer; int id; @@ -1197,22 +1200,26 @@ dax_iomap_rw(struct kiocb *iocb, struct iov_iter *iter, { struct address_space *mapping = iocb->ki_filp->f_mapping; struct inode *inode = mapping->host; - loff_t pos = iocb->ki_pos, ret = 0, done = 0; - unsigned flags = 0; + loff_t ret = 0, done = 0; + struct iomap_data data = { + .inode = inode, + .pos = iocb->ki_pos, + .priv = iter, + }; if (iov_iter_rw(iter) == WRITE) { lockdep_assert_held_write(&inode->i_rwsem); - flags |= IOMAP_WRITE; + data.flags |= IOMAP_WRITE; } else { lockdep_assert_held(&inode->i_rwsem); } while (iov_iter_count(iter)) { - ret = iomap_apply(inode, pos, iov_iter_count(iter), flags, ops, - iter, dax_iomap_actor); + data.len = iov_iter_count(iter); + ret = iomap_apply(&data, ops, dax_iomap_actor); if (ret <= 0) break; - pos += ret; + data.pos += ret; done += ret; } diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c index 76925b40b5fd..e76148db03b8 100644 --- a/fs/iomap/apply.c +++ b/fs/iomap/apply.c @@ -21,15 +21,16 @@ * iomap_end call. */ loff_t -iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, - const struct iomap_ops *ops, void *data, iomap_actor_t actor) +iomap_apply(struct iomap_data *data, const struct iomap_ops *ops, + iomap_actor_t actor) { struct iomap iomap = { .type = IOMAP_HOLE }; struct iomap srcmap = { .type = IOMAP_HOLE }; loff_t written = 0, ret; u64 end; - trace_iomap_apply(inode, pos, length, flags, ops, actor, _RET_IP_); + trace_iomap_apply(data->inode, data->pos, data->len, data->flags, ops, + actor, _RET_IP_); /* * Need to map a range from start position for length bytes. This can @@ -43,17 +44,18 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, * expose transient stale data. If the reserve fails, we can safely * back out at this point as there is nothing to undo. */ - ret = ops->iomap_begin(inode, pos, length, flags, &iomap, &srcmap); + ret = ops->iomap_begin(data->inode, data->pos, data->len, data->flags, + &iomap, &srcmap); if (ret) return ret; - if (WARN_ON(iomap.offset > pos)) + if (WARN_ON(iomap.offset > data->pos)) return -EIO; if (WARN_ON(iomap.length == 0)) return -EIO; - trace_iomap_apply_dstmap(inode, &iomap); + trace_iomap_apply_dstmap(data->inode, &iomap); if (srcmap.type != IOMAP_HOLE) - trace_iomap_apply_srcmap(inode, &srcmap); + trace_iomap_apply_srcmap(data->inode, &srcmap); /* * Cut down the length to the one actually provided by the filesystem, @@ -62,8 +64,8 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, end = iomap.offset + iomap.length; if (srcmap.type != IOMAP_HOLE) end = min(end, srcmap.offset + srcmap.length); - if (pos + length > end) - length = end - pos; + if (data->pos + data->len > end) + data->len = end - data->pos; /* * Now that we have guaranteed that the space allocation will succeed, @@ -77,7 +79,7 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, * iomap into the actors so that they don't need to have special * handling for the two cases. */ - written = actor(inode, pos, length, data, &iomap, + written = actor(data, &iomap, srcmap.type != IOMAP_HOLE ? &srcmap : &iomap); /* @@ -85,9 +87,9 @@ iomap_apply(struct inode *inode, loff_t pos, loff_t length, unsigned flags, * should not fail unless the filesystem has had a fatal error. */ if (ops->iomap_end) { - ret = ops->iomap_end(inode, pos, length, + ret = ops->iomap_end(data->inode, data->pos, data->len, written > 0 ? written : 0, - flags, &iomap); + data->flags, &iomap); } return written ? written : ret; diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 828444e14d09..0a1a195ed1cc 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -248,14 +248,15 @@ static inline bool iomap_block_needs_zeroing(struct inode *inode, } static loff_t -iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap, struct iomap *srcmap) +iomap_readpage_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { - struct iomap_readpage_ctx *ctx = data; + struct iomap_readpage_ctx *ctx = data->priv; + struct inode *inode = data->inode; struct page *page = ctx->cur_page; struct iomap_page *iop = iomap_page_create(inode, page); bool same_page = false, is_contig = false; - loff_t orig_pos = pos; + loff_t pos = data->pos, orig_pos = data->pos; unsigned poff, plen; sector_t sector; @@ -266,7 +267,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, } /* zero post-eof blocks as the page may be mapped */ - iomap_adjust_read_range(inode, iop, &pos, length, &poff, &plen); + iomap_adjust_read_range(inode, iop, &pos, data->len, &poff, &plen); if (plen == 0) goto done; @@ -302,7 +303,7 @@ iomap_readpage_actor(struct inode *inode, loff_t pos, loff_t length, void *data, if (!ctx->bio || !is_contig || bio_full(ctx->bio, plen)) { gfp_t gfp = mapping_gfp_constraint(page->mapping, GFP_KERNEL); - int nr_vecs = (length + PAGE_SIZE - 1) >> PAGE_SHIFT; + int nr_vecs = (data->len + PAGE_SIZE - 1) >> PAGE_SHIFT; if (ctx->bio) submit_bio(ctx->bio); @@ -333,16 +334,20 @@ int iomap_readpage(struct page *page, const struct iomap_ops *ops) { struct iomap_readpage_ctx ctx = { .cur_page = page }; - struct inode *inode = page->mapping->host; + struct iomap_data data = { + .inode = page->mapping->host, + .priv = &ctx, + .flags = 0 + }; unsigned poff; loff_t ret; trace_iomap_readpage(page->mapping->host, 1); for (poff = 0; poff < PAGE_SIZE; poff += ret) { - ret = iomap_apply(inode, page_offset(page) + poff, - PAGE_SIZE - poff, 0, ops, &ctx, - iomap_readpage_actor); + data.pos = page_offset(page) + poff; + data.len = PAGE_SIZE - poff; + ret = iomap_apply(&data, ops, iomap_readpage_actor); if (ret <= 0) { WARN_ON_ONCE(ret == 0); SetPageError(page); @@ -396,28 +401,34 @@ iomap_next_page(struct inode *inode, struct list_head *pages, loff_t pos, } static loff_t -iomap_readpages_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap, struct iomap *srcmap) +iomap_readpages_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { - struct iomap_readpage_ctx *ctx = data; + struct iomap_readpage_ctx *ctx = data->priv; loff_t done, ret; - for (done = 0; done < length; done += ret) { - if (ctx->cur_page && offset_in_page(pos + done) == 0) { + for (done = 0; done < data->len; done += ret) { + struct iomap_data rp_data = { + .inode = data->inode, + .pos = data->pos + done, + .len = data->len - done, + .priv = ctx, + }; + + if (ctx->cur_page && offset_in_page(rp_data.pos) == 0) { if (!ctx->cur_page_in_bio) unlock_page(ctx->cur_page); put_page(ctx->cur_page); ctx->cur_page = NULL; } if (!ctx->cur_page) { - ctx->cur_page = iomap_next_page(inode, ctx->pages, - pos, length, &done); + ctx->cur_page = iomap_next_page(data->inode, ctx->pages, + data->pos, data->len, &done); if (!ctx->cur_page) break; ctx->cur_page_in_bio = false; } - ret = iomap_readpage_actor(inode, pos + done, length - done, - ctx, iomap, srcmap); + ret = iomap_readpage_actor(&rp_data, iomap, srcmap); } return done; @@ -431,21 +442,27 @@ iomap_readpages(struct address_space *mapping, struct list_head *pages, .pages = pages, .is_readahead = true, }; - loff_t pos = page_offset(list_entry(pages->prev, struct page, lru)); + struct iomap_data data = { + .inode = mapping->host, + .priv = &ctx, + .flags = 0 + }; loff_t last = page_offset(list_entry(pages->next, struct page, lru)); - loff_t length = last - pos + PAGE_SIZE, ret = 0; + loff_t ret = 0; + + data.pos = page_offset(list_entry(pages->prev, struct page, lru)); + data.len = last - data.pos + PAGE_SIZE; - trace_iomap_readpages(mapping->host, nr_pages); + trace_iomap_readpages(data.inode, nr_pages); - while (length > 0) { - ret = iomap_apply(mapping->host, pos, length, 0, ops, - &ctx, iomap_readpages_actor); + while (data.len > 0) { + ret = iomap_apply(&data, ops, iomap_readpages_actor); if (ret <= 0) { WARN_ON_ONCE(ret == 0); goto done; } - pos += ret; - length -= ret; + data.pos += ret; + data.len -= ret; } ret = 0; done: @@ -796,10 +813,13 @@ iomap_write_end(struct inode *inode, loff_t pos, unsigned len, unsigned copied, } static loff_t -iomap_write_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap, struct iomap *srcmap) +iomap_write_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { - struct iov_iter *i = data; + struct inode *inode = data->inode; + struct iov_iter *i = data->priv; + loff_t length = data->len; + loff_t pos = data->pos; long status = 0; ssize_t written = 0; @@ -879,15 +899,20 @@ ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *iter, const struct iomap_ops *ops) { - struct inode *inode = iocb->ki_filp->f_mapping->host; - loff_t pos = iocb->ki_pos, ret = 0, written = 0; + struct iomap_data data = { + .inode = iocb->ki_filp->f_mapping->host, + .pos = iocb->ki_pos, + .priv = iter, + .flags = IOMAP_WRITE + }; + loff_t ret = 0, written = 0; while (iov_iter_count(iter)) { - ret = iomap_apply(inode, pos, iov_iter_count(iter), - IOMAP_WRITE, ops, iter, iomap_write_actor); + data.len = iov_iter_count(iter); + ret = iomap_apply(&data, ops, iomap_write_actor); if (ret <= 0) break; - pos += ret; + data.pos += ret; written += ret; } @@ -896,9 +921,11 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *iter, EXPORT_SYMBOL_GPL(iomap_file_buffered_write); static loff_t -iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap, struct iomap *srcmap) +iomap_unshare_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { + loff_t pos = data->pos; + loff_t length = data->len; long status = 0; ssize_t written = 0; @@ -914,13 +941,13 @@ iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data, unsigned long bytes = min_t(loff_t, PAGE_SIZE - offset, length); struct page *page; - status = iomap_write_begin(inode, pos, bytes, + status = iomap_write_begin(data->inode, pos, bytes, IOMAP_WRITE_F_UNSHARE, &page, iomap, srcmap); if (unlikely(status)) return status; - status = iomap_write_end(inode, pos, bytes, bytes, page, iomap, - srcmap); + status = iomap_write_end(data->inode, pos, bytes, bytes, page, + iomap, srcmap); if (unlikely(status <= 0)) { if (WARN_ON_ONCE(status == 0)) return -EIO; @@ -933,7 +960,7 @@ iomap_unshare_actor(struct inode *inode, loff_t pos, loff_t length, void *data, written += status; length -= status; - balance_dirty_pages_ratelimited(inode->i_mapping); + balance_dirty_pages_ratelimited(data->inode->i_mapping); } while (length); return written; @@ -943,15 +970,20 @@ int iomap_file_unshare(struct inode *inode, loff_t pos, loff_t len, const struct iomap_ops *ops) { + struct iomap_data data = { + .inode = inode, + .pos = pos, + .len = len, + .flags = IOMAP_WRITE, + }; loff_t ret; - while (len) { - ret = iomap_apply(inode, pos, len, IOMAP_WRITE, ops, NULL, - iomap_unshare_actor); + while (data.len) { + ret = iomap_apply(&data, ops, iomap_unshare_actor); if (ret <= 0) return ret; - pos += ret; - len -= ret; + data.pos += ret; + data.len -= ret; } return 0; @@ -982,16 +1014,18 @@ static int iomap_dax_zero(loff_t pos, unsigned offset, unsigned bytes, } static loff_t -iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, - void *data, struct iomap *iomap, struct iomap *srcmap) +iomap_zero_range_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { - bool *did_zero = data; + bool *did_zero = data->priv; + loff_t count = data->len; + loff_t pos = data->pos; loff_t written = 0; int status; /* already zeroed? we're done. */ if (srcmap->type == IOMAP_HOLE || srcmap->type == IOMAP_UNWRITTEN) - return count; + return data->len; do { unsigned offset, bytes; @@ -999,11 +1033,11 @@ iomap_zero_range_actor(struct inode *inode, loff_t pos, loff_t count, offset = offset_in_page(pos); bytes = min_t(loff_t, PAGE_SIZE - offset, count); - if (IS_DAX(inode)) + if (IS_DAX(data->inode)) status = iomap_dax_zero(pos, offset, bytes, iomap); else - status = iomap_zero(inode, pos, offset, bytes, iomap, - srcmap); + status = iomap_zero(data->inode, pos, offset, bytes, + iomap, srcmap); if (status < 0) return status; @@ -1021,16 +1055,22 @@ int iomap_zero_range(struct inode *inode, loff_t pos, loff_t len, bool *did_zero, const struct iomap_ops *ops) { + struct iomap_data data = { + .inode = inode, + .pos = pos, + .len = len, + .priv = did_zero, + .flags = IOMAP_ZERO + }; loff_t ret; - while (len > 0) { - ret = iomap_apply(inode, pos, len, IOMAP_ZERO, - ops, did_zero, iomap_zero_range_actor); + while (data.len > 0) { + ret = iomap_apply(&data, ops, iomap_zero_range_actor); if (ret <= 0) return ret; - pos += ret; - len -= ret; + data.pos += ret; + data.len -= ret; } return 0; @@ -1052,57 +1092,59 @@ iomap_truncate_page(struct inode *inode, loff_t pos, bool *did_zero, EXPORT_SYMBOL_GPL(iomap_truncate_page); static loff_t -iomap_page_mkwrite_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap, struct iomap *srcmap) +iomap_page_mkwrite_actor(const struct iomap_data *data, + struct iomap *iomap, struct iomap *srcmap) { - struct page *page = data; + struct page *page = data->priv; int ret; if (iomap->flags & IOMAP_F_BUFFER_HEAD) { - ret = __block_write_begin_int(page, pos, length, NULL, iomap); + ret = __block_write_begin_int(page, data->pos, data->len, NULL, + iomap); if (ret) return ret; - block_commit_write(page, 0, length); + block_commit_write(page, 0, data->len); } else { WARN_ON_ONCE(!PageUptodate(page)); - iomap_page_create(inode, page); + iomap_page_create(data->inode, page); set_page_dirty(page); } - return length; + return data->len; } vm_fault_t iomap_page_mkwrite(struct vm_fault *vmf, const struct iomap_ops *ops) { struct page *page = vmf->page; - struct inode *inode = file_inode(vmf->vma->vm_file); - unsigned long length; - loff_t offset, size; + struct iomap_data data = { + .inode = file_inode(vmf->vma->vm_file), + .pos = page_offset(page), + .flags = IOMAP_WRITE | IOMAP_FAULT, + .priv = page, + }; ssize_t ret; + loff_t size; lock_page(page); - size = i_size_read(inode); - offset = page_offset(page); - if (page->mapping != inode->i_mapping || offset > size) { + size = i_size_read(data.inode); + if (page->mapping != data.inode->i_mapping || data.pos > size) { /* We overload EFAULT to mean page got truncated */ ret = -EFAULT; goto out_unlock; } /* page is wholly or partially inside EOF */ - if (offset > size - PAGE_SIZE) - length = offset_in_page(size); + if (data.pos > size - PAGE_SIZE) + data.len = offset_in_page(size); else - length = PAGE_SIZE; + data.len = PAGE_SIZE; - while (length > 0) { - ret = iomap_apply(inode, offset, length, - IOMAP_WRITE | IOMAP_FAULT, ops, page, - iomap_page_mkwrite_actor); + while (data.len > 0) { + ret = iomap_apply(&data, ops, iomap_page_mkwrite_actor); if (unlikely(ret <= 0)) goto out_unlock; - offset += ret; - length -= ret; + data.pos += ret; + data.len -= ret; } wait_for_stable_page(page); diff --git a/fs/iomap/direct-io.c b/fs/iomap/direct-io.c index 23837926c0c5..e561ca9329ac 100644 --- a/fs/iomap/direct-io.c +++ b/fs/iomap/direct-io.c @@ -364,24 +364,27 @@ iomap_dio_inline_actor(struct inode *inode, loff_t pos, loff_t length, } static loff_t -iomap_dio_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap, struct iomap *srcmap) +iomap_dio_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { - struct iomap_dio *dio = data; + struct iomap_dio *dio = data->priv; switch (iomap->type) { case IOMAP_HOLE: if (WARN_ON_ONCE(dio->flags & IOMAP_DIO_WRITE)) return -EIO; - return iomap_dio_hole_actor(length, dio); + return iomap_dio_hole_actor(data->len, dio); case IOMAP_UNWRITTEN: if (!(dio->flags & IOMAP_DIO_WRITE)) - return iomap_dio_hole_actor(length, dio); - return iomap_dio_bio_actor(inode, pos, length, dio, iomap); + return iomap_dio_hole_actor(data->len, dio); + return iomap_dio_bio_actor(data->inode, data->pos, data->len, + dio, iomap); case IOMAP_MAPPED: - return iomap_dio_bio_actor(inode, pos, length, dio, iomap); + return iomap_dio_bio_actor(data->inode, data->pos, data->len, + dio, iomap); case IOMAP_INLINE: - return iomap_dio_inline_actor(inode, pos, length, dio, iomap); + return iomap_dio_inline_actor(data->inode, data->pos, data->len, + dio, iomap); default: WARN_ON_ONCE(1); return -EIO; @@ -404,16 +407,19 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, { struct address_space *mapping = iocb->ki_filp->f_mapping; struct inode *inode = file_inode(iocb->ki_filp); - size_t count = iov_iter_count(iter); - loff_t pos = iocb->ki_pos; - loff_t end = iocb->ki_pos + count - 1, ret = 0; - unsigned int flags = IOMAP_DIRECT; + struct iomap_data data = { + .inode = inode, + .pos = iocb->ki_pos, + .len = iov_iter_count(iter), + .flags = IOMAP_DIRECT + }; + loff_t end = data.pos + data.len - 1, ret = 0; struct blk_plug plug; struct iomap_dio *dio; lockdep_assert_held(&inode->i_rwsem); - if (!count) + if (!data.len) return 0; if (WARN_ON(is_sync_kiocb(iocb) && !wait_for_completion)) @@ -436,14 +442,16 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, dio->submit.cookie = BLK_QC_T_NONE; dio->submit.last_queue = NULL; + data.priv = dio; + if (iov_iter_rw(iter) == READ) { - if (pos >= dio->i_size) + if (data.pos >= dio->i_size) goto out_free_dio; if (iter_is_iovec(iter)) dio->flags |= IOMAP_DIO_DIRTY; } else { - flags |= IOMAP_WRITE; + data.flags |= IOMAP_WRITE; dio->flags |= IOMAP_DIO_WRITE; /* for data sync or sync, we need sync completion processing */ @@ -461,14 +469,14 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, } if (iocb->ki_flags & IOCB_NOWAIT) { - if (filemap_range_has_page(mapping, pos, end)) { + if (filemap_range_has_page(mapping, data.pos, end)) { ret = -EAGAIN; goto out_free_dio; } - flags |= IOMAP_NOWAIT; + data.flags |= IOMAP_NOWAIT; } - ret = filemap_write_and_wait_range(mapping, pos, end); + ret = filemap_write_and_wait_range(mapping, data.pos, end); if (ret) goto out_free_dio; @@ -479,7 +487,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, * pretty crazy thing to do, so we don't support it 100%. */ ret = invalidate_inode_pages2_range(mapping, - pos >> PAGE_SHIFT, end >> PAGE_SHIFT); + data.pos >> PAGE_SHIFT, end >> PAGE_SHIFT); if (ret) dio_warn_stale_pagecache(iocb->ki_filp); ret = 0; @@ -495,8 +503,7 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, blk_start_plug(&plug); do { - ret = iomap_apply(inode, pos, count, flags, ops, dio, - iomap_dio_actor); + ret = iomap_apply(&data, ops, iomap_dio_actor); if (ret <= 0) { /* magic error code to fall back to buffered I/O */ if (ret == -ENOTBLK) { @@ -505,18 +512,18 @@ iomap_dio_rw(struct kiocb *iocb, struct iov_iter *iter, } break; } - pos += ret; + data.pos += ret; - if (iov_iter_rw(iter) == READ && pos >= dio->i_size) { + if (iov_iter_rw(iter) == READ && data.pos >= dio->i_size) { /* * We only report that we've read data up to i_size. * Revert iter to a state corresponding to that as * some callers (such as splice code) rely on it. */ - iov_iter_revert(iter, pos - dio->i_size); + iov_iter_revert(iter, data.pos - dio->i_size); break; } - } while ((count = iov_iter_count(iter)) > 0); + } while ((data.len = iov_iter_count(iter)) > 0); blk_finish_plug(&plug); if (ret < 0) diff --git a/fs/iomap/fiemap.c b/fs/iomap/fiemap.c index bccf305ea9ce..4075fbe0e3f5 100644 --- a/fs/iomap/fiemap.c +++ b/fs/iomap/fiemap.c @@ -43,20 +43,20 @@ static int iomap_to_fiemap(struct fiemap_extent_info *fi, } static loff_t -iomap_fiemap_actor(struct inode *inode, loff_t pos, loff_t length, void *data, - struct iomap *iomap, struct iomap *srcmap) +iomap_fiemap_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { - struct fiemap_ctx *ctx = data; - loff_t ret = length; + struct fiemap_ctx *ctx = data->priv; + loff_t ret = data->len; if (iomap->type == IOMAP_HOLE) - return length; + return data->len; ret = iomap_to_fiemap(ctx->fi, &ctx->prev, 0); ctx->prev = *iomap; switch (ret) { case 0: /* success */ - return length; + return data->len; case 1: /* extent array full */ return 0; default: @@ -68,6 +68,13 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi, loff_t start, loff_t len, const struct iomap_ops *ops) { struct fiemap_ctx ctx; + struct iomap_data data = { + .inode = inode, + .pos = start, + .len = len, + .flags = IOMAP_REPORT, + .priv = &ctx + }; loff_t ret; memset(&ctx, 0, sizeof(ctx)); @@ -84,9 +91,8 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi, return ret; } - while (len > 0) { - ret = iomap_apply(inode, start, len, IOMAP_REPORT, ops, &ctx, - iomap_fiemap_actor); + while (data.len > 0) { + ret = iomap_apply(&data, ops, iomap_fiemap_actor); /* inode with no (attribute) mapping will give ENOENT */ if (ret == -ENOENT) break; @@ -95,8 +101,8 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi, if (ret == 0) break; - start += ret; - len -= ret; + data.pos += ret; + data.len -= ret; } if (ctx.prev.type != IOMAP_HOLE) { @@ -110,13 +116,14 @@ int iomap_fiemap(struct inode *inode, struct fiemap_extent_info *fi, EXPORT_SYMBOL_GPL(iomap_fiemap); static loff_t -iomap_bmap_actor(struct inode *inode, loff_t pos, loff_t length, - void *data, struct iomap *iomap, struct iomap *srcmap) +iomap_bmap_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { - sector_t *bno = data, addr; + sector_t *bno = data->priv, addr; if (iomap->type == IOMAP_MAPPED) { - addr = (pos - iomap->offset + iomap->addr) >> inode->i_blkbits; + addr = (data->pos - iomap->offset + iomap->addr) >> + data->inode->i_blkbits; if (addr > INT_MAX) WARN(1, "would truncate bmap result\n"); else @@ -131,16 +138,19 @@ iomap_bmap(struct address_space *mapping, sector_t bno, const struct iomap_ops *ops) { struct inode *inode = mapping->host; - loff_t pos = bno << inode->i_blkbits; - unsigned blocksize = i_blocksize(inode); + struct iomap_data data = { + .inode = inode, + .pos = bno << inode->i_blkbits, + .len = i_blocksize(inode), + .priv = &bno + }; int ret; if (filemap_write_and_wait(mapping)) return 0; bno = 0; - ret = iomap_apply(inode, pos, blocksize, 0, ops, &bno, - iomap_bmap_actor); + ret = iomap_apply(&data, ops, iomap_bmap_actor); if (ret) return 0; return bno; diff --git a/fs/iomap/seek.c b/fs/iomap/seek.c index 89f61d93c0bc..288bee0b5d9b 100644 --- a/fs/iomap/seek.c +++ b/fs/iomap/seek.c @@ -118,21 +118,23 @@ page_cache_seek_hole_data(struct inode *inode, loff_t offset, loff_t length, static loff_t -iomap_seek_hole_actor(struct inode *inode, loff_t offset, loff_t length, - void *data, struct iomap *iomap, struct iomap *srcmap) +iomap_seek_hole_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { + loff_t offset = data->pos; + switch (iomap->type) { case IOMAP_UNWRITTEN: - offset = page_cache_seek_hole_data(inode, offset, length, - SEEK_HOLE); + offset = page_cache_seek_hole_data(data->inode, offset, + data->len, SEEK_HOLE); if (offset < 0) - return length; + return data->len; /* fall through */ case IOMAP_HOLE: - *(loff_t *)data = offset; + *(loff_t *)data->priv = offset; return 0; default: - return length; + return data->len; } } @@ -140,23 +142,28 @@ loff_t iomap_seek_hole(struct inode *inode, loff_t offset, const struct iomap_ops *ops) { loff_t size = i_size_read(inode); - loff_t length = size - offset; + struct iomap_data data = { + .inode = inode, + .len = size - offset, + .priv = &offset, + .flags = IOMAP_REPORT + }; loff_t ret; /* Nothing to be found before or beyond the end of the file. */ if (offset < 0 || offset >= size) return -ENXIO; - while (length > 0) { - ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops, - &offset, iomap_seek_hole_actor); + while (data.len > 0) { + data.pos = offset; + ret = iomap_apply(&data, ops, iomap_seek_hole_actor); if (ret < 0) return ret; if (ret == 0) break; offset += ret; - length -= ret; + data.len -= ret; } return offset; @@ -164,20 +171,22 @@ iomap_seek_hole(struct inode *inode, loff_t offset, const struct iomap_ops *ops) EXPORT_SYMBOL_GPL(iomap_seek_hole); static loff_t -iomap_seek_data_actor(struct inode *inode, loff_t offset, loff_t length, - void *data, struct iomap *iomap, struct iomap *srcmap) +iomap_seek_data_actor(const struct iomap_data *data, struct iomap *iomap, + struct iomap *srcmap) { + loff_t offset = data->pos; + switch (iomap->type) { case IOMAP_HOLE: - return length; + return data->len; case IOMAP_UNWRITTEN: - offset = page_cache_seek_hole_data(inode, offset, length, - SEEK_DATA); + offset = page_cache_seek_hole_data(data->inode, offset, + data->len, SEEK_DATA); if (offset < 0) - return length; + return data->len; /*FALLTHRU*/ default: - *(loff_t *)data = offset; + *(loff_t *)data->priv = offset; return 0; } } @@ -186,26 +195,31 @@ loff_t iomap_seek_data(struct inode *inode, loff_t offset, const struct iomap_ops *ops) { loff_t size = i_size_read(inode); - loff_t length = size - offset; + struct iomap_data data = { + .inode = inode, + .len = size - offset, + .priv = &offset, + .flags = IOMAP_REPORT + }; loff_t ret; /* Nothing to be found before or beyond the end of the file. */ if (offset < 0 || offset >= size) return -ENXIO; - while (length > 0) { - ret = iomap_apply(inode, offset, length, IOMAP_REPORT, ops, - &offset, iomap_seek_data_actor); + while (data.len > 0) { + data.pos = offset; + ret = iomap_apply(&data, ops, iomap_seek_data_actor); if (ret < 0) return ret; if (ret == 0) break; offset += ret; - length -= ret; + data.len -= ret; } - if (length <= 0) + if (data.len <= 0) return -ENXIO; return offset; } diff --git a/fs/iomap/swapfile.c b/fs/iomap/swapfile.c index a648dbf6991e..d911ab4b69ea 100644 --- a/fs/iomap/swapfile.c +++ b/fs/iomap/swapfile.c @@ -75,11 +75,10 @@ static int iomap_swapfile_add_extent(struct iomap_swapfile_info *isi) * swap only cares about contiguous page-aligned physical extents and makes no * distinction between written and unwritten extents. */ -static loff_t iomap_swapfile_activate_actor(struct inode *inode, loff_t pos, - loff_t count, void *data, struct iomap *iomap, - struct iomap *srcmap) +static loff_t iomap_swapfile_activate_actor(const struct iomap_data *data, + struct iomap *iomap, struct iomap *srcmap) { - struct iomap_swapfile_info *isi = data; + struct iomap_swapfile_info *isi = data->priv; int error; switch (iomap->type) { @@ -125,7 +124,7 @@ static loff_t iomap_swapfile_activate_actor(struct inode *inode, loff_t pos, return error; memcpy(&isi->iomap, iomap, sizeof(isi->iomap)); } - return count; + return data->len; } /* @@ -142,8 +141,13 @@ int iomap_swapfile_activate(struct swap_info_struct *sis, }; struct address_space *mapping = swap_file->f_mapping; struct inode *inode = mapping->host; - loff_t pos = 0; - loff_t len = ALIGN_DOWN(i_size_read(inode), PAGE_SIZE); + struct iomap_data data = { + .inode = inode, + .pos = 0, + .len = ALIGN_DOWN(i_size_read(inode), PAGE_SIZE), + .priv = &isi, + .flags = IOMAP_REPORT + }; loff_t ret; /* @@ -154,14 +158,13 @@ int iomap_swapfile_activate(struct swap_info_struct *sis, if (ret) return ret; - while (len > 0) { - ret = iomap_apply(inode, pos, len, IOMAP_REPORT, - ops, &isi, iomap_swapfile_activate_actor); + while (data.len > 0) { + ret = iomap_apply(&data, ops, iomap_swapfile_activate_actor); if (ret <= 0) return ret; - pos += ret; - len -= ret; + data.pos += ret; + data.len -= ret; } if (isi.iomap.length) { diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 8b09463dae0d..30f40145a9e9 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -145,11 +145,18 @@ struct iomap_ops { /* * Main iomap iterator function. */ -typedef loff_t (*iomap_actor_t)(struct inode *inode, loff_t pos, loff_t len, - void *data, struct iomap *iomap, struct iomap *srcmap); +struct iomap_data { + struct inode *inode; + loff_t pos; + loff_t len; + void *priv; + unsigned flags; +}; + +typedef loff_t (*iomap_actor_t)(const struct iomap_data *data, + struct iomap *iomap, struct iomap *srcmap); -loff_t iomap_apply(struct inode *inode, loff_t pos, loff_t length, - unsigned flags, const struct iomap_ops *ops, void *data, +loff_t iomap_apply(struct iomap_data *data, const struct iomap_ops *ops, iomap_actor_t actor); ssize_t iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *from, From patchwork Thu Dec 12 19:01:33 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jens Axboe X-Patchwork-Id: 11289241 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 652946C1 for ; Thu, 12 Dec 2019 19:01:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.kernel.org (Postfix) with ESMTP id 4489C2173E for ; Thu, 12 Dec 2019 19:01:47 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (2048-bit key) header.d=kernel-dk.20150623.gappssmtp.com header.i=@kernel-dk.20150623.gappssmtp.com header.b="IckOWSwJ" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730552AbfLLTBq (ORCPT ); Thu, 12 Dec 2019 14:01:46 -0500 Received: from mail-io1-f68.google.com ([209.85.166.68]:42814 "EHLO mail-io1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730629AbfLLTBp (ORCPT ); Thu, 12 Dec 2019 14:01:45 -0500 Received: by mail-io1-f68.google.com with SMTP id f82so3918407ioa.9 for ; Thu, 12 Dec 2019 11:01:45 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=GaJCkgUv/B3OET1g1Gk3ujWzgIVpNM0nxRodwjfiGBE=; b=IckOWSwJdM1mRDe5xLg+1QZWh5GjmzKz/7oAhzljHbAQD3IR8KlQNignWa1viyT4en Bh37pENQzkkWfMCAI6BipdGMRo/FeyG1ZLUKPSdW+BN/fp8yr0o4MzfkgwtRhNsjRco+ M09XIHEM+8rPeoSpTB5EtkKznUCHqVeLvbg1q562VXLpYeQgoKspfnzL5cXCMwpaPm9s 64iPUDt8t+FhqGPPjCxwjQKb/UMTjRByOLC6AKfFpFvHPt5TTIbYK8wViTd7F3Hrs/jz F0x/5d299vyhYmAirrEFlRTDD/S4fijWr09sEYNA74raM3wwMiGwx+yKRv9O7TOmRyd4 3+OQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=GaJCkgUv/B3OET1g1Gk3ujWzgIVpNM0nxRodwjfiGBE=; b=dAPHi3ehq2wF34GN2roVOV9YohurwYcMFhpYcNi6t/BuhoktxlyQGV/W+aRCGAYmyF ghaQn5v/KHqz9nl2wOnlhl7iA5K3oLhNSijXkiiQEGCHm1+iY9ZC/iHNTTXBZe3h0PS3 oSOMnOiO4wIa1nxH3BC8v2uiNplTr2vDOKTTO62oZLqGl9Me3QZuTnYJrwMFNYHErtZP CxUc0YGkbUJxBHIEx8dJ/7Bh+IBXP5nViRPLxjYho5uOqYSz5xbZj1vJRnN5q+cyAw/c 3oBjn7z8HosBmlHkCt0kvBy0c2zsW278Hka6LlvTyT1a3S3Lv1UWM1x+RC9wcMvRsYbw EWPQ== X-Gm-Message-State: APjAAAUmt0vF3+a7uO+Amw9xEfHUsAqsMQm33sXtsJwtSVRxwgNZqHIt VnW1FmkCCZnSKvzHDpD7GINQKA== X-Google-Smtp-Source: APXvYqwjTIM6kPV9uE6Psr7sf1orKaXXnKqktbjIaUiI0LaIo8/4HAcLiFgn4bEKliYaVLewxqJuUQ== X-Received: by 2002:a02:a388:: with SMTP id y8mr9511471jak.70.1576177304566; Thu, 12 Dec 2019 11:01:44 -0800 (PST) Received: from x1.thefacebook.com ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id i22sm1957745ill.40.2019.12.12.11.01.43 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 12 Dec 2019 11:01:43 -0800 (PST) From: Jens Axboe To: linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-block@vger.kernel.org Cc: willy@infradead.org, clm@fb.com, torvalds@linux-foundation.org, david@fromorbit.com, Jens Axboe Subject: [PATCH 5/5] iomap: support RWF_UNCACHED for buffered writes Date: Thu, 12 Dec 2019 12:01:33 -0700 Message-Id: <20191212190133.18473-6-axboe@kernel.dk> X-Mailer: git-send-email 2.24.1 In-Reply-To: <20191212190133.18473-1-axboe@kernel.dk> References: <20191212190133.18473-1-axboe@kernel.dk> MIME-Version: 1.0 Sender: linux-block-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-block@vger.kernel.org This adds support for RWF_UNCACHED for file systems using iomap to perform buffered writes. We use the generic infrastructure for this, by tracking pages we created and calling write_drop_cached_pages() to issue writeback and prune those pages. Signed-off-by: Jens Axboe --- fs/iomap/apply.c | 24 ++++++++++++++++++++++++ fs/iomap/buffered-io.c | 23 +++++++++++++++++++---- include/linux/iomap.h | 5 +++++ 3 files changed, 48 insertions(+), 4 deletions(-) diff --git a/fs/iomap/apply.c b/fs/iomap/apply.c index e76148db03b8..11b6812f7b37 100644 --- a/fs/iomap/apply.c +++ b/fs/iomap/apply.c @@ -92,5 +92,29 @@ iomap_apply(struct iomap_data *data, const struct iomap_ops *ops, data->flags, &iomap); } + if (written && (data->flags & IOMAP_UNCACHED)) { + struct address_space *mapping = data->inode->i_mapping; + + end = data->pos + written; + ret = filemap_write_and_wait_range(mapping, data->pos, end); + if (ret) + goto out; + + /* + * No pages were created for this range, we're done + */ + if (!(iomap.flags & IOMAP_F_PAGE_CREATE)) + goto out; + + /* + * Try to invalidate cache pages for the range we just wrote. + * We don't care if invalidation fails as the write has still + * worked and leaving clean uptodate pages in the page cache + * isn't a corruption vector for uncached IO. + */ + invalidate_inode_pages2_range(mapping, + data->pos >> PAGE_SHIFT, end >> PAGE_SHIFT); + } +out: return written ? written : ret; } diff --git a/fs/iomap/buffered-io.c b/fs/iomap/buffered-io.c index 0a1a195ed1cc..df9d6002858e 100644 --- a/fs/iomap/buffered-io.c +++ b/fs/iomap/buffered-io.c @@ -659,6 +659,7 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, struct page **pagep, struct iomap *iomap, struct iomap *srcmap) { const struct iomap_page_ops *page_ops = iomap->page_ops; + unsigned aop_flags; struct page *page; int status = 0; @@ -675,8 +676,11 @@ iomap_write_begin(struct inode *inode, loff_t pos, unsigned len, unsigned flags, return status; } + aop_flags = AOP_FLAG_NOFS; + if (flags & IOMAP_UNCACHED) + aop_flags |= AOP_FLAG_UNCACHED; page = grab_cache_page_write_begin(inode->i_mapping, pos >> PAGE_SHIFT, - AOP_FLAG_NOFS); + aop_flags); if (!page) { status = -ENOMEM; goto out_no_page; @@ -818,6 +822,7 @@ iomap_write_actor(const struct iomap_data *data, struct iomap *iomap, { struct inode *inode = data->inode; struct iov_iter *i = data->priv; + unsigned flags = data->flags; loff_t length = data->len; loff_t pos = data->pos; long status = 0; @@ -851,10 +856,17 @@ iomap_write_actor(const struct iomap_data *data, struct iomap *iomap, break; } - status = iomap_write_begin(inode, pos, bytes, 0, &page, iomap, - srcmap); - if (unlikely(status)) +retry: + status = iomap_write_begin(inode, pos, bytes, flags, + &page, iomap, srcmap); + if (unlikely(status)) { + if (status == -ENOMEM && (flags & IOMAP_UNCACHED)) { + iomap->flags |= IOMAP_F_PAGE_CREATE; + flags &= ~IOMAP_UNCACHED; + goto retry; + } break; + } if (mapping_writably_mapped(inode->i_mapping)) flush_dcache_page(page); @@ -907,6 +919,9 @@ iomap_file_buffered_write(struct kiocb *iocb, struct iov_iter *iter, }; loff_t ret = 0, written = 0; + if (iocb->ki_flags & IOCB_UNCACHED) + data.flags |= IOMAP_UNCACHED; + while (iov_iter_count(iter)) { data.len = iov_iter_count(iter); ret = iomap_apply(&data, ops, iomap_write_actor); diff --git a/include/linux/iomap.h b/include/linux/iomap.h index 30f40145a9e9..30bb248e1d0d 100644 --- a/include/linux/iomap.h +++ b/include/linux/iomap.h @@ -48,12 +48,16 @@ struct vm_fault; * * IOMAP_F_BUFFER_HEAD indicates that the file system requires the use of * buffer heads for this mapping. + * + * IOMAP_F_PAGE_CREATE indicates that pages had to be allocated to satisfy + * this operation. */ #define IOMAP_F_NEW 0x01 #define IOMAP_F_DIRTY 0x02 #define IOMAP_F_SHARED 0x04 #define IOMAP_F_MERGED 0x08 #define IOMAP_F_BUFFER_HEAD 0x10 +#define IOMAP_F_PAGE_CREATE 0x20 /* * Flags set by the core iomap code during operations: @@ -121,6 +125,7 @@ struct iomap_page_ops { #define IOMAP_FAULT (1 << 3) /* mapping for page fault */ #define IOMAP_DIRECT (1 << 4) /* direct I/O */ #define IOMAP_NOWAIT (1 << 5) /* do not block */ +#define IOMAP_UNCACHED (1 << 6) struct iomap_ops { /*