From patchwork Wed Mar 27 09:15:17 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Smelkov X-Patchwork-Id: 10873087 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 01E5517E0 for ; Wed, 27 Mar 2019 09:45:13 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id DA92328BD3 for ; Wed, 27 Mar 2019 09:45:12 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id CD29F28BDF; Wed, 27 Mar 2019 09:45:12 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,URIBL_GREY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 400CD28BD3 for ; Wed, 27 Mar 2019 09:45:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732164AbfC0JpJ (ORCPT ); Wed, 27 Mar 2019 05:45:09 -0400 Received: from mail177-9.suw61.mandrillapp.com ([198.2.177.9]:60576 "EHLO mail177-9.suw61.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732348AbfC0JpJ (ORCPT ); Wed, 27 Mar 2019 05:45:09 -0400 X-Greylist: delayed 902 seconds by postgrey-1.27 at vger.kernel.org; Wed, 27 Mar 2019 05:45:09 EDT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=nexedi.com; h=From:Subject:To:Cc:Message-Id:In-Reply-To:References:Date:MIME-Version:Content-Type:Content-Transfer-Encoding; i=kirr@nexedi.com; bh=M3W9EymIbtNlepIICFQEKnmSlW1q5GOiNtFo19F089s=; b=fa5sgELyEmrI/UFTSFMBRqvvTjfJEzh2dZwqaHa6G/qPax08LcQc0PervWzyuhOqLzphy9uEOgkI Lmey3YuN6cVmWkmtGbIhcGClzijJQ5HAwlJcqb6Q81rendFxU1mOhICrKd9O+8uAKZiR0pqIxWxe ynZYp4ISRT75mSnWByk= Received: from pmta06.mandrill.prod.suw01.rsglab.com (127.0.0.1) by mail177-9.suw61.mandrillapp.com id hjd32a22rtkm for ; Wed, 27 Mar 2019 09:15:17 +0000 (envelope-from ) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1553678117; h=From : Subject : To : Cc : Message-Id : In-Reply-To : References : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=M3W9EymIbtNlepIICFQEKnmSlW1q5GOiNtFo19F089s=; b=m1RGXT0+TNrP7c5JqBs+D0zMd4ykLDk+5wWwt64zg1gDlk7sRvZL34wtwoDPhp1Y3pD1TU JxW2f3MTLocaUPPvwWfGIw5Jc7OfvOn+j1JTdoPOFAt3ekyB9969ql6fEcOOLh2sxo0iVZBf W+KqyOE9hDqa4IXKL2YLooXfYawCo= From: Kirill Smelkov Subject: [RESEND1, PATCH 1/2] fuse: convert printk -> pr_* Received: from [87.98.221.171] by mandrillapp.com id 0e74b53d30574f729ed9ba6829203367; Wed, 27 Mar 2019 09:15:17 +0000 X-Mailer: git-send-email 2.21.0.392.gf8f6787159 To: Miklos Szeredi , Miklos Szeredi Cc: Brian Foster , Maxim Patlasov , Anatol Pomozov , Pavel Emelyanov , Andrew Gallagher , "Anand V . Avati" , Alexey Kuznetsov , Andrey Ryabinin , Kirill Tkhai , Constantine Shulyupin , Chad Austin , Dan Schatzberg , , , , Han-Wen Nienhuys , Andrew Morton , Kirill Smelkov Message-Id: In-Reply-To: References: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31050260.0e74b53d30574f729ed9ba6829203367 X-Mandrill-User: md_31050260 Date: Wed, 27 Mar 2019 09:15:17 +0000 MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Functions, like pr_err, are a more modern variant of printing compared to printk. They could be used to denoise sources by using needed level in the print function name, and by automatically inserting per-driver / function / ... print prefix as defined by pr_fmt macro. pr_* are also said to be used in Documentation/process/coding-style.rst and more recent code - for example overlayfs - uses them instead of printk. Convert CUSE and FUSE to use the new pr_* functions. CUSE output stays completely unchanged, while FUSE output is amended a bit for "trying to steal weird page" warning - the second line now comes also with "fuse:" prefix. I hope it is ok. Suggested-by: Kirill Tkhai Signed-off-by: Kirill Smelkov Reviewed-by: Kirill Tkhai --- fs/fuse/cuse.c | 13 +++++++------ fs/fuse/dev.c | 4 ++-- fs/fuse/fuse_i.h | 4 ++++ fs/fuse/inode.c | 6 +++--- 4 files changed, 16 insertions(+), 11 deletions(-) diff --git a/fs/fuse/cuse.c b/fs/fuse/cuse.c index 55a26f351467..4b41df1d4642 100644 --- a/fs/fuse/cuse.c +++ b/fs/fuse/cuse.c @@ -33,6 +33,8 @@ * closed. */ +#define pr_fmt(fmt) "CUSE: " fmt + #include #include #include @@ -225,7 +227,7 @@ static int cuse_parse_one(char **pp, char *end, char **keyp, char **valp) return 0; if (end[-1] != '\0') { - printk(KERN_ERR "CUSE: info not properly terminated\n"); + pr_err("info not properly terminated\n"); return -EINVAL; } @@ -242,7 +244,7 @@ static int cuse_parse_one(char **pp, char *end, char **keyp, char **valp) key = strstrip(key); if (!strlen(key)) { - printk(KERN_ERR "CUSE: zero length info key specified\n"); + pr_err("zero length info key specified\n"); return -EINVAL; } @@ -282,12 +284,11 @@ static int cuse_parse_devinfo(char *p, size_t len, struct cuse_devinfo *devinfo) if (strcmp(key, "DEVNAME") == 0) devinfo->name = val; else - printk(KERN_WARNING "CUSE: unknown device info \"%s\"\n", - key); + pr_warn("unknown device info \"%s\"\n", key); } if (!devinfo->name || !strlen(devinfo->name)) { - printk(KERN_ERR "CUSE: DEVNAME unspecified\n"); + pr_err("DEVNAME unspecified\n"); return -EINVAL; } @@ -341,7 +342,7 @@ static void cuse_process_init_reply(struct fuse_conn *fc, struct fuse_req *req) else rc = register_chrdev_region(devt, 1, devinfo.name); if (rc) { - printk(KERN_ERR "CUSE: failed to register chrdev region\n"); + pr_err("failed to register chrdev region\n"); goto err; } diff --git a/fs/fuse/dev.c b/fs/fuse/dev.c index 8a63e52785e9..ccb4c3980829 100644 --- a/fs/fuse/dev.c +++ b/fs/fuse/dev.c @@ -906,8 +906,8 @@ static int fuse_check_page(struct page *page) 1 << PG_lru | 1 << PG_active | 1 << PG_reclaim))) { - printk(KERN_WARNING "fuse: trying to steal weird page\n"); - printk(KERN_WARNING " page=%p index=%li flags=%08lx, count=%i, mapcount=%i, mapping=%p\n", page, page->index, page->flags, page_count(page), page_mapcount(page), page->mapping); + pr_warn("trying to steal weird page\n"); + pr_warn(" page=%p index=%li flags=%08lx, count=%i, mapcount=%i, mapping=%p\n", page, page->index, page->flags, page_count(page), page_mapcount(page), page->mapping); return 1; } return 0; diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 0920c0c032a0..e6195bc8f836 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -9,6 +9,10 @@ #ifndef _FS_FUSE_I_H #define _FS_FUSE_I_H +#ifndef pr_fmt +# define pr_fmt(fmt) "fuse: " fmt +#endif + #include #include #include diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 1b3f3b67d9f0..1bca5023bcc5 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -1397,8 +1397,8 @@ static int __init fuse_init(void) { int res; - printk(KERN_INFO "fuse init (API version %i.%i)\n", - FUSE_KERNEL_VERSION, FUSE_KERNEL_MINOR_VERSION); + pr_info("init (API version %i.%i)\n", + FUSE_KERNEL_VERSION, FUSE_KERNEL_MINOR_VERSION); INIT_LIST_HEAD(&fuse_conn_list); res = fuse_fs_init(); @@ -1434,7 +1434,7 @@ static int __init fuse_init(void) static void __exit fuse_exit(void) { - printk(KERN_DEBUG "fuse exit\n"); + pr_debug("exit\n"); fuse_ctl_cleanup(); fuse_sysfs_cleanup(); From patchwork Wed Mar 27 10:14:12 2019 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Kirill Smelkov X-Patchwork-Id: 10873237 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C02BE186E for ; Wed, 27 Mar 2019 10:44:17 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9A20B28717 for ; Wed, 27 Mar 2019 10:44:17 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 8CFB82883D; Wed, 27 Mar 2019 10:44:17 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.6 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,MAILING_LIST_MULTI,RCVD_IN_DNSWL_HI,URIBL_GREY autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A6CBD28717 for ; Wed, 27 Mar 2019 10:44:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732384AbfC0KoP (ORCPT ); Wed, 27 Mar 2019 06:44:15 -0400 Received: from mail177-9.suw61.mandrillapp.com ([198.2.177.9]:58519 "EHLO mail177-9.suw61.mandrillapp.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729489AbfC0KoP (ORCPT ); Wed, 27 Mar 2019 06:44:15 -0400 X-Greylist: delayed 902 seconds by postgrey-1.27 at vger.kernel.org; Wed, 27 Mar 2019 06:44:14 EDT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; s=mandrill; d=nexedi.com; h=From:Subject:To:Cc:Message-Id:In-Reply-To:References:Date:MIME-Version:Content-Type:Content-Transfer-Encoding; i=kirr@nexedi.com; bh=e8HYmYugXSvAr9S+tFkQEap2XogCxVY+blXQqfdS8eY=; b=DSDgGZnYhgAqQ2ttYohiA6CPEHgmzN1VBpxVnihCd3EUfMGywzJgThJoCanQqtmjhO6STg0CfMkU ObgX2qjOaCQleWooK9U9ha+UpjuSJTiti57hHwRid9GFv8iE9nKg9IceD6SK4C9wggeNkuPQTrFv z7a1h/rQt594wbBlrlw= Received: from pmta06.mandrill.prod.suw01.rsglab.com (127.0.0.1) by mail177-9.suw61.mandrillapp.com id hjd9vs22rtkn for ; Wed, 27 Mar 2019 10:14:12 +0000 (envelope-from ) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mandrillapp.com; i=@mandrillapp.com; q=dns/txt; s=mandrill; t=1553681652; h=From : Subject : To : Cc : Message-Id : In-Reply-To : References : Date : MIME-Version : Content-Type : Content-Transfer-Encoding : From : Subject : Date : X-Mandrill-User : List-Unsubscribe; bh=e8HYmYugXSvAr9S+tFkQEap2XogCxVY+blXQqfdS8eY=; b=RywhYe253b7Jsuu8hQ4v7OzK9gx9tG91IsWgxW2ZiuvolWtfjKoFVeMZnHqy/ZvscB+f2D xf5RTVmaEyw+oita7Yk2e4RHaStJ2GDLzARbFy44XbS9O/IfyH6OkIZk9QrAD9+M3zxCJZ7R 7m79KlpfZAb4fJMIEiWgvMMAMp9D4= From: Kirill Smelkov Subject: [RESEND1, PATCH v2 2/2] fuse: allow filesystems to have precise control over data cache Received: from [87.98.221.171] by mandrillapp.com id af0f3a5f94a740378ae6185f05fb86eb; Wed, 27 Mar 2019 10:14:12 +0000 X-Mailer: git-send-email 2.21.0.392.gf8f6787159 To: Miklos Szeredi , Miklos Szeredi Cc: Brian Foster , Maxim Patlasov , Anatol Pomozov , Pavel Emelyanov , Andrew Gallagher , "Anand V . Avati" , Alexey Kuznetsov , Andrey Ryabinin , Kirill Tkhai , Constantine Shulyupin , Chad Austin , Dan Schatzberg , , , , Han-Wen Nienhuys , Andrew Morton , Kirill Smelkov Message-Id: In-Reply-To: References: X-Report-Abuse: Please forward a copy of this message, including all headers, to abuse@mandrill.com X-Report-Abuse: You can also report abuse here: http://mandrillapp.com/contact/abuse?id=31050260.af0f3a5f94a740378ae6185f05fb86eb X-Mandrill-User: md_31050260 Date: Wed, 27 Mar 2019 10:14:12 +0000 MIME-Version: 1.0 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On networked filesystems file data can be changed externally. FUSE provides notification messages for filesystem to inform kernel that metadata or data region of a file needs to be invalidated in local page cache. That provides the basis for filesystem implementations to invalidate kernel cache precisely based on observed filesystem-specific events. FUSE has also "automatic" invalidation mode(*) when the kernel automatically invalidates data cache of a file if it sees mtime change. It also automatically invalidates whole data cache of a file if it sees file size being changed. The automatic mode has corresponding capability - FUSE_AUTO_INVAL_DATA. However, due to probably historical reason, that capability controls only whether mtime change should be resulting in automatic invalidation or not. A change in file size always results in invalidating whole data cache of a file irregardless of whether FUSE_AUTO_INVAL_DATA was negotiated(+). The filesystem I write[1] represents data arrays stored in networked database as local files suitable for mmap. It is read-only filesystem - changes to data are committed externally via database interfaces and the filesystem only glues data into contiguous file streams suitable for mmap and traditional array processing. The files are big - starting from hundreds gigabytes and more. The files change regularly, and frequently by data being appended to their end. The size of files thus changes frequently. If a file was accessed locally and some part of its data got into page cache, we want that data to stay cached unless there is memory pressure, or unless corresponding part of the file was actually changed. However current FUSE behaviour - when it sees file size change - is to invalidate the whole file. The data cache of the file is thus completely lost even on small size change, and despite that the filesystem server is careful to accurately translate database changes into FUSE invalidation messages to kernel. Let's fix it: if a filesystem, through new FUSE_PRECISE_INVAL_DATA capability, indicates to kernel that it is fully responsible for data cache invalidation, then the kernel won't invalidate files data cache on size change and only truncate that cache to new size in case the size decreased. (*) see 72d0d248ca "fuse: add FUSE_AUTO_INVAL_DATA init flag", eed2179efe "fuse: invalidate inode mapping if mtime changes" (+) in writeback mode the kernel does not invalidate data cache on file size change, but neither it allows the filesystem to set the size due to external event (see 8373200b12 "fuse: Trust kernel i_size only") [1] https://lab.nexedi.com/kirr/wendelin.core/blob/a50f1d9f/wcfs/wcfs.go#L20 Signed-off-by: Kirill Smelkov --- fs/fuse/fuse_i.h | 3 +++ fs/fuse/inode.c | 12 ++++++++++-- include/uapi/linux/fuse.h | 7 ++++++- 3 files changed, 19 insertions(+), 3 deletions(-) diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index e6195bc8f836..154f6cdd94d1 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -694,6 +694,9 @@ struct fuse_conn { /** Use enhanced/automatic page cache invalidation. */ unsigned auto_inval_data:1; + /** Filesystem is fully reponsible for page cache invalidation. */ + unsigned precise_inval_data:1; + /** Does the filesystem support readdirplus? */ unsigned do_readdirplus:1; diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 1bca5023bcc5..2be0bca7f76c 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -237,7 +237,8 @@ void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr, if (oldsize != attr->size) { truncate_pagecache(inode, attr->size); - inval = true; + if (!fc->precise_inval_data) + inval = true; } else if (fc->auto_inval_data) { struct timespec64 new_mtime = { .tv_sec = attr->mtime, @@ -912,6 +913,13 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_req *req) fc->dont_mask = 1; if (arg->flags & FUSE_AUTO_INVAL_DATA) fc->auto_inval_data = 1; + if (arg->flags & FUSE_PRECISE_INVAL_DATA) + fc->precise_inval_data = 1; + if (fc->auto_inval_data && fc->precise_inval_data) { + pr_warn("filesystem requested both auto and " + "precise cache control - using auto\n"); + fc->precise_inval_data = 0; + } if (arg->flags & FUSE_DO_READDIRPLUS) { fc->do_readdirplus = 1; if (arg->flags & FUSE_READDIRPLUS_AUTO) @@ -973,7 +981,7 @@ static void fuse_send_init(struct fuse_conn *fc, struct fuse_req *req) FUSE_WRITEBACK_CACHE | FUSE_NO_OPEN_SUPPORT | FUSE_PARALLEL_DIROPS | FUSE_HANDLE_KILLPRIV | FUSE_POSIX_ACL | FUSE_ABORT_ERROR | FUSE_MAX_PAGES | FUSE_CACHE_SYMLINKS | - FUSE_NO_OPENDIR_SUPPORT; + FUSE_NO_OPENDIR_SUPPORT | FUSE_PRECISE_INVAL_DATA; req->in.h.opcode = FUSE_INIT; req->in.numargs = 1; req->in.args[0].size = sizeof(*arg); diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 2ac598614a8f..33de8f6391ec 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -125,6 +125,9 @@ * * 7.29 * - add FUSE_NO_OPENDIR_SUPPORT flag + * + * 7.30 + * - add FUSE_PRECISE_INVAL_DATA */ #ifndef _LINUX_FUSE_H @@ -160,7 +163,7 @@ #define FUSE_KERNEL_VERSION 7 /** Minor version number of this interface */ -#define FUSE_KERNEL_MINOR_VERSION 29 +#define FUSE_KERNEL_MINOR_VERSION 30 /** The node ID of the root inode */ #define FUSE_ROOT_ID 1 @@ -263,6 +266,7 @@ struct fuse_file_lock { * FUSE_MAX_PAGES: init_out.max_pages contains the max number of req pages * FUSE_CACHE_SYMLINKS: cache READLINK responses * FUSE_NO_OPENDIR_SUPPORT: kernel supports zero-message opendir + * FUSE_PRECISE_INVAL_DATA: filesystem is fully responsible for data cache invalidation */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) @@ -289,6 +293,7 @@ struct fuse_file_lock { #define FUSE_MAX_PAGES (1 << 22) #define FUSE_CACHE_SYMLINKS (1 << 23) #define FUSE_NO_OPENDIR_SUPPORT (1 << 24) +#define FUSE_PRECISE_INVAL_DATA (1 << 25) /** * CUSE INIT request/reply flags