From patchwork Fri Jul 24 18:38:08 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 11684111 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id AD4F613A4 for ; Fri, 24 Jul 2020 18:38:38 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 94B3D2070B for ; Fri, 24 Jul 2020 18:38:38 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="LZsBBli9" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726702AbgGXSih (ORCPT ); Fri, 24 Jul 2020 14:38:37 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:46366 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726676AbgGXSig (ORCPT ); Fri, 24 Jul 2020 14:38:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595615914; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=GNSW95GGKShebNtAUWbT47/rlR8zDxszAF5AVLBvrwM=; b=LZsBBli9gfWDZiOZfGBijJH+usaMpuxLYEui4sjAb7NXICbFKJpAVRc6b3Urj4g4Qv84gR 4V4110Qa8hKMMFIfHrzFqh1YXl5s2zUH5Nef5XncvpaMJw2kAoQL0fFPp+Ll7KbXEzUq9G vZ9GaLO/ySJZqwOfeaSH02K3hoMWlr8= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-304-5c_rYiaPOL69p136Iq73pA-1; Fri, 24 Jul 2020 14:38:30 -0400 X-MC-Unique: 5c_rYiaPOL69p136Iq73pA-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 3601D80183C; Fri, 24 Jul 2020 18:38:29 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-85.rdu2.redhat.com [10.10.116.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id A10245C1BB; Fri, 24 Jul 2020 18:38:25 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 2B23F223D03; Fri, 24 Jul 2020 14:38:25 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, miklos@szeredi.hu Cc: vgoyal@redhat.com, virtio-fs@redhat.com Subject: [PATCH 1/5] fuse: Introduce the notion of FUSE_HANDLE_KILLPRIV_V2 Date: Fri, 24 Jul 2020 14:38:08 -0400 Message-Id: <20200724183812.19573-2-vgoyal@redhat.com> In-Reply-To: <20200724183812.19573-1-vgoyal@redhat.com> References: <20200724183812.19573-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org FUSE_HANDLE_KILLPRIV flag says that file server will remove suid/sgid/caps on truncate/chown/write. But to be consistent with VFS behavior what we want is. - caps are always cleared on chown/write/truncate - suid is always cleared on chown, while for truncate/write it is cleared only if caller does not have CAP_FSETID. - sgid is always cleared on chown, while for truncate/write it is cleared only if caller does not have CAP_FSETID as well as file has group execute permission. As previous flag did not provide above semantics. Implement a V2 of the protocol with above said constraints. Signed-off-by: Vivek Goyal --- fs/fuse/fuse_i.h | 6 ++++++ fs/fuse/inode.c | 5 ++++- include/uapi/linux/fuse.h | 7 +++++++ 3 files changed, 17 insertions(+), 1 deletion(-) diff --git a/fs/fuse/fuse_i.h b/fs/fuse/fuse_i.h index 740a8a7d7ae6..71bede0a57c9 100644 --- a/fs/fuse/fuse_i.h +++ b/fs/fuse/fuse_i.h @@ -610,6 +610,12 @@ struct fuse_conn { /** cache READLINK responses in page cache */ unsigned cache_symlinks:1; + /** fs kills suid/sgid/cap on write/chown/trunc. suid is + killed on write/trunc only if caller did not have CAP_FSETID. + sgid is killed on write/truncate only if caller did not have + CAP_FSETID as well as file has group execute permission. */ + unsigned handle_killpriv_v2:1; + /* * The following bitfields are only for optimization purposes * and hence races in setting them will not cause malfunction diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index bba747520e9b..113ba149e08d 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -965,6 +965,8 @@ static void process_init_reply(struct fuse_conn *fc, struct fuse_args *args, min_t(unsigned int, FUSE_MAX_MAX_PAGES, max_t(unsigned int, arg->max_pages, 1)); } + if (arg->flags & FUSE_HANDLE_KILLPRIV_V2) + fc->handle_killpriv_v2 = 1; } else { ra_pages = fc->max_read / PAGE_SIZE; fc->no_lock = 1; @@ -1002,7 +1004,8 @@ void fuse_send_init(struct fuse_conn *fc) FUSE_WRITEBACK_CACHE | FUSE_NO_OPEN_SUPPORT | FUSE_PARALLEL_DIROPS | FUSE_HANDLE_KILLPRIV | FUSE_POSIX_ACL | FUSE_ABORT_ERROR | FUSE_MAX_PAGES | FUSE_CACHE_SYMLINKS | - FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA; + FUSE_NO_OPENDIR_SUPPORT | FUSE_EXPLICIT_INVAL_DATA | + FUSE_HANDLE_KILLPRIV_V2; ia->args.opcode = FUSE_INIT; ia->args.in_numargs = 1; ia->args.in_args[0].size = sizeof(ia->in); diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 373cada89815..960ba8af5cf4 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -172,6 +172,7 @@ * - add FUSE_WRITE_KILL_PRIV flag * - add FUSE_SETUPMAPPING and FUSE_REMOVEMAPPING * - add map_alignment to fuse_init_out, add FUSE_MAP_ALIGNMENT flag + * - add FUSE_HANDLE_KILLPRIV_V2 */ #ifndef _LINUX_FUSE_H @@ -314,6 +315,11 @@ struct fuse_file_lock { * FUSE_NO_OPENDIR_SUPPORT: kernel supports zero-message opendir * FUSE_EXPLICIT_INVAL_DATA: only invalidate cached pages on explicit request * FUSE_MAP_ALIGNMENT: map_alignment field is valid + * FUSE_HANDLE_KILLPRIV_V2: fs kills suid/sgid/cap on write/chown/trunc. + * Upon write/truncate suid/sgid is only killed if caller + * does not have CAP_FSETID. Additionally upon + * write/truncate sgid is killed only if file has group + * execute permission. (Same as Linux VFS behavior). */ #define FUSE_ASYNC_READ (1 << 0) #define FUSE_POSIX_LOCKS (1 << 1) @@ -342,6 +348,7 @@ struct fuse_file_lock { #define FUSE_NO_OPENDIR_SUPPORT (1 << 24) #define FUSE_EXPLICIT_INVAL_DATA (1 << 25) #define FUSE_MAP_ALIGNMENT (1 << 26) +#define FUSE_HANDLE_KILLPRIV_V2 (1 << 27) /** * CUSE INIT request/reply flags From patchwork Fri Jul 24 18:38:09 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 11684105 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 4F6E3913 for ; Fri, 24 Jul 2020 18:38:35 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 2D80A206F0 for ; Fri, 24 Jul 2020 18:38:35 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="QehrmQuj" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726658AbgGXSid (ORCPT ); Fri, 24 Jul 2020 14:38:33 -0400 Received: from us-smtp-delivery-1.mimecast.com ([205.139.110.120]:32807 "EHLO us-smtp-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726493AbgGXSid (ORCPT ); Fri, 24 Jul 2020 14:38:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595615911; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VxvG3aIdXnoGOb1aQ6qs6G7M5TgTV1H3UA5uFtnrcig=; b=QehrmQuj19EEda1+ATYERtoX6EKZPDlB/z7aG2A5zERwawW9SkutVaXGyk0+It6bprIQrV 3NMEmYKZ0CfXr5/Og0cauTRPlzGbZOmIYWDQX4zwxeAxnl9v79abvQm+MTQq/TEfjHTPe6 1DrzyjFrXpH9uelqFh62Vg4rWY/DAMQ= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-440-O0oOyaC_PZuFd8NRpeTfOA-1; Fri, 24 Jul 2020 14:38:30 -0400 X-MC-Unique: O0oOyaC_PZuFd8NRpeTfOA-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1EAE98015CE; Fri, 24 Jul 2020 18:38:29 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-85.rdu2.redhat.com [10.10.116.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 99C5972683; Fri, 24 Jul 2020 18:38:25 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 3356E223D04; Fri, 24 Jul 2020 14:38:25 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, miklos@szeredi.hu Cc: vgoyal@redhat.com, virtio-fs@redhat.com Subject: [PATCH 2/5] fuse: Set FUSE_WRITE_KILL_PRIV in cached write path Date: Fri, 24 Jul 2020 14:38:09 -0400 Message-Id: <20200724183812.19573-3-vgoyal@redhat.com> In-Reply-To: <20200724183812.19573-1-vgoyal@redhat.com> References: <20200724183812.19573-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org If caller does not have CAP_FSETID, we set FUSE_WRITE_KILL_PRIV in direct I/O path but not in cached write path. Set it there as well so that server can clear suid/sgid/caps as needed. Set it only if fc->handle_killpriv_v2 is set. Otherwise client is responsible for kill suid/sgid. We do it direct I/O path anyway because we do't call file_remove_privs() there (with cache=none option). Signed-off-by: Vivek Goyal --- fs/fuse/file.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/fs/fuse/file.c b/fs/fuse/file.c index 83d917f7e542..57899afc7cba 100644 --- a/fs/fuse/file.c +++ b/fs/fuse/file.c @@ -1083,6 +1083,8 @@ static ssize_t fuse_send_write_pages(struct fuse_io_args *ia, fuse_write_args_fill(ia, ff, pos, count); ia->write.in.flags = fuse_write_flags(iocb); + if (fc->handle_killpriv_v2 && !capable(CAP_FSETID)) + ia->write.in.write_flags |= FUSE_WRITE_KILL_PRIV; err = fuse_simple_request(fc, &ap->args); if (!err && ia->write.out.size > count) From patchwork Fri Jul 24 18:38:10 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 11684103 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id B7EF813B1 for ; Fri, 24 Jul 2020 18:38:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 9FBFC2070B for ; Fri, 24 Jul 2020 18:38:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="hEOos2FW" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726666AbgGXSid (ORCPT ); Fri, 24 Jul 2020 14:38:33 -0400 Received: from us-smtp-2.mimecast.com ([205.139.110.61]:55537 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726416AbgGXSid (ORCPT ); Fri, 24 Jul 2020 14:38:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595615911; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0Y7R8yC7e8a3wA22JBDo1+psNoRWlIjDrlJjVH3M6pM=; b=hEOos2FWS8zD2XMwVlqFUirD5xilQz1AcFO4pIFARFipJ4KXaQZJXeKW1C/FZBwki6U7fy 44LuaYCCMT/kMQwojivG4b76bZA6VzqrGybx/bGJYIo/f1skWgVJhFkAS3VXa8OZBcGgXx tEVQXEVARFTRcNb23+52U06bJGXcy5w= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-440-FJdLexSnOC2fJekOeq0Syw-1; Fri, 24 Jul 2020 14:38:30 -0400 X-MC-Unique: FJdLexSnOC2fJekOeq0Syw-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1EF1A8015F4; Fri, 24 Jul 2020 18:38:29 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-85.rdu2.redhat.com [10.10.116.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9DC0A10013C2; Fri, 24 Jul 2020 18:38:25 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 3A922223D05; Fri, 24 Jul 2020 14:38:25 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, miklos@szeredi.hu Cc: vgoyal@redhat.com, virtio-fs@redhat.com Subject: [PATCH 3/5] fuse: Add a flag FUSE_SETATTR_KILL_PRIV Date: Fri, 24 Jul 2020 14:38:10 -0400 Message-Id: <20200724183812.19573-4-vgoyal@redhat.com> In-Reply-To: <20200724183812.19573-1-vgoyal@redhat.com> References: <20200724183812.19573-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org With handle_killpriv_v2, server needs to kill suid/sgid on truncate (setattr) but it does not know if caller has CAP_FSETID or not. So like write, send killpriv information in fuse_setattr_in and add a flag FUSE_SETATTR_KILL_PRIV. Signed-off-by: Vivek Goyal --- fs/fuse/dir.c | 11 ++++++++--- include/uapi/linux/fuse.h | 11 ++++++++++- 2 files changed, 18 insertions(+), 4 deletions(-) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 26f028bc760b..82747ca4c5c8 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -1437,13 +1437,15 @@ void fuse_release_nowrite(struct inode *inode) static void fuse_setattr_fill(struct fuse_conn *fc, struct fuse_args *args, struct inode *inode, struct fuse_setattr_in *inarg_p, - struct fuse_attr_out *outarg_p) + struct fuse_attr_out *outarg_p, + uint32_t setattr_flags) { args->opcode = FUSE_SETATTR; args->nodeid = get_node_id(inode); args->in_numargs = 1; args->in_args[0].size = sizeof(*inarg_p); args->in_args[0].value = inarg_p; + inarg_p->setattr_flags = setattr_flags; args->out_numargs = 1; args->out_args[0].size = sizeof(*outarg_p); args->out_args[0].value = outarg_p; @@ -1474,7 +1476,7 @@ int fuse_flush_times(struct inode *inode, struct fuse_file *ff) inarg.valid |= FATTR_FH; inarg.fh = ff->fh; } - fuse_setattr_fill(fc, &args, inode, &inarg, &outarg); + fuse_setattr_fill(fc, &args, inode, &inarg, &outarg, 0); return fuse_simple_request(fc, &args); } @@ -1501,6 +1503,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr, loff_t oldsize; int err; bool trust_local_cmtime = is_wb && S_ISREG(inode->i_mode); + uint32_t setattr_flags = 0; if (!fc->default_permissions) attr->ia_valid |= ATTR_FORCE; @@ -1529,6 +1532,8 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr, if (attr->ia_valid & ATTR_SIZE) { if (WARN_ON(!S_ISREG(inode->i_mode))) return -EIO; + if (fc->handle_killpriv_v2 && !capable(CAP_FSETID)) + setattr_flags |= FUSE_SETATTR_KILL_PRIV; is_truncate = true; } @@ -1565,7 +1570,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr, inarg.valid |= FATTR_LOCKOWNER; inarg.lock_owner = fuse_lock_owner_id(fc, current->files); } - fuse_setattr_fill(fc, &args, inode, &inarg, &outarg); + fuse_setattr_fill(fc, &args, inode, &inarg, &outarg, setattr_flags); err = fuse_simple_request(fc, &args); if (err) { if (err == -EINTR) diff --git a/include/uapi/linux/fuse.h b/include/uapi/linux/fuse.h index 960ba8af5cf4..4b275653ac2e 100644 --- a/include/uapi/linux/fuse.h +++ b/include/uapi/linux/fuse.h @@ -173,6 +173,7 @@ * - add FUSE_SETUPMAPPING and FUSE_REMOVEMAPPING * - add map_alignment to fuse_init_out, add FUSE_MAP_ALIGNMENT flag * - add FUSE_HANDLE_KILLPRIV_V2 + * - add FUSE_SETATTR_KILL_PRIV */ #ifndef _LINUX_FUSE_H @@ -368,6 +369,14 @@ struct fuse_file_lock { */ #define FUSE_GETATTR_FH (1 << 0) +/** + * Setattr flags + * FUSE_SETATTR_KILL_PRIV: kill suid and sgid bits. sgid should be killed + * only if group execute bit (S_IXGRP) is set. Meant to be used together + * with FUSE_HANDLE_KILLPRIV_V2. + */ +#define FUSE_SETATTR_KILL_PRIV (1 << 0) + /** * Lock flags */ @@ -566,7 +575,7 @@ struct fuse_link_in { struct fuse_setattr_in { uint32_t valid; - uint32_t padding; + uint32_t setattr_flags; uint64_t fh; uint64_t size; uint64_t lock_owner; From patchwork Fri Jul 24 18:38:11 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 11684101 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id 3F77F13A4 for ; Fri, 24 Jul 2020 18:38:34 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 1E2DB2070B for ; Fri, 24 Jul 2020 18:38:34 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="gVRxUqjt" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726636AbgGXSid (ORCPT ); Fri, 24 Jul 2020 14:38:33 -0400 Received: from us-smtp-1.mimecast.com ([205.139.110.61]:39993 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726593AbgGXSid (ORCPT ); Fri, 24 Jul 2020 14:38:33 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595615912; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=06ytkjVAvyBae2U5rl80yxLCbeymxFczMdXZruRQj+A=; b=gVRxUqjt8fJ1+UlKDMA3O375lW/h+QgkfKgNb4KwjYbx97SU7LeyWWnfcBmM8enYXnSzJz +dp3pHxlvNoqc4LG2CG8qNWcY4V4jdpWKfcPVqwmmU2xDk5bu0Enxid455crbvJ1wy7FWl XYmU9enNK+8VtrInYu5q247j+FCmf20= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-440-mmUzY3Q2P6OCfngNe0uPog-1; Fri, 24 Jul 2020 14:38:30 -0400 X-MC-Unique: mmUzY3Q2P6OCfngNe0uPog-1 Received: from smtp.corp.redhat.com (int-mx07.intmail.prod.int.phx2.redhat.com [10.5.11.22]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 1F4A1107BEF6; Fri, 24 Jul 2020 18:38:29 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-85.rdu2.redhat.com [10.10.116.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id A374010013D7; Fri, 24 Jul 2020 18:38:25 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 3EA12223D06; Fri, 24 Jul 2020 14:38:25 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, miklos@szeredi.hu Cc: vgoyal@redhat.com, virtio-fs@redhat.com Subject: [PATCH 4/5] fuse: For sending setattr in case of open(O_TRUNC) Date: Fri, 24 Jul 2020 14:38:11 -0400 Message-Id: <20200724183812.19573-5-vgoyal@redhat.com> In-Reply-To: <20200724183812.19573-1-vgoyal@redhat.com> References: <20200724183812.19573-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.84 on 10.5.11.22 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org open(O_TRUNC) will not kill suid/sgid on server and fuse_open_in does not have information if caller has CAP_FSETID or not. So force sending setattr() which is called after open(O_TRUNC) so that server clears setuid/setgid. Signed-off-by: Vivek Goyal --- fs/fuse/dir.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/fuse/dir.c b/fs/fuse/dir.c index 82747ca4c5c8..0572779abbbe 100644 --- a/fs/fuse/dir.c +++ b/fs/fuse/dir.c @@ -1516,7 +1516,7 @@ int fuse_do_setattr(struct dentry *dentry, struct iattr *attr, /* This is coming from open(..., ... | O_TRUNC); */ WARN_ON(!(attr->ia_valid & ATTR_SIZE)); WARN_ON(attr->ia_size != 0); - if (fc->atomic_o_trunc) { + if (fc->atomic_o_trunc && !fc->handle_killpriv_v2) { /* * No need to send request to userspace, since actual * truncation has already been done by OPEN. But still From patchwork Fri Jul 24 18:38:12 2020 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Vivek Goyal X-Patchwork-Id: 11684109 Return-Path: Received: from mail.kernel.org (pdx-korg-mail-1.web.codeaurora.org [172.30.200.123]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id C3E62913 for ; Fri, 24 Jul 2020 18:38:37 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id AC3442070B for ; Fri, 24 Jul 2020 18:38:37 +0000 (UTC) Authentication-Results: mail.kernel.org; dkim=pass (1024-bit key) header.d=redhat.com header.i=@redhat.com header.b="NXtIAJSm" Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726701AbgGXSih (ORCPT ); Fri, 24 Jul 2020 14:38:37 -0400 Received: from us-smtp-2.mimecast.com ([207.211.31.81]:55978 "EHLO us-smtp-delivery-1.mimecast.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726416AbgGXSig (ORCPT ); Fri, 24 Jul 2020 14:38:36 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1595615914; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PhEoZgTMEfNaW4xY05eYSvJfzMwJ3nSUHVBZ7Yn3xr0=; b=NXtIAJSm5KoIcpggqatFxzaFLwTvSFSuquc+QZ6kzJb1w4CIKlLwEduXY4seVY+v+6q0iy rr56+1HRYDRmX2SA4VBNugFXRegy8eTXOFckhrwJedkmkLbwhcLKT2/mxJ/0ac3A0KWIEK N/NsfQ1SaDFp9hGMbccW/OH4smpd4tI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-409-PsfdSV3pOqen6ge77Q_lPg-1; Fri, 24 Jul 2020 14:38:30 -0400 X-MC-Unique: PsfdSV3pOqen6ge77Q_lPg-1 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.phx2.redhat.com [10.5.11.15]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B4F3919200C0; Fri, 24 Jul 2020 18:38:29 +0000 (UTC) Received: from horse.redhat.com (ovpn-116-85.rdu2.redhat.com [10.10.116.85]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8991A7269A; Fri, 24 Jul 2020 18:38:29 +0000 (UTC) Received: by horse.redhat.com (Postfix, from userid 10451) id 43FCC223D08; Fri, 24 Jul 2020 14:38:25 -0400 (EDT) From: Vivek Goyal To: linux-fsdevel@vger.kernel.org, miklos@szeredi.hu Cc: vgoyal@redhat.com, virtio-fs@redhat.com Subject: [PATCH 5/5] virtiofs: Support SB_NOSEC flag to improve direct write performance Date: Fri, 24 Jul 2020 14:38:12 -0400 Message-Id: <20200724183812.19573-6-vgoyal@redhat.com> In-Reply-To: <20200724183812.19573-1-vgoyal@redhat.com> References: <20200724183812.19573-1-vgoyal@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.15 Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org Ganesh Mahalingam reported that virtiofs is slow with small direct random writes when virtiofsd is run with cache=always. https://github.com/kata-containers/runtime/issues/2815 Little debugging showed that that file_remove_privs() is called in cached write path on every write. And everytime it calls security_inode_need_killpriv() which results in call to __vfs_getxattr(XATTR_NAME_CAPS). And this goes to file server to fetch xattr. This extra round trip for every write slows down writes a lot. Normally to avoid paying this penalty on every write, vfs has the notion of caching this information in inode (S_NOSEC). So vfs sets S_NOSEC, if filesystem opted for it using super block flag SB_NOSEC. And S_NOSEC is cleared when setuid/setgid bit is set or when security xattr is set on inode so that next time a write happens, we check inode again for clearing setuid/setgid bits as well clear any security.capability xattr. This seems to work well for local file systems but for remote file systems it is possible that VFS does not have full picture and a different client sets setuid/setgid bit or security.capability xattr on file and that means VFS information about S_NOSEC on another client will be stale. So for remote filesystems SB_NOSEC was disabled by default. commit 9e1f1de02c2275d7172e18dc4e7c2065777611bf Author: Al Viro Date: Fri Jun 3 18:24:58 2011 -0400 more conservative S_NOSEC handling That commit mentioned that these filesystems can still make use of SB_NOSEC as long as they clear S_NOSEC when they are refreshing inode attriutes from server. So this patch tries to enable SB_NOSEC on fuse (regular fuse as well as virtiofs). And clear SB_NOSEC when we are refreshing inode attributes. We need to clear SB_NOSEC either when inode has setuid/setgid bit set or security.capability xattr has been set. We have the first piece of information available in FUSE_GETATTR response. But we don't know if security.capability has been set on file or not. Question is, do we really need to know about security.capability. file_remove_privs() always removes security.capability if a file is being written to. That means when server writes to file, security.capability should be removed without guest having to tell anything to it. That means we don't have to worry about knowing if security.capability was set or not as long as writes by client don't get cached and go to server always. And server write should clear security.capability. Hence, I clear SB_NOSEC when writeback cache is enabled. This change improves random write performance very significantly. I am running virtiofsd with cache=auto and following fio command. fio --ioengine=libaio --direct=1 --name=test --filename=/mnt/virtiofs/random_read_write.fio --bs=4k --iodepth=64 --size=4G --readwrite=randwrite Before this patch I get around 40MB/s and after the patch I get around 300MB/s bandwidth. So improvement is very significant. Reported-by: "Mahalingam, Ganesh" Signed-off-by: Vivek Goyal --- fs/fuse/inode.c | 12 ++++++++++++ fs/fuse/virtio_fs.c | 3 +++ 2 files changed, 15 insertions(+) diff --git a/fs/fuse/inode.c b/fs/fuse/inode.c index 113ba149e08d..412ab08607ca 100644 --- a/fs/fuse/inode.c +++ b/fs/fuse/inode.c @@ -187,6 +187,16 @@ void fuse_change_attributes_common(struct inode *inode, struct fuse_attr *attr, inode->i_mode &= ~S_ISVTX; fi->orig_ino = attr->ino; + + /* + * We are refreshing inode data and it is possible that another + * client set suid/sgid or security.capability xattr. So clear + * S_NOSEC. Ideally, we could have cleared it only if suid/sgid + * was set or if security.capability xattr was set. But we don't + * know if security.capability has been set or not. So clear it + * anyway. Its less efficient but should is safe. + */ + inode->i_flags &= ~S_NOSEC; } void fuse_change_attributes(struct inode *inode, struct fuse_attr *attr, @@ -1281,6 +1291,8 @@ static int fuse_fill_super(struct super_block *sb, struct fs_context *fsc) */ fput(file); fuse_send_init(get_fuse_conn_super(sb)); + if (fc->handle_killpriv_v2) + sb->s_flags |= SB_NOSEC; return 0; err_put_conn: diff --git a/fs/fuse/virtio_fs.c b/fs/fuse/virtio_fs.c index 4c4ef5d69298..be05e4995e60 100644 --- a/fs/fuse/virtio_fs.c +++ b/fs/fuse/virtio_fs.c @@ -1126,6 +1126,9 @@ static int virtio_fs_fill_super(struct super_block *sb) /* Previous unmount will stop all queues. Start these again */ virtio_fs_start_all_queues(fs); fuse_send_init(fc); + + if (fc->handle_killpriv_v2) + sb->s_flags |= SB_NOSEC; mutex_unlock(&virtio_fs_mutex); return 0;