From patchwork Thu Feb 4 15:02:06 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12067523 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MIME_BASE64_TEXT,SPF_HELO_NONE,SPF_PASS autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4951AC433DB for ; Thu, 4 Feb 2021 15:06:58 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id D955664DF2 for ; Thu, 4 Feb 2021 15:06:57 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org D955664DF2 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:60610 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l7gDf-0007OW-Th for qemu-devel@archiver.kernel.org; Thu, 04 Feb 2021 10:06:55 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:40536) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l7g9V-000308-Q9 for qemu-devel@nongnu.org; Thu, 04 Feb 2021 10:02:38 -0500 Received: from us-smtp-delivery-124.mimecast.com ([63.128.21.124]:50567) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1l7g9T-0002XV-Ug for qemu-devel@nongnu.org; Thu, 04 Feb 2021 10:02:37 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612450955; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=dUguKk7pEcNYDonCk/K2qnrIwMEERiH0yxbvA57CzCA=; b=IUKCiqUTpoiEGzmhxTd4ZxTbrb/qE2euCbDUbjeMaBobkZ1aCWmk/Fq6p23U2ToA9ZTnHu ym2kNXaXb9qjDtPeKjJOBAbUZwFgqb2B/0vwQR8gE4ajmYgyfnOJ5ZsDJfzsAPDvUceb4E aSczUb9XllpcqskNo8wlyyIjMF+0bto= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-394-lo0YQtksOKmX4yrWH2z8Aw-1; Thu, 04 Feb 2021 10:02:33 -0500 X-MC-Unique: lo0YQtksOKmX4yrWH2z8Aw-1 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7D1FA1936B61; Thu, 4 Feb 2021 15:02:32 +0000 (UTC) Received: from localhost (ovpn-115-89.ams2.redhat.com [10.36.115.89]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8D3B55DA2D; Thu, 4 Feb 2021 15:02:21 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Subject: [PATCH v5 1/3] virtiofsd: extract lo_do_open() from lo_open() Date: Thu, 4 Feb 2021 15:02:06 +0000 Message-Id: <20210204150208.367837-2-stefanha@redhat.com> In-Reply-To: <20210204150208.367837-1-stefanha@redhat.com> References: <20210204150208.367837-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=63.128.21.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -13 X-Spam_score: -1.4 X-Spam_bar: - X-Spam_report: (-1.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.351, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MIME_BASE64_TEXT=1.741, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H4=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mszeredi@redhat.com, Daniel Berrange , slp@redhat.com, Greg Kurz , P J P , virtio-fs@redhat.com, vgoyal@redhat.com, Stefan Hajnoczi , Laszlo Ersek , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" Both lo_open() and lo_create() have similar code to open a file. Extract a common lo_do_open() function from lo_open() that will be used by lo_create() in a later commit. Since lo_do_open() does not otherwise need fuse_req_t req, convert lo_add_fd_mapping() to use struct lo_data *lo instead. Signed-off-by: Stefan Hajnoczi Reviewed-by: Greg Kurz --- v4: * Return positive errno if openat(2) fails in lo_do_open() [Greg] --- tools/virtiofsd/passthrough_ll.c | 73 ++++++++++++++++++++------------ 1 file changed, 46 insertions(+), 27 deletions(-) diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c index 5fb36d9407..f14fa5124d 100644 --- a/tools/virtiofsd/passthrough_ll.c +++ b/tools/virtiofsd/passthrough_ll.c @@ -459,17 +459,17 @@ static void lo_map_remove(struct lo_map *map, size_t key) } /* Assumes lo->mutex is held */ -static ssize_t lo_add_fd_mapping(fuse_req_t req, int fd) +static ssize_t lo_add_fd_mapping(struct lo_data *lo, int fd) { struct lo_map_elem *elem; - elem = lo_map_alloc_elem(&lo_data(req)->fd_map); + elem = lo_map_alloc_elem(&lo->fd_map); if (!elem) { return -1; } elem->fd = fd; - return elem - lo_data(req)->fd_map.elems; + return elem - lo->fd_map.elems; } /* Assumes lo->mutex is held */ @@ -1651,6 +1651,38 @@ static void update_open_flags(int writeback, int allow_direct_io, } } +static int lo_do_open(struct lo_data *lo, struct lo_inode *inode, + struct fuse_file_info *fi) +{ + char buf[64]; + ssize_t fh; + int fd; + + update_open_flags(lo->writeback, lo->allow_direct_io, fi); + + sprintf(buf, "%i", inode->fd); + fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW); + if (fd == -1) { + return errno; + } + + pthread_mutex_lock(&lo->mutex); + fh = lo_add_fd_mapping(lo, fd); + pthread_mutex_unlock(&lo->mutex); + if (fh == -1) { + close(fd); + return ENOMEM; + } + + fi->fh = fh; + if (lo->cache == CACHE_NONE) { + fi->direct_io = 1; + } else if (lo->cache == CACHE_ALWAYS) { + fi->keep_cache = 1; + } + return 0; +} + static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name, mode_t mode, struct fuse_file_info *fi) { @@ -1691,7 +1723,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name, ssize_t fh; pthread_mutex_lock(&lo->mutex); - fh = lo_add_fd_mapping(req, fd); + fh = lo_add_fd_mapping(lo, fd); pthread_mutex_unlock(&lo->mutex); if (fh == -1) { close(fd); @@ -1892,38 +1924,25 @@ static void lo_fsyncdir(fuse_req_t req, fuse_ino_t ino, int datasync, static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi) { - int fd; - ssize_t fh; - char buf[64]; struct lo_data *lo = lo_data(req); + struct lo_inode *inode = lo_inode(req, ino); + int err; fuse_log(FUSE_LOG_DEBUG, "lo_open(ino=%" PRIu64 ", flags=%d)\n", ino, fi->flags); - update_open_flags(lo->writeback, lo->allow_direct_io, fi); - - sprintf(buf, "%i", lo_fd(req, ino)); - fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW); - if (fd == -1) { - return (void)fuse_reply_err(req, errno); - } - - pthread_mutex_lock(&lo->mutex); - fh = lo_add_fd_mapping(req, fd); - pthread_mutex_unlock(&lo->mutex); - if (fh == -1) { - close(fd); - fuse_reply_err(req, ENOMEM); + if (!inode) { + fuse_reply_err(req, EBADF); return; } - fi->fh = fh; - if (lo->cache == CACHE_NONE) { - fi->direct_io = 1; - } else if (lo->cache == CACHE_ALWAYS) { - fi->keep_cache = 1; + err = lo_do_open(lo, inode, fi); + lo_inode_put(lo, &inode); + if (err) { + fuse_reply_err(req, err); + } else { + fuse_reply_open(req, fi); } - fuse_reply_open(req, fi); } static void lo_release(fuse_req_t req, fuse_ino_t ino, From patchwork Thu Feb 4 15:02:07 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12067509 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MIME_BASE64_TEXT,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2EA6C433E0 for ; Thu, 4 Feb 2021 15:05:04 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 3B80764F60 for ; Thu, 4 Feb 2021 15:05:04 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 3B80764F60 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55012 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l7gBr-0004n2-0n for qemu-devel@archiver.kernel.org; Thu, 04 Feb 2021 10:05:03 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:40618) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l7g9o-000356-2C for qemu-devel@nongnu.org; Thu, 04 Feb 2021 10:02:58 -0500 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:30350) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1l7g9i-0002dU-8Y for qemu-devel@nongnu.org; Thu, 04 Feb 2021 10:02:55 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612450969; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=TDEce3OIkshhfGWpJ6O1gNUvo+qNj3ylTwkmaF9ii04=; b=cTvHzeP5x7xKcFDaPDbja34W2hzekuCnP16iYvhHYkuogTiDt8ay0gF9UufC7nILl4wskS +71LyaWtmEiePA158lUfdDuFGYHEOv9QYX3tKF/8GkxQIHzV0gVFND/FW2LxrjQ+LQsKqm FxhpavdPzaLn7BOopxFfr3FJZJLtar4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-129-uY0QrSEKMQaDapBfBHJUag-1; Thu, 04 Feb 2021 10:02:47 -0500 X-MC-Unique: uY0QrSEKMQaDapBfBHJUag-1 Received: from smtp.corp.redhat.com (int-mx02.intmail.prod.int.phx2.redhat.com [10.5.11.12]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id E8736107ACE6; Thu, 4 Feb 2021 15:02:46 +0000 (UTC) Received: from localhost (ovpn-115-89.ams2.redhat.com [10.36.115.89]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0CF0160BE2; Thu, 4 Feb 2021 15:02:33 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Subject: [PATCH v5 2/3] virtiofsd: optionally return inode pointer from lo_do_lookup() Date: Thu, 4 Feb 2021 15:02:07 +0000 Message-Id: <20210204150208.367837-3-stefanha@redhat.com> In-Reply-To: <20210204150208.367837-1-stefanha@redhat.com> References: <20210204150208.367837-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.12 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=216.205.24.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -13 X-Spam_score: -1.4 X-Spam_bar: - X-Spam_report: (-1.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.351, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MIME_BASE64_TEXT=1.741, RCVD_IN_DNSWL_LOW=-0.7, RCVD_IN_MSPIKE_H3=0.001, RCVD_IN_MSPIKE_WL=0.001, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mszeredi@redhat.com, Daniel Berrange , slp@redhat.com, Greg Kurz , P J P , virtio-fs@redhat.com, vgoyal@redhat.com, Stefan Hajnoczi , Laszlo Ersek , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" lo_do_lookup() finds an existing inode or allocates a new one. It increments nlookup so that the inode stays alive until the client releases it. Existing callers don't need the struct lo_inode so the function doesn't return it. Extend the function to optionally return the inode. The next commit will need it. Signed-off-by: Stefan Hajnoczi Reviewed-by: Greg Kurz --- tools/virtiofsd/passthrough_ll.c | 29 +++++++++++++++++++++-------- 1 file changed, 21 insertions(+), 8 deletions(-) diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c index f14fa5124d..aa35fc6ba5 100644 --- a/tools/virtiofsd/passthrough_ll.c +++ b/tools/virtiofsd/passthrough_ll.c @@ -831,11 +831,13 @@ static int do_statx(struct lo_data *lo, int dirfd, const char *pathname, } /* - * Increments nlookup and caller must release refcount using - * lo_inode_put(&parent). + * Increments nlookup on the inode on success. unref_inode_lolocked() must be + * called eventually to decrement nlookup again. If inodep is non-NULL, the + * inode pointer is stored and the caller must call lo_inode_put(). */ static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name, - struct fuse_entry_param *e) + struct fuse_entry_param *e, + struct lo_inode **inodep) { int newfd; int res; @@ -845,6 +847,10 @@ static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name, struct lo_inode *inode = NULL; struct lo_inode *dir = lo_inode(req, parent); + if (inodep) { + *inodep = NULL; + } + /* * name_to_handle_at() and open_by_handle_at() can reach here with fuse * mount point in guest, but we don't have its inode info in the @@ -913,7 +919,14 @@ static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name, pthread_mutex_unlock(&lo->mutex); } e->ino = inode->fuse_ino; - lo_inode_put(lo, &inode); + + /* Transfer ownership of inode pointer to caller or drop it */ + if (inodep) { + *inodep = inode; + } else { + lo_inode_put(lo, &inode); + } + lo_inode_put(lo, &dir); fuse_log(FUSE_LOG_DEBUG, " %lli/%s -> %lli\n", (unsigned long long)parent, @@ -948,7 +961,7 @@ static void lo_lookup(fuse_req_t req, fuse_ino_t parent, const char *name) return; } - err = lo_do_lookup(req, parent, name, &e); + err = lo_do_lookup(req, parent, name, &e, NULL); if (err) { fuse_reply_err(req, err); } else { @@ -1056,7 +1069,7 @@ static void lo_mknod_symlink(fuse_req_t req, fuse_ino_t parent, goto out; } - saverr = lo_do_lookup(req, parent, name, &e); + saverr = lo_do_lookup(req, parent, name, &e, NULL); if (saverr) { goto out; } @@ -1534,7 +1547,7 @@ static void lo_do_readdir(fuse_req_t req, fuse_ino_t ino, size_t size, if (plus) { if (!is_dot_or_dotdot(name)) { - err = lo_do_lookup(req, ino, name, &e); + err = lo_do_lookup(req, ino, name, &e, NULL); if (err) { goto error; } @@ -1732,7 +1745,7 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name, } fi->fh = fh; - err = lo_do_lookup(req, parent, name, &e); + err = lo_do_lookup(req, parent, name, &e, NULL); } if (lo->cache == CACHE_NONE) { fi->direct_io = 1; From patchwork Thu Feb 4 15:02:08 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Stefan Hajnoczi X-Patchwork-Id: 12067511 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-11.8 required=3.0 tests=BAYES_00,DKIM_INVALID, DKIM_SIGNED,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER,INCLUDES_PATCH, MAILING_LIST_MULTI,MIME_BASE64_TEXT,SPF_HELO_NONE,SPF_PASS,URIBL_BLOCKED autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C3AA8C433E0 for ; Thu, 4 Feb 2021 15:05:24 +0000 (UTC) Received: from lists.gnu.org (lists.gnu.org [209.51.188.17]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPS id 5CB1964F4D for ; Thu, 4 Feb 2021 15:05:24 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 5CB1964F4D Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=redhat.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Received: from localhost ([::1]:55128 helo=lists1p.gnu.org) by lists.gnu.org with esmtp (Exim 4.90_1) (envelope-from ) id 1l7gCB-0004qM-I7 for qemu-devel@archiver.kernel.org; Thu, 04 Feb 2021 10:05:23 -0500 Received: from eggs.gnu.org ([2001:470:142:3::10]:40670) by lists.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.90_1) (envelope-from ) id 1l7gA2-00036n-0I for qemu-devel@nongnu.org; Thu, 04 Feb 2021 10:03:13 -0500 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:26977) by eggs.gnu.org with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA1:256) (Exim 4.90_1) (envelope-from ) id 1l7g9v-0002h1-II for qemu-devel@nongnu.org; Thu, 04 Feb 2021 10:03:09 -0500 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1612450981; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=CKcll4oBdHGSZ93sW+jglJXU08GQbDsj3OlkZvBHJog=; b=NuP79yeAVybDEkkYfNI+EiqiZcZ/1aifCAXskgsfSI9H09wKGlnrkuWhFrytdE7NTQDDw0 1d7mcZfEEdgPoRgk/Stc6t8w5rClEBmjqWNmAzN9LvZvapxus3bTLReY3Pc2JDPVq2dGjT FcIw/KSXwcvUv7Szx6Jo9I1m/cOaNHk= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-28-dOr8N0-mP72SAfS5LS7OFw-1; Thu, 04 Feb 2021 10:02:59 -0500 X-MC-Unique: dOr8N0-mP72SAfS5LS7OFw-1 Received: from smtp.corp.redhat.com (int-mx06.intmail.prod.int.phx2.redhat.com [10.5.11.16]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id F2471801962; Thu, 4 Feb 2021 15:02:57 +0000 (UTC) Received: from localhost (ovpn-115-89.ams2.redhat.com [10.36.115.89]) by smtp.corp.redhat.com (Postfix) with ESMTP id 5C9AF5C648; Thu, 4 Feb 2021 15:02:48 +0000 (UTC) From: Stefan Hajnoczi To: qemu-devel@nongnu.org Subject: [PATCH v5 3/3] virtiofsd: prevent opening of special files (CVE-2020-35517) Date: Thu, 4 Feb 2021 15:02:08 +0000 Message-Id: <20210204150208.367837-4-stefanha@redhat.com> In-Reply-To: <20210204150208.367837-1-stefanha@redhat.com> References: <20210204150208.367837-1-stefanha@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.16 Authentication-Results: relay.mimecast.com; auth=pass smtp.auth=CUSA124A263 smtp.mailfrom=stefanha@redhat.com X-Mimecast-Spam-Score: 0 X-Mimecast-Originator: redhat.com Received-SPF: pass client-ip=170.10.133.124; envelope-from=stefanha@redhat.com; helo=us-smtp-delivery-124.mimecast.com X-Spam_score_int: -13 X-Spam_score: -1.4 X-Spam_bar: - X-Spam_report: (-1.4 / 5.0 requ) BAYES_00=-1.9, DKIMWL_WL_HIGH=-0.351, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, DKIM_VALID_AU=-0.1, DKIM_VALID_EF=-0.1, MIME_BASE64_TEXT=1.741, RCVD_IN_DNSWL_LOW=-0.7, SPF_HELO_NONE=0.001, SPF_PASS=-0.001 autolearn=ham autolearn_force=no X-Spam_action: no action X-BeenThere: qemu-devel@nongnu.org X-Mailman-Version: 2.1.23 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Cc: mszeredi@redhat.com, Daniel Berrange , slp@redhat.com, Greg Kurz , P J P , virtio-fs@redhat.com, Alex Xu , vgoyal@redhat.com, Stefan Hajnoczi , Laszlo Ersek , "Dr. David Alan Gilbert" Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: "Qemu-devel" A well-behaved FUSE client does not attempt to open special files with FUSE_OPEN because they are handled on the client side (e.g. device nodes are handled by client-side device drivers). The check to prevent virtiofsd from opening special files is missing in a few cases, most notably FUSE_OPEN. A malicious client can cause virtiofsd to open a device node, potentially allowing the guest to escape. This can be exploited by a modified guest device driver. It is not exploitable from guest userspace since the guest kernel will handle special files inside the guest instead of sending FUSE requests. This patch fixes this issue by introducing the lo_inode_open() function to check the file type before opening it. This is a short-term solution because it does not prevent a compromised virtiofsd process from opening device nodes on the host. Restructure lo_create() to try O_CREAT | O_EXCL first. Note that O_CREAT | O_EXCL does not follow symlinks, so O_NOFOLLOW masking is not necessary here. If the file exists and the user did not specify O_EXCL, open it via lo_do_open(). Reported-by: Alex Xu Fixes: CVE-2020-35517 Reviewed-by: Dr. David Alan Gilbert Reviewed-by: Vivek Goyal Signed-off-by: Stefan Hajnoczi Reviewed-by: Greg Kurz --- v4: * Return -fd instead or -errno after lo_inode_open() in lo_do_open() [Greg] * Use De Morgan's Law to simplify the boolean expression in lo_create() [Vivek] * Add missing errno = -truncfd after lo_inode_open() call in lo_setattr v3: * Restructure lo_create() to handle externally-created files (we need to allocate an inode for them) [Greg] v3: * Protect lo_create() [Greg] v2: * Add doc comment clarifying that symlinks are traversed client-side [Daniel] This issue was diagnosed on public IRC and is therefore already known and not embargoed. A stronger fix, and the long-term solution, is for users to mount the shared directory and any sub-mounts with nodev, as well as nosuid and noexec. Unfortunately virtiofsd cannot do this automatically because bind mounts added by the user after virtiofsd has launched would not be detected. I suggest the following: 1. Modify libvirt and Kata Containers to explicitly set these mount options. 2. Then modify virtiofsd to check that the shared directory has the necessary options at startup. Refuse to start if the options are missing so that the user is aware of the security requirements. As a bonus this also increases the likelihood that other host processes besides virtiofsd will be protected by nosuid/noexec/nodev so that a malicious guest cannot drop these files in place and then arrange for a host process to come across them. Additionally, user namespaces have been discussed. They seem like a worthwhile addition as an unprivileged or privilege-separated mode although there are limitations with respect to security xattrs and the actual uid/gid stored on the host file system not corresponding to the guest uid/gid. Signed-off-by: Stefan Hajnoczi --- tools/virtiofsd/passthrough_ll.c | 144 ++++++++++++++++++++----------- 1 file changed, 92 insertions(+), 52 deletions(-) diff --git a/tools/virtiofsd/passthrough_ll.c b/tools/virtiofsd/passthrough_ll.c index aa35fc6ba5..147b59338a 100644 --- a/tools/virtiofsd/passthrough_ll.c +++ b/tools/virtiofsd/passthrough_ll.c @@ -555,6 +555,38 @@ static int lo_fd(fuse_req_t req, fuse_ino_t ino) return fd; } +/* + * Open a file descriptor for an inode. Returns -EBADF if the inode is not a + * regular file or a directory. + * + * Use this helper function instead of raw openat(2) to prevent security issues + * when a malicious client opens special files such as block device nodes. + * Symlink inodes are also rejected since symlinks must already have been + * traversed on the client side. + */ +static int lo_inode_open(struct lo_data *lo, struct lo_inode *inode, + int open_flags) +{ + g_autofree char *fd_str = g_strdup_printf("%d", inode->fd); + int fd; + + if (!S_ISREG(inode->filetype) && !S_ISDIR(inode->filetype)) { + return -EBADF; + } + + /* + * The file is a symlink so O_NOFOLLOW must be ignored. We checked earlier + * that the inode is not a special file but if an external process races + * with us then symlinks are traversed here. It is not possible to escape + * the shared directory since it is mounted as "/" though. + */ + fd = openat(lo->proc_self_fd, fd_str, open_flags & ~O_NOFOLLOW); + if (fd < 0) { + return -errno; + } + return fd; +} + static void lo_init(void *userdata, struct fuse_conn_info *conn) { struct lo_data *lo = (struct lo_data *)userdata; @@ -684,9 +716,9 @@ static void lo_setattr(fuse_req_t req, fuse_ino_t ino, struct stat *attr, if (fi) { truncfd = fd; } else { - sprintf(procname, "%i", ifd); - truncfd = openat(lo->proc_self_fd, procname, O_RDWR); + truncfd = lo_inode_open(lo, inode, O_RDWR); if (truncfd < 0) { + errno = -truncfd; goto out_err; } } @@ -848,7 +880,7 @@ static int lo_do_lookup(fuse_req_t req, fuse_ino_t parent, const char *name, struct lo_inode *dir = lo_inode(req, parent); if (inodep) { - *inodep = NULL; + *inodep = NULL; /* in case there is an error */ } /* @@ -1664,19 +1696,26 @@ static void update_open_flags(int writeback, int allow_direct_io, } } +/* + * Open a regular file, set up an fd mapping, and fill out the struct + * fuse_file_info for it. If existing_fd is not negative, use that fd instead + * opening a new one. Takes ownership of existing_fd. + * + * Returns 0 on success or a positive errno. + */ static int lo_do_open(struct lo_data *lo, struct lo_inode *inode, - struct fuse_file_info *fi) + int existing_fd, struct fuse_file_info *fi) { - char buf[64]; ssize_t fh; - int fd; + int fd = existing_fd; update_open_flags(lo->writeback, lo->allow_direct_io, fi); - sprintf(buf, "%i", inode->fd); - fd = openat(lo->proc_self_fd, buf, fi->flags & ~O_NOFOLLOW); - if (fd == -1) { - return errno; + if (fd < 0) { + fd = lo_inode_open(lo, inode, fi->flags); + if (fd < 0) { + return -fd; + } } pthread_mutex_lock(&lo->mutex); @@ -1699,9 +1738,10 @@ static int lo_do_open(struct lo_data *lo, struct lo_inode *inode, static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name, mode_t mode, struct fuse_file_info *fi) { - int fd; + int fd = -1; struct lo_data *lo = lo_data(req); struct lo_inode *parent_inode; + struct lo_inode *inode = NULL; struct fuse_entry_param e; int err; struct lo_cred old = {}; @@ -1727,36 +1767,38 @@ static void lo_create(fuse_req_t req, fuse_ino_t parent, const char *name, update_open_flags(lo->writeback, lo->allow_direct_io, fi); - fd = openat(parent_inode->fd, name, (fi->flags | O_CREAT) & ~O_NOFOLLOW, - mode); + /* Try to create a new file but don't open existing files */ + fd = openat(parent_inode->fd, name, fi->flags | O_CREAT | O_EXCL, mode); err = fd == -1 ? errno : 0; + lo_restore_cred(&old); - if (!err) { - ssize_t fh; - - pthread_mutex_lock(&lo->mutex); - fh = lo_add_fd_mapping(lo, fd); - pthread_mutex_unlock(&lo->mutex); - if (fh == -1) { - close(fd); - err = ENOMEM; - goto out; - } + /* Ignore the error if file exists and O_EXCL was not given */ + if (err && (err != EEXIST || (fi->flags & O_EXCL))) { + goto out; + } - fi->fh = fh; - err = lo_do_lookup(req, parent, name, &e, NULL); + err = lo_do_lookup(req, parent, name, &e, &inode); + if (err) { + goto out; } - if (lo->cache == CACHE_NONE) { - fi->direct_io = 1; - } else if (lo->cache == CACHE_ALWAYS) { - fi->keep_cache = 1; + + err = lo_do_open(lo, inode, fd, fi); + fd = -1; /* lo_do_open() takes ownership of fd */ + if (err) { + /* Undo lo_do_lookup() nlookup ref */ + unref_inode_lolocked(lo, inode, 1); } out: + lo_inode_put(lo, &inode); lo_inode_put(lo, &parent_inode); if (err) { + if (fd >= 0) { + close(fd); + } + fuse_reply_err(req, err); } else { fuse_reply_create(req, &e, fi); @@ -1770,7 +1812,6 @@ static struct lo_inode_plock *lookup_create_plock_ctx(struct lo_data *lo, pid_t pid, int *err) { struct lo_inode_plock *plock; - char procname[64]; int fd; plock = @@ -1787,12 +1828,10 @@ static struct lo_inode_plock *lookup_create_plock_ctx(struct lo_data *lo, } /* Open another instance of file which can be used for ofd locks. */ - sprintf(procname, "%i", inode->fd); - /* TODO: What if file is not writable? */ - fd = openat(lo->proc_self_fd, procname, O_RDWR); - if (fd == -1) { - *err = errno; + fd = lo_inode_open(lo, inode, O_RDWR); + if (fd < 0) { + *err = -fd; free(plock); return NULL; } @@ -1949,7 +1988,7 @@ static void lo_open(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi) return; } - err = lo_do_open(lo, inode, fi); + err = lo_do_open(lo, inode, -1, fi); lo_inode_put(lo, &inode); if (err) { fuse_reply_err(req, err); @@ -2014,39 +2053,40 @@ static void lo_flush(fuse_req_t req, fuse_ino_t ino, struct fuse_file_info *fi) static void lo_fsync(fuse_req_t req, fuse_ino_t ino, int datasync, struct fuse_file_info *fi) { + struct lo_inode *inode = lo_inode(req, ino); + struct lo_data *lo = lo_data(req); int res; int fd; - char *buf; fuse_log(FUSE_LOG_DEBUG, "lo_fsync(ino=%" PRIu64 ", fi=0x%p)\n", ino, (void *)fi); + if (!inode) { + fuse_reply_err(req, EBADF); + return; + } + if (!fi) { - struct lo_data *lo = lo_data(req); - - res = asprintf(&buf, "%i", lo_fd(req, ino)); - if (res == -1) { - return (void)fuse_reply_err(req, errno); - } - - fd = openat(lo->proc_self_fd, buf, O_RDWR); - free(buf); - if (fd == -1) { - return (void)fuse_reply_err(req, errno); + fd = lo_inode_open(lo, inode, O_RDWR); + if (fd < 0) { + res = -fd; + goto out; } } else { fd = lo_fi_fd(req, fi); } if (datasync) { - res = fdatasync(fd); + res = fdatasync(fd) == -1 ? errno : 0; } else { - res = fsync(fd); + res = fsync(fd) == -1 ? errno : 0; } if (!fi) { close(fd); } - fuse_reply_err(req, res == -1 ? errno : 0); +out: + lo_inode_put(lo, &inode); + fuse_reply_err(req, res); } static void lo_read(fuse_req_t req, fuse_ino_t ino, size_t size, off_t offset,