From patchwork Sat Nov 12 19:36:04 2016 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Amir Goldstein X-Patchwork-Id: 9424433 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id F39EC60233 for ; Sat, 12 Nov 2016 19:36:49 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id E451A28E1B for ; Sat, 12 Nov 2016 19:36:49 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id D928B28E1E; Sat, 12 Nov 2016 19:36:49 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.3 required=2.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED, DKIM_SIGNED, FREEMAIL_FROM, RCVD_IN_DNSWL_HI, RCVD_IN_SORBS_SPAM, T_DKIM_INVALID autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 5B27928E1B for ; Sat, 12 Nov 2016 19:36:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S966867AbcKLTgs (ORCPT ); Sat, 12 Nov 2016 14:36:48 -0500 Received: from mail-wm0-f65.google.com ([74.125.82.65]:34069 "EHLO mail-wm0-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S966902AbcKLTgp (ORCPT ); Sat, 12 Nov 2016 14:36:45 -0500 Received: by mail-wm0-f65.google.com with SMTP id g23so4800799wme.1; Sat, 12 Nov 2016 11:36:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20120113; h=from:to:cc:subject:date:message-id:in-reply-to:references; bh=VvbBBk8H39gtizSz3DWq+TZuYQEHMdSaLRD9JQNOLoY=; b=OBK+xLrzp2AZQeJ+f23pukjA1nR8L8y97vB5gk5lTps+yh5sanDJTums4K9ye1KjSS YsF6iaOslq+U8FfmFzmGHQpaKHUdQbo/BFxjHiap4Yp9G0XfRGV4uWYwZwed335ZiULE lg89lR6LwzCVMS0pClNEBVfCagqeR3KXEQUkFBzcqQhAZe/3FLldNBOwhQMFx9HNlP1u 5giHrl6RDFdx81dZs82cZ6x44ADUtCaCloZGmE3CkJKIxtu7Ycqq7G3NRJhVzHoIqDdl +aWBLxmqXySmQYvQVNxfApRXb8HXLlz2YXVltUwhjJTniylIoUEFEAq0wBB/2azotUCc zN1w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20130820; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references; bh=VvbBBk8H39gtizSz3DWq+TZuYQEHMdSaLRD9JQNOLoY=; b=XGPNyk3a4zIDYlBuGyEz+0prSKMlH7wcHWhPlpJ4ASLIrIQnNuaNB5n6hXFLNd9j6J RKgJrwboCpL1FIGvMf787rInkmaJM6/oHrGvMGpX3/HuFAmDxQbreyV/hm3boTCuCl2I tRrIdedVWMdGfMcNy5AKQE0qz4wC5/pWO4UcihI1zJevUrJsaS8+VwBi0p7DLmzFSsqW pbAmKiRcsdZxgmEt5a0FZZIso7Lo8+7NefBXai1cP7oN7Ujd092ZVqusssOIpN0U1ruo MLy2MV6H7X3S1dZ3/kz8jLw+BORyAlBB5IWCRc7qCzZAG6Uak6FyUIgz2pcVlw7a9VZo p2KA== X-Gm-Message-State: ABUngvc283AJwT/r/MQN0w3pKVfWJU4y0BXiyHluqZZqKap5NI2sxeZdb+Tc4tCOvjUIIw== X-Received: by 10.28.68.195 with SMTP id r186mr3396855wma.105.1478979403411; Sat, 12 Nov 2016 11:36:43 -0800 (PST) Received: from amir-VirtualBox.Home (bzq-79-177-89-54.red.bezeqint.net. [79.177.89.54]) by smtp.gmail.com with ESMTPSA id b7sm1527142wjm.39.2016.11.12.11.36.41 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Sat, 12 Nov 2016 11:36:42 -0800 (PST) From: Amir Goldstein To: Miklos Szeredi Cc: Konstantin Khlebnikov , Al Viro , linux-unionfs@vger.kernel.org, linux-fsdevel@vger.kernel.org Subject: [RFC][PATCH 4/4] ovl: allow concurrent copy up data of different files Date: Sat, 12 Nov 2016 21:36:04 +0200 Message-Id: <1478979364-10355-5-git-send-email-amir73il@gmail.com> X-Mailer: git-send-email 2.7.4 In-Reply-To: <1478979364-10355-1-git-send-email-amir73il@gmail.com> References: <1478979364-10355-1-git-send-email-amir73il@gmail.com> Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP Currently, all copy operations are serialized by taking the lock_rename(workdir, upperdir) locks for the entire copy up operation, including the data copy up. Copying up data may take a long time with large files and holding s_vfs_rename_mutex during that time blocks all rename and copy up operations on the entire underlying fs. This change addresses this problem by changing the copy up locking scheme for different types of files as follows. Directories: inode_lock(ovl_inode) lock_rename(workdir, upperdir) copy up dir to workdir move dir to upperdir Special and zero size files: inode_lock(ovl_inode) lock_rename(workdir, upperdir) copy up file to workdir (no data) move file to upperdir Regular files with data: inode_lock(ovl_inode) lock_rename(workdir, upperdir) copy up file to workdir (no data) unlock_rename(workdir, upperdir) copy data to file in workdir lock_rename(workdir, upperdir) move file to upperdir Signed-off-by: Amir Goldstein --- fs/overlayfs/copy_up.c | 70 +++++++++++++++++++++++++++++++++++++++++++++++--- fs/overlayfs/inode.c | 3 +++ 2 files changed, 69 insertions(+), 4 deletions(-) diff --git a/fs/overlayfs/copy_up.c b/fs/overlayfs/copy_up.c index a16127b..1b9705e 100644 --- a/fs/overlayfs/copy_up.c +++ b/fs/overlayfs/copy_up.c @@ -230,6 +230,44 @@ int ovl_set_attr(struct dentry *upperdentry, struct kstat *stat) return err; } +/* + * Called with dentry->d_inode lock held only for the last (leaf) copy up + * from __ovl_copy_up(), so it is NOT held when called for ancestor + * directory from __ovl_copy_up() + * + * Called with lock_rename(workdir, upperdir) locks held. + * + * lock_rename() locks remain locked throughout the copy up + * of non regular files and zero sized regular files. + * + * lock_rename() locks are released during ovl_copy_up_data() + * of non zero sized regular files. During this time, the overlay + * dentry->d_inode lock is still held to avoid concurrent + * copy up of files with data. + * + * Maybe a better description of this locking scheme: + * + * Directories: + * inode_lock(ovl_inode) + * lock_rename(workdir, upperdir) + * copy up dir to workdir + * move dir to upperdir + * + * Special and zero size files: + * inode_lock(ovl_inode) + * lock_rename(workdir, upperdir) + * copy up file to workdir (no data) + * move file to upperdir + * + * Regular files with data: + * inode_lock(ovl_inode) + * lock_rename(workdir, upperdir) + * copy up file to workdir (no data) + * unlock_rename(workdir, upperdir) + * copy data to file in workdir + * lock_rename(workdir, upperdir) + * move file to upperdir + */ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir, struct dentry *dentry, struct path *lowerpath, struct kstat *stat, const char *link) @@ -274,16 +312,39 @@ static int ovl_copy_up_locked(struct dentry *workdir, struct dentry *upperdir, if (err) goto out2; - if (S_ISREG(stat->mode)) { + if (S_ISREG(stat->mode) && stat->size > 0) { struct path upperpath; ovl_path_upper(dentry, &upperpath); BUG_ON(upperpath.dentry != NULL); upperpath.dentry = newdentry; + /* + * Release rename locks, because copy data may take a long time, + * and holding s_vfs_rename_mutex will block all rename and + * copy up operations on the entire underlying fs. + * We still hold the overlay inode lock to avoid concurrent + * copy up of this file. + */ + unlock_rename(workdir, upperdir); + err = ovl_copy_up_data(lowerpath, &upperpath, stat->size); + + /* + * Re-aquire rename locks, before moving copied file into place. + */ + if (unlikely(lock_rename(workdir, upperdir) != NULL)) { + pr_err("overlayfs: failed to re-aquire lock_rename\n"); + err = -EIO; + } + if (err) goto out_cleanup; + + if (WARN_ON(ovl_dentry_is_upper(dentry))) { + /* Raced with another copy-up? This shouldn't happen */ + goto out_cleanup; + } } err = ovl_copy_xattr(lowerpath->dentry, newdentry); @@ -366,15 +427,14 @@ static int ovl_copy_up_one(struct dentry *parent, struct dentry *dentry, return PTR_ERR(link); } - err = -EIO; - if (lock_rename(workdir, upperdir) != NULL) { + if (unlikely(lock_rename(workdir, upperdir) != NULL)) { pr_err("overlayfs: failed to lock workdir+upperdir\n"); + err = -EIO; goto out_unlock; } if (ovl_dentry_is_upper(dentry)) { /* Raced with another copy-up? Nothing to do, then... */ - err = 0; goto out_unlock; } @@ -433,11 +493,13 @@ static int __ovl_copy_up(struct dentry *dentry, int flags) return err; } +/* Called with dentry->d_inode lock held */ int ovl_copy_up(struct dentry *dentry) { return __ovl_copy_up(dentry, 0); } +/* Called with dentry->d_inode lock held */ int ovl_copy_up_open(struct dentry *dentry, int flags) { return __ovl_copy_up(dentry, flags); diff --git a/fs/overlayfs/inode.c b/fs/overlayfs/inode.c index 7abae00..532b0d5 100644 --- a/fs/overlayfs/inode.c +++ b/fs/overlayfs/inode.c @@ -251,11 +251,14 @@ int ovl_open_maybe_copy_up(struct dentry *dentry, unsigned int file_flags) type = ovl_path_real(dentry, &realpath); if (ovl_open_need_copy_up(file_flags, type, realpath.dentry)) { + /* Take the overlay inode lock to avoid concurrent copy-up */ + inode_lock(dentry->d_inode); err = ovl_want_write(dentry); if (!err) { err = ovl_copy_up_open(dentry, file_flags); ovl_drop_write(dentry); } + inode_unlock(dentry->d_inode); } return err;