From patchwork Mon Mar 13 22:28:36 2017 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Eric Biggers X-Patchwork-Id: 9622247 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 4F943604CC for ; Mon, 13 Mar 2017 22:30:36 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 3E44928503 for ; Mon, 13 Mar 2017 22:30:36 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 2F9A328514; Mon, 13 Mar 2017 22:30:36 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-7.0 required=2.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID, DKIM_VALID_AU, FREEMAIL_FROM, RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id A2DD428503 for ; Mon, 13 Mar 2017 22:30:35 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753559AbdCMWad (ORCPT ); Mon, 13 Mar 2017 18:30:33 -0400 Received: from mail-pg0-f66.google.com ([74.125.83.66]:33634 "EHLO mail-pg0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751150AbdCMWab (ORCPT ); Mon, 13 Mar 2017 18:30:31 -0400 Received: by mail-pg0-f66.google.com with SMTP id 77so19921890pgc.0; Mon, 13 Mar 2017 15:30:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id; bh=njA8P7KtUHi7S/RJDJLTvxDhDSRSWXme4f6OvQ0vC/Y=; b=J0TO3evpIfuuzEwfi71trnNJvhiW1aWG95tuozj/0PXSe0VTifXY26sVWGAEF3Gdwy mxL27K7nqgQPXamTtdtFIF7IuM4OF3wTHU92M01gIOAG/8Yz5kDcutS4RlM5cHEevVd8 V82X9WkfO2sQgqogtF7UPgxJtWp6f/Kts0ZX3sadQJetIBg3tAvhcY/L3GhqVJMHF6Za F8ATQzI9p7KBvNzWq23ta0Kmhzyoj8346KVOEzhwlsN9XqXQLcaLHoHMqLNb0rdZWvPe Y7PIWaY98LTXB2bQ63TtjQOKuEz1XCdL8O0i++HDQou7/aNlKI6y6Fq63XXWZGtCXEx2 Vc7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id; bh=njA8P7KtUHi7S/RJDJLTvxDhDSRSWXme4f6OvQ0vC/Y=; b=m6O5mLSBR9Z5lk8aFS2lYm/Ak6VQLgZdwXCpQGwMjuPe1HQk4+NB7o+c1L022DD2bQ FnWAHFv9n0U7GDNhfqx1CSsAxl3c1Sq7bUFjJSnBlzap8AvLOCbf6vgI8rsgNmEnguZD GyJ58l0ww7YBQDoCMcNnSS6B9ldJs8B0yZ4uLA4bz3Rt3JtaUfHxALI2QHJZEJP73MYa quE3Q8mc+p2RqPdRcSGADpdC3hE7eU+/yxQ8nB/3xhtlGCe4US8Ncnf2gNQdDj/WN5w1 318TwvTs+g3hzIk9AKHN1Kvf/k9KHQIo5AL6bTECzzR4u3ZqI8nbUe/VmqV/NO4FyRUk 3+lg== X-Gm-Message-State: AMke39nv89603NWFk1u4kjxVaLPTTYSFRkcZgqxcgAkoqVljj1+ZHSZbTI3cs7arDC1E9Q== X-Received: by 10.99.56.85 with SMTP id h21mr39684323pgn.108.1489444229956; Mon, 13 Mar 2017 15:30:29 -0700 (PDT) Received: from ebiggers-linuxstation.kir.corp.google.com ([100.119.30.131]) by smtp.gmail.com with ESMTPSA id w29sm34404678pfi.131.2017.03.13.15.30.28 (version=TLS1_2 cipher=ECDHE-RSA-AES128-SHA bits=128/128); Mon, 13 Mar 2017 15:30:29 -0700 (PDT) From: Eric Biggers To: linux-fsdevel@vger.kernel.org Cc: Al Viro , David Howells , linux-kernel@vger.kernel.org, Eric Biggers Subject: [PATCH v4] statx: optimize copy of struct statx to userspace Date: Mon, 13 Mar 2017 15:28:36 -0700 Message-Id: <20170313222836.105526-1-ebiggers3@gmail.com> X-Mailer: git-send-email 2.12.0.246.ga2ecc84866-goog Sender: linux-fsdevel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-fsdevel@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP From: Eric Biggers I found that statx() was significantly slower than stat(). As a microbenchmark, I compared 10,000,000 invocations of fstat() on a tmpfs file to the same with statx() passed a NULL path: $ time ./stat_benchmark real 0m1.464s user 0m0.275s sys 0m1.187s $ time ./statx_benchmark real 0m5.530s user 0m0.281s sys 0m5.247s statx is expected to be a little slower than stat because struct statx is larger than struct stat, but not by *that* much. It turns out that most of the overhead was in copying struct statx to userspace, mostly in all the stac/clac instructions that got generated for each __put_user() call. (This was on x86_64, but some other architectures, e.g. arm64, have something similar now too.) stat() instead initializes its struct on the stack and copies it to userspace with a single call to copy_to_user(). This turns out to be much faster, and changing statx to do this makes it almost as fast as stat: $ time ./statx_benchmark real 0m1.624s user 0m0.270s sys 0m1.354s For zeroing the reserved fields, start by zeroing the full struct with memset. This makes it clear that every byte copied to userspace is initialized, even implicit padding bytes (though there are none currently). In the scenarios I tested, it also performed the same as a designated initializer. Manually initializing each field was still slightly faster, but would have been more error-prone and less verifiable. Also rename statx_set_result() to cp_statx() for consistency with cp_old_stat() et al., and make it noinline so that struct statx doesn't add to the stack usage during the main portion of the syscall execution. Signed-off-by: Eric Biggers --- fs/stat.c | 74 +++++++++++++++++++++++++++------------------------------------ 1 file changed, 32 insertions(+), 42 deletions(-) diff --git a/fs/stat.c b/fs/stat.c index fa0be59340cc..1c5d5c11bddc 100644 --- a/fs/stat.c +++ b/fs/stat.c @@ -509,46 +509,37 @@ SYSCALL_DEFINE4(fstatat64, int, dfd, const char __user *, filename, } #endif /* __ARCH_WANT_STAT64 || __ARCH_WANT_COMPAT_STAT64 */ -static inline int __put_timestamp(struct timespec *kts, - struct statx_timestamp __user *uts) +static noinline_for_stack int +cp_statx(const struct kstat *stat, struct statx __user *buffer) { - return (__put_user(kts->tv_sec, &uts->tv_sec ) || - __put_user(kts->tv_nsec, &uts->tv_nsec ) || - __put_user(0, &uts->__reserved )); -} - -/* - * Set the statx results. - */ -static long statx_set_result(struct kstat *stat, struct statx __user *buffer) -{ - uid_t uid = from_kuid_munged(current_user_ns(), stat->uid); - gid_t gid = from_kgid_munged(current_user_ns(), stat->gid); - - if (__put_user(stat->result_mask, &buffer->stx_mask ) || - __put_user(stat->mode, &buffer->stx_mode ) || - __clear_user(&buffer->__spare0, sizeof(buffer->__spare0)) || - __put_user(stat->nlink, &buffer->stx_nlink ) || - __put_user(uid, &buffer->stx_uid ) || - __put_user(gid, &buffer->stx_gid ) || - __put_user(stat->attributes, &buffer->stx_attributes ) || - __put_user(stat->blksize, &buffer->stx_blksize ) || - __put_user(MAJOR(stat->rdev), &buffer->stx_rdev_major ) || - __put_user(MINOR(stat->rdev), &buffer->stx_rdev_minor ) || - __put_user(MAJOR(stat->dev), &buffer->stx_dev_major ) || - __put_user(MINOR(stat->dev), &buffer->stx_dev_minor ) || - __put_timestamp(&stat->atime, &buffer->stx_atime ) || - __put_timestamp(&stat->btime, &buffer->stx_btime ) || - __put_timestamp(&stat->ctime, &buffer->stx_ctime ) || - __put_timestamp(&stat->mtime, &buffer->stx_mtime ) || - __put_user(stat->ino, &buffer->stx_ino ) || - __put_user(stat->size, &buffer->stx_size ) || - __put_user(stat->blocks, &buffer->stx_blocks ) || - __clear_user(&buffer->__spare1, sizeof(buffer->__spare1)) || - __clear_user(&buffer->__spare2, sizeof(buffer->__spare2))) - return -EFAULT; - - return 0; + struct statx tmp; + + memset(&tmp, 0, sizeof(tmp)); + + tmp.stx_mask = stat->result_mask; + tmp.stx_blksize = stat->blksize; + tmp.stx_attributes = stat->attributes; + tmp.stx_nlink = stat->nlink; + tmp.stx_uid = from_kuid_munged(current_user_ns(), stat->uid); + tmp.stx_gid = from_kgid_munged(current_user_ns(), stat->gid); + tmp.stx_mode = stat->mode; + tmp.stx_ino = stat->ino; + tmp.stx_size = stat->size; + tmp.stx_blocks = stat->blocks; + tmp.stx_atime.tv_sec = stat->atime.tv_sec; + tmp.stx_atime.tv_nsec = stat->atime.tv_nsec; + tmp.stx_btime.tv_sec = stat->btime.tv_sec; + tmp.stx_btime.tv_nsec = stat->btime.tv_nsec; + tmp.stx_ctime.tv_sec = stat->ctime.tv_sec; + tmp.stx_ctime.tv_nsec = stat->ctime.tv_nsec; + tmp.stx_mtime.tv_sec = stat->mtime.tv_sec; + tmp.stx_mtime.tv_nsec = stat->mtime.tv_nsec; + tmp.stx_rdev_major = MAJOR(stat->rdev); + tmp.stx_rdev_minor = MINOR(stat->rdev); + tmp.stx_dev_major = MAJOR(stat->dev); + tmp.stx_dev_minor = MINOR(stat->dev); + + return copy_to_user(buffer, &tmp, sizeof(tmp)) ? -EFAULT : 0; } /** @@ -572,8 +563,6 @@ SYSCALL_DEFINE5(statx, if ((flags & AT_STATX_SYNC_TYPE) == AT_STATX_SYNC_TYPE) return -EINVAL; - if (!access_ok(VERIFY_WRITE, buffer, sizeof(*buffer))) - return -EFAULT; if (filename) error = vfs_statx(dfd, filename, flags, &stat, mask); @@ -581,7 +570,8 @@ SYSCALL_DEFINE5(statx, error = vfs_statx_fd(dfd, &stat, mask, flags); if (error) return error; - return statx_set_result(&stat, buffer); + + return cp_statx(&stat, buffer); } /* Caller is here responsible for sufficient locking (ie. inode->i_lock) */