From patchwork Thu Nov 4 05:52:40 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602439 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1C366C433F5 for ; Thu, 4 Nov 2021 05:53:31 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id EAE81611AE for ; Thu, 4 Nov 2021 05:53:30 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230319AbhKDF4H (ORCPT ); Thu, 4 Nov 2021 01:56:07 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:31235 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230011AbhKDF4G (ORCPT ); Thu, 4 Nov 2021 01:56:06 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005208; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=JFsazhmTKNLFL+SfYDlDrurjwOXOXwKwmC6D4crdwR8=; b=bVev/i032OolRQeBTg0RrECERfNRkUfJGpIUpZ8ja1AJd08RT+hBsXqKNFpjxtfAt+ByUo fSzdOufw/nV+JBd0GlSwub79EIT1aqW+3X1XS6UvgWVlCX6ka6O8SYnq0nUwKubhYZlqo1 KMQRtEOCks+VrgnZHfjQTJPLZhVmqhI= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-489-2m6tHpslNdmxDTeYH-0Jsg-1; Thu, 04 Nov 2021 01:53:25 -0400 X-MC-Unique: 2m6tHpslNdmxDTeYH-0Jsg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4AB681006AA6; Thu, 4 Nov 2021 05:53:24 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4BC0E694B6; Thu, 4 Nov 2021 05:53:14 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org Subject: [PATCH v6 1/9] libceph: add CEPH_OSD_OP_ASSERT_VER support Date: Thu, 4 Nov 2021 13:52:40 +0800 Message-Id: <20211104055248.190987-2-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton ...and record the user_version in the reply in a new field in ceph_osd_request, so we can populate the assert_ver appropriately. Shuffle the fields a bit too so that the new field fits in an existing hole on x86_64. Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 6 +++++- include/linux/ceph/rados.h | 4 ++++ net/ceph/osd_client.c | 5 +++++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 83fa08a06507..7ee1684d3edc 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -145,6 +145,9 @@ struct ceph_osd_req_op { u32 src_fadvise_flags; struct ceph_osd_data osd_data; } copy_from; + struct { + u64 ver; + } assert_ver; }; }; @@ -199,6 +202,7 @@ struct ceph_osd_request { struct ceph_osd_client *r_osdc; struct kref r_kref; bool r_mempool; + bool r_linger; /* don't resend on failure */ struct completion r_completion; /* private to osd_client.c */ ceph_osdc_callback_t r_callback; @@ -211,9 +215,9 @@ struct ceph_osd_request { struct ceph_snap_context *r_snapc; /* for writes */ struct timespec64 r_mtime; /* ditto */ u64 r_data_offset; /* ditto */ - bool r_linger; /* don't resend on failure */ /* internal */ + u64 r_version; /* data version sent in reply */ unsigned long r_stamp; /* jiffies, send or check time */ unsigned long r_start_stamp; /* jiffies */ ktime_t r_start_latency; /* ktime_t */ diff --git a/include/linux/ceph/rados.h b/include/linux/ceph/rados.h index 43a7a1573b51..73c3efbec36c 100644 --- a/include/linux/ceph/rados.h +++ b/include/linux/ceph/rados.h @@ -523,6 +523,10 @@ struct ceph_osd_op { struct { __le64 cookie; } __attribute__ ((packed)) notify; + struct { + __le64 unused; + __le64 ver; + } __attribute__ ((packed)) assert_ver; struct { __le64 offset, length; __le64 src_offset; diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index ff8624a7c964..f3a9af012123 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1038,6 +1038,10 @@ static u32 osd_req_encode_op(struct ceph_osd_op *dst, dst->copy_from.src_fadvise_flags = cpu_to_le32(src->copy_from.src_fadvise_flags); break; + case CEPH_OSD_OP_ASSERT_VER: + dst->assert_ver.unused = cpu_to_le64(0); + dst->assert_ver.ver = cpu_to_le64(src->assert_ver.ver); + break; default: pr_err("unsupported osd opcode %s\n", ceph_osd_op_name(src->op)); @@ -3763,6 +3767,7 @@ static void handle_reply(struct ceph_osd *osd, struct ceph_msg *msg) * one (type of) reply back. */ WARN_ON(!(m.flags & CEPH_OSD_FLAG_ONDISK)); + req->r_version = m.user_version; req->r_result = m.result ?: data_len; finish_request(req); mutex_unlock(&osd->lock); From patchwork Thu Nov 4 05:52:41 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602441 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2E3AC433F5 for ; Thu, 4 Nov 2021 05:53:32 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id A7835601FC for ; Thu, 4 Nov 2021 05:53:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230334AbhKDF4J (ORCPT ); Thu, 4 Nov 2021 01:56:09 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:57893 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230329AbhKDF4I (ORCPT ); Thu, 4 Nov 2021 01:56:08 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005211; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=0/20CJKGnvhPB+yGgkURFvNkpAteBZndC3X+xrqtTmo=; b=GRGWgxTaEFFZjnaP13lRr5gQ5V7/Txj+QqyP3mqGLMDmb4t7WJ/xzkHAmaA/W6HMYbOXgD MfJjyG1Aj38pQ0ryNETnDI6Zl+Sg29ah/Ed2JR/fY/71odXryahHZqUEqxnY4U5JeyHLc4 llpPdi6FjtBvjb6IqQ8OXl+az35bStY= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-564-4hj8J2tvP0qg4cv0ig0R9A-1; Thu, 04 Nov 2021 01:53:27 -0400 X-MC-Unique: 4hj8J2tvP0qg4cv0ig0R9A-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id DE6F7801B00; Thu, 4 Nov 2021 05:53:26 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id C5C4E5BAF0; Thu, 4 Nov 2021 05:53:24 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org Subject: [PATCH v6 2/9] ceph: size handling for encrypted inodes in cap updates Date: Thu, 4 Nov 2021 13:52:41 +0800 Message-Id: <20211104055248.190987-3-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton Transmit the rounded-up size as the normal size, and fill out the fscrypt_file field with the real file size. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 43 +++++++++++++++++++++++++------------------ fs/ceph/crypto.h | 4 ++++ 2 files changed, 29 insertions(+), 18 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 80f521dd7254..fc367f42536a 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1215,10 +1215,9 @@ struct cap_msg_args { umode_t mode; bool inline_data; bool wake; + bool encrypted; u32 fscrypt_auth_len; - u32 fscrypt_file_len; u8 fscrypt_auth[sizeof(struct ceph_fscrypt_auth)]; // for context - u8 fscrypt_file[sizeof(u64)]; // for size }; /* Marshal up the cap msg to the MDS */ @@ -1253,7 +1252,12 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) fc->ino = cpu_to_le64(arg->ino); fc->snap_follows = cpu_to_le64(arg->follows); - fc->size = cpu_to_le64(arg->size); +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + if (arg->encrypted) + fc->size = cpu_to_le64(round_up(arg->size, CEPH_FSCRYPT_BLOCK_SIZE)); + else +#endif + fc->size = cpu_to_le64(arg->size); fc->max_size = cpu_to_le64(arg->max_size); ceph_encode_timespec64(&fc->mtime, &arg->mtime); ceph_encode_timespec64(&fc->atime, &arg->atime); @@ -1313,11 +1317,17 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) ceph_encode_64(&p, 0); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) - /* fscrypt_auth and fscrypt_file (version 12) */ + /* + * fscrypt_auth and fscrypt_file (version 12) + * + * fscrypt_auth holds the crypto context (if any). fscrypt_file + * tracks the real i_size as an __le64 field (and we use a rounded-up + * i_size in * the traditional size field). + */ ceph_encode_32(&p, arg->fscrypt_auth_len); ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len); - ceph_encode_32(&p, arg->fscrypt_file_len); - ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len); + ceph_encode_32(&p, sizeof(__le64)); + ceph_encode_64(&p, arg->size); #else /* CONFIG_FS_ENCRYPTION */ ceph_encode_32(&p, 0); ceph_encode_32(&p, 0); @@ -1389,7 +1399,6 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, arg->follows = flushing ? ci->i_head_snapc->seq : 0; arg->flush_tid = flush_tid; arg->oldest_flush_tid = oldest_flush_tid; - arg->size = i_size_read(inode); ci->i_reported_size = arg->size; arg->max_size = ci->i_wanted_max_size; @@ -1443,6 +1452,7 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, } } arg->flags = flags; + arg->encrypted = IS_ENCRYPTED(inode); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) if (ci->fscrypt_auth_len && WARN_ON_ONCE(ci->fscrypt_auth_len != sizeof(struct ceph_fscrypt_auth))) { @@ -1453,21 +1463,21 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, memcpy(arg->fscrypt_auth, ci->fscrypt_auth, min_t(size_t, ci->fscrypt_auth_len, sizeof(arg->fscrypt_auth))); } - /* FIXME: use this to track "real" size */ - arg->fscrypt_file_len = 0; #endif /* CONFIG_FS_ENCRYPTION */ } +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) #define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ - 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4 + 8) -#if IS_ENABLED(CONFIG_FS_ENCRYPTION) static inline int cap_msg_size(struct cap_msg_args *arg) { - return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len + - arg->fscrypt_file_len; + return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len; } #else +#define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + static inline int cap_msg_size(struct cap_msg_args *arg) { return CAP_MSG_FIXED_FIELDS; @@ -1546,13 +1556,10 @@ static inline int __send_flush_snap(struct inode *inode, arg.inline_data = capsnap->inline_data; arg.flags = 0; arg.wake = false; + arg.encrypted = IS_ENCRYPTED(inode); - /* - * No fscrypt_auth changes from a capsnap. It will need - * to update fscrypt_file on size changes (TODO). - */ + /* No fscrypt_auth changes from a capsnap.*/ arg.fscrypt_auth_len = 0; - arg.fscrypt_file_len = 0; msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(&arg), GFP_NOFS, false); diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index c2e0cbb5667b..ab27a7ed62c3 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -9,6 +9,10 @@ #include #include +#define CEPH_FSCRYPT_BLOCK_SHIFT 12 +#define CEPH_FSCRYPT_BLOCK_SIZE (_AC(1,UL) << CEPH_FSCRYPT_BLOCK_SHIFT) +#define CEPH_FSCRYPT_BLOCK_MASK (~(CEPH_FSCRYPT_BLOCK_SIZE-1)) + struct ceph_fs_client; struct ceph_acl_sec_ctx; struct ceph_mds_request; From patchwork Thu Nov 4 05:52:42 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602443 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5DE76C433F5 for ; Thu, 4 Nov 2021 05:53:46 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 4323A611C3 for ; Thu, 4 Nov 2021 05:53:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231132AbhKDF4X (ORCPT ); Thu, 4 Nov 2021 01:56:23 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:37841 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230390AbhKDF4L (ORCPT ); Thu, 4 Nov 2021 01:56:11 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005214; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=5jARZQxVJjNeBsKRMNJPsgsPfhMBq478/q2NBFZTkO4=; b=SNi3DSXUOjWfwGoA0qyV7NqdhgiWzQY3tUwqfLqwFaIlW/JkZdd59XxvX/vt2l3aeW1mmy wzkNwz95+hoCPLzBTLyyg/Iyn7nrksh2gJbcXSIn+dFkI+cLv+DBFpzMefk4VB4oxF5k3j MuSWCTH7OBaRL9CVV//m3RnCIZvpHug= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-177-fXQbE5WmMIe-cGRw3k4bgg-1; Thu, 04 Nov 2021 01:53:30 -0400 X-MC-Unique: fXQbE5WmMIe-cGRw3k4bgg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8222EEC1A1; Thu, 4 Nov 2021 05:53:29 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 66BA45BAF0; Thu, 4 Nov 2021 05:53:27 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org Subject: [PATCH v6 3/9] ceph: fscrypt_file field handling in MClientRequest messages Date: Thu, 4 Nov 2021 13:52:42 +0800 Message-Id: <20211104055248.190987-4-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton For encrypted inodes, transmit a rounded-up size to the MDS as the normal file size and send the real inode size in fscrypt_file field. Also, fix up creates and truncates to also transmit fscrypt_file. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 3 +++ fs/ceph/file.c | 2 ++ fs/ceph/inode.c | 18 ++++++++++++++++-- fs/ceph/mds_client.c | 9 ++++++++- fs/ceph/mds_client.h | 2 ++ 5 files changed, 31 insertions(+), 3 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 37c9c589ee27..987c1579614c 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -916,6 +916,9 @@ static int ceph_mknod(struct user_namespace *mnt_userns, struct inode *dir, goto out_req; } + if (S_ISREG(mode) && IS_ENCRYPTED(dir)) + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + req->r_dentry = dget(dentry); req->r_num_caps = 2; req->r_parent = dir; diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 126d2d80686c..8c0b9ed7f48b 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -715,6 +715,8 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, req->r_args.open.mask = cpu_to_le32(mask); req->r_parent = dir; ihold(dir); + if (IS_ENCRYPTED(dir)) + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); if (flags & O_CREAT) { struct ceph_file_layout lo; diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index d24d42c94d43..4a7b2b0d88f7 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -2383,11 +2383,25 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c } } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || attr->ia_size != isize) { - req->r_args.setattr.size = cpu_to_le64(attr->ia_size); - req->r_args.setattr.old_size = cpu_to_le64(isize); mask |= CEPH_SETATTR_SIZE; release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL | CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; + if (IS_ENCRYPTED(inode)) { + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + mask |= CEPH_SETATTR_FSCRYPT_FILE; + req->r_args.setattr.size = + cpu_to_le64(round_up(attr->ia_size, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_args.setattr.old_size = + cpu_to_le64(round_up(isize, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_fscrypt_file = attr->ia_size; + /* FIXME: client must zero out any partial blocks! */ + } else { + req->r_args.setattr.size = cpu_to_le64(attr->ia_size); + req->r_args.setattr.old_size = cpu_to_le64(isize); + req->r_fscrypt_file = 0; + } } } if (ia_valid & ATTR_MTIME) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 69caea1d2444..e2d1b98c61fc 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2653,7 +2653,12 @@ static void encode_mclientrequest_tail(void **p, const struct ceph_mds_request * } else { ceph_encode_32(p, 0); } - ceph_encode_32(p, 0); // fscrypt_file for now + if (test_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags)) { + ceph_encode_32(p, sizeof(__le64)); + ceph_encode_64(p, req->r_fscrypt_file); + } else { + ceph_encode_32(p, 0); + } } /* @@ -2739,6 +2744,8 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, /* fscrypt_file */ len += sizeof(u32); + if (test_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags)) + len += sizeof(__le64); msg = ceph_msg_new2(CEPH_MSG_CLIENT_REQUEST, len, 1, GFP_NOFS, false); if (!msg) { diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 6a2ac489e06e..d64ff1bd2f5d 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -276,6 +276,7 @@ struct ceph_mds_request { #define CEPH_MDS_R_DID_PREPOPULATE (6) /* prepopulated readdir */ #define CEPH_MDS_R_PARENT_LOCKED (7) /* is r_parent->i_rwsem wlocked? */ #define CEPH_MDS_R_ASYNC (8) /* async request */ +#define CEPH_MDS_R_FSCRYPT_FILE (9) /* must marshal fscrypt_file field */ unsigned long r_req_flags; struct mutex r_fill_mutex; @@ -283,6 +284,7 @@ struct ceph_mds_request { union ceph_mds_request_args r_args; struct ceph_fscrypt_auth *r_fscrypt_auth; + __le64 r_fscrypt_file; u8 *r_altname; /* fscrypt binary crypttext for long filenames */ u32 r_altname_len; /* length of r_altname */ From patchwork Thu Nov 4 05:52:43 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602445 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4D8EC433EF for ; Thu, 4 Nov 2021 05:53:47 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id BEA12611AE for ; Thu, 4 Nov 2021 05:53:47 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231150AbhKDF4Y (ORCPT ); Thu, 4 Nov 2021 01:56:24 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:44258 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230410AbhKDF4M (ORCPT ); Thu, 4 Nov 2021 01:56:12 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005214; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=T7NfZHttcD1IvGpDYjn9D/oM20m0iGfmyo0w1VRirTQ=; b=dnIRjNGNOMASUQBbtdsUK9iVjdFyptX5rQNbzyzWSjsnZcYx10OvGmVBtgpyCLCyvvnKok W86pd8Kyehaaa7r8LzCXQPFCw8+of2bnJ7XHNVFESklyZAOyX6CutA3nbYewrkgmNkADSD +ithF6JVnMKALeA6pOe5/684ZQeTIcc= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-460-k54P-gF7PPuEltfUKYWFyQ-1; Thu, 04 Nov 2021 01:53:33 -0400 X-MC-Unique: k54P-gF7PPuEltfUKYWFyQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 249DD1006AA2; Thu, 4 Nov 2021 05:53:32 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0A2C92B399; Thu, 4 Nov 2021 05:53:29 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org Subject: [PATCH v6 4/9] ceph: get file size from fscrypt_file when present in inode traces Date: Thu, 4 Nov 2021 13:52:43 +0800 Message-Id: <20211104055248.190987-5-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 30 +++++++++++++++++++----------- 1 file changed, 19 insertions(+), 11 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 4a7b2b0d88f7..15c2fb1e2c8a 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -978,6 +978,16 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, from_kgid(&init_user_ns, inode->i_gid)); ceph_decode_timespec64(&ci->i_btime, &iinfo->btime); ceph_decode_timespec64(&ci->i_snap_btime, &iinfo->snap_btime); + +#ifdef CONFIG_FS_ENCRYPTION + if (iinfo->fscrypt_auth_len && !ci->fscrypt_auth) { + ci->fscrypt_auth_len = iinfo->fscrypt_auth_len; + ci->fscrypt_auth = iinfo->fscrypt_auth; + iinfo->fscrypt_auth = NULL; + iinfo->fscrypt_auth_len = 0; + inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); + } +#endif } if ((new_version || (new_issued & CEPH_CAP_LINK_SHARED)) && @@ -1001,6 +1011,7 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, if (new_version || (new_issued & (CEPH_CAP_ANY_FILE_RD | CEPH_CAP_ANY_FILE_WR))) { + u64 size = info->size; s64 old_pool = ci->i_layout.pool_id; struct ceph_string *old_ns; @@ -1014,10 +1025,17 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, pool_ns = old_ns; + if (IS_ENCRYPTED(inode) && size && + (iinfo->fscrypt_file_len == sizeof(__le64))) { + size = __le64_to_cpu(*(__le64 *)iinfo->fscrypt_file); + if (info->size != round_up(size, CEPH_FSCRYPT_BLOCK_SIZE)) + pr_warn("size=%llu fscrypt_file=%llu\n", info->size, size); + } + queue_trunc = ceph_fill_file_size(inode, issued, le32_to_cpu(info->truncate_seq), le64_to_cpu(info->truncate_size), - le64_to_cpu(info->size)); + le64_to_cpu(size)); /* only update max_size on auth cap */ if ((info->cap.flags & CEPH_CAP_FLAG_AUTH) && ci->i_max_size != le64_to_cpu(info->max_size)) { @@ -1057,16 +1075,6 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, xattr_blob = NULL; } -#ifdef CONFIG_FS_ENCRYPTION - if (iinfo->fscrypt_auth_len && !ci->fscrypt_auth) { - ci->fscrypt_auth_len = iinfo->fscrypt_auth_len; - ci->fscrypt_auth = iinfo->fscrypt_auth; - iinfo->fscrypt_auth = NULL; - iinfo->fscrypt_auth_len = 0; - inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); - } -#endif - /* finally update i_version */ if (le64_to_cpu(info->version) > ci->i_version) ci->i_version = le64_to_cpu(info->version); From patchwork Thu Nov 4 05:52:44 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602447 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 092C2C433FE for ; Thu, 4 Nov 2021 05:53:49 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id E561C611AE for ; Thu, 4 Nov 2021 05:53:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230445AbhKDF4Z (ORCPT ); Thu, 4 Nov 2021 01:56:25 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:56208 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230252AbhKDF4O (ORCPT ); Thu, 4 Nov 2021 01:56:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005217; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=MkgBUMASE86GNFtSR7MX34Qa8AUSRpQOChLcv1rS+ys=; b=I6AImzIigC6UvpxPa3c7xlW78OUydScIdh+sHvKMIMxPSEHhvCoY1ai7cwMl+yQI8LHKot bmlYEvvzHIhUVIZaN6DdUprjLIUi8A4lDV4CILkYDdIVvnM3hPuVNnuHgDJBjL/3w1BVOf 6Dj5MBY+i+hC2ASeR0JzH6XXT+2kJcM= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-515-EX_eBqjqPh-IXTTHYbpmPA-1; Thu, 04 Nov 2021 01:53:35 -0400 X-MC-Unique: EX_eBqjqPh-IXTTHYbpmPA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id B8978801B00; Thu, 4 Nov 2021 05:53:34 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 9FAF35BB06; Thu, 4 Nov 2021 05:53:32 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org Subject: [PATCH v6 5/9] ceph: handle fscrypt fields in cap messages from MDS Date: Thu, 4 Nov 2021 13:52:44 +0800 Message-Id: <20211104055248.190987-6-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Jeff Layton Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 74 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 72 insertions(+), 2 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index fc367f42536a..c9f1ac3ad2f3 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3329,6 +3329,9 @@ struct cap_extra_info { /* currently issued */ int issued; struct timespec64 btime; + u8 *fscrypt_auth; + u32 fscrypt_auth_len; + u64 fscrypt_file_size; }; /* @@ -3361,6 +3364,14 @@ static void handle_cap_grant(struct inode *inode, bool deleted_inode = false; bool fill_inline = false; + /* + * If there is at least one crypto block then we'll trust fscrypt_file_size. + * If the real length of the file is 0, then ignore it (it has probably been + * truncated down to 0 by the MDS). + */ + if (IS_ENCRYPTED(inode) && size) + size = extra_info->fscrypt_file_size; + dout("handle_cap_grant inode %p cap %p mds%d seq %d %s\n", inode, cap, session->s_mds, seq, ceph_cap_string(newcaps)); dout(" size %llu max_size %llu, i_size %llu\n", size, max_size, @@ -3839,7 +3850,8 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid, */ static bool handle_cap_trunc(struct inode *inode, struct ceph_mds_caps *trunc, - struct ceph_mds_session *session) + struct ceph_mds_session *session, + struct cap_extra_info *extra_info) { struct ceph_inode_info *ci = ceph_inode(inode); int mds = session->s_mds; @@ -3856,6 +3868,14 @@ static bool handle_cap_trunc(struct inode *inode, issued |= implemented | dirty; + /* + * If there is at least one crypto block then we'll trust fscrypt_file_size. + * If the real length of the file is 0, then ignore it (it has probably been + * truncated down to 0 by the MDS). + */ + if (IS_ENCRYPTED(inode) && size) + size = extra_info->fscrypt_file_size; + dout("handle_cap_trunc inode %p mds%d seq %d to %lld seq %d\n", inode, mds, seq, truncate_size, truncate_seq); queue_trunc = ceph_fill_file_size(inode, issued, @@ -4074,6 +4094,48 @@ static void handle_cap_import(struct ceph_mds_client *mdsc, *target_cap = cap; } +#ifdef CONFIG_FS_ENCRYPTION +static int parse_fscrypt_fields(void **p, void *end, struct cap_extra_info *extra) +{ + u32 len; + + ceph_decode_32_safe(p, end, extra->fscrypt_auth_len, bad); + if (extra->fscrypt_auth_len) { + ceph_decode_need(p, end, extra->fscrypt_auth_len, bad); + extra->fscrypt_auth = kmalloc(extra->fscrypt_auth_len, GFP_KERNEL); + if (!extra->fscrypt_auth) + return -ENOMEM; + ceph_decode_copy_safe(p, end, extra->fscrypt_auth, + extra->fscrypt_auth_len, bad); + } + + ceph_decode_32_safe(p, end, len, bad); + if (len == sizeof(u64)) + ceph_decode_64_safe(p, end, extra->fscrypt_file_size, bad); + else + ceph_decode_skip_n(p, end, len, bad); + return 0; +bad: + return -EIO; +} +#else +static int parse_fscrypt_fields(void **p, void *end, struct cap_extra_info *extra) +{ + u32 len; + + /* Don't care about these fields unless we're encryption-capable */ + ceph_decode_32_safe(p, end, len, bad); + if (len) + ceph_decode_skip_n(p, end, len, bad); + ceph_decode_32_safe(p, end, len, bad); + if (len) + ceph_decode_skip_n(p, end, len, bad); + return 0; +bad: + return -EIO; +} +#endif + /* * Handle a caps message from the MDS. * @@ -4192,6 +4254,12 @@ void ceph_handle_caps(struct ceph_mds_session *session, ceph_decode_64_safe(&p, end, extra_info.nsubdirs, bad); } + if (msg_version >= 12) { + int ret = parse_fscrypt_fields(&p, end, &extra_info); + if (ret) + goto bad; + } + /* lookup ino */ inode = ceph_find_inode(mdsc->fsc->sb, vino); ci = ceph_inode(inode); @@ -4288,7 +4356,8 @@ void ceph_handle_caps(struct ceph_mds_session *session, break; case CEPH_CAP_OP_TRUNC: - queue_trunc = handle_cap_trunc(inode, h, session); + queue_trunc = handle_cap_trunc(inode, h, session, + &extra_info); spin_unlock(&ci->i_ceph_lock); if (queue_trunc) ceph_queue_vmtruncate(inode); @@ -4306,6 +4375,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, iput(inode); out: ceph_put_string(extra_info.pool_ns); + kfree(extra_info.fscrypt_auth); return; flush_cap_releases: From patchwork Thu Nov 4 05:52:45 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602449 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id A514CC433F5 for ; Thu, 4 Nov 2021 05:53:50 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 86C4E601FC for ; Thu, 4 Nov 2021 05:53:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230334AbhKDF41 (ORCPT ); Thu, 4 Nov 2021 01:56:27 -0400 Received: from us-smtp-delivery-124.mimecast.com ([216.205.24.124]:40209 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230361AbhKDF4T (ORCPT ); Thu, 4 Nov 2021 01:56:19 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005222; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=v4jJffICIfc1bGy7Z+qcfwnI6LK4INeb9JCPIBUsaJU=; b=VJxZtT2BYKgpd3Mzt/OQxuWicLaK7GL3GuXzaUeei7pL3PoR1j9X7BIqwWM2YS0j3AG/gn 6NFjP23hJ6NGoc2qtF1bPVX5OfATS8e/eSfjeIxRkjRRk2DEfseyZkRaFk2eG0VD5HIMCs zw7aNhzEB70QdHLmGBHrlQFswUWOzyU= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-565-aPr36QiyNrycRhi02LQkPA-1; Thu, 04 Nov 2021 01:53:38 -0400 X-MC-Unique: aPr36QiyNrycRhi02LQkPA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 9760C19200C0; Thu, 4 Nov 2021 05:53:37 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 3F8A45BAF0; Thu, 4 Nov 2021 05:53:34 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v6 6/9] ceph: add __ceph_get_caps helper support Date: Thu, 4 Nov 2021 13:52:45 +0800 Message-Id: <20211104055248.190987-7-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Signed-off-by: Xiubo Li --- fs/ceph/caps.c | 19 +++++++++++++------ fs/ceph/super.h | 2 ++ 2 files changed, 15 insertions(+), 6 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index c9f1ac3ad2f3..c15c5dd36747 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2911,10 +2911,9 @@ int ceph_try_get_caps(struct inode *inode, int need, int want, * due to a small max_size, make sure we check_max_size (and possibly * ask the mds) so we don't get hung up indefinitely. */ -int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got) +int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi, int need, + int want, loff_t endoff, int *got) { - struct ceph_file_info *fi = filp->private_data; - struct inode *inode = file_inode(filp); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); int ret, _got, flags; @@ -2923,7 +2922,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got if (ret < 0) return ret; - if ((fi->fmode & CEPH_FILE_MODE_WR) && + if (fi && (fi->fmode & CEPH_FILE_MODE_WR) && fi->filp_gen != READ_ONCE(fsc->filp_gen)) return -EBADF; @@ -2931,7 +2930,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got while (true) { flags &= CEPH_FILE_MODE_MASK; - if (atomic_read(&fi->num_locks)) + if (fi && atomic_read(&fi->num_locks)) flags |= CHECK_FILELOCK; _got = 0; ret = try_get_cap_refs(inode, need, want, endoff, @@ -2976,7 +2975,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got continue; } - if ((fi->fmode & CEPH_FILE_MODE_WR) && + if (fi && (fi->fmode & CEPH_FILE_MODE_WR) && fi->filp_gen != READ_ONCE(fsc->filp_gen)) { if (ret >= 0 && _got) ceph_put_cap_refs(ci, _got); @@ -3039,6 +3038,14 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got return 0; } +int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got) +{ + struct ceph_file_info *fi = filp->private_data; + struct inode *inode = file_inode(filp); + + return __ceph_get_caps(inode, fi, need, want, endoff, got); +} + /* * Take cap refs. Caller must already know we hold at least one ref * on the caps in question or we don't know this is safe. diff --git a/fs/ceph/super.h b/fs/ceph/super.h index ea95c958202f..403918a4cdb3 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1225,6 +1225,8 @@ extern int ceph_encode_dentry_release(void **p, struct dentry *dn, struct inode *dir, int mds, int drop, int unless); +extern int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi, + int need, int want, loff_t endoff, int *got); extern int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got); extern int ceph_try_get_caps(struct inode *inode, From patchwork Thu Nov 4 05:52:46 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602451 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 98263C433EF for ; Thu, 4 Nov 2021 05:53:51 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 7FD35611AE for ; Thu, 4 Nov 2021 05:53:51 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230495AbhKDF42 (ORCPT ); Thu, 4 Nov 2021 01:56:28 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:38566 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231129AbhKDF4W (ORCPT ); Thu, 4 Nov 2021 01:56:22 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005225; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=PUYbmL6CclEet7vAjscxCDgBYxpZp9y1hVGL8/bxh5c=; b=RhS8lPc4U4lox8eMu2SUtgDvND6sp5sWrVeTTkW+Iph27ZdJ303vrKB2cPopja09vcwP8n BP+jRb36VKrCQ2C4iwm4gwejM2qvGuBfrA73/nqbebURkymUDR9MTVJCQfjsTe5/Qx9ZYq r2qSFETbb1MR34qdjjek7MBS39/o8Eo= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-270-o_VHAMPuPB2DxTcMLpswqg-1; Thu, 04 Nov 2021 01:53:41 -0400 X-MC-Unique: o_VHAMPuPB2DxTcMLpswqg-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 7AB1619200C0; Thu, 4 Nov 2021 05:53:40 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 1FCB21B472; Thu, 4 Nov 2021 05:53:37 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v6 7/9] ceph: add __ceph_sync_read helper support Date: Thu, 4 Nov 2021 13:52:46 +0800 Message-Id: <20211104055248.190987-8-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Signed-off-by: Xiubo Li --- fs/ceph/file.c | 34 ++++++++++++++++++++++------------ fs/ceph/super.h | 2 ++ 2 files changed, 24 insertions(+), 12 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 8c0b9ed7f48b..129f6a642f8e 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -870,21 +870,18 @@ enum { * If we get a short result from the OSD, check against i_size; we need to * only return a short read to the caller if we hit EOF. */ -static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, - int *retry_op) +ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, + struct iov_iter *to, int *retry_op) { - struct file *file = iocb->ki_filp; - struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_client *osdc = &fsc->client->osdc; ssize_t ret; - u64 off = iocb->ki_pos; + u64 off = *ki_pos; u64 len = iov_iter_count(to); u64 i_size; - dout("sync_read on file %p %llu~%u %s\n", file, off, (unsigned)len, - (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); + dout("sync_read on inode %p %llu~%u\n", inode, *ki_pos, (unsigned)len); if (!len) return 0; @@ -986,14 +983,14 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, break; } - if (off > iocb->ki_pos) { + if (off > *ki_pos) { if (off >= i_size) { *retry_op = CHECK_EOF; - ret = i_size - iocb->ki_pos; - iocb->ki_pos = i_size; + ret = i_size - *ki_pos; + *ki_pos = i_size; } else { - ret = off - iocb->ki_pos; - iocb->ki_pos = off; + ret = off - *ki_pos; + *ki_pos = off; } } @@ -1001,6 +998,19 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, return ret; } +static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, + int *retry_op) +{ + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + + dout("sync_read on file %p %llu~%u %s\n", file, iocb->ki_pos, + (unsigned)iov_iter_count(to), + (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); + + return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op); +} + struct ceph_aio_request { struct kiocb *iocb; size_t total_len; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 403918a4cdb3..2362d758af97 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1253,6 +1253,8 @@ extern int ceph_renew_caps(struct inode *inode, int fmode); extern int ceph_open(struct inode *inode, struct file *file); extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned flags, umode_t mode); +extern ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, + struct iov_iter *to, int *retry_op); extern int ceph_release(struct inode *inode, struct file *filp); extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page, char *data, size_t len); From patchwork Thu Nov 4 05:52:47 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602453 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id D85A8C433FE for ; Thu, 4 Nov 2021 05:53:52 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id C38D5611AE for ; Thu, 4 Nov 2021 05:53:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231137AbhKDF43 (ORCPT ); Thu, 4 Nov 2021 01:56:29 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.133.124]:41166 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230410AbhKDF4Z (ORCPT ); Thu, 4 Nov 2021 01:56:25 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005228; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=ZAzzRKRRwxkB4D9xzhRB2vzaeLtUmnNO03clKL79CVw=; b=foqw2lt2t3wDAb6HXCJPYEqs2QgyVQBNkK6Xnv0FWES/RVY/f3Jnq+yMLY8YzLsJsu8WOl o444jU0vuFgow6KdXjsCZVHoQSXx4a1IpQbxFdSoCMAlQNPvjyNM9nO9LxHVW1eXwntZbD krxQJszjkpm+IJ9GIoh0t37oe4eNes0= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-344-VBlhLYqXPAaEAtw-LwRnAw-1; Thu, 04 Nov 2021 01:53:44 -0400 X-MC-Unique: VBlhLYqXPAaEAtw-LwRnAw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 5A29FEC1A0; Thu, 4 Nov 2021 05:53:43 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0180D2B399; Thu, 4 Nov 2021 05:53:40 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v6 8/9] ceph: add object version support for sync read Date: Thu, 4 Nov 2021 13:52:47 +0800 Message-Id: <20211104055248.190987-9-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li The sync read may split the read into several osdc requests, so for each it may in different Rados objects. Signed-off-by: Xiubo Li --- fs/ceph/file.c | 44 ++++++++++++++++++++++++++++++++++++++++++-- fs/ceph/super.h | 18 +++++++++++++++++- 2 files changed, 59 insertions(+), 3 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 129f6a642f8e..cedd86a6058d 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -871,7 +871,8 @@ enum { * only return a short read to the caller if we hit EOF. */ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, - struct iov_iter *to, int *retry_op) + struct iov_iter *to, int *retry_op, + struct ceph_object_vers *objvers) { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); @@ -880,6 +881,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, u64 off = *ki_pos; u64 len = iov_iter_count(to); u64 i_size; + u32 object_count = 8; dout("sync_read on inode %p %llu~%u\n", inode, *ki_pos, (unsigned)len); @@ -896,6 +898,15 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, if (ret < 0) return ret; + if (objvers) { + objvers->count = 0; + objvers->objvers = kcalloc(object_count, + sizeof(struct ceph_object_ver), + GFP_KERNEL); + if (!objvers->objvers) + return -ENOMEM; + } + ret = 0; while ((len = iov_iter_count(to)) > 0) { struct ceph_osd_request *req; @@ -938,6 +949,30 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, req->r_end_latency, len, ret); + if (objvers) { + u32 ind = objvers->count; + + if (objvers->count >= object_count) { + int ov_size; + + object_count *= 2; + ov_size = sizeof(struct ceph_object_ver); + objvers->objvers = krealloc_array(objvers, + object_count, + ov_size, + GFP_KERNEL); + if (!objvers->objvers) { + objvers->count = 0; + ret = -ENOMEM; + break; + } + } + + objvers->objvers[ind].offset = off; + objvers->objvers[ind].length = len; + objvers->objvers[ind].objver = req->r_version; + objvers->count++; + } ceph_osdc_put_request(req); i_size = i_size_read(inode); @@ -995,6 +1030,11 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, } dout("sync_read result %zd retry_op %d\n", ret, *retry_op); + if (ret < 0 && objvers) { + objvers->count = 0; + kfree(objvers->objvers); + objvers->objvers = NULL; + } return ret; } @@ -1008,7 +1048,7 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, (unsigned)iov_iter_count(to), (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); - return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op); + return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op, NULL); } struct ceph_aio_request { diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 2362d758af97..b347b12e86a9 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -451,6 +451,21 @@ struct ceph_inode_info { struct inode vfs_inode; /* at end */ }; +/* + * The version of an object which contains the + * file range of [offset, offset + length). + */ +struct ceph_object_ver { + u64 offset; + u64 length; + u64 objver; +}; + +struct ceph_object_vers { + u32 count; + struct ceph_object_ver *objvers; +}; + static inline struct ceph_inode_info * ceph_inode(const struct inode *inode) { @@ -1254,7 +1269,8 @@ extern int ceph_open(struct inode *inode, struct file *file); extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned flags, umode_t mode); extern ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, - struct iov_iter *to, int *retry_op); + struct iov_iter *to, int *retry_op, + struct ceph_object_vers *objvers); extern int ceph_release(struct inode *inode, struct file *filp); extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page, char *data, size_t len); From patchwork Thu Nov 4 05:52:48 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12602455 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4EFD4C433EF for ; Thu, 4 Nov 2021 05:53:54 +0000 (UTC) Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by mail.kernel.org (Postfix) with ESMTP id 3ABDF601FC for ; Thu, 4 Nov 2021 05:53:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231215AbhKDF4a (ORCPT ); Thu, 4 Nov 2021 01:56:30 -0400 Received: from us-smtp-delivery-124.mimecast.com ([170.10.129.124]:60885 "EHLO us-smtp-delivery-124.mimecast.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230497AbhKDF42 (ORCPT ); Thu, 4 Nov 2021 01:56:28 -0400 DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1636005230; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=HRGyRTKZTIMUP7K9iWX0B1cSXKPaJ22jWjkCEOkyd14=; b=E6vKlKgJsoyt3aFz9xMM1bu+Kavxw/RPtp44ixfgcfZJwEbDI4Hw2P8QWP093tLB88oL12 F/ujdQ1RzUw8+kkZmYAbBBPigyzZMNdp5xW8Ic0a1oGxESIgihGbSYziX69IHF6Mr1NOd1 B/aiLwqru4arAsvSgxKQrq5VnjmgI/k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) (Using TLS) by relay.mimecast.com with ESMTP id us-mta-41-pblQgD35MbKoR41SV0S6SQ-1; Thu, 04 Nov 2021 01:53:47 -0400 X-MC-Unique: pblQgD35MbKoR41SV0S6SQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 59C4B87D541; Thu, 4 Nov 2021 05:53:46 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id D56005BAF0; Thu, 4 Nov 2021 05:53:43 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, pdonnell@redhat.com, khiremat@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v6 9/9] ceph: add truncate size handling support for fscrypt Date: Thu, 4 Nov 2021 13:52:48 +0800 Message-Id: <20211104055248.190987-10-xiubli@redhat.com> In-Reply-To: <20211104055248.190987-1-xiubli@redhat.com> References: <20211104055248.190987-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will transfer the encrypted last block contents to the MDS along with the truncate request only when the new size is smaller and not aligned to the fscrypt BLOCK size. When the last block is located in the file hole, the truncate request will only contain the header. The MDS could fail to do the truncate if there has another client or process has already updated the Rados object which contains the last block, and will return -EAGAIN, then the kclient needs to retry it. The RMW will take around 50ms, and will let it retry 20 times for now. Signed-off-by: Xiubo Li --- fs/ceph/inode.c | 205 ++++++++++++++++++++++++++++++++++-- fs/ceph/super.h | 5 + include/linux/ceph/crypto.h | 28 +++++ 3 files changed, 227 insertions(+), 11 deletions(-) create mode 100644 include/linux/ceph/crypto.h diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 15c2fb1e2c8a..5817685ea9a5 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -21,6 +21,7 @@ #include "cache.h" #include "crypto.h" #include +#include /* * Ceph inode operations @@ -586,6 +587,7 @@ struct inode *ceph_alloc_inode(struct super_block *sb) ci->i_truncate_seq = 0; ci->i_truncate_size = 0; ci->i_truncate_pending = 0; + ci->i_truncate_pagecache_size = 0; ci->i_max_size = 0; ci->i_reported_size = 0; @@ -751,6 +753,10 @@ int ceph_fill_file_size(struct inode *inode, int issued, dout("truncate_size %lld -> %llu\n", ci->i_truncate_size, truncate_size); ci->i_truncate_size = truncate_size; + if (IS_ENCRYPTED(inode)) + ci->i_truncate_pagecache_size = size; + else + ci->i_truncate_pagecache_size = truncate_size; } if (queue_trunc) @@ -1026,10 +1032,14 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, pool_ns = old_ns; if (IS_ENCRYPTED(inode) && size && - (iinfo->fscrypt_file_len == sizeof(__le64))) { - size = __le64_to_cpu(*(__le64 *)iinfo->fscrypt_file); - if (info->size != round_up(size, CEPH_FSCRYPT_BLOCK_SIZE)) - pr_warn("size=%llu fscrypt_file=%llu\n", info->size, size); + (iinfo->fscrypt_file_len >= sizeof(__le64))) { + u64 fsize = __le64_to_cpu(*(__le64 *)iinfo->fscrypt_file); + if (fsize) { + size = fsize; + if (info->size != round_up(size, CEPH_FSCRYPT_BLOCK_SIZE)) + pr_warn("size=%llu fscrypt_file=%llu\n", + info->size, size); + } } queue_trunc = ceph_fill_file_size(inode, issued, @@ -2142,7 +2152,7 @@ void __ceph_do_pending_vmtruncate(struct inode *inode) /* there should be no reader or writer */ WARN_ON_ONCE(ci->i_rd_ref || ci->i_wr_ref); - to = ci->i_truncate_size; + to = ci->i_truncate_pagecache_size; wrbuffer_refs = ci->i_wrbuffer_ref; dout("__do_pending_vmtruncate %p (%d) to %lld\n", inode, ci->i_truncate_pending, to); @@ -2151,7 +2161,7 @@ void __ceph_do_pending_vmtruncate(struct inode *inode) truncate_pagecache(inode, to); spin_lock(&ci->i_ceph_lock); - if (to == ci->i_truncate_size) { + if (to == ci->i_truncate_pagecache_size) { ci->i_truncate_pending = 0; finish = 1; } @@ -2232,6 +2242,141 @@ static const struct inode_operations ceph_encrypted_symlink_iops = { .listxattr = ceph_listxattr, }; +/* + * Transfer the encrypted last block to the MDS and the MDS + * will help update it when truncating a smaller size. + * + * We don't support a PAGE_SIZE that is smaller than the + * CEPH_FSCRYPT_BLOCK_SIZE. + */ +static int fill_fscrypt_truncate(struct inode *inode, + struct ceph_mds_request *req, + struct iattr *attr) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + int boff = attr->ia_size % CEPH_FSCRYPT_BLOCK_SIZE; + loff_t pos, orig_pos = round_down(attr->ia_size, CEPH_FSCRYPT_BLOCK_SIZE); +#if 0 + u64 block = orig_pos >> CEPH_FSCRYPT_BLOCK_SHIFT; +#endif + struct ceph_pagelist *pagelist = NULL; + struct kvec iov; + struct iov_iter iter; + struct page *page = NULL; + struct ceph_fscrypt_truncate_size_header header; + int retry_op = 0; + int len = CEPH_FSCRYPT_BLOCK_SIZE; + loff_t i_size = i_size_read(inode); + struct ceph_object_vers objvers = {0, NULL}; + int got, ret, issued; + + ret = __ceph_get_caps(inode, NULL, CEPH_CAP_FILE_RD, 0, -1, &got); + if (ret < 0) + return ret; + + issued = __ceph_caps_issued(ci, NULL); + + dout("%s size %lld -> %lld got cap refs on %s, issued %s\n", __func__, + i_size, attr->ia_size, ceph_cap_string(got), + ceph_cap_string(issued)); + + /* Try to writeback the dirty pagecaches */ + if (issued & (CEPH_CAP_FILE_BUFFER)) + filemap_fdatawrite(&inode->i_data); + + page = __page_cache_alloc(GFP_KERNEL); + if (page == NULL) { + ret = -ENOMEM; + goto out; + } + + pagelist = ceph_pagelist_alloc(GFP_KERNEL); + if (!pagelist) { + ret = -ENOMEM; + goto out; + } + + iov.iov_base = kmap_local_page(page); + iov.iov_len = len; + iov_iter_kvec(&iter, READ, &iov, 1, len); + + pos = orig_pos; + ret = __ceph_sync_read(inode, &pos, &iter, &retry_op, &objvers); + ceph_put_cap_refs(ci, got); + if (ret < 0) + goto out; + + WARN_ON_ONCE(objvers.count != 1); + + /* Insert the header first */ + header.ver = 1; + header.compat = 1; + + /* + * If we hit a hole here, we should just skip filling + * the fscrypt for the request, because once the fscrypt + * is enabled, the file will be split into many blocks + * with the size of CEPH_FSCRYPT_BLOCK_SIZE, if there + * has a hole, the hole size should be multiple of block + * size. + * + * If the Rados object doesn't exist, it will be set 0. + */ + if (!objvers.objvers[0].objver) { + dout("%s hit hole, ppos %lld < size %lld\n", __func__, + pos, i_size); + + header.data_len = cpu_to_le32(8 + 8 + 4); + header.assert_ver = cpu_to_le64(0); + header.file_offset = cpu_to_le64(0); + header.block_size = cpu_to_le64(0); + ret = 0; + } else { + header.data_len = cpu_to_le32(8 + 8 + 4 + CEPH_FSCRYPT_BLOCK_SIZE); + header.assert_ver = objvers.objvers[0].objver; + header.file_offset = cpu_to_le64(orig_pos); + header.block_size = cpu_to_le64(CEPH_FSCRYPT_BLOCK_SIZE); + + /* truncate and zero out the extra contents for the last block */ + memset(iov.iov_base + boff, 0, PAGE_SIZE - boff); + +#if 0 // Uncomment this when the fscrypt is enabled globally in kceph + + /* encrypt the last block */ + ret = fscrypt_encrypt_block_inplace(inode, page, + CEPH_FSCRYPT_BLOCK_SIZE, + 0, block, + GFP_KERNEL); + if (ret) + goto out; +#endif + } + + /* Insert the header */ + ret = ceph_pagelist_append(pagelist, &header, sizeof(header)); + if (ret) + goto out; + + if (header.block_size) { + /* Append the last block contents to pagelist */ + ret = ceph_pagelist_append(pagelist, iov.iov_base, + CEPH_FSCRYPT_BLOCK_SIZE); + if (ret) + goto out; + } + req->r_pagelist = pagelist; +out: + dout("%s %p size dropping cap refs on %s\n", __func__, + inode, ceph_cap_string(got)); + kunmap_local(iov.iov_base); + if (page) + __free_pages(page, 0); + if (ret && pagelist) + ceph_pagelist_release(pagelist); + kfree(objvers.objvers); + return ret; +} + int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *cia) { struct ceph_inode_info *ci = ceph_inode(inode); @@ -2239,12 +2384,15 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c struct ceph_mds_request *req; struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc; struct ceph_cap_flush *prealloc_cf; + loff_t isize = i_size_read(inode); int issued; int release = 0, dirtied = 0; int mask = 0; int err = 0; int inode_dirty_flags = 0; bool lock_snap_rwsem = false; + bool fill_fscrypt; + int truncate_retry = 20; /* The RMW will take around 50ms */ prealloc_cf = ceph_alloc_cap_flush(); if (!prealloc_cf) @@ -2257,6 +2405,8 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c return PTR_ERR(req); } +retry: + fill_fscrypt = false; spin_lock(&ci->i_ceph_lock); issued = __ceph_caps_issued(ci, NULL); @@ -2378,10 +2528,27 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c } } if (ia_valid & ATTR_SIZE) { - loff_t isize = i_size_read(inode); - dout("setattr %p size %lld -> %lld\n", inode, isize, attr->ia_size); - if ((issued & CEPH_CAP_FILE_EXCL) && attr->ia_size >= isize) { + /* + * Only when the new size is smaller and not aligned to + * CEPH_FSCRYPT_BLOCK_SIZE will the RMW is needed. + */ + if (IS_ENCRYPTED(inode) && attr->ia_size < isize && + (attr->ia_size % CEPH_FSCRYPT_BLOCK_SIZE)) { + mask |= CEPH_SETATTR_SIZE; + release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL | + CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + mask |= CEPH_SETATTR_FSCRYPT_FILE; + req->r_args.setattr.size = + cpu_to_le64(round_up(attr->ia_size, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_args.setattr.old_size = + cpu_to_le64(round_up(isize, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_fscrypt_file = attr->ia_size; + fill_fscrypt = true; + } else if ((issued & CEPH_CAP_FILE_EXCL) && attr->ia_size >= isize) { if (attr->ia_size > isize) { i_size_write(inode, attr->ia_size); inode->i_blocks = calc_inode_blocks(attr->ia_size); @@ -2404,7 +2571,6 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c cpu_to_le64(round_up(isize, CEPH_FSCRYPT_BLOCK_SIZE)); req->r_fscrypt_file = attr->ia_size; - /* FIXME: client must zero out any partial blocks! */ } else { req->r_args.setattr.size = cpu_to_le64(attr->ia_size); req->r_args.setattr.old_size = cpu_to_le64(isize); @@ -2476,7 +2642,6 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c if (inode_dirty_flags) __mark_inode_dirty(inode, inode_dirty_flags); - if (mask) { req->r_inode = inode; ihold(inode); @@ -2484,7 +2649,25 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c req->r_args.setattr.mask = cpu_to_le32(mask); req->r_num_caps = 1; req->r_stamp = attr->ia_ctime; + if (fill_fscrypt) { + err = fill_fscrypt_truncate(inode, req, attr); + if (err) + goto out; + } + + /* + * The truncate request will return -EAGAIN when the + * last block has been updated just before the MDS + * successfully gets the xlock for the FILE lock. To + * avoid corrupting the file contents we need to retry + * it. + */ err = ceph_mdsc_do_request(mdsc, NULL, req); + if (err == -EAGAIN && truncate_retry--) { + dout("setattr %p result=%d (%s locally, %d remote), retry it!\n", + inode, err, ceph_cap_string(dirtied), mask); + goto retry; + } } out: dout("setattr %p result=%d (%s locally, %d remote)\n", inode, err, diff --git a/fs/ceph/super.h b/fs/ceph/super.h index b347b12e86a9..071857bb59d8 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -408,6 +408,11 @@ struct ceph_inode_info { u32 i_truncate_seq; /* last truncate to smaller size */ u64 i_truncate_size; /* and the size we last truncated down to */ int i_truncate_pending; /* still need to call vmtruncate */ + /* + * For none fscrypt case it equals to i_truncate_size or it will + * equals to fscrypt_file_size + */ + u64 i_truncate_pagecache_size; u64 i_max_size; /* max file size authorized by mds */ u64 i_reported_size; /* (max_)size reported to or requested of mds */ diff --git a/include/linux/ceph/crypto.h b/include/linux/ceph/crypto.h new file mode 100644 index 000000000000..2b0961902887 --- /dev/null +++ b/include/linux/ceph/crypto.h @@ -0,0 +1,28 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#ifndef _FS_CEPH_CRYPTO_H +#define _FS_CEPH_CRYPTO_H + +#include + +/* + * Header for the crypted file when truncating the size, this + * will be sent to MDS, and the MDS will update the encrypted + * last block and then truncate the size. + */ +struct ceph_fscrypt_truncate_size_header { + __u8 ver; + __u8 compat; + + /* + * It will be sizeof(assert_ver + file_offset + block_size) + * if the last block is empty when it's located in a file + * hole. Or the data_len will plus CEPH_FSCRYPT_BLOCK_SIZE. + */ + __le32 data_len; + + __le64 assert_ver; + __le64 file_offset; + __le32 block_size; +} __packed; + +#endif