From patchwork Wed Mar 2 12:13:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12765889 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 581A7C433EF for ; Wed, 2 Mar 2022 12:13:40 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241685AbiCBMOU (ORCPT ); Wed, 2 Mar 2022 07:14:20 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:40814 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241667AbiCBMOS (ORCPT ); Wed, 2 Mar 2022 07:14:18 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 122842A257 for ; Wed, 2 Mar 2022 04:13:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646223215; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=/zzDQuZ8eYQqFFfnBFeCOwyCjci2o1PmC4eykL7+NEE=; b=IEomYVlPjvyNUtPww99tRriyrruRQwfcp/60qByfucPU4S9vzrPHWnrtczf+fIXfSInouU YXezGsxxIWD2IlHIA5dqsm8GokrGWcKCR8+QP9gbynJRRhu7VYq8tl/yFT8EyFAo1b35/j HU8wZx9/8KxFDPpwuR0hPsUMqd+ddgs= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-519-7wBGgq2LPmOPt29eugNZKQ-1; Wed, 02 Mar 2022 07:13:33 -0500 X-MC-Unique: 7wBGgq2LPmOPt29eugNZKQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C5CDE1800D50; Wed, 2 Mar 2022 12:13:32 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 0D358781EC; Wed, 2 Mar 2022 12:13:30 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v3 1/6] ceph: fail the request when failing to decode dentry names Date: Wed, 2 Mar 2022 20:13:18 +0800 Message-Id: <20220302121323.240432-2-xiubli@redhat.com> In-Reply-To: <20220302121323.240432-1-xiubli@redhat.com> References: <20220302121323.240432-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li If we just skip the corrupt dentry names without setting the rde's offset it will crash in ceph_readdir(): ------------[ cut here ]------------ kernel BUG at fs/ceph/dir.c:537! invalid opcode: 0000 [#1] PREEMPT SMP KASAN NOPTI CPU: 16 PID: 21641 Comm: ls Tainted: G E 5.17.0-rc2+ #92 Hardware name: Red Hat RHEV Hypervisor, BIOS 1.11.0-2.el7 04/01/2014 The corresponding code in ceph_readdir() is: BUG_ON(rde->offset < ctx->pos); For now let's just fail the readdir request since it's nasty to handle it and will do better error handling later in future. Signed-off-by: Xiubo Li --- fs/ceph/dir.c | 13 +++++++------ fs/ceph/inode.c | 5 +++-- fs/ceph/mds_client.c | 2 +- 3 files changed, 11 insertions(+), 9 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 44395aae7259..fa3924959537 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -537,6 +537,13 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) .ctext_len = rde->altname_len }; u32 olen = oname.len; + err = ceph_fname_to_usr(&fname, &tname, &oname, NULL); + if (err) { + pr_err("%s unable to decode %.*s, got %d\n", __func__, + rde->name_len, rde->name, err); + goto out; + } + BUG_ON(rde->offset < ctx->pos); BUG_ON(!rde->inode.in); @@ -545,12 +552,6 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) i, rinfo->dir_nr, ctx->pos, rde->name_len, rde->name, &rde->inode.in); - err = ceph_fname_to_usr(&fname, &tname, &oname, NULL); - if (err) { - dout("Unable to decode %.*s. Skipping it.\n", rde->name_len, rde->name); - continue; - } - if (!dir_emit(ctx, oname.name, oname.len, ceph_present_ino(inode->i_sb, le64_to_cpu(rde->inode.in->ino)), le32_to_cpu(rde->inode.in->mode) >> 12)) { diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index cbeba8a93a07..b573a0f33450 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1904,8 +1904,9 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, err = ceph_fname_to_usr(&fname, &tname, &oname, &is_nokey); if (err) { - dout("Unable to decode %.*s. Skipping it.", rde->name_len, rde->name); - continue; + pr_err("%s unable to decode %.*s, got %d\n", __func__, + rde->name_len, rde->name, err); + goto out; } dname.name = oname.name; diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 31c1b441c0a1..8d704ddd7291 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -3477,7 +3477,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) if (err == 0) { if (result == 0 && (req->r_op == CEPH_MDS_OP_READDIR || req->r_op == CEPH_MDS_OP_LSSNAP)) - ceph_readdir_prepopulate(req, req->r_session); + err = ceph_readdir_prepopulate(req, req->r_session); } current->journal_info = NULL; mutex_unlock(&req->r_fill_mutex); From patchwork Wed Mar 2 12:13:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12765890 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 23188C433F5 for ; Wed, 2 Mar 2022 12:13:43 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241684AbiCBMOY (ORCPT ); Wed, 2 Mar 2022 07:14:24 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41240 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241667AbiCBMOX (ORCPT ); Wed, 2 Mar 2022 07:14:23 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 0051B2E0A2 for ; Wed, 2 Mar 2022 04:13:39 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646223219; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=pdXhuP6nyCVT5CC3Vmb6PrB1pVWpnxAfQYsjfDxhiFY=; b=XCkMgqIl9rmsO+6kcrzKIJKoKHIsoeKoO0s48bhP3h6Iyt5mYeUxFm8DfwV3fSZNayt6mc hJ8Ni63V8oAjPfYCjzGAlaTE7C7pq7qiVV/4K5m9OgVYLLJuC+o6xo3O6X415ZQEyLfQFA 1ldAbRgUWeedv8FyEoicG0T1hPB01o4= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-277-z62PXS8QO_6nxK8GXbO3xA-1; Wed, 02 Mar 2022 07:13:36 -0500 X-MC-Unique: z62PXS8QO_6nxK8GXbO3xA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 0EE7A520F; Wed, 2 Mar 2022 12:13:35 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 4814478203; Wed, 2 Mar 2022 12:13:33 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v3 2/6] ceph: do not dencrypt the dentry name twice for readdir Date: Wed, 2 Mar 2022 20:13:19 +0800 Message-Id: <20220302121323.240432-3-xiubli@redhat.com> In-Reply-To: <20220302121323.240432-1-xiubli@redhat.com> References: <20220302121323.240432-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li For the readdir request the dentries will be pasred and dencrypted in ceph_readdir_prepopulate(). And in ceph_readdir() we could just get the dentry name from the dentry cache instead of parsing and dencrypting them again. This could improve performance. Signed-off-by: Xiubo Li --- fs/ceph/crypto.h | 8 +++++ fs/ceph/dir.c | 74 +++++++++++++++++++++++--------------------- fs/ceph/inode.c | 15 +++++++++ fs/ceph/mds_client.h | 1 + 4 files changed, 62 insertions(+), 36 deletions(-) diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 1e08f8a64ad6..9a00c60b8535 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -83,6 +83,14 @@ static inline u32 ceph_fscrypt_auth_len(struct ceph_fscrypt_auth *fa) */ #define CEPH_NOHASH_NAME_MAX (189 - SHA256_DIGEST_SIZE) +/* + * The encrypted long snap name will be in format of + * "_${ENCRYPTED-LONG-SNAP-NAME}_${INODE-NUM}". And will set the max longth + * to sizeof('_') + NAME_MAX + sizeof('_') + max of sizeof(${INO}) + extra 7 + * bytes to align the total size to 8 bytes. + */ +#define CEPH_ENCRPTED_LONG_SNAP_NAME_MAX (1 + 255 + 1 + 16 + 7) + void ceph_fscrypt_set_ops(struct super_block *sb); void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index fa3924959537..e7cbb97df662 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -316,8 +316,7 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) int err; unsigned frag = -1; struct ceph_mds_reply_info_parsed *rinfo; - struct fscrypt_str tname = FSTR_INIT(NULL, 0); - struct fscrypt_str oname = FSTR_INIT(NULL, 0); + char *dentry_name = NULL; dout("readdir %p file %p pos %llx\n", inode, file, ctx->pos); if (dfi->file_info.flags & CEPH_F_ATEND) @@ -345,10 +344,6 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) ctx->pos = 2; } - err = fscrypt_prepare_readdir(inode); - if (err) - goto out; - spin_lock(&ci->i_ceph_lock); /* request Fx cap. if have Fx, we don't need to release Fs cap * for later create/unlink. */ @@ -369,14 +364,6 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) spin_unlock(&ci->i_ceph_lock); } - err = ceph_fname_alloc_buffer(inode, &tname); - if (err < 0) - goto out; - - err = ceph_fname_alloc_buffer(inode, &oname); - if (err < 0) - goto out; - /* proceed with a normal readdir */ more: /* do we have the correct frag content buffered? */ @@ -528,41 +515,56 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) } } } + + dentry_name = kmalloc(CEPH_ENCRPTED_LONG_SNAP_NAME_MAX, GFP_KERNEL); + if (!dentry_name) { + err = -ENOMEM; + ceph_mdsc_put_request(dfi->last_readdir); + goto out; + } + for (; i < rinfo->dir_nr; i++) { struct ceph_mds_reply_dir_entry *rde = rinfo->dir_entries + i; - struct ceph_fname fname = { .dir = inode, - .name = rde->name, - .name_len = rde->name_len, - .ctext = rde->altname, - .ctext_len = rde->altname_len }; - u32 olen = oname.len; - - err = ceph_fname_to_usr(&fname, &tname, &oname, NULL); - if (err) { - pr_err("%s unable to decode %.*s, got %d\n", __func__, - rde->name_len, rde->name, err); - goto out; - } + struct dentry *dn = rde->dentry; + int name_len; BUG_ON(rde->offset < ctx->pos); BUG_ON(!rde->inode.in); + BUG_ON(!rde->dentry); ctx->pos = rde->offset; - dout("readdir (%d/%d) -> %llx '%.*s' %p\n", - i, rinfo->dir_nr, ctx->pos, - rde->name_len, rde->name, &rde->inode.in); - if (!dir_emit(ctx, oname.name, oname.len, + spin_lock(&dn->d_lock); + memcpy(dentry_name, dn->d_name.name, dn->d_name.len); + name_len = dn->d_name.len; + spin_unlock(&dn->d_lock); + + dentry_name[name_len] = '\0'; + dout("readdir (%d/%d) -> %llx '%s' %p\n", + i, rinfo->dir_nr, ctx->pos, dentry_name, &rde->inode.in); + + dput(dn); + rde->dentry = NULL; + + if (!dir_emit(ctx, dentry_name, name_len, ceph_present_ino(inode->i_sb, le64_to_cpu(rde->inode.in->ino)), le32_to_cpu(rde->inode.in->mode) >> 12)) { dout("filldir stopping us...\n"); - ceph_mdsc_put_request(dfi->last_readdir); err = 0; + + /* + * dput the rest dentries. Must do this before + * releasing the request. + */ + for (++i; i < rinfo->dir_nr; i++) { + rde = rinfo->dir_entries + i; + dput(rde->dentry); + rde->dentry = NULL; + } + ceph_mdsc_put_request(dfi->last_readdir); goto out; } - /* Reset the lengths to their original allocated vals */ - oname.len = olen; ctx->pos++; } @@ -620,8 +622,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) err = 0; dout("readdir %p file %p done.\n", inode, file); out: - ceph_fname_free_buffer(inode, &tname); - ceph_fname_free_buffer(inode, &oname); + if (dentry_name) + kfree(dentry_name); return err; } diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index b573a0f33450..3ef8d9ae01dc 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1909,6 +1909,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, goto out; } + rde->dentry = NULL; dname.name = oname.name; dname.len = oname.len; dname.hash = full_name_hash(parent, dname.name, dname.len); @@ -1969,6 +1970,12 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, goto retry_lookup; } + /* + * ceph_readdir will use the dentry to get the name + * to avoid doing the dencrypt again there. + */ + rde->dentry = dget(dn); + /* inode */ if (d_really_is_positive(dn)) { in = d_inode(dn); @@ -2031,6 +2038,14 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, dput(dn); } out: + if (err) { + for (; i >= 0; i--) { + struct ceph_mds_reply_dir_entry *rde; + + rde = rinfo->dir_entries + i; + dput(rde->dentry); + } + } if (err == 0 && skipped == 0) { set_bit(CEPH_MDS_R_DID_PREPOPULATE, &req->r_req_flags); req->r_readdir_cache_idx = cache_ctl.index; diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 0dfe24f94567..663d7754d57d 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -96,6 +96,7 @@ struct ceph_mds_reply_info_in { }; struct ceph_mds_reply_dir_entry { + struct dentry *dentry; char *name; u8 *altname; u32 name_len; From patchwork Wed Mar 2 12:13:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12765892 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81B55C433EF for ; Wed, 2 Mar 2022 12:13:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241694AbiCBMOa (ORCPT ); Wed, 2 Mar 2022 07:14:30 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41672 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241686AbiCBMO2 (ORCPT ); Wed, 2 Mar 2022 07:14:28 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id BB11335842 for ; Wed, 2 Mar 2022 04:13:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646223223; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=xe/2zYxwMztD76HYIZoSI1FxarjUf0XFwSdWuRWL+vM=; b=XXxPsZ7+z+gMKU6tGbnJC11K36nNmSbPAyqXm1Jiel2eN1QGTOq/nJY8g30+PCoAczOEl5 MaepMW3TOcQuZwUbrHdg80s4c8p0s1qAL3/B7+0GFdY/PGYYP2hdetJwGVwe3RaX8UrdXo h3wrg4AdJZGj6M2SQ/eNvbdQ/hSs5kA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-641-gUGk4eGZPVquauJvCUW8Ng-1; Wed, 02 Mar 2022 07:13:38 -0500 X-MC-Unique: gUGk4eGZPVquauJvCUW8Ng-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4AA361091DA1; Wed, 2 Mar 2022 12:13:37 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 8499D7821C; Wed, 2 Mar 2022 12:13:35 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v3 3/6] ceph: add ceph_get_snap_parent_inode() support Date: Wed, 2 Mar 2022 20:13:20 +0800 Message-Id: <20220302121323.240432-4-xiubli@redhat.com> In-Reply-To: <20220302121323.240432-1-xiubli@redhat.com> References: <20220302121323.240432-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Get the parent inode for the snap directory ".snap", if the inode is not a snap directory just return it with the reference increased. Signed-off-by: Xiubo Li --- fs/ceph/snap.c | 24 ++++++++++++++++++++++++ fs/ceph/super.h | 1 + 2 files changed, 25 insertions(+) diff --git a/fs/ceph/snap.c b/fs/ceph/snap.c index 322ee5add942..b62c1ace2ee9 100644 --- a/fs/ceph/snap.c +++ b/fs/ceph/snap.c @@ -1268,3 +1268,27 @@ void ceph_cleanup_snapid_map(struct ceph_mds_client *mdsc) kfree(sm); } } + +/* + * Get the parent inode for the snap directory ".snap", + * if the inode is not a snap directory just return it + * with the reference increased. + */ +struct inode *ceph_get_snap_parent_inode(struct inode *inode) +{ + struct inode *pinode; + + if (ceph_snap(inode) == CEPH_SNAPDIR) { + struct ceph_vino vino = { + .ino = ceph_ino(inode), + .snap = CEPH_NOSNAP, + }; + pinode = ceph_find_inode(inode->i_sb, vino); + BUG_ON(!pinode); + } else { + ihold(inode); + pinode = inode; + } + + return pinode; +} diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 6d41a69f5d86..f0268a571621 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -969,6 +969,7 @@ extern void ceph_put_snapid_map(struct ceph_mds_client* mdsc, struct ceph_snapid_map *sm); extern void ceph_trim_snapid_map(struct ceph_mds_client *mdsc); extern void ceph_cleanup_snapid_map(struct ceph_mds_client *mdsc); +extern struct inode *ceph_get_snap_parent_inode(struct inode *inode); void ceph_umount_begin(struct super_block *sb); From patchwork Wed Mar 2 12:13:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12765893 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 375DDC433EF for ; Wed, 2 Mar 2022 12:13:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241686AbiCBMOc (ORCPT ); Wed, 2 Mar 2022 07:14:32 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41860 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241704AbiCBMOa (ORCPT ); Wed, 2 Mar 2022 07:14:30 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7838F457A4 for ; Wed, 2 Mar 2022 04:13:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646223226; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=IplMnW0psISCiMEAUbj/4bUoWcpi2a170XO5Mcx9pv0=; b=hw+1y8pMbModdl29N1Z2Ne91yflKXApJyxaxega6CxCsKUjz6838Y2MOubaV68oGINq/rD +VVHRSLrGaOjCn9eRTqhtu0/Z7JjhLgy0XgPs/lCEWbKVeGsd6MF62NW1eOu+T54GdNvdj QJDsbQSXRAw9iPv6fSTQETGKzYxuuRE= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-5-EID3b_BOOEu_dP7O7xR0rw-1; Wed, 02 Mar 2022 07:13:40 -0500 X-MC-Unique: EID3b_BOOEu_dP7O7xR0rw-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 8746E5200; Wed, 2 Mar 2022 12:13:39 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id C0DC778203; Wed, 2 Mar 2022 12:13:37 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v3 4/6] ceph: use the parent inode of '.snap' to dencrypt the names for readdir Date: Wed, 2 Mar 2022 20:13:21 +0800 Message-Id: <20220302121323.240432-5-xiubli@redhat.com> In-Reply-To: <20220302121323.240432-1-xiubli@redhat.com> References: <20220302121323.240432-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li The inode for '.snap' directory will always with no key setup, so we can use the parent inode to do this. Signed-off-by: Xiubo Li --- fs/ceph/inode.c | 15 +++++++++------ 1 file changed, 9 insertions(+), 6 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 3ef8d9ae01dc..2d4e5ee9a373 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1823,7 +1823,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, struct ceph_mds_reply_info_parsed *rinfo = &req->r_reply_info; struct qstr dname; struct dentry *dn; - struct inode *in; + struct inode *in, *pinode; int err = 0, skipped = 0, ret, i; u32 frag = le32_to_cpu(req->r_args.readdir.frag); u32 last_hash = 0; @@ -1882,11 +1882,13 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, cache_ctl.index = req->r_readdir_cache_idx; fpos_offset = req->r_readdir_offset; - err = ceph_fname_alloc_buffer(inode, &tname); + pinode = ceph_get_snap_parent_inode(inode); + + err = ceph_fname_alloc_buffer(pinode, &tname); if (err < 0) goto out; - err = ceph_fname_alloc_buffer(inode, &oname); + err = ceph_fname_alloc_buffer(pinode, &oname); if (err < 0) goto out; @@ -1896,7 +1898,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, struct ceph_mds_reply_dir_entry *rde = rinfo->dir_entries + i; struct ceph_vino tvino; u32 olen = oname.len; - struct ceph_fname fname = { .dir = inode, + struct ceph_fname fname = { .dir = pinode, .name = rde->name, .name_len = rde->name_len, .ctext = rde->altname, @@ -2051,8 +2053,9 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, req->r_readdir_cache_idx = cache_ctl.index; } ceph_readdir_cache_release(&cache_ctl); - ceph_fname_free_buffer(inode, &tname); - ceph_fname_free_buffer(inode, &oname); + ceph_fname_free_buffer(pinode, &tname); + ceph_fname_free_buffer(pinode, &oname); + iput(pinode); dout("readdir_prepopulate done\n"); return err; } From patchwork Wed Mar 2 12:13:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12765891 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7754AC433F5 for ; Wed, 2 Mar 2022 12:13:48 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241695AbiCBMO3 (ORCPT ); Wed, 2 Mar 2022 07:14:29 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241687AbiCBMO2 (ORCPT ); Wed, 2 Mar 2022 07:14:28 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.133.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id CAF6736B48 for ; Wed, 2 Mar 2022 04:13:44 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646223224; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=aEjMzFzz4snOv5WhbNqLQcqKyY8H58ML4fOsuWjty84=; b=JWs5uwth16EuY9FhDLE4tiooNfz3ZiyhzI2cLiPTe9aQSsM8nUOweKvMRR0rp5/0dWFcJl 0L2PjZSITFthcM+QLF0JrKPX5P6YuR4s7RzHiqiOfP91lBFDP6u/ukewsvuWhwRlowRRxt tbWRy6S6eYmLuSs5ZC8TjTPZj+4C/0k= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-594-JWvvUMtEOQicf16oozGyYA-1; Wed, 02 Mar 2022 07:13:42 -0500 X-MC-Unique: JWvvUMtEOQicf16oozGyYA-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id C50461800D50; Wed, 2 Mar 2022 12:13:41 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 08A5578203; Wed, 2 Mar 2022 12:13:39 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v3 5/6] ceph: use the parent inode of '.snap' to encrypt name to build path Date: Wed, 2 Mar 2022 20:13:22 +0800 Message-Id: <20220302121323.240432-6-xiubli@redhat.com> In-Reply-To: <20220302121323.240432-1-xiubli@redhat.com> References: <20220302121323.240432-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li The inode for '.snap' directory will always with no key setup, so we can use the parent inode to do this. Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 33 ++++++++++++++++++++------------- 1 file changed, 20 insertions(+), 13 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 8d704ddd7291..f8fd474f80cf 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2466,8 +2466,8 @@ static u8 *get_fscrypt_altname(const struct ceph_mds_request *req, u32 *plen) */ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for_wire) { - struct dentry *cur; - struct inode *inode; + struct dentry *cur, *parent; + struct inode *inode, *pinode; char *path; int pos; unsigned seq; @@ -2480,13 +2480,16 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for if (!path) return ERR_PTR(-ENOMEM); retry: + pinode = NULL; + parent = NULL; pos = PATH_MAX - 1; path[pos] = '\0'; seq = read_seqbegin(&rename_lock); cur = dget(dentry); for (;;) { - struct dentry *parent; + parent = dget_parent(cur); + pinode = ceph_get_snap_parent_inode(d_inode(parent)); spin_lock(&cur->d_lock); inode = d_inode(cur); @@ -2494,12 +2497,11 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for dout("build_path path+%d: %p SNAPDIR\n", pos, cur); spin_unlock(&cur->d_lock); - parent = dget_parent(cur); } else if (for_wire && inode && dentry != cur && ceph_snap(inode) == CEPH_NOSNAP) { spin_unlock(&cur->d_lock); pos++; /* get rid of any prepended '/' */ break; - } else if (!for_wire || !IS_ENCRYPTED(d_inode(cur->d_parent))) { + } else if (!for_wire || !IS_ENCRYPTED(pinode)) { pos -= cur->d_name.len; if (pos < 0) { spin_unlock(&cur->d_lock); @@ -2507,7 +2509,6 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for } memcpy(path + pos, cur->d_name.name, cur->d_name.len); spin_unlock(&cur->d_lock); - parent = dget_parent(cur); } else { int len, ret; char buf[FSCRYPT_BASE64URL_CHARS(NAME_MAX)]; @@ -2519,32 +2520,32 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for memcpy(buf, cur->d_name.name, cur->d_name.len); len = cur->d_name.len; spin_unlock(&cur->d_lock); - parent = dget_parent(cur); - ret = __fscrypt_prepare_readdir(d_inode(parent)); + ret = __fscrypt_prepare_readdir(pinode); if (ret < 0) { dput(parent); dput(cur); + iput(pinode); return ERR_PTR(ret); } - if (fscrypt_has_encryption_key(d_inode(parent))) { - len = ceph_encode_encrypted_fname(d_inode(parent), cur, buf); + if (fscrypt_has_encryption_key(pinode)) { + len = ceph_encode_encrypted_fname(pinode, cur, buf); if (len < 0) { dput(parent); dput(cur); + iput(pinode); return ERR_PTR(len); } } pos -= len; - if (pos < 0) { - dput(parent); + if (pos < 0) break; - } memcpy(path + pos, buf, len); } dput(cur); cur = parent; + parent = NULL; /* Are we at the root? */ if (IS_ROOT(cur)) @@ -2555,7 +2556,13 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for break; path[pos] = '/'; + iput(pinode); + pinode = NULL; } + if (pinode) + iput(pinode); + if (parent) + dput(parent); inode = d_inode(cur); base = inode ? ceph_ino(inode) : 0; dput(cur); From patchwork Wed Mar 2 12:13:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Xiubo Li X-Patchwork-Id: 12765894 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6ED90C433FE for ; Wed, 2 Mar 2022 12:13:53 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S241687AbiCBMOe (ORCPT ); Wed, 2 Mar 2022 07:14:34 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41864 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S241692AbiCBMOb (ORCPT ); Wed, 2 Mar 2022 07:14:31 -0500 Received: from us-smtp-delivery-124.mimecast.com (us-smtp-delivery-124.mimecast.com [170.10.129.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 7E6C053E38 for ; Wed, 2 Mar 2022 04:13:47 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=redhat.com; s=mimecast20190719; t=1646223226; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version: content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references; bh=VcjOoiVC+H6rIcvT5vXhjXR5H6mSmUQtnXv97enbKuk=; b=UvCZ+NSJ0wi7TfFtn8nDx4H+ynCKN3iXV3972kYr41DDZnIMeID0c5Wo5saETwfLUx+LfZ csDnmwLREp1e1TP9SUSPZMrSOD30GH/44t0MpzD3G6M4k9aBG+nMBi5bZs+Qi3euBVHp9Q G/5V3um1x+OGZSMYACqb4197W43iiLA= Received: from mimecast-mx01.redhat.com (mimecast-mx01.redhat.com [209.132.183.4]) by relay.mimecast.com with ESMTP with STARTTLS (version=TLSv1.2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id us-mta-141-PgIF1ygzMhSGdbzvlS8PeQ-1; Wed, 02 Mar 2022 07:13:45 -0500 X-MC-Unique: PgIF1ygzMhSGdbzvlS8PeQ-1 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.phx2.redhat.com [10.5.11.13]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mimecast-mx01.redhat.com (Postfix) with ESMTPS id 4A24E835DE0; Wed, 2 Mar 2022 12:13:44 +0000 (UTC) Received: from lxbceph1.gsslab.pek2.redhat.com (unknown [10.72.47.117]) by smtp.corp.redhat.com (Postfix) with ESMTP id 46AE778203; Wed, 2 Mar 2022 12:13:42 +0000 (UTC) From: xiubli@redhat.com To: jlayton@kernel.org Cc: idryomov@gmail.com, vshankar@redhat.com, ceph-devel@vger.kernel.org, Xiubo Li Subject: [PATCH v3 6/6] ceph: try to encrypt/decrypt long snap name Date: Wed, 2 Mar 2022 20:13:23 +0800 Message-Id: <20220302121323.240432-7-xiubli@redhat.com> In-Reply-To: <20220302121323.240432-1-xiubli@redhat.com> References: <20220302121323.240432-1-xiubli@redhat.com> MIME-Version: 1.0 X-Scanned-By: MIMEDefang 2.79 on 10.5.11.13 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li The child realm will inherit parents' snapshots, and the snapshot names will be in long name format: "_${ENCRYPTED-SNAP-NAME}_${PARENT-INO}" We need to parse the ${ENCRYPTED-NAME} and decrypt it for readdir and when lookup a snapshot we also need to encrypt the real snapshot name and then switch it to the long snap name to do the lookup in MDS. We will always use the parent inode of ".snap" directory to do the encyrption/decryption. When doing the lookup a snapshot we must retry it if the first try failed in case: There has a path of "/dir1/dir2/", if for the root '/' dir the encryption is not enabled, while the 'dir2/' will. Then in 'dir2/' we can get some thing like: $ fscrypt lock dir1 $ ls dir1/.snap/ 7o5HEjlwwsWevZctd6Xtwpq8yg9fkLMOrl59PEaPQd0 EbsAkJnyJbFVMtfvGmxNaY6hchRNEyDDFQBjW8r669g _root_snap1_1 $ fscrypt unlock dir1 $ ls dir1/.snap/ dir1_snap1 dir1_snap2 _root_snap1_1 $ ls dir1/dir2/.snap/ _dir1_snap1_1099511640069 _dir1_snap2_1099511640069 dir2_snap1 _root_snap1_1 For the '_root_snap1_1' snap name the lookup in 'dir1' will try twice, since the 'dir1' encrypt is enabled and the first time it will try to encrypt it and will fail, and the second time with long_snap_name=true it will use root inode '1' to determine whether needs to encrypt it. Signed-off-by: Xiubo Li --- fs/ceph/crypto.c | 95 +++++++++++++++++++++++++++++++++++++++++--- fs/ceph/crypto.h | 2 +- fs/ceph/dir.c | 27 +++++++++++-- fs/ceph/inode.c | 86 ++++++++++++++++++++++++++++++++++++--- fs/ceph/mds_client.c | 22 ++++++---- fs/ceph/mds_client.h | 2 + fs/ceph/super.h | 1 + 7 files changed, 211 insertions(+), 24 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 5a87e7385d3f..ef682d81a78e 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -128,14 +128,79 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se swap(req->r_fscrypt_auth, as->fscrypt_auth); } -int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) +int ceph_encode_encrypted_fname(struct inode *parent, struct dentry *dentry, char *buf) { + struct ceph_dentry_info *di = ceph_dentry(dentry); + struct qstr d_name = {.len = 0, .name = NULL}; + struct inode *pinode = parent; u32 len; int elen; int ret; u8 *cryptbuf; + char *p; + unsigned char *last = NULL; + + // The long snap name format is "_${SNAP-NAME}_{INO}" + if (di->long_snap_name) { + struct ceph_vino vino = { .snap = CEPH_NOSNAP }; + + /* The last will be "_${INO}" */ + last = strrchr(dentry->d_name.name, '_'); + if (!last) + return -EINVAL; + + /* Parse the ino from the "${INO}" */ + ret = kstrtou64(last + 1, 0, &vino.ino); + if (ret) + return ret; + + /* Get the parent inode with "${INO}" */ + pinode = ceph_get_inode(parent->i_sb, vino, NULL); + BUG_ON(!pinode); + + /* + * If the encrypt is not enabled just return + * the dentry name. + */ + if (!IS_ENCRYPTED(pinode)) { + memcpy(buf, dentry->d_name.name, dentry->d_name.len); + buf[dentry->d_name.len] = '\0'; + iput(pinode); + return dentry->d_name.len; + } + + ret = __fscrypt_prepare_readdir(pinode); + if (ret < 0) { + iput(pinode); + return ret; + } - WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); + /* If no key just copy the original dentry name back */ + if (!fscrypt_has_encryption_key(pinode)) { + memcpy(buf, dentry->d_name.name, dentry->d_name.len); + buf[dentry->d_name.len] = '\0'; + iput(pinode); + return dentry->d_name.len; + } + + /* Parse the "${SNAP-NAME}" and the length */ + d_name.len = last - dentry->d_name.name - 1; + d_name.name = kstrndup(dentry->d_name.name + 1, + d_name.len, GFP_KERNEL); + if (!d_name.name) + return -ENOMEM; + p = buf + 1; + buf[0] = '_'; + dout(" long_snap_name real snap name: %s, ino: %s\n", + d_name.name, last + 1); + } else { + p = buf; + d_name.name = dentry->d_name.name; + d_name.len = dentry->d_name.len; + ihold(parent); + } + + WARN_ON_ONCE(!fscrypt_has_encryption_key(pinode)); /* * convert cleartext dentry name to ciphertext @@ -144,20 +209,31 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr * * See: fscrypt_setup_filename */ - if (!fscrypt_fname_encrypted_size(parent, dentry->d_name.len, NAME_MAX, &len)) + if (!fscrypt_fname_encrypted_size(pinode, d_name.len, NAME_MAX, &len)) { + iput(pinode); return -ENAMETOOLONG; + } /* Allocate a buffer appropriate to hold the result */ cryptbuf = kmalloc(len > CEPH_NOHASH_NAME_MAX ? NAME_MAX : len, GFP_KERNEL); - if (!cryptbuf) + if (!cryptbuf) { + iput(pinode); + if (di->long_snap_name) + kfree(d_name.name); return -ENOMEM; + } - ret = fscrypt_fname_encrypt(parent, &dentry->d_name, cryptbuf, len); + ret = fscrypt_fname_encrypt(pinode, &d_name, cryptbuf, len); if (ret) { + iput(pinode); kfree(cryptbuf); + if (di->long_snap_name) + kfree(d_name.name); return ret; } + iput(pinode); + /* hash the end if the name is long enough */ if (len > CEPH_NOHASH_NAME_MAX) { u8 hash[SHA256_DIGEST_SIZE]; @@ -170,8 +246,15 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr } /* base64 encode the encrypted name */ - elen = fscrypt_base64url_encode(cryptbuf, len, buf); + elen = fscrypt_base64url_encode(cryptbuf, len, p); kfree(cryptbuf); + + /* The final name will be "_${ENCRYPTED-SNAP-NAME}_${INO}" */ + if (di->long_snap_name) { + kfree(d_name.name); + strcpy(p + elen, last); + elen += 1 + strlen(last); + } dout("base64-encoded ciphertext name = %.*s\n", elen, buf); return elen; } diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 9a00c60b8535..fe4065f8da53 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -98,7 +98,7 @@ void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, struct ceph_acl_sec_ctx *as); void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); -int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf); +int ceph_encode_encrypted_fname(struct inode *parent, struct dentry *dentry, char *buf); static inline int ceph_fname_alloc_buffer(struct inode *parent, struct fscrypt_str *fname) { diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index e7cbb97df662..bb5ca23c86d9 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -50,6 +50,7 @@ static int ceph_d_init(struct dentry *dentry) di->time = jiffies; dentry->d_fsdata = di; INIT_LIST_HEAD(&di->lease_list); + di->long_snap_name = false; atomic64_inc(&mdsc->metric.total_dentries); @@ -785,7 +786,9 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, { struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb); struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); + struct ceph_dentry_info *di = ceph_dentry(dentry); struct ceph_mds_request *req; + struct inode *pinode = dir; int op; int mask; int err; @@ -796,21 +799,22 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, if (dentry->d_name.len > NAME_MAX) return ERR_PTR(-ENAMETOOLONG); - if (IS_ENCRYPTED(dir)) { - err = __fscrypt_prepare_readdir(dir); + pinode = ceph_get_snap_parent_inode(dir); + if (IS_ENCRYPTED(pinode)) { + err = __fscrypt_prepare_readdir(pinode); if (err) return ERR_PTR(err); - if (!fscrypt_has_encryption_key(dir)) { + if (!fscrypt_has_encryption_key(pinode)) { spin_lock(&dentry->d_lock); dentry->d_flags |= DCACHE_NOKEY_NAME; spin_unlock(&dentry->d_lock); } } + iput(pinode); /* can we conclude ENOENT locally? */ if (d_really_is_negative(dentry)) { struct ceph_inode_info *ci = ceph_inode(dir); - struct ceph_dentry_info *di = ceph_dentry(dentry); spin_lock(&ci->i_ceph_lock); dout(" dir %p flags are 0x%lx\n", dir, ci->i_ceph_flags); @@ -833,6 +837,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, op = ceph_snap(dir) == CEPH_SNAPDIR ? CEPH_MDS_OP_LOOKUPSNAP : CEPH_MDS_OP_LOOKUP; +retry: req = ceph_mdsc_create_request(mdsc, op, USE_ANY_MDS); if (IS_ERR(req)) return ERR_CAST(req); @@ -851,6 +856,19 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, if (err == -ENOENT) { struct dentry *res; + /* + * Try to find encrypted long snap name with the + * format "_${ENCRYPTED-SNAP-NAME}_${INO}" + */ + if (IS_ENCRYPTED(pinode) && !di->long_snap_name && + op == CEPH_MDS_OP_LOOKUPSNAP && + dentry->d_name.name[0] == '_') { + di->long_snap_name = true; + ceph_mdsc_put_request(req); + dout("lookup retry with long snap name set.\n"); + goto retry; + } + res = ceph_handle_snapdir(req, dentry); if (IS_ERR(res)) { err = PTR_ERR(res); @@ -858,6 +876,7 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, dentry = res; err = 0; } + di->long_snap_name = false; } dentry = ceph_finish_lookup(req, dentry, err); ceph_mdsc_put_request(req); /* will dput(dentry) */ diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 2d4e5ee9a373..11f4417ffb6e 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1831,6 +1831,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, struct ceph_readdir_cache_control cache_ctl = {}; struct fscrypt_str tname = FSTR_INIT(NULL, 0); struct fscrypt_str oname = FSTR_INIT(NULL, 0); + char *long_snap_name = NULL; if (test_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags)) return readdir_prepopulate_inodes_only(req, session); @@ -1892,23 +1893,93 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, if (err < 0) goto out; + long_snap_name = kzalloc(CEPH_ENCRPTED_LONG_SNAP_NAME_MAX, GFP_NOFS); + if (!long_snap_name) { + err = -ENOMEM; + goto out; + } + /* FIXME: release caps/leases if error occurs */ for (i = 0; i < rinfo->dir_nr; i++) { bool is_nokey = false; struct ceph_mds_reply_dir_entry *rde = rinfo->dir_entries + i; struct ceph_vino tvino; u32 olen = oname.len; + struct ceph_dentry_info *di; struct ceph_fname fname = { .dir = pinode, .name = rde->name, .name_len = rde->name_len, .ctext = rde->altname, .ctext_len = rde->altname_len }; - err = ceph_fname_to_usr(&fname, &tname, &oname, &is_nokey); - if (err) { - pr_err("%s unable to decode %.*s, got %d\n", __func__, - rde->name_len, rde->name, err); - goto out; + /* The long snap name will be "_${[ENCRYPTED-]SNAP-NAME}_${INO}" */ + if (rde->long_snap_name) { + int len; + char *lsn, *last, ino_str[20]; /* max len of "${INO}" is 16 */ + struct inode *_pinode; + struct ceph_vino vino = { + .snap = CEPH_NOSNAP, + }; + + /* Get the inode by using the "${INO}" */ + memcpy(long_snap_name, rde->name, rde->name_len); + long_snap_name[rde->name_len] = '\0'; + last = strrchr(long_snap_name, '_'); + if (!last) { + pr_err("%s long snapshot name %.*s badness\n", + __func__, rde->name_len, rde->name); + goto out; + } + last++; + len = rde->name_len - (last - long_snap_name); + memcpy(ino_str, last, len); + ino_str[len] = '\0'; + err = kstrtou64(ino_str, 0, &vino.ino); + if (err) + goto out; + _pinode = ceph_find_inode(inode->i_sb, vino); + BUG_ON(!_pinode); + + // is the inode of ${INO} encrypted ? + if (IS_ENCRYPTED(_pinode)) { + len = rde->name_len - 2 - len; + fname.dir = _pinode; + fname.name = rde->name + 1; + fname.name_len = len; + + err = ceph_fname_to_usr(&fname, &tname, &oname, + &is_nokey); + if (err) { + pr_err("%s unable to decode %.*s, got %d\n", + __func__, rde->name_len, rde->name, + err); + iput(_pinode); + goto out; + } + + /* Covert it back to "_${DENCRYPTED-SNAP-NAME}_${INO}" */ + lsn = kasprintf(GFP_NOFS, "_%s_%s", oname.name, + ino_str); + if (!lsn) { + err = -ENOMEM; + iput(_pinode); + goto out; + } + len = strlen(lsn); + memcpy(oname.name, lsn, len); + oname.len = len; + } else { + memcpy(oname.name, fname.name, fname.name_len); + oname.len = fname.name_len; + } + iput(_pinode); + } else { + err = ceph_fname_to_usr(&fname, &tname, &oname, &is_nokey); + if (err) { + pr_err("%s unable to decode %.*s, got %d\n", __func__, + rde->name_len, rde->name, err); + goto out; + } } rde->dentry = NULL; @@ -1954,7 +2025,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, } else if (d_really_is_positive(dn) && (ceph_ino(d_inode(dn)) != tvino.ino || ceph_snap(d_inode(dn)) != tvino.snap)) { - struct ceph_dentry_info *di = ceph_dentry(dn); + di = ceph_dentry(dn); dout(" dn %p points to wrong inode %p\n", dn, d_inode(dn)); @@ -1977,6 +2048,8 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, * to avoid doing the dencrypt again there. */ rde->dentry = dget(dn); + di = ceph_dentry(dn); + di->long_snap_name = !!rde->long_snap_name; /* inode */ if (d_really_is_positive(dn)) { @@ -2056,6 +2129,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, ceph_fname_free_buffer(pinode, &tname); ceph_fname_free_buffer(pinode, &oname); iput(pinode); + kfree(long_snap_name); dout("readdir_prepopulate done\n"); return err; } diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index f8fd474f80cf..92adfaca0914 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -309,7 +309,8 @@ static int parse_reply_info_dir(void **p, void *end, static int parse_reply_info_lease(void **p, void *end, struct ceph_mds_reply_lease **lease, - u64 features, u32 *altname_len, u8 **altname) + u64 features, u32 *altname_len, u8 **altname, + u8 *long_snap_name) { u8 struct_v; u32 struct_len; @@ -339,14 +340,16 @@ static int parse_reply_info_lease(void **p, void *end, *p += sizeof(**lease); if (features == (u64)-1) { + *altname = NULL; + *altname_len = 0; if (struct_v >= 2) { ceph_decode_32_safe(p, end, *altname_len, bad); ceph_decode_need(p, end, *altname_len, bad); *altname = *p; *p += *altname_len; - } else { - *altname = NULL; - *altname_len = 0; + } + if (struct_v >= 3) { + ceph_decode_8_safe(p, end, *long_snap_name, bad); } } *p = lend; @@ -380,7 +383,8 @@ static int parse_reply_info_trace(void **p, void *end, *p += info->dname_len; err = parse_reply_info_lease(p, end, &info->dlease, features, - &info->altname_len, &info->altname); + &info->altname_len, &info->altname, + &info->long_snap_name); if (err < 0) goto out_bad; } @@ -448,7 +452,8 @@ static int parse_reply_info_readdir(void **p, void *end, /* dentry lease */ err = parse_reply_info_lease(p, end, &rde->lease, features, - &rde->altname_len, &rde->altname); + &rde->altname_len, &rde->altname, + &rde->long_snap_name); if (err) goto out_bad; @@ -2511,7 +2516,7 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for spin_unlock(&cur->d_lock); } else { int len, ret; - char buf[FSCRYPT_BASE64URL_CHARS(NAME_MAX)]; + char buf[FSCRYPT_BASE64URL_CHARS(NAME_MAX) + 20]; /* * Proactively copy name into buf, in case we need to present @@ -2784,6 +2789,9 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, if (test_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags)) len += sizeof(__le64); + /* extra chars '_' and '_${INO}' for long snap names */ + len += 60; + msg = ceph_msg_new2(CEPH_MSG_CLIENT_REQUEST, len, 1, GFP_NOFS, false); if (!msg) { msg = ERR_PTR(-ENOMEM); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 663d7754d57d..5068f85c7505 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -104,6 +104,7 @@ struct ceph_mds_reply_dir_entry { struct ceph_mds_reply_lease *lease; struct ceph_mds_reply_info_in inode; loff_t offset; + u8 long_snap_name; }; struct ceph_mds_reply_xattr { @@ -129,6 +130,7 @@ struct ceph_mds_reply_info_parsed { u32 altname_len; struct ceph_mds_reply_lease *dlease; struct ceph_mds_reply_xattr xattr_info; + u8 long_snap_name; /* extra */ union { diff --git a/fs/ceph/super.h b/fs/ceph/super.h index f0268a571621..2a96b4524a37 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -297,6 +297,7 @@ struct ceph_dentry_info { unsigned long lease_renew_after, lease_renew_from; unsigned long time; u64 offset; + bool long_snap_name; }; #define CEPH_DENTRY_REFERENCED 1