From patchwork Tue Apr 5 19:19:32 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802348 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9C855C433EF for ; Wed, 6 Apr 2022 04:16:23 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1380568AbiDFEMp (ORCPT ); Wed, 6 Apr 2022 00:12:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35094 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573549AbiDETWd (ORCPT ); Tue, 5 Apr 2022 15:22:33 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B03638DB9; Tue, 5 Apr 2022 12:20:34 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B0E03616C5; Tue, 5 Apr 2022 19:20:33 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7792EC385A3; Tue, 5 Apr 2022 19:20:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186433; bh=KPIMaUrv3ahVDlTZhvKlbu0p7ZyDL3VrkfH0rCRjpXk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ifF09hyIaGUTiC7bnjvc9J8UYXHxi2eZKOPa+cWWIGInOKj6kdGzAV65rtAvzN8G7 /ZT1UExzLMCQJikWl4O9Ga3D5x39CE9fnFPXBEoHk+6lK4/bCaCNVRvDO1cls43kOZ OvmShbiUM7hNU2Op2QytzzWoVfj1b63pE2G81W0KKovHXp4emOXc2EC+EJK+fVMaQj EZ/YVtZUVtEYO74b0R89kL4XmHjsUS8PFnqDJn4AOsEAbLrTeO9GeVmfrukjlIFQNU 7NGbldoE85LoiViOuZ8SEjExCaSw4HaQ0FUa5SWtd4Kt47R7g4PhPflpGGN5wKEpq0 DfLefq6jHJTFQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 01/59] libceph: add spinlock around osd->o_requests Date: Tue, 5 Apr 2022 15:19:32 -0400 Message-Id: <20220405192030.178326-2-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org In a later patch, we're going to need to search for a request in the rbtree, but taking the o_mutex is inconvenient as we already hold the con mutex at the point where we need it. Add a new spinlock that we take when inserting and erasing entries from the o_requests tree. Search of the rbtree can be done with either the mutex or the spinlock, but insertion and removal requires both. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 8 +++++++- net/ceph/osd_client.c | 5 +++++ 2 files changed, 12 insertions(+), 1 deletion(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 3431011f364d..3122c1a3205f 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -29,7 +29,12 @@ typedef void (*ceph_osdc_callback_t)(struct ceph_osd_request *); #define CEPH_HOMELESS_OSD -1 -/* a given osd we're communicating with */ +/* + * A given osd we're communicating with. + * + * Note that the o_requests tree can be searched while holding the "lock" mutex + * or the "o_requests_lock" spinlock. Insertion or removal requires both! + */ struct ceph_osd { refcount_t o_ref; struct ceph_osd_client *o_osdc; @@ -37,6 +42,7 @@ struct ceph_osd { int o_incarnation; struct rb_node o_node; struct ceph_connection o_con; + spinlock_t o_requests_lock; struct rb_root o_requests; struct rb_root o_linger_requests; struct rb_root o_backoff_mappings; diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 83eb97c94e83..17c792b32343 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1198,6 +1198,7 @@ static void osd_init(struct ceph_osd *osd) { refcount_set(&osd->o_ref, 1); RB_CLEAR_NODE(&osd->o_node); + spin_lock_init(&osd->o_requests_lock); osd->o_requests = RB_ROOT; osd->o_linger_requests = RB_ROOT; osd->o_backoff_mappings = RB_ROOT; @@ -1427,7 +1428,9 @@ static void link_request(struct ceph_osd *osd, struct ceph_osd_request *req) atomic_inc(&osd->o_osdc->num_homeless); get_osd(osd); + spin_lock(&osd->o_requests_lock); insert_request(&osd->o_requests, req); + spin_unlock(&osd->o_requests_lock); req->r_osd = osd; } @@ -1439,7 +1442,9 @@ static void unlink_request(struct ceph_osd *osd, struct ceph_osd_request *req) req, req->r_tid); req->r_osd = NULL; + spin_lock(&osd->o_requests_lock); erase_request(&osd->o_requests, req); + spin_unlock(&osd->o_requests_lock); put_osd(osd); if (!osd_homeless(osd)) From patchwork Tue Apr 5 19:19:33 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802326 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7FF4CC433FE for ; Wed, 6 Apr 2022 04:02:03 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230188AbiDFED4 (ORCPT ); Wed, 6 Apr 2022 00:03:56 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35276 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573551AbiDETWf (ORCPT ); Tue, 5 Apr 2022 15:22:35 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C65443B2B0; Tue, 5 Apr 2022 12:20:36 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 59D9EB81F6B; Tue, 5 Apr 2022 19:20:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 611E1C385A0; Tue, 5 Apr 2022 19:20:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186434; bh=WUMUKws9bRjZTrkHwsGynA7tUQA6j+UO+nipvx8/Pxc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ViqzZ1njQjHURsQFB+bQws/EXFJXBVZ0713Vy6dx/7ZW2JLu/xJiUvM59gv19WtuH jfUpil6gly2i2pNknMzzqQLgdSP/9zSp1jWAsMzq31/yBk1ZmtSHwQo8W93YF+gpoH q6gA5ImURC+5w6E7ZP7KxFXbq5yPmvGsDpR55YX9rpAnUorWYz8YyWWxca89/CYEGb dCmAQsLRhI+xEKB9lYpcxr7hUR7Dp1hQy01Plzo+DQB9eu66rhzwF1tqLuh0RVcn/E 8dUtp6FjBKqDcfNtEjqOTl56bgrKgfjt0ySudygn2hIFJ1pt3dBuGefss3n3vS2CxR 5uhc00PJzXdlQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 02/59] libceph: define struct ceph_sparse_extent and add some helpers Date: Tue, 5 Apr 2022 15:19:33 -0400 Message-Id: <20220405192030.178326-3-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When the OSD sends back a sparse read reply, it contains an array of these structures. Define the structure and add a couple of helpers for dealing with them. Also add a place in struct ceph_osd_req_op to store the extent buffer, and code to free it if it's populated when the req is torn down. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 43 ++++++++++++++++++++++++++++++++- net/ceph/osd_client.c | 13 ++++++++++ 2 files changed, 55 insertions(+), 1 deletion(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 3122c1a3205f..1dd02240d00d 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -29,6 +29,17 @@ typedef void (*ceph_osdc_callback_t)(struct ceph_osd_request *); #define CEPH_HOMELESS_OSD -1 +/* + * A single extent in a SPARSE_READ reply. + * + * Note that these come from the OSD as little-endian values. On BE arches, + * we convert them in-place after receipt. + */ +struct ceph_sparse_extent { + u64 off; + u64 len; +} __packed; + /* * A given osd we're communicating with. * @@ -104,6 +115,8 @@ struct ceph_osd_req_op { u64 offset, length; u64 truncate_size; u32 truncate_seq; + int sparse_ext_cnt; + struct ceph_sparse_extent *sparse_ext; struct ceph_osd_data osd_data; } extent; struct { @@ -507,6 +520,20 @@ extern struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *, u32 truncate_seq, u64 truncate_size, bool use_mempool); +int __ceph_alloc_sparse_ext_map(struct ceph_osd_req_op *op, int cnt); + +/* + * How big an extent array should we preallocate for a sparse read? This is + * just a starting value. If we get more than this back from the OSD, the + * receiver will reallocate. + */ +#define CEPH_SPARSE_EXT_ARRAY_INITIAL 16 + +static inline int ceph_alloc_sparse_ext_map(struct ceph_osd_req_op *op) +{ + return __ceph_alloc_sparse_ext_map(op, CEPH_SPARSE_EXT_ARRAY_INITIAL); +} + extern void ceph_osdc_get_request(struct ceph_osd_request *req); extern void ceph_osdc_put_request(struct ceph_osd_request *req); @@ -562,5 +589,19 @@ int ceph_osdc_list_watchers(struct ceph_osd_client *osdc, struct ceph_object_locator *oloc, struct ceph_watch_item **watchers, u32 *num_watchers); -#endif +/* Find offset into the buffer of the end of the extent map */ +static inline u64 ceph_sparse_ext_map_end(struct ceph_osd_req_op *op) +{ + struct ceph_sparse_extent *ext; + + /* No extents? No data */ + if (op->extent.sparse_ext_cnt == 0) + return 0; + + ext = &op->extent.sparse_ext[op->extent.sparse_ext_cnt - 1]; + + return ext->off + ext->len - op->extent.offset; +} + +#endif diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index 17c792b32343..c150683f2a2f 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -378,6 +378,7 @@ static void osd_req_op_data_release(struct ceph_osd_request *osd_req, case CEPH_OSD_OP_READ: case CEPH_OSD_OP_WRITE: case CEPH_OSD_OP_WRITEFULL: + kfree(op->extent.sparse_ext); ceph_osd_data_release(&op->extent.osd_data); break; case CEPH_OSD_OP_CALL: @@ -1141,6 +1142,18 @@ struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *osdc, } EXPORT_SYMBOL(ceph_osdc_new_request); +int __ceph_alloc_sparse_ext_map(struct ceph_osd_req_op *op, int cnt) +{ + op->extent.sparse_ext_cnt = cnt; + op->extent.sparse_ext = kmalloc_array(cnt, + sizeof(*op->extent.sparse_ext), + GFP_NOFS); + if (!op->extent.sparse_ext) + return -ENOMEM; + return 0; +} +EXPORT_SYMBOL(__ceph_alloc_sparse_ext_map); + /* * We keep osd requests in an rbtree, sorted by ->r_tid. */ From patchwork Tue Apr 5 19:19:34 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802355 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 915BCC4321E for ; Wed, 6 Apr 2022 04:16:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1388309AbiDFENs (ORCPT ); Wed, 6 Apr 2022 00:13:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35266 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573550AbiDETWf (ORCPT ); Tue, 5 Apr 2022 15:22:35 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D1053A70D; Tue, 5 Apr 2022 12:20:36 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8D09061899; Tue, 5 Apr 2022 19:20:35 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4A4A3C385A1; Tue, 5 Apr 2022 19:20:34 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186435; bh=nG8CJQgJdJ6uZY6N3B9ES84bNWIgTJ4luknQQSBEL7g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PfD92Pe5GDQkkwdKIR07YoHmc/+qwtcF5k1GyDiZ9NSsNfFNXtdEoZhAEs+0kHO7f D87/tGhmK7Yay22hndvV1yjkAybmgu1rMZFmYzZpq3cIAQGC72SsSAy+vv6maCSADL o8rGvMWFiGjZcsUBKzAB5GaJ3915DNz8VXltDe/0mT55LsZL4o3CCkKtMedI5utFyC BevPuBXPwxuqqL5WKGSYZLbRnyHa0shBl8Kj29n6oyeBaIUK31dzyaO+QjQtfUVMjm H9wF04lZy+TXYUBek0FLlwOZKB8HGLOppRZsoFvf6wOHgqp+6FHtEX1TyeJQ5DUEWO oaHoFo/yL2vGg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 03/59] libceph: add sparse read support to msgr2 crc state machine Date: Tue, 5 Apr 2022 15:19:34 -0400 Message-Id: <20220405192030.178326-4-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add support for a new sparse_read ceph_connection operation. The idea is that the client driver can define this operation use it to do special handling for incoming reads. The alloc_msg routine will look at the request and determine whether the reply is expected to be sparse. If it is, then we'll dispatch to a different set of state machine states that will repeatedly call the driver's sparse_read op to get length and placement info for reading the extent map, and the extents themselves. This necessitates adding some new field to some other structs: - The msg gets a new bool to track whether it's a sparse_read request. - A new field is added to the cursor to track the amount remaining in the current extent. This is used to cap the read from the socket into the msg_data - Handing a revoke with all of this is particularly difficult, so I've added a new data_len_remain field to the v2 connection info, and then use that to skip that much on a revoke. We may want to expand the use of that to the normal read path as well, just for consistency's sake. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- include/linux/ceph/messenger.h | 28 ++++++ net/ceph/messenger.c | 1 + net/ceph/messenger_v2.c | 168 +++++++++++++++++++++++++++++++-- 3 files changed, 188 insertions(+), 9 deletions(-) diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index e7f2fb2fc207..7f09a4213834 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -17,6 +17,7 @@ struct ceph_msg; struct ceph_connection; +struct ceph_msg_data_cursor; /* * Ceph defines these callbacks for handling connection events. @@ -70,6 +71,30 @@ struct ceph_connection_operations { int used_proto, int result, const int *allowed_protos, int proto_cnt, const int *allowed_modes, int mode_cnt); + + /** + * sparse_read: read sparse data + * @con: connection we're reading from + * @cursor: data cursor for reading extents + * @buf: optional buffer to read into + * + * This should be called more than once, each time setting up to + * receive an extent into the current cursor position, and zeroing + * the holes between them. + * + * Returns amount of data to be read (in bytes), 0 if reading is + * complete, or -errno if there was an error. + * + * If @buf is set on a >0 return, then the data should be read into + * the provided buffer. Otherwise, it should be read into the cursor. + * + * The sparse read operation is expected to initialize the cursor + * with a length covering up to the end of the last extent. + */ + int (*sparse_read)(struct ceph_connection *con, + struct ceph_msg_data_cursor *cursor, + char **buf); + }; /* use format string %s%lld */ @@ -207,6 +232,7 @@ struct ceph_msg_data_cursor { struct ceph_msg_data *data; /* current data item */ size_t resid; /* bytes not yet consumed */ + int sr_resid; /* residual sparse_read len */ bool last_piece; /* current is last piece */ bool need_crc; /* crc update needed */ union { @@ -252,6 +278,7 @@ struct ceph_msg { struct kref kref; bool more_to_follow; bool needs_out_seq; + bool sparse_read; int front_alloc_len; struct ceph_msgpool *pool; @@ -396,6 +423,7 @@ struct ceph_connection_v2_info { void *conn_bufs[16]; int conn_buf_cnt; + int data_len_remain; struct kvec in_sign_kvecs[8]; struct kvec out_sign_kvecs[8]; diff --git a/net/ceph/messenger.c b/net/ceph/messenger.c index d3bb656308b4..bf4e7f5751ee 100644 --- a/net/ceph/messenger.c +++ b/net/ceph/messenger.c @@ -1034,6 +1034,7 @@ void ceph_msg_data_cursor_init(struct ceph_msg_data_cursor *cursor, cursor->total_resid = length; cursor->data = msg->data; + cursor->sr_resid = 0; __ceph_msg_data_cursor_init(cursor); } diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index c6e5bfc717d5..d527777af584 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -52,14 +52,16 @@ #define FRAME_LATE_STATUS_COMPLETE 0xe #define FRAME_LATE_STATUS_ABORTED_MASK 0xf -#define IN_S_HANDLE_PREAMBLE 1 -#define IN_S_HANDLE_CONTROL 2 -#define IN_S_HANDLE_CONTROL_REMAINDER 3 -#define IN_S_PREPARE_READ_DATA 4 -#define IN_S_PREPARE_READ_DATA_CONT 5 -#define IN_S_PREPARE_READ_ENC_PAGE 6 -#define IN_S_HANDLE_EPILOGUE 7 -#define IN_S_FINISH_SKIP 8 +#define IN_S_HANDLE_PREAMBLE 1 +#define IN_S_HANDLE_CONTROL 2 +#define IN_S_HANDLE_CONTROL_REMAINDER 3 +#define IN_S_PREPARE_READ_DATA 4 +#define IN_S_PREPARE_READ_DATA_CONT 5 +#define IN_S_PREPARE_READ_ENC_PAGE 6 +#define IN_S_PREPARE_SPARSE_DATA 7 +#define IN_S_PREPARE_SPARSE_DATA_CONT 8 +#define IN_S_HANDLE_EPILOGUE 9 +#define IN_S_FINISH_SKIP 10 #define OUT_S_QUEUE_DATA 1 #define OUT_S_QUEUE_DATA_CONT 2 @@ -1819,6 +1821,124 @@ static void prepare_read_data_cont(struct ceph_connection *con) con->v2.in_state = IN_S_HANDLE_EPILOGUE; } +static int prepare_sparse_read_cont(struct ceph_connection *con) +{ + int ret; + struct bio_vec bv; + char *buf = NULL; + struct ceph_msg_data_cursor *cursor = &con->v2.in_cursor; + + WARN_ON(con->v2.in_state != IN_S_PREPARE_SPARSE_DATA_CONT); + + if (iov_iter_is_bvec(&con->v2.in_iter)) { + if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) { + con->in_data_crc = crc32c(con->in_data_crc, + page_address(con->bounce_page), + con->v2.in_bvec.bv_len); + get_bvec_at(cursor, &bv); + memcpy_to_page(bv.bv_page, bv.bv_offset, + page_address(con->bounce_page), + con->v2.in_bvec.bv_len); + } else { + con->in_data_crc = ceph_crc32c_page(con->in_data_crc, + con->v2.in_bvec.bv_page, + con->v2.in_bvec.bv_offset, + con->v2.in_bvec.bv_len); + } + + ceph_msg_data_advance(cursor, con->v2.in_bvec.bv_len); + cursor->sr_resid -= con->v2.in_bvec.bv_len; + dout("%s: advance by 0x%x sr_resid 0x%x\n", __func__, + con->v2.in_bvec.bv_len, cursor->sr_resid); + WARN_ON_ONCE(cursor->sr_resid > cursor->total_resid); + if (cursor->sr_resid) { + get_bvec_at(cursor, &bv); + if (bv.bv_len > cursor->sr_resid) + bv.bv_len = cursor->sr_resid; + if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) { + bv.bv_page = con->bounce_page; + bv.bv_offset = 0; + } + set_in_bvec(con, &bv); + con->v2.data_len_remain -= bv.bv_len; + return 0; + } + } else if (iov_iter_is_kvec(&con->v2.in_iter)) { + /* On first call, we have no kvec so don't compute crc */ + if (con->v2.in_kvec_cnt) { + WARN_ON_ONCE(con->v2.in_kvec_cnt > 1); + con->in_data_crc = crc32c(con->in_data_crc, + con->v2.in_kvecs[0].iov_base, + con->v2.in_kvecs[0].iov_len); + } + } else { + return -EIO; + } + + /* get next extent */ + ret = con->ops->sparse_read(con, cursor, &buf); + if (ret <= 0) { + if (ret < 0) + return ret; + + reset_in_kvecs(con); + add_in_kvec(con, con->v2.in_buf, CEPH_EPILOGUE_PLAIN_LEN); + con->v2.in_state = IN_S_HANDLE_EPILOGUE; + return 0; + } + + if (buf) { + /* receive into buffer */ + reset_in_kvecs(con); + add_in_kvec(con, buf, ret); + con->v2.data_len_remain -= ret; + return 0; + } + + if (ret > cursor->total_resid) { + pr_warn("%s: ret 0x%x total_resid 0x%zx resid 0x%zx last %d\n", + __func__, ret, cursor->total_resid, cursor->resid, + cursor->last_piece); + return -EIO; + } + get_bvec_at(cursor, &bv); + if (bv.bv_len > cursor->sr_resid) + bv.bv_len = cursor->sr_resid; + if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) { + if (unlikely(!con->bounce_page)) { + con->bounce_page = alloc_page(GFP_NOIO); + if (!con->bounce_page) { + pr_err("failed to allocate bounce page\n"); + return -ENOMEM; + } + } + + bv.bv_page = con->bounce_page; + bv.bv_offset = 0; + } + set_in_bvec(con, &bv); + con->v2.data_len_remain -= ret; + return ret; +} + +static int prepare_sparse_read_data(struct ceph_connection *con) +{ + struct ceph_msg *msg = con->in_msg; + + dout("%s: starting sparse read\n", __func__); + + if (WARN_ON_ONCE(!con->ops->sparse_read)) + return -EOPNOTSUPP; + + if (!con_secure(con)) + con->in_data_crc = -1; + + reset_in_kvecs(con); + con->v2.in_state = IN_S_PREPARE_SPARSE_DATA_CONT; + con->v2.data_len_remain = data_len(msg); + return prepare_sparse_read_cont(con); +} + static int prepare_read_tail_plain(struct ceph_connection *con) { struct ceph_msg *msg = con->in_msg; @@ -1839,7 +1959,10 @@ static int prepare_read_tail_plain(struct ceph_connection *con) } if (data_len(msg)) { - con->v2.in_state = IN_S_PREPARE_READ_DATA; + if (msg->sparse_read) + con->v2.in_state = IN_S_PREPARE_SPARSE_DATA; + else + con->v2.in_state = IN_S_PREPARE_READ_DATA; } else { add_in_kvec(con, con->v2.in_buf, CEPH_EPILOGUE_PLAIN_LEN); con->v2.in_state = IN_S_HANDLE_EPILOGUE; @@ -2893,6 +3016,12 @@ static int populate_in_iter(struct ceph_connection *con) prepare_read_enc_page(con); ret = 0; break; + case IN_S_PREPARE_SPARSE_DATA: + ret = prepare_sparse_read_data(con); + break; + case IN_S_PREPARE_SPARSE_DATA_CONT: + ret = prepare_sparse_read_cont(con); + break; case IN_S_HANDLE_EPILOGUE: ret = handle_epilogue(con); break; @@ -3485,6 +3614,23 @@ static void revoke_at_prepare_read_enc_page(struct ceph_connection *con) con->v2.in_state = IN_S_FINISH_SKIP; } +static void revoke_at_prepare_sparse_data(struct ceph_connection *con) +{ + int resid; /* current piece of data */ + int remaining; + + WARN_ON(con_secure(con)); + WARN_ON(!data_len(con->in_msg)); + WARN_ON(!iov_iter_is_bvec(&con->v2.in_iter)); + resid = iov_iter_count(&con->v2.in_iter); + dout("%s con %p resid %d\n", __func__, con, resid); + + remaining = CEPH_EPILOGUE_PLAIN_LEN + con->v2.data_len_remain; + con->v2.in_iter.count -= resid; + set_in_skip(con, resid + remaining); + con->v2.in_state = IN_S_FINISH_SKIP; +} + static void revoke_at_handle_epilogue(struct ceph_connection *con) { int resid; @@ -3501,6 +3647,7 @@ static void revoke_at_handle_epilogue(struct ceph_connection *con) void ceph_con_v2_revoke_incoming(struct ceph_connection *con) { switch (con->v2.in_state) { + case IN_S_PREPARE_SPARSE_DATA: case IN_S_PREPARE_READ_DATA: revoke_at_prepare_read_data(con); break; @@ -3510,6 +3657,9 @@ void ceph_con_v2_revoke_incoming(struct ceph_connection *con) case IN_S_PREPARE_READ_ENC_PAGE: revoke_at_prepare_read_enc_page(con); break; + case IN_S_PREPARE_SPARSE_DATA_CONT: + revoke_at_prepare_sparse_data(con); + break; case IN_S_HANDLE_EPILOGUE: revoke_at_handle_epilogue(con); break; From patchwork Tue Apr 5 19:19:35 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802337 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id F3A78C4332F for ; Wed, 6 Apr 2022 04:05:14 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442049AbiDFEGa (ORCPT ); Wed, 6 Apr 2022 00:06:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35484 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573553AbiDETWj (ORCPT ); Tue, 5 Apr 2022 15:22:39 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4FD563CFCD; Tue, 5 Apr 2022 12:20:39 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 8C61ECE1FB6; Tue, 5 Apr 2022 19:20:37 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3390FC385A3; Tue, 5 Apr 2022 19:20:35 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186435; bh=wH0z00USvCwH2GBLz1DogypY6kWWSEH0EiOI5h83q+E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oDFS5/7KRE0XqcPAHpuET2uJiz4GUz9GppVW23XdlTF0keZLn360VmyYr92jDCPQM xCbwN5+qERVgewEUadLDsiIc+0ryn3t9apRrRsrqpuZbVOefi2rRxGt/XTkvddFtCY aHLeW0IMZau2nHp7kCW/OIDbJzcwaT2CZOUx9KMiFuJMcR4a2ApwJLTdcan5QBQ4wo 8tyYjlbhbsacgMRNcSLdTDkToJ29K55QR77jM+yQSB1+OAPMkpOFPRZKgvHHPxRvn9 yTOXvpGgtwWT3w12rcCKS1yUIyj99Qxgyt18j3M10A1hxgQEjLLoMNRvdKgcjzoWCW jSZkBdE/H1Uyw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 04/59] libceph: add sparse read support to OSD client Date: Tue, 5 Apr 2022 15:19:35 -0400 Message-Id: <20220405192030.178326-5-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Have get_reply check for the presence of sparse read ops in the request and set the sparse_read boolean in the msg. That will queue the messenger layer to use the sparse read codepath instead of the normal data receive. Add a new sparse_read operation for the OSD client, driven by its own state machine. The messenger will repeatedly call the sparse_read operation, and it will pass back the necessary info to set up to read the next extent of data, while zero-filling the sparse regions. The state machine will stop at the end of the last extent, and will attach the extent map buffer to the ceph_osd_req_op so that the caller can use it. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 32 ++++ net/ceph/osd_client.c | 256 +++++++++++++++++++++++++++++++- 2 files changed, 284 insertions(+), 4 deletions(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 1dd02240d00d..4088601beacc 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -40,6 +40,36 @@ struct ceph_sparse_extent { u64 len; } __packed; +/* Sparse read state machine state values */ +enum ceph_sparse_read_state { + CEPH_SPARSE_READ_HDR = 0, + CEPH_SPARSE_READ_EXTENTS, + CEPH_SPARSE_READ_DATA_LEN, + CEPH_SPARSE_READ_DATA, +}; + +/* + * A SPARSE_READ reply is a 32-bit count of extents, followed by an array of + * 64-bit offset/length pairs, and then all of the actual file data + * concatenated after it (sans holes). + * + * Unfortunately, we don't know how long the extent array is until we've + * started reading the data section of the reply. The caller should send down + * a destination buffer for the array, but we'll alloc one if it's too small + * or if the caller doesn't. + */ +struct ceph_sparse_read { + enum ceph_sparse_read_state sr_state; /* state machine state */ + u64 sr_req_off; /* orig request offset */ + u64 sr_req_len; /* orig request length */ + u64 sr_pos; /* current pos in buffer */ + int sr_index; /* current extent index */ + __le32 sr_datalen; /* length of actual data */ + u32 sr_count; /* extent count in reply */ + int sr_ext_len; /* length of extent array */ + struct ceph_sparse_extent *sr_extent; /* extent array */ +}; + /* * A given osd we're communicating with. * @@ -48,6 +78,7 @@ struct ceph_sparse_extent { */ struct ceph_osd { refcount_t o_ref; + int o_sparse_op_idx; struct ceph_osd_client *o_osdc; int o_osd; int o_incarnation; @@ -63,6 +94,7 @@ struct ceph_osd { unsigned long lru_ttl; struct list_head o_keepalive_item; struct mutex lock; + struct ceph_sparse_read o_sparse_read; }; #define CEPH_OSD_SLAB_OPS 2 diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index c150683f2a2f..acf6a19b6677 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -376,6 +376,7 @@ static void osd_req_op_data_release(struct ceph_osd_request *osd_req, switch (op->op) { case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: case CEPH_OSD_OP_WRITE: case CEPH_OSD_OP_WRITEFULL: kfree(op->extent.sparse_ext); @@ -707,6 +708,7 @@ static void get_num_data_items(struct ceph_osd_request *req, /* reply */ case CEPH_OSD_OP_STAT: case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: case CEPH_OSD_OP_LIST_WATCHERS: *num_reply_data_items += 1; break; @@ -776,7 +778,7 @@ void osd_req_op_extent_init(struct ceph_osd_request *osd_req, BUG_ON(opcode != CEPH_OSD_OP_READ && opcode != CEPH_OSD_OP_WRITE && opcode != CEPH_OSD_OP_WRITEFULL && opcode != CEPH_OSD_OP_ZERO && - opcode != CEPH_OSD_OP_TRUNCATE); + opcode != CEPH_OSD_OP_TRUNCATE && opcode != CEPH_OSD_OP_SPARSE_READ); op->extent.offset = offset; op->extent.length = length; @@ -985,6 +987,7 @@ static u32 osd_req_encode_op(struct ceph_osd_op *dst, case CEPH_OSD_OP_STAT: break; case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: case CEPH_OSD_OP_WRITE: case CEPH_OSD_OP_WRITEFULL: case CEPH_OSD_OP_ZERO: @@ -1081,7 +1084,8 @@ struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *osdc, BUG_ON(opcode != CEPH_OSD_OP_READ && opcode != CEPH_OSD_OP_WRITE && opcode != CEPH_OSD_OP_ZERO && opcode != CEPH_OSD_OP_TRUNCATE && - opcode != CEPH_OSD_OP_CREATE && opcode != CEPH_OSD_OP_DELETE); + opcode != CEPH_OSD_OP_CREATE && opcode != CEPH_OSD_OP_DELETE && + opcode != CEPH_OSD_OP_SPARSE_READ); req = ceph_osdc_alloc_request(osdc, snapc, num_ops, use_mempool, GFP_NOFS); @@ -1222,6 +1226,13 @@ static void osd_init(struct ceph_osd *osd) mutex_init(&osd->lock); } +static void ceph_init_sparse_read(struct ceph_sparse_read *sr) +{ + kfree(sr->sr_extent); + memset(sr, '\0', sizeof(*sr)); + sr->sr_state = CEPH_SPARSE_READ_HDR; +} + static void osd_cleanup(struct ceph_osd *osd) { WARN_ON(!RB_EMPTY_NODE(&osd->o_node)); @@ -1232,6 +1243,8 @@ static void osd_cleanup(struct ceph_osd *osd) WARN_ON(!list_empty(&osd->o_osd_lru)); WARN_ON(!list_empty(&osd->o_keepalive_item)); + ceph_init_sparse_read(&osd->o_sparse_read); + if (osd->o_auth.authorizer) { WARN_ON(osd_homeless(osd)); ceph_auth_destroy_authorizer(osd->o_auth.authorizer); @@ -1251,6 +1264,9 @@ static struct ceph_osd *create_osd(struct ceph_osd_client *osdc, int onum) osd_init(osd); osd->o_osdc = osdc; osd->o_osd = onum; + osd->o_sparse_op_idx = -1; + + ceph_init_sparse_read(&osd->o_sparse_read); ceph_con_init(&osd->o_con, osd, &osd_con_ops, &osdc->client->msgr); @@ -2055,6 +2071,7 @@ static void setup_request_data(struct ceph_osd_request *req) &op->raw_data_in); break; case CEPH_OSD_OP_READ: + case CEPH_OSD_OP_SPARSE_READ: ceph_osdc_msg_data_add(reply_msg, &op->extent.osd_data); break; @@ -2474,8 +2491,10 @@ static void finish_request(struct ceph_osd_request *req) req->r_end_latency = ktime_get(); - if (req->r_osd) + if (req->r_osd) { + ceph_init_sparse_read(&req->r_osd->o_sparse_read); unlink_request(req->r_osd, req); + } atomic_dec(&osdc->num_requests); /* @@ -5420,6 +5439,24 @@ static void osd_dispatch(struct ceph_connection *con, struct ceph_msg *msg) ceph_msg_put(msg); } +/* How much sparse data was requested? */ +static u64 sparse_data_requested(struct ceph_osd_request *req) +{ + u64 len = 0; + + if (req->r_flags & CEPH_OSD_FLAG_READ) { + int i; + + for (i = 0; i < req->r_num_ops; ++i) { + struct ceph_osd_req_op *op = &req->r_ops[i]; + + if (op->op == CEPH_OSD_OP_SPARSE_READ) + len += op->extent.length; + } + } + return len; +} + /* * Lookup and return message for incoming reply. Don't try to do * anything about a larger than preallocated data portion of the @@ -5436,6 +5473,7 @@ static struct ceph_msg *get_reply(struct ceph_connection *con, int front_len = le32_to_cpu(hdr->front_len); int data_len = le32_to_cpu(hdr->data_len); u64 tid = le64_to_cpu(hdr->tid); + u64 srlen; down_read(&osdc->lock); if (!osd_registered(osd)) { @@ -5468,7 +5506,8 @@ static struct ceph_msg *get_reply(struct ceph_connection *con, req->r_reply = m; } - if (data_len > req->r_reply->data_length) { + srlen = sparse_data_requested(req); + if (!srlen && data_len > req->r_reply->data_length) { pr_warn("%s osd%d tid %llu data %d > preallocated %zu, skipping\n", __func__, osd->o_osd, req->r_tid, data_len, req->r_reply->data_length); @@ -5478,6 +5517,8 @@ static struct ceph_msg *get_reply(struct ceph_connection *con, } m = ceph_msg_get(req->r_reply); + m->sparse_read = (bool)srlen; + dout("get_reply tid %lld %p\n", tid, m); out_unlock_session: @@ -5710,9 +5751,216 @@ static int osd_check_message_signature(struct ceph_msg *msg) return ceph_auth_check_message_signature(auth, msg); } +static void advance_cursor(struct ceph_msg_data_cursor *cursor, size_t len, bool zero) +{ + while (len) { + struct page *page; + size_t poff, plen; + bool last = false; + + page = ceph_msg_data_next(cursor, &poff, &plen, &last); + if (plen > len) + plen = len; + if (zero) + zero_user_segment(page, poff, poff + plen); + len -= plen; + ceph_msg_data_advance(cursor, plen); + } +} + +static int prep_next_sparse_read(struct ceph_connection *con, + struct ceph_msg_data_cursor *cursor) +{ + struct ceph_osd *o = con->private; + struct ceph_sparse_read *sr = &o->o_sparse_read; + struct ceph_osd_request *req; + struct ceph_osd_req_op *op; + + spin_lock(&o->o_requests_lock); + req = lookup_request(&o->o_requests, le64_to_cpu(con->in_msg->hdr.tid)); + if (!req) { + spin_unlock(&o->o_requests_lock); + return -EBADR; + } + + if (o->o_sparse_op_idx < 0) { + u64 srlen = sparse_data_requested(req); + + dout("%s: [%d] starting new sparse read req. srlen=0x%llx\n", + __func__, o->o_osd, srlen); + ceph_msg_data_cursor_init(cursor, con->in_msg, srlen); + } else { + u64 end; + + op = &req->r_ops[o->o_sparse_op_idx]; + + WARN_ON_ONCE(op->extent.sparse_ext); + + /* hand back buffer we took earlier */ + op->extent.sparse_ext = sr->sr_extent; + sr->sr_extent = NULL; + op->extent.sparse_ext_cnt = sr->sr_count; + sr->sr_ext_len = 0; + dout("%s: [%d] completed extent array len %d cursor->resid %zd\n", + __func__, o->o_osd, op->extent.sparse_ext_cnt, cursor->resid); + /* Advance to end of data for this operation */ + end = ceph_sparse_ext_map_end(op); + if (end < sr->sr_req_len) + advance_cursor(cursor, sr->sr_req_len - end, false); + } + + ceph_init_sparse_read(sr); + + /* find next op in this request (if any) */ + while (++o->o_sparse_op_idx < req->r_num_ops) { + op = &req->r_ops[o->o_sparse_op_idx]; + if (op->op == CEPH_OSD_OP_SPARSE_READ) + goto found; + } + + /* reset for next sparse read request */ + spin_unlock(&o->o_requests_lock); + o->o_sparse_op_idx = -1; + return 0; +found: + sr->sr_req_off = op->extent.offset; + sr->sr_req_len = op->extent.length; + sr->sr_pos = sr->sr_req_off; + dout("%s: [%d] new sparse read op at idx %d 0x%llx~0x%llx\n", __func__, + o->o_osd, o->o_sparse_op_idx, sr->sr_req_off, sr->sr_req_len); + + /* hand off request's sparse extent map buffer */ + sr->sr_ext_len = op->extent.sparse_ext_cnt; + op->extent.sparse_ext_cnt = 0; + sr->sr_extent = op->extent.sparse_ext; + op->extent.sparse_ext = NULL; + + spin_unlock(&o->o_requests_lock); + return 1; +} + +#ifdef __BIG_ENDIAN +static inline void convert_extent_map(struct ceph_sparse_read *sr) +{ + int i; + + for (i = 0; i < sr->sr_count; i++) { + struct ceph_sparse_extent *ext = &sr->sr_extent[i]; + + ext->off = le64_to_cpu((__force __le64)ext->off); + ext->len = le64_to_cpu((__force __le64)ext->len); + } +} +#else +static inline void convert_extent_map(struct ceph_sparse_read *sr) +{ +} +#endif + +#define MAX_EXTENTS 4096 + +static int osd_sparse_read(struct ceph_connection *con, + struct ceph_msg_data_cursor *cursor, + char **pbuf) +{ + struct ceph_osd *o = con->private; + struct ceph_sparse_read *sr = &o->o_sparse_read; + u32 count = sr->sr_count; + u64 eoff, elen; + int ret; + + switch (sr->sr_state) { + case CEPH_SPARSE_READ_HDR: +next_op: + ret = prep_next_sparse_read(con, cursor); + if (ret <= 0) + return ret; + + /* number of extents */ + ret = sizeof(sr->sr_count); + *pbuf = (char *)&sr->sr_count; + sr->sr_state = CEPH_SPARSE_READ_EXTENTS; + break; + case CEPH_SPARSE_READ_EXTENTS: + /* Convert sr_count to host-endian */ + count = le32_to_cpu((__force __le32)sr->sr_count); + sr->sr_count = count; + dout("[%d] got %u extents\n", o->o_osd, count); + + if (count > 0) { + if (!sr->sr_extent || count > sr->sr_ext_len) { + /* + * Apply a hard cap to the number of extents. + * If we have more, assume something is wrong. + */ + if (count > MAX_EXTENTS) { + dout("%s: OSD returned 0x%x extents in a single reply!\n", + __func__, count); + return -EREMOTEIO; + } + + /* no extent array provided, or too short */ + kfree(sr->sr_extent); + sr->sr_extent = kmalloc_array(count, + sizeof(*sr->sr_extent), + GFP_NOIO); + if (!sr->sr_extent) + return -ENOMEM; + sr->sr_ext_len = count; + } + ret = count * sizeof(*sr->sr_extent); + *pbuf = (char *)sr->sr_extent; + sr->sr_state = CEPH_SPARSE_READ_DATA_LEN; + break; + } + /* No extents? Read data len */ + fallthrough; + case CEPH_SPARSE_READ_DATA_LEN: + convert_extent_map(sr); + ret = sizeof(sr->sr_datalen); + *pbuf = (char *)&sr->sr_datalen; + sr->sr_state = CEPH_SPARSE_READ_DATA; + break; + case CEPH_SPARSE_READ_DATA: + if (sr->sr_index >= count) { + sr->sr_state = CEPH_SPARSE_READ_HDR; + goto next_op; + } + + eoff = sr->sr_extent[sr->sr_index].off; + elen = sr->sr_extent[sr->sr_index].len; + + dout("[%d] ext %d off 0x%llx len 0x%llx\n", + o->o_osd, sr->sr_index, eoff, elen); + + if (elen > INT_MAX) { + dout("Sparse read extent length too long (0x%llx)\n", elen); + return -EREMOTEIO; + } + + /* zero out anything from sr_pos to start of extent */ + if (sr->sr_pos < eoff) + advance_cursor(cursor, eoff - sr->sr_pos, true); + + /* Set position to end of extent */ + sr->sr_pos = eoff + elen; + + /* send back the new length and nullify the ptr */ + cursor->sr_resid = elen; + ret = elen; + *pbuf = NULL; + + /* Bump the array index */ + ++sr->sr_index; + break; + } + return ret; +} + static const struct ceph_connection_operations osd_con_ops = { .get = osd_get_con, .put = osd_put_con, + .sparse_read = osd_sparse_read, .alloc_msg = osd_alloc_msg, .dispatch = osd_dispatch, .fault = osd_fault, From patchwork Tue Apr 5 19:19:36 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802364 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4D286C433F5 for ; Wed, 6 Apr 2022 04:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1447774AbiDFEPP (ORCPT ); Wed, 6 Apr 2022 00:15:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35486 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573552AbiDETWj (ORCPT ); Tue, 5 Apr 2022 15:22:39 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 57AC73CFDB; Tue, 5 Apr 2022 12:20:39 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0A6DDB81FA4; Tue, 5 Apr 2022 19:20:38 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1BD42C385A0; Tue, 5 Apr 2022 19:20:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186436; bh=y1CHM8STK8OTbioMqhyja91DwJu2qLFOOxUQqODp/bA=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=KKgdKxatD/6nBp3Zy2wQE3CVP9KcHcBNn3W7yivtMJdaCRwY4ooZp1XV0xJHqWBfz qbz8sDTdhukY/sGPLUBHvA/x4w7zwvzJD87Vgs0KTidTFbLURAnWNVSjYY/psun+Ur mnYvFyN7EC8TfEIkpHNRNatHCvQ8rC2pgfPj4AvDpxYKy4ErUqq7/rf66/P5S7mSWT v/Gh/s2vgrIccinHt98aPCL1EJrR6T2kOZlgJ6xa3xQpMa5bQlX9HtfmPkWlZ8gMdQ On3g+1VHlFRp7+t1nTzcug2oxPr5KbQbmcFVMqtTnOF1lwDgjE8a+9MC23FOk+A7yV UxMGTrGm9J3og== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 05/59] libceph: support sparse reads on msgr2 secure codepath Date: Tue, 5 Apr 2022 15:19:36 -0400 Message-Id: <20220405192030.178326-6-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add a new init_sgs_pages helper that populates the scatterlist from an arbitrary point in an array of pages. Change setup_message_sgs to take an optional pointer to an array of pages. If that's set, then the scatterlist will be set using that array instead of the cursor. When given a sparse read on a secure connection, decrypt the data in-place rather than into the final destination, by passing it the in_enc_pages array. After decrypting, run the sparse_read state machine in a loop, copying data from the decrypted pages until it's complete. Signed-off-by: Jeff Layton --- net/ceph/messenger_v2.c | 119 ++++++++++++++++++++++++++++++++++++---- 1 file changed, 109 insertions(+), 10 deletions(-) diff --git a/net/ceph/messenger_v2.c b/net/ceph/messenger_v2.c index d527777af584..3dcaee6f8903 100644 --- a/net/ceph/messenger_v2.c +++ b/net/ceph/messenger_v2.c @@ -963,12 +963,48 @@ static void init_sgs_cursor(struct scatterlist **sg, } } +/** + * init_sgs_pages: set up scatterlist on an array of page pointers + * @sg: scatterlist to populate + * @pages: pointer to page array + * @dpos: position in the array to start (bytes) + * @dlen: len to add to sg (bytes) + * @pad: pointer to pad destination (if any) + * + * Populate the scatterlist from the page array, starting at an arbitrary + * byte in the array and running for a specified length. + */ +static void init_sgs_pages(struct scatterlist **sg, struct page **pages, + int dpos, int dlen, u8 *pad) +{ + int idx = dpos >> PAGE_SHIFT; + int off = offset_in_page(dpos); + int resid = dlen; + + do { + int len = min(resid, (int)PAGE_SIZE - off); + + sg_set_page(*sg, pages[idx], len, off); + *sg = sg_next(*sg); + off = 0; + ++idx; + resid -= len; + } while (resid); + + if (need_padding(dlen)) { + sg_set_buf(*sg, pad, padding_len(dlen)); + *sg = sg_next(*sg); + } +} + static int setup_message_sgs(struct sg_table *sgt, struct ceph_msg *msg, u8 *front_pad, u8 *middle_pad, u8 *data_pad, - void *epilogue, bool add_tag) + void *epilogue, struct page **pages, int dpos, + bool add_tag) { struct ceph_msg_data_cursor cursor; struct scatterlist *cur_sg; + int dlen = data_len(msg); int sg_cnt; int ret; @@ -982,9 +1018,15 @@ static int setup_message_sgs(struct sg_table *sgt, struct ceph_msg *msg, if (middle_len(msg)) sg_cnt += calc_sg_cnt(msg->middle->vec.iov_base, middle_len(msg)); - if (data_len(msg)) { - ceph_msg_data_cursor_init(&cursor, msg, data_len(msg)); - sg_cnt += calc_sg_cnt_cursor(&cursor); + if (dlen) { + if (pages) { + sg_cnt += calc_pages_for(dpos, dlen); + if (need_padding(dlen)) + sg_cnt++; + } else { + ceph_msg_data_cursor_init(&cursor, msg, dlen); + sg_cnt += calc_sg_cnt_cursor(&cursor); + } } ret = sg_alloc_table(sgt, sg_cnt, GFP_NOIO); @@ -998,9 +1040,13 @@ static int setup_message_sgs(struct sg_table *sgt, struct ceph_msg *msg, if (middle_len(msg)) init_sgs(&cur_sg, msg->middle->vec.iov_base, middle_len(msg), middle_pad); - if (data_len(msg)) { - ceph_msg_data_cursor_init(&cursor, msg, data_len(msg)); - init_sgs_cursor(&cur_sg, &cursor, data_pad); + if (dlen) { + if (pages) { + init_sgs_pages(&cur_sg, pages, dpos, dlen, data_pad); + } else { + ceph_msg_data_cursor_init(&cursor, msg, dlen); + init_sgs_cursor(&cur_sg, &cursor, data_pad); + } } WARN_ON(!sg_is_last(cur_sg)); @@ -1035,10 +1081,52 @@ static int decrypt_control_remainder(struct ceph_connection *con) padded_len(rem_len) + CEPH_GCM_TAG_LEN); } +/* Process sparse read data that lives in a buffer */ +static int process_v2_sparse_read(struct ceph_connection *con, struct page **pages, int spos) +{ + struct ceph_msg_data_cursor *cursor = &con->v2.in_cursor; + int ret; + + for (;;) { + char *buf = NULL; + + ret = con->ops->sparse_read(con, cursor, &buf); + if (ret <= 0) + return ret; + + dout("%s: sparse_read return %x buf %p\n", __func__, ret, buf); + + do { + int idx = spos >> PAGE_SHIFT; + int soff = offset_in_page(spos); + struct page *spage = con->v2.in_enc_pages[idx]; + int len = min_t(int, ret, PAGE_SIZE - soff); + + if (buf) { + memcpy_from_page(buf, spage, soff, len); + buf += len; + } else { + struct bio_vec bv; + + get_bvec_at(cursor, &bv); + len = min_t(int, len, bv.bv_len); + memcpy_page(bv.bv_page, bv.bv_offset, + spage, soff, len); + ceph_msg_data_advance(cursor, len); + } + spos += len; + ret -= len; + } while (ret); + } +} + static int decrypt_tail(struct ceph_connection *con) { struct sg_table enc_sgt = {}; struct sg_table sgt = {}; + struct page **pages = NULL; + bool sparse = con->in_msg->sparse_read; + int dpos = 0; int tail_len; int ret; @@ -1049,9 +1137,14 @@ static int decrypt_tail(struct ceph_connection *con) if (ret) goto out; + if (sparse) { + dpos = padded_len(front_len(con->in_msg) + padded_len(middle_len(con->in_msg))); + pages = con->v2.in_enc_pages; + } + ret = setup_message_sgs(&sgt, con->in_msg, FRONT_PAD(con->v2.in_buf), - MIDDLE_PAD(con->v2.in_buf), DATA_PAD(con->v2.in_buf), - con->v2.in_buf, true); + MIDDLE_PAD(con->v2.in_buf), DATA_PAD(con->v2.in_buf), + con->v2.in_buf, pages, dpos, true); if (ret) goto out; @@ -1061,6 +1154,12 @@ static int decrypt_tail(struct ceph_connection *con) if (ret) goto out; + if (sparse && data_len(con->in_msg)) { + ret = process_v2_sparse_read(con, con->v2.in_enc_pages, dpos); + if (ret) + goto out; + } + WARN_ON(!con->v2.in_enc_page_cnt); ceph_release_page_vector(con->v2.in_enc_pages, con->v2.in_enc_page_cnt); @@ -1584,7 +1683,7 @@ static int prepare_message_secure(struct ceph_connection *con) encode_epilogue_secure(con, false); ret = setup_message_sgs(&sgt, con->out_msg, zerop, zerop, zerop, - &con->v2.out_epil, false); + &con->v2.out_epil, NULL, 0, false); if (ret) goto out; From patchwork Tue Apr 5 19:19:37 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802360 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1049EC433F5 for ; Wed, 6 Apr 2022 04:16:42 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1445687AbiDFEOs (ORCPT ); Wed, 6 Apr 2022 00:14:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35496 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573554AbiDETWj (ORCPT ); Tue, 5 Apr 2022 15:22:39 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 632473E0DB; Tue, 5 Apr 2022 12:20:40 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 0AB56B81F6B; Tue, 5 Apr 2022 19:20:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0503FC385A1; Tue, 5 Apr 2022 19:20:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186437; bh=gFp0s2+yOcXsxdx+8FwxESNn106vn66cOvnAIMvKYsk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VsI1yGO5xbYAMRP+i5E4Cubrv3Faw4rfQwY+423rZ6IyILXWi/UVtF9IXBaZ74/cY YZEUqMyQk4EMV9V3S3jSi2IAM7DqaPSbNcDfW0Ejr/z/0dJdfQAQ0ixi+kvNt514kE bgAahZ/++1hP+6TyivNQ8X4bEpOxYz4r8ptcfSeE+kHI56pwrnkiLWYe1MXkJcO6Xu VSlJPqyYzNm+VMd+4EQX7NUfEiruPnZ+46ahMdmpGqjGjqmaCijz55i2ea1VlGKnjF jyTCJPP6/RZac+VTbaqI8Kv149wFgV6jpJ8trlqdSOYGunrAt5aD4ANDjKiu8jyDVw Q3NtEK2o+UCZg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 06/59] libceph: add sparse read support to msgr1 Date: Tue, 5 Apr 2022 15:19:37 -0400 Message-Id: <20220405192030.178326-7-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add 2 new fields to ceph_connection_v1_info to track the necessary info in sparse reads. Skip initializing the cursor for a sparse read. Break out read_partial_message_section into a wrapper around a new read_partial_message_chunk function that doesn't zero out the crc first. Add new helper functions to drive receiving into the destinations provided by the sparse_read state machine. Signed-off-by: Jeff Layton --- include/linux/ceph/messenger.h | 4 ++ net/ceph/messenger_v1.c | 98 +++++++++++++++++++++++++++++++--- 2 files changed, 94 insertions(+), 8 deletions(-) diff --git a/include/linux/ceph/messenger.h b/include/linux/ceph/messenger.h index 7f09a4213834..f4adbfee56d5 100644 --- a/include/linux/ceph/messenger.h +++ b/include/linux/ceph/messenger.h @@ -337,6 +337,10 @@ struct ceph_connection_v1_info { int in_base_pos; /* bytes read */ + /* sparse reads */ + struct kvec in_sr_kvec; /* current location to receive into */ + u64 in_sr_len; /* amount of data in this extent */ + /* message in temps */ u8 in_tag; /* protocol control byte */ struct ceph_msg_header in_hdr; diff --git a/net/ceph/messenger_v1.c b/net/ceph/messenger_v1.c index 6b014eca3a13..bf385e458a01 100644 --- a/net/ceph/messenger_v1.c +++ b/net/ceph/messenger_v1.c @@ -160,9 +160,9 @@ static size_t sizeof_footer(struct ceph_connection *con) static void prepare_message_data(struct ceph_msg *msg, u32 data_len) { - /* Initialize data cursor */ - - ceph_msg_data_cursor_init(&msg->cursor, msg, data_len); + /* Initialize data cursor if it's not a sparse read */ + if (!msg->sparse_read) + ceph_msg_data_cursor_init(&msg->cursor, msg, data_len); } /* @@ -967,9 +967,9 @@ static void process_ack(struct ceph_connection *con) prepare_read_tag(con); } -static int read_partial_message_section(struct ceph_connection *con, - struct kvec *section, - unsigned int sec_len, u32 *crc) +static int read_partial_message_chunk(struct ceph_connection *con, + struct kvec *section, + unsigned int sec_len, u32 *crc) { int ret, left; @@ -985,11 +985,91 @@ static int read_partial_message_section(struct ceph_connection *con, section->iov_len += ret; } if (section->iov_len == sec_len) - *crc = crc32c(0, section->iov_base, section->iov_len); + *crc = crc32c(*crc, section->iov_base, section->iov_len); return 1; } +static inline int read_partial_message_section(struct ceph_connection *con, + struct kvec *section, + unsigned int sec_len, u32 *crc) +{ + *crc = 0; + return read_partial_message_chunk(con, section, sec_len, crc); +} + +static int read_sparse_msg_extent(struct ceph_connection *con, u32 *crc) +{ + struct ceph_msg_data_cursor *cursor = &con->in_msg->cursor; + bool do_bounce = ceph_test_opt(from_msgr(con->msgr), RXBOUNCE); + + if (do_bounce && unlikely(!con->bounce_page)) { + con->bounce_page = alloc_page(GFP_NOIO); + if (!con->bounce_page) { + pr_err("failed to allocate bounce page\n"); + return -ENOMEM; + } + } + + while (cursor->sr_resid > 0) { + struct page *page, *rpage; + size_t off, len; + int ret; + + page = ceph_msg_data_next(cursor, &off, &len, NULL); + rpage = do_bounce ? con->bounce_page : page; + + /* clamp to what remains in extent */ + len = min_t(int, len, cursor->sr_resid); + ret = ceph_tcp_recvpage(con->sock, rpage, (int)off, len); + if (ret <= 0) + return ret; + *crc = ceph_crc32c_page(*crc, rpage, off, ret); + ceph_msg_data_advance(cursor, (size_t)ret); + cursor->sr_resid -= ret; + if (do_bounce) + memcpy_page(page, off, rpage, off, ret); + } + return 1; +} + +static int read_sparse_msg_data(struct ceph_connection *con) +{ + struct ceph_msg_data_cursor *cursor = &con->in_msg->cursor; + bool do_datacrc = !ceph_test_opt(from_msgr(con->msgr), NOCRC); + u32 crc = 0; + int ret = 1; + + if (do_datacrc) + crc = con->in_data_crc; + + do { + if (con->v1.in_sr_kvec.iov_base) + ret = read_partial_message_chunk(con, + &con->v1.in_sr_kvec, + con->v1.in_sr_len, + &crc); + else if (cursor->sr_resid > 0) + ret = read_sparse_msg_extent(con, &crc); + + if (ret <= 0) { + if (do_datacrc) + con->in_data_crc = crc; + return ret; + } + + memset(&con->v1.in_sr_kvec, 0, sizeof(con->v1.in_sr_kvec)); + ret = con->ops->sparse_read(con, cursor, + (char **)&con->v1.in_sr_kvec.iov_base); + con->v1.in_sr_len = ret; + } while (ret > 0); + + if (do_datacrc) + con->in_data_crc = crc; + + return ret < 0 ? ret : 1; /* must return > 0 to indicate success */ +} + static int read_partial_msg_data(struct ceph_connection *con) { struct ceph_msg_data_cursor *cursor = &con->in_msg->cursor; @@ -1180,7 +1260,9 @@ static int read_partial_message(struct ceph_connection *con) if (!m->num_data_items) return -EIO; - if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) + if (m->sparse_read) + ret = read_sparse_msg_data(con); + else if (ceph_test_opt(from_msgr(con->msgr), RXBOUNCE)) ret = read_partial_msg_data_bounce(con); else ret = read_partial_msg_data(con); From patchwork Tue Apr 5 19:19:38 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802354 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 72320C43217 for ; Wed, 6 Apr 2022 04:16:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1385435AbiDFENb (ORCPT ); Wed, 6 Apr 2022 00:13:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35636 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573556AbiDETWl (ORCPT ); Tue, 5 Apr 2022 15:22:41 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 180DB3CFDB; Tue, 5 Apr 2022 12:20:41 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id CCC7BB81FA8; Tue, 5 Apr 2022 19:20:39 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E21C4C385A5; Tue, 5 Apr 2022 19:20:37 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186438; bh=IDUUHHm/lC7WrXghg2b9EU8N6n8Ce+tfjO41Cd6bRik=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dhktCP+xu/cqLvctOE/rx1gdsCr5pDA7kMHOPUawFFzNQfiFv9eku0wq7nUhNfN4H H6IE85bgpbP9jy75hpLr21nKoUiYGMrGmBsH+4tl414GeR/UEbCpJe9BgHZ7xqGh10 vAfE1pUQ8eYuFRy2AeCj7Y7GV1+iuBI0Leh7W8pMMMbGTCWi9pNi5xcpPoEjxAMnR6 4PApdZBTvZig6xzbiLkHQqBQR7wC9Wkcb0qAD18S82yLidU9dPRcTy4+WDlWQsi0DL IHkkYnatKiF9238Rx8z2NA+4zIN6jVPWzxRHmpao2ZZQScYKkeEy6r0PrKNat8eBU4 bvEIRishb0cCQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 07/59] ceph: add new mount option to enable sparse reads Date: Tue, 5 Apr 2022 15:19:38 -0400 Message-Id: <20220405192030.178326-8-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add a new mount option that has the client issue sparse reads instead of normal ones. The callers now preallocate an sparse extent buffer that the libceph receive code can populate and hand back after the operation completes. After a successful sparse read, we can't use the req->r_result value to determine the amount of data "read", so instead we set the received length to be from the end of the last extent in the buffer. Any interstitial holes will have been filled by the receive code. Reviewed-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 17 +++++++++++++++-- fs/ceph/file.c | 51 +++++++++++++++++++++++++++++++++++++++++-------- fs/ceph/super.c | 16 +++++++++++++++- fs/ceph/super.h | 1 + 4 files changed, 74 insertions(+), 11 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index aa25bffd4823..99021431a391 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -219,8 +219,10 @@ static void finish_netfs_read(struct ceph_osd_request *req) struct ceph_fs_client *fsc = ceph_inode_to_client(req->r_inode); struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); struct netfs_io_subrequest *subreq = req->r_priv; + struct ceph_osd_req_op *op = &req->r_ops[0]; int num_pages; int err = req->r_result; + bool sparse = (op->op == CEPH_OSD_OP_SPARSE_READ); ceph_update_read_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, osd_data->length, err); @@ -229,7 +231,9 @@ static void finish_netfs_read(struct ceph_osd_request *req) subreq->len, i_size_read(req->r_inode)); /* no object means success but no data */ - if (err == -ENOENT) + if (sparse && err >= 0) + err = ceph_sparse_ext_map_end(op); + else if (err == -ENOENT) err = 0; else if (err == -EBLOCKLISTED) fsc->blocklisted = true; @@ -310,13 +314,14 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) size_t page_off; int err = 0; u64 len = subreq->len; + bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); if (ci->i_inline_version != CEPH_INLINE_NONE && ceph_netfs_issue_op_inline(subreq)) return; req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, subreq->start, &len, - 0, 1, CEPH_OSD_OP_READ, + 0, 1, sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ | fsc->client->osdc.client->options->read_from_replica, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); if (IS_ERR(req)) { @@ -325,6 +330,14 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) goto out; } + if (sparse) { + err = ceph_alloc_sparse_ext_map(&req->r_ops[0]); + if (err) { + ceph_osdc_put_request(req); + goto out; + } + } + dout("%s: pos=%llu orig_len=%zu len=%llu\n", __func__, subreq->start, subreq->len, len); iov_iter_xarray(&iter, READ, &rreq->mapping->i_pages, subreq->start, len); err = iov_iter_get_pages_alloc(&iter, &pages, len, &page_off); diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 6c9e837aa1d3..9ee6e92bfed0 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -905,6 +905,7 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, u64 off = iocb->ki_pos; u64 len = iov_iter_count(to); u64 i_size = i_size_read(inode); + bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); dout("sync_read on file %p %llu~%u %s\n", file, off, (unsigned)len, (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); @@ -931,10 +932,12 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, bool more; int idx; size_t left; + struct ceph_osd_req_op *op; req = ceph_osdc_new_request(osdc, &ci->i_layout, ci->i_vino, off, &len, 0, 1, - CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, + sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, + CEPH_OSD_FLAG_READ, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); if (IS_ERR(req)) { @@ -955,6 +958,16 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, osd_req_op_extent_osd_data_pages(req, 0, pages, len, page_off, false, false); + + op = &req->r_ops[0]; + if (sparse) { + ret = ceph_alloc_sparse_ext_map(op); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } + ret = ceph_osdc_start_request(osdc, req, false); if (!ret) ret = ceph_osdc_wait_request(osdc, req); @@ -964,19 +977,24 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, req->r_end_latency, len, ret); - ceph_osdc_put_request(req); - i_size = i_size_read(inode); dout("sync_read %llu~%llu got %zd i_size %llu%s\n", off, len, ret, i_size, (more ? " MORE" : "")); - if (ret == -ENOENT) + /* Fix it to go to end of extent map */ + if (sparse && ret >= 0) + ret = ceph_sparse_ext_map_end(op); + else if (ret == -ENOENT) ret = 0; + + ceph_osdc_put_request(req); + if (ret >= 0 && ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); int zoff = page_off + ret; + dout("sync_read zero gap %llu~%llu\n", - off + ret, off + ret + zlen); + off + ret, off + ret + zlen); ceph_zero_page_vector_range(zoff, zlen, pages); ret += zlen; } @@ -1095,8 +1113,10 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req) struct inode *inode = req->r_inode; struct ceph_aio_request *aio_req = req->r_priv; struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); + struct ceph_osd_req_op *op = &req->r_ops[0]; struct ceph_client_metric *metric = &ceph_sb_to_mdsc(inode->i_sb)->metric; unsigned int len = osd_data->bvec_pos.iter.bi_size; + bool sparse = (op->op == CEPH_OSD_OP_SPARSE_READ); BUG_ON(osd_data->type != CEPH_OSD_DATA_TYPE_BVECS); BUG_ON(!osd_data->num_bvecs); @@ -1117,6 +1137,8 @@ static void ceph_aio_complete_req(struct ceph_osd_request *req) } rc = -ENOMEM; } else if (!aio_req->write) { + if (sparse && rc >= 0) + rc = ceph_sparse_ext_map_end(op); if (rc == -ENOENT) rc = 0; if (rc >= 0 && len > rc) { @@ -1253,6 +1275,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, loff_t pos = iocb->ki_pos; bool write = iov_iter_rw(iter) == WRITE; bool should_dirty = !write && iter_is_iovec(iter); + bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); if (write && ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS; @@ -1280,6 +1303,8 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, while (iov_iter_count(iter) > 0) { u64 size = iov_iter_count(iter); ssize_t len; + struct ceph_osd_req_op *op; + int readop = sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ; if (write) size = min_t(u64, size, fsc->mount_options->wsize); @@ -1290,8 +1315,7 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, pos, &size, 0, 1, - write ? CEPH_OSD_OP_WRITE : - CEPH_OSD_OP_READ, + write ? CEPH_OSD_OP_WRITE : readop, flags, snapc, ci->i_truncate_seq, ci->i_truncate_size, @@ -1342,6 +1366,14 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, } osd_req_op_extent_osd_data_bvecs(req, 0, bvecs, num_pages, len); + op = &req->r_ops[0]; + if (sparse) { + ret = ceph_alloc_sparse_ext_map(op); + if (ret) { + ceph_osdc_put_request(req); + break; + } + } if (aio_req) { aio_req->total_len += len; @@ -1370,8 +1402,11 @@ ceph_direct_read_write(struct kiocb *iocb, struct iov_iter *iter, size = i_size_read(inode); if (!write) { - if (ret == -ENOENT) + if (sparse && ret >= 0) + ret = ceph_sparse_ext_map_end(op); + else if (ret == -ENOENT) ret = 0; + if (ret >= 0 && ret < len && pos + ret < size) { struct iov_iter i; int zlen = min_t(size_t, len - ret, diff --git a/fs/ceph/super.c b/fs/ceph/super.c index e6987d295079..26d924dda721 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -163,6 +163,7 @@ enum { Opt_copyfrom, Opt_wsync, Opt_pagecache, + Opt_sparseread, }; enum ceph_recover_session_mode { @@ -205,6 +206,7 @@ static const struct fs_parameter_spec ceph_mount_parameters[] = { fsparam_u32 ("wsize", Opt_wsize), fsparam_flag_no ("wsync", Opt_wsync), fsparam_flag_no ("pagecache", Opt_pagecache), + fsparam_flag_no ("sparseread", Opt_sparseread), {} }; @@ -574,6 +576,12 @@ static int ceph_parse_mount_param(struct fs_context *fc, else fsopt->flags &= ~CEPH_MOUNT_OPT_NOPAGECACHE; break; + case Opt_sparseread: + if (result.negated) + fsopt->flags &= ~CEPH_MOUNT_OPT_SPARSEREAD; + else + fsopt->flags |= CEPH_MOUNT_OPT_SPARSEREAD; + break; default: BUG(); } @@ -708,9 +716,10 @@ static int ceph_show_options(struct seq_file *m, struct dentry *root) if (!(fsopt->flags & CEPH_MOUNT_OPT_ASYNC_DIROPS)) seq_puts(m, ",wsync"); - if (fsopt->flags & CEPH_MOUNT_OPT_NOPAGECACHE) seq_puts(m, ",nopagecache"); + if (fsopt->flags & CEPH_MOUNT_OPT_SPARSEREAD) + seq_puts(m, ",sparseread"); if (fsopt->wsize != CEPH_MAX_WRITE_SIZE) seq_printf(m, ",wsize=%u", fsopt->wsize); @@ -1290,6 +1299,11 @@ static int ceph_reconfigure_fc(struct fs_context *fc) else ceph_clear_mount_opt(fsc, ASYNC_DIROPS); + if (fsopt->flags & CEPH_MOUNT_OPT_SPARSEREAD) + ceph_set_mount_opt(fsc, SPARSEREAD); + else + ceph_clear_mount_opt(fsc, SPARSEREAD); + if (strcmp_null(fsc->mount_options->mon_addr, fsopt->mon_addr)) { kfree(fsc->mount_options->mon_addr); fsc->mount_options->mon_addr = fsopt->mon_addr; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 73db7f6021f3..80a2399f2878 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -41,6 +41,7 @@ #define CEPH_MOUNT_OPT_NOCOPYFROM (1<<14) /* don't use RADOS 'copy-from' op */ #define CEPH_MOUNT_OPT_ASYNC_DIROPS (1<<15) /* allow async directory ops */ #define CEPH_MOUNT_OPT_NOPAGECACHE (1<<16) /* bypass pagecache altogether */ +#define CEPH_MOUNT_OPT_SPARSEREAD (1<<17) /* always do sparse reads */ #define CEPH_MOUNT_OPT_DEFAULT \ (CEPH_MOUNT_OPT_DCACHE | \ From patchwork Tue Apr 5 19:19:39 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802338 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A01DDC433FE for ; Wed, 6 Apr 2022 04:05:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234327AbiDFEH0 (ORCPT ); Wed, 6 Apr 2022 00:07:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35638 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573557AbiDETWl (ORCPT ); Tue, 5 Apr 2022 15:22:41 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B844F3F899; Tue, 5 Apr 2022 12:20:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 2623ECE1FB8; Tue, 5 Apr 2022 19:20:41 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id CA51CC385A0; Tue, 5 Apr 2022 19:20:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186439; bh=aGm/yYftJrMv8Wm0z7akYN9TTrAtUh82JqGdEA2ekKU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=NKs1Yjzs98L8zGRRD0FwzP0xrAX0EHlKhwSsFbQXrX9BBZav9La1egDgcBTZ/jvwi NK2BWI0205CvUtsRTaOKah9SErtfgrKJby9ltId9RE8vBdZLKVVkNiZXr7HifE+w7i /3OgKwab9ZpJH2mHMUfrLu59chPIn6/KRr9Ae85+Qpz41FGwqurBQYvkDDWITlUFQ9 G2i3w9IMdIS+wkv/c9dcmPAEg8ksaNnbS1ywDpp3BrW/AXFJQDhgYIdK+Wts5KG0tq I4bbEgTFPNoo4wOa3asis/p4fI/3Rp0XZNu8ZRPP6WBwfOr6rUkLNJbcXhXBq0bq3p jEMuvqeP5AhzQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de, Al Viro Subject: [PATCH v13 08/59] fs: change test in inode_insert5 for adding to the sb list Date: Tue, 5 Apr 2022 15:19:39 -0400 Message-Id: <20220405192030.178326-9-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org The inode_insert5 currently looks at I_CREATING to decide whether to insert the inode into the sb list. This test is a bit ambiguous though as I_CREATING state is not directly related to that list. This test is also problematic for some upcoming ceph changes to add fscrypt support. We need to be able to allocate an inode using new_inode and insert it into the hash later if we end up using it, and doing that now means that we double add it and corrupt the list. What we really want to know in this test is whether the inode is already in its superblock list, and then add it if it isn't. Have it test for list_empty instead and ensure that we always initialize the list by doing it in inode_init_once. It's only ever removed from the list with list_del_init, so that should be sufficient. Suggested-by: Al Viro Signed-off-by: Jeff Layton --- fs/inode.c | 11 ++++++++--- 1 file changed, 8 insertions(+), 3 deletions(-) diff --git a/fs/inode.c b/fs/inode.c index 9d9b422504d1..743420a55e5f 100644 --- a/fs/inode.c +++ b/fs/inode.c @@ -422,6 +422,7 @@ void inode_init_once(struct inode *inode) INIT_LIST_HEAD(&inode->i_io_list); INIT_LIST_HEAD(&inode->i_wb_list); INIT_LIST_HEAD(&inode->i_lru); + INIT_LIST_HEAD(&inode->i_sb_list); __address_space_init_once(&inode->i_data); i_size_ordered_init(inode); } @@ -1021,7 +1022,6 @@ struct inode *new_inode_pseudo(struct super_block *sb) spin_lock(&inode->i_lock); inode->i_state = 0; spin_unlock(&inode->i_lock); - INIT_LIST_HEAD(&inode->i_sb_list); } return inode; } @@ -1165,7 +1165,6 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval, { struct hlist_head *head = inode_hashtable + hash(inode->i_sb, hashval); struct inode *old; - bool creating = inode->i_state & I_CREATING; again: spin_lock(&inode_hash_lock); @@ -1199,7 +1198,13 @@ struct inode *inode_insert5(struct inode *inode, unsigned long hashval, inode->i_state |= I_NEW; hlist_add_head_rcu(&inode->i_hash, head); spin_unlock(&inode->i_lock); - if (!creating) + + /* + * Add it to the list if it wasn't already in, + * e.g. new_inode. We hold I_NEW at this point, so + * we should be safe to test i_sb_list locklessly. + */ + if (list_empty(&inode->i_sb_list)) inode_sb_list_add(inode); unlock: spin_unlock(&inode_hash_lock); From patchwork Tue Apr 5 19:19:40 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802336 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 21C78C43217 for ; Wed, 6 Apr 2022 04:05:15 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S233134AbiDFEGz (ORCPT ); Wed, 6 Apr 2022 00:06:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35578 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573555AbiDETWl (ORCPT ); Tue, 5 Apr 2022 15:22:41 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 477353CFCD; Tue, 5 Apr 2022 12:20:42 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D4AA7B81FA5; Tue, 5 Apr 2022 19:20:40 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C5A20C385A1; Tue, 5 Apr 2022 19:20:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186440; bh=u6cpxI3/hvGaiyZFCeu55d6yaSo4kUHVnzeT4mNgI6I=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nxmTSsYbQna7Zx3YoDwuUWRwNf5AdGwzzK96nh63/QltKFEnpr2PwhwuQyaEUKrfI VmN1uf1EXU+xTBMcM0xQIkGQnrfebS6qxczBYaXfBy7+dgrogjZbb+oJsQZ8G99/zt m1UAohAn5Z6Lt9yEoSz50mbQaNjV+zuK6QWz+GhsADdTJyysdJ5gR2cH+b4og7vRV1 SdLTIIHrHWnG9boNnJWQSHXaeMf+A5N7IhyWG8cQShe7U6MSY4t2bUF3whg6EmHHvW 6G0JUFKiVb0hUoWhBOiPiVVyu7yXfDEKf196oeE7VlGUSoDz1KKw9riH3VkX/NJ/HF mVoj8euy/4W5Q== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de, Eric Biggers Subject: [PATCH v13 09/59] fscrypt: export fscrypt_base64url_encode and fscrypt_base64url_decode Date: Tue, 5 Apr 2022 15:19:40 -0400 Message-Id: <20220405192030.178326-10-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Ceph is going to add fscrypt support, but we still want encrypted filenames to be composed of printable characters, so we can maintain compatibility with clients that don't support fscrypt. We could just adopt fscrypt's current nokey name format, but that is subject to change in the future, and it also contains dirhash fields that we don't need for cephfs. Because of this, we're going to concoct our own scheme for encoding encrypted filenames. It's very similar to fscrypt's current scheme, but doesn't bother with the dirhash fields. The ceph encoding scheme will use base64 encoding as well, and we also want it to avoid characters that are illegal in filenames. Export the fscrypt base64 encoding/decoding routines so we can use them in ceph's fscrypt implementation. Acked-by: Eric Biggers Signed-off-by: Jeff Layton --- fs/crypto/fname.c | 8 ++++---- include/linux/fscrypt.h | 5 +++++ 2 files changed, 9 insertions(+), 4 deletions(-) diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c index a9be4bc74a94..1e4233c95005 100644 --- a/fs/crypto/fname.c +++ b/fs/crypto/fname.c @@ -182,8 +182,6 @@ static int fname_decrypt(const struct inode *inode, static const char base64url_table[65] = "ABCDEFGHIJKLMNOPQRSTUVWXYZabcdefghijklmnopqrstuvwxyz0123456789-_"; -#define FSCRYPT_BASE64URL_CHARS(nbytes) DIV_ROUND_UP((nbytes) * 4, 3) - /** * fscrypt_base64url_encode() - base64url-encode some binary data * @src: the binary data to encode @@ -198,7 +196,7 @@ static const char base64url_table[65] = * Return: the length of the resulting base64url-encoded string in bytes. * This will be equal to FSCRYPT_BASE64URL_CHARS(srclen). */ -static int fscrypt_base64url_encode(const u8 *src, int srclen, char *dst) +int fscrypt_base64url_encode(const u8 *src, int srclen, char *dst) { u32 ac = 0; int bits = 0; @@ -217,6 +215,7 @@ static int fscrypt_base64url_encode(const u8 *src, int srclen, char *dst) *cp++ = base64url_table[(ac << (6 - bits)) & 0x3f]; return cp - dst; } +EXPORT_SYMBOL_GPL(fscrypt_base64url_encode); /** * fscrypt_base64url_decode() - base64url-decode a string @@ -233,7 +232,7 @@ static int fscrypt_base64url_encode(const u8 *src, int srclen, char *dst) * Return: the length of the resulting decoded binary data in bytes, * or -1 if the string isn't a valid base64url string. */ -static int fscrypt_base64url_decode(const char *src, int srclen, u8 *dst) +int fscrypt_base64url_decode(const char *src, int srclen, u8 *dst) { u32 ac = 0; int bits = 0; @@ -256,6 +255,7 @@ static int fscrypt_base64url_decode(const char *src, int srclen, u8 *dst) return -1; return bp - dst; } +EXPORT_SYMBOL_GPL(fscrypt_base64url_decode); bool fscrypt_fname_encrypted_size(const union fscrypt_policy *policy, u32 orig_len, u32 max_len, diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h index 50d92d805bd8..629ccd09e095 100644 --- a/include/linux/fscrypt.h +++ b/include/linux/fscrypt.h @@ -46,6 +46,9 @@ struct fscrypt_name { /* Maximum value for the third parameter of fscrypt_operations.set_context(). */ #define FSCRYPT_SET_CONTEXT_MAX_SIZE 40 +/* len of resulting string (sans NUL terminator) after base64 encoding nbytes */ +#define FSCRYPT_BASE64URL_CHARS(nbytes) DIV_ROUND_UP((nbytes) * 4, 3) + #ifdef CONFIG_FS_ENCRYPTION /* @@ -305,6 +308,8 @@ void fscrypt_free_inode(struct inode *inode); int fscrypt_drop_inode(struct inode *inode); /* fname.c */ +int fscrypt_base64url_encode(const u8 *src, int len, char *dst); +int fscrypt_base64url_decode(const char *src, int len, u8 *dst); int fscrypt_setup_filename(struct inode *inode, const struct qstr *iname, int lookup, struct fscrypt_name *fname); From patchwork Tue Apr 5 19:19:41 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802356 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4B99BC433F5 for ; Wed, 6 Apr 2022 04:16:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1390601AbiDFEOG (ORCPT ); Wed, 6 Apr 2022 00:14:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35780 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573558AbiDETWn (ORCPT ); Tue, 5 Apr 2022 15:22:43 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A87644091B; Tue, 5 Apr 2022 12:20:44 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 123EACE1FB7; Tue, 5 Apr 2022 19:20:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C385AC385A5; Tue, 5 Apr 2022 19:20:40 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186441; bh=VavI1xtUqjkJNyDAit+vwUdU1iO1gYw4vnwxHnE4eZo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=F02a4y98R/WYeJZOnGcn1AJn7z5GbL1pSyDrlpN2Ut7UTvTP0SiWeqP97gKYizkFq +qUwBrNobxeg/XX4c8ccrrXr2R+3AtwFDmKZfNgW8xAktn9HPGBtfOwx4nhKWcTtFx +nx/1zZ5gpudtUCWEKpdARBi8ASsoQjDbak0arHoYHDKd2jjOL/cz1luz6vyCYq9cr e1cox2Nl1Sd8IkjulzRYRwTO8yrbk3mM5X0402aJklwBuuiHb1yQ0eOmB+zkp93tiD A30F01ofcAiJsTRGONWR/e1G6zNqWGmlzILQsqcXyVDBi6hqRke6+P5lah8DON0iVK hXHYHmGRMSUfA== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de, Eric Biggers Subject: [PATCH v13 10/59] fscrypt: export fscrypt_fname_encrypt and fscrypt_fname_encrypted_size Date: Tue, 5 Apr 2022 15:19:41 -0400 Message-Id: <20220405192030.178326-11-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org For ceph, we want to use our own scheme for handling filenames that are are longer than NAME_MAX after encryption and Base64 encoding. This allows us to have a consistent view of the encrypted filenames for clients that don't support fscrypt and clients that do but that don't have the key. Currently, fs/crypto only supports encrypting filenames using fscrypt_setup_filename, but that also handles encoding nokey names. Ceph can't use that because it handles nokey names in a different way. Export fscrypt_fname_encrypt. Rename fscrypt_fname_encrypted_size to __fscrypt_fname_encrypted_size and add a new wrapper called fscrypt_fname_encrypted_size that takes an inode argument rather than a pointer to a fscrypt_policy union. Acked-by: Eric Biggers Signed-off-by: Jeff Layton --- fs/crypto/fname.c | 36 ++++++++++++++++++++++++++++++------ fs/crypto/fscrypt_private.h | 9 +++------ fs/crypto/hooks.c | 6 +++--- include/linux/fscrypt.h | 4 ++++ 4 files changed, 40 insertions(+), 15 deletions(-) diff --git a/fs/crypto/fname.c b/fs/crypto/fname.c index 1e4233c95005..77d38188a168 100644 --- a/fs/crypto/fname.c +++ b/fs/crypto/fname.c @@ -79,7 +79,8 @@ static inline bool fscrypt_is_dot_dotdot(const struct qstr *str) /** * fscrypt_fname_encrypt() - encrypt a filename * @inode: inode of the parent directory (for regular filenames) - * or of the symlink (for symlink targets) + * or of the symlink (for symlink targets). Key must already be + * set up. * @iname: the filename to encrypt * @out: (output) the encrypted filename * @olen: size of the encrypted filename. It must be at least @iname->len. @@ -130,6 +131,7 @@ int fscrypt_fname_encrypt(const struct inode *inode, const struct qstr *iname, return 0; } +EXPORT_SYMBOL_GPL(fscrypt_fname_encrypt); /** * fname_decrypt() - decrypt a filename @@ -257,9 +259,9 @@ int fscrypt_base64url_decode(const char *src, int srclen, u8 *dst) } EXPORT_SYMBOL_GPL(fscrypt_base64url_decode); -bool fscrypt_fname_encrypted_size(const union fscrypt_policy *policy, - u32 orig_len, u32 max_len, - u32 *encrypted_len_ret) +bool __fscrypt_fname_encrypted_size(const union fscrypt_policy *policy, + u32 orig_len, u32 max_len, + u32 *encrypted_len_ret) { int padding = 4 << (fscrypt_policy_flags(policy) & FSCRYPT_POLICY_FLAGS_PAD_MASK); @@ -273,6 +275,29 @@ bool fscrypt_fname_encrypted_size(const union fscrypt_policy *policy, return true; } +/** + * fscrypt_fname_encrypted_size() - calculate length of encrypted filename + * @inode: parent inode of dentry name being encrypted. Key must + * already be set up. + * @orig_len: length of the original filename + * @max_len: maximum length to return + * @encrypted_len_ret: where calculated length should be returned (on success) + * + * Filenames that are shorter than the maximum length may have their lengths + * increased slightly by encryption, due to padding that is applied. + * + * Return: false if the orig_len is greater than max_len. Otherwise, true and + * fill out encrypted_len_ret with the length (up to max_len). + */ +bool fscrypt_fname_encrypted_size(const struct inode *inode, u32 orig_len, + u32 max_len, u32 *encrypted_len_ret) +{ + return __fscrypt_fname_encrypted_size(&inode->i_crypt_info->ci_policy, + orig_len, max_len, + encrypted_len_ret); +} +EXPORT_SYMBOL_GPL(fscrypt_fname_encrypted_size); + /** * fscrypt_fname_alloc_buffer() - allocate a buffer for presented filenames * @max_encrypted_len: maximum length of encrypted filenames the buffer will be @@ -428,8 +453,7 @@ int fscrypt_setup_filename(struct inode *dir, const struct qstr *iname, return ret; if (fscrypt_has_encryption_key(dir)) { - if (!fscrypt_fname_encrypted_size(&dir->i_crypt_info->ci_policy, - iname->len, NAME_MAX, + if (!fscrypt_fname_encrypted_size(dir, iname->len, NAME_MAX, &fname->crypto_buf.len)) return -ENAMETOOLONG; fname->crypto_buf.name = kmalloc(fname->crypto_buf.len, diff --git a/fs/crypto/fscrypt_private.h b/fs/crypto/fscrypt_private.h index 5b0a9e6478b5..f3e6e566daff 100644 --- a/fs/crypto/fscrypt_private.h +++ b/fs/crypto/fscrypt_private.h @@ -297,14 +297,11 @@ void fscrypt_generate_iv(union fscrypt_iv *iv, u64 lblk_num, const struct fscrypt_info *ci); /* fname.c */ -int fscrypt_fname_encrypt(const struct inode *inode, const struct qstr *iname, - u8 *out, unsigned int olen); -bool fscrypt_fname_encrypted_size(const union fscrypt_policy *policy, - u32 orig_len, u32 max_len, - u32 *encrypted_len_ret); +bool __fscrypt_fname_encrypted_size(const union fscrypt_policy *policy, + u32 orig_len, u32 max_len, + u32 *encrypted_len_ret); /* hkdf.c */ - struct fscrypt_hkdf { struct crypto_shash *hmac_tfm; }; diff --git a/fs/crypto/hooks.c b/fs/crypto/hooks.c index af74599ae1cf..7c01025879b3 100644 --- a/fs/crypto/hooks.c +++ b/fs/crypto/hooks.c @@ -228,9 +228,9 @@ int fscrypt_prepare_symlink(struct inode *dir, const char *target, * counting it (even though it is meaningless for ciphertext) is simpler * for now since filesystems will assume it is there and subtract it. */ - if (!fscrypt_fname_encrypted_size(policy, len, - max_len - sizeof(struct fscrypt_symlink_data), - &disk_link->len)) + if (!__fscrypt_fname_encrypted_size(policy, len, + max_len - sizeof(struct fscrypt_symlink_data), + &disk_link->len)) return -ENAMETOOLONG; disk_link->len += sizeof(struct fscrypt_symlink_data); diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h index 629ccd09e095..84b363665162 100644 --- a/include/linux/fscrypt.h +++ b/include/linux/fscrypt.h @@ -308,8 +308,12 @@ void fscrypt_free_inode(struct inode *inode); int fscrypt_drop_inode(struct inode *inode); /* fname.c */ +int fscrypt_fname_encrypt(const struct inode *inode, const struct qstr *iname, + u8 *out, unsigned int olen); int fscrypt_base64url_encode(const u8 *src, int len, char *dst); int fscrypt_base64url_decode(const char *src, int len, u8 *dst); +bool fscrypt_fname_encrypted_size(const struct inode *inode, u32 orig_len, + u32 max_len, u32 *encrypted_len_ret); int fscrypt_setup_filename(struct inode *inode, const struct qstr *iname, int lookup, struct fscrypt_name *fname); From patchwork Tue Apr 5 19:19:42 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802379 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1471FC433FE for ; Wed, 6 Apr 2022 04:18:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1577452AbiDFEQe (ORCPT ); Wed, 6 Apr 2022 00:16:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:35810 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573559AbiDETWo (ORCPT ); Tue, 5 Apr 2022 15:22:44 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3C63C40E57; Tue, 5 Apr 2022 12:20:45 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D78C1B81FA4; Tue, 5 Apr 2022 19:20:43 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C017CC385A3; Tue, 5 Apr 2022 19:20:41 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186442; bh=nqFudejBHmm/2Fp5Uj5NdfNTUWq0CnNt7cKaEXokUYs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FLxcgBdAWUbRTTpmAjUx8v0lQkRY6jnTwtCXdCvJyrI/XXY31bVPvFqi9Xy66gnYf k+M0EePbpEBO8T56DvW8q1fKX1cC8PhwNOcCEIb9UXPSzZP1/Iql7gnrsvFZdQriKS agVrU3LfpuUNIeTcpYN5UU25Nyq7Mw6mY91V7AP2rN3g5nARITZYo8o1xrWNgWgMdn 6CepwNihicbFKr59JY7DYihoOow4Y1Jqpi3eaQGGaeMj9pZgLjNAEXBwhZ5oAZnWW1 6GLraAFALTUIV/ZKk8xH8RXwcBze58X+eY/3/f0s0ZGMXph0dW95yuYyg7K0DtpjIy KtxC5MvhOWUxw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de, Eric Biggers Subject: [PATCH v13 11/59] fscrypt: add fscrypt_context_for_new_inode Date: Tue, 5 Apr 2022 15:19:42 -0400 Message-Id: <20220405192030.178326-12-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Most filesystems just call fscrypt_set_context on new inodes, which usually causes a setxattr. That's a bit late for ceph, which can send along a full set of attributes with the create request. Doing so allows it to avoid race windows that where the new inode could be seen by other clients without the crypto context attached. It also avoids the separate round trip to the server. Refactor the fscrypt code a bit to allow us to create a new crypto context, attach it to the inode, and write it to the buffer, but without calling set_context on it. ceph can later use this to marshal the context into the attributes we send along with the create request. Acked-by: Eric Biggers Signed-off-by: Jeff Layton --- fs/crypto/policy.c | 35 +++++++++++++++++++++++++++++------ include/linux/fscrypt.h | 1 + 2 files changed, 30 insertions(+), 6 deletions(-) diff --git a/fs/crypto/policy.c b/fs/crypto/policy.c index ed3d623724cd..ec861af96252 100644 --- a/fs/crypto/policy.c +++ b/fs/crypto/policy.c @@ -664,6 +664,32 @@ const union fscrypt_policy *fscrypt_policy_to_inherit(struct inode *dir) return fscrypt_get_dummy_policy(dir->i_sb); } +/** + * fscrypt_context_for_new_inode() - create an encryption context for a new inode + * @ctx: where context should be written + * @inode: inode from which to fetch policy and nonce + * + * Given an in-core "prepared" (via fscrypt_prepare_new_inode) inode, + * generate a new context and write it to ctx. ctx _must_ be at least + * FSCRYPT_SET_CONTEXT_MAX_SIZE bytes. + * + * Return: size of the resulting context or a negative error code. + */ +int fscrypt_context_for_new_inode(void *ctx, struct inode *inode) +{ + struct fscrypt_info *ci = inode->i_crypt_info; + + BUILD_BUG_ON(sizeof(union fscrypt_context) != + FSCRYPT_SET_CONTEXT_MAX_SIZE); + + /* fscrypt_prepare_new_inode() should have set up the key already. */ + if (WARN_ON_ONCE(!ci)) + return -ENOKEY; + + return fscrypt_new_context(ctx, &ci->ci_policy, ci->ci_nonce); +} +EXPORT_SYMBOL_GPL(fscrypt_context_for_new_inode); + /** * fscrypt_set_context() - Set the fscrypt context of a new inode * @inode: a new inode @@ -680,12 +706,9 @@ int fscrypt_set_context(struct inode *inode, void *fs_data) union fscrypt_context ctx; int ctxsize; - /* fscrypt_prepare_new_inode() should have set up the key already. */ - if (WARN_ON_ONCE(!ci)) - return -ENOKEY; - - BUILD_BUG_ON(sizeof(ctx) != FSCRYPT_SET_CONTEXT_MAX_SIZE); - ctxsize = fscrypt_new_context(&ctx, &ci->ci_policy, ci->ci_nonce); + ctxsize = fscrypt_context_for_new_inode(&ctx, inode); + if (ctxsize < 0) + return ctxsize; /* * This may be the first time the inode number is available, so do any diff --git a/include/linux/fscrypt.h b/include/linux/fscrypt.h index 84b363665162..ebe908b20d94 100644 --- a/include/linux/fscrypt.h +++ b/include/linux/fscrypt.h @@ -276,6 +276,7 @@ int fscrypt_ioctl_get_policy(struct file *filp, void __user *arg); int fscrypt_ioctl_get_policy_ex(struct file *filp, void __user *arg); int fscrypt_ioctl_get_nonce(struct file *filp, void __user *arg); int fscrypt_has_permitted_context(struct inode *parent, struct inode *child); +int fscrypt_context_for_new_inode(void *ctx, struct inode *inode); int fscrypt_set_context(struct inode *inode, void *fs_data); struct fscrypt_dummy_policy { From patchwork Tue Apr 5 19:19:43 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802376 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 667CEC433F5 for ; Wed, 6 Apr 2022 04:16:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1456964AbiDFEQP (ORCPT ); Wed, 6 Apr 2022 00:16:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36006 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573560AbiDETWq (ORCPT ); Tue, 5 Apr 2022 15:22:46 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C6E9141996; Tue, 5 Apr 2022 12:20:46 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 097C9CE1FB8; Tue, 5 Apr 2022 19:20:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BC894C385A0; Tue, 5 Apr 2022 19:20:42 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186443; bh=+az3mwUrqSnm/XRcPrB9Xt+1CPtxgYV7Op3czPVtpCQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WQlqfEUeQjx8S7NI5nl0PmYwrxyvgdhdQV4cnMzeM7+v1iB90TB2eJG7WpvHFQCdP o1Yr+5PifY6qUNjEHy60UWCqwpelTvYKVMvTussUa/mTr2A/U001O6Y62mSsSD196C t7BIbh+TQnj3GJe7Wc1cTlhBGaOU4HOPxmee3uUKT3nfru9tC9MASjkB/lE94Zmb/c zx78MPeS2ennBcr1vx/cfBQ0joyG0TsnP0hOaf1i6UZOJRsOfsFs++TwLfosOSwNng TW0aPgEE8H+4pmuR7pt04lRCTrdiAEROaFLx7iEe2SGpYu+qPLSIXafkZIJYujPDyh qM7l4Uezwa7Fw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 12/59] ceph: preallocate inode for ops that may create one Date: Tue, 5 Apr 2022 15:19:43 -0400 Message-Id: <20220405192030.178326-13-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When creating a new inode, we need to determine the crypto context before we can transmit the RPC. The fscrypt API has a routine for getting a crypto context before a create occurs, but it requires an inode. Change the ceph code to preallocate an inode in advance of a create of any sort (open(), mknod(), symlink(), etc). Move the existing code that generates the ACL and SELinux blobs into this routine since that's mostly common across all the different codepaths. In most cases, we just want to allow ceph_fill_trace to use that inode after the reply comes in, so add a new field to the MDS request for it (r_new_inode). The async create codepath is a bit different though. In that case, we want to hash the inode in advance of the RPC so that it can be used before the reply comes in. If the call subsequently fails with -EJUKEBOX, then just put the references and clean up the as_ctx. Note that with this change, we now need to regenerate the as_ctx when this occurs, but it's quite rare for it to happen. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 70 ++++++++++++++++++++----------------- fs/ceph/file.c | 62 ++++++++++++++++++++------------- fs/ceph/inode.c | 82 ++++++++++++++++++++++++++++++++++++++++---- fs/ceph/mds_client.c | 13 +++++-- fs/ceph/mds_client.h | 1 + fs/ceph/super.h | 7 +++- 6 files changed, 169 insertions(+), 66 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index eae417d71136..8cc7a49ee508 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -861,13 +861,6 @@ static int ceph_mknod(struct user_namespace *mnt_userns, struct inode *dir, goto out; } - err = ceph_pre_init_acls(dir, &mode, &as_ctx); - if (err < 0) - goto out; - err = ceph_security_init_secctx(dentry, mode, &as_ctx); - if (err < 0) - goto out; - dout("mknod in dir %p dentry %p mode 0%ho rdev %d\n", dir, dentry, mode, rdev); req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_MKNOD, USE_AUTH_MDS); @@ -875,6 +868,14 @@ static int ceph_mknod(struct user_namespace *mnt_userns, struct inode *dir, err = PTR_ERR(req); goto out; } + + req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); + if (IS_ERR(req->r_new_inode)) { + err = PTR_ERR(req->r_new_inode); + req->r_new_inode = NULL; + goto out_req; + } + req->r_dentry = dget(dentry); req->r_num_caps = 2; req->r_parent = dir; @@ -884,13 +885,13 @@ static int ceph_mknod(struct user_namespace *mnt_userns, struct inode *dir, req->r_args.mknod.rdev = cpu_to_le32(rdev); req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL; req->r_dentry_unless = CEPH_CAP_FILE_EXCL; - if (as_ctx.pagelist) { - req->r_pagelist = as_ctx.pagelist; - as_ctx.pagelist = NULL; - } + + ceph_as_ctx_to_req(req, &as_ctx); + err = ceph_mdsc_do_request(mdsc, dir, req); if (!err && !req->r_reply_info.head->is_dentry) err = ceph_handle_notrace_create(dir, dentry); +out_req: ceph_mdsc_put_request(req); out: if (!err) @@ -913,6 +914,7 @@ static int ceph_symlink(struct user_namespace *mnt_userns, struct inode *dir, struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); struct ceph_mds_request *req; struct ceph_acl_sec_ctx as_ctx = {}; + umode_t mode = S_IFLNK | 0777; int err; if (ceph_snap(dir) != CEPH_NOSNAP) @@ -923,21 +925,24 @@ static int ceph_symlink(struct user_namespace *mnt_userns, struct inode *dir, goto out; } - err = ceph_security_init_secctx(dentry, S_IFLNK | 0777, &as_ctx); - if (err < 0) - goto out; - dout("symlink in dir %p dentry %p to '%s'\n", dir, dentry, dest); req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_SYMLINK, USE_AUTH_MDS); if (IS_ERR(req)) { err = PTR_ERR(req); goto out; } + + req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); + if (IS_ERR(req->r_new_inode)) { + err = PTR_ERR(req->r_new_inode); + req->r_new_inode = NULL; + goto out_req; + } + req->r_path2 = kstrdup(dest, GFP_KERNEL); if (!req->r_path2) { err = -ENOMEM; - ceph_mdsc_put_request(req); - goto out; + goto out_req; } req->r_parent = dir; ihold(dir); @@ -947,13 +952,13 @@ static int ceph_symlink(struct user_namespace *mnt_userns, struct inode *dir, req->r_num_caps = 2; req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL; req->r_dentry_unless = CEPH_CAP_FILE_EXCL; - if (as_ctx.pagelist) { - req->r_pagelist = as_ctx.pagelist; - as_ctx.pagelist = NULL; - } + + ceph_as_ctx_to_req(req, &as_ctx); + err = ceph_mdsc_do_request(mdsc, dir, req); if (!err && !req->r_reply_info.head->is_dentry) err = ceph_handle_notrace_create(dir, dentry); +out_req: ceph_mdsc_put_request(req); out: if (err) @@ -989,13 +994,6 @@ static int ceph_mkdir(struct user_namespace *mnt_userns, struct inode *dir, goto out; } - mode |= S_IFDIR; - err = ceph_pre_init_acls(dir, &mode, &as_ctx); - if (err < 0) - goto out; - err = ceph_security_init_secctx(dentry, mode, &as_ctx); - if (err < 0) - goto out; req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); if (IS_ERR(req)) { @@ -1003,6 +1001,14 @@ static int ceph_mkdir(struct user_namespace *mnt_userns, struct inode *dir, goto out; } + mode |= S_IFDIR; + req->r_new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); + if (IS_ERR(req->r_new_inode)) { + err = PTR_ERR(req->r_new_inode); + req->r_new_inode = NULL; + goto out_req; + } + req->r_dentry = dget(dentry); req->r_num_caps = 2; req->r_parent = dir; @@ -1011,15 +1017,15 @@ static int ceph_mkdir(struct user_namespace *mnt_userns, struct inode *dir, req->r_args.mkdir.mode = cpu_to_le32(mode); req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL; req->r_dentry_unless = CEPH_CAP_FILE_EXCL; - if (as_ctx.pagelist) { - req->r_pagelist = as_ctx.pagelist; - as_ctx.pagelist = NULL; - } + + ceph_as_ctx_to_req(req, &as_ctx); + err = ceph_mdsc_do_request(mdsc, dir, req); if (!err && !req->r_reply_info.head->is_target && !req->r_reply_info.head->is_dentry) err = ceph_handle_notrace_create(dir, dentry); +out_req: ceph_mdsc_put_request(req); out: if (!err) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 9ee6e92bfed0..dd183d12a3bd 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -601,7 +601,8 @@ static void ceph_async_create_cb(struct ceph_mds_client *mdsc, ceph_mdsc_release_dir_caps(req); } -static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry, +static int ceph_finish_async_create(struct inode *dir, struct inode *inode, + struct dentry *dentry, struct file *file, umode_t mode, struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as_ctx, @@ -612,7 +613,6 @@ static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry, struct ceph_mds_reply_inode in = { }; struct ceph_mds_reply_info_in iinfo = { .in = &in }; struct ceph_inode_info *ci = ceph_inode(dir); - struct inode *inode; struct timespec64 now; struct ceph_string *pool_ns; struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(dir->i_sb); @@ -621,10 +621,6 @@ static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry, ktime_get_real_ts64(&now); - inode = ceph_get_inode(dentry->d_sb, vino); - if (IS_ERR(inode)) - return PTR_ERR(inode); - iinfo.inline_version = CEPH_INLINE_NONE; iinfo.change_attr = 1; ceph_encode_timespec64(&iinfo.btime, &now); @@ -680,8 +676,7 @@ static int ceph_finish_async_create(struct inode *dir, struct dentry *dentry, ceph_dir_clear_complete(dir); if (!d_unhashed(dentry)) d_drop(dentry); - if (inode->i_state & I_NEW) - discard_new_inode(inode); + discard_new_inode(inode); } else { struct dentry *dn; @@ -721,6 +716,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, struct ceph_fs_client *fsc = ceph_sb_to_client(dir->i_sb); struct ceph_mds_client *mdsc = fsc->mdsc; struct ceph_mds_request *req; + struct inode *new_inode = NULL; struct dentry *dn; struct ceph_acl_sec_ctx as_ctx = {}; bool try_async = ceph_test_mount_opt(fsc, ASYNC_DIROPS); @@ -733,21 +729,21 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, if (dentry->d_name.len > NAME_MAX) return -ENAMETOOLONG; - +retry: if (flags & O_CREAT) { if (ceph_quota_is_max_files_exceeded(dir)) return -EDQUOT; - err = ceph_pre_init_acls(dir, &mode, &as_ctx); - if (err < 0) - return err; - err = ceph_security_init_secctx(dentry, mode, &as_ctx); - if (err < 0) + + new_inode = ceph_new_inode(dir, dentry, &mode, &as_ctx); + if (IS_ERR(new_inode)) { + err = PTR_ERR(new_inode); goto out_ctx; + } } else if (!d_in_lookup(dentry)) { /* If it's not being looked up, it's negative */ return -ENOENT; } -retry: + /* do the open */ req = prepare_open_request(dir->i_sb, flags, mode); if (IS_ERR(req)) { @@ -768,25 +764,40 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, req->r_dentry_drop = CEPH_CAP_FILE_SHARED | CEPH_CAP_AUTH_EXCL; req->r_dentry_unless = CEPH_CAP_FILE_EXCL; - if (as_ctx.pagelist) { - req->r_pagelist = as_ctx.pagelist; - as_ctx.pagelist = NULL; - } - if (try_async && - (req->r_dir_caps = - try_prep_async_create(dir, dentry, &lo, - &req->r_deleg_ino))) { + + ceph_as_ctx_to_req(req, &as_ctx); + + if (try_async && (req->r_dir_caps = + try_prep_async_create(dir, dentry, &lo, &req->r_deleg_ino))) { + struct ceph_vino vino = { .ino = req->r_deleg_ino, + .snap = CEPH_NOSNAP }; + set_bit(CEPH_MDS_R_ASYNC, &req->r_req_flags); req->r_args.open.flags |= cpu_to_le32(CEPH_O_EXCL); req->r_callback = ceph_async_create_cb; + + /* Hash inode before RPC */ + new_inode = ceph_get_inode(dir->i_sb, vino, new_inode); + if (IS_ERR(new_inode)) { + err = PTR_ERR(new_inode); + new_inode = NULL; + goto out_req; + } + WARN_ON_ONCE(!(new_inode->i_state & I_NEW)); + err = ceph_mdsc_submit_request(mdsc, dir, req); if (!err) { - err = ceph_finish_async_create(dir, dentry, + err = ceph_finish_async_create(dir, new_inode, dentry, file, mode, req, &as_ctx, &lo); + new_inode = NULL; } else if (err == -EJUKEBOX) { restore_deleg_ino(dir, req->r_deleg_ino); ceph_mdsc_put_request(req); + discard_new_inode(new_inode); + ceph_release_acl_sec_ctx(&as_ctx); + memset(&as_ctx, 0, sizeof(as_ctx)); + new_inode = NULL; try_async = false; ceph_put_string(rcu_dereference_raw(lo.pool_ns)); goto retry; @@ -797,6 +808,8 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, } set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags); + req->r_new_inode = new_inode; + new_inode = NULL; err = ceph_mdsc_do_request(mdsc, (flags & (O_CREAT|O_TRUNC)) ? dir : NULL, req); @@ -839,6 +852,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, } out_req: ceph_mdsc_put_request(req); + iput(new_inode); out_ctx: ceph_release_acl_sec_ctx(&as_ctx); dout("atomic_open result=%d\n", err); diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index ebc936231ea2..6aba0391070d 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -52,17 +52,85 @@ static int ceph_set_ino_cb(struct inode *inode, void *data) return 0; } -struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino) +/** + * ceph_new_inode - allocate a new inode in advance of an expected create + * @dir: parent directory for new inode + * @dentry: dentry that may eventually point to new inode + * @mode: mode of new inode + * @as_ctx: pointer to inherited security context + * + * Allocate a new inode in advance of an operation to create a new inode. + * This allocates the inode and sets up the acl_sec_ctx with appropriate + * info for the new inode. + * + * Returns a pointer to the new inode or an ERR_PTR. + */ +struct inode *ceph_new_inode(struct inode *dir, struct dentry *dentry, + umode_t *mode, struct ceph_acl_sec_ctx *as_ctx) +{ + int err; + struct inode *inode; + + inode = new_inode(dir->i_sb); + if (!inode) + return ERR_PTR(-ENOMEM); + + if (!S_ISLNK(*mode)) { + err = ceph_pre_init_acls(dir, mode, as_ctx); + if (err < 0) + goto out_err; + } + + err = ceph_security_init_secctx(dentry, *mode, as_ctx); + if (err < 0) + goto out_err; + + inode->i_state = 0; + inode->i_mode = *mode; + return inode; +out_err: + iput(inode); + return ERR_PTR(err); +} + +void ceph_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as_ctx) +{ + if (as_ctx->pagelist) { + req->r_pagelist = as_ctx->pagelist; + as_ctx->pagelist = NULL; + } +} + +/** + * ceph_get_inode - find or create/hash a new inode + * @sb: superblock to search and allocate in + * @vino: vino to search for + * @newino: optional new inode to insert if one isn't found (may be NULL) + * + * Search for or insert a new inode into the hash for the given vino, and return a + * reference to it. If new is non-NULL, its reference is consumed. + */ +struct inode *ceph_get_inode(struct super_block *sb, struct ceph_vino vino, struct inode *newino) { struct inode *inode; if (ceph_vino_is_reserved(vino)) return ERR_PTR(-EREMOTEIO); - inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare, - ceph_set_ino_cb, &vino); - if (!inode) + if (newino) { + inode = inode_insert5(newino, (unsigned long)vino.ino, ceph_ino_compare, + ceph_set_ino_cb, &vino); + if (inode != newino) + iput(newino); + } else { + inode = iget5_locked(sb, (unsigned long)vino.ino, ceph_ino_compare, + ceph_set_ino_cb, &vino); + } + + if (!inode) { + dout("No inode found for %llx.%llx\n", vino.ino, vino.snap); return ERR_PTR(-ENOMEM); + } dout("get_inode on %llu=%llx.%llx got %p new %d\n", ceph_present_inode(inode), ceph_vinop(inode), inode, !!(inode->i_state & I_NEW)); @@ -78,7 +146,7 @@ struct inode *ceph_get_snapdir(struct inode *parent) .ino = ceph_ino(parent), .snap = CEPH_SNAPDIR, }; - struct inode *inode = ceph_get_inode(parent->i_sb, vino); + struct inode *inode = ceph_get_inode(parent->i_sb, vino, NULL); struct ceph_inode_info *ci = ceph_inode(inode); if (IS_ERR(inode)) @@ -1552,7 +1620,7 @@ static int readdir_prepopulate_inodes_only(struct ceph_mds_request *req, vino.ino = le64_to_cpu(rde->inode.in->ino); vino.snap = le64_to_cpu(rde->inode.in->snapid); - in = ceph_get_inode(req->r_dentry->d_sb, vino); + in = ceph_get_inode(req->r_dentry->d_sb, vino, NULL); if (IS_ERR(in)) { err = PTR_ERR(in); dout("new_inode badness got %d\n", err); @@ -1754,7 +1822,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, if (d_really_is_positive(dn)) { in = d_inode(dn); } else { - in = ceph_get_inode(parent->d_sb, tvino); + in = ceph_get_inode(parent->d_sb, tvino, NULL); if (IS_ERR(in)) { dout("new_inode badness\n"); d_drop(dn); diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index f476c65fb985..7aa253be7edc 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -868,6 +868,7 @@ void ceph_mdsc_release_request(struct kref *kref) iput(req->r_parent); } iput(req->r_target_inode); + iput(req->r_new_inode); if (req->r_dentry) dput(req->r_dentry); if (req->r_old_dentry) @@ -3208,13 +3209,21 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) /* Must find target inode outside of mutexes to avoid deadlocks */ if ((err >= 0) && rinfo->head->is_target) { - struct inode *in; + struct inode *in = xchg(&req->r_new_inode, NULL); struct ceph_vino tvino = { .ino = le64_to_cpu(rinfo->targeti.in->ino), .snap = le64_to_cpu(rinfo->targeti.in->snapid) }; - in = ceph_get_inode(mdsc->fsc->sb, tvino); + /* If we ended up opening an existing inode, discard r_new_inode */ + if (req->r_op == CEPH_MDS_OP_CREATE && !req->r_reply_info.has_create_ino) { + /* This should never happen on an async create */ + WARN_ON_ONCE(req->r_deleg_ino); + iput(in); + in = NULL; + } + + in = ceph_get_inode(mdsc->fsc->sb, tvino, in); if (IS_ERR(in)) { err = PTR_ERR(in); mutex_lock(&session->s_mutex); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 33497846e47e..2e945979a2e0 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -265,6 +265,7 @@ struct ceph_mds_request { struct inode *r_parent; /* parent dir inode */ struct inode *r_target_inode; /* resulting inode */ + struct inode *r_new_inode; /* new inode (for creates) */ #define CEPH_MDS_R_DIRECT_IS_HASH (1) /* r_direct_hash is valid */ #define CEPH_MDS_R_ABORTED (2) /* call was aborted */ diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 80a2399f2878..9eaaab34baae 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -967,6 +967,7 @@ static inline bool __ceph_have_pending_cap_snap(struct ceph_inode_info *ci) /* inode.c */ struct ceph_mds_reply_info_in; struct ceph_mds_reply_dirfrag; +struct ceph_acl_sec_ctx; extern const struct inode_operations ceph_file_iops; @@ -974,8 +975,12 @@ extern struct inode *ceph_alloc_inode(struct super_block *sb); extern void ceph_evict_inode(struct inode *inode); extern void ceph_free_inode(struct inode *inode); +struct inode *ceph_new_inode(struct inode *dir, struct dentry *dentry, + umode_t *mode, struct ceph_acl_sec_ctx *as_ctx); +void ceph_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as_ctx); + extern struct inode *ceph_get_inode(struct super_block *sb, - struct ceph_vino vino); + struct ceph_vino vino, struct inode *newino); extern struct inode *ceph_get_snapdir(struct inode *parent); extern int ceph_fill_file_size(struct inode *inode, int issued, u32 truncate_seq, u64 truncate_size, u64 size); From patchwork Tue Apr 5 19:19:44 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802333 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DD0E3C433F5 for ; Wed, 6 Apr 2022 04:05:00 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236993AbiDFEEs (ORCPT ); Wed, 6 Apr 2022 00:04:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36046 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573562AbiDETWr (ORCPT ); Tue, 5 Apr 2022 15:22:47 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 16B9641F8E; Tue, 5 Apr 2022 12:20:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A12BDB81FA8; Tue, 5 Apr 2022 19:20:45 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B8913C385A5; Tue, 5 Apr 2022 19:20:43 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186444; bh=Xo1CqM/m1X0Vn5f0F5v27LeQtnut8KTTWWfaOHTaWOM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t3VpB5aNzT1CC7ZDu0JZEN4+bIZMNItmwgKVe8wE+K7cAdlna1208D8npqq7dPRjG DGOMrJT93IUEi3ntzN09ioD+Hk0bY/1Xk6xpEkuhk6+z7po2Td1uynTEE0H59v6nvN ukQKLoErAW/rVG1pwtZJZVf4uI+inu0k2JBuMsOQsiQmIXIAYlQoBeXphsx8JKD7Rg OT7RZZkVIulM5lbIiM5n+LXzCEsxD0/rNhVPNo7gCGPVu/llGgPg+VmfOnKofdUc+2 c6p4zeFyydyZHiVaiNpdiCFTzB3gQiWbZTaVNiIC2wnTHyD56d10zMH5Yi99Wxzwy3 s5DvMBGZb21Xw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 13/59] ceph: fscrypt_auth handling for ceph Date: Tue, 5 Apr 2022 15:19:44 -0400 Message-Id: <20220405192030.178326-14-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Most fscrypt-enabled filesystems store the crypto context in an xattr, but that's problematic for ceph as xatts are governed by the XATTR cap, but we really want the crypto context as part of the AUTH cap. Because of this, the MDS has added two new inode metadata fields: fscrypt_auth and fscrypt_file. The former is used to hold the crypto context, and the latter is used to track the real file size. Parse new fscrypt_auth and fscrypt_file fields in inode traces. For now, we don't use fscrypt_file, but fscrypt_auth is used to hold the fscrypt context. Allow the client to use a setattr request for setting the fscrypt_auth field. Since this is not a standard setattr request from the VFS, we add a new field to __ceph_setattr that carries ceph-specific inode attrs. Have the set_context op do a setattr that sets the fscrypt_auth value, and get_context just return the contents of that field (since it should always be available). Signed-off-by: Jeff Layton --- fs/ceph/Makefile | 1 + fs/ceph/acl.c | 4 +- fs/ceph/crypto.c | 76 ++++++++++++++++++++++++ fs/ceph/crypto.h | 36 ++++++++++++ fs/ceph/inode.c | 62 ++++++++++++++++++- fs/ceph/mds_client.c | 111 ++++++++++++++++++++++++++++++++--- fs/ceph/mds_client.h | 7 +++ fs/ceph/super.c | 3 + fs/ceph/super.h | 14 ++++- include/linux/ceph/ceph_fs.h | 21 ++++--- 10 files changed, 313 insertions(+), 22 deletions(-) create mode 100644 fs/ceph/crypto.c create mode 100644 fs/ceph/crypto.h diff --git a/fs/ceph/Makefile b/fs/ceph/Makefile index 50c635dc7f71..1f77ca04c426 100644 --- a/fs/ceph/Makefile +++ b/fs/ceph/Makefile @@ -12,3 +12,4 @@ ceph-y := super.o inode.o dir.o file.o locks.o addr.o ioctl.o \ ceph-$(CONFIG_CEPH_FSCACHE) += cache.o ceph-$(CONFIG_CEPH_FS_POSIX_ACL) += acl.o +ceph-$(CONFIG_FS_ENCRYPTION) += crypto.o diff --git a/fs/ceph/acl.c b/fs/ceph/acl.c index f4fc8e0b847c..427724c36316 100644 --- a/fs/ceph/acl.c +++ b/fs/ceph/acl.c @@ -139,7 +139,7 @@ int ceph_set_acl(struct user_namespace *mnt_userns, struct inode *inode, newattrs.ia_ctime = current_time(inode); newattrs.ia_mode = new_mode; newattrs.ia_valid = ATTR_MODE | ATTR_CTIME; - ret = __ceph_setattr(inode, &newattrs); + ret = __ceph_setattr(inode, &newattrs, NULL); if (ret) goto out_free; } @@ -150,7 +150,7 @@ int ceph_set_acl(struct user_namespace *mnt_userns, struct inode *inode, newattrs.ia_ctime = old_ctime; newattrs.ia_mode = old_mode; newattrs.ia_valid = ATTR_MODE | ATTR_CTIME; - __ceph_setattr(inode, &newattrs); + __ceph_setattr(inode, &newattrs, NULL); } goto out_free; } diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c new file mode 100644 index 000000000000..a513ff373b13 --- /dev/null +++ b/fs/ceph/crypto.c @@ -0,0 +1,76 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include + +#include "super.h" +#include "crypto.h" + +static int ceph_crypt_get_context(struct inode *inode, void *ctx, size_t len) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + struct ceph_fscrypt_auth *cfa = (struct ceph_fscrypt_auth *)ci->fscrypt_auth; + u32 ctxlen; + + /* Non existent or too short? */ + if (!cfa || (ci->fscrypt_auth_len < (offsetof(struct ceph_fscrypt_auth, cfa_blob) + 1))) + return -ENOBUFS; + + /* Some format we don't recognize? */ + if (le32_to_cpu(cfa->cfa_version) != CEPH_FSCRYPT_AUTH_VERSION) + return -ENOBUFS; + + ctxlen = le32_to_cpu(cfa->cfa_blob_len); + if (len < ctxlen) + return -ERANGE; + + memcpy(ctx, cfa->cfa_blob, ctxlen); + return ctxlen; +} + +static int ceph_crypt_set_context(struct inode *inode, const void *ctx, size_t len, void *fs_data) +{ + int ret; + struct iattr attr = { }; + struct ceph_iattr cia = { }; + struct ceph_fscrypt_auth *cfa; + + WARN_ON_ONCE(fs_data); + + if (len > FSCRYPT_SET_CONTEXT_MAX_SIZE) + return -EINVAL; + + cfa = kzalloc(sizeof(*cfa), GFP_KERNEL); + if (!cfa) + return -ENOMEM; + + cfa->cfa_version = cpu_to_le32(CEPH_FSCRYPT_AUTH_VERSION); + cfa->cfa_blob_len = cpu_to_le32(len); + memcpy(cfa->cfa_blob, ctx, len); + + cia.fscrypt_auth = cfa; + + ret = __ceph_setattr(inode, &attr, &cia); + if (ret == 0) + inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); + kfree(cia.fscrypt_auth); + return ret; +} + +static bool ceph_crypt_empty_dir(struct inode *inode) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + + return ci->i_rsubdirs + ci->i_rfiles == 1; +} + +static struct fscrypt_operations ceph_fscrypt_ops = { + .get_context = ceph_crypt_get_context, + .set_context = ceph_crypt_set_context, + .empty_dir = ceph_crypt_empty_dir, +}; + +void ceph_fscrypt_set_ops(struct super_block *sb) +{ + fscrypt_set_ops(sb, &ceph_fscrypt_ops); +} diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h new file mode 100644 index 000000000000..6dca674f79b8 --- /dev/null +++ b/fs/ceph/crypto.h @@ -0,0 +1,36 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +/* + * Ceph fscrypt functionality + */ + +#ifndef _CEPH_CRYPTO_H +#define _CEPH_CRYPTO_H + +#include + +struct ceph_fscrypt_auth { + __le32 cfa_version; + __le32 cfa_blob_len; + u8 cfa_blob[FSCRYPT_SET_CONTEXT_MAX_SIZE]; +} __packed; + +#define CEPH_FSCRYPT_AUTH_VERSION 1 +static inline u32 ceph_fscrypt_auth_len(struct ceph_fscrypt_auth *fa) +{ + u32 ctxsize = le32_to_cpu(fa->cfa_blob_len); + + return offsetof(struct ceph_fscrypt_auth, cfa_blob) + ctxsize; +} + +#ifdef CONFIG_FS_ENCRYPTION +void ceph_fscrypt_set_ops(struct super_block *sb); + +#else /* CONFIG_FS_ENCRYPTION */ + +static inline void ceph_fscrypt_set_ops(struct super_block *sb) +{ +} + +#endif /* CONFIG_FS_ENCRYPTION */ + +#endif diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 6aba0391070d..2d9bade892cc 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -14,10 +14,12 @@ #include #include #include +#include #include "super.h" #include "mds_client.h" #include "cache.h" +#include "crypto.h" #include /* @@ -616,6 +618,10 @@ struct inode *ceph_alloc_inode(struct super_block *sb) INIT_WORK(&ci->i_work, ceph_inode_work); ci->i_work_mask = 0; memset(&ci->i_btime, '\0', sizeof(ci->i_btime)); +#ifdef CONFIG_FS_ENCRYPTION + ci->fscrypt_auth = NULL; + ci->fscrypt_auth_len = 0; +#endif return &ci->vfs_inode; } @@ -624,6 +630,9 @@ void ceph_free_inode(struct inode *inode) struct ceph_inode_info *ci = ceph_inode(inode); kfree(ci->i_symlink); +#ifdef CONFIG_FS_ENCRYPTION + kfree(ci->fscrypt_auth); +#endif kmem_cache_free(ceph_inode_cachep, ci); } @@ -644,6 +653,7 @@ void ceph_evict_inode(struct inode *inode) clear_inode(inode); ceph_fscache_unregister_inode_cookie(ci); + fscrypt_put_encryption_info(inode); __ceph_remove_caps(ci); @@ -1023,6 +1033,16 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, xattr_blob = NULL; } +#ifdef CONFIG_FS_ENCRYPTION + if (iinfo->fscrypt_auth_len && !ci->fscrypt_auth) { + ci->fscrypt_auth_len = iinfo->fscrypt_auth_len; + ci->fscrypt_auth = iinfo->fscrypt_auth; + iinfo->fscrypt_auth = NULL; + iinfo->fscrypt_auth_len = 0; + inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); + } +#endif + /* finally update i_version */ if (le64_to_cpu(info->version) > ci->i_version) ci->i_version = le64_to_cpu(info->version); @@ -2079,7 +2099,7 @@ static const struct inode_operations ceph_symlink_iops = { .listxattr = ceph_listxattr, }; -int __ceph_setattr(struct inode *inode, struct iattr *attr) +int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *cia) { struct ceph_inode_info *ci = ceph_inode(inode); unsigned int ia_valid = attr->ia_valid; @@ -2119,6 +2139,43 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr) } dout("setattr %p issued %s\n", inode, ceph_cap_string(issued)); +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + if (cia && cia->fscrypt_auth) { + u32 len = ceph_fscrypt_auth_len(cia->fscrypt_auth); + + if (len > sizeof(*cia->fscrypt_auth)) { + err = -EINVAL; + spin_unlock(&ci->i_ceph_lock); + goto out; + } + + dout("setattr %llx:%llx fscrypt_auth len %u to %u)\n", + ceph_vinop(inode), ci->fscrypt_auth_len, len); + + /* It should never be re-set once set */ + WARN_ON_ONCE(ci->fscrypt_auth); + + if (issued & CEPH_CAP_AUTH_EXCL) { + dirtied |= CEPH_CAP_AUTH_EXCL; + kfree(ci->fscrypt_auth); + ci->fscrypt_auth = (u8 *)cia->fscrypt_auth; + ci->fscrypt_auth_len = len; + } else if ((issued & CEPH_CAP_AUTH_SHARED) == 0 || + ci->fscrypt_auth_len != len || + memcmp(ci->fscrypt_auth, cia->fscrypt_auth, len)) { + req->r_fscrypt_auth = cia->fscrypt_auth; + mask |= CEPH_SETATTR_FSCRYPT_AUTH; + release |= CEPH_CAP_AUTH_SHARED; + } + cia->fscrypt_auth = NULL; + } +#else + if (cia && cia->fscrypt_auth) { + err = -EINVAL; + spin_unlock(&ci->i_ceph_lock); + goto out; + } +#endif /* CONFIG_FS_ENCRYPTION */ if (ia_valid & ATTR_UID) { dout("setattr %p uid %d -> %d\n", inode, @@ -2281,6 +2338,7 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr) req->r_stamp = attr->ia_ctime; err = ceph_mdsc_do_request(mdsc, NULL, req); } +out: dout("setattr %p result=%d (%s locally, %d remote)\n", inode, err, ceph_cap_string(dirtied), mask); @@ -2321,7 +2379,7 @@ int ceph_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, ceph_quota_is_max_bytes_exceeded(inode, attr->ia_size)) return -EDQUOT; - err = __ceph_setattr(inode, attr); + err = __ceph_setattr(inode, attr, NULL); if (err >= 0 && (attr->ia_valid & ATTR_MODE)) err = posix_acl_chmod(&init_user_ns, inode, attr->ia_mode); diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 7aa253be7edc..dcb800675dec 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -15,6 +15,7 @@ #include "super.h" #include "mds_client.h" +#include "crypto.h" #include #include @@ -184,8 +185,52 @@ static int parse_reply_info_in(void **p, void *end, info->rsnaps = 0; } + if (struct_v >= 5) { + u32 alen; + + ceph_decode_32_safe(p, end, alen, bad); + + while (alen--) { + u32 len; + + /* key */ + ceph_decode_32_safe(p, end, len, bad); + ceph_decode_skip_n(p, end, len, bad); + /* value */ + ceph_decode_32_safe(p, end, len, bad); + ceph_decode_skip_n(p, end, len, bad); + } + } + + /* fscrypt flag -- ignore */ + if (struct_v >= 6) + ceph_decode_skip_8(p, end, bad); + + info->fscrypt_auth = NULL; + info->fscrypt_auth_len = 0; + info->fscrypt_file = NULL; + info->fscrypt_file_len = 0; + if (struct_v >= 7) { + ceph_decode_32_safe(p, end, info->fscrypt_auth_len, bad); + if (info->fscrypt_auth_len) { + info->fscrypt_auth = kmalloc(info->fscrypt_auth_len, GFP_KERNEL); + if (!info->fscrypt_auth) + return -ENOMEM; + ceph_decode_copy_safe(p, end, info->fscrypt_auth, + info->fscrypt_auth_len, bad); + } + ceph_decode_32_safe(p, end, info->fscrypt_file_len, bad); + if (info->fscrypt_file_len) { + info->fscrypt_file = kmalloc(info->fscrypt_file_len, GFP_KERNEL); + if (!info->fscrypt_file) + return -ENOMEM; + ceph_decode_copy_safe(p, end, info->fscrypt_file, + info->fscrypt_file_len, bad); + } + } *p = end; } else { + /* legacy (unversioned) struct */ if (features & CEPH_FEATURE_MDS_INLINE_DATA) { ceph_decode_64_safe(p, end, info->inline_version, bad); ceph_decode_32_safe(p, end, info->inline_len, bad); @@ -650,8 +695,21 @@ static int parse_reply_info(struct ceph_mds_session *s, struct ceph_msg *msg, static void destroy_reply_info(struct ceph_mds_reply_info_parsed *info) { + int i; + + kfree(info->diri.fscrypt_auth); + kfree(info->diri.fscrypt_file); + kfree(info->targeti.fscrypt_auth); + kfree(info->targeti.fscrypt_file); if (!info->dir_entries) return; + + for (i = 0; i < info->dir_nr; i++) { + struct ceph_mds_reply_dir_entry *rde = info->dir_entries + i; + + kfree(rde->inode.fscrypt_auth); + kfree(rde->inode.fscrypt_file); + } free_pages((unsigned long)info->dir_entries, get_order(info->dir_buf_size)); } @@ -889,6 +947,7 @@ void ceph_mdsc_release_request(struct kref *kref) put_cred(req->r_cred); if (req->r_pagelist) ceph_pagelist_release(req->r_pagelist); + kfree(req->r_fscrypt_auth); put_request_session(req); ceph_unreserve_caps(req->r_mdsc, &req->r_caps_reservation); WARN_ON_ONCE(!list_empty(&req->r_wait)); @@ -2469,8 +2528,7 @@ static int set_request_path_attr(struct inode *rinode, struct dentry *rdentry, return r; } -static void encode_timestamp_and_gids(void **p, - const struct ceph_mds_request *req) +static void encode_mclientrequest_tail(void **p, const struct ceph_mds_request *req) { struct ceph_timespec ts; int i; @@ -2483,6 +2541,20 @@ static void encode_timestamp_and_gids(void **p, for (i = 0; i < req->r_cred->group_info->ngroups; i++) ceph_encode_64(p, from_kgid(&init_user_ns, req->r_cred->group_info->gid[i])); + + /* v5: altname (TODO: skip for now) */ + ceph_encode_32(p, 0); + + /* v6: fscrypt_auth and fscrypt_file */ + if (req->r_fscrypt_auth) { + u32 authlen = ceph_fscrypt_auth_len(req->r_fscrypt_auth); + + ceph_encode_32(p, authlen); + ceph_encode_copy(p, req->r_fscrypt_auth, authlen); + } else { + ceph_encode_32(p, 0); + } + ceph_encode_32(p, 0); // fscrypt_file for now } /* @@ -2527,12 +2599,14 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, goto out_free1; } + /* head */ len = legacy ? sizeof(*head) : sizeof(struct ceph_mds_request_head); - len += pathlen1 + pathlen2 + 2*(1 + sizeof(u32) + sizeof(u64)) + - sizeof(struct ceph_timespec); - len += sizeof(u32) + (sizeof(u64) * req->r_cred->group_info->ngroups); - /* calculate (max) length for cap releases */ + /* filepaths */ + len += 2 * (1 + sizeof(u32) + sizeof(u64)); + len += pathlen1 + pathlen2; + + /* cap releases */ len += sizeof(struct ceph_mds_request_release) * (!!req->r_inode_drop + !!req->r_dentry_drop + !!req->r_old_inode_drop + !!req->r_old_dentry_drop); @@ -2542,6 +2616,25 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, if (req->r_old_dentry_drop) len += pathlen2; + /* MClientRequest tail */ + + /* req->r_stamp */ + len += sizeof(struct ceph_timespec); + + /* gid list */ + len += sizeof(u32) + (sizeof(u64) * req->r_cred->group_info->ngroups); + + /* alternate name */ + len += sizeof(u32); // TODO + + /* fscrypt_auth */ + len += sizeof(u32); // fscrypt_auth + if (req->r_fscrypt_auth) + len += ceph_fscrypt_auth_len(req->r_fscrypt_auth); + + /* fscrypt_file */ + len += sizeof(u32); + msg = ceph_msg_new2(CEPH_MSG_CLIENT_REQUEST, len, 1, GFP_NOFS, false); if (!msg) { msg = ERR_PTR(-ENOMEM); @@ -2561,7 +2654,7 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, } else { struct ceph_mds_request_head *new_head = msg->front.iov_base; - msg->hdr.version = cpu_to_le16(4); + msg->hdr.version = cpu_to_le16(6); new_head->version = cpu_to_le16(CEPH_MDS_REQUEST_HEAD_VERSION); head = (struct ceph_mds_request_head_old *)&new_head->oldest_client_tid; p = msg->front.iov_base + sizeof(*new_head); @@ -2612,7 +2705,7 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, head->num_releases = cpu_to_le16(releases); - encode_timestamp_and_gids(&p, req); + encode_mclientrequest_tail(&p, req); if (WARN_ON_ONCE(p > end)) { ceph_msg_put(msg); @@ -2742,7 +2835,7 @@ static int __prepare_send_request(struct ceph_mds_session *session, rhead->num_releases = 0; p = msg->front.iov_base + req->r_request_release_offset; - encode_timestamp_and_gids(&p, req); + encode_mclientrequest_tail(&p, req); msg->front.iov_len = p - msg->front.iov_base; msg->hdr.front_len = cpu_to_le32(msg->front.iov_len); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 2e945979a2e0..aab3ab284fce 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -88,6 +88,10 @@ struct ceph_mds_reply_info_in { s32 dir_pin; struct ceph_timespec btime; struct ceph_timespec snap_btime; + u8 *fscrypt_auth; + u8 *fscrypt_file; + u32 fscrypt_auth_len; + u32 fscrypt_file_len; u64 rsnaps; u64 change_attr; }; @@ -280,6 +284,9 @@ struct ceph_mds_request { struct mutex r_fill_mutex; union ceph_mds_request_args r_args; + + struct ceph_fscrypt_auth *r_fscrypt_auth; + int r_fmode; /* file mode, if expecting cap */ int r_request_release_offset; const struct cred *r_cred; diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 26d924dda721..1f52fc5b89bd 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -20,6 +20,7 @@ #include "super.h" #include "mds_client.h" #include "cache.h" +#include "crypto.h" #include #include @@ -1129,6 +1130,8 @@ static int ceph_set_super(struct super_block *s, struct fs_context *fc) s->s_time_min = 0; s->s_time_max = U32_MAX; + ceph_fscrypt_set_ops(s); + ret = set_anon_super_fc(s, fc); if (ret != 0) fsc->sb = NULL; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 9eaaab34baae..a4dc7b81a6a4 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -434,6 +434,13 @@ struct ceph_inode_info { struct work_struct i_work; unsigned long i_work_mask; + +#ifdef CONFIG_FS_ENCRYPTION + u32 fscrypt_auth_len; + u32 fscrypt_file_len; + u8 *fscrypt_auth; + u8 *fscrypt_file; +#endif }; static inline struct ceph_inode_info * @@ -1038,7 +1045,12 @@ static inline int ceph_do_getattr(struct inode *inode, int mask, bool force) } extern int ceph_permission(struct user_namespace *mnt_userns, struct inode *inode, int mask); -extern int __ceph_setattr(struct inode *inode, struct iattr *attr); + +struct ceph_iattr { + struct ceph_fscrypt_auth *fscrypt_auth; +}; + +extern int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *cia); extern int ceph_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, struct iattr *attr); extern int ceph_getattr(struct user_namespace *mnt_userns, diff --git a/include/linux/ceph/ceph_fs.h b/include/linux/ceph/ceph_fs.h index 86bf82dbd8b8..2810b214fa3f 100644 --- a/include/linux/ceph/ceph_fs.h +++ b/include/linux/ceph/ceph_fs.h @@ -359,14 +359,19 @@ enum { extern const char *ceph_mds_op_name(int op); - -#define CEPH_SETATTR_MODE 1 -#define CEPH_SETATTR_UID 2 -#define CEPH_SETATTR_GID 4 -#define CEPH_SETATTR_MTIME 8 -#define CEPH_SETATTR_ATIME 16 -#define CEPH_SETATTR_SIZE 32 -#define CEPH_SETATTR_CTIME 64 +#define CEPH_SETATTR_MODE (1 << 0) +#define CEPH_SETATTR_UID (1 << 1) +#define CEPH_SETATTR_GID (1 << 2) +#define CEPH_SETATTR_MTIME (1 << 3) +#define CEPH_SETATTR_ATIME (1 << 4) +#define CEPH_SETATTR_SIZE (1 << 5) +#define CEPH_SETATTR_CTIME (1 << 6) +#define CEPH_SETATTR_MTIME_NOW (1 << 7) +#define CEPH_SETATTR_ATIME_NOW (1 << 8) +#define CEPH_SETATTR_BTIME (1 << 9) +#define CEPH_SETATTR_KILL_SGUID (1 << 10) +#define CEPH_SETATTR_FSCRYPT_AUTH (1 << 11) +#define CEPH_SETATTR_FSCRYPT_FILE (1 << 12) /* * Ceph setxattr request flags. From patchwork Tue Apr 5 19:19:45 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802370 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8B216C35280 for ; Wed, 6 Apr 2022 04:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1454048AbiDFEQF (ORCPT ); Wed, 6 Apr 2022 00:16:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36096 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573564AbiDETWr (ORCPT ); Tue, 5 Apr 2022 15:22:47 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9FFC54348E; Tue, 5 Apr 2022 12:20:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id E0B2ECE1FB7; Tue, 5 Apr 2022 19:20:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A1BD2C385A3; Tue, 5 Apr 2022 19:20:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186445; bh=nvrYBFsMkCxOJYreXafVsZm50vgOwttkKvH8dbufqbg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=h78EXZniuHV8uJZW+m9ZcNnHcwLs/seGKD6Sf0UDtGyfbwa9teJx1xJXdl65OZdoF yGRlvEXx2WfdB7h+bkZKEntAFnDZmrCbDJXzZvVFX47OuX5tZO6mem+V4q3kNtYpQI i4C0WdUm9ekJrnFHMJPTm5isdsKT+Jy4L5YpbRRlWAuAdom4i0sIuYiydKEj+B10VO 31N35FUe/NEdoJksaliieTdPOOd6DP46wuCJ9we0kEA854/BqHO7oUjiweOREhMqhf hU35sWpoRyLLpdfF+BDLNzQ4rAL4mPs9ZT6qnnffIaeCFem+Ql2zdmZQRA+/uubJbZ 0O/Nfmzsoc29w== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 14/59] ceph: ensure that we accept a new context from MDS for new inodes Date: Tue, 5 Apr 2022 15:19:45 -0400 Message-Id: <20220405192030.178326-15-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 21 +++++++++++---------- 1 file changed, 11 insertions(+), 10 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 2d9bade892cc..9a5641b37978 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -944,6 +944,17 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, __ceph_update_quota(ci, iinfo->max_bytes, iinfo->max_files); +#ifdef CONFIG_FS_ENCRYPTION + if (iinfo->fscrypt_auth_len && (inode->i_state & I_NEW)) { + kfree(ci->fscrypt_auth); + ci->fscrypt_auth_len = iinfo->fscrypt_auth_len; + ci->fscrypt_auth = iinfo->fscrypt_auth; + iinfo->fscrypt_auth = NULL; + iinfo->fscrypt_auth_len = 0; + inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); + } +#endif + if ((new_version || (new_issued & CEPH_CAP_AUTH_SHARED)) && (issued & CEPH_CAP_AUTH_EXCL) == 0) { inode->i_mode = mode; @@ -1033,16 +1044,6 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, xattr_blob = NULL; } -#ifdef CONFIG_FS_ENCRYPTION - if (iinfo->fscrypt_auth_len && !ci->fscrypt_auth) { - ci->fscrypt_auth_len = iinfo->fscrypt_auth_len; - ci->fscrypt_auth = iinfo->fscrypt_auth; - iinfo->fscrypt_auth = NULL; - iinfo->fscrypt_auth_len = 0; - inode_set_flags(inode, S_ENCRYPTED, S_ENCRYPTED); - } -#endif - /* finally update i_version */ if (le64_to_cpu(info->version) > ci->i_version) ci->i_version = le64_to_cpu(info->version); From patchwork Tue Apr 5 19:19:46 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802324 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C1766C4167B for ; Wed, 6 Apr 2022 04:01:54 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S236884AbiDFEAL (ORCPT ); Wed, 6 Apr 2022 00:00:11 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36044 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573561AbiDETWr (ORCPT ); Tue, 5 Apr 2022 15:22:47 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0EFEE41F83; Tue, 5 Apr 2022 12:20:47 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 92C8DCE1D71; Tue, 5 Apr 2022 19:20:46 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8A9D6C385A1; Tue, 5 Apr 2022 19:20:45 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186446; bh=JFvTVs1Mbafak7cqDVajlAdBW/VajKmPl0TjxjOhC9A=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=jOQ4rwMLcxSJ+KzBlsVwSdACTUyHVXMLWlnNXixgc2Ys0RFTc1MFAYk/XRRtd34SS 6liEE8OFkbJK6I9ZQz2HNxdvfqWfDtMSVMi6PiwHbnHGKakTOoqSHD4YXE76RDmQw8 cljWq7jemrrhvSAzg175Fpc/KJL+r9YtLwZuyXp3MzK9WqR6mvctp1tyqLLhMHv1fC idAGEt865IuVoFODfFckykDVaRlhQMcDlPqtPrTyGejx44aWJTjKDqWR9ZOmOhTomU g7x9HwOiHNRHa+a9DBhfb049r/05L/p5xfcg4ZwBwM7UaKn+hNrcuqjz//ZnjM/HJW B0dEjfbzwy72A== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 15/59] ceph: add support for fscrypt_auth/fscrypt_file to cap messages Date: Tue, 5 Apr 2022 15:19:46 -0400 Message-Id: <20220405192030.178326-16-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add support for new version 12 cap messages that carry the new fscrypt_auth and fscrypt_file fields from the inode. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 76 +++++++++++++++++++++++++++++++++++++++++--------- 1 file changed, 63 insertions(+), 13 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 042d4ca75253..3b31d77eb1ea 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -13,6 +13,7 @@ #include "super.h" #include "mds_client.h" #include "cache.h" +#include "crypto.h" #include #include @@ -1214,15 +1215,12 @@ struct cap_msg_args { umode_t mode; bool inline_data; bool wake; + u32 fscrypt_auth_len; + u32 fscrypt_file_len; + u8 fscrypt_auth[sizeof(struct ceph_fscrypt_auth)]; // for context + u8 fscrypt_file[sizeof(u64)]; // for size }; -/* - * cap struct size + flock buffer size + inline version + inline data size + - * osd_epoch_barrier + oldest_flush_tid - */ -#define CAP_MSG_SIZE (sizeof(struct ceph_mds_caps) + \ - 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4) - /* Marshal up the cap msg to the MDS */ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) { @@ -1238,7 +1236,7 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) arg->size, arg->max_size, arg->xattr_version, arg->xattr_buf ? (int)arg->xattr_buf->vec.iov_len : 0); - msg->hdr.version = cpu_to_le16(10); + msg->hdr.version = cpu_to_le16(12); msg->hdr.tid = cpu_to_le64(arg->flush_tid); fc = msg->front.iov_base; @@ -1309,6 +1307,21 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) /* Advisory flags (version 10) */ ceph_encode_32(&p, arg->flags); + + /* dirstats (version 11) - these are r/o on the client */ + ceph_encode_64(&p, 0); + ceph_encode_64(&p, 0); + +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + /* fscrypt_auth and fscrypt_file (version 12) */ + ceph_encode_32(&p, arg->fscrypt_auth_len); + ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len); + ceph_encode_32(&p, arg->fscrypt_file_len); + ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len); +#else /* CONFIG_FS_ENCRYPTION */ + ceph_encode_32(&p, 0); + ceph_encode_32(&p, 0); +#endif /* CONFIG_FS_ENCRYPTION */ } /* @@ -1430,8 +1443,37 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, } } arg->flags = flags; +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + if (ci->fscrypt_auth_len && + WARN_ON_ONCE(ci->fscrypt_auth_len > sizeof(struct ceph_fscrypt_auth))) { + /* Don't set this if it's too big */ + arg->fscrypt_auth_len = 0; + } else { + arg->fscrypt_auth_len = ci->fscrypt_auth_len; + memcpy(arg->fscrypt_auth, ci->fscrypt_auth, + min_t(size_t, ci->fscrypt_auth_len, sizeof(arg->fscrypt_auth))); + } + /* FIXME: use this to track "real" size */ + arg->fscrypt_file_len = 0; +#endif /* CONFIG_FS_ENCRYPTION */ } +#define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) +static inline int cap_msg_size(struct cap_msg_args *arg) +{ + return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len + + arg->fscrypt_file_len; +} +#else +static inline int cap_msg_size(struct cap_msg_args *arg) +{ + return CAP_MSG_FIXED_FIELDS; +} +#endif /* CONFIG_FS_ENCRYPTION */ + /* * Send a cap msg on the given inode. * @@ -1442,7 +1484,7 @@ static void __send_cap(struct cap_msg_args *arg, struct ceph_inode_info *ci) struct ceph_msg *msg; struct inode *inode = &ci->vfs_inode; - msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, CAP_MSG_SIZE, GFP_NOFS, false); + msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(arg), GFP_NOFS, false); if (!msg) { pr_err("error allocating cap msg: ino (%llx.%llx) flushing %s tid %llu, requeuing cap.\n", ceph_vinop(inode), ceph_cap_string(arg->dirty), @@ -1468,10 +1510,6 @@ static inline int __send_flush_snap(struct inode *inode, struct cap_msg_args arg; struct ceph_msg *msg; - msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, CAP_MSG_SIZE, GFP_NOFS, false); - if (!msg) - return -ENOMEM; - arg.session = session; arg.ino = ceph_vino(inode).ino; arg.cid = 0; @@ -1509,6 +1547,18 @@ static inline int __send_flush_snap(struct inode *inode, arg.flags = 0; arg.wake = false; + /* + * No fscrypt_auth changes from a capsnap. It will need + * to update fscrypt_file on size changes (TODO). + */ + arg.fscrypt_auth_len = 0; + arg.fscrypt_file_len = 0; + + msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(&arg), + GFP_NOFS, false); + if (!msg) + return -ENOMEM; + encode_cap_msg(msg, &arg); ceph_con_send(&arg.session->s_con, msg); return 0; From patchwork Tue Apr 5 19:19:47 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802365 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D05AC43217 for ; Wed, 6 Apr 2022 04:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1448196AbiDFEPZ (ORCPT ); Wed, 6 Apr 2022 00:15:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36088 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573563AbiDETWr (ORCPT ); Tue, 5 Apr 2022 15:22:47 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2CFE5433A4; Tue, 5 Apr 2022 12:20:48 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id AADBD617EE; Tue, 5 Apr 2022 19:20:47 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 73957C385A5; Tue, 5 Apr 2022 19:20:46 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186447; bh=r0OtV7zrO0G8qThQ0L1dySiLYozkfYUKRkAcdp3oD4Q=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=OKqjUXCvrijoB+jiisckXjEYsGo54MhlaqlroMgX62byZwQ6+KxdTDv5IXYKK+KRh JRwQtheqXKv4T9xYIMCmbP5RhGKe2MaS1sfB2htFFUpGDzMB22m27kqJ/CZlygbCnl agSOhn8lujj4wLAsC5vhF3OcCVoeTqbiCcji3JmBqMjoKxLU+dtv/rQyDVOs+qd9q/ h4Cp7k1gurXeDDs/OKNtfYs+Lkld+l5X89acKM2UC6XQNTMrKYX3uPPCyDq2NY6SXh t3rTJ6IP9zhVG3QbVOJb8ULK0F2RpvD9ne6sLKiZRmUqS4s8V6haiYBKpNYdaKUTMB O6onAI+3MmLGw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 16/59] ceph: implement -o test_dummy_encryption mount option Date: Tue, 5 Apr 2022 15:19:47 -0400 Message-Id: <20220405192030.178326-17-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add support for the test_dummy_encryption mount option. This allows us to test the encrypted codepaths in ceph without having to manually set keys, etc. Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 53 ++++++++++++++++++++++++++++++++ fs/ceph/crypto.h | 26 ++++++++++++++++ fs/ceph/inode.c | 10 ++++-- fs/ceph/super.c | 80 ++++++++++++++++++++++++++++++++++++++++++++++-- fs/ceph/super.h | 10 +++++- fs/ceph/xattr.c | 3 ++ 6 files changed, 176 insertions(+), 6 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index a513ff373b13..1c34b8ed1266 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -4,6 +4,7 @@ #include #include "super.h" +#include "mds_client.h" #include "crypto.h" static int ceph_crypt_get_context(struct inode *inode, void *ctx, size_t len) @@ -64,9 +65,15 @@ static bool ceph_crypt_empty_dir(struct inode *inode) return ci->i_rsubdirs + ci->i_rfiles == 1; } +static const union fscrypt_policy *ceph_get_dummy_policy(struct super_block *sb) +{ + return ceph_sb_to_client(sb)->dummy_enc_policy.policy; +} + static struct fscrypt_operations ceph_fscrypt_ops = { .get_context = ceph_crypt_get_context, .set_context = ceph_crypt_set_context, + .get_dummy_policy = ceph_get_dummy_policy, .empty_dir = ceph_crypt_empty_dir, }; @@ -74,3 +81,49 @@ void ceph_fscrypt_set_ops(struct super_block *sb) { fscrypt_set_ops(sb, &ceph_fscrypt_ops); } + +void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc) +{ + fscrypt_free_dummy_policy(&fsc->dummy_enc_policy); +} + +int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, + struct ceph_acl_sec_ctx *as) +{ + int ret, ctxsize; + bool encrypted = false; + struct ceph_inode_info *ci = ceph_inode(inode); + + ret = fscrypt_prepare_new_inode(dir, inode, &encrypted); + if (ret) + return ret; + if (!encrypted) + return 0; + + as->fscrypt_auth = kzalloc(sizeof(*as->fscrypt_auth), GFP_KERNEL); + if (!as->fscrypt_auth) + return -ENOMEM; + + ctxsize = fscrypt_context_for_new_inode(as->fscrypt_auth->cfa_blob, inode); + if (ctxsize < 0) + return ctxsize; + + as->fscrypt_auth->cfa_version = cpu_to_le32(CEPH_FSCRYPT_AUTH_VERSION); + as->fscrypt_auth->cfa_blob_len = cpu_to_le32(ctxsize); + + WARN_ON_ONCE(ci->fscrypt_auth); + kfree(ci->fscrypt_auth); + ci->fscrypt_auth_len = ceph_fscrypt_auth_len(as->fscrypt_auth); + ci->fscrypt_auth = kmemdup(as->fscrypt_auth, ci->fscrypt_auth_len, GFP_KERNEL); + if (!ci->fscrypt_auth) + return -ENOMEM; + + inode->i_flags |= S_ENCRYPTED; + + return 0; +} + +void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as) +{ + swap(req->r_fscrypt_auth, as->fscrypt_auth); +} diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 6dca674f79b8..cb00fe42d5b7 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -8,6 +8,10 @@ #include +struct ceph_fs_client; +struct ceph_acl_sec_ctx; +struct ceph_mds_request; + struct ceph_fscrypt_auth { __le32 cfa_version; __le32 cfa_blob_len; @@ -25,12 +29,34 @@ static inline u32 ceph_fscrypt_auth_len(struct ceph_fscrypt_auth *fa) #ifdef CONFIG_FS_ENCRYPTION void ceph_fscrypt_set_ops(struct super_block *sb); +void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); + +int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, + struct ceph_acl_sec_ctx *as); +void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); + #else /* CONFIG_FS_ENCRYPTION */ static inline void ceph_fscrypt_set_ops(struct super_block *sb) { } +static inline void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc) +{ +} + +static inline int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, + struct ceph_acl_sec_ctx *as) +{ + if (IS_ENCRYPTED(dir)) + return -EOPNOTSUPP; + return 0; +} + +static inline void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, + struct ceph_acl_sec_ctx *as_ctx) +{ +} #endif /* CONFIG_FS_ENCRYPTION */ #endif diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 9a5641b37978..4cbc303730ef 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -83,12 +83,17 @@ struct inode *ceph_new_inode(struct inode *dir, struct dentry *dentry, goto out_err; } + inode->i_state = 0; + inode->i_mode = *mode; + err = ceph_security_init_secctx(dentry, *mode, as_ctx); if (err < 0) goto out_err; - inode->i_state = 0; - inode->i_mode = *mode; + err = ceph_fscrypt_prepare_context(dir, inode, as_ctx); + if (err) + goto out_err; + return inode; out_err: iput(inode); @@ -101,6 +106,7 @@ void ceph_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *a req->r_pagelist = as_ctx->pagelist; as_ctx->pagelist = NULL; } + ceph_fscrypt_as_ctx_to_req(req, as_ctx); } /** diff --git a/fs/ceph/super.c b/fs/ceph/super.c index 1f52fc5b89bd..a1f921d5675d 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -47,6 +47,7 @@ static void ceph_put_super(struct super_block *s) struct ceph_fs_client *fsc = ceph_sb_to_client(s); dout("put_super\n"); + ceph_fscrypt_free_dummy_policy(fsc); ceph_mdsc_close_sessions(fsc->mdsc); } @@ -150,6 +151,7 @@ enum { Opt_recover_session, Opt_source, Opt_mon_addr, + Opt_test_dummy_encryption, /* string args above */ Opt_dirstat, Opt_rbytes, @@ -192,6 +194,7 @@ static const struct fs_parameter_spec ceph_mount_parameters[] = { fsparam_string ("fsc", Opt_fscache), // fsc=... fsparam_flag_no ("ino32", Opt_ino32), fsparam_string ("mds_namespace", Opt_mds_namespace), + fsparam_string ("mon_addr", Opt_mon_addr), fsparam_flag_no ("poolperm", Opt_poolperm), fsparam_flag_no ("quotadf", Opt_quotadf), fsparam_u32 ("rasize", Opt_rasize), @@ -203,7 +206,8 @@ static const struct fs_parameter_spec ceph_mount_parameters[] = { fsparam_u32 ("rsize", Opt_rsize), fsparam_string ("snapdirname", Opt_snapdirname), fsparam_string ("source", Opt_source), - fsparam_string ("mon_addr", Opt_mon_addr), + fsparam_flag ("test_dummy_encryption", Opt_test_dummy_encryption), + fsparam_string ("test_dummy_encryption", Opt_test_dummy_encryption), fsparam_u32 ("wsize", Opt_wsize), fsparam_flag_no ("wsync", Opt_wsync), fsparam_flag_no ("pagecache", Opt_pagecache), @@ -583,6 +587,17 @@ static int ceph_parse_mount_param(struct fs_context *fc, else fsopt->flags |= CEPH_MOUNT_OPT_SPARSEREAD; break; + case Opt_test_dummy_encryption: +#ifdef CONFIG_FS_ENCRYPTION + kfree(fsopt->test_dummy_encryption); + fsopt->test_dummy_encryption = param->string; + param->string = NULL; + fsopt->flags |= CEPH_MOUNT_OPT_TEST_DUMMY_ENC; +#else + warnfc(fc, + "FS encryption not supported: test_dummy_encryption mount option ignored"); +#endif + break; default: BUG(); } @@ -603,6 +618,7 @@ static void destroy_mount_options(struct ceph_mount_options *args) kfree(args->server_path); kfree(args->fscache_uniq); kfree(args->mon_addr); + kfree(args->test_dummy_encryption); kfree(args); } @@ -722,6 +738,8 @@ static int ceph_show_options(struct seq_file *m, struct dentry *root) if (fsopt->flags & CEPH_MOUNT_OPT_SPARSEREAD) seq_puts(m, ",sparseread"); + fscrypt_show_test_dummy_encryption(m, ',', root->d_sb); + if (fsopt->wsize != CEPH_MAX_WRITE_SIZE) seq_printf(m, ",wsize=%u", fsopt->wsize); if (fsopt->rsize != CEPH_MAX_READ_SIZE) @@ -1057,6 +1075,52 @@ static struct dentry *open_root_dentry(struct ceph_fs_client *fsc, return root; } +#ifdef CONFIG_FS_ENCRYPTION +static int ceph_set_test_dummy_encryption(struct super_block *sb, struct fs_context *fc, + struct ceph_mount_options *fsopt) +{ + /* + * No changing encryption context on remount. Note that + * fscrypt_set_test_dummy_encryption will validate the version + * string passed in (if any). + */ + if (fsopt->flags & CEPH_MOUNT_OPT_TEST_DUMMY_ENC) { + struct ceph_fs_client *fsc = sb->s_fs_info; + int err = 0; + + if (fc->purpose == FS_CONTEXT_FOR_RECONFIGURE && !fsc->dummy_enc_policy.policy) { + errorfc(fc, "Can't set test_dummy_encryption on remount"); + return -EEXIST; + } + + err = fscrypt_set_test_dummy_encryption(sb, + fsc->mount_options->test_dummy_encryption, + &fsc->dummy_enc_policy); + if (err) { + if (err == -EEXIST) + errorfc(fc, "Can't change test_dummy_encryption on remount"); + else if (err == -EINVAL) + errorfc(fc, "Value of option \"%s\" is unrecognized", + fsc->mount_options->test_dummy_encryption); + else + errorfc(fc, "Error processing option \"%s\" [%d]", + fsc->mount_options->test_dummy_encryption, err); + return err; + } + warnfc(fc, "test_dummy_encryption mode enabled"); + } + return 0; +} +#else +static inline int ceph_set_test_dummy_encryption(struct super_block *sb, struct fs_context *fc, + struct ceph_mount_options *fsopt) +{ + if (fsopt->flags & CEPH_MOUNT_OPT_TEST_DUMMY_ENC) + warnfc(fc, "test_dummy_encryption mode ignored"); + return 0; +} +#endif + /* * mount: join the ceph cluster, and open root directory. */ @@ -1085,6 +1149,10 @@ static struct dentry *ceph_real_mount(struct ceph_fs_client *fsc, goto out; } + err = ceph_set_test_dummy_encryption(fsc->sb, fc, fsc->mount_options); + if (err) + goto out; + dout("mount opening path '%s'\n", path); ceph_fs_debugfs_init(fsc); @@ -1293,9 +1361,15 @@ static void ceph_free_fc(struct fs_context *fc) static int ceph_reconfigure_fc(struct fs_context *fc) { + int err; struct ceph_parse_opts_ctx *pctx = fc->fs_private; struct ceph_mount_options *fsopt = pctx->opts; - struct ceph_fs_client *fsc = ceph_sb_to_client(fc->root->d_sb); + struct super_block *sb = fc->root->d_sb; + struct ceph_fs_client *fsc = ceph_sb_to_client(sb); + + err = ceph_set_test_dummy_encryption(sb, fc, fsopt); + if (err) + return err; if (fsopt->flags & CEPH_MOUNT_OPT_ASYNC_DIROPS) ceph_set_mount_opt(fsc, ASYNC_DIROPS); @@ -1314,7 +1388,7 @@ static int ceph_reconfigure_fc(struct fs_context *fc) pr_notice("ceph: monitor addresses recorded, but not used for reconnection"); } - sync_filesystem(fc->root->d_sb); + sync_filesystem(sb); return 0; } diff --git a/fs/ceph/super.h b/fs/ceph/super.h index a4dc7b81a6a4..a97a6f6f3089 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -21,6 +21,7 @@ #include #include +#include "crypto.h" /* large granularity for statfs utilization stats to facilitate * large volume sizes on 32-bit machines. */ @@ -42,6 +43,7 @@ #define CEPH_MOUNT_OPT_ASYNC_DIROPS (1<<15) /* allow async directory ops */ #define CEPH_MOUNT_OPT_NOPAGECACHE (1<<16) /* bypass pagecache altogether */ #define CEPH_MOUNT_OPT_SPARSEREAD (1<<17) /* always do sparse reads */ +#define CEPH_MOUNT_OPT_TEST_DUMMY_ENC (1<<18) /* enable dummy encryption (for testing) */ #define CEPH_MOUNT_OPT_DEFAULT \ (CEPH_MOUNT_OPT_DCACHE | \ @@ -98,6 +100,7 @@ struct ceph_mount_options { char *server_path; /* default NULL (means "/") */ char *fscache_uniq; /* default NULL */ char *mon_addr; + char *test_dummy_encryption; /* default NULL */ }; struct ceph_fs_client { @@ -138,9 +141,11 @@ struct ceph_fs_client { #ifdef CONFIG_CEPH_FSCACHE struct fscache_volume *fscache; #endif +#ifdef CONFIG_FS_ENCRYPTION + struct fscrypt_dummy_policy dummy_enc_policy; +#endif }; - /* * File i/o capability. This tracks shared state with the metadata * server that allows us to cache or writeback attributes or to read @@ -1084,6 +1089,9 @@ struct ceph_acl_sec_ctx { #ifdef CONFIG_CEPH_FS_SECURITY_LABEL void *sec_ctx; u32 sec_ctxlen; +#endif +#ifdef CONFIG_FS_ENCRYPTION + struct ceph_fscrypt_auth *fscrypt_auth; #endif struct ceph_pagelist *pagelist; }; diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index 8c2dc2c762a4..58628cef4207 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -1397,6 +1397,9 @@ void ceph_release_acl_sec_ctx(struct ceph_acl_sec_ctx *as_ctx) #endif #ifdef CONFIG_CEPH_FS_SECURITY_LABEL security_release_secctx(as_ctx->sec_ctx, as_ctx->sec_ctxlen); +#endif +#ifdef CONFIG_FS_ENCRYPTION + kfree(as_ctx->fscrypt_auth); #endif if (as_ctx->pagelist) ceph_pagelist_release(as_ctx->pagelist); From patchwork Tue Apr 5 19:19:48 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802387 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9B7FCC433EF for ; Wed, 6 Apr 2022 04:18:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1840363AbiDFEQx (ORCPT ); Wed, 6 Apr 2022 00:16:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36104 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573565AbiDETWr (ORCPT ); Tue, 5 Apr 2022 15:22:47 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0685E41996; Tue, 5 Apr 2022 12:20:49 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 98746616C5; Tue, 5 Apr 2022 19:20:48 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5C95FC385A0; Tue, 5 Apr 2022 19:20:47 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186448; bh=ShhHqYfBpf9JsbMiXI8G2iZFROdQeYH5+nM3VSH0nCI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=dKHViOiZAF8ltnCy75v1geQHPUV3YPHu0rqq8dpG7YpkQs61r73lUO9MHKA/dA5Ko vrpOoJP7fgGvLjlJTaYGbR7CQ8IRQvPC+N3DVZd6guEK8PmgfVB8DocSpjjPx0krPa R0DBwmFqMBBu75AWDTpNRyG2oxYljfYQ62sWfo+AXOEWuzgqqIIEVjKvEuv7h8Z9hR lp7eIdcCsZh6wZjN1B61iDKAwVkAAi+Gv/WOGRfdeCkynnTulGP0cWJufpuHaXI2gg Gl1IQRunszg/NxN8kWmKpLZIZByHcPEO5qh1of3y7yVa8gEQc65CigL426u+LAkK6E y1OLcnVKDWgSw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 17/59] ceph: decode alternate_name in lease info Date: Tue, 5 Apr 2022 15:19:48 -0400 Message-Id: <20220405192030.178326-18-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Ceph is a bit different from local filesystems, in that we don't want to store filenames as raw binary data, since we may also be dealing with clients that don't support fscrypt. We could just base64-encode the encrypted filenames, but that could leave us with filenames longer than NAME_MAX. It turns out that the MDS doesn't care much about filename length, but the clients do. To manage this, we've added a new "alternate name" field that can be optionally added to any dentry that we'll use to store the binary crypttext of the filename if its base64-encoded value will be longer than NAME_MAX. When a dentry has one of these names attached, the MDS will send it along in the lease info, which we can then store for later usage. Signed-off-by: Jeff Layton Signed-off-by: Xiubo Li --- fs/ceph/mds_client.c | 43 +++++++++++++++++++++++++++++++++---------- fs/ceph/mds_client.h | 11 +++++++---- 2 files changed, 40 insertions(+), 14 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index dcb800675dec..2f2f8221eb25 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -308,27 +308,47 @@ static int parse_reply_info_dir(void **p, void *end, static int parse_reply_info_lease(void **p, void *end, struct ceph_mds_reply_lease **lease, - u64 features) + u64 features, u32 *altname_len, u8 **altname) { + u8 struct_v; + u32 struct_len; + void *lend; + if (features == (u64)-1) { - u8 struct_v, struct_compat; - u32 struct_len; + u8 struct_compat; + ceph_decode_8_safe(p, end, struct_v, bad); ceph_decode_8_safe(p, end, struct_compat, bad); + /* struct_v is expected to be >= 1. we only understand * encoding whose struct_compat == 1. */ if (!struct_v || struct_compat != 1) goto bad; + ceph_decode_32_safe(p, end, struct_len, bad); - ceph_decode_need(p, end, struct_len, bad); - end = *p + struct_len; + } else { + struct_len = sizeof(**lease); + *altname_len = 0; + *altname = NULL; } - ceph_decode_need(p, end, sizeof(**lease), bad); + lend = *p + struct_len; + ceph_decode_need(p, end, struct_len, bad); *lease = *p; *p += sizeof(**lease); - if (features == (u64)-1) - *p = end; + + if (features == (u64)-1) { + if (struct_v >= 2) { + ceph_decode_32_safe(p, end, *altname_len, bad); + ceph_decode_need(p, end, *altname_len, bad); + *altname = *p; + *p += *altname_len; + } else { + *altname = NULL; + *altname_len = 0; + } + } + *p = lend; return 0; bad: return -EIO; @@ -358,7 +378,8 @@ static int parse_reply_info_trace(void **p, void *end, info->dname = *p; *p += info->dname_len; - err = parse_reply_info_lease(p, end, &info->dlease, features); + err = parse_reply_info_lease(p, end, &info->dlease, features, + &info->altname_len, &info->altname); if (err < 0) goto out_bad; } @@ -425,9 +446,11 @@ static int parse_reply_info_readdir(void **p, void *end, dout("parsed dir dname '%.*s'\n", rde->name_len, rde->name); /* dentry lease */ - err = parse_reply_info_lease(p, end, &rde->lease, features); + err = parse_reply_info_lease(p, end, &rde->lease, features, + &rde->altname_len, &rde->altname); if (err) goto out_bad; + /* inode */ err = parse_reply_info_in(p, end, &rde->inode, features); if (err < 0) diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index aab3ab284fce..2cc75f9ae7c7 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -29,8 +29,8 @@ enum ceph_feature_type { CEPHFS_FEATURE_MULTI_RECONNECT, CEPHFS_FEATURE_DELEG_INO, CEPHFS_FEATURE_METRIC_COLLECT, - - CEPHFS_FEATURE_MAX = CEPHFS_FEATURE_METRIC_COLLECT, + CEPHFS_FEATURE_ALTERNATE_NAME, + CEPHFS_FEATURE_MAX = CEPHFS_FEATURE_ALTERNATE_NAME, }; /* @@ -45,8 +45,7 @@ enum ceph_feature_type { CEPHFS_FEATURE_MULTI_RECONNECT, \ CEPHFS_FEATURE_DELEG_INO, \ CEPHFS_FEATURE_METRIC_COLLECT, \ - \ - CEPHFS_FEATURE_MAX, \ + CEPHFS_FEATURE_ALTERNATE_NAME, \ } #define CEPHFS_FEATURES_CLIENT_REQUIRED {} @@ -98,7 +97,9 @@ struct ceph_mds_reply_info_in { struct ceph_mds_reply_dir_entry { char *name; + u8 *altname; u32 name_len; + u32 altname_len; struct ceph_mds_reply_lease *lease; struct ceph_mds_reply_info_in inode; loff_t offset; @@ -122,7 +123,9 @@ struct ceph_mds_reply_info_parsed { struct ceph_mds_reply_info_in diri, targeti; struct ceph_mds_reply_dirfrag *dirfrag; char *dname; + u8 *altname; u32 dname_len; + u32 altname_len; struct ceph_mds_reply_lease *dlease; struct ceph_mds_reply_xattr xattr_info; From patchwork Tue Apr 5 19:19:49 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802359 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1D10AC433EF for ; Wed, 6 Apr 2022 04:16:37 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1445098AbiDFEOg (ORCPT ); Wed, 6 Apr 2022 00:14:36 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36310 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573567AbiDETWu (ORCPT ); Tue, 5 Apr 2022 15:22:50 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A758D45510; Tue, 5 Apr 2022 12:20:51 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 518B4B81FA5; Tue, 5 Apr 2022 19:20:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 45723C385A3; Tue, 5 Apr 2022 19:20:48 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186448; bh=qCvm12JIx5v8brhsRRqgAaDUmWNtAuToWb1JQhAYbW8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=SwxrTA3uCIbL4tA8yMfT0cPD7wl3GuI5sbVrqhRtENSJOcP1NYHkJRszgzmrzPND6 ve67RItoO9TPfHiRp/4p8EPqGMD+sDixjhWof0O7CbrcwBep/lPQCaae6wq/ha7EaH 8oKX6xtGJa6077OXqCSpnBvHuQZf1yfXj5QlKVzih2pTctRh5gZrsc83lk5QcxaLfD 9iK81wj+QBSLsLRf4FvTPm9ssgjs7El6nAgpvI6697BL1KVKPPrXcNMfqG2ph8a2bT EY4hKqlmPksCW1f08QkuvY+ZRTBZcFjlHDd9zdppS5JRB1i/xHv7hLpKxgwAySTcOB 1Bmgy3s3OV5RQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 18/59] ceph: add fscrypt ioctls Date: Tue, 5 Apr 2022 15:19:49 -0400 Message-Id: <20220405192030.178326-19-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org We gate most of the ioctls on MDS feature support. The exception is the key removal and status functions that we still want to work if the MDS's were to (inexplicably) lose the feature. For the set_policy ioctl, we take Fs caps to ensure that nothing can create files in the directory while the ioctl is running. That should be enough to ensure that the "empty_dir" check is reliable. Signed-off-by: Jeff Layton --- fs/ceph/ioctl.c | 83 +++++++++++++++++++++++++++++++++++++++++++++++++ 1 file changed, 83 insertions(+) diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c index 6e061bf62ad4..477ecc667aee 100644 --- a/fs/ceph/ioctl.c +++ b/fs/ceph/ioctl.c @@ -6,6 +6,7 @@ #include "mds_client.h" #include "ioctl.h" #include +#include /* * ioctls @@ -268,8 +269,54 @@ static long ceph_ioctl_syncio(struct file *file) return 0; } +static int vet_mds_for_fscrypt(struct file *file) +{ + int i, ret = -EOPNOTSUPP; + struct ceph_mds_client *mdsc = ceph_sb_to_mdsc(file_inode(file)->i_sb); + + mutex_lock(&mdsc->mutex); + for (i = 0; i < mdsc->max_sessions; i++) { + struct ceph_mds_session *s = mdsc->sessions[i]; + + if (!s) + continue; + if (test_bit(CEPHFS_FEATURE_ALTERNATE_NAME, &s->s_features)) + ret = 0; + break; + } + mutex_unlock(&mdsc->mutex); + return ret; +} + +static long ceph_set_encryption_policy(struct file *file, unsigned long arg) +{ + int ret, got = 0; + struct inode *inode = file_inode(file); + struct ceph_inode_info *ci = ceph_inode(inode); + + ret = vet_mds_for_fscrypt(file); + if (ret) + return ret; + + /* + * Ensure we hold these caps so that we _know_ that the rstats check + * in the empty_dir check is reliable. + */ + ret = ceph_get_caps(file, CEPH_CAP_FILE_SHARED, 0, -1, &got); + if (ret) + return ret; + + ret = fscrypt_ioctl_set_policy(file, (const void __user *)arg); + if (got) + ceph_put_cap_refs(ci, got); + + return ret; +} + long ceph_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { + int ret; + dout("ioctl file %p cmd %u arg %lu\n", file, cmd, arg); switch (cmd) { case CEPH_IOC_GET_LAYOUT: @@ -289,6 +336,42 @@ long ceph_ioctl(struct file *file, unsigned int cmd, unsigned long arg) case CEPH_IOC_SYNCIO: return ceph_ioctl_syncio(file); + + case FS_IOC_SET_ENCRYPTION_POLICY: + return ceph_set_encryption_policy(file, arg); + + case FS_IOC_GET_ENCRYPTION_POLICY: + ret = vet_mds_for_fscrypt(file); + if (ret) + return ret; + return fscrypt_ioctl_get_policy(file, (void __user *)arg); + + case FS_IOC_GET_ENCRYPTION_POLICY_EX: + ret = vet_mds_for_fscrypt(file); + if (ret) + return ret; + return fscrypt_ioctl_get_policy_ex(file, (void __user *)arg); + + case FS_IOC_ADD_ENCRYPTION_KEY: + ret = vet_mds_for_fscrypt(file); + if (ret) + return ret; + return fscrypt_ioctl_add_key(file, (void __user *)arg); + + case FS_IOC_REMOVE_ENCRYPTION_KEY: + return fscrypt_ioctl_remove_key(file, (void __user *)arg); + + case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: + return fscrypt_ioctl_remove_key_all_users(file, (void __user *)arg); + + case FS_IOC_GET_ENCRYPTION_KEY_STATUS: + return fscrypt_ioctl_get_key_status(file, (void __user *)arg); + + case FS_IOC_GET_ENCRYPTION_NONCE: + ret = vet_mds_for_fscrypt(file); + if (ret) + return ret; + return fscrypt_ioctl_get_nonce(file, (void __user *)arg); } return -ENOTTY; From patchwork Tue Apr 5 19:19:50 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802349 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0FE99C433F5 for ; Wed, 6 Apr 2022 04:16:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1381361AbiDFEM6 (ORCPT ); Wed, 6 Apr 2022 00:12:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36234 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573566AbiDETWt (ORCPT ); Tue, 5 Apr 2022 15:22:49 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CDA414475E; Tue, 5 Apr 2022 12:20:50 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 6AC90616C5; Tue, 5 Apr 2022 19:20:50 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2F7E8C385A0; Tue, 5 Apr 2022 19:20:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186449; bh=Xr5WwpOdJDLAwJ1HnaAMUsOMgwdZ2fjReZF8iNdzeus=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=t6ktD3Gf8m3isqykAADxjB4s7e/ZmGzmA1FScx37me5fgYi0+ZbEBJ8tJ8FfeF6cw eaazxKuv0r6cplyinrSmRVjmIRNo1dV3DKgvXiiYtJfZIMgohom5VSu7DLIBy5l07v Q6hZLEMyHXF77eePB7xumNy7eF8TIvywCpWcCWqjlYtxc4QzUgUodtFLSLg1U3Z1RV X4fZXPuwnwLcCiOy+PVrt7o0Na88BpZNuQJ8XgoBbIx/17rvFz2H3Mx13SjEdqt5dD 5zo/LKcQa1h1BfFlnbiy2aDHZLbWtyhuZ9nVYV35LA3O8RNchBhb7KcUfJaEpUGilz hPr8q0VEzR44A== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 19/59] ceph: make the ioctl cmd more readable in debug log Date: Tue, 5 Apr 2022 15:19:50 -0400 Message-Id: <20220405192030.178326-20-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li ioctl file 0000000004e6b054 cmd 2148296211 arg 824635143532 The numerical cmd valye in the ioctl debug log message is too hard to understand even when you look at it in the code. Make it more readable. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/ioctl.c | 39 ++++++++++++++++++++++++++++++++++++++- 1 file changed, 38 insertions(+), 1 deletion(-) diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c index 477ecc667aee..b9f0f4e460ab 100644 --- a/fs/ceph/ioctl.c +++ b/fs/ceph/ioctl.c @@ -313,11 +313,48 @@ static long ceph_set_encryption_policy(struct file *file, unsigned long arg) return ret; } +static const char *ceph_ioctl_cmd_name(const unsigned int cmd) +{ + switch (cmd) { + case CEPH_IOC_GET_LAYOUT: + return "get_layout"; + case CEPH_IOC_SET_LAYOUT: + return "set_layout"; + case CEPH_IOC_SET_LAYOUT_POLICY: + return "set_layout_policy"; + case CEPH_IOC_GET_DATALOC: + return "get_dataloc"; + case CEPH_IOC_LAZYIO: + return "lazyio"; + case CEPH_IOC_SYNCIO: + return "syncio"; + case FS_IOC_SET_ENCRYPTION_POLICY: + return "set encryption_policy"; + case FS_IOC_GET_ENCRYPTION_POLICY: + return "get_encryption_policy"; + case FS_IOC_GET_ENCRYPTION_POLICY_EX: + return "get_encryption_policy_ex"; + case FS_IOC_ADD_ENCRYPTION_KEY: + return "add_encryption_key"; + case FS_IOC_REMOVE_ENCRYPTION_KEY: + return "remove_encryption_key"; + case FS_IOC_REMOVE_ENCRYPTION_KEY_ALL_USERS: + return "remove_encryption_key_all_users"; + case FS_IOC_GET_ENCRYPTION_KEY_STATUS: + return "get_encryption_key_status"; + case FS_IOC_GET_ENCRYPTION_NONCE: + return "get_encryption_nonce"; + default: + return "unknown"; + } +} + long ceph_ioctl(struct file *file, unsigned int cmd, unsigned long arg) { int ret; - dout("ioctl file %p cmd %u arg %lu\n", file, cmd, arg); + dout("ioctl file %p cmd %s arg %lu\n", file, + ceph_ioctl_cmd_name(cmd), arg); switch (cmd) { case CEPH_IOC_GET_LAYOUT: return ceph_ioctl_get_layout(file, (void __user *)arg); From patchwork Tue Apr 5 19:19:51 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802325 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4255AC433F5 for ; Wed, 6 Apr 2022 04:01:59 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245197AbiDFEDu (ORCPT ); Wed, 6 Apr 2022 00:03:50 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36512 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573570AbiDETWw (ORCPT ); Tue, 5 Apr 2022 15:22:52 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [145.40.73.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 0D4AA47050; Tue, 5 Apr 2022 12:20:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 4AA77CE1FB6; Tue, 5 Apr 2022 19:20:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 18778C385A1; Tue, 5 Apr 2022 19:20:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186450; bh=4gK8ZAn+TacGcAjwCDK5lhYYnjJ7LL+dT0yZHFaadd0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=h3vPiIxfkK6OiYb/165G98js1mP4pade1hlyPqwCLOvWG47or0gv3JuZz2PzrhdeW LEGz361/FC7O/xfSPkIV/SJG0WEwEk98VAFZ3Tvk2MFbkwJsljtP1Tueh4ji9g/UpZ WfOKEqy9nSA4+mDWZl+mGCxzFyeYrQSYLvFbhEtBnk2ctUsiJsC+N63QUwbmi9YTdr C7kv5ohRf44g1IoriEV3OgQVzt7/5bWnQ2RpSESLGg1L9erv90p18H0cJ33Sm+/FyI LuATH4VtQJt47boufQaIYAOWq3jdd/BnQHnHQdHz8hD3byTOgfG5bC61qqfGM+MnQ3 aaDVyC2xxMFgw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 20/59] ceph: make ceph_msdc_build_path use ref-walk Date: Tue, 5 Apr 2022 15:19:51 -0400 Message-Id: <20220405192030.178326-21-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Encryption potentially requires allocation, at which point we'll need to be in a non-atomic context. Convert ceph_msdc_build_path to take dentry spinlocks and references instead of using rcu_read_lock to walk the path. This is slightly less efficient, and we may want to eventually allow using RCU when the leaf dentry isn't encrypted. Signed-off-by: Jeff Layton --- fs/ceph/mds_client.c | 35 +++++++++++++++++++---------------- 1 file changed, 19 insertions(+), 16 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 2f2f8221eb25..28315053e116 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2398,7 +2398,8 @@ static inline u64 __get_oldest_tid(struct ceph_mds_client *mdsc) char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int stop_on_nosnap) { - struct dentry *temp; + struct dentry *cur; + struct inode *inode; char *path; int pos; unsigned seq; @@ -2415,34 +2416,35 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, path[pos] = '\0'; seq = read_seqbegin(&rename_lock); - rcu_read_lock(); - temp = dentry; + cur = dget(dentry); for (;;) { - struct inode *inode; + struct dentry *temp; - spin_lock(&temp->d_lock); - inode = d_inode(temp); + spin_lock(&cur->d_lock); + inode = d_inode(cur); if (inode && ceph_snap(inode) == CEPH_SNAPDIR) { dout("build_path path+%d: %p SNAPDIR\n", - pos, temp); - } else if (stop_on_nosnap && inode && dentry != temp && + pos, cur); + } else if (stop_on_nosnap && inode && dentry != cur && ceph_snap(inode) == CEPH_NOSNAP) { - spin_unlock(&temp->d_lock); + spin_unlock(&cur->d_lock); pos++; /* get rid of any prepended '/' */ break; } else { - pos -= temp->d_name.len; + pos -= cur->d_name.len; if (pos < 0) { - spin_unlock(&temp->d_lock); + spin_unlock(&cur->d_lock); break; } - memcpy(path + pos, temp->d_name.name, temp->d_name.len); + memcpy(path + pos, cur->d_name.name, cur->d_name.len); } + temp = cur; spin_unlock(&temp->d_lock); - temp = READ_ONCE(temp->d_parent); + cur = dget_parent(temp); + dput(temp); /* Are we at the root? */ - if (IS_ROOT(temp)) + if (IS_ROOT(cur)) break; /* Are we out of buffer? */ @@ -2451,8 +2453,9 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, path[pos] = '/'; } - base = ceph_ino(d_inode(temp)); - rcu_read_unlock(); + inode = d_inode(cur); + base = inode ? ceph_ino(inode) : 0; + dput(cur); if (read_seqretry(&rename_lock, seq)) goto retry; From patchwork Tue Apr 5 19:19:52 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802335 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8E1E1C41535 for ; Wed, 6 Apr 2022 04:05:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1377099AbiDFEFx (ORCPT ); Wed, 6 Apr 2022 00:05:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36426 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573568AbiDETWw (ORCPT ); Tue, 5 Apr 2022 15:22:52 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AAF0445AD9; Tue, 5 Apr 2022 12:20:52 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 3968160A5F; Tue, 5 Apr 2022 19:20:52 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 01B42C385A5; Tue, 5 Apr 2022 19:20:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186451; bh=hNFGUfj55nllaWQ3ffWTuSz9rOdSBiQUHvBKtre/cSc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=k01V5pqKZL9goEyuGcolCqpsH19i/LdNG2E7kBRI8Ml3o2S+07jyDymxOdd97FKmd 4rLU4b0z+4awqMpLInkBLU+sE1gCh+MQ39exnwBaXJJv5d+2yjYhEXSBK+xW94P6dB s3V+7Dux4NrxHtcO8v+MGs/aagenkNM0rXNUZjL8JZgpfbaC2voIUarh6ohVBoWeOc d0p/8lP2gND9o9jWH96nsPe1gf/KMtT7UD2LYejlrqE8M5UhEMNYUE1/QpgqR5nhyF Ya0hCvLzeBtKB0zBbn5OvDXWzSrPEy0AmEQHM3/3uSVkyf8O1UkgMhWq8fFgPOACTO UGaFWWAYB4nhA== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 21/59] ceph: add encrypted fname handling to ceph_mdsc_build_path Date: Tue, 5 Apr 2022 15:19:52 -0400 Message-Id: <20220405192030.178326-22-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow ceph_mdsc_build_path to encrypt and base64 encode the filename when the parent is encrypted and we're sending the path to the MDS. In most cases, we just encrypt the filenames and base64 encode them, but when the name is longer than CEPH_NOHASH_NAME_MAX, we use a similar scheme to fscrypt proper, and hash the remaning bits with sha256. When doing this, we then send along the full crypttext of the name in the new alternate_name field of the MClientRequest. The MDS can then send that along in readdir responses and traces. Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 47 ++++++++++++++++++++++++++ fs/ceph/crypto.h | 32 ++++++++++++++++++ fs/ceph/mds_client.c | 80 ++++++++++++++++++++++++++++++++++---------- 3 files changed, 141 insertions(+), 18 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 1c34b8ed1266..a6ee2e3160ca 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -127,3 +127,50 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se { swap(req->r_fscrypt_auth, as->fscrypt_auth); } + +int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) +{ + u32 len; + int elen; + int ret; + u8 *cryptbuf; + + WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); + + /* + * Convert cleartext d_name to ciphertext. If result is longer than + * CEPH_NOHASH_NAME_MAX, sha256 the remaining bytes + * + * See: fscrypt_setup_filename + */ + if (!fscrypt_fname_encrypted_size(parent, dentry->d_name.len, NAME_MAX, &len)) + return -ENAMETOOLONG; + + /* Allocate a buffer appropriate to hold the result */ + cryptbuf = kmalloc(len > CEPH_NOHASH_NAME_MAX ? NAME_MAX : len, GFP_KERNEL); + if (!cryptbuf) + return -ENOMEM; + + ret = fscrypt_fname_encrypt(parent, &dentry->d_name, cryptbuf, len); + if (ret) { + kfree(cryptbuf); + return ret; + } + + /* hash the end if the name is long enough */ + if (len > CEPH_NOHASH_NAME_MAX) { + u8 hash[SHA256_DIGEST_SIZE]; + u8 *extra = cryptbuf + CEPH_NOHASH_NAME_MAX; + + /* hash the extra bytes and overwrite crypttext beyond that point with it */ + sha256(extra, len - CEPH_NOHASH_NAME_MAX, hash); + memcpy(extra, hash, SHA256_DIGEST_SIZE); + len = CEPH_NOHASH_NAME_MAX + SHA256_DIGEST_SIZE; + } + + /* base64 encode the encrypted name */ + elen = fscrypt_base64url_encode(cryptbuf, len, buf); + kfree(cryptbuf); + dout("base64-encoded ciphertext name = %.*s\n", elen, buf); + return elen; +} diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index cb00fe42d5b7..9a66a29d5c8b 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -6,6 +6,7 @@ #ifndef _CEPH_CRYPTO_H #define _CEPH_CRYPTO_H +#include #include struct ceph_fs_client; @@ -27,6 +28,30 @@ static inline u32 ceph_fscrypt_auth_len(struct ceph_fscrypt_auth *fa) } #ifdef CONFIG_FS_ENCRYPTION +/* + * We want to encrypt filenames when creating them, but the encrypted + * versions of those names may have illegal characters in them. To mitigate + * that, we base64 encode them, but that gives us a result that can exceed + * NAME_MAX. + * + * Follow a similar scheme to fscrypt itself, and cap the filename to a + * smaller size. If the ciphertext name is longer than the value below, then + * sha256 hash the remaining bytes. + * + * For the fscrypt_nokey_name struct the dirhash[2] member is useless in ceph + * so the corresponding struct will be: + * + * struct fscrypt_ceph_nokey_name { + * u8 bytes[157]; + * u8 sha256[SHA256_DIGEST_SIZE]; + * }; // 189 bytes => 252 bytes base64-encoded, which is <= NAME_MAX (255) + * + * Note that for long names that end up having their tail portion hashed, we + * must also store the full encrypted name (in the dentry's alternate_name + * field). + */ +#define CEPH_NOHASH_NAME_MAX (189 - SHA256_DIGEST_SIZE) + void ceph_fscrypt_set_ops(struct super_block *sb); void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); @@ -34,6 +59,7 @@ void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, struct ceph_acl_sec_ctx *as); void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); +int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf); #else /* CONFIG_FS_ENCRYPTION */ @@ -57,6 +83,12 @@ static inline void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as_ctx) { } + +static inline int ceph_encode_encrypted_fname(const struct inode *parent, + struct dentry *dentry, char *buf) +{ + return -EOPNOTSUPP; +} #endif /* CONFIG_FS_ENCRYPTION */ #endif diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 28315053e116..13367a358a85 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -14,6 +14,7 @@ #include #include "super.h" +#include "crypto.h" #include "mds_client.h" #include "crypto.h" @@ -2385,18 +2386,27 @@ static inline u64 __get_oldest_tid(struct ceph_mds_client *mdsc) return mdsc->oldest_tid; } -/* - * Build a dentry's path. Allocate on heap; caller must kfree. Based - * on build_path_from_dentry in fs/cifs/dir.c. +/** + * ceph_mdsc_build_path - build a path string to a given dentry + * @dentry: dentry to which path should be built + * @plen: returned length of string + * @pbase: returned base inode number + * @for_wire: is this path going to be sent to the MDS? + * + * Build a string that represents the path to the dentry. This is mostly called + * for two different purposes: * - * If @stop_on_nosnap, generate path relative to the first non-snapped - * inode. + * 1) we need to build a path string to send to the MDS (for_wire == true) + * 2) we need a path string for local presentation (e.g. debugfs) (for_wire == false) + * + * The path is built in reverse, starting with the dentry. Walk back up toward + * the root, building the path until the first non-snapped inode is reached (for_wire) + * or the root inode is reached (!for_wire). * * Encode hidden .snap dirs as a double /, i.e. * foo/.snap/bar -> foo//bar */ -char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, - int stop_on_nosnap) +char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, int for_wire) { struct dentry *cur; struct inode *inode; @@ -2418,30 +2428,65 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, seq = read_seqbegin(&rename_lock); cur = dget(dentry); for (;;) { - struct dentry *temp; + struct dentry *parent; spin_lock(&cur->d_lock); inode = d_inode(cur); if (inode && ceph_snap(inode) == CEPH_SNAPDIR) { dout("build_path path+%d: %p SNAPDIR\n", pos, cur); - } else if (stop_on_nosnap && inode && dentry != cur && - ceph_snap(inode) == CEPH_NOSNAP) { + spin_unlock(&cur->d_lock); + parent = dget_parent(cur); + } else if (for_wire && inode && dentry != cur && ceph_snap(inode) == CEPH_NOSNAP) { spin_unlock(&cur->d_lock); pos++; /* get rid of any prepended '/' */ break; - } else { + } else if (!for_wire || !IS_ENCRYPTED(d_inode(cur->d_parent))) { pos -= cur->d_name.len; if (pos < 0) { spin_unlock(&cur->d_lock); break; } memcpy(path + pos, cur->d_name.name, cur->d_name.len); + spin_unlock(&cur->d_lock); + parent = dget_parent(cur); + } else { + int len, ret; + char buf[NAME_MAX]; + + /* + * Proactively copy name into buf, in case we need to present + * it as-is. + */ + memcpy(buf, cur->d_name.name, cur->d_name.len); + len = cur->d_name.len; + spin_unlock(&cur->d_lock); + parent = dget_parent(cur); + + ret = __fscrypt_prepare_readdir(d_inode(parent)); + if (ret < 0) { + dput(parent); + dput(cur); + return ERR_PTR(ret); + } + + if (fscrypt_has_encryption_key(d_inode(parent))) { + len = ceph_encode_encrypted_fname(d_inode(parent), cur, buf); + if (len < 0) { + dput(parent); + dput(cur); + return ERR_PTR(len); + } + } + pos -= len; + if (pos < 0) { + dput(parent); + break; + } + memcpy(path + pos, buf, len); } - temp = cur; - spin_unlock(&temp->d_lock); - cur = dget_parent(temp); - dput(temp); + dput(cur); + cur = parent; /* Are we at the root? */ if (IS_ROOT(cur)) @@ -2465,8 +2510,7 @@ char *ceph_mdsc_build_path(struct dentry *dentry, int *plen, u64 *pbase, * A rename didn't occur, but somehow we didn't end up where * we thought we would. Throw a warning and try again. */ - pr_warn("build_path did not end path lookup where " - "expected, pos is %d\n", pos); + pr_warn("build_path did not end path lookup where expected (pos = %d)\n", pos); goto retry; } @@ -2486,7 +2530,7 @@ static int build_dentry_path(struct dentry *dentry, struct inode *dir, rcu_read_lock(); if (!dir) dir = d_inode_rcu(dentry->d_parent); - if (dir && parent_locked && ceph_snap(dir) == CEPH_NOSNAP) { + if (dir && parent_locked && ceph_snap(dir) == CEPH_NOSNAP && !IS_ENCRYPTED(dir)) { *pino = ceph_ino(dir); rcu_read_unlock(); *ppath = dentry->d_name.name; From patchwork Tue Apr 5 19:19:53 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802344 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 34434C433F5 for ; Wed, 6 Apr 2022 04:16:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1351051AbiDFELC (ORCPT ); Wed, 6 Apr 2022 00:11:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36428 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573569AbiDETWw (ORCPT ); Tue, 5 Apr 2022 15:22:52 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8616C46652; Tue, 5 Apr 2022 12:20:53 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 22BA5617EE; Tue, 5 Apr 2022 19:20:53 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id DFF06C385A3; Tue, 5 Apr 2022 19:20:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186452; bh=kyYkU/Dchft4ACwgY94oRFa28SIAQLGaoAiYc1wr4Js=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=iAOjDvpPE74286q/P5TJcdZcpkEp7FNy6KNHZRseCCJLqb1zTmDAIV+SZQm+hBWi4 Vb68EptgER1RBKv1zpSJD5VAtA5IdIu+Qcv0APwLOyszhQa8/Dsqe7w2boNUKpeLod 7cct4kv6h4EeH0dIkAtquhK1V2JYekqb4FFFhApgr/ZbyazLIR2OI1x9mVT5xdRsgq XW1gFIf7PG6AZ2Oxeggv6B6LUtvlnE+SaqEXKekUjsYEpaYQqHW0YItW4gxwMz5eHp sCvWhm17YWahQP9feMct/iu50nDt46RQ7YmTjhslNesBLlccl9IJUEfw1HlzYIAFUN hsx7ahqBtkHBQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 22/59] ceph: send altname in MClientRequest Date: Tue, 5 Apr 2022 15:19:53 -0400 Message-Id: <20220405192030.178326-23-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org In the event that we have a filename longer than CEPH_NOHASH_NAME_MAX, we'll need to hash the tail of the filename. The client however will still need to know the full name of the file if it has a key. To support this, the MClientRequest field has grown a new alternate_name field that we populate with the full (binary) crypttext of the filename. This is then transmitted to the clients in readdir or traces as part of the dentry lease. Add support for populating this field when the filenames are very long. Signed-off-by: Jeff Layton --- fs/ceph/mds_client.c | 75 +++++++++++++++++++++++++++++++++++++++++--- fs/ceph/mds_client.h | 3 ++ 2 files changed, 73 insertions(+), 5 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 13367a358a85..0be1668b2c32 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -972,6 +972,7 @@ void ceph_mdsc_release_request(struct kref *kref) if (req->r_pagelist) ceph_pagelist_release(req->r_pagelist); kfree(req->r_fscrypt_auth); + kfree(req->r_altname); put_request_session(req); ceph_unreserve_caps(req->r_mdsc, &req->r_caps_reservation); WARN_ON_ONCE(!list_empty(&req->r_wait)); @@ -2386,6 +2387,63 @@ static inline u64 __get_oldest_tid(struct ceph_mds_client *mdsc) return mdsc->oldest_tid; } +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) +static u8 *get_fscrypt_altname(const struct ceph_mds_request *req, u32 *plen) +{ + struct inode *dir = req->r_parent; + struct dentry *dentry = req->r_dentry; + u8 *cryptbuf = NULL; + u32 len = 0; + int ret = 0; + + /* only encode if we have parent and dentry */ + if (!dir || !dentry) + goto success; + + /* No-op unless this is encrypted */ + if (!IS_ENCRYPTED(dir)) + goto success; + + ret = __fscrypt_prepare_readdir(dir); + if (ret) + return ERR_PTR(ret); + + /* No key? Just ignore it. */ + if (!fscrypt_has_encryption_key(dir)) + goto success; + + if (!fscrypt_fname_encrypted_size(dir, dentry->d_name.len, NAME_MAX, &len)) { + WARN_ON_ONCE(1); + return ERR_PTR(-ENAMETOOLONG); + } + + /* No need to append altname if name is short enough */ + if (len <= CEPH_NOHASH_NAME_MAX) { + len = 0; + goto success; + } + + cryptbuf = kmalloc(len, GFP_KERNEL); + if (!cryptbuf) + return ERR_PTR(-ENOMEM); + + ret = fscrypt_fname_encrypt(dir, &dentry->d_name, cryptbuf, len); + if (ret) { + kfree(cryptbuf); + return ERR_PTR(ret); + } +success: + *plen = len; + return cryptbuf; +} +#else +static u8 *get_fscrypt_altname(const struct ceph_mds_request *req, u32 *plen) +{ + *plen = 0; + return NULL; +} +#endif + /** * ceph_mdsc_build_path - build a path string to a given dentry * @dentry: dentry to which path should be built @@ -2606,14 +2664,15 @@ static void encode_mclientrequest_tail(void **p, const struct ceph_mds_request * ceph_encode_timespec64(&ts, &req->r_stamp); ceph_encode_copy(p, &ts, sizeof(ts)); - /* gid_list */ + /* v4: gid_list */ ceph_encode_32(p, req->r_cred->group_info->ngroups); for (i = 0; i < req->r_cred->group_info->ngroups; i++) ceph_encode_64(p, from_kgid(&init_user_ns, req->r_cred->group_info->gid[i])); - /* v5: altname (TODO: skip for now) */ - ceph_encode_32(p, 0); + /* v5: altname */ + ceph_encode_32(p, req->r_altname_len); + ceph_encode_copy(p, req->r_altname, req->r_altname_len); /* v6: fscrypt_auth and fscrypt_file */ if (req->r_fscrypt_auth) { @@ -2669,7 +2728,13 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, goto out_free1; } - /* head */ + req->r_altname = get_fscrypt_altname(req, &req->r_altname_len); + if (IS_ERR(req->r_altname)) { + msg = ERR_CAST(req->r_altname); + req->r_altname = NULL; + goto out_free2; + } + len = legacy ? sizeof(*head) : sizeof(struct ceph_mds_request_head); /* filepaths */ @@ -2695,7 +2760,7 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, len += sizeof(u32) + (sizeof(u64) * req->r_cred->group_info->ngroups); /* alternate name */ - len += sizeof(u32); // TODO + len += sizeof(u32) + req->r_altname_len; /* fscrypt_auth */ len += sizeof(u32); // fscrypt_auth diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 2cc75f9ae7c7..cd719691a86d 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -290,6 +290,9 @@ struct ceph_mds_request { struct ceph_fscrypt_auth *r_fscrypt_auth; + u8 *r_altname; /* fscrypt binary crypttext for long filenames */ + u32 r_altname_len; /* length of r_altname */ + int r_fmode; /* file mode, if expecting cap */ int r_request_release_offset; const struct cred *r_cred; From patchwork Tue Apr 5 19:19:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802361 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2D034C433FE for ; Wed, 6 Apr 2022 04:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1446398AbiDFEPB (ORCPT ); Wed, 6 Apr 2022 00:15:01 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36526 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573571AbiDETWx (ORCPT ); Tue, 5 Apr 2022 15:22:53 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7DD38473A6; Tue, 5 Apr 2022 12:20:54 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 0EBED616C5; Tue, 5 Apr 2022 19:20:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C8763C385A0; Tue, 5 Apr 2022 19:20:52 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186453; bh=PU5u4snQwa0Fwp4GmEqydtyCCMY0ChecXgE22UELz48=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VMO3jsZ6ER6o742l7qQ9XxwghEoYIPEQ1yIwI3X+LKmcnZLAHqBidj2U/mczA5pUN YCMhngbC2z7ELti+GWe8K4zN2RqUOtmLtaHA1ieT2tUJwrV93Ud8VJwaOICOXXFytW DVUv6GgkLiusm41ulTinFXCK5fIplzu9p+w9UaiRwjNZgsvyACWCLAeRDVsb36y6fG rJ0kWgfbnvboH4JyLnLHAmGj4mmondgJyPHxv3pHn9hpGWfFCKUDWd75oDBSWQe+Ty +Gv7knwN6bc7jVRh7B9wY9XIEFgNL9fzOUKjCKpCV3Rj9HLKu8L9c8CJx4N4IJfUL7 YJ9h5yiP8BWuQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 23/59] ceph: encode encrypted name in dentry release Date: Tue, 5 Apr 2022 15:19:54 -0400 Message-Id: <20220405192030.178326-24-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Encode encrypted dentry names when sending a dentry release request. Also add a more helpful comment over ceph_encode_dentry_release. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 32 ++++++++++++++++++++++++++++---- fs/ceph/mds_client.c | 20 ++++++++++++++++---- 2 files changed, 44 insertions(+), 8 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 3b31d77eb1ea..22bf3e2696cb 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -4618,6 +4618,18 @@ int ceph_encode_inode_release(void **p, struct inode *inode, return ret; } +/** + * ceph_encode_dentry_release - encode a dentry release into an outgoing request + * @p: outgoing request buffer + * @dentry: dentry to release + * @dir: dir to release it from + * @mds: mds that we're speaking to + * @drop: caps being dropped + * @unless: unless we have these caps + * + * Encode a dentry release into an outgoing request buffer. Returns 1 if the + * thing was released, or a negative error code otherwise. + */ int ceph_encode_dentry_release(void **p, struct dentry *dentry, struct inode *dir, int mds, int drop, int unless) @@ -4650,13 +4662,25 @@ int ceph_encode_dentry_release(void **p, struct dentry *dentry, if (ret && di->lease_session && di->lease_session->s_mds == mds) { dout("encode_dentry_release %p mds%d seq %d\n", dentry, mds, (int)di->lease_seq); - rel->dname_len = cpu_to_le32(dentry->d_name.len); - memcpy(*p, dentry->d_name.name, dentry->d_name.len); - *p += dentry->d_name.len; rel->dname_seq = cpu_to_le32(di->lease_seq); __ceph_mdsc_drop_dentry_lease(dentry); + spin_unlock(&dentry->d_lock); + if (IS_ENCRYPTED(dir) && fscrypt_has_encryption_key(dir)) { + int ret2 = ceph_encode_encrypted_fname(dir, dentry, *p); + + if (ret2 < 0) + return ret2; + + rel->dname_len = cpu_to_le32(ret2); + *p += ret2; + } else { + rel->dname_len = cpu_to_le32(dentry->d_name.len); + memcpy(*p, dentry->d_name.name, dentry->d_name.len); + *p += dentry->d_name.len; + } + } else { + spin_unlock(&dentry->d_lock); } - spin_unlock(&dentry->d_lock); return ret; } diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 0be1668b2c32..750a67643850 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2819,15 +2819,23 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, req->r_inode ? req->r_inode : d_inode(req->r_dentry), mds, req->r_inode_drop, req->r_inode_unless, req->r_op == CEPH_MDS_OP_READDIR); - if (req->r_dentry_drop) - releases += ceph_encode_dentry_release(&p, req->r_dentry, + if (req->r_dentry_drop) { + ret = ceph_encode_dentry_release(&p, req->r_dentry, req->r_parent, mds, req->r_dentry_drop, req->r_dentry_unless); - if (req->r_old_dentry_drop) - releases += ceph_encode_dentry_release(&p, req->r_old_dentry, + if (ret < 0) + goto out_err; + releases += ret; + } + if (req->r_old_dentry_drop) { + ret = ceph_encode_dentry_release(&p, req->r_old_dentry, req->r_old_dentry_dir, mds, req->r_old_dentry_drop, req->r_old_dentry_unless); + if (ret < 0) + goto out_err; + releases += ret; + } if (req->r_old_inode_drop) releases += ceph_encode_inode_release(&p, d_inode(req->r_old_dentry), @@ -2869,6 +2877,10 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, ceph_mdsc_free_path((char *)path1, pathlen1); out: return msg; +out_err: + ceph_msg_put(msg); + msg = ERR_PTR(ret); + goto out_free2; } /* From patchwork Tue Apr 5 19:19:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802382 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 42744C433F5 for ; Wed, 6 Apr 2022 04:18:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1836138AbiDFEQr (ORCPT ); Wed, 6 Apr 2022 00:16:47 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36574 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573572AbiDETWx (ORCPT ); Tue, 5 Apr 2022 15:22:53 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5BE7847551; Tue, 5 Apr 2022 12:20:55 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id F09E260A5F; Tue, 5 Apr 2022 19:20:54 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B1686C385A1; Tue, 5 Apr 2022 19:20:53 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186454; bh=40bz7fgcSkfmSJmWJDKxd9JZxineBbUvX30XOIb7qbk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ScUndJsO1iIF8TZnZGyatLX8D1Sre9qw03pcugwOP94pa2shH71KCYubeRtwLcte8 q9RYFCVt+6ZMZu+QFNAqmtzeaZqPzTNzF+0MMe1HSjLzlKAUfQJOCwSn5IPQyvFZwA Vih4e+llhlS//shafvtvaB0Z5W9Dp2d+KmN/zlOSYwX9QOC2RhLZzyzN8RTJeMrYOv hpyzzA2FW6n9d9mp4bD/8vGRDgPflyQ83mZ5r+BySLJASgExrd54G46yrVdy9Ui5KN DXSaIWV0SsLsQ8K4bfs044u75kf+1m/5jdjNBj+fbaDli3Pysk4GHIOYY/Jhud32+3 2fSg1Jk/m2S2Q== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 24/59] ceph: properly set DCACHE_NOKEY_NAME flag in lookup Date: Tue, 5 Apr 2022 15:19:55 -0400 Message-Id: <20220405192030.178326-25-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org This is required so that we know to invalidate these dentries when the directory is unlocked. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 11 +++++++++++ 1 file changed, 11 insertions(+) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 8cc7a49ee508..897f8618151b 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -760,6 +760,17 @@ static struct dentry *ceph_lookup(struct inode *dir, struct dentry *dentry, if (dentry->d_name.len > NAME_MAX) return ERR_PTR(-ENAMETOOLONG); + if (IS_ENCRYPTED(dir)) { + err = __fscrypt_prepare_readdir(dir); + if (err) + return ERR_PTR(err); + if (!fscrypt_has_encryption_key(dir)) { + spin_lock(&dentry->d_lock); + dentry->d_flags |= DCACHE_NOKEY_NAME; + spin_unlock(&dentry->d_lock); + } + } + /* can we conclude ENOENT locally? */ if (d_really_is_negative(dentry)) { struct ceph_inode_info *ci = ceph_inode(dir); From patchwork Tue Apr 5 19:19:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802327 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 06A6FC433FE for ; Wed, 6 Apr 2022 04:02:32 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231449AbiDFEEC (ORCPT ); Wed, 6 Apr 2022 00:04:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36664 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573573AbiDETWy (ORCPT ); Tue, 5 Apr 2022 15:22:54 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4489347ACB; Tue, 5 Apr 2022 12:20:56 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id D6299617EE; Tue, 5 Apr 2022 19:20:55 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 9B7EBC385A3; Tue, 5 Apr 2022 19:20:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186455; bh=wuel0yRpvqoH42Shg0phMCGGxQASot3eqVV2HwOD7gc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VKJuTRiHJZSsnavBh4ApUesE7dAteXIDTW0TI5t1JbNUsVsjzWFNsDdsKvPP3855c M6i7YsE8mD9NgZCGl0hlLG8U6iqmjGturZFM09UvpopaTwhFFGH1I/lofFP2ZT9wPf 6lp/9beYBIE4Cnn9PzAJM+wjpb9vyD92BuSyD6y9lsf9nWOJ3R83BJoFbbLaKnp9KL SeVlDrYh6+U1FnmXz0nvlBM7bCJyX8Y2fw2dVzQpJzVnpVC3X5PuUetkuxxUrMePhm cAcSlYaHNPqJ5TaulfkqLMyQUcCJbYUAnIPHjkur3pplzKxsySPTxKyGnB5ImO6R61 XKkHmM/apDlug== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 25/59] ceph: set DCACHE_NOKEY_NAME in atomic open Date: Tue, 5 Apr 2022 15:19:56 -0400 Message-Id: <20220405192030.178326-26-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Atomic open can act as a lookup if handed a dentry that is negative on the MDS. Ensure that we set DCACHE_NOKEY_NAME on the dentry in atomic_open, if we don't have the key for the parent. Otherwise, we can end up validating the dentry inappropriately if someone later adds a key. Reviewed-by: Xiubo Li Reviewed-by: Luís Henriques Signed-off-by: Jeff Layton --- fs/ceph/file.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index dd183d12a3bd..dfc02caf4229 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -758,6 +758,13 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, req->r_args.open.mask = cpu_to_le32(mask); req->r_parent = dir; ihold(dir); + if (IS_ENCRYPTED(dir)) { + if (!fscrypt_has_encryption_key(dir)) { + spin_lock(&dentry->d_lock); + dentry->d_flags |= DCACHE_NOKEY_NAME; + spin_unlock(&dentry->d_lock); + } + } if (flags & O_CREAT) { struct ceph_file_layout lo; From patchwork Tue Apr 5 19:19:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802357 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id D4551C433F5 for ; Wed, 6 Apr 2022 04:16:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1391898AbiDFEOP (ORCPT ); Wed, 6 Apr 2022 00:14:15 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36884 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573574AbiDETW5 (ORCPT ); Tue, 5 Apr 2022 15:22:57 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3EA0A483AF; Tue, 5 Apr 2022 12:20:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id B1A48CE1FB7; Tue, 5 Apr 2022 19:20:57 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 851C8C385A1; Tue, 5 Apr 2022 19:20:55 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186456; bh=QSQJwXnQ26aopltanKUAuklulNeW880EC3N09Fc3UKo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PULmi/MSmvm3KcMyeoe4q7YjAY098w+k3YDDbZvoCb07KoISMwzAzmUIRz696LhBs pYSQPt9mNUe61J+L5+HTXwFeXVJbeZ8l/tA8ymKxDPWtszKcck5fFTpWfZ3uO3vhSV rRifdEkJqKR/4Tt7DTGtZoYUFVQDL9yOU8/tXEIEzU0yKWZ2okSz/pdAis59+28Rlq RoyKD479znZzAwV1WWeSdclRsEH6LHjCLZ3feSK5jCP/sM2tXxBA7oG/jsuxOZSCO3 uP2zzNYUGUnii1yr7Zwv+YXS9c8gVocK35GWnqe6fwKupAErvZuhdhcBm0zMfbdDnf iI/a6kQk4aBDw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 26/59] ceph: make d_revalidate call fscrypt revalidator for encrypted dentries Date: Tue, 5 Apr 2022 15:19:57 -0400 Message-Id: <20220405192030.178326-27-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org If we have a dentry which represents a no-key name, then we need to test whether the parent directory's encryption key has since been added. Do that before we test anything else about the dentry. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 8 ++++++-- 1 file changed, 6 insertions(+), 2 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 897f8618151b..caf2547c3fe1 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1709,6 +1709,10 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags) struct inode *dir, *inode; struct ceph_mds_client *mdsc; + valid = fscrypt_d_revalidate(dentry, flags); + if (valid <= 0) + return valid; + if (flags & LOOKUP_RCU) { parent = READ_ONCE(dentry->d_parent); dir = d_inode_rcu(parent); @@ -1721,8 +1725,8 @@ static int ceph_d_revalidate(struct dentry *dentry, unsigned int flags) inode = d_inode(dentry); } - dout("d_revalidate %p '%pd' inode %p offset 0x%llx\n", dentry, - dentry, inode, ceph_dentry(dentry)->offset); + dout("d_revalidate %p '%pd' inode %p offset 0x%llx nokey %d\n", dentry, + dentry, inode, ceph_dentry(dentry)->offset, !!(dentry->d_flags & DCACHE_NOKEY_NAME)); mdsc = ceph_sb_to_client(dir->i_sb)->mdsc; From patchwork Tue Apr 5 19:19:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802363 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C0B8C4332F for ; Wed, 6 Apr 2022 04:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1447012AbiDFEPG (ORCPT ); Wed, 6 Apr 2022 00:15:06 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573576AbiDETW6 (ORCPT ); Tue, 5 Apr 2022 15:22:58 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7AFE4889A; Tue, 5 Apr 2022 12:20:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 64A13B81F6B; Tue, 5 Apr 2022 19:20:58 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 6ECA2C385A0; Tue, 5 Apr 2022 19:20:56 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186457; bh=N/e9LH0P+9ERaRrv/8h6/+bh1AE/58FUtmgR2K0k6AM=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=BkQMgVqtVVcmOn4SFTLXp3eqkkwWqgM01LaqugEEMa/Z0E0WZalB85lXln5VOofgB +VBdjL4P7czpUrGKJVDm1xZ+1oZmuvnJ46SYhmQUIZ5g2EW0yvZ6qJhBzcW8F6QNQf D0yCi9XOazIZlOj8STdqNAqXwPOrdU8NjJN+au0qdEwtwWbtsLLx0II80FBvOTjWZr b65yavQVgOdFdsEUUB/jxHoAYGtscnB9HxnUyEoPTKj9MGxGP9Fmeas6RVorIARxIH CSFEc3E9RnQNY2aLxYwTFnZHGbJpMEXOUI5RRHR1GNnItbCpbgksWlovrub6LfHRzZ m/j+NL0amlssg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 27/59] ceph: add helpers for converting names for userland presentation Date: Tue, 5 Apr 2022 15:19:58 -0400 Message-Id: <20220405192030.178326-28-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Define a new ceph_fname struct that we can use to carry information about encrypted dentry names. Add helpers for working with these objects, including ceph_fname_to_usr which formats an encrypted filename for userland presentation. Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 76 ++++++++++++++++++++++++++++++++++++++++++++++++ fs/ceph/crypto.h | 41 ++++++++++++++++++++++++++ 2 files changed, 117 insertions(+) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index a6ee2e3160ca..eefeaa721b9d 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -174,3 +174,79 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr dout("base64-encoded ciphertext name = %.*s\n", elen, buf); return elen; } + +/** + * ceph_fname_to_usr - convert a filename for userland presentation + * @fname: ceph_fname to be converted + * @tname: temporary name buffer to use for conversion (may be NULL) + * @oname: where converted name should be placed + * @is_nokey: set to true if key wasn't available during conversion (may be NULL) + * + * Given a filename (usually from the MDS), format it for presentation to + * userland. If @parent is not encrypted, just pass it back as-is. + * + * Otherwise, base64 decode the string, and then ask fscrypt to format it + * for userland presentation. + * + * Returns 0 on success or negative error code on error. + */ +int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, + struct fscrypt_str *oname, bool *is_nokey) +{ + int ret; + struct fscrypt_str _tname = FSTR_INIT(NULL, 0); + struct fscrypt_str iname; + + if (!IS_ENCRYPTED(fname->dir)) { + oname->name = fname->name; + oname->len = fname->name_len; + return 0; + } + + /* Sanity check that the resulting name will fit in the buffer */ + if (fname->name_len > FSCRYPT_BASE64URL_CHARS(NAME_MAX)) + return -EIO; + + ret = __fscrypt_prepare_readdir(fname->dir); + if (ret) + return ret; + + /* + * Use the raw dentry name as sent by the MDS instead of + * generating a nokey name via fscrypt. + */ + if (!fscrypt_has_encryption_key(fname->dir)) { + memcpy(oname->name, fname->name, fname->name_len); + oname->len = fname->name_len; + if (is_nokey) + *is_nokey = true; + return 0; + } + + if (fname->ctext_len == 0) { + int declen; + + if (!tname) { + ret = fscrypt_fname_alloc_buffer(NAME_MAX, &_tname); + if (ret) + return ret; + tname = &_tname; + } + + declen = fscrypt_base64url_decode(fname->name, fname->name_len, tname->name); + if (declen <= 0) { + ret = -EIO; + goto out; + } + iname.name = tname->name; + iname.len = declen; + } else { + iname.name = fname->ctext; + iname.len = fname->ctext_len; + } + + ret = fscrypt_fname_disk_to_usr(fname->dir, 0, 0, &iname, oname); +out: + fscrypt_fname_free_buffer(&_tname); + return ret; +} diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 9a66a29d5c8b..7e56aded5124 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -13,6 +13,14 @@ struct ceph_fs_client; struct ceph_acl_sec_ctx; struct ceph_mds_request; +struct ceph_fname { + struct inode *dir; + char *name; // b64 encoded, possibly hashed + unsigned char *ctext; // binary crypttext (if any) + u32 name_len; // length of name buffer + u32 ctext_len; // length of crypttext +}; + struct ceph_fscrypt_auth { __le32 cfa_version; __le32 cfa_blob_len; @@ -61,6 +69,22 @@ int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf); +static inline int ceph_fname_alloc_buffer(struct inode *parent, struct fscrypt_str *fname) +{ + if (!IS_ENCRYPTED(parent)) + return 0; + return fscrypt_fname_alloc_buffer(NAME_MAX, fname); +} + +static inline void ceph_fname_free_buffer(struct inode *parent, struct fscrypt_str *fname) +{ + if (IS_ENCRYPTED(parent)) + fscrypt_fname_free_buffer(fname); +} + +int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, + struct fscrypt_str *oname, bool *is_nokey); + #else /* CONFIG_FS_ENCRYPTION */ static inline void ceph_fscrypt_set_ops(struct super_block *sb) @@ -89,6 +113,23 @@ static inline int ceph_encode_encrypted_fname(const struct inode *parent, { return -EOPNOTSUPP; } + +static inline int ceph_fname_alloc_buffer(struct inode *parent, struct fscrypt_str *fname) +{ + return 0; +} + +static inline void ceph_fname_free_buffer(struct inode *parent, struct fscrypt_str *fname) +{ +} + +static inline int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, + struct fscrypt_str *oname, bool *is_nokey) +{ + oname->name = fname->name; + oname->len = fname->name_len; + return 0; +} #endif /* CONFIG_FS_ENCRYPTION */ #endif From patchwork Tue Apr 5 19:19:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802385 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id BAD75C43217 for ; Wed, 6 Apr 2022 04:18:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352285AbiDFEL2 (ORCPT ); Wed, 6 Apr 2022 00:11:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36958 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573577AbiDETW6 (ORCPT ); Tue, 5 Apr 2022 15:22:58 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 81452488B9; Tue, 5 Apr 2022 12:21:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 36D4CB81FA7; Tue, 5 Apr 2022 19:20:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 57B37C385A5; Tue, 5 Apr 2022 19:20:57 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186458; bh=MU+8cQqFTauv6+tNkq0uHJULiqNrfH78K6L73eZ6818=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=JN5SAAGEHLYlUgFg+9TZNj/pebWMMrIA7EgYMoxx0fGzRk3ZuJwNlY/j1Hy4hNJFJ oqXkWbkDXIwbOaG4bHTSjJsnembnIYBwp5EluI+ya77SzSPNbST068KzZKUQ1lhzqr V25e6w7C3Ccw32l8ATlFrMkcMROYMMz0iCNdBJB4B/ctCVZaUYnt+ZdKSPhm+ENdhJ 4ZesVgdu0NRDiKeNRn9hrMsIvhzyMArL9lNy222vZjoWBUcKe4hA9MIN0IcFLbA2oR XB3/gMNAneF1+v0WbgSoXU9vAqsL/PMep7JT7pDe2+3ozXmBB0C+nloCBAfmDndN6N t+1SU2pehcyBg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 28/59] ceph: fix base64 encoded name's length check in ceph_fname_to_usr() Date: Tue, 5 Apr 2022 15:19:59 -0400 Message-Id: <20220405192030.178326-29-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li The fname->name is based64_encoded names and the max long shouldn't exceed the NAME_MAX. The FSCRYPT_BASE64URL_CHARS(NAME_MAX) will be 255 * 4 / 3. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index eefeaa721b9d..d63e4a583413 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -204,7 +204,7 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, } /* Sanity check that the resulting name will fit in the buffer */ - if (fname->name_len > FSCRYPT_BASE64URL_CHARS(NAME_MAX)) + if (fname->name_len > NAME_MAX || fname->ctext_len > NAME_MAX) return -EIO; ret = __fscrypt_prepare_readdir(fname->dir); From patchwork Tue Apr 5 19:20:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802328 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C4A3DC433EF for ; Wed, 6 Apr 2022 04:02:44 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237830AbiDFEEa (ORCPT ); Wed, 6 Apr 2022 00:04:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:36932 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573575AbiDETW6 (ORCPT ); Tue, 5 Apr 2022 15:22:58 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DFDBE4889E; Tue, 5 Apr 2022 12:20:59 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7CD35617EE; Tue, 5 Apr 2022 19:20:59 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 40754C385A1; Tue, 5 Apr 2022 19:20:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186458; bh=twu3E557bdskprsD4Fr6VDO4t+mWqQiy+j+WCNte8Sw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=ZsvE035ygkqMuo6WSFvyNQL7hM6B4SFQMns+aN2FKQuEDNrQ8sK6+MK/v2pbhDvmB hWXnxsP7f4dSp0oPlV51sHq0TP1Fwp+s7icdvPR/6fd5tBbnNW+wrDKWHVzurlAPGu oEuPNCV3HI2TnQiBKbgJjQT7McqN+6V9WAcsey2+pq2T/2Ml+y5Jy7epWS1agBFpeV O2cdZ1MyoUnF/UHyErGewmMhtLI+v3JRkRa1fMBds6if7AZ+wyOipGriZXtDfF3YCH JeGQ9Vwqg+Fo6VTRzFyHthKBhlMoizgJPmPW2VRBKMiUNUl3Hv9g1lIhq2TGEDZpa3 eRzXrOzQ7N4/g== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 29/59] ceph: add fscrypt support to ceph_fill_trace Date: Tue, 5 Apr 2022 15:20:00 -0400 Message-Id: <20220405192030.178326-30-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When we get a dentry in a trace, decrypt the name so we can properly instantiate the dentry. Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 30 ++++++++++++++++++++++++++++-- 1 file changed, 28 insertions(+), 2 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 4cbc303730ef..37c2c2977235 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1405,8 +1405,15 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) if (dir && req->r_op == CEPH_MDS_OP_LOOKUPNAME && test_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags) && !test_bit(CEPH_MDS_R_ABORTED, &req->r_req_flags)) { + bool is_nokey = false; struct qstr dname; struct dentry *dn, *parent; + struct fscrypt_str oname = FSTR_INIT(NULL, 0); + struct ceph_fname fname = { .dir = dir, + .name = rinfo->dname, + .ctext = rinfo->altname, + .name_len = rinfo->dname_len, + .ctext_len = rinfo->altname_len }; BUG_ON(!rinfo->head->is_target); BUG_ON(req->r_dentry); @@ -1414,8 +1421,20 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) parent = d_find_any_alias(dir); BUG_ON(!parent); - dname.name = rinfo->dname; - dname.len = rinfo->dname_len; + err = ceph_fname_alloc_buffer(dir, &oname); + if (err < 0) { + dput(parent); + goto done; + } + + err = ceph_fname_to_usr(&fname, NULL, &oname, &is_nokey); + if (err < 0) { + dput(parent); + ceph_fname_free_buffer(dir, &oname); + goto done; + } + dname.name = oname.name; + dname.len = oname.len; dname.hash = full_name_hash(parent, dname.name, dname.len); tvino.ino = le64_to_cpu(rinfo->targeti.in->ino); tvino.snap = le64_to_cpu(rinfo->targeti.in->snapid); @@ -1430,9 +1449,15 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) dname.len, dname.name, dn); if (!dn) { dput(parent); + ceph_fname_free_buffer(dir, &oname); err = -ENOMEM; goto done; } + if (is_nokey) { + spin_lock(&dn->d_lock); + dn->d_flags |= DCACHE_NOKEY_NAME; + spin_unlock(&dn->d_lock); + } err = 0; } else if (d_really_is_positive(dn) && (ceph_ino(d_inode(dn)) != tvino.ino || @@ -1444,6 +1469,7 @@ int ceph_fill_trace(struct super_block *sb, struct ceph_mds_request *req) dput(dn); goto retry_lookup; } + ceph_fname_free_buffer(dir, &oname); req->r_dentry = dn; dput(parent); From patchwork Tue Apr 5 19:20:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802386 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC5A4C43219 for ; Wed, 6 Apr 2022 04:18:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1585306AbiDFEQm (ORCPT ); Wed, 6 Apr 2022 00:16:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37008 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573578AbiDETW7 (ORCPT ); Tue, 5 Apr 2022 15:22:59 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0F4748E4C; Tue, 5 Apr 2022 12:21:00 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 608E861899; Tue, 5 Apr 2022 19:21:00 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 2943AC385A3; Tue, 5 Apr 2022 19:20:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186459; bh=VuAetinbMIJ3jeBAT3F1PDyYDvpSu4GO847Nw/6AauI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=EV9WI8Ws5j3ZBLTudp20Epsy/TrTwGPL8CNzVCuRB/h6ET739GafztHdfWyE63F+G QMkCTG8y8xslilsSfeVFmZb+NeKEvILYI9xZoVtZ0o0bPBAYbV/usa+jfOd1bTRCII RqTCFexLXjQs50YQzwG3fKRxVqF/RZTTJn2jWrHt1p86OuBQKfe/JZtBH3s7TxvCyR q8mZSr8ZYBi+zbHhWNDEMk4/fL8NWA0e3fVbEhDI5u3uh8Fe5W0TyCrlvZtq7UFRTE 4+6I1l+DEaBdgpeeGEgatDnXiWLsESQe8ozUJAUNJjhRxp9EaKCgeebofDmB9w7tEQ WzTUdijRPHRyw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 30/59] ceph: pass the request to parse_reply_info_readdir() Date: Tue, 5 Apr 2022 15:20:01 -0400 Message-Id: <20220405192030.178326-31-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Instead of passing just the r_reply_info to the readdir reply parser, pass the request pointer directly instead. This will facilitate implementing readdir on fscrypted directories. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/mds_client.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 750a67643850..0a7f18d4df73 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -406,9 +406,10 @@ static int parse_reply_info_trace(void **p, void *end, * parse readdir results */ static int parse_reply_info_readdir(void **p, void *end, - struct ceph_mds_reply_info_parsed *info, - u64 features) + struct ceph_mds_request *req, + u64 features) { + struct ceph_mds_reply_info_parsed *info = &req->r_reply_info; u32 num, i = 0; int err; @@ -650,15 +651,16 @@ static int parse_reply_info_getvxattr(void **p, void *end, * parse extra results */ static int parse_reply_info_extra(void **p, void *end, - struct ceph_mds_reply_info_parsed *info, + struct ceph_mds_request *req, u64 features, struct ceph_mds_session *s) { + struct ceph_mds_reply_info_parsed *info = &req->r_reply_info; u32 op = le32_to_cpu(info->head->op); if (op == CEPH_MDS_OP_GETFILELOCK) return parse_reply_info_filelock(p, end, info, features); else if (op == CEPH_MDS_OP_READDIR || op == CEPH_MDS_OP_LSSNAP) - return parse_reply_info_readdir(p, end, info, features); + return parse_reply_info_readdir(p, end, req, features); else if (op == CEPH_MDS_OP_CREATE) return parse_reply_info_create(p, end, info, features, s); else if (op == CEPH_MDS_OP_GETVXATTR) @@ -671,9 +673,9 @@ static int parse_reply_info_extra(void **p, void *end, * parse entire mds reply */ static int parse_reply_info(struct ceph_mds_session *s, struct ceph_msg *msg, - struct ceph_mds_reply_info_parsed *info, - u64 features) + struct ceph_mds_request *req, u64 features) { + struct ceph_mds_reply_info_parsed *info = &req->r_reply_info; void *p, *end; u32 len; int err; @@ -695,7 +697,7 @@ static int parse_reply_info(struct ceph_mds_session *s, struct ceph_msg *msg, ceph_decode_32_safe(&p, end, len, bad); if (len > 0) { ceph_decode_need(&p, end, len, bad); - err = parse_reply_info_extra(&p, p+len, info, features, s); + err = parse_reply_info_extra(&p, p+len, req, features, s); if (err < 0) goto out_bad; } @@ -3440,14 +3442,14 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) } dout("handle_reply tid %lld result %d\n", tid, result); - rinfo = &req->r_reply_info; if (test_bit(CEPHFS_FEATURE_REPLY_ENCODING, &session->s_features)) - err = parse_reply_info(session, msg, rinfo, (u64)-1); + err = parse_reply_info(session, msg, req, (u64)-1); else - err = parse_reply_info(session, msg, rinfo, session->s_con.peer_features); + err = parse_reply_info(session, msg, req, session->s_con.peer_features); mutex_unlock(&mdsc->mutex); /* Must find target inode outside of mutexes to avoid deadlocks */ + rinfo = &req->r_reply_info; if ((err >= 0) && rinfo->head->is_target) { struct inode *in = xchg(&req->r_new_inode, NULL); struct ceph_vino tvino = { From patchwork Tue Apr 5 19:20:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802384 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B7D4C433F5 for ; Wed, 6 Apr 2022 04:18:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1837422AbiDFEQt (ORCPT ); Wed, 6 Apr 2022 00:16:49 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37212 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573580AbiDETXC (ORCPT ); Tue, 5 Apr 2022 15:23:02 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3758949C83; Tue, 5 Apr 2022 12:21:03 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E7C37B81FA4; Tue, 5 Apr 2022 19:21:01 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 122C4C385A1; Tue, 5 Apr 2022 19:20:59 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186460; bh=8AAlGpHGzqWHPpapHY63mYkU0suyOZJZSALRjDSUPpw=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TxpDmE10vteztA7ZWpt+hd/qmOIFT2GmrQJw7iHgtjpQrZlDrDEziDOIoqFyr4/YL a2DOzfqYLKxXpmRzvT4PWy2ufrpnOKbi3sak1f82AyhU9n6PjJv53n3cNLC4aPiF0i cC+HeVPsCMZjTFW/r2ZCzwDMzqkDR/0NBlGp44KCD8pEMl16rykOcbplMqGJU4qpbC nJWv4EiprvDeEip5Gdl97oBJJqJ/cAWZuleyhe8VBdabjQQt3iCofBfXOjW3iXcTQ7 groQMQ3UF6+9gf4qfpg6ewlyCNxyzfhIItlCNZ0FCVKV2b/8jaW+PyPF4vq4xuTc/+ nfgrH0vmuUSUg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 31/59] ceph: add ceph_encode_encrypted_dname() helper Date: Tue, 5 Apr 2022 15:20:02 -0400 Message-Id: <20220405192030.178326-32-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Add a new helper that basically calls ceph_encode_encrypted_fname, but with a qstr pointer instead of a dentry pointer. This will make it simpler to decrypt names in a readdir reply, before we have a dentry. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 11 ++++++++--- fs/ceph/crypto.h | 8 ++++++++ 2 files changed, 16 insertions(+), 3 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index d63e4a583413..84a48c230bd7 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -128,7 +128,7 @@ void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_se swap(req->r_fscrypt_auth, as->fscrypt_auth); } -int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) +int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, char *buf) { u32 len; int elen; @@ -143,7 +143,7 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr * * See: fscrypt_setup_filename */ - if (!fscrypt_fname_encrypted_size(parent, dentry->d_name.len, NAME_MAX, &len)) + if (!fscrypt_fname_encrypted_size(parent, d_name->len, NAME_MAX, &len)) return -ENAMETOOLONG; /* Allocate a buffer appropriate to hold the result */ @@ -151,7 +151,7 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr if (!cryptbuf) return -ENOMEM; - ret = fscrypt_fname_encrypt(parent, &dentry->d_name, cryptbuf, len); + ret = fscrypt_fname_encrypt(parent, d_name, cryptbuf, len); if (ret) { kfree(cryptbuf); return ret; @@ -175,6 +175,11 @@ int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentr return elen; } +int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) +{ + return ceph_encode_encrypted_dname(parent, &dentry->d_name, buf); +} + /** * ceph_fname_to_usr - convert a filename for userland presentation * @fname: ceph_fname to be converted diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 7e56aded5124..e54150260eba 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -67,6 +67,7 @@ void ceph_fscrypt_free_dummy_policy(struct ceph_fs_client *fsc); int ceph_fscrypt_prepare_context(struct inode *dir, struct inode *inode, struct ceph_acl_sec_ctx *as); void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, struct ceph_acl_sec_ctx *as); +int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, char *buf); int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf); static inline int ceph_fname_alloc_buffer(struct inode *parent, struct fscrypt_str *fname) @@ -108,6 +109,13 @@ static inline void ceph_fscrypt_as_ctx_to_req(struct ceph_mds_request *req, { } +static inline int ceph_encode_encrypted_dname(const struct inode *parent, + struct qstr *d_name, char *buf) +{ + memcpy(buf, d_name->name, d_name->len); + return d_name->len; +} + static inline int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) { From patchwork Tue Apr 5 19:20:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802347 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 62FA1C433FE for ; Wed, 6 Apr 2022 04:16:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1355793AbiDFEMN (ORCPT ); Wed, 6 Apr 2022 00:12:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37304 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573583AbiDETXD (ORCPT ); Tue, 5 Apr 2022 15:23:03 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8A07E49FA0; Tue, 5 Apr 2022 12:21:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id F00B6B81F6B; Tue, 5 Apr 2022 19:21:02 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id EF496C385A0; Tue, 5 Apr 2022 19:21:00 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186461; bh=VqF6Qdkp4d5JLS4cAySm7dR1WjPzpUymO66OZidflG0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=rIY5eLpOF+W3AUlmHNCfcMyWz42sO4tlm7wjW+C0c0fKeD3AK5MYDI+izb3M3BzsK udKrnqESjLqN9vnKPKmLEOQAbMcx4lPy5AWeRoq7yQ+WO1PyqIh9MxC01UHOn70mGX rhZQSH/06+37ZiRO+L6fw4yqhgIIE5v7YEtz3JhPx6B27w7VTOZADIboNTK0PBZ9DQ xMt12hJ+9Sh/d44NW1M/xd6sIY+qmkKEMDx6ybg9epMKQJ4mIbWciU+a/ZxhhO6o0C mb0Q6VXtMiYW3TduyFV03AUQ1WEvyYwuA78KwSUuJpvfR6YknHjxsgagp+nH+pl4OT zK+RjTbLL5gSQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 32/59] ceph: add support to readdir for encrypted filenames Date: Tue, 5 Apr 2022 15:20:03 -0400 Message-Id: <20220405192030.178326-33-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Once we've decrypted the names in a readdir reply, we no longer need the crypttext, so overwrite them in ceph_mds_reply_dir_entry with the unencrypted names. Then in both ceph_readdir_prepopulate() and ceph_readdir() we will use the dencrypted name directly. [ jlayton: convert some BUG_ONs into error returns ] Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 12 +++++-- fs/ceph/crypto.h | 1 + fs/ceph/dir.c | 35 +++++++++++++++---- fs/ceph/inode.c | 12 ++++--- fs/ceph/mds_client.c | 81 ++++++++++++++++++++++++++++++++++++++++---- fs/ceph/mds_client.h | 4 +-- 6 files changed, 124 insertions(+), 21 deletions(-) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 84a48c230bd7..19c113afb400 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -135,7 +135,10 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, int ret; u8 *cryptbuf; - WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); + if (!fscrypt_has_encryption_key(parent)) { + memcpy(buf, d_name->name, d_name->len); + return d_name->len; + } /* * Convert cleartext d_name to ciphertext. If result is longer than @@ -177,6 +180,8 @@ int ceph_encode_encrypted_dname(const struct inode *parent, struct qstr *d_name, int ceph_encode_encrypted_fname(const struct inode *parent, struct dentry *dentry, char *buf) { + WARN_ON_ONCE(!fscrypt_has_encryption_key(parent)); + return ceph_encode_encrypted_dname(parent, &dentry->d_name, buf); } @@ -221,7 +226,10 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, * generating a nokey name via fscrypt. */ if (!fscrypt_has_encryption_key(fname->dir)) { - memcpy(oname->name, fname->name, fname->name_len); + if (fname->no_copy) + oname->name = fname->name; + else + memcpy(oname->name, fname->name, fname->name_len); oname->len = fname->name_len; if (is_nokey) *is_nokey = true; diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index e54150260eba..080905b0c73c 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -19,6 +19,7 @@ struct ceph_fname { unsigned char *ctext; // binary crypttext (if any) u32 name_len; // length of name buffer u32 ctext_len; // length of crypttext + bool no_copy; }; struct ceph_fscrypt_auth { diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index caf2547c3fe1..5ce2a6384e55 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -9,6 +9,7 @@ #include "super.h" #include "mds_client.h" +#include "crypto.h" /* * Directory operations: readdir, lookup, create, link, unlink, @@ -241,7 +242,9 @@ static int __dcache_readdir(struct file *file, struct dir_context *ctx, di = ceph_dentry(dentry); if (d_unhashed(dentry) || d_really_is_negative(dentry) || - di->lease_shared_gen != shared_gen) { + di->lease_shared_gen != shared_gen || + ((dentry->d_flags & DCACHE_NOKEY_NAME) && + fscrypt_has_encryption_key(dir))) { spin_unlock(&dentry->d_lock); dput(dentry); err = -EAGAIN; @@ -340,6 +343,10 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) ctx->pos = 2; } + err = fscrypt_prepare_readdir(inode); + if (err) + return err; + spin_lock(&ci->i_ceph_lock); /* request Fx cap. if have Fx, we don't need to release Fs cap * for later create/unlink. */ @@ -389,6 +396,7 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); if (IS_ERR(req)) return PTR_ERR(req); + err = ceph_alloc_readdir_reply_buffer(req, inode); if (err) { ceph_mdsc_put_request(req); @@ -402,11 +410,20 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) req->r_inode_drop = CEPH_CAP_FILE_EXCL; } if (dfi->last_name) { - req->r_path2 = kstrdup(dfi->last_name, GFP_KERNEL); + struct qstr d_name = { .name = dfi->last_name, + .len = strlen(dfi->last_name) }; + + req->r_path2 = kzalloc(NAME_MAX + 1, GFP_KERNEL); if (!req->r_path2) { ceph_mdsc_put_request(req); return -ENOMEM; } + + err = ceph_encode_encrypted_dname(inode, &d_name, req->r_path2); + if (err < 0) { + ceph_mdsc_put_request(req); + return err; + } } else if (is_hash_order(ctx->pos)) { req->r_args.readdir.offset_hash = cpu_to_le32(fpos_hash(ctx->pos)); @@ -511,15 +528,20 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) for (; i < rinfo->dir_nr; i++) { struct ceph_mds_reply_dir_entry *rde = rinfo->dir_entries + i; - BUG_ON(rde->offset < ctx->pos); + if (rde->offset < ctx->pos) { + pr_warn("%s: rde->offset 0x%llx ctx->pos 0x%llx\n", + __func__, rde->offset, ctx->pos); + return -EIO; + } + + if (WARN_ON_ONCE(!rde->inode.in)) + return -EIO; ctx->pos = rde->offset; dout("readdir (%d/%d) -> %llx '%.*s' %p\n", i, rinfo->dir_nr, ctx->pos, rde->name_len, rde->name, &rde->inode.in); - BUG_ON(!rde->inode.in); - if (!dir_emit(ctx, rde->name, rde->name_len, ceph_present_ino(inode->i_sb, le64_to_cpu(rde->inode.in->ino)), le32_to_cpu(rde->inode.in->mode) >> 12)) { @@ -532,6 +554,8 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) dout("filldir stopping us...\n"); return 0; } + + /* Reset the lengths to their original allocated vals */ ctx->pos++; } @@ -586,7 +610,6 @@ static int ceph_readdir(struct file *file, struct dir_context *ctx) dfi->dir_ordered_count); spin_unlock(&ci->i_ceph_lock); } - dout("readdir %p file %p done.\n", inode, file); return 0; } diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 37c2c2977235..d1ade1651214 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1751,7 +1751,8 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, struct ceph_mds_session *session) { struct dentry *parent = req->r_dentry; - struct ceph_inode_info *ci = ceph_inode(d_inode(parent)); + struct inode *inode = d_inode(parent); + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_reply_info_parsed *rinfo = &req->r_reply_info; struct qstr dname; struct dentry *dn; @@ -1825,9 +1826,7 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, tvino.snap = le64_to_cpu(rde->inode.in->snapid); if (rinfo->hash_order) { - u32 hash = ceph_str_hash(ci->i_dir_layout.dl_dir_hash, - rde->name, rde->name_len); - hash = ceph_frag_value(hash); + u32 hash = ceph_frag_value(rde->raw_hash); if (hash != last_hash) fpos_offset = 2; last_hash = hash; @@ -1850,6 +1849,11 @@ int ceph_readdir_prepopulate(struct ceph_mds_request *req, err = -ENOMEM; goto out; } + if (rde->is_nokey) { + spin_lock(&dn->d_lock); + dn->d_flags |= DCACHE_NOKEY_NAME; + spin_unlock(&dn->d_lock); + } } else if (d_really_is_positive(dn) && (ceph_ino(d_inode(dn)) != tvino.ino || ceph_snap(d_inode(dn)) != tvino.snap)) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 0a7f18d4df73..50fe77768295 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -439,20 +439,87 @@ static int parse_reply_info_readdir(void **p, void *end, info->dir_nr = num; while (num) { + struct inode *inode = d_inode(req->r_dentry); + struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_mds_reply_dir_entry *rde = info->dir_entries + i; + struct fscrypt_str tname = FSTR_INIT(NULL, 0); + struct fscrypt_str oname = FSTR_INIT(NULL, 0); + struct ceph_fname fname; + u32 altname_len, _name_len; + u8 *altname, *_name; + /* dentry */ - ceph_decode_32_safe(p, end, rde->name_len, bad); - ceph_decode_need(p, end, rde->name_len, bad); - rde->name = *p; - *p += rde->name_len; - dout("parsed dir dname '%.*s'\n", rde->name_len, rde->name); + ceph_decode_32_safe(p, end, _name_len, bad); + ceph_decode_need(p, end, _name_len, bad); + _name = *p; + *p += _name_len; + dout("parsed dir dname '%.*s'\n", _name_len, _name); + + if (info->hash_order) + rde->raw_hash = ceph_str_hash(ci->i_dir_layout.dl_dir_hash, + _name, _name_len); /* dentry lease */ err = parse_reply_info_lease(p, end, &rde->lease, features, - &rde->altname_len, &rde->altname); + &altname_len, &altname); if (err) goto out_bad; + /* + * Try to dencrypt the dentry names and update them + * in the ceph_mds_reply_dir_entry struct. + */ + fname.dir = inode; + fname.name = _name; + fname.name_len = _name_len; + fname.ctext = altname; + fname.ctext_len = altname_len; + /* + * The _name_len maybe larger than altname_len, such as + * when the human readable name length is in range of + * (CEPH_NOHASH_NAME_MAX, CEPH_NOHASH_NAME_MAX + SHA256_DIGEST_SIZE), + * then the copy in ceph_fname_to_usr will corrupt the + * data if there has no encryption key. + * + * Just set the no_copy flag and then if there has no + * encryption key the oname.name will be assigned to + * _name always. + */ + fname.no_copy = true; + if (altname_len == 0) { + /* + * Set tname to _name, and this will be used + * to do the base64_decode in-place. It's + * safe because the decoded string should + * always be shorter, which is 3/4 of origin + * string. + */ + tname.name = _name; + + /* + * Set oname to _name too, and this will be + * used to do the dencryption in-place. + */ + oname.name = _name; + oname.len = _name_len; + } else { + /* + * This will do the decryption only in-place + * from altname cryptext directly. + */ + oname.name = altname; + oname.len = altname_len; + } + rde->is_nokey = false; + err = ceph_fname_to_usr(&fname, &tname, &oname, &rde->is_nokey); + if (err) { + pr_err("%s unable to decode %.*s, got %d\n", __func__, + _name_len, _name, err); + goto out_bad; + } + rde->name = oname.name; + rde->name_len = oname.len; + /* inode */ err = parse_reply_info_in(p, end, &rde->inode, features); if (err < 0) @@ -3501,7 +3568,7 @@ static void handle_reply(struct ceph_mds_session *session, struct ceph_msg *msg) if (err == 0) { if (result == 0 && (req->r_op == CEPH_MDS_OP_READDIR || req->r_op == CEPH_MDS_OP_LSSNAP)) - ceph_readdir_prepopulate(req, req->r_session); + err = ceph_readdir_prepopulate(req, req->r_session); } current->journal_info = NULL; mutex_unlock(&req->r_fill_mutex); diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index cd719691a86d..046a9368c4a9 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -96,10 +96,10 @@ struct ceph_mds_reply_info_in { }; struct ceph_mds_reply_dir_entry { + bool is_nokey; char *name; - u8 *altname; u32 name_len; - u32 altname_len; + u32 raw_hash; struct ceph_mds_reply_lease *lease; struct ceph_mds_reply_info_in inode; loff_t offset; From patchwork Tue Apr 5 19:20:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802367 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D1A4C4321E for ; Wed, 6 Apr 2022 04:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1448775AbiDFEPb (ORCPT ); Wed, 6 Apr 2022 00:15:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37214 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573579AbiDETXC (ORCPT ); Tue, 5 Apr 2022 15:23:02 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 97D4D49C84; Tue, 5 Apr 2022 12:21:03 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 1EE80618CD; Tue, 5 Apr 2022 19:21:03 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D8933C385A3; Tue, 5 Apr 2022 19:21:01 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186462; bh=AWzsTg+UTHKEk8QehJZJCF5BnvPHMbE6gVQ/WVlXD74=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=CDMkMuVHglb+rWTsUJfqPzgEB06jnyqqRgnZaA2bbXB+XHGmjqnPXsuSIq+jsHJrJ AKrKgtP9E9nEkZFLSQSlKUxn20XBSRcyCYZVhYb0Iax96PfpmmtkHSb4CL4VHM89vw vtTSoBSvPZseTRdOiwO2mqAW81J17pOwso4FtP5T4vYxWzKxXS+1Uv0OJ4o7nn8s0L wV0CDY/jvORblSCm1Nn+Ndx8yWlArMcdPHSw452smhJKbX5ra7/lQojjfroP99ORqA QePVt1bw8vzWpLEquTxh3YhVe+IlHcsUp63IEM45NU0kmv18OWTGNzRaXX9f0b8qCB 5Qe5zpNlRbM/w== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 33/59] ceph: create symlinks with encrypted and base64-encoded targets Date: Tue, 5 Apr 2022 15:20:04 -0400 Message-Id: <20220405192030.178326-34-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When creating symlinks in encrypted directories, encrypt and base64-encode the target with the new inode's key before sending to the MDS. When filling a symlinked inode, base64-decode it into a buffer that we'll keep in ci->i_symlink. When get_link is called, decrypt the buffer into a new one that will hang off i_link. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 51 ++++++++++++++++++++--- fs/ceph/inode.c | 107 ++++++++++++++++++++++++++++++++++++++++++------ 2 files changed, 141 insertions(+), 17 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 5ce2a6384e55..82a5f37e9d4a 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -942,6 +942,40 @@ static int ceph_create(struct user_namespace *mnt_userns, struct inode *dir, return ceph_mknod(mnt_userns, dir, dentry, mode, 0); } +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) +static int prep_encrypted_symlink_target(struct ceph_mds_request *req, const char *dest) +{ + int err; + int len = strlen(dest); + struct fscrypt_str osd_link = FSTR_INIT(NULL, 0); + + err = fscrypt_prepare_symlink(req->r_parent, dest, len, PATH_MAX, &osd_link); + if (err) + goto out; + + err = fscrypt_encrypt_symlink(req->r_new_inode, dest, len, &osd_link); + if (err) + goto out; + + req->r_path2 = kmalloc(FSCRYPT_BASE64URL_CHARS(osd_link.len) + 1, GFP_KERNEL); + if (!req->r_path2) { + err = -ENOMEM; + goto out; + } + + len = fscrypt_base64url_encode(osd_link.name, osd_link.len, req->r_path2); + req->r_path2[len] = '\0'; +out: + fscrypt_fname_free_buffer(&osd_link); + return err; +} +#else +static int prep_encrypted_symlink_target(struct ceph_mds_request *req, const char *dest) +{ + return -EOPNOTSUPP; +} +#endif + static int ceph_symlink(struct user_namespace *mnt_userns, struct inode *dir, struct dentry *dentry, const char *dest) { @@ -973,14 +1007,21 @@ static int ceph_symlink(struct user_namespace *mnt_userns, struct inode *dir, goto out_req; } - req->r_path2 = kstrdup(dest, GFP_KERNEL); - if (!req->r_path2) { - err = -ENOMEM; - goto out_req; - } req->r_parent = dir; ihold(dir); + if (IS_ENCRYPTED(req->r_new_inode)) { + err = prep_encrypted_symlink_target(req, dest); + if (err) + goto out_req; + } else { + req->r_path2 = kstrdup(dest, GFP_KERNEL); + if (!req->r_path2) { + err = -ENOMEM; + goto out_req; + } + } + set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags); req->r_dentry = dget(dentry); req->r_num_caps = 2; diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index d1ade1651214..bb1b1a57970c 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -35,6 +35,7 @@ */ static const struct inode_operations ceph_symlink_iops; +static const struct inode_operations ceph_encrypted_symlink_iops; static void ceph_inode_work(struct work_struct *work); @@ -639,6 +640,7 @@ void ceph_free_inode(struct inode *inode) #ifdef CONFIG_FS_ENCRYPTION kfree(ci->fscrypt_auth); #endif + fscrypt_free_inode(inode); kmem_cache_free(ceph_inode_cachep, ci); } @@ -836,6 +838,34 @@ void ceph_fill_file_time(struct inode *inode, int issued, inode, time_warp_seq, ci->i_time_warp_seq); } +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) +static int decode_encrypted_symlink(const char *encsym, int enclen, u8 **decsym) +{ + int declen; + u8 *sym; + + sym = kmalloc(enclen + 1, GFP_NOFS); + if (!sym) + return -ENOMEM; + + declen = fscrypt_base64url_decode(encsym, enclen, sym); + if (declen < 0) { + pr_err("%s: can't decode symlink (%d). Content: %.*s\n", + __func__, declen, enclen, encsym); + kfree(sym); + return -EIO; + } + sym[declen + 1] = '\0'; + *decsym = sym; + return declen; +} +#else +static int decode_encrypted_symlink(const char *encsym, int symlen, u8 **decsym) +{ + return -EOPNOTSUPP; +} +#endif + /* * Populate an inode based on info from mds. May be called on new or * existing inodes. @@ -1070,26 +1100,39 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, inode->i_fop = &ceph_file_fops; break; case S_IFLNK: - inode->i_op = &ceph_symlink_iops; if (!ci->i_symlink) { u32 symlen = iinfo->symlink_len; char *sym; spin_unlock(&ci->i_ceph_lock); - if (symlen != i_size_read(inode)) { - pr_err("%s %llx.%llx BAD symlink " - "size %lld\n", __func__, - ceph_vinop(inode), - i_size_read(inode)); + if (IS_ENCRYPTED(inode)) { + if (symlen != i_size_read(inode)) + pr_err("%s %llx.%llx BAD symlink size %lld\n", + __func__, ceph_vinop(inode), i_size_read(inode)); + + err = decode_encrypted_symlink(iinfo->symlink, symlen, (u8 **)&sym); + if (err < 0) { + pr_err("%s decoding encrypted symlink failed: %d\n", + __func__, err); + goto out; + } + symlen = err; i_size_write(inode, symlen); inode->i_blocks = calc_inode_blocks(symlen); - } + } else { + if (symlen != i_size_read(inode)) { + pr_err("%s %llx.%llx BAD symlink size %lld\n", + __func__, ceph_vinop(inode), i_size_read(inode)); + i_size_write(inode, symlen); + inode->i_blocks = calc_inode_blocks(symlen); + } - err = -ENOMEM; - sym = kstrndup(iinfo->symlink, symlen, GFP_NOFS); - if (!sym) - goto out; + err = -ENOMEM; + sym = kstrndup(iinfo->symlink, symlen, GFP_NOFS); + if (!sym) + goto out; + } spin_lock(&ci->i_ceph_lock); if (!ci->i_symlink) @@ -1097,7 +1140,17 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, else kfree(sym); /* lost a race */ } - inode->i_link = ci->i_symlink; + + if (IS_ENCRYPTED(inode)) { + /* + * Encrypted symlinks need to be decrypted before we can + * cache their targets in i_link. Don't touch it here. + */ + inode->i_op = &ceph_encrypted_symlink_iops; + } else { + inode->i_link = ci->i_symlink; + inode->i_op = &ceph_symlink_iops; + } break; case S_IFDIR: inode->i_op = &ceph_dir_iops; @@ -2126,6 +2179,29 @@ static void ceph_inode_work(struct work_struct *work) iput(inode); } +static const char *ceph_encrypted_get_link(struct dentry *dentry, struct inode *inode, + struct delayed_call *done) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + + if (!dentry) + return ERR_PTR(-ECHILD); + + return fscrypt_get_symlink(inode, ci->i_symlink, i_size_read(inode), done); +} + +static int ceph_encrypted_symlink_getattr(struct user_namespace *mnt_userns, + const struct path *path, struct kstat *stat, + u32 request_mask, unsigned int query_flags) +{ + int ret; + + ret = ceph_getattr(mnt_userns, path, stat, request_mask, query_flags); + if (ret) + return ret; + return fscrypt_symlink_getattr(path, stat); +} + /* * symlinks */ @@ -2136,6 +2212,13 @@ static const struct inode_operations ceph_symlink_iops = { .listxattr = ceph_listxattr, }; +static const struct inode_operations ceph_encrypted_symlink_iops = { + .get_link = ceph_encrypted_get_link, + .setattr = ceph_setattr, + .getattr = ceph_encrypted_symlink_getattr, + .listxattr = ceph_listxattr, +}; + int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *cia) { struct ceph_inode_info *ci = ceph_inode(inode); From patchwork Tue Apr 5 19:20:05 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802371 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7B94BC4167D for ; Wed, 6 Apr 2022 04:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1452848AbiDFEQD (ORCPT ); Wed, 6 Apr 2022 00:16:03 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573581AbiDETXD (ORCPT ); Tue, 5 Apr 2022 15:23:03 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 65A4549F92; Tue, 5 Apr 2022 12:21:04 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 03FDF618A0; Tue, 5 Apr 2022 19:21:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id C13BEC385A1; Tue, 5 Apr 2022 19:21:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186463; bh=cDPy0xI578WfkDIktMt8wJvIbPGQCoBzzMq2Fpkpaf4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=O58UFv1Xdgf5DMHZT5h+btU7dWQqw0aMYUPwmFYCQ177ViMlNAkPvRYNs2VrT2VMB 8Mljj04e8YcFXzFo2t2PaNyunOFVx/R1Wc0hPzMy86axMHWVmIlWtkEfuSwkvLL8hE kwmXQgSqsd3DsSeYqz7PIf6w1aEl4K0GRzUGgEak5/5HB84YBomMs+o7D+0DoBajIv N4UgUQGJUZu0oGvdp61sTfpLziUZcaLxbQ7xXa4Y8Kf9Lhm39z5TRIzEwz/9SEF2DT vtblc3YJofL+1otmKbBz/whZEc2vI7ojdqvSHusBhkDPl8eCm0g0Z/nW07kURihgOm UJM+I/hw93PWg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 34/59] ceph: make ceph_get_name decrypt filenames Date: Tue, 5 Apr 2022 15:20:05 -0400 Message-Id: <20220405192030.178326-35-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When we do a lookupino to the MDS, we get a filename in the trace. ceph_get_name uses that name directly, so we must properly decrypt it before copying it to the name buffer. Signed-off-by: Jeff Layton --- fs/ceph/export.c | 44 ++++++++++++++++++++++++++++++++------------ 1 file changed, 32 insertions(+), 12 deletions(-) diff --git a/fs/ceph/export.c b/fs/ceph/export.c index e0fa66ac8b9f..0ebf2bd93055 100644 --- a/fs/ceph/export.c +++ b/fs/ceph/export.c @@ -7,6 +7,7 @@ #include "super.h" #include "mds_client.h" +#include "crypto.h" /* * Basic fh @@ -534,7 +535,9 @@ static int ceph_get_name(struct dentry *parent, char *name, { struct ceph_mds_client *mdsc; struct ceph_mds_request *req; + struct inode *dir = d_inode(parent); struct inode *inode = d_inode(child); + struct ceph_mds_reply_info_parsed *rinfo; int err; if (ceph_snap(inode) != CEPH_NOSNAP) @@ -546,30 +549,47 @@ static int ceph_get_name(struct dentry *parent, char *name, if (IS_ERR(req)) return PTR_ERR(req); - inode_lock(d_inode(parent)); - + inode_lock(dir); req->r_inode = inode; ihold(inode); req->r_ino2 = ceph_vino(d_inode(parent)); - req->r_parent = d_inode(parent); - ihold(req->r_parent); + req->r_parent = dir; + ihold(dir); set_bit(CEPH_MDS_R_PARENT_LOCKED, &req->r_req_flags); req->r_num_caps = 2; err = ceph_mdsc_do_request(mdsc, NULL, req); + inode_unlock(dir); - inode_unlock(d_inode(parent)); + if (err) + goto out; - if (!err) { - struct ceph_mds_reply_info_parsed *rinfo = &req->r_reply_info; + rinfo = &req->r_reply_info; + if (!IS_ENCRYPTED(dir)) { memcpy(name, rinfo->dname, rinfo->dname_len); name[rinfo->dname_len] = 0; - dout("get_name %p ino %llx.%llx name %s\n", - child, ceph_vinop(inode), name); } else { - dout("get_name %p ino %llx.%llx err %d\n", - child, ceph_vinop(inode), err); - } + struct fscrypt_str oname = FSTR_INIT(NULL, 0); + struct ceph_fname fname = { .dir = dir, + .name = rinfo->dname, + .ctext = rinfo->altname, + .name_len = rinfo->dname_len, + .ctext_len = rinfo->altname_len }; + + err = ceph_fname_alloc_buffer(dir, &oname); + if (err < 0) + goto out; + err = ceph_fname_to_usr(&fname, NULL, &oname, NULL); + if (!err) { + memcpy(name, oname.name, oname.len); + name[oname.len] = 0; + } + ceph_fname_free_buffer(dir, &oname); + } +out: + dout("get_name %p ino %llx.%llx err %d %s%s\n", + child, ceph_vinop(inode), err, + err ? "" : "name ", err ? "" : name); ceph_mdsc_put_request(req); return err; } From patchwork Tue Apr 5 19:20:06 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802378 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05695C433EF for ; Wed, 6 Apr 2022 04:18:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1576288AbiDFEQb (ORCPT ); Wed, 6 Apr 2022 00:16:31 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37306 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573582AbiDETXD (ORCPT ); Tue, 5 Apr 2022 15:23:03 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 667794A3E2; Tue, 5 Apr 2022 12:21:05 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id E8BDB616C5; Tue, 5 Apr 2022 19:21:04 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A93FDC385A0; Tue, 5 Apr 2022 19:21:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186464; bh=lVmRbo/Yqwtaeh9CAL98bk9QWDa/T4bzwxEXkoS7RR8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=U3k5xKk6McwExr1pInRLBBwT0PSd1ja7gfw4fXGYf3G9rG8Ff0FQihQw5WlIJ6hdh eilnR53Iy/8He6MnHF9HXc/rOlO3CTktZJ96PYQ6+HfKDGEjcv/AwAp0Yib1Wu3UZb oBDt31SstSUj/fbCPYrGAI99ydS05zt19JY8y2t8WwmZ2l72D/zPU/XdeyrRLGEBK7 Scljqp3FqosbSdOpfTalLCkMJ6XebyLgJnzAExnhcaPg1E3ho/VALdQ0FwvMS7jIKX Nv6N+eh8OIiISDWvPabMqwvqr3gE674oMCwPh5aQhtiZ/1NVxG/qiQGEYSTpw1ASOf qA6lOUAaTe+IQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 35/59] ceph: add a new ceph.fscrypt.auth vxattr Date: Tue, 5 Apr 2022 15:20:06 -0400 Message-Id: <20220405192030.178326-36-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Give the client a way to get at the xattr from userland, mostly for future debugging purposes. Signed-off-by: Jeff Layton --- fs/ceph/xattr.c | 26 ++++++++++++++++++++++++++ 1 file changed, 26 insertions(+) diff --git a/fs/ceph/xattr.c b/fs/ceph/xattr.c index 58628cef4207..e080116608b2 100644 --- a/fs/ceph/xattr.c +++ b/fs/ceph/xattr.c @@ -352,6 +352,23 @@ static ssize_t ceph_vxattrcb_auth_mds(struct ceph_inode_info *ci, return ret; } +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) +static bool ceph_vxattrcb_fscrypt_auth_exists(struct ceph_inode_info *ci) +{ + return ci->fscrypt_auth_len; +} + +static ssize_t ceph_vxattrcb_fscrypt_auth(struct ceph_inode_info *ci, char *val, size_t size) +{ + if (size) { + if (size < ci->fscrypt_auth_len) + return -ERANGE; + memcpy(val, ci->fscrypt_auth, ci->fscrypt_auth_len); + } + return ci->fscrypt_auth_len; +} +#endif /* CONFIG_FS_ENCRYPTION */ + #define CEPH_XATTR_NAME(_type, _name) XATTR_CEPH_PREFIX #_type "." #_name #define CEPH_XATTR_NAME2(_type, _name, _name2) \ XATTR_CEPH_PREFIX #_type "." #_name "." #_name2 @@ -500,6 +517,15 @@ static struct ceph_vxattr ceph_common_vxattrs[] = { .exists_cb = NULL, .flags = VXATTR_FLAG_READONLY, }, +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + { + .name = "ceph.fscrypt.auth", + .name_size = sizeof("ceph.fscrypt.auth"), + .getxattr_cb = ceph_vxattrcb_fscrypt_auth, + .exists_cb = ceph_vxattrcb_fscrypt_auth_exists, + .flags = VXATTR_FLAG_READONLY, + }, +#endif /* CONFIG_FS_ENCRYPTION */ { .name = NULL, 0 } /* Required table terminator */ }; From patchwork Tue Apr 5 19:20:07 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802374 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 931C1C4332F for ; Wed, 6 Apr 2022 04:16:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1574923AbiDFEQ2 (ORCPT ); Wed, 6 Apr 2022 00:16:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37376 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573584AbiDETXF (ORCPT ); Tue, 5 Apr 2022 15:23:05 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 3584A4A910; Tue, 5 Apr 2022 12:21:06 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id C9B9C61899; Tue, 5 Apr 2022 19:21:05 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 91C0BC385A3; Tue, 5 Apr 2022 19:21:04 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186465; bh=z6i7+1naGRJ17oSoDdDmMlOtJ7LJFiXxSfJ3S7KoaqU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RiQdlvTzhjRwJP6Mzz9TVIY8iE/q/hyXdo9g0rYao26aQZ1bdfKyHo0pqS6F14EzI ZXz4qCi4F2Gq9w+c/tqyiN3WVqfsLbQnm5lb6Ahhu5O34fp1IQ/uZhnjUmIJtYFVc2 6DYDq4dqvi57kRsSE6E3RKRqP6EzO3tdm9tIQoxo2ypRSlnf58h2UXm3/aVDKuQjUr 2f9my0R6nk11CuVC6dGcHEWAfex93Y/5WoVfxiBprna+/TwpIIa84GLl6HL2qr1mke XF8yl3o6S+ZwgNGiRIbVP1awnZnjxEVq64YkI6shrd0CtpjCZ+IANo8V+1G1rBu3JP 53BgivWs/Bwlw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 36/59] ceph: add some fscrypt guardrails Date: Tue, 5 Apr 2022 15:20:07 -0400 Message-Id: <20220405192030.178326-37-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add the appropriate calls into fscrypt for various actions, including link, rename, setattr, and the open codepaths. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 8 ++++++++ fs/ceph/file.c | 14 +++++++++++++- fs/ceph/inode.c | 4 ++++ 3 files changed, 25 insertions(+), 1 deletion(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 82a5f37e9d4a..8a9f916bfc6c 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -1121,6 +1121,10 @@ static int ceph_link(struct dentry *old_dentry, struct inode *dir, if (ceph_snap(dir) != CEPH_NOSNAP) return -EROFS; + err = fscrypt_prepare_link(old_dentry, dir, dentry); + if (err) + return err; + dout("link in dir %p old_dentry %p dentry %p\n", dir, old_dentry, dentry); req = ceph_mdsc_create_request(mdsc, CEPH_MDS_OP_LINK, USE_AUTH_MDS); @@ -1318,6 +1322,10 @@ static int ceph_rename(struct user_namespace *mnt_userns, struct inode *old_dir, (!ceph_quota_is_same_realm(old_dir, new_dir))) return -EXDEV; + err = fscrypt_prepare_rename(old_dir, old_dentry, new_dir, new_dentry, flags); + if (err) + return err; + dout("rename dir %p dentry %p to dir %p dentry %p\n", old_dir, old_dentry, new_dir, new_dentry); req = ceph_mdsc_create_request(mdsc, op, USE_AUTH_MDS); diff --git a/fs/ceph/file.c b/fs/ceph/file.c index dfc02caf4229..a3afdb9cfddb 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -372,8 +372,13 @@ int ceph_open(struct inode *inode, struct file *file) /* filter out O_CREAT|O_EXCL; vfs did that already. yuck. */ flags = file->f_flags & ~(O_CREAT|O_EXCL); - if (S_ISDIR(inode->i_mode)) + if (S_ISDIR(inode->i_mode)) { flags = O_DIRECTORY; /* mds likes to know */ + } else if (S_ISREG(inode->i_mode)) { + err = fscrypt_file_open(inode, file); + if (err) + return err; + } dout("open inode %p ino %llx.%llx file %p flags %d (%d)\n", inode, ceph_vinop(inode), file, flags, file->f_flags); @@ -847,6 +852,13 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, dout("atomic_open finish_no_open on dn %p\n", dn); err = finish_no_open(file, dn); } else { + if (IS_ENCRYPTED(dir) && + !fscrypt_has_permitted_context(dir, d_inode(dentry))) { + pr_warn("Inconsistent encryption context (parent %llx:%llx child %llx:%llx)\n", + ceph_vinop(dir), ceph_vinop(d_inode(dentry))); + goto out_req; + } + dout("atomic_open finish_open on dn %p\n", dn); if (req->r_op == CEPH_MDS_OP_CREATE && req->r_reply_info.has_create_ino) { struct inode *newino = d_inode(dentry); diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index bb1b1a57970c..183b9f52dc7d 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -2487,6 +2487,10 @@ int ceph_setattr(struct user_namespace *mnt_userns, struct dentry *dentry, if (ceph_inode_is_shutdown(inode)) return -ESTALE; + err = fscrypt_prepare_setattr(dentry, attr); + if (err) + return err; + err = setattr_prepare(&init_user_ns, dentry, attr); if (err != 0) return err; From patchwork Tue Apr 5 19:20:08 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802342 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id C2385C433EF for ; Wed, 6 Apr 2022 04:15:12 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1349820AbiDFEK2 (ORCPT ); Wed, 6 Apr 2022 00:10:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37518 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573585AbiDETXH (ORCPT ); Tue, 5 Apr 2022 15:23:07 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C143D4B1EB; Tue, 5 Apr 2022 12:21:08 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 7B4A8B81F6B; Tue, 5 Apr 2022 19:21:07 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 7B24EC385A1; Tue, 5 Apr 2022 19:21:05 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186466; bh=btEIS1j3Ond3ZP9qah9C8gQBs8ZC16Vm5hgbF/uIHr4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=eFlKOuSw0cSgBPQB0RlcnyLlamcccO007IB5K9wTb5VNPisX/4Cf3Pn9gEdDRsw4H IY6r5l6CAnWZaSsl3esf9kK0VkAeG8GRhNm2OS4K9hMBKVTH+vm3COFoGjOVmaK2gH v2nWQIT8n5qcckt58lWop10s/ccjos9RPzvMYEz1fViJMrxArI0Lw2oqLwOlvXY7pC bok/BFjrPvYmL53exoHlzMaeKWRmoW08zwFYle2DISIeqRGojpE6Sz5u4TcTtzc5zR qj0ZW0HO9T/vz/ztELK/WSc8YwMHUwhjQjqfa85/WOuvoFKIqRlLEryzCyjW5STrfG s0Ji8x71s/r4A== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 37/59] ceph: don't allow changing layout on encrypted files/directories Date: Tue, 5 Apr 2022 15:20:08 -0400 Message-Id: <20220405192030.178326-38-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Luis Henriques Encryption is currently only supported on files/directories with layouts where stripe_count=1. Forbid changing layouts when encryption is involved. Signed-off-by: Luis Henriques Signed-off-by: Jeff Layton --- fs/ceph/ioctl.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/ceph/ioctl.c b/fs/ceph/ioctl.c index b9f0f4e460ab..9675ef3a6c47 100644 --- a/fs/ceph/ioctl.c +++ b/fs/ceph/ioctl.c @@ -294,6 +294,10 @@ static long ceph_set_encryption_policy(struct file *file, unsigned long arg) struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); + /* encrypted directories can't have striped layout */ + if (ci->i_layout.stripe_count > 1) + return -EINVAL; + ret = vet_mds_for_fscrypt(file); if (ret) return ret; From patchwork Tue Apr 5 19:20:09 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802341 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6D4EBC433EF for ; Wed, 6 Apr 2022 04:07:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239518AbiDFEJN (ORCPT ); Wed, 6 Apr 2022 00:09:13 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37640 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573587AbiDETXJ (ORCPT ); Tue, 5 Apr 2022 15:23:09 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 95F234B842; Tue, 5 Apr 2022 12:21:09 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 43BAEB81FA5; Tue, 5 Apr 2022 19:21:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 64BBCC385A0; Tue, 5 Apr 2022 19:21:06 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186467; bh=X85CZobPXWulW9mgO7+FK15JdkLRqeyY4fhr5BUdygo=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=FroQz6bZzBiDzl4/Y8m8ivQwBBM15kgJDRQkBcOis27RIWqJXdO/Up+Ja0+ps5+1G lybMxpr+MB0gRbaNM9Op70KXPihW9axLWkrm04rpRfRDHuT1Mp28xgJehOxjpBPHnQ vmi+J8QmBbzd5p39I7Dv75ece+Qil/41XLC4pjfDJajYKp7hAnfjbXHyRlGETbSqAt LoQYbDx2D33rSSgYWZsRVZrlcKs/KF9q8kvoMHh1IDXh5+c0i1N4DIPZs4QZpHPP4g a56POnYOgmaSA8+WsW93I3VRAPJV1W8n7s7nIKHB8s+g7L95PiezKBoDInvKh7JoEK 9Mnv85xTgyntg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 38/59] libceph: add CEPH_OSD_OP_ASSERT_VER support Date: Tue, 5 Apr 2022 15:20:09 -0400 Message-Id: <20220405192030.178326-39-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org ...and record the user_version in the reply in a new field in ceph_osd_request, so we can populate the assert_ver appropriately. Shuffle the fields a bit too so that the new field fits in an existing hole on x86_64. Signed-off-by: Jeff Layton --- include/linux/ceph/osd_client.h | 6 +++++- include/linux/ceph/rados.h | 4 ++++ net/ceph/osd_client.c | 5 +++++ 3 files changed, 14 insertions(+), 1 deletion(-) diff --git a/include/linux/ceph/osd_client.h b/include/linux/ceph/osd_client.h index 4088601beacc..8c7f34df66d3 100644 --- a/include/linux/ceph/osd_client.h +++ b/include/linux/ceph/osd_client.h @@ -196,6 +196,9 @@ struct ceph_osd_req_op { u32 src_fadvise_flags; struct ceph_osd_data osd_data; } copy_from; + struct { + u64 ver; + } assert_ver; }; }; @@ -250,6 +253,7 @@ struct ceph_osd_request { struct ceph_osd_client *r_osdc; struct kref r_kref; bool r_mempool; + bool r_linger; /* don't resend on failure */ struct completion r_completion; /* private to osd_client.c */ ceph_osdc_callback_t r_callback; @@ -262,9 +266,9 @@ struct ceph_osd_request { struct ceph_snap_context *r_snapc; /* for writes */ struct timespec64 r_mtime; /* ditto */ u64 r_data_offset; /* ditto */ - bool r_linger; /* don't resend on failure */ /* internal */ + u64 r_version; /* data version sent in reply */ unsigned long r_stamp; /* jiffies, send or check time */ unsigned long r_start_stamp; /* jiffies */ ktime_t r_start_latency; /* ktime_t */ diff --git a/include/linux/ceph/rados.h b/include/linux/ceph/rados.h index 43a7a1573b51..73c3efbec36c 100644 --- a/include/linux/ceph/rados.h +++ b/include/linux/ceph/rados.h @@ -523,6 +523,10 @@ struct ceph_osd_op { struct { __le64 cookie; } __attribute__ ((packed)) notify; + struct { + __le64 unused; + __le64 ver; + } __attribute__ ((packed)) assert_ver; struct { __le64 offset, length; __le64 src_offset; diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index acf6a19b6677..febdd728b2fb 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1042,6 +1042,10 @@ static u32 osd_req_encode_op(struct ceph_osd_op *dst, dst->copy_from.src_fadvise_flags = cpu_to_le32(src->copy_from.src_fadvise_flags); break; + case CEPH_OSD_OP_ASSERT_VER: + dst->assert_ver.unused = cpu_to_le64(0); + dst->assert_ver.ver = cpu_to_le64(src->assert_ver.ver); + break; default: pr_err("unsupported osd opcode %s\n", ceph_osd_op_name(src->op)); @@ -3804,6 +3808,7 @@ static void handle_reply(struct ceph_osd *osd, struct ceph_msg *msg) * one (type of) reply back. */ WARN_ON(!(m.flags & CEPH_OSD_FLAG_ONDISK)); + req->r_version = m.user_version; req->r_result = m.result ?: data_len; finish_request(req); mutex_unlock(&osd->lock); From patchwork Tue Apr 5 19:20:10 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802334 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 32C56C43217 for ; Wed, 6 Apr 2022 04:05:01 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1348687AbiDFEFY (ORCPT ); Wed, 6 Apr 2022 00:05:24 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37560 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573586AbiDETXI (ORCPT ); Tue, 5 Apr 2022 15:23:08 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F0AE64B1FC; Tue, 5 Apr 2022 12:21:08 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8549D61899; Tue, 5 Apr 2022 19:21:08 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4DC48C385A3; Tue, 5 Apr 2022 19:21:07 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186468; bh=14o5AyS753fcTCZmC+HM1t/hgGn4Rq2wXNi53+bcw+U=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nLl3vhRhPAHlOR3xbsvQlr7NLV0rPQPFXFI9GXMe9Zg0qy/SlZenSfJF7B+9+HPmQ 1pZwso5rDirMGVn95yxypey863Alt8ShGJQVCcStBUiV364309s5ITLlxZ9Dwh15vt UbuZCtw9pQcW5JtzqfJh+cXHEQaxnEYD5oJaDQd+p60AJyy6BNqRFbeuLuTsv+pWGY aFXF+c9f5fXZYvryRSBwjX4g/0Ix+lPuH4LiT8Qu7kuzrwAAe+SPlTH75pzWbetq9f RcKBFKp12vNCtUVkpjnnTm993fnIexkUN03o9tTyP35JCkXSmWow8QUbx2GOs8ud14 VqFOdHABr/1pw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 39/59] ceph: size handling for encrypted inodes in cap updates Date: Tue, 5 Apr 2022 15:20:10 -0400 Message-Id: <20220405192030.178326-40-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Transmit the rounded-up size as the normal size, and fill out the fscrypt_file field with the real file size. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 43 +++++++++++++++++++++++++------------------ fs/ceph/crypto.h | 4 ++++ 2 files changed, 29 insertions(+), 18 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 22bf3e2696cb..cb5cdf2260ad 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -1215,10 +1215,9 @@ struct cap_msg_args { umode_t mode; bool inline_data; bool wake; + bool encrypted; u32 fscrypt_auth_len; - u32 fscrypt_file_len; u8 fscrypt_auth[sizeof(struct ceph_fscrypt_auth)]; // for context - u8 fscrypt_file[sizeof(u64)]; // for size }; /* Marshal up the cap msg to the MDS */ @@ -1253,7 +1252,12 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) fc->ino = cpu_to_le64(arg->ino); fc->snap_follows = cpu_to_le64(arg->follows); - fc->size = cpu_to_le64(arg->size); +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + if (arg->encrypted) + fc->size = cpu_to_le64(round_up(arg->size, CEPH_FSCRYPT_BLOCK_SIZE)); + else +#endif + fc->size = cpu_to_le64(arg->size); fc->max_size = cpu_to_le64(arg->max_size); ceph_encode_timespec64(&fc->mtime, &arg->mtime); ceph_encode_timespec64(&fc->atime, &arg->atime); @@ -1313,11 +1317,17 @@ static void encode_cap_msg(struct ceph_msg *msg, struct cap_msg_args *arg) ceph_encode_64(&p, 0); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) - /* fscrypt_auth and fscrypt_file (version 12) */ + /* + * fscrypt_auth and fscrypt_file (version 12) + * + * fscrypt_auth holds the crypto context (if any). fscrypt_file + * tracks the real i_size as an __le64 field (and we use a rounded-up + * i_size in * the traditional size field). + */ ceph_encode_32(&p, arg->fscrypt_auth_len); ceph_encode_copy(&p, arg->fscrypt_auth, arg->fscrypt_auth_len); - ceph_encode_32(&p, arg->fscrypt_file_len); - ceph_encode_copy(&p, arg->fscrypt_file, arg->fscrypt_file_len); + ceph_encode_32(&p, sizeof(__le64)); + ceph_encode_64(&p, arg->size); #else /* CONFIG_FS_ENCRYPTION */ ceph_encode_32(&p, 0); ceph_encode_32(&p, 0); @@ -1389,7 +1399,6 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, arg->follows = flushing ? ci->i_head_snapc->seq : 0; arg->flush_tid = flush_tid; arg->oldest_flush_tid = oldest_flush_tid; - arg->size = i_size_read(inode); ci->i_reported_size = arg->size; arg->max_size = ci->i_wanted_max_size; @@ -1443,6 +1452,7 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, } } arg->flags = flags; + arg->encrypted = IS_ENCRYPTED(inode); #if IS_ENABLED(CONFIG_FS_ENCRYPTION) if (ci->fscrypt_auth_len && WARN_ON_ONCE(ci->fscrypt_auth_len > sizeof(struct ceph_fscrypt_auth))) { @@ -1453,21 +1463,21 @@ static void __prep_cap(struct cap_msg_args *arg, struct ceph_cap *cap, memcpy(arg->fscrypt_auth, ci->fscrypt_auth, min_t(size_t, ci->fscrypt_auth_len, sizeof(arg->fscrypt_auth))); } - /* FIXME: use this to track "real" size */ - arg->fscrypt_file_len = 0; #endif /* CONFIG_FS_ENCRYPTION */ } +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) #define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ - 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4 + 8) -#if IS_ENABLED(CONFIG_FS_ENCRYPTION) static inline int cap_msg_size(struct cap_msg_args *arg) { - return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len + - arg->fscrypt_file_len; + return CAP_MSG_FIXED_FIELDS + arg->fscrypt_auth_len; } #else +#define CAP_MSG_FIXED_FIELDS (sizeof(struct ceph_mds_caps) + \ + 4 + 8 + 4 + 4 + 8 + 4 + 4 + 4 + 8 + 8 + 4 + 8 + 8 + 4 + 4) + static inline int cap_msg_size(struct cap_msg_args *arg) { return CAP_MSG_FIXED_FIELDS; @@ -1546,13 +1556,10 @@ static inline int __send_flush_snap(struct inode *inode, arg.inline_data = capsnap->inline_data; arg.flags = 0; arg.wake = false; + arg.encrypted = IS_ENCRYPTED(inode); - /* - * No fscrypt_auth changes from a capsnap. It will need - * to update fscrypt_file on size changes (TODO). - */ + /* No fscrypt_auth changes from a capsnap.*/ arg.fscrypt_auth_len = 0; - arg.fscrypt_file_len = 0; msg = ceph_msg_new(CEPH_MSG_CLIENT_CAPS, cap_msg_size(&arg), GFP_NOFS, false); diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 080905b0c73c..56a61ba64edc 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -9,6 +9,10 @@ #include #include +#define CEPH_FSCRYPT_BLOCK_SHIFT 12 +#define CEPH_FSCRYPT_BLOCK_SIZE (_AC(1, UL) << CEPH_FSCRYPT_BLOCK_SHIFT) +#define CEPH_FSCRYPT_BLOCK_MASK (~(CEPH_FSCRYPT_BLOCK_SIZE-1)) + struct ceph_fs_client; struct ceph_acl_sec_ctx; struct ceph_mds_request; From patchwork Tue Apr 5 19:20:11 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802345 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 52CE2C4332F for ; Wed, 6 Apr 2022 04:16:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1354207AbiDFEL6 (ORCPT ); Wed, 6 Apr 2022 00:11:58 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37684 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573589AbiDETXK (ORCPT ); Tue, 5 Apr 2022 15:23:10 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 561974BFDA; Tue, 5 Apr 2022 12:21:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1A1C9B81FA4; Tue, 5 Apr 2022 19:21:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 370CEC385A5; Tue, 5 Apr 2022 19:21:08 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186468; bh=NYgOnMz5LZxYWvR6hLfX0fO6hW+FJGG+1J/GyHdvGCU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=qEutv6DRBKJn+5NCqUwPm5Ts5efaMJsSTp2TVUO6W0J7YXM/liJhtrgL27Ff2bcnH mhk2XOgZazXiVBJfTtdxXuJnky6XlICRtZUhIfl2wOmE4tEiMQKn8mKGZVcoEAo5SI +qzZ8yDSJK3jRuJE8rOIabwsLmpevbsjtprBGeooXH/0aF0De3eAvH06/bCnmoQSdr Anysj4jVywi+ww7WQOlAVTFiIYYWq1X54OTDvQP6HxWRb2oz/9ZwxLVoVZGcyvcqHU iztyrL2X9Ho3O7YhYB39ET/gBzY3iNSw759Y+CEIgQbUcM/997keGzz5sfX8Ai54FZ 9fPQmpiCuXEpw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 40/59] ceph: fscrypt_file field handling in MClientRequest messages Date: Tue, 5 Apr 2022 15:20:11 -0400 Message-Id: <20220405192030.178326-41-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org For encrypted inodes, transmit a rounded-up size to the MDS as the normal file size and send the real inode size in fscrypt_file field. Also, fix up creates and truncates to also transmit fscrypt_file. Signed-off-by: Jeff Layton --- fs/ceph/dir.c | 3 +++ fs/ceph/file.c | 1 + fs/ceph/inode.c | 18 ++++++++++++++++-- fs/ceph/mds_client.c | 9 ++++++++- fs/ceph/mds_client.h | 2 ++ 5 files changed, 30 insertions(+), 3 deletions(-) diff --git a/fs/ceph/dir.c b/fs/ceph/dir.c index 8a9f916bfc6c..5ccf6453f02f 100644 --- a/fs/ceph/dir.c +++ b/fs/ceph/dir.c @@ -910,6 +910,9 @@ static int ceph_mknod(struct user_namespace *mnt_userns, struct inode *dir, goto out_req; } + if (S_ISREG(mode) && IS_ENCRYPTED(dir)) + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + req->r_dentry = dget(dentry); req->r_num_caps = 2; req->r_parent = dir; diff --git a/fs/ceph/file.c b/fs/ceph/file.c index a3afdb9cfddb..b7e2594cc296 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -764,6 +764,7 @@ int ceph_atomic_open(struct inode *dir, struct dentry *dentry, req->r_parent = dir; ihold(dir); if (IS_ENCRYPTED(dir)) { + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); if (!fscrypt_has_encryption_key(dir)) { spin_lock(&dentry->d_lock); dentry->d_flags |= DCACHE_NOKEY_NAME; diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index 183b9f52dc7d..b9454721c976 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -2378,11 +2378,25 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c } } else if ((issued & CEPH_CAP_FILE_SHARED) == 0 || attr->ia_size != isize) { - req->r_args.setattr.size = cpu_to_le64(attr->ia_size); - req->r_args.setattr.old_size = cpu_to_le64(isize); mask |= CEPH_SETATTR_SIZE; release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL | CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; + if (IS_ENCRYPTED(inode) && attr->ia_size) { + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + mask |= CEPH_SETATTR_FSCRYPT_FILE; + req->r_args.setattr.size = + cpu_to_le64(round_up(attr->ia_size, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_args.setattr.old_size = + cpu_to_le64(round_up(isize, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_fscrypt_file = attr->ia_size; + /* FIXME: client must zero out any partial blocks! */ + } else { + req->r_args.setattr.size = cpu_to_le64(attr->ia_size); + req->r_args.setattr.old_size = cpu_to_le64(isize); + req->r_fscrypt_file = 0; + } } } if (ia_valid & ATTR_MTIME) { diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c index 50fe77768295..0da85c9ce73a 100644 --- a/fs/ceph/mds_client.c +++ b/fs/ceph/mds_client.c @@ -2752,7 +2752,12 @@ static void encode_mclientrequest_tail(void **p, const struct ceph_mds_request * } else { ceph_encode_32(p, 0); } - ceph_encode_32(p, 0); // fscrypt_file for now + if (test_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags)) { + ceph_encode_32(p, sizeof(__le64)); + ceph_encode_64(p, req->r_fscrypt_file); + } else { + ceph_encode_32(p, 0); + } } /* @@ -2838,6 +2843,8 @@ static struct ceph_msg *create_request_message(struct ceph_mds_session *session, /* fscrypt_file */ len += sizeof(u32); + if (test_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags)) + len += sizeof(__le64); msg = ceph_msg_new2(CEPH_MSG_CLIENT_REQUEST, len, 1, GFP_NOFS, false); if (!msg) { diff --git a/fs/ceph/mds_client.h b/fs/ceph/mds_client.h index 046a9368c4a9..e297bf98c39f 100644 --- a/fs/ceph/mds_client.h +++ b/fs/ceph/mds_client.h @@ -282,6 +282,7 @@ struct ceph_mds_request { #define CEPH_MDS_R_DID_PREPOPULATE (6) /* prepopulated readdir */ #define CEPH_MDS_R_PARENT_LOCKED (7) /* is r_parent->i_rwsem wlocked? */ #define CEPH_MDS_R_ASYNC (8) /* async request */ +#define CEPH_MDS_R_FSCRYPT_FILE (9) /* must marshal fscrypt_file field */ unsigned long r_req_flags; struct mutex r_fill_mutex; @@ -289,6 +290,7 @@ struct ceph_mds_request { union ceph_mds_request_args r_args; struct ceph_fscrypt_auth *r_fscrypt_auth; + u64 r_fscrypt_file; u8 *r_altname; /* fscrypt binary crypttext for long filenames */ u32 r_altname_len; /* length of r_altname */ From patchwork Tue Apr 5 19:20:12 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802383 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8AAE1C433FE for ; Wed, 6 Apr 2022 04:18:50 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1838376AbiDFEQv (ORCPT ); Wed, 6 Apr 2022 00:16:51 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37682 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573588AbiDETXK (ORCPT ); Tue, 5 Apr 2022 15:23:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id F2AC14BB9E; Tue, 5 Apr 2022 12:21:10 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 5AB8761899; Tue, 5 Apr 2022 19:21:10 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 1FD4CC385A0; Tue, 5 Apr 2022 19:21:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186469; bh=PDf3viSS4Mrr2b1LxuraoL8yqmsI0zQ6OYJs90Hla08=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=oHkKb0zOXEGuXDRiw+PFtURhv5EVKSUUB/pEPOSwoTvrDKMMkparT8Yo9oxfzEdL7 bvqrG7PYDMEL/jj4n/I1QjdBPH5Iz5sKnujn5kmgM+FsE1GbkM4ubK5HYbiB5i5GRt 2GcheSsP2jMI0mhKOO25nWlz8HuwFSYd4XZYRKqNMBMfZji8WIC2CWZPrnd+ZW5RDK 4Pr63v2KKhS0GKMKuMiM7gSXx76ok2+wQrXigURfS7r4uJ3XEuk8lChHl/M0dM9RFe lBusEbXaxGN9jNBzLuuXFfJasFxbyPQbDFs7VY/qr/PCYY6ZIgcswU24BwK29WyORq YTE1HezVsEIQQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 41/59] ceph: get file size from fscrypt_file when present in inode traces Date: Tue, 5 Apr 2022 15:20:12 -0400 Message-Id: <20220405192030.178326-42-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When we get an inode trace from the MDS, grab the fscrypt_file field if the inode is encrypted, and use it to populate the i_size field instead of the regular inode size field. Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index b9454721c976..f2a59306e4a6 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -1024,6 +1024,7 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, if (new_version || (new_issued & (CEPH_CAP_ANY_FILE_RD | CEPH_CAP_ANY_FILE_WR))) { + u64 size = le64_to_cpu(info->size); s64 old_pool = ci->i_layout.pool_id; struct ceph_string *old_ns; @@ -1037,10 +1038,21 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, pool_ns = old_ns; + if (IS_ENCRYPTED(inode) && size && (iinfo->fscrypt_file_len == sizeof(__le64))) { + u64 fsize = __le64_to_cpu(*(__le64 *)iinfo->fscrypt_file); + + if (size == round_up(fsize, CEPH_FSCRYPT_BLOCK_SIZE)) { + size = fsize; + } else { + pr_warn("fscrypt size mismatch: size=%llu fscrypt_file=%llu, discarding fscrypt_file size.\n", + info->size, size); + } + } + queue_trunc = ceph_fill_file_size(inode, issued, - le32_to_cpu(info->truncate_seq), - le64_to_cpu(info->truncate_size), - le64_to_cpu(info->size)); + le32_to_cpu(info->truncate_seq), + le64_to_cpu(info->truncate_size), + size); /* only update max_size on auth cap */ if ((info->cap.flags & CEPH_CAP_FLAG_AUTH) && ci->i_max_size != le64_to_cpu(info->max_size)) { From patchwork Tue Apr 5 19:20:13 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802346 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 723C5C43219 for ; Wed, 6 Apr 2022 04:16:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1379658AbiDFEMa (ORCPT ); Wed, 6 Apr 2022 00:12:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37708 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573590AbiDETXK (ORCPT ); Tue, 5 Apr 2022 15:23:10 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A608E4B842; Tue, 5 Apr 2022 12:21:11 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 41453616C5; Tue, 5 Apr 2022 19:21:11 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 08F6FC385A1; Tue, 5 Apr 2022 19:21:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186470; bh=Zg3zMWSG6TvHZGGBtCLfiftQlhpR76ntlCWktCjM/rI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=r75syTBou8Z4Hpkp1mjTt3Zb/e6dtZS5ttMdnxnJhie0jKOGCDuge3ZxVSfoBOsMc +gKS+1NFp+FiiSYHDBQ1i6MWknoPjnydDRMD6g/zSnBrD/8232h2sqkZ9NGYdGaw+R IL7f9tgOku4IbdByviN8/DQHHycj3UIXqrfeiGG9+pL8Q6fhnOx/2u/vPSeg+hrDKx lTHljOLnjx8L4AZ2B8hH22tp84lnuDK8n9G6XRXhWg6WLYXyL8Sndrhe5zWq9E6BWz 6D6H6/5+N6Yb2brGnTvWtleWj2DejPFMZHihMz65HD78XkOpUZOr+aess+q9AojYln czMgCmam9EC4g== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 42/59] ceph: handle fscrypt fields in cap messages from MDS Date: Tue, 5 Apr 2022 15:20:13 -0400 Message-Id: <20220405192030.178326-43-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Handle the new fscrypt_file and fscrypt_auth fields in cap messages. Use them to populate new fields in cap_extra_info and update the inode with those values. Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 78 ++++++++++++++++++++++++++++++++++++++++++++++++-- 1 file changed, 76 insertions(+), 2 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index cb5cdf2260ad..1f3a2135214c 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3364,6 +3364,9 @@ struct cap_extra_info { /* currently issued */ int issued; struct timespec64 btime; + u8 *fscrypt_auth; + u32 fscrypt_auth_len; + u64 fscrypt_file_size; }; /* @@ -3396,6 +3399,14 @@ static void handle_cap_grant(struct inode *inode, bool deleted_inode = false; bool fill_inline = false; + /* + * If there is at least one crypto block then we'll trust fscrypt_file_size. + * If the real length of the file is 0, then ignore it (it has probably been + * truncated down to 0 by the MDS). + */ + if (IS_ENCRYPTED(inode) && size) + size = extra_info->fscrypt_file_size; + dout("handle_cap_grant inode %p cap %p mds%d seq %d %s\n", inode, cap, session->s_mds, seq, ceph_cap_string(newcaps)); dout(" size %llu max_size %llu, i_size %llu\n", size, max_size, @@ -3462,6 +3473,10 @@ static void handle_cap_grant(struct inode *inode, dout("%p mode 0%o uid.gid %d.%d\n", inode, inode->i_mode, from_kuid(&init_user_ns, inode->i_uid), from_kgid(&init_user_ns, inode->i_gid)); + + WARN_ON_ONCE(ci->fscrypt_auth_len != extra_info->fscrypt_auth_len || + memcmp(ci->fscrypt_auth, extra_info->fscrypt_auth, + ci->fscrypt_auth_len)); } if ((newcaps & CEPH_CAP_LINK_SHARED) && @@ -3872,7 +3887,8 @@ static void handle_cap_flushsnap_ack(struct inode *inode, u64 flush_tid, */ static bool handle_cap_trunc(struct inode *inode, struct ceph_mds_caps *trunc, - struct ceph_mds_session *session) + struct ceph_mds_session *session, + struct cap_extra_info *extra_info) { struct ceph_inode_info *ci = ceph_inode(inode); int mds = session->s_mds; @@ -3889,6 +3905,14 @@ static bool handle_cap_trunc(struct inode *inode, issued |= implemented | dirty; + /* + * If there is at least one crypto block then we'll trust fscrypt_file_size. + * If the real length of the file is 0, then ignore it (it has probably been + * truncated down to 0 by the MDS). + */ + if (IS_ENCRYPTED(inode) && size) + size = extra_info->fscrypt_file_size; + dout("handle_cap_trunc inode %p mds%d seq %d to %lld seq %d\n", inode, mds, seq, truncate_size, truncate_seq); queue_trunc = ceph_fill_file_size(inode, issued, @@ -4110,6 +4134,49 @@ static void handle_cap_import(struct ceph_mds_client *mdsc, *target_cap = cap; } +#ifdef CONFIG_FS_ENCRYPTION +static int parse_fscrypt_fields(void **p, void *end, struct cap_extra_info *extra) +{ + u32 len; + + ceph_decode_32_safe(p, end, extra->fscrypt_auth_len, bad); + if (extra->fscrypt_auth_len) { + ceph_decode_need(p, end, extra->fscrypt_auth_len, bad); + extra->fscrypt_auth = kmalloc(extra->fscrypt_auth_len, GFP_KERNEL); + if (!extra->fscrypt_auth) + return -ENOMEM; + ceph_decode_copy_safe(p, end, extra->fscrypt_auth, + extra->fscrypt_auth_len, bad); + } + + ceph_decode_32_safe(p, end, len, bad); + if (len >= sizeof(u64)) { + ceph_decode_64_safe(p, end, extra->fscrypt_file_size, bad); + len -= sizeof(u64); + } + ceph_decode_skip_n(p, end, len, bad); + return 0; +bad: + return -EIO; +} +#else +static int parse_fscrypt_fields(void **p, void *end, struct cap_extra_info *extra) +{ + u32 len; + + /* Don't care about these fields unless we're encryption-capable */ + ceph_decode_32_safe(p, end, len, bad); + if (len) + ceph_decode_skip_n(p, end, len, bad); + ceph_decode_32_safe(p, end, len, bad); + if (len) + ceph_decode_skip_n(p, end, len, bad); + return 0; +bad: + return -EIO; +} +#endif + /* * Handle a caps message from the MDS. * @@ -4228,6 +4295,11 @@ void ceph_handle_caps(struct ceph_mds_session *session, ceph_decode_64_safe(&p, end, extra_info.nsubdirs, bad); } + if (msg_version >= 12) { + if (parse_fscrypt_fields(&p, end, &extra_info)) + goto bad; + } + /* lookup ino */ inode = ceph_find_inode(mdsc->fsc->sb, vino); dout(" op %s ino %llx.%llx inode %p\n", ceph_cap_op_name(op), vino.ino, @@ -4324,7 +4396,8 @@ void ceph_handle_caps(struct ceph_mds_session *session, break; case CEPH_CAP_OP_TRUNC: - queue_trunc = handle_cap_trunc(inode, h, session); + queue_trunc = handle_cap_trunc(inode, h, session, + &extra_info); spin_unlock(&ci->i_ceph_lock); if (queue_trunc) ceph_queue_vmtruncate(inode); @@ -4342,6 +4415,7 @@ void ceph_handle_caps(struct ceph_mds_session *session, iput(inode); out: ceph_put_string(extra_info.pool_ns); + kfree(extra_info.fscrypt_auth); return; flush_cap_releases: From patchwork Tue Apr 5 19:20:14 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802380 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 35211C4332F for ; Wed, 6 Apr 2022 04:18:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1584743AbiDFEQk (ORCPT ); Wed, 6 Apr 2022 00:16:40 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37710 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573591AbiDETXL (ORCPT ); Tue, 5 Apr 2022 15:23:11 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [139.178.84.217]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 9D7E64BB9E; Tue, 5 Apr 2022 12:21:12 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 2A8C9618CD; Tue, 5 Apr 2022 19:21:12 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E6F01C385A3; Tue, 5 Apr 2022 19:21:10 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186471; bh=jKwrnkur/UZMWNKLAXoBHYovRwFcYZx33jY+OjyVk3E=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=RPdd6h+UwAmG5ptt3N35Odgi0Sooj/kK0G16AHRqHv8TZlXi147IQ4ahU1akYUygP Vqr5OVJt0PsE/dtEc68wxF6mXwNpdMLn7vKQImajKbCCdNw3gtsyoDWVOXzKU10OnM IMXtG1oQ1+LpDIs4zFpHdt1IqTgJ8ullm1HdOE0GlHqN1ISu6dwNvbe+d2u9A99uPx XHSRP8R4wTkdJ6nb4xVoAXfZBRMXmk9kYEhdiCf60k4jbr9CNPH3iddgP22SjkQ8XE lVS96ElhxKCb6rfdRV/2wiT/Tcy7JjNfVPkAOPaBhLyo/exVWmrPwwQRxRDkcCTlOx zbho0Ojtgx3Xw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 43/59] ceph: update WARN_ON message to pr_warn Date: Tue, 5 Apr 2022 15:20:14 -0400 Message-Id: <20220405192030.178326-44-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Give some more helpful info Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 11 +++++++---- 1 file changed, 7 insertions(+), 4 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index 1f3a2135214c..cb2c18d43946 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -3473,10 +3473,13 @@ static void handle_cap_grant(struct inode *inode, dout("%p mode 0%o uid.gid %d.%d\n", inode, inode->i_mode, from_kuid(&init_user_ns, inode->i_uid), from_kgid(&init_user_ns, inode->i_gid)); - - WARN_ON_ONCE(ci->fscrypt_auth_len != extra_info->fscrypt_auth_len || - memcmp(ci->fscrypt_auth, extra_info->fscrypt_auth, - ci->fscrypt_auth_len)); +#if IS_ENABLED(CONFIG_FS_ENCRYPTION) + if (ci->fscrypt_auth_len != extra_info->fscrypt_auth_len || + memcmp(ci->fscrypt_auth, extra_info->fscrypt_auth, + ci->fscrypt_auth_len)) + pr_warn_ratelimited("%s: cap grant attempt to change fscrypt_auth on non-I_NEW inode (old len %d new len %d)\n", + __func__, ci->fscrypt_auth_len, extra_info->fscrypt_auth_len); +#endif } if ((newcaps & CEPH_CAP_LINK_SHARED) && From patchwork Tue Apr 5 19:20:15 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802375 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 7D5ABC433FE for ; Wed, 6 Apr 2022 04:16:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1574054AbiDFEQV (ORCPT ); Wed, 6 Apr 2022 00:16:21 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37940 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573593AbiDETXO (ORCPT ); Tue, 5 Apr 2022 15:23:14 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2EFF14C432; Tue, 5 Apr 2022 12:21:15 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id D6F69B81FA4; Tue, 5 Apr 2022 19:21:13 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D0B88C385A0; Tue, 5 Apr 2022 19:21:11 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186472; bh=BuPDaFFHn9DI+S2jJNJ0UstMiP6cacU6yo5802iH8rs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=tdMW82DnD5bTmA5CqP5EzSXRVOMUszinsLxMfqiNz9OVYhh54+e8ZFVqi+XAj7QXE nnR0bjnC3+DpVo5VV7EcHVGJ6AQHQnB/J82A4cVarwg5fdzyyvDVGV0R5uaJWhUXBo z15clcrY0/kbyQqt9DzJQrDFrDllwUk77es0gIP0TydIoiUMEF+VXWE7BVRLEyCSgO ohRW0JyTukIOAkSmNL5P3vyRUh2c4qIi4IjoQziLPetustN+IVkcNtwBRFWrhWku0v dH14LoanPrI4wBObU98yR2Ie45WWFMwQVV+i7nUn8A2bB25A+i4Px1m0X5aJyJ0zCe w8AfxTrsKRwVA== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 44/59] ceph: add __ceph_get_caps helper support Date: Tue, 5 Apr 2022 15:20:15 -0400 Message-Id: <20220405192030.178326-45-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Break out the guts of ceph_get_caps into a helper that takes an inode and ceph_file_info instead of a file pointer. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/caps.c | 19 +++++++++++++------ fs/ceph/super.h | 2 ++ 2 files changed, 15 insertions(+), 6 deletions(-) diff --git a/fs/ceph/caps.c b/fs/ceph/caps.c index cb2c18d43946..69af17df59be 100644 --- a/fs/ceph/caps.c +++ b/fs/ceph/caps.c @@ -2947,10 +2947,9 @@ int ceph_try_get_caps(struct inode *inode, int need, int want, * due to a small max_size, make sure we check_max_size (and possibly * ask the mds) so we don't get hung up indefinitely. */ -int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got) +int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi, int need, + int want, loff_t endoff, int *got) { - struct ceph_file_info *fi = filp->private_data; - struct inode *inode = file_inode(filp); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); int ret, _got, flags; @@ -2959,7 +2958,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got if (ret < 0) return ret; - if ((fi->fmode & CEPH_FILE_MODE_WR) && + if (fi && (fi->fmode & CEPH_FILE_MODE_WR) && fi->filp_gen != READ_ONCE(fsc->filp_gen)) return -EBADF; @@ -2967,7 +2966,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got while (true) { flags &= CEPH_FILE_MODE_MASK; - if (atomic_read(&fi->num_locks)) + if (fi && atomic_read(&fi->num_locks)) flags |= CHECK_FILELOCK; _got = 0; ret = try_get_cap_refs(inode, need, want, endoff, @@ -3012,7 +3011,7 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got continue; } - if ((fi->fmode & CEPH_FILE_MODE_WR) && + if (fi && (fi->fmode & CEPH_FILE_MODE_WR) && fi->filp_gen != READ_ONCE(fsc->filp_gen)) { if (ret >= 0 && _got) ceph_put_cap_refs(ci, _got); @@ -3075,6 +3074,14 @@ int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got return 0; } +int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got) +{ + struct ceph_file_info *fi = filp->private_data; + struct inode *inode = file_inode(filp); + + return __ceph_get_caps(inode, fi, need, want, endoff, got); +} + /* * Take cap refs. Caller must already know we hold at least one ref * on the caps in question or we don't know this is safe. diff --git a/fs/ceph/super.h b/fs/ceph/super.h index a97a6f6f3089..752bc3c820ca 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1229,6 +1229,8 @@ extern int ceph_encode_dentry_release(void **p, struct dentry *dn, struct inode *dir, int mds, int drop, int unless); +extern int __ceph_get_caps(struct inode *inode, struct ceph_file_info *fi, + int need, int want, loff_t endoff, int *got); extern int ceph_get_caps(struct file *filp, int need, int want, loff_t endoff, int *got); extern int ceph_try_get_caps(struct inode *inode, From patchwork Tue Apr 5 19:20:16 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 6BC5EC4167E for ; Wed, 6 Apr 2022 04:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1451910AbiDFEQA (ORCPT ); Wed, 6 Apr 2022 00:16:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37852 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573592AbiDETXN (ORCPT ); Tue, 5 Apr 2022 15:23:13 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 687EC4C420; Tue, 5 Apr 2022 12:21:14 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 05A53618A0; Tue, 5 Apr 2022 19:21:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id B9DECC385A3; Tue, 5 Apr 2022 19:21:12 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186473; bh=QuApRqh0fnTOdYhO0fpJUYwLgpx0svD4Wr/+e96NUFE=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=V85DtXBr19bL+lTEDoSEjYy0ajvW1aTrwCTptYwEHBNr64twKPsNY58LaE9PgGkBz 5fkatuYc5XIVY8nIkIy6NqIV3NBQrzOJ7Q3xeX7KoXb4UybNGQiYZaMRtu301iLFX2 o21Le4mQtxYecUZoJJek9iT+fWw+/wBQCN1Qax4df0VBEIfokFhmgUfAfE4QnJYsSJ Rcd5EcbdYgldUTUG8p9j6NIF8g7PdJAAVe+FDbmo1rgknfaVkB0Px/Tm5QYnXpqVbf zMnqHiS4VVHJNkw86fLoofCyYH7QysAXVtyUnI8OF1JhKtYamVaQM3XNEDv3ZFwZwg +ox9LNGTbZZ6w== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 45/59] ceph: add __ceph_sync_read helper support Date: Tue, 5 Apr 2022 15:20:16 -0400 Message-Id: <20220405192030.178326-46-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Turn the guts of ceph_sync_read into a new helper that takes an inode and an offset instead of a kiocb struct, and make ceph_sync_read call the helper as a wrapper. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/file.c | 33 +++++++++++++++++++++------------ fs/ceph/super.h | 2 ++ 2 files changed, 23 insertions(+), 12 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index b7e2594cc296..c4300381851e 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -927,22 +927,19 @@ enum { * If we get a short result from the OSD, check against i_size; we need to * only return a short read to the caller if we hit EOF. */ -static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, - int *retry_op) +ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, + struct iov_iter *to, int *retry_op) { - struct file *file = iocb->ki_filp; - struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_client *osdc = &fsc->client->osdc; ssize_t ret; - u64 off = iocb->ki_pos; + u64 off = *ki_pos; u64 len = iov_iter_count(to); u64 i_size = i_size_read(inode); bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); - dout("sync_read on file %p %llu~%u %s\n", file, off, (unsigned)len, - (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); + dout("sync_read on inode %p %llx~%llx\n", inode, *ki_pos, len); if (!len) return 0; @@ -1061,14 +1058,14 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, break; } - if (off > iocb->ki_pos) { + if (off > *ki_pos) { if (off >= i_size) { *retry_op = CHECK_EOF; - ret = i_size - iocb->ki_pos; - iocb->ki_pos = i_size; + ret = i_size - *ki_pos; + *ki_pos = i_size; } else { - ret = off - iocb->ki_pos; - iocb->ki_pos = off; + ret = off - *ki_pos; + *ki_pos = off; } } @@ -1076,6 +1073,18 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, return ret; } +static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, + int *retry_op) +{ + struct file *file = iocb->ki_filp; + struct inode *inode = file_inode(file); + + dout("sync_read on file %p %llx~%zx %s\n", file, iocb->ki_pos, + iov_iter_count(to), (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); + + return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op); +} + struct ceph_aio_request { struct kiocb *iocb; size_t total_len; diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 752bc3c820ca..d7ab820aed34 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1258,6 +1258,8 @@ extern int ceph_renew_caps(struct inode *inode, int fmode); extern int ceph_open(struct inode *inode, struct file *file); extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned flags, umode_t mode); +extern ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, + struct iov_iter *to, int *retry_op); extern int ceph_release(struct inode *inode, struct file *filp); extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page, char *data, size_t len); From patchwork Tue Apr 5 19:20:17 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802381 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 242D4C433F5 for ; Wed, 6 Apr 2022 04:18:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1580154AbiDFEQh (ORCPT ); Wed, 6 Apr 2022 00:16:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:37942 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573594AbiDETXO (ORCPT ); Tue, 5 Apr 2022 15:23:14 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 4CB3B4C788; Tue, 5 Apr 2022 12:21:15 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id DE896618CD; Tue, 5 Apr 2022 19:21:14 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A36F0C385A5; Tue, 5 Apr 2022 19:21:13 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186474; bh=X1L6/wSKaDTpim1XpoJnpAgGMETaa+8EmWzAIXZ00gU=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=WAbgfrhtoy5Sg+CRvEU3jPw5dlTccrhLVEjqBezzHxrwoRfdA4Sugg7I5/we2/5qh VjYiVx+SkN6deDnpy0CG/hTJ1D+OKnWhYD4PZ8/UjNQ66ynQ+QgAuKhx3sTXRBeJ/m 6eO+VQsKQMYYNaWLSpcocP93NoL3TE+hDJLx4uZT/NDQkLnxHCBEC+Mrw4ncARdX09 YYQK3BfJs0lsBHx7P/esthLyd9mG/NTAtIpGUc8c4y7jF6rT4JUpFWvO2CMiqg8iPR nAvtMobnrz25BZm33+RP4eG7bwJr3SQ2w0nGpqPAnQ/dmiS5kcj2Ma4UDtX7CdIZtH UrOzn7R1rr4Jg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 46/59] ceph: add object version support for sync read Date: Tue, 5 Apr 2022 15:20:17 -0400 Message-Id: <20220405192030.178326-47-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li Always return the last object's version. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/file.c | 12 ++++++++++-- fs/ceph/super.h | 3 ++- 2 files changed, 12 insertions(+), 3 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index c4300381851e..175a59277726 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -928,7 +928,8 @@ enum { * only return a short read to the caller if we hit EOF. */ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, - struct iov_iter *to, int *retry_op) + struct iov_iter *to, int *retry_op, + u64 *last_objver) { struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); @@ -938,6 +939,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, u64 len = iov_iter_count(to); u64 i_size = i_size_read(inode); bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); + u64 objver = 0; dout("sync_read on inode %p %llx~%llx\n", inode, *ki_pos, len); @@ -1008,6 +1010,9 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, req->r_end_latency, len, ret); + if (ret > 0) + objver = req->r_version; + i_size = i_size_read(inode); dout("sync_read %llu~%llu got %zd i_size %llu%s\n", off, len, ret, i_size, (more ? " MORE" : "")); @@ -1069,6 +1074,9 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, } } + if (last_objver && ret > 0) + *last_objver = objver; + dout("sync_read result %zd retry_op %d\n", ret, *retry_op); return ret; } @@ -1082,7 +1090,7 @@ static ssize_t ceph_sync_read(struct kiocb *iocb, struct iov_iter *to, dout("sync_read on file %p %llx~%zx %s\n", file, iocb->ki_pos, iov_iter_count(to), (file->f_flags & O_DIRECT) ? "O_DIRECT" : ""); - return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op); + return __ceph_sync_read(inode, &iocb->ki_pos, to, retry_op, NULL); } struct ceph_aio_request { diff --git a/fs/ceph/super.h b/fs/ceph/super.h index d7ab820aed34..9809bc97b89e 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -1259,7 +1259,8 @@ extern int ceph_open(struct inode *inode, struct file *file); extern int ceph_atomic_open(struct inode *dir, struct dentry *dentry, struct file *file, unsigned flags, umode_t mode); extern ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, - struct iov_iter *to, int *retry_op); + struct iov_iter *to, int *retry_op, + u64 *last_objver); extern int ceph_release(struct inode *inode, struct file *filp); extern void ceph_fill_inline_data(struct inode *inode, struct page *locked_page, char *data, size_t len); From patchwork Tue Apr 5 19:20:18 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802351 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5D3F3C433FE for ; Wed, 6 Apr 2022 04:16:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1384197AbiDFENU (ORCPT ); Wed, 6 Apr 2022 00:13:20 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38116 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573596AbiDETXQ (ORCPT ); Tue, 5 Apr 2022 15:23:16 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id AFFFF4DF6F; Tue, 5 Apr 2022 12:21:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 6DF64B81FA5; Tue, 5 Apr 2022 19:21:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8C99DC385A0; Tue, 5 Apr 2022 19:21:14 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186475; bh=6YU8kP1jnlCntdm06abZO3qzfSycP7nQ8wCDSx1OVxc=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=YJWjxWIc7MwDcyTJU6n6fM7VbWsg1dMosSAG12Mu3MRch4LXpWwtgBSkfZF+Qt2HM iF6Yqp/2mDmHYbbyAWTqPz5IirvTm7BqSnDhPmIIfQTvkSknr2RTDL1i2zZFu657BC Vx0DjtqG5D+xF0A5FMOMnooBewjUC8yJAZuwXLWZZMUo5i9zdMWpvEUIt4sEPMKGJr byzjBMZfUVd/Hc0MKsovfOGtAl+c1PNXRc/yY7rSKcEw+8cRgVgw6qe/ihuKLGJunI YswR8yFyMjBF2cq5PxecesCqdQ1taczYugFfjaDYSTRfq0n4++hRPWp1A0k6jxS8Yy xSvqZpZXB3Uig== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 47/59] ceph: add infrastructure for file encryption and decryption Date: Tue, 5 Apr 2022 15:20:18 -0400 Message-Id: <20220405192030.178326-48-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org ...and allow test_dummy_encryption to bypass content encryption if mounted with test_dummy_encryption=clear. Signed-off-by: Jeff Layton --- fs/ceph/crypto.c | 177 +++++++++++++++++++++++++++++++++++++++++++++++ fs/ceph/crypto.h | 71 +++++++++++++++++++ fs/ceph/super.c | 8 +++ fs/ceph/super.h | 1 + 4 files changed, 257 insertions(+) diff --git a/fs/ceph/crypto.c b/fs/ceph/crypto.c index 19c113afb400..e24e61c51118 100644 --- a/fs/ceph/crypto.c +++ b/fs/ceph/crypto.c @@ -2,6 +2,7 @@ #include #include #include +#include #include "super.h" #include "mds_client.h" @@ -263,3 +264,179 @@ int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, fscrypt_fname_free_buffer(&_tname); return ret; } + +int ceph_fscrypt_decrypt_block_inplace(const struct inode *inode, + struct page *page, unsigned int len, + unsigned int offs, u64 lblk_num) +{ + struct ceph_mount_options *opt = ceph_inode_to_client(inode)->mount_options; + + if (opt->flags & CEPH_MOUNT_OPT_DUMMY_ENC_CLEAR) + return 0; + + dout("%s: len %u offs %u blk %llu\n", __func__, len, offs, lblk_num); + return fscrypt_decrypt_block_inplace(inode, page, len, offs, lblk_num); +} + +int ceph_fscrypt_encrypt_block_inplace(const struct inode *inode, + struct page *page, unsigned int len, + unsigned int offs, u64 lblk_num, gfp_t gfp_flags) +{ + struct ceph_mount_options *opt = ceph_inode_to_client(inode)->mount_options; + + if (opt->flags & CEPH_MOUNT_OPT_DUMMY_ENC_CLEAR) + return 0; + + dout("%s: len %u offs %u blk %llu\n", __func__, len, offs, lblk_num); + return fscrypt_encrypt_block_inplace(inode, page, len, offs, lblk_num, gfp_flags); +} + +/** + * ceph_fscrypt_decrypt_pages - decrypt an array of pages + * @inode: pointer to inode associated with these pages + * @page: pointer to page array + * @off: offset into the file that the read data starts + * @len: max length to decrypt + * + * Decrypt an array of fscrypt'ed pages and return the amount of + * data decrypted. Any data in the page prior to the start of the + * first complete block in the read is ignored. Any incomplete + * crypto blocks at the end of the array are ignored (and should + * probably be zeroed by the caller). + * + * Returns the length of the decrypted data or a negative errno. + */ +int ceph_fscrypt_decrypt_pages(struct inode *inode, struct page **page, u64 off, int len) +{ + int i, num_blocks; + u64 baseblk = off >> CEPH_FSCRYPT_BLOCK_SHIFT; + int ret = 0; + + /* + * We can't deal with partial blocks on an encrypted file, so mask off + * the last bit. + */ + num_blocks = ceph_fscrypt_blocks(off, len & CEPH_FSCRYPT_BLOCK_MASK); + + /* Decrypt each block */ + for (i = 0; i < num_blocks; ++i) { + int blkoff = i << CEPH_FSCRYPT_BLOCK_SHIFT; + int pgidx = blkoff >> PAGE_SHIFT; + unsigned int pgoffs = offset_in_page(blkoff); + int fret; + + fret = ceph_fscrypt_decrypt_block_inplace(inode, page[pgidx], + CEPH_FSCRYPT_BLOCK_SIZE, pgoffs, + baseblk + i); + if (fret < 0) { + if (ret == 0) + ret = fret; + break; + } + ret += CEPH_FSCRYPT_BLOCK_SIZE; + } + return ret; +} + +/** + * ceph_fscrypt_decrypt_extents: decrypt received extents in given buffer + * @inode: inode associated with pages being decrypted + * @page: pointer to page array + * @off: offset into the file that the data in page[0] starts + * @map: pointer to extent array + * @ext_cnt: length of extent array + * + * Given an extent map and a page array, decrypt the received data in-place, + * skipping holes. Returns the offset into buffer of end of last decrypted + * block. + */ +int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page, u64 off, + struct ceph_sparse_extent *map, u32 ext_cnt) +{ + int i, ret = 0; + struct ceph_inode_info *ci = ceph_inode(inode); + u64 objno, objoff; + u32 xlen; + + /* Nothing to do for empty array */ + if (ext_cnt == 0) { + dout("%s: empty array, ret 0\n", __func__); + return 0; + } + + ceph_calc_file_object_mapping(&ci->i_layout, off, map[0].len, + &objno, &objoff, &xlen); + + for (i = 0; i < ext_cnt; ++i) { + struct ceph_sparse_extent *ext = &map[i]; + int pgsoff = ext->off - objoff; + int pgidx = pgsoff >> PAGE_SHIFT; + int fret; + + if ((ext->off | ext->len) & ~CEPH_FSCRYPT_BLOCK_MASK) { + pr_warn("%s: bad encrypted sparse extent idx %d off %llx len %llx\n", + __func__, i, ext->off, ext->len); + return -EIO; + } + fret = ceph_fscrypt_decrypt_pages(inode, &page[pgidx], + off + pgsoff, ext->len); + dout("%s: [%d] 0x%llx~0x%llx fret %d\n", __func__, i, + ext->off, ext->len, fret); + if (fret < 0) { + if (ret == 0) + ret = fret; + break; + } + ret = pgsoff + fret; + } + dout("%s: ret %d\n", __func__, ret); + return ret; +} + +/** + * ceph_fscrypt_encrypt_pages - encrypt an array of pages + * @inode: pointer to inode associated with these pages + * @page: pointer to page array + * @off: offset into the file that the data starts + * @len: max length to encrypt + * @gfp: gfp flags to use for allocation + * + * Decrypt an array of cleartext pages and return the amount of + * data encrypted. Any data in the page prior to the start of the + * first complete block in the read is ignored. Any incomplete + * crypto blocks at the end of the array are ignored. + * + * Returns the length of the encrypted data or a negative errno. + */ +int ceph_fscrypt_encrypt_pages(struct inode *inode, struct page **page, u64 off, + int len, gfp_t gfp) +{ + int i, num_blocks; + u64 baseblk = off >> CEPH_FSCRYPT_BLOCK_SHIFT; + int ret = 0; + + /* + * We can't deal with partial blocks on an encrypted file, so mask off + * the last bit. + */ + num_blocks = ceph_fscrypt_blocks(off, len & CEPH_FSCRYPT_BLOCK_MASK); + + /* Encrypt each block */ + for (i = 0; i < num_blocks; ++i) { + int blkoff = i << CEPH_FSCRYPT_BLOCK_SHIFT; + int pgidx = blkoff >> PAGE_SHIFT; + unsigned int pgoffs = offset_in_page(blkoff); + int fret; + + fret = ceph_fscrypt_encrypt_block_inplace(inode, page[pgidx], + CEPH_FSCRYPT_BLOCK_SIZE, pgoffs, + baseblk + i, gfp); + if (fret < 0) { + if (ret == 0) + ret = fret; + break; + } + ret += CEPH_FSCRYPT_BLOCK_SIZE; + } + return ret; +} diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 56a61ba64edc..fdd73c50487f 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -91,6 +91,40 @@ static inline void ceph_fname_free_buffer(struct inode *parent, struct fscrypt_s int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscrypt_str *tname, struct fscrypt_str *oname, bool *is_nokey); +static inline unsigned int ceph_fscrypt_blocks(u64 off, u64 len) +{ + /* crypto blocks cannot span more than one page */ + BUILD_BUG_ON(CEPH_FSCRYPT_BLOCK_SHIFT > PAGE_SHIFT); + + return ((off+len+CEPH_FSCRYPT_BLOCK_SIZE-1) >> CEPH_FSCRYPT_BLOCK_SHIFT) - + (off >> CEPH_FSCRYPT_BLOCK_SHIFT); +} + +/* + * If we have an encrypted inode then we must adjust the offset and + * range of the on-the-wire read to cover an entire encryption block. + * The copy will be done using the original offset and length, after + * we've decrypted the result. + */ +static inline void ceph_fscrypt_adjust_off_and_len(struct inode *inode, u64 *off, u64 *len) +{ + if (IS_ENCRYPTED(inode)) { + *len = ceph_fscrypt_blocks(*off, *len) * CEPH_FSCRYPT_BLOCK_SIZE; + *off &= CEPH_FSCRYPT_BLOCK_MASK; + } +} + +int ceph_fscrypt_decrypt_block_inplace(const struct inode *inode, + struct page *page, unsigned int len, + unsigned int offs, u64 lblk_num); +int ceph_fscrypt_encrypt_block_inplace(const struct inode *inode, + struct page *page, unsigned int len, + unsigned int offs, u64 lblk_num, gfp_t gfp_flags); +int ceph_fscrypt_decrypt_pages(struct inode *inode, struct page **page, u64 off, int len); +int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page, u64 off, + struct ceph_sparse_extent *map, u32 ext_cnt); +int ceph_fscrypt_encrypt_pages(struct inode *inode, struct page **page, u64 off, + int len, gfp_t gfp); #else /* CONFIG_FS_ENCRYPTION */ static inline void ceph_fscrypt_set_ops(struct super_block *sb) @@ -143,6 +177,43 @@ static inline int ceph_fname_to_usr(const struct ceph_fname *fname, struct fscry oname->len = fname->name_len; return 0; } + +static inline void ceph_fscrypt_adjust_off_and_len(struct inode *inode, u64 *off, u64 *len) +{ +} + +static inline int ceph_fscrypt_decrypt_block_inplace(const struct inode *inode, + struct page *page, unsigned int len, + unsigned int offs, u64 lblk_num) +{ + return 0; +} + +static inline int ceph_fscrypt_encrypt_block_inplace(const struct inode *inode, + struct page *page, unsigned int len, + unsigned int offs, u64 lblk_num, gfp_t gfp_flags) +{ + return 0; +} + +static inline int ceph_fscrypt_decrypt_pages(struct inode *inode, struct page **page, + u64 off, int len) +{ + return 0; +} + +static inline int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page, + u64 off, struct ceph_sparse_extent *map, + u32 ext_cnt) +{ + return 0; +} + +static inline int ceph_fscrypt_encrypt_pages(struct inode *inode, struct page **page, + u64 off, int len, gfp_t gfp) +{ + return 0; +} #endif /* CONFIG_FS_ENCRYPTION */ #endif diff --git a/fs/ceph/super.c b/fs/ceph/super.c index a1f921d5675d..70cd1dcad645 100644 --- a/fs/ceph/super.c +++ b/fs/ceph/super.c @@ -1093,6 +1093,14 @@ static int ceph_set_test_dummy_encryption(struct super_block *sb, struct fs_cont return -EEXIST; } + /* HACK: allow for cleartext "encryption" in files for testing */ + if (fsc->mount_options->test_dummy_encryption && + !strcmp(fsc->mount_options->test_dummy_encryption, "clear")) { + fsopt->flags |= CEPH_MOUNT_OPT_DUMMY_ENC_CLEAR; + kfree(fsc->mount_options->test_dummy_encryption); + fsc->mount_options->test_dummy_encryption = NULL; + } + err = fscrypt_set_test_dummy_encryption(sb, fsc->mount_options->test_dummy_encryption, &fsc->dummy_enc_policy); diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 9809bc97b89e..9c205c6967b7 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -44,6 +44,7 @@ #define CEPH_MOUNT_OPT_NOPAGECACHE (1<<16) /* bypass pagecache altogether */ #define CEPH_MOUNT_OPT_SPARSEREAD (1<<17) /* always do sparse reads */ #define CEPH_MOUNT_OPT_TEST_DUMMY_ENC (1<<18) /* enable dummy encryption (for testing) */ +#define CEPH_MOUNT_OPT_DUMMY_ENC_CLEAR (1<<19) /* don't actually encrypt content */ #define CEPH_MOUNT_OPT_DEFAULT \ (CEPH_MOUNT_OPT_DCACHE | \ From patchwork Tue Apr 5 19:20:19 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802372 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 4F562C433EF for ; Wed, 6 Apr 2022 04:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1450738AbiDFEP5 (ORCPT ); Wed, 6 Apr 2022 00:15:57 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38080 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573595AbiDETXQ (ORCPT ); Tue, 5 Apr 2022 15:23:16 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2B02F4D248; Tue, 5 Apr 2022 12:21:17 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id B066C616C5; Tue, 5 Apr 2022 19:21:16 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 76100C385A1; Tue, 5 Apr 2022 19:21:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186476; bh=1zqIrI3g+MJ1zUHiN6pzRuPNEEpwUbADMsCTJsz6hB0=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=nRk5CTi3KlF02Irw1XktR7e9+uY2OqjAget0xIulgjLfqps7Mid3sU0xbkrNNRrS+ /LdiW5tbVGF5eejnmQTWMN32kKj1lmgaDkDCigZei4Sjmf7DvLrp5onw3ERmwVXa08 9umYzatTzw+eV+P019rBbCh4Sustxi1D9lmh+pzNgmQOoGtz/hX3bg0JGqxEUysWBk C3b0v7MDC+NIqPcV+33j9owIAESQSoR1RwynsBUt3UjGGbFpBBjFXtejS9hUM70jCv dBFXU5nw3JMnVw/iCDCCYDgURGONyyheMMMEWyKaNEiY0mcyXRLqGopJ8xmbonWNUM QaSnFleiLagFw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 48/59] ceph: add truncate size handling support for fscrypt Date: Tue, 5 Apr 2022 15:20:19 -0400 Message-Id: <20220405192030.178326-49-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org From: Xiubo Li This will transfer the encrypted last block contents to the MDS along with the truncate request only when the new size is smaller and not aligned to the fscrypt BLOCK size. When the last block is located in the file hole, the truncate request will only contain the header. The MDS could fail to do the truncate if there has another client or process has already updated the RADOS object which contains the last block, and will return -EAGAIN, then the kclient needs to retry it. The RMW will take around 50ms, and will let it retry 20 times for now. Signed-off-by: Xiubo Li Signed-off-by: Jeff Layton --- fs/ceph/crypto.h | 21 ++++++ fs/ceph/inode.c | 192 +++++++++++++++++++++++++++++++++++++++++++++-- fs/ceph/super.h | 5 ++ 3 files changed, 211 insertions(+), 7 deletions(-) diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index fdd73c50487f..92a7b221a975 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -26,6 +26,27 @@ struct ceph_fname { bool no_copy; }; +/* + * Header for the crypted file when truncating the size, this + * will be sent to MDS, and the MDS will update the encrypted + * last block and then truncate the size. + */ +struct ceph_fscrypt_truncate_size_header { + __u8 ver; + __u8 compat; + + /* + * It will be sizeof(assert_ver + file_offset + block_size) + * if the last block is empty when it's located in a file + * hole. Or the data_len will plus CEPH_FSCRYPT_BLOCK_SIZE. + */ + __le32 data_len; + + __le64 change_attr; + __le64 file_offset; + __le32 block_size; +} __packed; + struct ceph_fscrypt_auth { __le32 cfa_version; __le32 cfa_blob_len; diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index f2a59306e4a6..eb8f066975a8 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -595,6 +595,7 @@ struct inode *ceph_alloc_inode(struct super_block *sb) ci->i_truncate_seq = 0; ci->i_truncate_size = 0; ci->i_truncate_pending = 0; + ci->i_truncate_pagecache_size = 0; ci->i_max_size = 0; ci->i_reported_size = 0; @@ -766,6 +767,10 @@ int ceph_fill_file_size(struct inode *inode, int issued, dout("truncate_size %lld -> %llu\n", ci->i_truncate_size, truncate_size); ci->i_truncate_size = truncate_size; + if (IS_ENCRYPTED(inode)) + ci->i_truncate_pagecache_size = size; + else + ci->i_truncate_pagecache_size = truncate_size; } return queue_trunc; } @@ -2140,7 +2145,7 @@ void __ceph_do_pending_vmtruncate(struct inode *inode) /* there should be no reader or writer */ WARN_ON_ONCE(ci->i_rd_ref || ci->i_wr_ref); - to = ci->i_truncate_size; + to = ci->i_truncate_pagecache_size; wrbuffer_refs = ci->i_wrbuffer_ref; dout("__do_pending_vmtruncate %p (%d) to %lld\n", inode, ci->i_truncate_pending, to); @@ -2150,7 +2155,7 @@ void __ceph_do_pending_vmtruncate(struct inode *inode) truncate_pagecache(inode, to); spin_lock(&ci->i_ceph_lock); - if (to == ci->i_truncate_size) { + if (to == ci->i_truncate_pagecache_size) { ci->i_truncate_pending = 0; finish = 1; } @@ -2231,6 +2236,136 @@ static const struct inode_operations ceph_encrypted_symlink_iops = { .listxattr = ceph_listxattr, }; +/* + * Transfer the encrypted last block to the MDS and the MDS + * will help update it when truncating a smaller size. + * + * We don't support a PAGE_SIZE that is smaller than the + * CEPH_FSCRYPT_BLOCK_SIZE. + */ +static int fill_fscrypt_truncate(struct inode *inode, + struct ceph_mds_request *req, + struct iattr *attr) +{ + struct ceph_inode_info *ci = ceph_inode(inode); + int boff = attr->ia_size % CEPH_FSCRYPT_BLOCK_SIZE; + loff_t pos, orig_pos = round_down(attr->ia_size, CEPH_FSCRYPT_BLOCK_SIZE); + u64 block = orig_pos >> CEPH_FSCRYPT_BLOCK_SHIFT; + struct ceph_pagelist *pagelist = NULL; + struct kvec iov; + struct iov_iter iter; + struct page *page = NULL; + struct ceph_fscrypt_truncate_size_header header; + int retry_op = 0; + int len = CEPH_FSCRYPT_BLOCK_SIZE; + loff_t i_size = i_size_read(inode); + int got, ret, issued; + u64 objver; + + ret = __ceph_get_caps(inode, NULL, CEPH_CAP_FILE_RD, 0, -1, &got); + if (ret < 0) + return ret; + + issued = __ceph_caps_issued(ci, NULL); + + dout("%s size %lld -> %lld got cap refs on %s, issued %s\n", __func__, + i_size, attr->ia_size, ceph_cap_string(got), + ceph_cap_string(issued)); + + /* Try to writeback the dirty pagecaches */ + if (issued & (CEPH_CAP_FILE_BUFFER)) + filemap_write_and_wait(inode->i_mapping); + + page = __page_cache_alloc(GFP_KERNEL); + if (page == NULL) { + ret = -ENOMEM; + goto out; + } + + pagelist = ceph_pagelist_alloc(GFP_KERNEL); + if (!pagelist) { + ret = -ENOMEM; + goto out; + } + + iov.iov_base = kmap_local_page(page); + iov.iov_len = len; + iov_iter_kvec(&iter, READ, &iov, 1, len); + + pos = orig_pos; + ret = __ceph_sync_read(inode, &pos, &iter, &retry_op, &objver); + ceph_put_cap_refs(ci, got); + if (ret < 0) + goto out; + + /* Insert the header first */ + header.ver = 1; + header.compat = 1; + header.change_attr = cpu_to_le64(inode_peek_iversion_raw(inode)); + + /* + * Always set the block_size to CEPH_FSCRYPT_BLOCK_SIZE, + * because in MDS it may need this to do the truncate. + */ + header.block_size = cpu_to_le32(CEPH_FSCRYPT_BLOCK_SIZE); + + /* + * If we hit a hole here, we should just skip filling + * the fscrypt for the request, because once the fscrypt + * is enabled, the file will be split into many blocks + * with the size of CEPH_FSCRYPT_BLOCK_SIZE, if there + * has a hole, the hole size should be multiple of block + * size. + * + * If the Rados object doesn't exist, it will be set to 0. + */ + if (!objver) { + dout("%s hit hole, ppos %lld < size %lld\n", __func__, + pos, i_size); + + header.data_len = cpu_to_le32(8 + 8 + 4); + header.file_offset = 0; + ret = 0; + } else { + header.data_len = cpu_to_le32(8 + 8 + 4 + CEPH_FSCRYPT_BLOCK_SIZE); + header.file_offset = cpu_to_le64(orig_pos); + + /* truncate and zero out the extra contents for the last block */ + memset(iov.iov_base + boff, 0, PAGE_SIZE - boff); + + /* encrypt the last block */ + ret = ceph_fscrypt_encrypt_block_inplace(inode, page, + CEPH_FSCRYPT_BLOCK_SIZE, + 0, block, + GFP_KERNEL); + if (ret) + goto out; + } + + /* Insert the header */ + ret = ceph_pagelist_append(pagelist, &header, sizeof(header)); + if (ret) + goto out; + + if (header.block_size) { + /* Append the last block contents to pagelist */ + ret = ceph_pagelist_append(pagelist, iov.iov_base, + CEPH_FSCRYPT_BLOCK_SIZE); + if (ret) + goto out; + } + req->r_pagelist = pagelist; +out: + dout("%s %p size dropping cap refs on %s\n", __func__, + inode, ceph_cap_string(got)); + kunmap_local(iov.iov_base); + if (page) + __free_pages(page, 0); + if (ret && pagelist) + ceph_pagelist_release(pagelist); + return ret; +} + int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *cia) { struct ceph_inode_info *ci = ceph_inode(inode); @@ -2238,13 +2373,17 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c struct ceph_mds_request *req; struct ceph_mds_client *mdsc = ceph_sb_to_client(inode->i_sb)->mdsc; struct ceph_cap_flush *prealloc_cf; + loff_t isize = i_size_read(inode); int issued; int release = 0, dirtied = 0; int mask = 0; int err = 0; int inode_dirty_flags = 0; bool lock_snap_rwsem = false; + bool fill_fscrypt; + int truncate_retry = 20; /* The RMW will take around 50ms */ +retry: prealloc_cf = ceph_alloc_cap_flush(); if (!prealloc_cf) return -ENOMEM; @@ -2256,6 +2395,7 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c return PTR_ERR(req); } + fill_fscrypt = false; spin_lock(&ci->i_ceph_lock); issued = __ceph_caps_issued(ci, NULL); @@ -2377,10 +2517,27 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c } } if (ia_valid & ATTR_SIZE) { - loff_t isize = i_size_read(inode); - dout("setattr %p size %lld -> %lld\n", inode, isize, attr->ia_size); - if ((issued & CEPH_CAP_FILE_EXCL) && attr->ia_size >= isize) { + /* + * Only when the new size is smaller and not aligned to + * CEPH_FSCRYPT_BLOCK_SIZE will the RMW is needed. + */ + if (IS_ENCRYPTED(inode) && attr->ia_size < isize && + (attr->ia_size % CEPH_FSCRYPT_BLOCK_SIZE)) { + mask |= CEPH_SETATTR_SIZE; + release |= CEPH_CAP_FILE_SHARED | CEPH_CAP_FILE_EXCL | + CEPH_CAP_FILE_RD | CEPH_CAP_FILE_WR; + set_bit(CEPH_MDS_R_FSCRYPT_FILE, &req->r_req_flags); + mask |= CEPH_SETATTR_FSCRYPT_FILE; + req->r_args.setattr.size = + cpu_to_le64(round_up(attr->ia_size, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_args.setattr.old_size = + cpu_to_le64(round_up(isize, + CEPH_FSCRYPT_BLOCK_SIZE)); + req->r_fscrypt_file = attr->ia_size; + fill_fscrypt = true; + } else if ((issued & CEPH_CAP_FILE_EXCL) && attr->ia_size >= isize) { if (attr->ia_size > isize) { i_size_write(inode, attr->ia_size); inode->i_blocks = calc_inode_blocks(attr->ia_size); @@ -2403,7 +2560,6 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c cpu_to_le64(round_up(isize, CEPH_FSCRYPT_BLOCK_SIZE)); req->r_fscrypt_file = attr->ia_size; - /* FIXME: client must zero out any partial blocks! */ } else { req->r_args.setattr.size = cpu_to_le64(attr->ia_size); req->r_args.setattr.old_size = cpu_to_le64(isize); @@ -2469,8 +2625,10 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c release &= issued; spin_unlock(&ci->i_ceph_lock); - if (lock_snap_rwsem) + if (lock_snap_rwsem) { up_read(&mdsc->snap_rwsem); + lock_snap_rwsem = false; + } if (inode_dirty_flags) __mark_inode_dirty(inode, inode_dirty_flags); @@ -2482,7 +2640,27 @@ int __ceph_setattr(struct inode *inode, struct iattr *attr, struct ceph_iattr *c req->r_args.setattr.mask = cpu_to_le32(mask); req->r_num_caps = 1; req->r_stamp = attr->ia_ctime; + if (fill_fscrypt) { + err = fill_fscrypt_truncate(inode, req, attr); + if (err) + goto out; + } + + /* + * The truncate request will return -EAGAIN when the + * last block has been updated just before the MDS + * successfully gets the xlock for the FILE lock. To + * avoid corrupting the file contents we need to retry + * it. + */ err = ceph_mdsc_do_request(mdsc, NULL, req); + if (err == -EAGAIN && truncate_retry--) { + dout("setattr %p result=%d (%s locally, %d remote), retry it!\n", + inode, err, ceph_cap_string(dirtied), mask); + ceph_mdsc_put_request(req); + ceph_free_cap_flush(prealloc_cf); + goto retry; + } } out: dout("setattr %p result=%d (%s locally, %d remote)\n", inode, err, diff --git a/fs/ceph/super.h b/fs/ceph/super.h index 9c205c6967b7..a2e1c83ab29a 100644 --- a/fs/ceph/super.h +++ b/fs/ceph/super.h @@ -410,6 +410,11 @@ struct ceph_inode_info { u32 i_truncate_seq; /* last truncate to smaller size */ u64 i_truncate_size; /* and the size we last truncated down to */ int i_truncate_pending; /* still need to call vmtruncate */ + /* + * For none fscrypt case it equals to i_truncate_size or it will + * equals to fscrypt_file_size + */ + u64 i_truncate_pagecache_size; u64 i_max_size; /* max file size authorized by mds */ u64 i_reported_size; /* (max_)size reported to or requested of mds */ From patchwork Tue Apr 5 19:20:20 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802340 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 05274C433EF for ; Wed, 6 Apr 2022 04:07:34 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S242068AbiDFEIh (ORCPT ); Wed, 6 Apr 2022 00:08:37 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38256 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573598AbiDETXS (ORCPT ); Tue, 5 Apr 2022 15:23:18 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BF0664F45F; Tue, 5 Apr 2022 12:21:19 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 666ADB81FAA; Tue, 5 Apr 2022 19:21:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 5EA4AC385A3; Tue, 5 Apr 2022 19:21:16 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186477; bh=PJPyqDK6DNMW9F+5wYxGCw3ywIqwNAhCuh2XTGubQow=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=VHTfVfFNYI9LHTiG9Bs4cHcpNr/9s3fqIqPmOWNKMB1wC8auyB8ll+7SfvqJDzli0 p7l6gtLT6OhJIUYKuj4e7XSoW89v0ZaG99pNkDGtaDadVuh8y00onEOiEp4vMEAITt 8V5ko5nOrVYSOYoov7SNqVS+wbgtmhJc+ZCWcarhMC3wmHeFlWoAnjK3e5UJfCInxr 6VlQumLqW7w+HwyzsO1tOPPsCXvZArTVj9Vsi9eRrEq/OAy2W9CSl6RBTeh17jRHvm K/QhiYNL3uFmt6BGHGtIjAQ3C8zCztTq+KBh1CjsEqVU7o6TqPUsNOwcEGtGZF1Oi6 c5bPPW9kHlBEw== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 49/59] libceph: allow ceph_osdc_new_request to accept a multi-op read Date: Tue, 5 Apr 2022 15:20:20 -0400 Message-Id: <20220405192030.178326-50-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Currently we have some special-casing for multi-op writes, but in the case of a read, we can't really handle it. All of the current multi-op callers call it with CEPH_OSD_FLAG_WRITE set. Have ceph_osdc_new_request check for CEPH_OSD_FLAG_READ and if it's set, allocate multiple reply ops instead of multiple request ops. If neither flag is set, return -EINVAL. Signed-off-by: Jeff Layton --- net/ceph/osd_client.c | 27 +++++++++++++++++++++------ 1 file changed, 21 insertions(+), 6 deletions(-) diff --git a/net/ceph/osd_client.c b/net/ceph/osd_client.c index febdd728b2fb..39d38b69a953 100644 --- a/net/ceph/osd_client.c +++ b/net/ceph/osd_client.c @@ -1130,15 +1130,30 @@ struct ceph_osd_request *ceph_osdc_new_request(struct ceph_osd_client *osdc, if (flags & CEPH_OSD_FLAG_WRITE) req->r_data_offset = off; - if (num_ops > 1) + if (num_ops > 1) { + int num_req_ops, num_rep_ops; + /* - * This is a special case for ceph_writepages_start(), but it - * also covers ceph_uninline_data(). If more multi-op request - * use cases emerge, we will need a separate helper. + * If this is a multi-op write request, assume that we'll need + * request ops. If it's a multi-op read then assume we'll need + * reply ops. Anything else and call it -EINVAL. */ - r = __ceph_osdc_alloc_messages(req, GFP_NOFS, num_ops, 0); - else + if (flags & CEPH_OSD_FLAG_WRITE) { + num_req_ops = num_ops; + num_rep_ops = 0; + } else if (flags & CEPH_OSD_FLAG_READ) { + num_req_ops = 0; + num_rep_ops = num_ops; + } else { + r = -EINVAL; + goto fail; + } + + r = __ceph_osdc_alloc_messages(req, GFP_NOFS, num_req_ops, + num_rep_ops); + } else { r = ceph_osdc_alloc_messages(req, GFP_NOFS); + } if (r) goto fail; From patchwork Tue Apr 5 19:20:21 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802368 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 2A02EC433FE for ; Wed, 6 Apr 2022 04:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1449298AbiDFEPs (ORCPT ); Wed, 6 Apr 2022 00:15:48 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38178 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573597AbiDETXR (ORCPT ); Tue, 5 Apr 2022 15:23:17 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EED174EA31; Tue, 5 Apr 2022 12:21:18 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 8B3BB616C5; Tue, 5 Apr 2022 19:21:18 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 4707FC385A5; Tue, 5 Apr 2022 19:21:17 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186478; bh=SRirXv6SjAaxHj9D42MDkbHJyqzjojeWbzUc3GlPdFI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bMJ9UC+uMVUgQ9XDVmV67ithHJZhOUeNimcvXXlRYJv7dSBOr8tDIzQVPFOHSmuQz D9EfAD4BsUZYE9K36AoPR5uE8d8Cr7sFXI1oKvBr2uBGxNij1ICRH/FXWpmvcPDzHZ ZFFJ1gIvj3iZGUZoTcosy9SJpxqgX2LYrbhaclW+I8DuThSkuIJYeFtS6atPBr+CFl uqrVKYolaomVNhdyb7phcSqnKK2i9QxK3DSSKE+3mkhrDoQ3RH+M9+cwrA61JizQ7b 2ou149hHAvxKOJUL3cUNtUNog5ROmrLOZfpqjIGcavFIUxXjkKtGLa32zulA/28X/E f3vGHJ2sTEqAg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 50/59] ceph: disable fallocate for encrypted inodes Date: Tue, 5 Apr 2022 15:20:21 -0400 Message-Id: <20220405192030.178326-51-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org ...hopefully, just for now. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 3 +++ 1 file changed, 3 insertions(+) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 175a59277726..f74563e11058 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2207,6 +2207,9 @@ static long ceph_fallocate(struct file *file, int mode, if (!S_ISREG(inode->i_mode)) return -EOPNOTSUPP; + if (IS_ENCRYPTED(inode)) + return -EOPNOTSUPP; + prealloc_cf = ceph_alloc_cap_flush(); if (!prealloc_cf) return -ENOMEM; From patchwork Tue Apr 5 19:20:22 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802358 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 01469C433FE for ; Wed, 6 Apr 2022 04:16:36 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1444661AbiDFEOZ (ORCPT ); Wed, 6 Apr 2022 00:14:25 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38258 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573599AbiDETXS (ORCPT ); Tue, 5 Apr 2022 15:23:18 -0400 Received: from dfw.source.kernel.org (dfw.source.kernel.org [IPv6:2604:1380:4641:c500::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DBD254FC42; Tue, 5 Apr 2022 12:21:19 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by dfw.source.kernel.org (Postfix) with ESMTPS id 7787A617EE; Tue, 5 Apr 2022 19:21:19 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 3969DC385A0; Tue, 5 Apr 2022 19:21:18 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186478; bh=wNbrR0kMy5CjuXO377tAi8kaHSrkxQZLX+CwiBwZIn8=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=buHJBAvpQVbNR5sdtUe3WDquSf5XOnCge4HUUepr6yH3XluOPD6iLItil3DElv6XV 0pa7PyLtdjgwftcXxbomXqUm+1B7aNAm6YzUnJA04BiO256/TvYB2Jo6diCYxSgK+9 wyTMxEuOgVx4BZ7zaxMUCV4hVS4KqdUg2aq1qNSa7ahzOEBdqVnBageJUDdCGvmryc SXMjAV+Ftk4H+ORGeEvZxYv7rPCnlxuOENyArB8cYd8VScSzlsaYNxj+J2362jGuBO VYfRc/+PTLMulviro8dTuJwzm7P2oTCSm8fGbWJOwEdD68jpEMGcAqWC5nhWqHnVgq VDqSrlTRgNDUA== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 51/59] ceph: disable copy offload on encrypted inodes Date: Tue, 5 Apr 2022 15:20:22 -0400 Message-Id: <20220405192030.178326-52-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org If we have an encrypted inode, then the client will need to re-encrypt the contents of the new object. Disable copy offload to or from encrypted inodes. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 4 ++++ 1 file changed, 4 insertions(+) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index f74563e11058..f9e775d6cdf0 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -2526,6 +2526,10 @@ static ssize_t __ceph_copy_file_range(struct file *src_file, loff_t src_off, return -EOPNOTSUPP; } + /* Every encrypted inode gets its own key, so we can't offload them */ + if (IS_ENCRYPTED(src_inode) || IS_ENCRYPTED(dst_inode)) + return -EOPNOTSUPP; + if (len < src_ci->i_layout.object_size) return -EOPNOTSUPP; /* no remote copy will be done */ From patchwork Tue Apr 5 19:20:23 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802350 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 20F58C4332F for ; Wed, 6 Apr 2022 04:16:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1382243AbiDFENI (ORCPT ); Wed, 6 Apr 2022 00:13:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38448 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573600AbiDETXV (ORCPT ); Tue, 5 Apr 2022 15:23:21 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6C0ED50E37; Tue, 5 Apr 2022 12:21:22 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 1E282B81F6B; Tue, 5 Apr 2022 19:21:21 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 215F3C385A3; Tue, 5 Apr 2022 19:21:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186479; bh=+2Vkh7etWPpq3ECuOQUyhhfQFepRWAk8lFgAlse9LJs=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=PYEUv2DlDfZZodloCX6EfAoA5aOT1NjpSCd8by9nr3AoYfa/qqd2jWJy7MvrgisjA e0iX+toF52FPTgwjRUm6M08XzAL5//L+htGwTBqeEeu4Tgb8i/GxYV4F+32iftBGnk lMLl+5HM7kTcU85qP5jbqIxAmyrefEOiOjfAWWT+/ZW0GKr8lL1SYpbT1Jbr+oRdxN 6lUvuaR/gDyfZqQLDgk/0HHlrJv/rYF0OhbAsuCywUnIGGh+TU9Koq3zGzXEUJsH8j o3eBR8LmDhH4rGh7NRhlRfEmXcN1hzVAsjAXHHNIlmc0akk1cHgZGsWBRAoZA2mOFZ NH361eFDpuA9g== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 52/59] ceph: don't use special DIO path for encrypted inodes Date: Tue, 5 Apr 2022 15:20:23 -0400 Message-Id: <20220405192030.178326-53-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Eventually I want to merge the synchronous and direct read codepaths, possibly via new netfs infrastructure. For now, the direct path is not crypto-enabled, so use the sync read/write paths instead. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 6 ++++-- 1 file changed, 4 insertions(+), 2 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index f9e775d6cdf0..41b97d32dfcf 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1709,7 +1709,9 @@ static ssize_t ceph_read_iter(struct kiocb *iocb, struct iov_iter *to) ceph_cap_string(got)); if (ci->i_inline_version == CEPH_INLINE_NONE) { - if (!retry_op && (iocb->ki_flags & IOCB_DIRECT)) { + if (!retry_op && + (iocb->ki_flags & IOCB_DIRECT) && + !IS_ENCRYPTED(inode)) { ret = ceph_direct_read_write(iocb, to, NULL, NULL); if (ret >= 0 && ret < len) @@ -1935,7 +1937,7 @@ static ssize_t ceph_write_iter(struct kiocb *iocb, struct iov_iter *from) /* we might need to revert back to that point */ data = *from; - if (iocb->ki_flags & IOCB_DIRECT) + if ((iocb->ki_flags & IOCB_DIRECT) && !IS_ENCRYPTED(inode)) written = ceph_direct_read_write(iocb, &data, snapc, &prealloc_cf); else From patchwork Tue Apr 5 19:20:24 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802343 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43A04C433EF for ; Wed, 6 Apr 2022 04:16:16 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352985AbiDFELn (ORCPT ); Wed, 6 Apr 2022 00:11:43 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38514 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573601AbiDETXV (ORCPT ); Tue, 5 Apr 2022 15:23:21 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 704B4515BC; Tue, 5 Apr 2022 12:21:23 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 2046BB81FA5; Tue, 5 Apr 2022 19:21:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 0B237C385A1; Tue, 5 Apr 2022 19:21:19 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186480; bh=aBwSCl+nnd5kUW2zQQtaeshWmlSvmWxl99UF09aY6AQ=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=bQB7CfxjNrExI1aJTNY6nAaIMDhbZN1vzkObuQ+dIoqtT1qCNZEnZfSj5ZEdz9DC1 zu2WuAhBT1TK+p+0MhGSIDnhIh/hiVRcbH/nuQZuMli8zJ1RZUoXBgK+G9v0uol0IR xeFjXw9fKar9XABvQvUJT+dXKYh+XVFFEN7NHgKr9lu7tDvwTibMvq3AM+M0bwkKiL k75o5H3n3SUxDz4yKncozFnh8S0O9L9gvm/NMSdaJ8A6NuAZQuvuYAPXP5z/xr07gm 80bkN+tZUO74+oHRnUJ02bMDhXRX8tzyY2P3JTpBBdFMH3UpmW++cjDBPll8Rr+Pbd kmr7qW1P+vZ1A== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 53/59] ceph: align data in pages in ceph_sync_write Date: Tue, 5 Apr 2022 15:20:24 -0400 Message-Id: <20220405192030.178326-54-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Encrypted files will need to be dealt with in block-sized chunks and once we do that, the way that ceph_sync_write aligns the data in the bounce buffer won't be acceptable. Change it to align the data the same way it would be aligned in the pagecache. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 20 ++++++++++---------- 1 file changed, 10 insertions(+), 10 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 41b97d32dfcf..69ac67c93552 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1551,6 +1551,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, bool check_caps = false; struct timespec64 mtime = current_time(inode); size_t count = iov_iter_count(from); + size_t off; if (ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS; @@ -1588,12 +1589,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, break; } - /* - * write from beginning of first page, - * regardless of io alignment - */ - num_pages = (len + PAGE_SIZE - 1) >> PAGE_SHIFT; - + num_pages = calc_pages_for(pos, len); pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); if (IS_ERR(pages)) { ret = PTR_ERR(pages); @@ -1601,9 +1597,12 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, } left = len; + off = offset_in_page(pos); for (n = 0; n < num_pages; n++) { - size_t plen = min_t(size_t, left, PAGE_SIZE); - ret = copy_page_from_iter(pages[n], 0, plen, from); + size_t plen = min_t(size_t, left, PAGE_SIZE - off); + + ret = copy_page_from_iter(pages[n], off, plen, from); + off = 0; if (ret != plen) { ret = -EFAULT; break; @@ -1618,8 +1617,9 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, req->r_inode = inode; - osd_req_op_extent_osd_data_pages(req, 0, pages, len, 0, - false, true); + osd_req_op_extent_osd_data_pages(req, 0, pages, len, + offset_in_page(pos), + false, true); req->r_mtime = mtime; ret = ceph_osdc_start_request(&fsc->client->osdc, req, false); From patchwork Tue Apr 5 19:20:25 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802362 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 15DD6C433EF for ; Wed, 6 Apr 2022 04:16:46 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1446044AbiDFEOz (ORCPT ); Wed, 6 Apr 2022 00:14:55 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38656 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573603AbiDETXY (ORCPT ); Tue, 5 Apr 2022 15:23:24 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [145.40.68.75]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6AD7051E65; Tue, 5 Apr 2022 12:21:24 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id E949FB81FAA; Tue, 5 Apr 2022 19:21:22 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id E89D5C385A0; Tue, 5 Apr 2022 19:21:20 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186481; bh=3k0F7k/flOmPH0AUjNjEF/MgnBNUhqWecKWs+ePIhNk=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=P5FnZITw0s3+Sx4qA/1VH9O15GScBDQIXZ8nCsBNvFSg+Vdhw4RJy65/ZWY3F26tN +4wgSvFIxpRxMqVvwbddLycn5sm2fBqSX5enOuz3/nMZgjvkBRpfQRrYgzu933ABhM bUK5j/4dRoKo1aXvdEQ2xwqAH6w6EtGLxLAWsiyaqei8j1uOyT26UhA7GPFBwnCiRG AZhTCWYQdeUim6IHONerJIOdP5WGysPM9ji91K1ZdR8dlYKo0QlOQA4XDLf2epbYld WMc4FowZckPvSYyRxcnzH2Oqmsgyxbw6BKk8z5M7VgDtLUaArwCmukYNRcf/Vk/woY 5qWycATSrxwYQ== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 54/59] ceph: add read/modify/write to ceph_sync_write Date: Tue, 5 Apr 2022 15:20:25 -0400 Message-Id: <20220405192030.178326-55-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org When doing a synchronous write on an encrypted inode, we have no guarantee that the caller is writing crypto block-aligned data. When that happens, we must do a read/modify/write cycle. First, expand the range to cover complete blocks. If we had to change the original pos or length, issue a read to fill the first and/or last pages, and fetch the version of the object from the result. We then copy data into the pages as usual, encrypt the result and issue a write prefixed by an assertion that the version hasn't changed. If it has changed then we restart the whole thing again. If there is no object at that position in the file (-ENOENT), we prefix the write on an exclusive create of the object instead. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 319 ++++++++++++++++++++++++++++++++++++++++++++----- 1 file changed, 290 insertions(+), 29 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 69ac67c93552..522189ed6642 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -1540,18 +1540,16 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, struct inode *inode = file_inode(file); struct ceph_inode_info *ci = ceph_inode(inode); struct ceph_fs_client *fsc = ceph_inode_to_client(inode); - struct ceph_vino vino; + struct ceph_osd_client *osdc = &fsc->client->osdc; struct ceph_osd_request *req; struct page **pages; u64 len; int num_pages; int written = 0; - int flags; int ret; bool check_caps = false; struct timespec64 mtime = current_time(inode); size_t count = iov_iter_count(from); - size_t off; if (ceph_snap(file_inode(file)) != CEPH_NOSNAP) return -EROFS; @@ -1571,29 +1569,236 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, if (ret < 0) dout("invalidate_inode_pages2_range returned %d\n", ret); - flags = /* CEPH_OSD_FLAG_ORDERSNAP | */ CEPH_OSD_FLAG_WRITE; - while ((len = iov_iter_count(from)) > 0) { size_t left; int n; + u64 write_pos = pos; + u64 write_len = len; + u64 objnum, objoff; + u32 xlen; + u64 assert_ver; + bool rmw; + bool first, last; + struct iov_iter saved_iter = *from; + size_t off; + + ceph_fscrypt_adjust_off_and_len(inode, &write_pos, &write_len); + + /* clamp the length to the end of first object */ + ceph_calc_file_object_mapping(&ci->i_layout, write_pos, + write_len, &objnum, &objoff, + &xlen); + write_len = xlen; + + /* adjust len downward if it goes beyond current object */ + if (pos + len > write_pos + write_len) + len = write_pos + write_len - pos; - vino = ceph_vino(inode); - req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, - vino, pos, &len, 0, 1, - CEPH_OSD_OP_WRITE, flags, snapc, - ci->i_truncate_seq, - ci->i_truncate_size, - false); - if (IS_ERR(req)) { - ret = PTR_ERR(req); - break; - } + /* + * If we had to adjust the length or position to align with a + * crypto block, then we must do a read/modify/write cycle. We + * use a version assertion to redrive the thing if something + * changes in between. + */ + first = pos != write_pos; + last = (pos + len) != (write_pos + write_len); + rmw = first || last; - num_pages = calc_pages_for(pos, len); + dout("sync_write ino %llx %lld~%llu adjusted %lld~%llu -- %srmw\n", + ci->i_vino.ino, pos, len, write_pos, write_len, rmw ? "" : "no "); + + /* + * The data is emplaced into the page as it would be if it were in + * an array of pagecache pages. + */ + num_pages = calc_pages_for(write_pos, write_len); pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); if (IS_ERR(pages)) { ret = PTR_ERR(pages); - goto out; + break; + } + + /* Do we need to preload the pages? */ + if (rmw) { + u64 first_pos = write_pos; + u64 last_pos = (write_pos + write_len) - CEPH_FSCRYPT_BLOCK_SIZE; + u64 read_len = CEPH_FSCRYPT_BLOCK_SIZE; + struct ceph_osd_req_op *op; + + /* We should only need to do this for encrypted inodes */ + WARN_ON_ONCE(!IS_ENCRYPTED(inode)); + + /* No need to do two reads if first and last blocks are same */ + if (first && last_pos == first_pos) + last = false; + + /* + * Allocate a read request for one or two extents, depending + * on how the request was aligned. + */ + req = ceph_osdc_new_request(osdc, &ci->i_layout, + ci->i_vino, first ? first_pos : last_pos, + &read_len, 0, (first && last) ? 2 : 1, + CEPH_OSD_OP_SPARSE_READ, CEPH_OSD_FLAG_READ, + NULL, ci->i_truncate_seq, + ci->i_truncate_size, false); + if (IS_ERR(req)) { + ceph_release_page_vector(pages, num_pages); + ret = PTR_ERR(req); + break; + } + + /* Something is misaligned! */ + if (read_len != CEPH_FSCRYPT_BLOCK_SIZE) { + ceph_osdc_put_request(req); + ceph_release_page_vector(pages, num_pages); + ret = -EIO; + break; + } + + /* Add extent for first block? */ + op = &req->r_ops[0]; + + if (first) { + osd_req_op_extent_osd_data_pages(req, 0, pages, + CEPH_FSCRYPT_BLOCK_SIZE, + offset_in_page(first_pos), + false, false); + /* We only expect a single extent here */ + ret = __ceph_alloc_sparse_ext_map(op, 1); + if (ret) { + ceph_osdc_put_request(req); + ceph_release_page_vector(pages, num_pages); + break; + } + } + + /* Add extent for last block */ + if (last) { + /* Init the other extent if first extent has been used */ + if (first) { + op = &req->r_ops[1]; + osd_req_op_extent_init(req, 1, CEPH_OSD_OP_SPARSE_READ, + last_pos, CEPH_FSCRYPT_BLOCK_SIZE, + ci->i_truncate_size, + ci->i_truncate_seq); + } + + ret = __ceph_alloc_sparse_ext_map(op, 1); + if (ret) { + ceph_osdc_put_request(req); + ceph_release_page_vector(pages, num_pages); + break; + } + + osd_req_op_extent_osd_data_pages(req, first ? 1 : 0, + &pages[num_pages - 1], + CEPH_FSCRYPT_BLOCK_SIZE, + offset_in_page(last_pos), + false, false); + } + + ret = ceph_osdc_start_request(osdc, req, false); + if (!ret) + ret = ceph_osdc_wait_request(osdc, req); + + /* FIXME: length field is wrong if there are 2 extents */ + ceph_update_read_metrics(&fsc->mdsc->metric, + req->r_start_latency, + req->r_end_latency, + read_len, ret); + + /* Ok if object is not already present */ + if (ret == -ENOENT) { + /* + * If there is no object, then we can't assert + * on its version. Set it to 0, and we'll use an + * exclusive create instead. + */ + ceph_osdc_put_request(req); + assert_ver = 0; + ret = 0; + + /* + * zero out the soon-to-be uncopied parts of the + * first and last pages. + */ + if (first) + zero_user_segment(pages[0], 0, + offset_in_page(first_pos)); + if (last) + zero_user_segment(pages[num_pages - 1], + offset_in_page(last_pos), + PAGE_SIZE); + } else { + if (ret < 0) { + ceph_osdc_put_request(req); + ceph_release_page_vector(pages, num_pages); + break; + } + + op = &req->r_ops[0]; + if (op->extent.sparse_ext_cnt == 0) { + if (first) + zero_user_segment(pages[0], 0, + offset_in_page(first_pos)); + else + zero_user_segment(pages[num_pages - 1], + offset_in_page(last_pos), + PAGE_SIZE); + } else if (op->extent.sparse_ext_cnt != 1 || + ceph_sparse_ext_map_end(op) != + CEPH_FSCRYPT_BLOCK_SIZE) { + ret = -EIO; + ceph_osdc_put_request(req); + ceph_release_page_vector(pages, num_pages); + break; + } + + if (first && last) { + op = &req->r_ops[1]; + if (op->extent.sparse_ext_cnt == 0) { + zero_user_segment(pages[num_pages - 1], + offset_in_page(last_pos), + PAGE_SIZE); + } else if (op->extent.sparse_ext_cnt != 1 || + ceph_sparse_ext_map_end(op) != + CEPH_FSCRYPT_BLOCK_SIZE) { + ret = -EIO; + ceph_osdc_put_request(req); + ceph_release_page_vector(pages, num_pages); + break; + } + } + + /* Grab assert version. It must be non-zero. */ + assert_ver = req->r_version; + WARN_ON_ONCE(ret > 0 && assert_ver == 0); + + ceph_osdc_put_request(req); + if (first) { + ret = ceph_fscrypt_decrypt_block_inplace(inode, + pages[0], + CEPH_FSCRYPT_BLOCK_SIZE, + offset_in_page(first_pos), + first_pos >> CEPH_FSCRYPT_BLOCK_SHIFT); + if (ret < 0) { + ceph_release_page_vector(pages, num_pages); + break; + } + } + if (last) { + ret = ceph_fscrypt_decrypt_block_inplace(inode, + pages[num_pages - 1], + CEPH_FSCRYPT_BLOCK_SIZE, + offset_in_page(last_pos), + last_pos >> CEPH_FSCRYPT_BLOCK_SHIFT); + if (ret < 0) { + ceph_release_page_vector(pages, num_pages); + break; + } + } + } } left = len; @@ -1601,43 +1806,98 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, for (n = 0; n < num_pages; n++) { size_t plen = min_t(size_t, left, PAGE_SIZE - off); + /* copy the data */ ret = copy_page_from_iter(pages[n], off, plen, from); - off = 0; if (ret != plen) { ret = -EFAULT; break; } + off = 0; left -= ret; } - if (ret < 0) { + dout("sync_write write failed with %d\n", ret); ceph_release_page_vector(pages, num_pages); - goto out; + break; } - req->r_inode = inode; + if (IS_ENCRYPTED(inode)) { + ret = ceph_fscrypt_encrypt_pages(inode, pages, + write_pos, write_len, + GFP_KERNEL); + if (ret < 0) { + dout("encryption failed with %d\n", ret); + ceph_release_page_vector(pages, num_pages); + break; + } + } - osd_req_op_extent_osd_data_pages(req, 0, pages, len, - offset_in_page(pos), - false, true); + req = ceph_osdc_new_request(osdc, &ci->i_layout, + ci->i_vino, write_pos, &write_len, + rmw ? 1 : 0, rmw ? 2 : 1, + CEPH_OSD_OP_WRITE, + CEPH_OSD_FLAG_WRITE, + snapc, ci->i_truncate_seq, + ci->i_truncate_size, false); + if (IS_ERR(req)) { + ret = PTR_ERR(req); + ceph_release_page_vector(pages, num_pages); + break; + } + dout("sync_write write op %lld~%llu\n", write_pos, write_len); + osd_req_op_extent_osd_data_pages(req, rmw ? 1 : 0, pages, write_len, + offset_in_page(write_pos), false, + true); + req->r_inode = inode; req->r_mtime = mtime; - ret = ceph_osdc_start_request(&fsc->client->osdc, req, false); + + /* Set up the assertion */ + if (rmw) { + /* + * Set up the assertion. If we don't have a version number, + * then the object doesn't exist yet. Use an exclusive create + * instead of a version assertion in that case. + */ + if (assert_ver) { + osd_req_op_init(req, 0, CEPH_OSD_OP_ASSERT_VER, 0); + req->r_ops[0].assert_ver.ver = assert_ver; + } else { + osd_req_op_init(req, 0, CEPH_OSD_OP_CREATE, + CEPH_OSD_OP_FLAG_EXCL); + } + } + + ret = ceph_osdc_start_request(osdc, req, false); if (!ret) - ret = ceph_osdc_wait_request(&fsc->client->osdc, req); + ret = ceph_osdc_wait_request(osdc, req); ceph_update_write_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, len, ret); -out: ceph_osdc_put_request(req); if (ret != 0) { + dout("sync_write osd write returned %d\n", ret); + /* Version changed! Must re-do the rmw cycle */ + if ((assert_ver && (ret == -ERANGE || ret == -EOVERFLOW)) || + (!assert_ver && ret == -EEXIST)) { + /* We should only ever see this on a rmw */ + WARN_ON_ONCE(!rmw); + + /* The version should never go backward */ + WARN_ON_ONCE(ret == -EOVERFLOW); + + *from = saved_iter; + + /* FIXME: limit number of times we loop? */ + continue; + } ceph_set_error_write(ci); break; } - ceph_clear_error_write(ci); pos += len; written += len; + dout("sync_write written %d\n", written); if (pos > i_size_read(inode)) { check_caps = ceph_inode_set_size(inode, pos); if (check_caps) @@ -1652,6 +1912,7 @@ ceph_sync_write(struct kiocb *iocb, struct iov_iter *from, loff_t pos, ret = written; iocb->ki_pos = pos; } + dout("sync_write returning %d\n", ret); return ret; } From patchwork Tue Apr 5 19:20:26 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802352 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 81B86C43219 for ; Wed, 6 Apr 2022 04:16:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1386507AbiDFENj (ORCPT ); Wed, 6 Apr 2022 00:13:39 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573604AbiDETXZ (ORCPT ); Tue, 5 Apr 2022 15:23:25 -0400 Received: from sin.source.kernel.org (sin.source.kernel.org [IPv6:2604:1380:40e1:4800::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A622752E44; Tue, 5 Apr 2022 12:21:25 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sin.source.kernel.org (Postfix) with ESMTPS id 1BB1ACE1D71; Tue, 5 Apr 2022 19:21:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id D216DC385A5; Tue, 5 Apr 2022 19:21:21 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186482; bh=FJBkwUgyhXk/5NkW1/W+Deix1TBTinD9Cdy4DVDu0x4=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=GzfD7WsZm2TcrLhQPrzqyBGMwXnYvyuVLq3sITKkFZGDD2XjkZXLF1f1WIvEyt2mk 49BMjClt8PDRhIaL5godjcGz0XEjtO1hM/sDmvEufUCexokZekigsD7kBU7F6gAqB8 Ym5ZpCWJg/B2y2L26q2LdC+R6M3ZXarU8yIZsneiPqIVDbtmGF45ILkgdC1raK9By2 riLuJetycYMAWFpV2GdMERzN6DfcUwCSXVYyUa2GxIyrG5nf64YK8BINcvGDYM8k/e RJnHT3f1T8sBG6wvLgF+yNWQQtGxreg2LC16B68MHEq2J634qd7q/3M1BJO7CF8VlZ z4zPvBhnjyZXg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 55/59] ceph: plumb in decryption during sync reads Date: Tue, 5 Apr 2022 15:20:26 -0400 Message-Id: <20220405192030.178326-56-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Switch to using sparse reads when the inode is encrypted. Note that the crypto block may be smaller than a page, but the reverse cannot be true. Signed-off-by: Jeff Layton --- fs/ceph/file.c | 89 ++++++++++++++++++++++++++++++++++++-------------- 1 file changed, 65 insertions(+), 24 deletions(-) diff --git a/fs/ceph/file.c b/fs/ceph/file.c index 522189ed6642..5d39d8e54273 100644 --- a/fs/ceph/file.c +++ b/fs/ceph/file.c @@ -938,7 +938,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, u64 off = *ki_pos; u64 len = iov_iter_count(to); u64 i_size = i_size_read(inode); - bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); + bool sparse = IS_ENCRYPTED(inode) || ceph_test_mount_opt(fsc, SPARSEREAD); u64 objver = 0; dout("sync_read on inode %p %llx~%llx\n", inode, *ki_pos, len); @@ -966,10 +966,19 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, int idx; size_t left; struct ceph_osd_req_op *op; + u64 read_off = off; + u64 read_len = len; + + /* determine new offset/length if encrypted */ + ceph_fscrypt_adjust_off_and_len(inode, &read_off, &read_len); + + dout("sync_read orig %llu~%llu reading %llu~%llu", + off, len, read_off, read_len); req = ceph_osdc_new_request(osdc, &ci->i_layout, - ci->i_vino, off, &len, 0, 1, - sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, + ci->i_vino, read_off, &read_len, 0, 1, + sparse ? CEPH_OSD_OP_SPARSE_READ : + CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); @@ -978,10 +987,13 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, break; } + /* adjust len downward if the request truncated the len */ + if (off + len > read_off + read_len) + len = read_off + read_len - off; more = len < iov_iter_count(to); - num_pages = calc_pages_for(off, len); - page_off = off & ~PAGE_MASK; + num_pages = calc_pages_for(read_off, read_len); + page_off = offset_in_page(off); pages = ceph_alloc_page_vector(num_pages, GFP_KERNEL); if (IS_ERR(pages)) { ceph_osdc_put_request(req); @@ -989,7 +1001,8 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, break; } - osd_req_op_extent_osd_data_pages(req, 0, pages, len, page_off, + osd_req_op_extent_osd_data_pages(req, 0, pages, read_len, + offset_in_page(read_off), false, false); op = &req->r_ops[0]; @@ -1008,7 +1021,7 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, ceph_update_read_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, - len, ret); + read_len, ret); if (ret > 0) objver = req->r_version; @@ -1023,8 +1036,34 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, else if (ret == -ENOENT) ret = 0; + if (ret > 0 && IS_ENCRYPTED(inode)) { + int fret; + + fret = ceph_fscrypt_decrypt_extents(inode, pages, read_off, + op->extent.sparse_ext, op->extent.sparse_ext_cnt); + if (fret < 0) { + ret = fret; + ceph_osdc_put_request(req); + break; + } + + /* account for any partial block at the beginning */ + fret -= (off - read_off); + + /* + * Short read after big offset adjustment? + * Nothing is usable, just call it a zero + * len read. + */ + fret = max(fret, 0); + + /* account for partial block at the end */ + ret = min_t(ssize_t, fret, len); + } + ceph_osdc_put_request(req); + /* Short read but not EOF? Zero out the remainder. */ if (ret >= 0 && ret < len && (off + ret < i_size)) { int zlen = min(len - ret, i_size - off - ret); int zoff = page_off + ret; @@ -1038,15 +1077,16 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, idx = 0; left = ret > 0 ? ret : 0; while (left > 0) { - size_t len, copied; - page_off = off & ~PAGE_MASK; - len = min_t(size_t, left, PAGE_SIZE - page_off); + size_t plen, copied; + + plen = min_t(size_t, left, PAGE_SIZE - page_off); SetPageUptodate(pages[idx]); copied = copy_page_to_iter(pages[idx++], - page_off, len, to); + page_off, plen, to); off += copied; left -= copied; - if (copied < len) { + page_off = 0; + if (copied < plen) { ret = -EFAULT; break; } @@ -1063,20 +1103,21 @@ ssize_t __ceph_sync_read(struct inode *inode, loff_t *ki_pos, break; } - if (off > *ki_pos) { - if (off >= i_size) { - *retry_op = CHECK_EOF; - ret = i_size - *ki_pos; - *ki_pos = i_size; - } else { - ret = off - *ki_pos; - *ki_pos = off; + if (ret > 0) { + if (off > *ki_pos) { + if (off >= i_size) { + *retry_op = CHECK_EOF; + ret = i_size - *ki_pos; + *ki_pos = i_size; + } else { + ret = off - *ki_pos; + *ki_pos = off; + } } - } - - if (last_objver && ret > 0) - *last_objver = objver; + if (last_objver) + *last_objver = objver; + } dout("sync_read result %zd retry_op %d\n", ret, *retry_op); return ret; } From patchwork Tue Apr 5 19:20:27 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802388 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id DC44DC433F5 for ; Wed, 6 Apr 2022 04:18:56 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1586460AbiDFEQp (ORCPT ); Wed, 6 Apr 2022 00:16:45 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38658 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573602AbiDETXY (ORCPT ); Tue, 5 Apr 2022 15:23:24 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 17F2E52B16; Tue, 5 Apr 2022 12:21:25 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id B8121B81FA4; Tue, 5 Apr 2022 19:21:23 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id BAEA5C385A1; Tue, 5 Apr 2022 19:21:22 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186483; bh=heaEs/fAbEMBmP6Wk4yEQLXufpxlZz2bCArq5qEYbdg=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=TNvJEiPAWRH4l6rhv/NWdlIf2sqqMXOf2HdAwv7gGpL2AKEwMGhZ5wxe1P/9gkMw/ uXorRYG/Rpde59wq8ReMnH8EZRTxd+awAhqhS4BVuRVJHClZpeBA031+daEqKa5Eue yAYiAw0gNNgSE2cZL3tgVeMKbV5SAPfMm3Zm9xQwLRgUX/yz+wlqlTr/8tX08QCZHh 3Aue8bC2obiyqHSAp3PfTPw/OJvlfDlGyPD61nQup9vQdVEB8Smi6fAHQ33V+egR4R 0kcAKIaoZJp1jFdHtI6E6Qu2mnZfOC49aXd/ZqYvo1jyNaanN+iOAJ1BZIqEQc6iXP o3MttLytL+6Zg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 56/59] ceph: add fscrypt decryption support to ceph_netfs_issue_op Date: Tue, 5 Apr 2022 15:20:27 -0400 Message-Id: <20220405192030.178326-57-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Force the use of sparse reads when the inode is encrypted, and add the appropriate code to decrypt the extent map after receiving. Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 32 +++++++++++++++++++++++--------- 1 file changed, 23 insertions(+), 9 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index 99021431a391..bcb74b1d46bb 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -18,6 +18,7 @@ #include "mds_client.h" #include "cache.h" #include "metric.h" +#include "crypto.h" #include #include @@ -216,7 +217,8 @@ static bool ceph_netfs_clamp_length(struct netfs_io_subrequest *subreq) static void finish_netfs_read(struct ceph_osd_request *req) { - struct ceph_fs_client *fsc = ceph_inode_to_client(req->r_inode); + struct inode *inode = req->r_inode; + struct ceph_fs_client *fsc = ceph_inode_to_client(inode); struct ceph_osd_data *osd_data = osd_req_op_extent_osd_data(req, 0); struct netfs_io_subrequest *subreq = req->r_priv; struct ceph_osd_req_op *op = &req->r_ops[0]; @@ -231,15 +233,24 @@ static void finish_netfs_read(struct ceph_osd_request *req) subreq->len, i_size_read(req->r_inode)); /* no object means success but no data */ - if (sparse && err >= 0) - err = ceph_sparse_ext_map_end(op); - else if (err == -ENOENT) + if (err == -ENOENT) err = 0; else if (err == -EBLOCKLISTED) fsc->blocklisted = true; - if (err >= 0 && err < subreq->len) - __set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags); + if (err >= 0) { + if (sparse && err > 0) + err = ceph_sparse_ext_map_end(op); + if (err < subreq->len) + __set_bit(NETFS_SREQ_CLEAR_TAIL, &subreq->flags); + if (IS_ENCRYPTED(inode) && err > 0) { + err = ceph_fscrypt_decrypt_extents(inode, osd_data->pages, + subreq->start, op->extent.sparse_ext, + op->extent.sparse_ext_cnt); + if (err > subreq->len) + err = subreq->len; + } + } netfs_subreq_terminated(subreq, err, true); @@ -314,13 +325,16 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) size_t page_off; int err = 0; u64 len = subreq->len; - bool sparse = ceph_test_mount_opt(fsc, SPARSEREAD); + bool sparse = IS_ENCRYPTED(inode) || ceph_test_mount_opt(fsc, SPARSEREAD); + u64 off = subreq->start; if (ci->i_inline_version != CEPH_INLINE_NONE && ceph_netfs_issue_op_inline(subreq)) return; - req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, subreq->start, &len, + ceph_fscrypt_adjust_off_and_len(inode, &off, &len); + + req = ceph_osdc_new_request(&fsc->client->osdc, &ci->i_layout, vino, off, &len, 0, 1, sparse ? CEPH_OSD_OP_SPARSE_READ : CEPH_OSD_OP_READ, CEPH_OSD_FLAG_READ | fsc->client->osdc.client->options->read_from_replica, NULL, ci->i_truncate_seq, ci->i_truncate_size, false); @@ -339,7 +353,7 @@ static void ceph_netfs_issue_read(struct netfs_io_subrequest *subreq) } dout("%s: pos=%llu orig_len=%zu len=%llu\n", __func__, subreq->start, subreq->len, len); - iov_iter_xarray(&iter, READ, &rreq->mapping->i_pages, subreq->start, len); + iov_iter_xarray(&iter, READ, &rreq->mapping->i_pages, off, len); err = iov_iter_get_pages_alloc(&iter, &pages, len, &page_off); if (err < 0) { dout("%s: iov_ter_get_pages_alloc returned %d\n", __func__, err); From patchwork Tue Apr 5 19:20:28 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802353 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id A8AFCC433EF for ; Wed, 6 Apr 2022 04:16:25 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1389606AbiDFEOA (ORCPT ); Wed, 6 Apr 2022 00:14:00 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38704 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573605AbiDETXZ (ORCPT ); Tue, 5 Apr 2022 15:23:25 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id ED25452E58; Tue, 5 Apr 2022 12:21:25 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id A0ED6B81F6B; Tue, 5 Apr 2022 19:21:24 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id A35DAC385A0; Tue, 5 Apr 2022 19:21:23 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186484; bh=NzNofCR/iPD4n8NmCcwDkvJcDdRZrg0H4oSDZpMwm5g=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=I2PqqTco0eMmgV7MjGFlRaVDoaN5RCLPnK+jV42sF+/aQKvyPvPzv0AWOaarOYh5A zrxsUfI1R79VQvVjmBronkc9hmuB/SaHF9DqGcWZxVEO4suT33IazwZlfEwegj7PzP OBuasJ7pb+YWAi64TvsadzW4/bLHVHzv+wIJe+qPOytJRjQgtEUBgymDbnAeYWDMIV M2ObCdNL9NWbYJFvDDPrUTuM/kwwS+4EEkYcCGLiPTPMDjfY9bWFm0D21GoKhhuXfS 9sa7DSt4wMMrh+U6D1MBu+Txl9CoPyvY2nfHKCIpYBbsJEd7tGPpScMUsm6NzeOG54 Se4mOcQXBNZig== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 57/59] ceph: set i_blkbits to crypto block size for encrypted inodes Date: Tue, 5 Apr 2022 15:20:28 -0400 Message-Id: <20220405192030.178326-58-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Some of the underlying infrastructure for fscrypt relies on i_blkbits being aligned to the crypto blocksize. Signed-off-by: Jeff Layton --- fs/ceph/inode.c | 16 +++++++++------- 1 file changed, 9 insertions(+), 7 deletions(-) diff --git a/fs/ceph/inode.c b/fs/ceph/inode.c index eb8f066975a8..45ca4e598ef0 100644 --- a/fs/ceph/inode.c +++ b/fs/ceph/inode.c @@ -976,13 +976,6 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, issued |= __ceph_caps_dirty(ci); new_issued = ~issued & info_caps; - /* directories have fl_stripe_unit set to zero */ - if (le32_to_cpu(info->layout.fl_stripe_unit)) - inode->i_blkbits = - fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1; - else - inode->i_blkbits = CEPH_BLOCK_SHIFT; - __ceph_update_quota(ci, iinfo->max_bytes, iinfo->max_files); #ifdef CONFIG_FS_ENCRYPTION @@ -1008,6 +1001,15 @@ int ceph_fill_inode(struct inode *inode, struct page *locked_page, ceph_decode_timespec64(&ci->i_snap_btime, &iinfo->snap_btime); } + /* directories have fl_stripe_unit set to zero */ + if (IS_ENCRYPTED(inode)) + inode->i_blkbits = CEPH_FSCRYPT_BLOCK_SHIFT; + else if (le32_to_cpu(info->layout.fl_stripe_unit)) + inode->i_blkbits = + fls(le32_to_cpu(info->layout.fl_stripe_unit)) - 1; + else + inode->i_blkbits = CEPH_BLOCK_SHIFT; + if ((new_version || (new_issued & CEPH_CAP_LINK_SHARED)) && (issued & CEPH_CAP_LINK_EXCL) == 0) set_nlink(inode, le32_to_cpu(info->nlink)); From patchwork Tue Apr 5 19:20:29 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802373 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 9D616C41535 for ; Wed, 6 Apr 2022 04:16:49 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1455680AbiDFEQK (ORCPT ); Wed, 6 Apr 2022 00:16:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38752 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573606AbiDETX0 (ORCPT ); Tue, 5 Apr 2022 15:23:26 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id EF12353719; Tue, 5 Apr 2022 12:21:26 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 8E28FB81FA5; Tue, 5 Apr 2022 19:21:26 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 8C286C385A1; Tue, 5 Apr 2022 19:21:24 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186485; bh=UyhoaEfN26kChKGPTQMMPNGQiUm6otQmJCvQT6vOeVI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=Hy9oPgfHZ178DRYZNlzmpYwSt9a7y7tuEeZdQVKXo6BFXh3FbqxXieJG9rJpHZbwr P7x8hxMfmdHFV5RiqEFM5rWcfmudlsqLssv687p6QjMUu6SwsCoLxbsuHLErRslSnZ kiWAqbjGV1ZDBSX67BSOYZBCKhklrUjmubSfNvSrG0Onj+kL6BvAnXVhlWoW5SzZBh vXlbUu1zhEDHiX8nLpbU+Dek3h0Na6KtomU1EMElYgdL3yjKG5xTQJp2Chzd7lHCwB ZAd7u8J3JnCl6w+hr0UUpt+CXrWJOVnC71hgzPzvmx9mAAHpROTHnCeBPpr14vfcOR qR/RLSYhkRy9g== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 58/59] ceph: add encryption support to writepage Date: Tue, 5 Apr 2022 15:20:29 -0400 Message-Id: <20220405192030.178326-59-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Allow writepage to issue encrypted writes. Extend out the requested size and offset to cover complete blocks, and then encrypt and write them to the OSDs. Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 34 +++++++++++++++++++++++++++------- 1 file changed, 27 insertions(+), 7 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index bcb74b1d46bb..ff015f251fcc 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -584,10 +584,12 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) loff_t page_off = page_offset(page); int err; loff_t len = thp_size(page); + loff_t wlen; struct ceph_writeback_ctl ceph_wbc; struct ceph_osd_client *osdc = &fsc->client->osdc; struct ceph_osd_request *req; bool caching = ceph_is_cache_enabled(inode); + struct page *bounce_page = NULL; dout("writepage %p idx %lu\n", page, page->index); @@ -619,6 +621,8 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) if (ceph_wbc.i_size < page_off + len) len = ceph_wbc.i_size - page_off; + if (IS_ENCRYPTED(inode)) + wlen = round_up(len, CEPH_FSCRYPT_BLOCK_SIZE); dout("writepage %p page %p index %lu on %llu~%llu snapc %p seq %lld\n", inode, page, page->index, page_off, len, snapc, snapc->seq); @@ -627,22 +631,37 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) CONGESTION_ON_THRESH(fsc->mount_options->congestion_kb)) fsc->write_congested = true; - req = ceph_osdc_new_request(osdc, &ci->i_layout, ceph_vino(inode), page_off, &len, 0, 1, - CEPH_OSD_OP_WRITE, CEPH_OSD_FLAG_WRITE, snapc, - ceph_wbc.truncate_seq, ceph_wbc.truncate_size, - true); + req = ceph_osdc_new_request(osdc, &ci->i_layout, ceph_vino(inode), + page_off, &wlen, 0, 1, CEPH_OSD_OP_WRITE, + CEPH_OSD_FLAG_WRITE, snapc, + ceph_wbc.truncate_seq, + ceph_wbc.truncate_size, true); if (IS_ERR(req)) return PTR_ERR(req); + if (wlen < len) + len = wlen; + set_page_writeback(page); if (caching) ceph_set_page_fscache(page); ceph_fscache_write_to_cache(inode, page_off, len, caching); + if (IS_ENCRYPTED(inode)) { + bounce_page = fscrypt_encrypt_pagecache_blocks(page, CEPH_FSCRYPT_BLOCK_SIZE, + 0, GFP_NOFS); + if (IS_ERR(bounce_page)) { + err = PTR_ERR(bounce_page); + goto out; + } + } /* it may be a short write due to an object boundary */ WARN_ON_ONCE(len > thp_size(page)); - osd_req_op_extent_osd_data_pages(req, 0, &page, len, 0, false, false); - dout("writepage %llu~%llu (%llu bytes)\n", page_off, len, len); + osd_req_op_extent_osd_data_pages(req, 0, + bounce_page ? &bounce_page : &page, wlen, 0, + false, false); + dout("writepage %llu~%llu (%llu bytes, %sencrypted)\n", + page_off, len, wlen, IS_ENCRYPTED(inode) ? "" : "not "); req->r_mtime = inode->i_mtime; err = ceph_osdc_start_request(osdc, req, true); @@ -651,7 +670,8 @@ static int writepage_nounlock(struct page *page, struct writeback_control *wbc) ceph_update_write_metrics(&fsc->mdsc->metric, req->r_start_latency, req->r_end_latency, len, err); - + fscrypt_free_bounce_page(bounce_page); +out: ceph_osdc_put_request(req); if (err == 0) err = len; From patchwork Tue Apr 5 19:20:30 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Jeff Layton X-Patchwork-Id: 12802339 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 375CEC433EF for ; Wed, 6 Apr 2022 04:06:39 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S244794AbiDFEIF (ORCPT ); Wed, 6 Apr 2022 00:08:05 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38822 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1573607AbiDETX1 (ORCPT ); Tue, 5 Apr 2022 15:23:27 -0400 Received: from ams.source.kernel.org (ams.source.kernel.org [IPv6:2604:1380:4601:e00::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id DF22653732; Tue, 5 Apr 2022 12:21:27 -0700 (PDT) Received: from smtp.kernel.org (relay.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ams.source.kernel.org (Postfix) with ESMTPS id 80596B81F6B; Tue, 5 Apr 2022 19:21:27 +0000 (UTC) Received: by smtp.kernel.org (Postfix) with ESMTPSA id 74EFAC385A3; Tue, 5 Apr 2022 19:21:25 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1649186486; bh=1HTi64VCd1Ogi3wR7tPqfrc9qpnlV1qUcf0emucjZvI=; h=From:To:Cc:Subject:Date:In-Reply-To:References:From; b=heaSs3yLTTKz8Q6XQgWZcMqi2Gv8r6shJTd9gP+sVI2tTw0pUrhm3Y0dNox/Ze+Ql A84yG8eq2JyhO2dO4U2kfTaYxJSOU+lb0zg42760/buCJVAYkLbRQ/PHnte83KAQjk krqBHw48PntS95l9keAXe9d1+U60mQ+wVle+PGY30AQgKuyyIugC1+MlFjf5Lcf+6g osK8aF0k1BU21pd7LrGoKbv+1hn+8QB5iWrJMbGk85kTiCvXM6YyTYfWPn7yCrdZsX rt+7HQOW7hlv+c2Q5se3S/tBYZazhxRY/PZGbgmsb/58yeqtJvRdccmG+HgHFKqtks 7s5wGK1T0zPBg== From: Jeff Layton To: idryomov@gmail.com, xiubli@redhat.com Cc: ceph-devel@vger.kernel.org, linux-fsdevel@vger.kernel.org, linux-fscrypt@vger.kernel.org, linux-kernel@vger.kernel.org, lhenriques@suse.de Subject: [PATCH v13 59/59] ceph: fscrypt support for writepages Date: Tue, 5 Apr 2022 15:20:30 -0400 Message-Id: <20220405192030.178326-60-jlayton@kernel.org> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220405192030.178326-1-jlayton@kernel.org> References: <20220405192030.178326-1-jlayton@kernel.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: ceph-devel@vger.kernel.org Add the appropriate machinery to write back dirty data with encryption. Signed-off-by: Jeff Layton --- fs/ceph/addr.c | 63 +++++++++++++++++++++++++++++++++++++++--------- fs/ceph/crypto.h | 18 +++++++++++++- 2 files changed, 68 insertions(+), 13 deletions(-) diff --git a/fs/ceph/addr.c b/fs/ceph/addr.c index ff015f251fcc..939819a1cf41 100644 --- a/fs/ceph/addr.c +++ b/fs/ceph/addr.c @@ -545,10 +545,12 @@ static u64 get_writepages_data_length(struct inode *inode, struct page *page, u64 start) { struct ceph_inode_info *ci = ceph_inode(inode); - struct ceph_snap_context *snapc = page_snap_context(page); + struct ceph_snap_context *snapc; struct ceph_cap_snap *capsnap = NULL; u64 end = i_size_read(inode); + u64 ret; + snapc = page_snap_context(ceph_fscrypt_pagecache_page(page)); if (snapc != ci->i_head_snapc) { bool found = false; spin_lock(&ci->i_ceph_lock); @@ -563,9 +565,12 @@ static u64 get_writepages_data_length(struct inode *inode, spin_unlock(&ci->i_ceph_lock); WARN_ON(!found); } - if (end > page_offset(page) + thp_size(page)) - end = page_offset(page) + thp_size(page); - return end > start ? end - start : 0; + if (end > ceph_fscrypt_page_offset(page) + thp_size(page)) + end = ceph_fscrypt_page_offset(page) + thp_size(page); + ret = end > start ? end - start : 0; + if (ret && fscrypt_is_bounce_page(page)) + ret = round_up(ret, CEPH_FSCRYPT_BLOCK_SIZE); + return ret; } /* @@ -787,6 +792,11 @@ static void writepages_finish(struct ceph_osd_request *req) total_pages += num_pages; for (j = 0; j < num_pages; j++) { page = osd_data->pages[j]; + if (fscrypt_is_bounce_page(page)) { + page = fscrypt_pagecache_page(page); + fscrypt_free_bounce_page(osd_data->pages[j]); + osd_data->pages[j] = page; + } BUG_ON(!page); WARN_ON(!PageUptodate(page)); @@ -1048,9 +1058,28 @@ static int ceph_writepages_start(struct address_space *mapping, fsc->mount_options->congestion_kb)) fsc->write_congested = true; - pages[locked_pages++] = page; - pvec.pages[i] = NULL; + if (IS_ENCRYPTED(inode)) { + pages[locked_pages] = + fscrypt_encrypt_pagecache_blocks(page, + PAGE_SIZE, 0, + locked_pages ? GFP_NOWAIT : GFP_NOFS); + if (IS_ERR(pages[locked_pages])) { + if (PTR_ERR(pages[locked_pages]) == -EINVAL) + pr_err("%s: inode->i_blkbits=%hhu\n", + __func__, inode->i_blkbits); + /* better not fail on first page! */ + BUG_ON(locked_pages == 0); + pages[locked_pages] = NULL; + redirty_page_for_writepage(wbc, page); + unlock_page(page); + break; + } + ++locked_pages; + } else { + pages[locked_pages++] = page; + } + pvec.pages[i] = NULL; len += thp_size(page); } @@ -1078,7 +1107,7 @@ static int ceph_writepages_start(struct address_space *mapping, } new_request: - offset = page_offset(pages[0]); + offset = ceph_fscrypt_page_offset(pages[0]); len = wsize; req = ceph_osdc_new_request(&fsc->client->osdc, @@ -1099,8 +1128,8 @@ static int ceph_writepages_start(struct address_space *mapping, ceph_wbc.truncate_size, true); BUG_ON(IS_ERR(req)); } - BUG_ON(len < page_offset(pages[locked_pages - 1]) + - thp_size(page) - offset); + BUG_ON(len < ceph_fscrypt_page_offset(pages[locked_pages - 1]) + + thp_size(pages[locked_pages - 1]) - offset); req->r_callback = writepages_finish; req->r_inode = inode; @@ -1110,7 +1139,9 @@ static int ceph_writepages_start(struct address_space *mapping, data_pages = pages; op_idx = 0; for (i = 0; i < locked_pages; i++) { - u64 cur_offset = page_offset(pages[i]); + struct page *page = ceph_fscrypt_pagecache_page(pages[i]); + + u64 cur_offset = page_offset(page); /* * Discontinuity in page range? Ceph can handle that by just passing * multiple extents in the write op. @@ -1139,9 +1170,9 @@ static int ceph_writepages_start(struct address_space *mapping, op_idx++; } - set_page_writeback(pages[i]); + set_page_writeback(page); if (caching) - ceph_set_page_fscache(pages[i]); + ceph_set_page_fscache(page); len += thp_size(page); } ceph_fscache_write_to_cache(inode, offset, len, caching); @@ -1157,8 +1188,16 @@ static int ceph_writepages_start(struct address_space *mapping, offset); len = max(len, min_len); } + if (IS_ENCRYPTED(inode)) + len = round_up(len, CEPH_FSCRYPT_BLOCK_SIZE); + dout("writepages got pages at %llu~%llu\n", offset, len); + if (IS_ENCRYPTED(inode) && + ((offset | len) & ~CEPH_FSCRYPT_BLOCK_MASK)) + pr_warn("%s: bad encrypted write offset=%lld len=%llu\n", + __func__, offset, len); + osd_req_op_extent_osd_data_pages(req, op_idx, data_pages, len, 0, from_pool, false); osd_req_op_extent_update(req, op_idx, len); diff --git a/fs/ceph/crypto.h b/fs/ceph/crypto.h index 92a7b221a975..0cf526f07567 100644 --- a/fs/ceph/crypto.h +++ b/fs/ceph/crypto.h @@ -146,6 +146,12 @@ int ceph_fscrypt_decrypt_extents(struct inode *inode, struct page **page, u64 of struct ceph_sparse_extent *map, u32 ext_cnt); int ceph_fscrypt_encrypt_pages(struct inode *inode, struct page **page, u64 off, int len, gfp_t gfp); + +static inline struct page *ceph_fscrypt_pagecache_page(struct page *page) +{ + return fscrypt_is_bounce_page(page) ? fscrypt_pagecache_page(page) : page; +} + #else /* CONFIG_FS_ENCRYPTION */ static inline void ceph_fscrypt_set_ops(struct super_block *sb) @@ -235,6 +241,16 @@ static inline int ceph_fscrypt_encrypt_pages(struct inode *inode, struct page ** { return 0; } + +static inline struct page *ceph_fscrypt_pagecache_page(struct page *page) +{ + return page; +} #endif /* CONFIG_FS_ENCRYPTION */ -#endif +static inline loff_t ceph_fscrypt_page_offset(struct page *page) +{ + return page_offset(ceph_fscrypt_pagecache_page(page)); +} + +#endif /* _CEPH_CRYPTO_H */