diff mbox series

NFSD: nfsctl: remove read permission of filehandle

Message ID 20240319104714.1178-1-chenhx.fnst@fujitsu.com (mailing list archive)
State New
Headers show
Series NFSD: nfsctl: remove read permission of filehandle | expand

Commit Message

Chen Hanxiao March 19, 2024, 10:47 a.m. UTC
As write_filehandle can't accept zero-bytes writting, 
remove read permission of /proc/fs/nfsd/filehandle

Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
---
 fs/nfsd/nfsctl.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Jeffrey Layton March 19, 2024, 1:09 p.m. UTC | #1
On Tue, 2024-03-19 at 18:47 +0800, Chen Hanxiao wrote:
> As write_filehandle can't accept zero-bytes writting, 
> remove read permission of /proc/fs/nfsd/filehandle
> 
> Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
> ---
>  fs/nfsd/nfsctl.c | 2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
> index ecd18bffeebc..da1387f1e30a 100644
> --- a/fs/nfsd/nfsctl.c
> +++ b/fs/nfsd/nfsctl.c
> @@ -1346,7 +1346,7 @@ static int nfsd_fill_super(struct super_block *sb, struct fs_context *fc)
>  					&transaction_ops, S_IWUSR|S_IRUSR},
>  		[NFSD_FO_UnlockFS] = {"unlock_filesystem",
>  					&transaction_ops, S_IWUSR|S_IRUSR},
> -		[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
> +		[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR},
>  		[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
>  		[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
>  		[NFSD_Pool_Stats] = {"pool_stats", &pool_stats_operations, S_IRUGO},


I don't think we can do this. mountd does this to get the rootfh for an
exported fs and it'll probably start failing if it can't open that file
for read:

    f = open("/proc/fs/nfsd/filehandle", O_RDWR);

That write_filehandle interface sure is weird though. Could we get rid
of it by teaching cache_get_filehandle how to use name_to_handle_at
instead? That would be a lot cleaner and would get rid of yet another
dependency on /proc/fs/nfsd.
Chen Hanxiao March 20, 2024, 3:49 p.m. UTC | #2
> -----邮件原件-----
> 发件人: Jeff Layton <jlayton@kernel.org>
> 发送时间: 2024年3月19日 21:09
> 收件人: Chen, Hanxiaochenhx.fnst@fujitsu.com>; Chuck Lever
> <chuck.lever@oracle.com>; Neil Brown <neilb@suse.de>; Olga Kornievskaia
> <kolga@netapp.com>; Dai Ngo <Dai.Ngo@oracle.com>; Tom Talpey
> <tom@talpey.com>
> 抄送: linux-nfs@vger.kernel.org
> 主题: Re: [PATCH] NFSD: nfsctl: remove read permission of filehandle
> 
> On Tue, 2024-03-19 at 18:47 +0800, Chen Hanxiao wrote:
> > As write_filehandle can't accept zero-bytes writting,
> > remove read permission of /proc/fs/nfsd/filehandle
> >
> > Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
...
> 
> 
> I don't think we can do this. mountd does this to get the rootfh for an
> exported fs and it'll probably start failing if it can't open that file
> for read:
> 
>     f = open("/proc/fs/nfsd/filehandle", O_RDWR);
> 
> That write_filehandle interface sure is weird though. Could we get rid
> of it by teaching cache_get_filehandle how to use name_to_handle_at
> instead? That would be a lot cleaner and would get rid of yet another
> dependency on /proc/fs/nfsd.

name_to_handle_at return struct file_handle
cache_get_filehandle return hexed struct knfsd_fh

Could you please give some hints on how to convert from file_handle. f_handle to knfsd_fh.fh_raw?

Regards,
- Chen



> --
> Jeff Layton <jlayton@kernel.org>
Jeffrey Layton March 20, 2024, 4:27 p.m. UTC | #3
On Wed, 2024-03-20 at 15:49 +0000, Hanxiao Chen (Fujitsu) wrote:
> 
> > -----邮件原件-----
> > 发件人: Jeff Layton <jlayton@kernel.org>
> > 发送时间: 2024年3月19日 21:09
> > 收件人: Chen, Hanxiaochenhx.fnst@fujitsu.com>; Chuck Lever
> > <chuck.lever@oracle.com>; Neil Brown <neilb@suse.de>; Olga Kornievskaia
> > <kolga@netapp.com>; Dai Ngo <Dai.Ngo@oracle.com>; Tom Talpey
> > <tom@talpey.com>
> > 抄送: linux-nfs@vger.kernel.org
> > 主题: Re: [PATCH] NFSD: nfsctl: remove read permission of filehandle
> > 
> > On Tue, 2024-03-19 at 18:47 +0800, Chen Hanxiao wrote:
> > > As write_filehandle can't accept zero-bytes writting,
> > > remove read permission of /proc/fs/nfsd/filehandle
> > > 
> > > Signed-off-by: Chen Hanxiao <chenhx.fnst@fujitsu.com>
> ...
> > 
> > 
> > I don't think we can do this. mountd does this to get the rootfh for an
> > exported fs and it'll probably start failing if it can't open that file
> > for read:
> > 
> >     f = open("/proc/fs/nfsd/filehandle", O_RDWR);
> > 
> > That write_filehandle interface sure is weird though. Could we get rid
> > of it by teaching cache_get_filehandle how to use name_to_handle_at
> > instead? That would be a lot cleaner and would get rid of yet another
> > dependency on /proc/fs/nfsd.
> 
> name_to_handle_at return struct file_handle
> cache_get_filehandle return hexed struct knfsd_fh
> 
> Could you please give some hints on how to convert from file_handle. f_handle to knfsd_fh.fh_raw?
> 

I started looking at this yesterday. knfsd_fh is mostly constructed in
fh_compose. The exportfs_encode_fh call is in _fh_update. If we look at
the comments over knfsd_fh, it explains pretty well what we need to fill
out:

fh_version : should always be 1
fh_auth_type: always 0
fh_fsid_type: this is the tricky part (see below)
fh_fileid_type: returned by name_to_handle_at
fh_fsid: fsid concatenated with the fileid for the inode

name_to_handle_at should be able to fill out the fileid portion of the
knfsd_fh. The harder part is: how do we get the fh_fsid_type and its
content?

The types are shown here:

* The fsid_type identifies how the filesystem (or export point) is
 *    encoded.
 *  Current values:
 *     0  - 4 byte device id (ms-2-bytes major, ls-2-bytes minor), 4byte inode number
 *        NOTE: we cannot use the kdev_t device id value, because kdev_t.h
 *              says we mustn't.  We must break it up and reassemble.
 *     1  - 4 byte user specified identifier
 *     2  - 4 byte major, 4 byte minor, 4 byte inode number - DEPRECATED
 *     3  - 4 byte device id, encoded for user-space, 4 byte inode number
 *     4  - 4 byte inode number and 4 byte uuid
 *     5  - 8 byte uuid
 *     6  - 16 byte uuid
 *     7  - 8 byte inode number and 16 byte uuid

The kernel decides this in set_version_and_fsid_type, so I suppose we
could try to replicate that logic in userland.

There is the problem though of what to do about exports with the fsid=
option set. That's tougher to determine from userland, but maybe we can
somehow get that from the export record?

That's about as far as I got with it yesterday...
Chen Hanxiao March 27, 2024, 6:40 a.m. UTC | #4
Hi, Jeff

I wrote a POC patch, use name_to_handle_at to get nfs root filehandle
instead of /proc/fs/nfsd/filehandle.

I did some simple tests and it works.

Pls check the POC patch below:
---
 support/export/cache.c | 95 +++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 93 insertions(+), 2 deletions(-)

diff --git a/support/export/cache.c b/support/export/cache.c
index 6c0a44a3..adecac2e 100644
--- a/support/export/cache.c
+++ b/support/export/cache.c
@@ -43,6 +43,25 @@
 #include "blkid/blkid.h"
 #endif
 
+#define NFS4_FHSIZE           128
+
+struct knfsd_fh {
+	unsigned int	fh_size;	/*
+					 * Points to the current size while
+					 * building a new file handle.
+					 */
+	union {
+		char			fh_raw[NFS4_FHSIZE];
+		struct {
+			uint8_t		fh_version;	/* == 1 */
+			uint8_t		fh_auth_type;	/* deprecated */
+			uint8_t		fh_fsid_type;
+			uint8_t		fh_fileid_type;
+			uint32_t	fh_fsid[]; /* flexible-array member */
+		};
+	};
+};
+
 enum nfsd_fsid {
 	FSID_DEV = 0,
 	FSID_NUM,
@@ -1827,8 +1846,8 @@ int cache_export(nfs_export *exp, char *path)
  *   read filehandle <&0
  * } <> /proc/fs/nfsd/filehandle
  */
-struct nfs_fh_len *
-cache_get_filehandle(nfs_export *exp, int len, char *p)
+static struct nfs_fh_len *
+cache_get_filehandle_by_proc(nfs_export *exp, int len, char *p)
 {
 	static struct nfs_fh_len fh;
 	char buf[RPC_CHAN_BUF_SIZE], *bp;
@@ -1862,6 +1881,78 @@ cache_get_filehandle(nfs_export *exp, int len, char *p)
 	return &fh;
 }
 
+static struct nfs_fh_len *
+cache_get_filehandle_by_name(nfs_export *exp, char *name)
+{
+	static struct nfs_fh_len fh;
+	struct {
+		struct file_handle fh;
+		unsigned char handle[128];
+	} file_fh;
+	char buf[RPC_CHAN_BUF_SIZE] = {0};
+	char *mesg = buf;
+	int len, mnt_id;
+	unsigned int e_fsid;
+	struct knfsd_fh kfh;
+	char u[16];
+
+	memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
+	
+	file_fh.fh.handle_bytes = 128;
+	if (name_to_handle_at(AT_FDCWD, name, &file_fh.fh, &mnt_id, 0) < 0)
+		return NULL;
+
+	memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
+	memset(&kfh, 0, sizeof(struct knfsd_fh));
+	kfh.fh_version = 1;
+	kfh.fh_auth_type = 0;
+	e_fsid = exp->m_export.e_fsid;
+
+	if (e_fsid > 0) {
+		len = 12;
+		fh.fh_size = 8;
+		kfh.fh_size = 12;
+		kfh.fh_fsid_type = 1;
+		kfh.fh_fsid[0] = e_fsid;
+	} else {
+		len = file_fh.fh.handle_bytes + 8;
+		fh.fh_size = file_fh.fh.handle_bytes;
+		kfh.fh_size = file_fh.fh.handle_bytes + sizeof(kfh.fh_size);
+		kfh.fh_fsid_type = FSID_UUID16_INUM;
+		if (file_fh.fh.handle_bytes <= 12) {
+			kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
+			kfh.fh_fsid[1] = 0;
+		} else {
+			kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
+			kfh.fh_fsid[1] = *((uint32_t *)file_fh.fh.f_handle + 1);
+		}
+	}
+	kfh.fh_fileid_type = 0; // FILEID_ROOT
+
+	qword_addhex(&mesg, &len, kfh.fh_raw, kfh.fh_size);
+	mesg = buf;
+	len = qword_get(&mesg, (char *)fh.fh_handle, NFS3_FHSIZE);
+	if (e_fsid == 0) {
+		len = 16;
+		uuid_by_path(name, 0, 16, u);
+		memcpy((char *)fh.fh_handle+12, u, 16);
+		fh.fh_size += 16;
+	}
+	
+	return &fh;
+}
+
+struct nfs_fh_len *
+cache_get_filehandle(nfs_export *exp, int len, char *p)
+{
+	struct nfs_fh_len *fh;
+	fh = cache_get_filehandle_by_name(exp, p);
+	if (!fh)
+		fh = cache_get_filehandle_by_proc(exp, len, p);
+
+	return fh;
+}
+
 /* Wait for all worker child processes to exit and reap them */
 void
 cache_wait_for_workers(char *prog)
Jeffrey Layton March 27, 2024, 10:50 a.m. UTC | #5
On Wed, 2024-03-27 at 14:40 +0800, Chen Hanxiao wrote:
> Hi, Jeff
> 
> I wrote a POC patch, use name_to_handle_at to get nfs root filehandle
> instead of /proc/fs/nfsd/filehandle.
> 
> I did some simple tests and it works.
> 
> Pls check the POC patch below:
> ---
>  support/export/cache.c | 95 +++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 93 insertions(+), 2 deletions(-)
> 

Very cool!

I do think this is the basic direction we want to go with revising how
mountd gets its filehandles. 


> diff --git a/support/export/cache.c b/support/export/cache.c
> index 6c0a44a3..adecac2e 100644
> --- a/support/export/cache.c
> +++ b/support/export/cache.c
> @@ -43,6 +43,25 @@
>  #include "blkid/blkid.h"
>  #endif
>  
> +#define NFS4_FHSIZE           128
> +
> +struct knfsd_fh {
> +	unsigned int	fh_size;	/*
> +					 * Points to the current size while
> +					 * building a new file handle.
> +					 */
> +	union {
> +		char			fh_raw[NFS4_FHSIZE];
> +		struct {
> +			uint8_t		fh_version;	/* == 1 */
> +			uint8_t		fh_auth_type;	/* deprecated */
> +			uint8_t		fh_fsid_type;
> +			uint8_t		fh_fileid_type;
> +			uint32_t	fh_fsid[]; /* flexible-array member */
> +		};
> +	};
> +};
> +

This is a large departure from how this has worked in the past. Before
this, userland could always treat a knfsd_fh as opaque. With this change
though, knfsd_fh will need to be part of the kernel's ABI, and we'll
need to be very careful about future changes to the FH format (not that
we're planning any).

I think we probably ought to start by moving the definitions of knfsd_fh
and its component parts into a UAPI header that nfs-utils could then
use. For now, we can add duplicate definitions like above for when the
case where they aren't defined in UAPI headers.

>  enum nfsd_fsid {
>  	FSID_DEV = 0,
>  	FSID_NUM,

Side note: I wonder if some of the nfsd_fsid values can (effectively) be
deprecated these days? I don't think we really use FSID_DEV anymore at
all, do we?

> @@ -1827,8 +1846,8 @@ int cache_export(nfs_export *exp, char *path)
>   *   read filehandle <&0
>   * } <> /proc/fs/nfsd/filehandle
>   */
> -struct nfs_fh_len *
> -cache_get_filehandle(nfs_export *exp, int len, char *p)
> +static struct nfs_fh_len *
> +cache_get_filehandle_by_proc(nfs_export *exp, int len, char *p)
>  {
>  	static struct nfs_fh_len fh;
>  	char buf[RPC_CHAN_BUF_SIZE], *bp;
> @@ -1862,6 +1881,78 @@ cache_get_filehandle(nfs_export *exp, int len, char *p)
>  	return &fh;
>  }
>  
> +static struct nfs_fh_len *
> +cache_get_filehandle_by_name(nfs_export *exp, char *name)
> +{
> +	static struct nfs_fh_len fh;
> +	struct {
> +		struct file_handle fh;
> +		unsigned char handle[128];
> +	} file_fh;
> +	char buf[RPC_CHAN_BUF_SIZE] = {0};
> +	char *mesg = buf;
> +	int len, mnt_id;
> +	unsigned int e_fsid;
> +	struct knfsd_fh kfh;
> +	char u[16];
> +
> +	memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
> +	
> +	file_fh.fh.handle_bytes = 128;
> +	if (name_to_handle_at(AT_FDCWD, name, &file_fh.fh, &mnt_id, 0) < 0)
> +		return NULL;
> +
> +	memset(fh.fh_handle, 0, sizeof(fh.fh_handle));
> +	memset(&kfh, 0, sizeof(struct knfsd_fh));
> +	kfh.fh_version = 1;
> +	kfh.fh_auth_type = 0;
> +	e_fsid = exp->m_export.e_fsid;
> +
> +	if (e_fsid > 0) {
> +		len = 12;
> +		fh.fh_size = 8;
> +		kfh.fh_size = 12;
> +		kfh.fh_fsid_type = 1;
> +		kfh.fh_fsid[0] = e_fsid;
> +	} else {
> +		len = file_fh.fh.handle_bytes + 8;
> +		fh.fh_size = file_fh.fh.handle_bytes;
> +		kfh.fh_size = file_fh.fh.handle_bytes + sizeof(kfh.fh_size);
> +		kfh.fh_fsid_type = FSID_UUID16_INUM;

Note that we will need to deal with NFSv2 here too. It has 32-byte
filehandles, and so we likely can't use FSID_UUID16_INUM for those.

> +		if (file_fh.fh.handle_bytes <= 12) {
> +			kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
> +			kfh.fh_fsid[1] = 0;
> +		} else {
> +			kfh.fh_fsid[0] = *(uint32_t *)file_fh.fh.f_handle;
> +			kfh.fh_fsid[1] = *((uint32_t *)file_fh.fh.f_handle + 1);
> +		}
> +	}
> +	kfh.fh_fileid_type = 0; // FILEID_ROOT
> +
> +	qword_addhex(&mesg, &len, kfh.fh_raw, kfh.fh_size);
> +	mesg = buf;
> +	len = qword_get(&mesg, (char *)fh.fh_handle, NFS3_FHSIZE);

We may need to pass to this function what version of NFS MNT request was
sent so that we know what sort of FH is needed here.

> +	if (e_fsid == 0) {
> +		len = 16;
> +		uuid_by_path(name, 0, 16, u);
> +		memcpy((char *)fh.fh_handle+12, u, 16);
> +		fh.fh_size += 16;
> +	}
> +	
> +	return &fh;
> +}
> +
> +struct nfs_fh_len *
> +cache_get_filehandle(nfs_export *exp, int len, char *p)
> +{
> +	struct nfs_fh_len *fh;
> +	fh = cache_get_filehandle_by_name(exp, p);
> +	if (!fh)
> +		fh = cache_get_filehandle_by_proc(exp, len, p);
> +
> +	return fh;
> +}
> +
>  /* Wait for all worker child processes to exit and reap them */
>  void
>  cache_wait_for_workers(char *prog)

This looks like a great start -- nice work! There are a number of more
obscure corner-cases that we'll need to deal with, but this does look
like a promising direction.
diff mbox series

Patch

diff --git a/fs/nfsd/nfsctl.c b/fs/nfsd/nfsctl.c
index ecd18bffeebc..da1387f1e30a 100644
--- a/fs/nfsd/nfsctl.c
+++ b/fs/nfsd/nfsctl.c
@@ -1346,7 +1346,7 @@  static int nfsd_fill_super(struct super_block *sb, struct fs_context *fc)
 					&transaction_ops, S_IWUSR|S_IRUSR},
 		[NFSD_FO_UnlockFS] = {"unlock_filesystem",
 					&transaction_ops, S_IWUSR|S_IRUSR},
-		[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR|S_IRUSR},
+		[NFSD_Fh] = {"filehandle", &transaction_ops, S_IWUSR},
 		[NFSD_Threads] = {"threads", &transaction_ops, S_IWUSR|S_IRUSR},
 		[NFSD_Pool_Threads] = {"pool_threads", &transaction_ops, S_IWUSR|S_IRUSR},
 		[NFSD_Pool_Stats] = {"pool_stats", &pool_stats_operations, S_IRUGO},