diff mbox series

[v2] ceph: fix memory leak in ceph_mds_auth_match()

Message ID 20250114224514.2399813-1-antoine@lesviallon.fr (mailing list archive)
State New
Headers show
Series [v2] ceph: fix memory leak in ceph_mds_auth_match() | expand

Commit Message

Antoine Viallon Jan. 14, 2025, 10:45 p.m. UTC
We now free the temporary target path substring allocation on every possible branch, instead of omitting the default branch.
In some cases, a memory leak occured, which could rapidly crash the system (depending on how many file accesses were attempted).

This was detected in production because it caused a continuous memory growth,
eventually triggering kernel OOM and completely hard-locking the kernel.

Relevant kmemleak stacktrace:

    unreferenced object 0xffff888131e69900 (size 128):
      comm "git", pid 66104, jiffies 4295435999
      hex dump (first 32 bytes):
        76 6f 6c 75 6d 65 73 2f 63 6f 6e 74 61 69 6e 65  volumes/containe
        72 73 2f 67 69 74 65 61 2f 67 69 74 65 61 2f 67  rs/gitea/gitea/g
      backtrace (crc 2f3bb450):
        [<ffffffffaa68fb49>] __kmalloc_noprof+0x359/0x510
        [<ffffffffc32bf1df>] ceph_mds_check_access+0x5bf/0x14e0 [ceph]
        [<ffffffffc3235722>] ceph_open+0x312/0xd80 [ceph]
        [<ffffffffaa7dd786>] do_dentry_open+0x456/0x1120
        [<ffffffffaa7e3729>] vfs_open+0x79/0x360
        [<ffffffffaa832875>] path_openat+0x1de5/0x4390
        [<ffffffffaa834fcc>] do_filp_open+0x19c/0x3c0
        [<ffffffffaa7e44a1>] do_sys_openat2+0x141/0x180
        [<ffffffffaa7e4945>] __x64_sys_open+0xe5/0x1a0
        [<ffffffffac2cc2f7>] do_syscall_64+0xb7/0x210
        [<ffffffffac400130>] entry_SYSCALL_64_after_hwframe+0x77/0x7f

It can be triggered by mouting a subdirectory of a CephFS filesystem, and then trying to access files on this subdirectory with an auth token using a path-scoped capability:

    $ ceph auth get client.services
    [client.services]
            key = REDACTED
            caps mds = "allow rw fsname=cephfs path=/volumes/"
            caps mon = "allow r fsname=cephfs"
            caps osd = "allow rw tag cephfs data=cephfs"

    $ cat /proc/self/mounts
    services@00000000-0000-0000-0000-000000000000.cephfs=/volumes/containers /ceph/containers ceph rw,noatime,name=services,secret=<hidden>,ms_mode=prefer-crc,mount_timeout=300,acl,mon_addr=[REDACTED]:3300,recover_session=clean 0 0

    $ seq 1 1000000 | xargs -P32 --replace={} touch /ceph/containers/file-{} && \
    seq 1 1000000 | xargs -P32 --replace={} cat /ceph/containers/file-{}

Signed-off-by: Antoine Viallon <antoine@lesviallon.fr>
---
 fs/ceph/mds_client.c | 15 +++++++++------
 1 file changed, 9 insertions(+), 6 deletions(-)

Comments

Viacheslav Dubeyko Jan. 14, 2025, 11:27 p.m. UTC | #1
On Tue, 2025-01-14 at 23:45 +0100, Antoine Viallon wrote:
> We now free the temporary target path substring allocation on every
> possible branch, instead of omitting the default branch.
> In some cases, a memory leak occured, which could rapidly crash the
> system (depending on how many file accesses were attempted).
> 
> This was detected in production because it caused a continuous memory
> growth,
> eventually triggering kernel OOM and completely hard-locking the
> kernel.
> 
> Relevant kmemleak stacktrace:
> 
>     unreferenced object 0xffff888131e69900 (size 128):
>       comm "git", pid 66104, jiffies 4295435999
>       hex dump (first 32 bytes):
>         76 6f 6c 75 6d 65 73 2f 63 6f 6e 74 61 69 6e 65 
> volumes/containe
>         72 73 2f 67 69 74 65 61 2f 67 69 74 65 61 2f 67 
> rs/gitea/gitea/g
>       backtrace (crc 2f3bb450):
>         [<ffffffffaa68fb49>] __kmalloc_noprof+0x359/0x510
>         [<ffffffffc32bf1df>] ceph_mds_check_access+0x5bf/0x14e0
> [ceph]
>         [<ffffffffc3235722>] ceph_open+0x312/0xd80 [ceph]
>         [<ffffffffaa7dd786>] do_dentry_open+0x456/0x1120
>         [<ffffffffaa7e3729>] vfs_open+0x79/0x360
>         [<ffffffffaa832875>] path_openat+0x1de5/0x4390
>         [<ffffffffaa834fcc>] do_filp_open+0x19c/0x3c0
>         [<ffffffffaa7e44a1>] do_sys_openat2+0x141/0x180
>         [<ffffffffaa7e4945>] __x64_sys_open+0xe5/0x1a0
>         [<ffffffffac2cc2f7>] do_syscall_64+0xb7/0x210
>         [<ffffffffac400130>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> It can be triggered by mouting a subdirectory of a CephFS filesystem,
> and then trying to access files on this subdirectory with an auth
> token using a path-scoped capability:
> 
>     $ ceph auth get client.services
>     [client.services]
>             key = REDACTED
>             caps mds = "allow rw fsname=cephfs path=/volumes/"
>             caps mon = "allow r fsname=cephfs"
>             caps osd = "allow rw tag cephfs data=cephfs"
> 
>     $ cat /proc/self/mounts
>    
> services@00000000-0000-0000-0000-000000000000.cephfs=/volumes/contain
> ers /ceph/containers ceph
> rw,noatime,name=services,secret=<hidden>,ms_mode=prefer-
> crc,mount_timeout=300,acl,mon_addr=[REDACTED]:3300,recover_session=cl
> ean 0 0
> 
>     $ seq 1 1000000 | xargs -P32 --replace={} touch
> /ceph/containers/file-{} && \
>     seq 1 1000000 | xargs -P32 --replace={} cat
> /ceph/containers/file-{}
> 
> Signed-off-by: Antoine Viallon <antoine@lesviallon.fr>
> ---
>  fs/ceph/mds_client.c | 15 +++++++++------
>  1 file changed, 9 insertions(+), 6 deletions(-)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 785fe489ef4b..c3b63243c2dd 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -5690,18 +5690,21 @@ static int ceph_mds_auth_match(struct
> ceph_mds_client *mdsc,
>  			 *
>  			 * All the other cases                      
> --> mismatch
>  			 */
> +			int rc = 1;
>  			char *first = strstr(_tpath, auth-
> >match.path);
>  			if (first != _tpath) {
> -				if (free_tpath)
> -					kfree(_tpath);
> -				return 0;
> +				rc = 0;
>  			}
>  
>  			if (tlen > len && _tpath[len] != '/') {
> -				if (free_tpath)
> -					kfree(_tpath);
> -				return 0;
> +				rc = 0;
>  			}
> +
> +			if (free_tpath)
> +			  kfree(_tpath);

I have some worry here. Maybe, I am wrong. Initially, we receive tpath
pointer as function argument:

https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5605

Then, we assign tpath to _tpath:

https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5651

We allocate memory by condition:

			if (spath && (m = strlen(spath)) != 1) {
				/* mount path + '/' + tpath + an extra
space */
				n = m + 1 + tlen + 1;
				_tpath = kmalloc(n, GFP_NOFS);
				if (!_tpath)
					return -ENOMEM;
				/* remove the leading '/' */
				snprintf(_tpath, n, "%s/%s", spath +
1, tpath);
				free_tpath = true;
				tlen = strlen(_tpath);
			}

https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5660

What if condition is not true and we don't allocate memory? We still
have _tpath keeping the pointer on tpath and kfree() will be called. It
sounds for me that we can free tpath and caller of
ceph_mds_auth_match() will have use-after-free issue. Am I right here?
Do I miss something here?

Thanks,
Slava.


> +
> +			if (!rc)
> +			  return 0;
>  		}
>  	}
>
Antoine Viallon Jan. 15, 2025, 7:32 a.m. UTC | #2
On 15/01/2025 00:27, Viacheslav Dubeyko wrote:
> I have some worry here. Maybe, I am wrong. Initially, we receive tpath
> pointer as function argument:
> 
> https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5605
> 
> Then, we assign tpath to _tpath:
> 
> https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5651
> 
> We allocate memory by condition:
> 
> 			if (spath && (m = strlen(spath)) != 1) {
> 				/* mount path + '/' + tpath + an extra
> space */
> 				n = m + 1 + tlen + 1;
> 				_tpath = kmalloc(n, GFP_NOFS);
> 				if (!_tpath)
> 					return -ENOMEM;
> 				/* remove the leading '/' */
> 				snprintf(_tpath, n, "%s/%s", spath +
> 1, tpath);
> 				free_tpath = true;
> 				tlen = strlen(_tpath);
> 			}
> 
> https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5660
> 
> What if condition is not true and we don't allocate memory? We still
> have _tpath keeping the pointer on tpath and kfree() will be called. It
> sounds for me that we can free tpath and caller of
> ceph_mds_auth_match() will have use-after-free issue. Am I right here?
> Do I miss something here?

Hello Slava,
actually, we check that free_tpath is set to true before trying to free 
_tpath, and the only time free_tpath is set to true is after a 
successful kmalloc assigned to _tpath.

+			if (free_tpath)
+			  kfree(_tpath);

> 
> Thanks,
> Slava.

Antoine
diff mbox series

Patch

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 785fe489ef4b..c3b63243c2dd 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -5690,18 +5690,21 @@  static int ceph_mds_auth_match(struct ceph_mds_client *mdsc,
 			 *
 			 * All the other cases                       --> mismatch
 			 */
+			int rc = 1;
 			char *first = strstr(_tpath, auth->match.path);
 			if (first != _tpath) {
-				if (free_tpath)
-					kfree(_tpath);
-				return 0;
+				rc = 0;
 			}
 
 			if (tlen > len && _tpath[len] != '/') {
-				if (free_tpath)
-					kfree(_tpath);
-				return 0;
+				rc = 0;
 			}
+
+			if (free_tpath)
+			  kfree(_tpath);
+
+			if (!rc)
+			  return 0;
 		}
 	}