diff mbox series

ceph: fix memory leak in ceph_mds_auth_match()

Message ID 20250114123806.2339159-1-antoine@lesviallon.fr (mailing list archive)
State New
Headers show
Series ceph: fix memory leak in ceph_mds_auth_match() | expand

Commit Message

Antoine Viallon Jan. 14, 2025, 12:38 p.m. UTC
This was detected in production because it caused a continuous memory growth,
eventually triggering kernel OOM and completely hard-locking the kernel.

Relevant kmemleak stacktrace:

    unreferenced object 0xffff888131e69900 (size 128):
      comm "git", pid 66104, jiffies 4295435999
      hex dump (first 32 bytes):
        76 6f 6c 75 6d 65 73 2f 63 6f 6e 74 61 69 6e 65  volumes/containe
        72 73 2f 67 69 74 65 61 2f 67 69 74 65 61 2f 67  rs/gitea/gitea/g
      backtrace (crc 2f3bb450):
        [<ffffffffaa68fb49>] __kmalloc_noprof+0x359/0x510
        [<ffffffffc32bf1df>] ceph_mds_check_access+0x5bf/0x14e0 [ceph]
        [<ffffffffc3235722>] ceph_open+0x312/0xd80 [ceph]
        [<ffffffffaa7dd786>] do_dentry_open+0x456/0x1120
        [<ffffffffaa7e3729>] vfs_open+0x79/0x360
        [<ffffffffaa832875>] path_openat+0x1de5/0x4390
        [<ffffffffaa834fcc>] do_filp_open+0x19c/0x3c0
        [<ffffffffaa7e44a1>] do_sys_openat2+0x141/0x180
        [<ffffffffaa7e4945>] __x64_sys_open+0xe5/0x1a0
        [<ffffffffac2cc2f7>] do_syscall_64+0xb7/0x210
        [<ffffffffac400130>] entry_SYSCALL_64_after_hwframe+0x77/0x7f

Signed-off-by: Antoine Viallon <antoine@lesviallon.fr>
---
 fs/ceph/mds_client.c | 3 +++
 1 file changed, 3 insertions(+)

Comments

Antoine Viallon Jan. 14, 2025, 6:41 p.m. UTC | #1
Addendum: The relevant bug report can be found here: <https://tracker.ceph.com/issues/69535>
Viacheslav Dubeyko Jan. 14, 2025, 7:17 p.m. UTC | #2
On Tue, 2025-01-14 at 13:38 +0100, Antoine Viallon wrote:
> This was detected in production because it caused a continuous memory
> growth,
> eventually triggering kernel OOM and completely hard-locking the
> kernel.
> 

Does it exist any way to reproduce the issue in stable manner? Could
you please share any steps to repeat it? It will be great to have this
description in the patch comment.

> Relevant kmemleak stacktrace:
> 
>     unreferenced object 0xffff888131e69900 (size 128):
>       comm "git", pid 66104, jiffies 4295435999
>       hex dump (first 32 bytes):
>         76 6f 6c 75 6d 65 73 2f 63 6f 6e 74 61 69 6e 65 
> volumes/containe
>         72 73 2f 67 69 74 65 61 2f 67 69 74 65 61 2f 67 
> rs/gitea/gitea/g
>       backtrace (crc 2f3bb450):
>         [<ffffffffaa68fb49>] __kmalloc_noprof+0x359/0x510
>         [<ffffffffc32bf1df>] ceph_mds_check_access+0x5bf/0x14e0
> [ceph]
>         [<ffffffffc3235722>] ceph_open+0x312/0xd80 [ceph]
>         [<ffffffffaa7dd786>] do_dentry_open+0x456/0x1120
>         [<ffffffffaa7e3729>] vfs_open+0x79/0x360
>         [<ffffffffaa832875>] path_openat+0x1de5/0x4390
>         [<ffffffffaa834fcc>] do_filp_open+0x19c/0x3c0
>         [<ffffffffaa7e44a1>] do_sys_openat2+0x141/0x180
>         [<ffffffffaa7e4945>] __x64_sys_open+0xe5/0x1a0
>         [<ffffffffac2cc2f7>] do_syscall_64+0xb7/0x210
>         [<ffffffffac400130>] entry_SYSCALL_64_after_hwframe+0x77/0x7f
> 
> Signed-off-by: Antoine Viallon <antoine@lesviallon.fr>
> ---
>  fs/ceph/mds_client.c | 3 +++
>  1 file changed, 3 insertions(+)
> 
> diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
> index 785fe489ef4b..89c69e9c03b9 100644
> --- a/fs/ceph/mds_client.c
> +++ b/fs/ceph/mds_client.c
> @@ -5702,6 +5702,9 @@ static int ceph_mds_auth_match(struct
> ceph_mds_client *mdsc,
>  					kfree(_tpath);
>  				return 0;
>  			}
> +
> +			if (free_tpath)
> +			  kfree(_tpath);

As far as I can see, we have several kfree() calls in the logic of this
method:
(1)
https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5697
(2)
https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5703

And you are adding the third call. I believe that it will be much
cleaner solution if we have only one kfree() call and goto from all
other places. Could you please rework your fix?

Thanks,
Slava.
Antoine Viallon Jan. 14, 2025, 10:13 p.m. UTC | #3
Hello Viacheslav,
thanks for your review.

On 14/01/2025 20:17, Viacheslav Dubeyko wrote:

> Does it exist any way to reproduce the issue in stable manner? Could
> you please share any steps to repeat it? It will be great to have this
> description in the patch comment.

This issue can probably be reproduced by having a CephFS subvolumegroup 
be mounted to a kernel Ceph client (version 6.12.8), where the auth 
credentials are scoped to a specific path:

client.services:
	key: REDACTED
	caps: [mds] allow rw fsname=cephfs path=/volumes/
         caps: [mon] allow r fsname=cephfs
         caps: [osd] allow rw tag cephfs data=cephfs

Then, you simply need to trigger a lot of file permission check (either 
by using the FS for long or doing LOSF I/O). This could be probably be 
done by doing:

     seq 1 100000 | xargs -P32 --replace touch 
/path/to/your/cephfs/mount/file-{}
     seq 1 100000 | xargs -P32 --replace cat 
/path/to/your/cephfs/mount/file-{}

The idea is to place yourself in a situation where the target path being 
matched by ceph_mds_auth_match does not contain your credential (auth) 
path AT ALL. This can be done when mounting a subvolumegroup, for instance:

	services@00000000-0000-0000-0000-000000000000.cephfs=/volumes/containers ceph rw,noatime,name=services,secret=<hidden>,ms_mode=prefer-crc,mount_timeout=300,acl,mon_addr=[REDACTED]

Since you never enter 
https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5697 
or 
https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5703, 
you never free _tpath (whatever free_tpath's value might be).

> As far as I can see, we have several kfree() calls in the logic of this
> method:
> (1)
> https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5697
> (2)
> https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5703
> 
> And you are adding the third call. I believe that it will be much
> cleaner solution if we have only one kfree() call and goto from all
> other places. Could you please rework your fix?

I absolutely agree, and was thinking the same thing. I'll rework my 
patch to simplify this kfree path.

> Thanks,
> Slava.

Thank you,
Antoine Viallon
Viacheslav Dubeyko Jan. 14, 2025, 10:21 p.m. UTC | #4
On Tue, 2025-01-14 at 23:13 +0100, Antoine Viallon wrote:
> Hello Viacheslav,
> thanks for your review.
> 
> On 14/01/2025 20:17, Viacheslav Dubeyko wrote:
> 
> > Does it exist any way to reproduce the issue in stable manner?
> > Could
> > you please share any steps to repeat it? It will be great to have
> > this
> > description in the patch comment.
> 
> This issue can probably be reproduced by having a CephFS
> subvolumegroup 
> be mounted to a kernel Ceph client (version 6.12.8), where the auth 
> credentials are scoped to a specific path:
> 
> client.services:
> 	key: REDACTED
> 	caps: [mds] allow rw fsname=cephfs path=/volumes/
>          caps: [mon] allow r fsname=cephfs
>          caps: [osd] allow rw tag cephfs data=cephfs
> 
> Then, you simply need to trigger a lot of file permission check
> (either 
> by using the FS for long or doing LOSF I/O). This could be probably
> be 
> done by doing:
> 
>      seq 1 100000 | xargs -P32 --replace touch 
> /path/to/your/cephfs/mount/file-{}
>      seq 1 100000 | xargs -P32 --replace cat 
> /path/to/your/cephfs/mount/file-{}
> 
> The idea is to place yourself in a situation where the target path
> being 
> matched by ceph_mds_auth_match does not contain your credential
> (auth) 
> path AT ALL. This can be done when mounting a subvolumegroup, for
> instance:
> 
> 	
> services@00000000-0000-0000-0000-000000000000.cephfs=/volumes/contain
> ers ceph rw,noatime,name=services,secret=<hidden>,ms_mode=prefer-
> crc,mount_timeout=300,acl,mon_addr=[REDACTED]
> 

Thanks for the explanation. This description is valuable to understand
the issue nature and the path to reproduce it. It really needs to be in
patch description.

> Since you never enter 
> https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5697
>  
> or 
> https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5703
> ,
> you never free _tpath (whatever free_tpath's value might be).
> 
> > As far as I can see, we have several kfree() calls in the logic of
> > this
> > method:
> > (1)
> > https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5697
> > (2)
> > https://elixir.bootlin.com/linux/v6.13-rc3/source/fs/ceph/mds_client.c#L5703
> > 
> > And you are adding the third call. I believe that it will be much
> > cleaner solution if we have only one kfree() call and goto from all
> > other places. Could you please rework your fix?
> 
> I absolutely agree, and was thinking the same thing. I'll rework my 
> patch to simplify this kfree path.
> 

Sounds great!

Thanks,
Slava.
diff mbox series

Patch

diff --git a/fs/ceph/mds_client.c b/fs/ceph/mds_client.c
index 785fe489ef4b..89c69e9c03b9 100644
--- a/fs/ceph/mds_client.c
+++ b/fs/ceph/mds_client.c
@@ -5702,6 +5702,9 @@  static int ceph_mds_auth_match(struct ceph_mds_client *mdsc,
 					kfree(_tpath);
 				return 0;
 			}
+
+			if (free_tpath)
+			  kfree(_tpath);
 		}
 	}