mbox series

[0/3] Reduce impact of overlayfs fake path files

Message ID 20230609073239.957184-1-amir73il@gmail.com (mailing list archive)
Headers show
Series Reduce impact of overlayfs fake path files | expand

Message

Amir Goldstein June 9, 2023, 7:32 a.m. UTC
Miklos,

This is the solution that we discussed for removing FMODE_NONOTIFY
from overlayfs real files.

My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
I am still testing the ovl-fsnotify interaction, so we can defer
that step to later.

I wanted to post this series earlier to give more time for fsdevel
feedback and if these patches get your blessing and the blessing of
vfs maintainers, it is probably better that they will go through the
vfs tree.

I've tested that overlay "fake" path are still shown in /proc/self/maps
and in the /proc/self/exe and /proc/self/map_files/ symlinks.

The audit and tomoyo use of file_fake_path() is not tested
(CC maintainers), but they both look like user displayed paths,
so I assumed they's want to preserve the existing behavior
(i.e. displaying the fake overlayfs path).

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/ovl_fake_path

Amir Goldstein (3):
  fs: use fake_file container for internal files with fake f_path
  fs: use file_fake_path() to get path of mapped files for display
  fs: store fake path in file_fake along with real path

 fs/cachefiles/namei.c  |  2 +-
 fs/file_table.c        | 85 ++++++++++++++++++++++++++++++++++--------
 fs/internal.h          |  5 ++-
 fs/namei.c             |  2 +-
 fs/open.c              |  9 +++--
 fs/overlayfs/file.c    |  2 +-
 fs/proc/base.c         |  8 ++--
 fs/seq_file.c          |  2 +-
 include/linux/fs.h     | 13 ++++---
 kernel/audit.c         |  3 +-
 kernel/fork.c          |  5 ++-
 security/tomoyo/util.c |  3 +-
 12 files changed, 102 insertions(+), 37 deletions(-)

Comments

Miklos Szeredi June 9, 2023, 1:15 p.m. UTC | #1
On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
>
> Miklos,
>
> This is the solution that we discussed for removing FMODE_NONOTIFY
> from overlayfs real files.
>
> My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> I am still testing the ovl-fsnotify interaction, so we can defer
> that step to later.
>
> I wanted to post this series earlier to give more time for fsdevel
> feedback and if these patches get your blessing and the blessing of
> vfs maintainers, it is probably better that they will go through the
> vfs tree.
>
> I've tested that overlay "fake" path are still shown in /proc/self/maps
> and in the /proc/self/exe and /proc/self/map_files/ symlinks.
>
> The audit and tomoyo use of file_fake_path() is not tested
> (CC maintainers), but they both look like user displayed paths,
> so I assumed they's want to preserve the existing behavior
> (i.e. displaying the fake overlayfs path).

I did an audit of all ->vm_file  and found a couple of missing ones:

dump_common_audit_data
ima_file_mprotect
common_file_perm (I don't understand the code enough to know whether
it needs fake dentry or not)
aa_file_perm
__file_path_perm
print_bad_pte
file_path
seq_print_user_ip
__mnt_want_write_file
__mnt_drop_write_file
file_dentry_name

Didn't go into drivers/ and didn't follow indirect calls (e.g.
f_op->fsysnc).  I also may have missed something along the way, but my
guess is that I did catch most cases.

Thanks,
Miklos
Tetsuo Handa June 9, 2023, 1:15 p.m. UTC | #2
On 2023/06/09 16:32, Amir Goldstein wrote:
> The audit and tomoyo use of file_fake_path() is not tested
> (CC maintainers), but they both look like user displayed paths,
> so I assumed they's want to preserve the existing behavior
> (i.e. displaying the fake overlayfs path).

Since I'm not using overlayfs, I don't know the difference between
real path and fake path. Would you explain using command line example?

  mkdir what_path1
  mkdir what_path2
  mkdir what_path3
  mount -t overlayfs ...what_paths_come_here?...

what the pathname returned by wrapping with file_fake_path() is, and
what the pathname returned by not wrapping with file_fake_path() is?
Amir Goldstein June 9, 2023, 1:54 p.m. UTC | #3
On Fri, Jun 9, 2023 at 4:16 PM Tetsuo Handa
<penguin-kernel@i-love.sakura.ne.jp> wrote:
>
> On 2023/06/09 16:32, Amir Goldstein wrote:
> > The audit and tomoyo use of file_fake_path() is not tested
> > (CC maintainers), but they both look like user displayed paths,
> > so I assumed they's want to preserve the existing behavior
> > (i.e. displaying the fake overlayfs path).
>
> Since I'm not using overlayfs, I don't know the difference between
> real path and fake path. Would you explain using command line example?
>
>   mkdir what_path1
>   mkdir what_path2
>   mkdir what_path3
>   mount -t overlayfs ...what_paths_come_here?...

For example:
mount -t overlayfs overlay /mnt/ovl \
          -o lowerdir=what_path1:what_path2:what_path3

>
> what the pathname returned by wrapping with file_fake_path() is, and
> what the pathname returned by not wrapping with file_fake_path() is?
>

It depends. if you have an audit rule on /mnt/ovl the path is
always the /mnt/ovl/... path (fake_path as well).

If you have an audit rule on what_path1 the fake path is /mnt/ovl/...
and the real path is /... (the relative path from what_path1).
In both cases, the filename itself will be correct.
If the rule prints something like %pD2 then it does not really
matter which path you use.

Thanks,
Amir.
Amir Goldstein June 9, 2023, 2:28 p.m. UTC | #4
On Fri, Jun 9, 2023 at 4:15 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > Miklos,
> >
> > This is the solution that we discussed for removing FMODE_NONOTIFY
> > from overlayfs real files.
> >
> > My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> > I am still testing the ovl-fsnotify interaction, so we can defer
> > that step to later.
> >
> > I wanted to post this series earlier to give more time for fsdevel
> > feedback and if these patches get your blessing and the blessing of
> > vfs maintainers, it is probably better that they will go through the
> > vfs tree.
> >
> > I've tested that overlay "fake" path are still shown in /proc/self/maps
> > and in the /proc/self/exe and /proc/self/map_files/ symlinks.
> >
> > The audit and tomoyo use of file_fake_path() is not tested
> > (CC maintainers), but they both look like user displayed paths,
> > so I assumed they's want to preserve the existing behavior
> > (i.e. displaying the fake overlayfs path).
>
> I did an audit of all ->vm_file  and found a couple of missing ones:

Wait, but why only ->vm_file?
We were under the assumption the fake path is only needed
for mapped files, but the list below suggests that it matters
to other file operations as well...

>
> dump_common_audit_data
> ima_file_mprotect
> common_file_perm (I don't understand the code enough to know whether
> it needs fake dentry or not)
> aa_file_perm
> __file_path_perm
> print_bad_pte
> file_path
> seq_print_user_ip
> __mnt_want_write_file
> __mnt_drop_write_file
> file_dentry_name
>
> Didn't go into drivers/ and didn't follow indirect calls (e.g.
> f_op->fsysnc).  I also may have missed something along the way, but my
> guess is that I did catch most cases.

Wow. So much for 3-4 special cases...

Confused by some of the above.

Why would we want __mnt_want_write_file on the fake path?
We'd already taken __mnt_want_write on overlay and with
real file we need __mnt_want_write on the real path.

For IMA/LSMs, I'd imagine that like fanotify, they would rather get
the real path where the real policy is stored.
If some log files end with relative path instead of full fake path
it's probably not the worst outcome.

Thoughts?

Thanks,
Amir.
Amir Goldstein June 9, 2023, 2:42 p.m. UTC | #5
On Fri, Jun 9, 2023 at 5:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Jun 9, 2023 at 4:15 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > Miklos,
> > >
> > > This is the solution that we discussed for removing FMODE_NONOTIFY
> > > from overlayfs real files.
> > >
> > > My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> > > I am still testing the ovl-fsnotify interaction, so we can defer
> > > that step to later.
> > >
> > > I wanted to post this series earlier to give more time for fsdevel
> > > feedback and if these patches get your blessing and the blessing of
> > > vfs maintainers, it is probably better that they will go through the
> > > vfs tree.
> > >
> > > I've tested that overlay "fake" path are still shown in /proc/self/maps
> > > and in the /proc/self/exe and /proc/self/map_files/ symlinks.
> > >
> > > The audit and tomoyo use of file_fake_path() is not tested
> > > (CC maintainers), but they both look like user displayed paths,
> > > so I assumed they's want to preserve the existing behavior
> > > (i.e. displaying the fake overlayfs path).
> >
> > I did an audit of all ->vm_file  and found a couple of missing ones:
>
> Wait, but why only ->vm_file?
> We were under the assumption the fake path is only needed
> for mapped files, but the list below suggests that it matters
> to other file operations as well...
>
> >
> > dump_common_audit_data
> > ima_file_mprotect
> > common_file_perm (I don't understand the code enough to know whether
> > it needs fake dentry or not)
> > aa_file_perm
> > __file_path_perm
> > print_bad_pte
> > file_path
> > seq_print_user_ip
> > __mnt_want_write_file
> > __mnt_drop_write_file
> > file_dentry_name
> >
> > Didn't go into drivers/ and didn't follow indirect calls (e.g.
> > f_op->fsysnc).  I also may have missed something along the way, but my
> > guess is that I did catch most cases.
>
> Wow. So much for 3-4 special cases...
>
> Confused by some of the above.
>
> Why would we want __mnt_want_write_file on the fake path?
> We'd already taken __mnt_want_write on overlay and with
> real file we need __mnt_want_write on the real path.
>
> For IMA/LSMs, I'd imagine that like fanotify, they would rather get
> the real path where the real policy is stored.
> If some log files end with relative path instead of full fake path
> it's probably not the worst outcome.
>
> Thoughts?

Considering the results of your audit, I think I prefer to keep
f_path fake and store real_path in struct file_fake for code
that wants the real path.

This will keep all logic unchanged, which is better for my health.
and only fsnotify (for now) will start using f_real_path() to
generate events on real fs objects.

Thanks,
Amir.
Miklos Szeredi June 9, 2023, 3 p.m. UTC | #6
On Fri, 9 Jun 2023 at 16:42, Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Jun 9, 2023 at 5:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Jun 9, 2023 at 4:15 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > >
> > > On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > Miklos,
> > > >
> > > > This is the solution that we discussed for removing FMODE_NONOTIFY
> > > > from overlayfs real files.
> > > >
> > > > My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> > > > I am still testing the ovl-fsnotify interaction, so we can defer
> > > > that step to later.
> > > >
> > > > I wanted to post this series earlier to give more time for fsdevel
> > > > feedback and if these patches get your blessing and the blessing of
> > > > vfs maintainers, it is probably better that they will go through the
> > > > vfs tree.
> > > >
> > > > I've tested that overlay "fake" path are still shown in /proc/self/maps
> > > > and in the /proc/self/exe and /proc/self/map_files/ symlinks.
> > > >
> > > > The audit and tomoyo use of file_fake_path() is not tested
> > > > (CC maintainers), but they both look like user displayed paths,
> > > > so I assumed they's want to preserve the existing behavior
> > > > (i.e. displaying the fake overlayfs path).
> > >
> > > I did an audit of all ->vm_file  and found a couple of missing ones:
> >
> > Wait, but why only ->vm_file?

Because we don't get to intercept vm_ops, so anything done through
mmaps will not go though overlayfs.   That would result in apparmor
missing these, for example.

> > We were under the assumption the fake path is only needed
> > for mapped files, but the list below suggests that it matters
> > to other file operations as well...
> >
> > >
> > > dump_common_audit_data
> > > ima_file_mprotect
> > > common_file_perm (I don't understand the code enough to know whether
> > > it needs fake dentry or not)
> > > aa_file_perm
> > > __file_path_perm
> > > print_bad_pte
> > > file_path
> > > seq_print_user_ip
> > > __mnt_want_write_file
> > > __mnt_drop_write_file
> > > file_dentry_name
> > >
> > > Didn't go into drivers/ and didn't follow indirect calls (e.g.
> > > f_op->fsysnc).  I also may have missed something along the way, but my
> > > guess is that I did catch most cases.
> >
> > Wow. So much for 3-4 special cases...
> >
> > Confused by some of the above.
> >
> > Why would we want __mnt_want_write_file on the fake path?
> > We'd already taken __mnt_want_write on overlay and with
> > real file we need __mnt_want_write on the real path.

It's for write faults on memory maps.   The code already branches on
file->f_mode, I don't think it would be a big performance hit to check
FMODE_FAKE_PATH.

> >
> > For IMA/LSMs, I'd imagine that like fanotify, they would rather get
> > the real path where the real policy is stored.
> > If some log files end with relative path instead of full fake path
> > it's probably not the worst outcome.
> >
> > Thoughts?
>
> Considering the results of your audit, I think I prefer to keep
> f_path fake and store real_path in struct file_fake for code
> that wants the real path.
>
> This will keep all logic unchanged, which is better for my health.
> and only fsnotify (for now) will start using f_real_path() to
> generate events on real fs objects.

That's also an option.

I think f_fake_path() would still be a move in the right direction.
We have 46 instances of file_dentry() currently and of those special
cases most are cosmetic, while missing file_dentry() ones are
crashable.

Thanks,
Miklos
Mimi Zohar June 9, 2023, 3:27 p.m. UTC | #7
On Fri, 2023-06-09 at 17:28 +0300, Amir Goldstein wrote:
> For IMA/LSMs, I'd imagine that like fanotify, they would rather get
> the real path where the real policy is stored.

Definitely!

> If some log files end with relative path instead of full fake path
> it's probably not the worst outcome.
Amir Goldstein June 9, 2023, 7:17 p.m. UTC | #8
On Fri, Jun 9, 2023 at 6:00 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Fri, 9 Jun 2023 at 16:42, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Jun 9, 2023 at 5:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Fri, Jun 9, 2023 at 4:15 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > Miklos,
> > > > >
> > > > > This is the solution that we discussed for removing FMODE_NONOTIFY
> > > > > from overlayfs real files.
> > > > >
> > > > > My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> > > > > I am still testing the ovl-fsnotify interaction, so we can defer
> > > > > that step to later.
> > > > >
> > > > > I wanted to post this series earlier to give more time for fsdevel
> > > > > feedback and if these patches get your blessing and the blessing of
> > > > > vfs maintainers, it is probably better that they will go through the
> > > > > vfs tree.
> > > > >
> > > > > I've tested that overlay "fake" path are still shown in /proc/self/maps
> > > > > and in the /proc/self/exe and /proc/self/map_files/ symlinks.
> > > > >
> > > > > The audit and tomoyo use of file_fake_path() is not tested
> > > > > (CC maintainers), but they both look like user displayed paths,
> > > > > so I assumed they's want to preserve the existing behavior
> > > > > (i.e. displaying the fake overlayfs path).
> > > >
> > > > I did an audit of all ->vm_file  and found a couple of missing ones:
> > >
> > > Wait, but why only ->vm_file?
>
> Because we don't get to intercept vm_ops, so anything done through
> mmaps will not go though overlayfs.   That would result in apparmor
> missing these, for example.
>
> > > We were under the assumption the fake path is only needed
> > > for mapped files, but the list below suggests that it matters
> > > to other file operations as well...
> > >
> > > >
> > > > dump_common_audit_data
> > > > ima_file_mprotect
> > > > common_file_perm (I don't understand the code enough to know whether
> > > > it needs fake dentry or not)
> > > > aa_file_perm
> > > > __file_path_perm
> > > > print_bad_pte
> > > > file_path
> > > > seq_print_user_ip
> > > > __mnt_want_write_file
> > > > __mnt_drop_write_file
> > > > file_dentry_name
> > > >
> > > > Didn't go into drivers/ and didn't follow indirect calls (e.g.
> > > > f_op->fsysnc).  I also may have missed something along the way, but my
> > > > guess is that I did catch most cases.
> > >
> > > Wow. So much for 3-4 special cases...
> > >
> > > Confused by some of the above.
> > >
> > > Why would we want __mnt_want_write_file on the fake path?
> > > We'd already taken __mnt_want_write on overlay and with
> > > real file we need __mnt_want_write on the real path.
>
> It's for write faults on memory maps.   The code already branches on
> file->f_mode, I don't think it would be a big performance hit to check
> FMODE_FAKE_PATH.
>

Yes, but all those lower leven helpers are also called for read/write ops
on the same realfile object, where we do not want them to act on the
fake path. That's the reason I started with this conversion in the first
place. Maybe I am missing something in the big picture, but for now
the next steps are clear to me:

1. Store both real+fake paths in file_fake container
2. f_path remains fake now and maybe will be changed later
3. f_real_path() will be used now in fsnotify
4. Once we have a plan, we can start adding f_fake_path()
     calls for the mapped file code paths and one day, we may
     be able to let f_path be real

I will post v3 with steps 1-3.

Thanks,
Amir.
Christian Brauner June 12, 2023, 7:57 a.m. UTC | #9
On Fri, Jun 09, 2023 at 05:00:25PM +0200, Miklos Szeredi wrote:
> On Fri, 9 Jun 2023 at 16:42, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Jun 9, 2023 at 5:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Fri, Jun 9, 2023 at 4:15 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > Miklos,
> > > > >
> > > > > This is the solution that we discussed for removing FMODE_NONOTIFY
> > > > > from overlayfs real files.
> > > > >
> > > > > My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> > > > > I am still testing the ovl-fsnotify interaction, so we can defer
> > > > > that step to later.
> > > > >
> > > > > I wanted to post this series earlier to give more time for fsdevel
> > > > > feedback and if these patches get your blessing and the blessing of
> > > > > vfs maintainers, it is probably better that they will go through the
> > > > > vfs tree.
> > > > >
> > > > > I've tested that overlay "fake" path are still shown in /proc/self/maps
> > > > > and in the /proc/self/exe and /proc/self/map_files/ symlinks.
> > > > >
> > > > > The audit and tomoyo use of file_fake_path() is not tested
> > > > > (CC maintainers), but they both look like user displayed paths,
> > > > > so I assumed they's want to preserve the existing behavior
> > > > > (i.e. displaying the fake overlayfs path).
> > > >
> > > > I did an audit of all ->vm_file  and found a couple of missing ones:
> > >
> > > Wait, but why only ->vm_file?
> 
> Because we don't get to intercept vm_ops, so anything done through
> mmaps will not go though overlayfs.   That would result in apparmor
> missing these, for example.
> 
> > > We were under the assumption the fake path is only needed
> > > for mapped files, but the list below suggests that it matters
> > > to other file operations as well...
> > >
> > > >
> > > > dump_common_audit_data
> > > > ima_file_mprotect
> > > > common_file_perm (I don't understand the code enough to know whether
> > > > it needs fake dentry or not)
> > > > aa_file_perm
> > > > __file_path_perm
> > > > print_bad_pte
> > > > file_path
> > > > seq_print_user_ip
> > > > __mnt_want_write_file
> > > > __mnt_drop_write_file
> > > > file_dentry_name
> > > >
> > > > Didn't go into drivers/ and didn't follow indirect calls (e.g.
> > > > f_op->fsysnc).  I also may have missed something along the way, but my
> > > > guess is that I did catch most cases.
> > >
> > > Wow. So much for 3-4 special cases...
> > >
> > > Confused by some of the above.
> > >
> > > Why would we want __mnt_want_write_file on the fake path?
> > > We'd already taken __mnt_want_write on overlay and with
> > > real file we need __mnt_want_write on the real path.
> 
> It's for write faults on memory maps.   The code already branches on
> file->f_mode, I don't think it would be a big performance hit to check
> FMODE_FAKE_PATH.
> 
> > >
> > > For IMA/LSMs, I'd imagine that like fanotify, they would rather get
> > > the real path where the real policy is stored.
> > > If some log files end with relative path instead of full fake path
> > > it's probably not the worst outcome.
> > >
> > > Thoughts?
> >
> > Considering the results of your audit, I think I prefer to keep
> > f_path fake and store real_path in struct file_fake for code
> > that wants the real path.
> >
> > This will keep all logic unchanged, which is better for my health.
> > and only fsnotify (for now) will start using f_real_path() to
> > generate events on real fs objects.
> 
> That's also an option.

Ideally we keep the generic file infrastructure separate from the
conversion of the various places because the latter operation is the
really sensitive part.
Amir Goldstein Oct. 2, 2023, 3:32 p.m. UTC | #10
On Fri, Jun 9, 2023 at 6:00 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
>
> On Fri, 9 Jun 2023 at 16:42, Amir Goldstein <amir73il@gmail.com> wrote:
> >
> > On Fri, Jun 9, 2023 at 5:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Fri, Jun 9, 2023 at 4:15 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > >
> > > > On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
> > > > >
> > > > > Miklos,
> > > > >
> > > > > This is the solution that we discussed for removing FMODE_NONOTIFY
> > > > > from overlayfs real files.
> > > > >
> > > > > My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> > > > > I am still testing the ovl-fsnotify interaction, so we can defer
> > > > > that step to later.
> > > > >
> > > > > I wanted to post this series earlier to give more time for fsdevel
> > > > > feedback and if these patches get your blessing and the blessing of
> > > > > vfs maintainers, it is probably better that they will go through the
> > > > > vfs tree.
> > > > >
> > > > > I've tested that overlay "fake" path are still shown in /proc/self/maps
> > > > > and in the /proc/self/exe and /proc/self/map_files/ symlinks.
> > > > >
> > > > > The audit and tomoyo use of file_fake_path() is not tested
> > > > > (CC maintainers), but they both look like user displayed paths,
> > > > > so I assumed they's want to preserve the existing behavior
> > > > > (i.e. displaying the fake overlayfs path).
> > > >
> > > > I did an audit of all ->vm_file  and found a couple of missing ones:
> > >

Hi Miklos,

Trying to get back to this.

> > > Wait, but why only ->vm_file?
>
> Because we don't get to intercept vm_ops, so anything done through
> mmaps will not go though overlayfs.   That would result in apparmor
> missing these, for example.
>

As discussed, unless we split ovl realfile to 3 variants: lower/upper/mmap
vm_file will be a backing_file and so will the read/write realfile,
sometimes the low level helpers need the real path and sometimes
then need the fake path.

> > > We were under the assumption the fake path is only needed
> > > for mapped files, but the list below suggests that it matters
> > > to other file operations as well...
> > >
> > > >
> > > > dump_common_audit_data

This is an example of a generic helper.
There is no way of knowing if it really wants the real or fake path.
It depends if the audit rule was set on ovl path or on real path,
but if audit rule was set on real path, we should not let the fake path
avert the audit, same as we should not have let the real file avert
fsnotify events.

It seems to me the audit rules are set on inodes and compare
st_dev/st_ino, so it is more likely that for operations on the realfile,
the real path makes more sense to print.

> > > > ima_file_mprotect

From the little experience I have with overlayfs and IMA,
nothing works much wrt measuring and verifying operations
on the real files.

> > > > common_file_perm (I don't understand the code enough to know whether
> > > > it needs fake dentry or not)
> > > > aa_file_perm
> > > > __file_path_perm

Same rationale as in audit.
We do not want to avert permission hooks for rules that were set
on the real inode/path, so it makes way more sense to use real path
in these common helpers.

Printing the wrong path is not as bad as not printing an audit!

> > > > print_bad_pte

I am not very concerned about printing real path in kmsg errors.
Those errors are more likely cause by underlying fs/block issues anyway?

> > > > file_path

Too low level.
Call sites that need to print the fake path can use d_path()

> > > > seq_print_user_ip

Yes.

> > > > __mnt_want_write_file
> > > > __mnt_drop_write_file
> > > > file_dentry_name

Too low level.

> > > >
> > > > Didn't go into drivers/ and didn't follow indirect calls (e.g.
> > > > f_op->fsysnc).  I also may have missed something along the way, but my
> > > > guess is that I did catch most cases.
> > >
> > > Wow. So much for 3-4 special cases...
> > >
> > > Confused by some of the above.
> > >
> > > Why would we want __mnt_want_write_file on the fake path?
> > > We'd already taken __mnt_want_write on overlay and with
> > > real file we need __mnt_want_write on the real path.
>
> It's for write faults on memory maps.   The code already branches on
> file->f_mode, I don't think it would be a big performance hit to check
> FMODE_FAKE_PATH.
>

Again, FMODE_BACKING is set on the same realfile that is used
for read/write_iter. Unless you will be willing to allocate two realfiles.

I added a patch to take mnt_writers refcount for both fake and real path
for writable file.
Page faults only need to take freeze protection on real sb. no?

> > >
> > > For IMA/LSMs, I'd imagine that like fanotify, they would rather get
> > > the real path where the real policy is stored.
> > > If some log files end with relative path instead of full fake path
> > > it's probably not the worst outcome.
> > >
> > > Thoughts?
> >
> > Considering the results of your audit, I think I prefer to keep
> > f_path fake and store real_path in struct file_fake for code
> > that wants the real path.
> >
> > This will keep all logic unchanged, which is better for my health.
> > and only fsnotify (for now) will start using f_real_path() to
> > generate events on real fs objects.
>
> That's also an option.
>
> I think f_fake_path() would still be a move in the right direction.
> We have 46 instances of file_dentry() currently and of those special
> cases most are cosmetic, while missing file_dentry() ones are
> crashable.
>

Here is what I have so far:

https://github.com/amir73il/linux/commits/ovl_fake_path

I tested it with fstests and nothing popped up.
If this looks like a good start to you, I can throw it at linux-next and
see and anything floats.

Thanks,
Amir.
Amir Goldstein Oct. 4, 2023, 3:29 p.m. UTC | #11
On Mon, Oct 2, 2023 at 6:32 PM Amir Goldstein <amir73il@gmail.com> wrote:
>
> On Fri, Jun 9, 2023 at 6:00 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> >
> > On Fri, 9 Jun 2023 at 16:42, Amir Goldstein <amir73il@gmail.com> wrote:
> > >
> > > On Fri, Jun 9, 2023 at 5:28 PM Amir Goldstein <amir73il@gmail.com> wrote:
> > > >
> > > > On Fri, Jun 9, 2023 at 4:15 PM Miklos Szeredi <miklos@szeredi.hu> wrote:
> > > > >
> > > > > On Fri, 9 Jun 2023 at 09:32, Amir Goldstein <amir73il@gmail.com> wrote:
> > > > > >
> > > > > > Miklos,
> > > > > >
> > > > > > This is the solution that we discussed for removing FMODE_NONOTIFY
> > > > > > from overlayfs real files.
> > > > > >
> > > > > > My branch [1] has an extra patch for remove FMODE_NONOTIFY, but
> > > > > > I am still testing the ovl-fsnotify interaction, so we can defer
> > > > > > that step to later.
> > > > > >
> > > > > > I wanted to post this series earlier to give more time for fsdevel
> > > > > > feedback and if these patches get your blessing and the blessing of
> > > > > > vfs maintainers, it is probably better that they will go through the
> > > > > > vfs tree.
> > > > > >
> > > > > > I've tested that overlay "fake" path are still shown in /proc/self/maps
> > > > > > and in the /proc/self/exe and /proc/self/map_files/ symlinks.
> > > > > >
> > > > > > The audit and tomoyo use of file_fake_path() is not tested
> > > > > > (CC maintainers), but they both look like user displayed paths,
> > > > > > so I assumed they's want to preserve the existing behavior
> > > > > > (i.e. displaying the fake overlayfs path).
> > > > >
> > > > > I did an audit of all ->vm_file  and found a couple of missing ones:
> > > >
>
> Hi Miklos,
>
> Trying to get back to this.
>
> > > > Wait, but why only ->vm_file?
> >
> > Because we don't get to intercept vm_ops, so anything done through
> > mmaps will not go though overlayfs.   That would result in apparmor
> > missing these, for example.
> >
>
> As discussed, unless we split ovl realfile to 3 variants: lower/upper/mmap
> vm_file will be a backing_file and so will the read/write realfile,
> sometimes the low level helpers need the real path and sometimes
> then need the fake path.
>
> > > > We were under the assumption the fake path is only needed
> > > > for mapped files, but the list below suggests that it matters
> > > > to other file operations as well...
> > > >
> > > > >
> > > > > dump_common_audit_data
>
> This is an example of a generic helper.
> There is no way of knowing if it really wants the real or fake path.
> It depends if the audit rule was set on ovl path or on real path,
> but if audit rule was set on real path, we should not let the fake path
> avert the audit, same as we should not have let the real file avert
> fsnotify events.
>
> It seems to me the audit rules are set on inodes and compare
> st_dev/st_ino, so it is more likely that for operations on the realfile,
> the real path makes more sense to print.
>
> > > > > ima_file_mprotect
>
> From the little experience I have with overlayfs and IMA,
> nothing works much wrt measuring and verifying operations
> on the real files.
>
> > > > > common_file_perm (I don't understand the code enough to know whether
> > > > > it needs fake dentry or not)
> > > > > aa_file_perm
> > > > > __file_path_perm
>
> Same rationale as in audit.
> We do not want to avert permission hooks for rules that were set
> on the real inode/path, so it makes way more sense to use real path
> in these common helpers.
>
> Printing the wrong path is not as bad as not printing an audit!
>
> > > > > print_bad_pte
>
> I am not very concerned about printing real path in kmsg errors.
> Those errors are more likely cause by underlying fs/block issues anyway?
>
> > > > > file_path
>
> Too low level.
> Call sites that need to print the fake path can use d_path()
>
> > > > > seq_print_user_ip
>
> Yes.
>
> > > > > __mnt_want_write_file
> > > > > __mnt_drop_write_file
> > > > > file_dentry_name
>
> Too low level.
>
> > > > >
> > > > > Didn't go into drivers/ and didn't follow indirect calls (e.g.
> > > > > f_op->fsysnc).  I also may have missed something along the way, but my
> > > > > guess is that I did catch most cases.
> > > >
> > > > Wow. So much for 3-4 special cases...
> > > >
> > > > Confused by some of the above.
> > > >
> > > > Why would we want __mnt_want_write_file on the fake path?
> > > > We'd already taken __mnt_want_write on overlay and with
> > > > real file we need __mnt_want_write on the real path.
> >
> > It's for write faults on memory maps.   The code already branches on
> > file->f_mode, I don't think it would be a big performance hit to check
> > FMODE_FAKE_PATH.
> >
>
> Again, FMODE_BACKING is set on the same realfile that is used
> for read/write_iter. Unless you will be willing to allocate two realfiles.
>
> I added a patch to take mnt_writers refcount for both fake and real path
> for writable file.
> Page faults only need to take freeze protection on real sb. no?
>
> > > >
> > > > For IMA/LSMs, I'd imagine that like fanotify, they would rather get
> > > > the real path where the real policy is stored.
> > > > If some log files end with relative path instead of full fake path
> > > > it's probably not the worst outcome.
> > > >
> > > > Thoughts?
> > >
> > > Considering the results of your audit, I think I prefer to keep
> > > f_path fake and store real_path in struct file_fake for code
> > > that wants the real path.
> > >
> > > This will keep all logic unchanged, which is better for my health.
> > > and only fsnotify (for now) will start using f_real_path() to
> > > generate events on real fs objects.
> >
> > That's also an option.
> >
> > I think f_fake_path() would still be a move in the right direction.
> > We have 46 instances of file_dentry() currently and of those special
> > cases most are cosmetic, while missing file_dentry() ones are
> > crashable.
> >
>
> Here is what I have so far:
>
> https://github.com/amir73il/linux/commits/ovl_fake_path
>
> I tested it with fstests and nothing popped up.
> If this looks like a good start to you, I can throw it at linux-next and
> see and anything floats.
>

Hi Paul,

I would like to ask for your help to confirm my theory about
"overlayfs backing files" and LSM/IMA/audit hooks.

First, let me start with some background for the audience.

An overlayfs typically has lower+upper underlying layers, let's assume
for the sake of discussion that both lower and upper are on an xfs
mount at /mnt/xfs and that overlayfs is mounted at /mnt/ovl.

When a user opens a file from overlayfs mount path, an additional
internal "backing file" is opened by overlayfs on either the lower or
upper xfs path (e.g. /mnt/xfs/lower/foo).

In this example, security_file_open() will be called for the overlayfs
file at "/mnt/ovl/foo" with user credential and then backing_file_open()
will once again call security_file_open() with the backing file at
"/mnt/xfs/lower/foo" with the stored credentials of the user that mounted
overlayfs (i.e. mounter credentials). I guess you already know this part.

The problem is that the backing file's f_path is not "/mnt/xfs/lower/foo"
but rather "/mnt/ovl/foo". It has always been this way with overlayfs
files, more or less. The reason is so users will not be exposed to the
real path via /proc/<pid>/maps. backing files are not in the process
fd table, so the real path is never exposed via /proc/<pid>/fd/.

This special condition is sometimes referred to as "fake" path -
The helper open_with_fake_path() was renamed to backing_file_open()
in commit 62d53c4a1dfe ("fs: use backing_file container for internal files
with "fake" f_path").

I knew for a long time that fsnotify hooks couldn't handle files with
fake path correctly and that is the reason that I introduced the backing_file
container and file_real_path(), so that fsnotify can get to the real path
of overlayfs backing files (e.g. "/mnt/xfs/lower/foo") and fsnotify watches
on the underlying xfs filesystem could be respected.

My theory is that all the LSM hooks as well as the non-LSM IMA/audit
hooks, always need the real path of overlayfs backing files in order to
enforce their policy and not the fake path.

Mimi has know for some time that IMA does not work well in conjunction
with overlayfs and more specifically, an IMA policy that wants to measure
all files on the underlying xfs does not work well when overlayfs is using
backing files on xfs.

For example, if ima_file_free() would sometimes use the inode file->f_inode
and sometimes use d_inode(file->f_path.dentry), very bad things would
happen because it is not the same object.

For this reason I posted this IMA patch:
https://lore.kernel.org/linux-fsdevel/20230913073755.3489676-1-amir73il@gmail.com/

But I don't like this solution - there are endless places that may
require this fix - I suspect that many of the LSM modules are "broken"
in the same manner when it comes to overlayfs backing files.

I would rather have backing_file f_path always be the real path and
the few places that need the fake path would opt in for it, as I've
implemented in this patch set [1].

To prove my theory that this change is correct for all LSM/audit hooks,
I audited the hooks that accept file or path arguments.

It is important to note that the security_path_* hooks are never called
by overlayfs itself on the underlying filesystem paths.

overlayfs only ever calls the vfs helpers (e.g. vfs_mkdir()) which
call the security_inode_* and security_dentry_* hooks on the underlying
filesystem objects.

Likewise, security_mmap_file() and security_file_mprotect() are also
called at the "syscall level" and overlayfs never calls any vfs helpers
that call those hooks when mmaping the backing file.
Arguably, overlayfs should call security_mmap_file() explicitly
in ovl_mmap() on the real backing file.

Other security_file_* LSM hooks may very well be called by
overlayfs when operating on the backing file, for example
security_file_permission() from ovl_read_iter() which reads from
the backing file.

But in all those security_file_* LSM hooks, just like my examples
with ima_file_free() and fsnotify_open(), there is never going to
be a case where the LSM module would care about the overlayfs
fake path, and the code is always going to be more safe if the
assumption that d_inode(file->f_path.dentry) is the same as
file->f_inode holds for overlayfs backing files.

Was I able to explain the problem?

Was I able to explain why I think that the proposed change [1]
is safe and good for all the LSM modules, IMA and audit?

Do you agree with my theory?

Would you be able to run an audit integration test and other
LSM tests if available on my branch to let me know if it breaks
anything?

I could push this to linux-next via overlayfs tree, but because
it is a very subtle change, I wanted to get some ACK from
Miklos and from you on the concept first.

If possible to get an early audit/LSM test run that would be
awesome, because I don't really know which audit/LSM tests
are regularly run by bots on linux-next.

Thanks,
Amir.

[1] https://github.com/amir73il/linux/commits/ovl_fake_path