mbox series

[v2,0/3] fanotify support for btrfs sub-volumes

Message ID 20231026155224.129326-1-amir73il@gmail.com (mailing list archive)
Headers show
Series fanotify support for btrfs sub-volumes | expand

Message

Amir Goldstein Oct. 26, 2023, 3:52 p.m. UTC
Jan,

As agreed on the review of v1 [1], we do not need any vfs changes
to support fanotify on btrfs sub-volumes and we can enable setting
marks on btrfs sub-volumes simply by caching the fsid in the mark
object instead of the connector.

This is the would be man page update to clarify the meaning of fsid
as it is reflected in this patch set:

fsid

  This is a unique identifier of the filesystem containing the object
  associated with the event.  It is a structure of type __kernel_fsid_t
  and contains the same value reported in  f_fsid  when calling
  statfs(2) with the same pathname argument that was used for
  fanotify_mark(2).  Note that some filesystems (e.g., btrfs(5)) report
  non-uniform values of f_fsid on different objects of the same filesystem.
  In these cases, if fanotify_mark(2) is called several times with different
  pathname values, the fsid value reported in events will match f_fsid
  associated  with at least one of those pathname values.

Thanks,
Amir.

[1] https://lore.kernel.org/r/CAOQ4uxg9wjESoCFNDADbneF0-nW4xVHHV3Rhhp=gJwAs=S83dQ@mail.gmail.com/

Amir Goldstein (3):
  fanotify: store fsid in mark instead of in connector
  fanotify: report the most specific fsid for btrfs
  fanotify: support setting marks in btrfs sub-volumes

 fs/notify/fanotify/fanotify.c      | 21 ++++--------
 fs/notify/fanotify/fanotify.h      | 10 ++++++
 fs/notify/fanotify/fanotify_user.c | 31 ++++++++----------
 fs/notify/mark.c                   | 52 +++++-------------------------
 include/linux/fsnotify_backend.h   | 18 +++++------
 5 files changed, 47 insertions(+), 85 deletions(-)

Comments

Jan Kara Oct. 26, 2023, 4:09 p.m. UTC | #1
On Thu 26-10-23 18:52:21, Amir Goldstein wrote:
> As agreed on the review of v1 [1], we do not need any vfs changes
> to support fanotify on btrfs sub-volumes and we can enable setting
> marks on btrfs sub-volumes simply by caching the fsid in the mark
> object instead of the connector.
> 
> This is the would be man page update to clarify the meaning of fsid
> as it is reflected in this patch set:
> 
> fsid
> 
>   This is a unique identifier of the filesystem containing the object
>   associated with the event.  It is a structure of type __kernel_fsid_t
>   and contains the same value reported in  f_fsid  when calling
>   statfs(2) with the same pathname argument that was used for
>   fanotify_mark(2).  Note that some filesystems (e.g., btrfs(5)) report
>   non-uniform values of f_fsid on different objects of the same filesystem.
>   In these cases, if fanotify_mark(2) is called several times with different
>   pathname values, the fsid value reported in events will match f_fsid
>   associated  with at least one of those pathname values.

Thanks! The patchset looks good to me but I don't want to queue it now so
shortly before the merge window opens. So I'll queue it into my tree once
I'll push out changes for the merge window.

								Honza
Christoph Hellwig Oct. 27, 2023, 5:46 a.m. UTC | #2
As per the discussion in the last round:

Hard-NAKed-by: Christoph Hellwig <hch@lst.de>

We need to solve the whole btrfs subvolume st_dev thing out properly
and not leak these details in fanotify.
Amir Goldstein Oct. 27, 2023, 6:03 a.m. UTC | #3
On Fri, Oct 27, 2023 at 8:46 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> As per the discussion in the last round:
>
> Hard-NAKed-by: Christoph Hellwig <hch@lst.de>
>
> We need to solve the whole btrfs subvolume st_dev thing out properly
> and not leak these details in fanotify.
>

With all due respect, your NACK is uncalled for.

Did you look at the patches?
Did you actually study what they do?
Please point out a single line of code that leaks details to fanotify
as you claim.

The "details" that fanotify reports and has reported since circa v5.1
are the same details available to any unprivileged user via calls
to statfs(2) and name_to_handle_at(2).

The v2 patches do not change anything in that regard.
This is an internal fanotify detail (whether we allow setting a
watch on an inode inside a sub-volume), which does not expose
any new details to usersapce. It has nothing to do with your
campaign to fix the btrfs non-uniform st_dev/f_fsid issue.

Thanks,
Amir.
Christoph Hellwig Oct. 27, 2023, 6:08 a.m. UTC | #4
On Fri, Oct 27, 2023 at 09:03:43AM +0300, Amir Goldstein wrote:
> With all due respect, your NACK is uncalled for.

It abosolute is not.  We must not spread the broken btrfs behavior.

> The "details" that fanotify reports and has reported since circa v5.1
> are the same details available to any unprivileged user via calls
> to statfs(2) and name_to_handle_at(2).

Yes, and that's very broken.  btrfs sneaked this in and we need to
fix it.

Again:

Extra-hard-Nacked-by: Christoph Hellwig <hch@lst.de>

and I'm a little sad that you're even arguing for sneaking such broken
thing in.
Amir Goldstein Oct. 27, 2023, 6:33 a.m. UTC | #5
On Fri, Oct 27, 2023 at 9:08 AM Christoph Hellwig <hch@infradead.org> wrote:
>
> On Fri, Oct 27, 2023 at 09:03:43AM +0300, Amir Goldstein wrote:
> > With all due respect, your NACK is uncalled for.
>
> It abosolute is not.  We must not spread the broken btrfs behavior.
>
> > The "details" that fanotify reports and has reported since circa v5.1
> > are the same details available to any unprivileged user via calls
> > to statfs(2) and name_to_handle_at(2).
>
> Yes, and that's very broken.  btrfs sneaked this in and we need to
> fix it.
>
> Again:
>
> Extra-hard-Nacked-by: Christoph Hellwig <hch@lst.de>
>
> and I'm a little sad that you're even arguing for sneaking such broken
> thing in.

OK. You are blaming me for attempting to sneak in a broken feature
and I have blamed you for trying take my patches hostage to
promote your agenda.
I hope that we are both wrong.

Suppose that we follow up on your plan to fix btrfs, that it is
all done and everyone is happy.

You said it yourself, that we cannot risk regressions by changing
the existing st_dev/f_fsid without an opt-in mount/mkfs option.
Is that correct?

If that is the case, fanotify will need to continue reporting the fsid's
exactly as the user observes them on the legacy btrfs filesystems.
The v2 patches I posted are required to make that possible.

However, I think we can reach a compromise here.
My main goal with the recent f_fsid series is to allow fanotify
to be used as (unprivileged) inotify drop-in replacement.

To do that, there is no need to allow btrfs sb/mount watch
on sub-volumes, so I don't mind keeping the ban on sb/mount
watches on btrfs sub-volume and relaxing only inode watch
inside btrfs sub-volume.

IOW, on legacy btrfs sub-volumes, users could use fanotify
to watch changes/access to a file and get events with the same
fsid that the user observed with statfs(2) on that file.

Migrating to new btrfs sub-volumes (i.e. with unique sb/vfsmount)
would enable the feature of fanotify sb/mount watch, so there is
one more incentive for the community to fix btrfs fsid's and for
users to actually make this migration.

Will that address your concerns?

Thanks,
Amir.
Christoph Hellwig Oct. 27, 2023, 7:30 a.m. UTC | #6
On Fri, Oct 27, 2023 at 09:33:19AM +0300, Amir Goldstein wrote:
> OK. You are blaming me for attempting to sneak in a broken feature
> and I have blamed you for trying take my patches hostage to
> promote your agenda.

I'm not blaming you for anything.  But I absolutely reject spreading
this broken behavior to core.  That's why there is hard NAK on this
patchs. 

> 
> If that is the case, fanotify will need to continue reporting the fsid's
> exactly as the user observes them on the legacy btrfs filesystems.
> The v2 patches I posted are required to make that possible.

The point is tht you simply can't use fanotify on a btrfs file system
with the broken behavior.  That's what btrfs gets for doing this
broken behavior to start with.
Jan Kara Oct. 27, 2023, 3:47 p.m. UTC | #7
On Fri 27-10-23 00:30:29, Christoph Hellwig wrote:
> On Fri, Oct 27, 2023 at 09:33:19AM +0300, Amir Goldstein wrote:
> > OK. You are blaming me for attempting to sneak in a broken feature
> > and I have blamed you for trying take my patches hostage to
> > promote your agenda.
> 
> I'm not blaming you for anything.  But I absolutely reject spreading
> this broken behavior to core.  That's why there is hard NAK on this
> patchs. 
> 
> > 
> > If that is the case, fanotify will need to continue reporting the fsid's
> > exactly as the user observes them on the legacy btrfs filesystems.
> > The v2 patches I posted are required to make that possible.
> 
> The point is tht you simply can't use fanotify on a btrfs file system
> with the broken behavior.  That's what btrfs gets for doing this
> broken behavior to start with.

Well, fanotify was never disabled on btrfs and presumably there are users.
What we blocked (exactly out of caution to not spread questionable
behavior) is placing of marks using a path whose fsid (from statfs(2)) is
different from filesystem root fsid when using FID-mode fanotify group
(i.e. groups where we report fsid + fhandle for each event instead of open
file descriptor). So effectively people currently cannot place marks on
non-root subvolume paths of btrfs for such fanotify groups.

Now given it is uncertain how exactly will filesystems end up presenting
subvolumes to VFS (and consequently to userspace) I agree it is probably
premature to allow placing superblock or mount marks on such paths because
the meaning could change when btrfs changes its presentation and we'd be
just adding ourselves more headaches with backward compatibility for no
great reasons. But so far I see no good reason to keep forbidding adding
inode marks on such paths as Amir suggests. I'll think about that.

								Honza