diff mbox series

[5/6] fsmount: do not use legacy MS_ flags

Message ID 20180920151214.15484-6-mszeredi@redhat.com (mailing list archive)
State New, archived
Headers show
Series mount-api: fixes and cleanups | expand

Commit Message

Miklos Szeredi Sept. 20, 2018, 3:12 p.m. UTC
What happens if we introduce new flags for fsmount(2) and are already out
of flags for mount(2)?  I see a big mess that way.

So let's instead start a clean new set, to be used in the new API.

The MS_RELATIME flag was accepted but ignored.  Simply leave this out of
the new set, since "relatime" is the default.

The MNT_ prefix is already used by libmount, and we don't want any
confusion arising from that.  MOUNT_ feels a bit too long, so just go with
the short and sweet M_ prefix.

Signed-off-by: Miklos Szeredi <mszeredi@redhat.com>
---
 fs/namespace.c             | 28 +++++++++++++---------------
 include/uapi/linux/mount.h |  9 +++++++++
 2 files changed, 22 insertions(+), 15 deletions(-)

Comments

David Howells Sept. 21, 2018, 3:07 p.m. UTC | #1
Miklos Szeredi <mszeredi@redhat.com> wrote:

> What happens if we introduce new flags for fsmount(2) and are already out
> of flags for mount(2)?  I see a big mess that way.
> 
> So let's instead start a clean new set, to be used in the new API.

If we must.  But let's not call them just M_* please.  Let's call them
MOUNT_ATTR_* or something.

> The MS_RELATIME flag was accepted but ignored.  Simply leave this out of
> the new set, since "relatime" is the default.

Can we make RELATIME, STRICTATIME and NOATIME an enum rather than individual
flags?

	#define MOUNT_ATTR_RDONLY	0x01
	#define MOUNT_ATTR_NOSUID	0x02
	#define MOUNT_ATTR_NODEV	0x04
	#define MOUNT_ATTR_NOEXEC	0x08
	#define MOUNT_ATTR_RELATIME	0x00
	#define MOUNT_ATTR_NOATIME	0x10
	#define MOUNT_ATTR_STRICTATIME	0x20
	#define MOUNT_ATTR_ATIME_MASK	0x30
	#define MOUNT_ATTR_NODIRATIME	0x40

We can also use these for a mount_setattr() syscall:

	mount_setattr(int dfd, const char *path, unsigned int atflags,
		      unsigned int attr_values,
		      unsigned int attr_mask);

where atflags can potentially include AT_RECURSIVE.

David
Miklos Szeredi Sept. 21, 2018, 3:28 p.m. UTC | #2
On Fri, Sep 21, 2018 at 5:07 PM, David Howells <dhowells@redhat.com> wrote:
> Miklos Szeredi <mszeredi@redhat.com> wrote:
>
>> What happens if we introduce new flags for fsmount(2) and are already out
>> of flags for mount(2)?  I see a big mess that way.
>>
>> So let's instead start a clean new set, to be used in the new API.
>
> If we must.  But let's not call them just M_* please.  Let's call them
> MOUNT_ATTR_* or something.

Oh well.

>> The MS_RELATIME flag was accepted but ignored.  Simply leave this out of
>> the new set, since "relatime" is the default.
>
> Can we make RELATIME, STRICTATIME and NOATIME an enum rather than individual
> flags?

Sure.

>
>         #define MOUNT_ATTR_RDONLY       0x01
>         #define MOUNT_ATTR_NOSUID       0x02
>         #define MOUNT_ATTR_NODEV        0x04
>         #define MOUNT_ATTR_NOEXEC       0x08
>         #define MOUNT_ATTR_RELATIME     0x00
>         #define MOUNT_ATTR_NOATIME      0x10
>         #define MOUNT_ATTR_STRICTATIME  0x20
>         #define MOUNT_ATTR_ATIME_MASK   0x30
>         #define MOUNT_ATTR_NODIRATIME   0x40
>
> We can also use these for a mount_setattr() syscall:
>
>         mount_setattr(int dfd, const char *path, unsigned int atflags,
>                       unsigned int attr_values,
>                       unsigned int attr_mask);
>
> where atflags can potentially include AT_RECURSIVE.

Indeed.  Also, shouldn't these include the propagation flags?

Thanks,
Miklos
David Howells Sept. 21, 2018, 3:37 p.m. UTC | #3
Miklos Szeredi <miklos@szeredi.hu> wrote:

> Indeed.  Also, shouldn't these include the propagation flags?

I guess so, though I'd be tempted to put those in a separate set.  Also, I'm
not sure whether fsmount() should deal with prop flags or whether that should
be something move_mount() needs to deal with (ie. does propagation need
applying at the time of splicing, depending on the parent).

David
Christian Brauner Sept. 21, 2018, 3:54 p.m. UTC | #4
On Fri, Sep 21, 2018 at 04:07:55PM +0100, David Howells wrote:
> Miklos Szeredi <mszeredi@redhat.com> wrote:
> 
> > What happens if we introduce new flags for fsmount(2) and are already out
> > of flags for mount(2)?  I see a big mess that way.
> > 
> > So let's instead start a clean new set, to be used in the new API.

If I may chime in. I think this is a really good idea.

> 
> If we must.  But let's not call them just M_* please.  Let's call them
> MOUNT_ATTR_* or something.
> 
> > The MS_RELATIME flag was accepted but ignored.  Simply leave this out of
> > the new set, since "relatime" is the default.
> 
> Can we make RELATIME, STRICTATIME and NOATIME an enum rather than individual
> flags?
> 
> 	#define MOUNT_ATTR_RDONLY	0x01
> 	#define MOUNT_ATTR_NOSUID	0x02
> 	#define MOUNT_ATTR_NODEV	0x04
> 	#define MOUNT_ATTR_NOEXEC	0x08
> 	#define MOUNT_ATTR_RELATIME	0x00
> 	#define MOUNT_ATTR_NOATIME	0x10
> 	#define MOUNT_ATTR_STRICTATIME	0x20
> 	#define MOUNT_ATTR_ATIME_MASK	0x30
> 	#define MOUNT_ATTR_NODIRATIME	0x40
> 
> We can also use these for a mount_setattr() syscall:

So a - maybe stupid - question I had when recently working on a patch
was how to add new mount properties in the new api.
What I would expect is that once the new mount API has landed the old
mount() syscall should not see any new features added since I take it
that we want userspace to very slowly over time have sufficient
incentive to only use the new mount API.

So from reading the patch I got the impression that superblock mount
options passed via fsconfig() are passed as strings like "ro" and are
translated into approriate objects (e.g. flags etc.) by the kernel. That
seems like a future proof mechanism to extend mount options for
userspace without having to worry about exceeding any integer types in
the future. Maybe this would make sense for non-superblock options as
well? I can see that this is less performant that checking flags and
string parsing is a thing that is not particularly nice but it would be
one option to solve the running out of flags problem.

> 
> 	mount_setattr(int dfd, const char *path, unsigned int atflags,
> 		      unsigned int attr_values,
> 		      unsigned int attr_mask);

If we go with the flag arguments wouldn't it make sense to use a larger
integer type?

> 
> where atflags can potentially include AT_RECURSIVE.

Very much in favor of having this operate recursively!

Christian
David Howells Sept. 21, 2018, 4:52 p.m. UTC | #5
Christian Brauner <christian@brauner.io> wrote:

> So from reading the patch I got the impression that superblock mount
> options passed via fsconfig() are passed as strings like "ro" and are
> translated into approriate objects (e.g. flags etc.) by the kernel.

I'm having second throughts about that - at least for "ro": that one is
particularly problematic as it governs how source devices are opened.  I'm
kind of tempted to pass that as a flag to fsopen().

What I'd like to do in btrfs, ext4, etc. is open blockdevs as I get the
parameters that enumerate them - but I have to hold the looked-up paths till
the validate/get_tree stage so that I know the final ro/rw state before I can
do the opening.

> That seems like a future proof mechanism to extend mount options for
> userspace without having to worry about exceeding any integer types in the
> future. Maybe this would make sense for non-superblock options as well? I
> can see that this is less performant that checking flags and string parsing
> is a thing that is not particularly nice but it would be one option to solve
> the running out of flags problem.

Al disliked the idea of setting up a separate context to define the mount
options.

> > 	mount_setattr(int dfd, const char *path, unsigned int atflags,
> > 		      unsigned int attr_values,
> > 		      unsigned int attr_mask);
> 
> If we go with the flag arguments wouldn't it make sense to use a larger
> integer type?

You can't - at least not directly through syscall args.  They are 32-bit on a
32-bit system.

> > where atflags can potentially include AT_RECURSIVE.
> 
> Very much in favor of having this operate recursively!

It gets complicated with propagation.

David
Christian Brauner Sept. 22, 2018, 1:21 p.m. UTC | #6
On Fri, Sep 21, 2018 at 05:52:36PM +0100, David Howells wrote:
> Christian Brauner <christian@brauner.io> wrote:
> 
> > So from reading the patch I got the impression that superblock mount
> > options passed via fsconfig() are passed as strings like "ro" and are
> > translated into approriate objects (e.g. flags etc.) by the kernel.
> 
> I'm having second throughts about that - at least for "ro": that one is
> particularly problematic as it governs how source devices are opened.  I'm
> kind of tempted to pass that as a flag to fsopen().
> 
> What I'd like to do in btrfs, ext4, etc. is open blockdevs as I get the
> parameters that enumerate them - but I have to hold the looked-up paths till
> the validate/get_tree stage so that I know the final ro/rw state before I can
> do the opening.
> 
> > That seems like a future proof mechanism to extend mount options for
> > userspace without having to worry about exceeding any integer types in the
> > future. Maybe this would make sense for non-superblock options as well? I
> > can see that this is less performant that checking flags and string parsing
> > is a thing that is not particularly nice but it would be one option to solve
> > the running out of flags problem.
> 
> Al disliked the idea of setting up a separate context to define the mount
> options.

I hope thinking about mount flag exhaustion at this point is not too
hijacking this thread too much. But another option I was thinking about was to
split the mount flags into different sets where the sets can be
differentiated by a command.

enum {
        MOUNT_ATTR_PROPAGATION = 1
        MOUNT_ATTR_SECURITY
        MOUNT_ATTR_FSTIME
}

Let's say split mount propagation flags into a set identified by
MOUNT_ATTR_PROPAGATION:
#define MOUNT_ATTR_UNBINDABLE   (1<<0)  /* change to unbindable */
#define MOUNT_ATTR_PRIVATE      (1<<1)  /* change to private */
#define MOUNT_ATTR_SLAVE        (1<<2)  /* change to slave */
#define MOUNT_ATTR_SHARED       (1<<3)  /* change to shared */

and another set
MOUNT_ATTR_SECURITY:
#define MOUNT_ATTR_RDONLY        (1<<0) /* Mount read-only */
#define MOUNT_ATTR_NOSUID        (1<<1) /* Ignore suid and sgid bits */
#define MOUNT_ATTR_NODEV         (1<<2) /* Disallow access to device special files */
#define MOUNT_ATTR_NOEXEC        (1<<3) /* Disallow program execution */

and another set
MOUNT_ATTR_FSTIME:
#define MOUNT_ATTR_NOATIME      (1<<0)  /* Do not update access times. */
#define MOUNT_ATTR_NODIRATIME   (1<<1)  /* Do not update directory access times */

and so on...

So you could e.g. add another unsigned int cmd argument to
mount_setattr():

mount_setattr(int dfd, const char *path, unsigned int atflags,
              unsigned int attr_cmd,
	      unsigned int attr_values,
	      unsigned int attr_mask);

Then - similiar to fsconfig()'s FS_CONFIG_SET_{FLAG,STRING,...} - you
would pass:

/* set propagation */
mount_setattr(dfd, path, atflags,
              MOUNT_ATTR_PROPAGATION,
	      vals,
	      propagation_flagmask);

/* set security */
mount_setattr(dfd, path, atflags,
              MOUNT_ATTR_SECURITY,
	      vals,
	      security_flagmask);

/* set time */
mount_setattr(dfd, path, atflags,
              MOUNT_ATTR_FSTIME,
	      vals,
	      fstime_flagmask);

and then finally have an additional command like:

/* apply changes */
mount_setattr(dfd, path, atflags, MOUNT_ATTR_APPLY, vals, 0);

that applies all the properties.

In kernel you could then either have separate flags in the corresponding
structs of interest or bit arrays of arbitrary length or whatever.
Each of the flag setting commands before the MOUNT_ATTR_APPLY would then
perform verification whether the combination of flags makes sense and
immediately return back an error to userspace but only the
MOUNT_ATTR_APPLY call would actually commit the changes.

> 
> > > 	mount_setattr(int dfd, const char *path, unsigned int atflags,
> > > 		      unsigned int attr_values,
> > > 		      unsigned int attr_mask);
> > 
> > If we go with the flag arguments wouldn't it make sense to use a larger
> > integer type?
> 
> You can't - at least not directly through syscall args.  They are 32-bit on a
> 32-bit system.

Ah, 32bit systems. I tend to forget that they exist. :)

> 
> > > where atflags can potentially include AT_RECURSIVE.
> > 
> > Very much in favor of having this operate recursively!
> 
> It gets complicated with propagation.

Sure, I totally understand.

Christian
David Howells Sept. 22, 2018, 3:48 p.m. UTC | #7
Christian Brauner <christian@brauner.io> wrote:

> mount_setattr(int dfd, const char *path, unsigned int atflags,
>               unsigned int attr_cmd,
> 	      unsigned int attr_values,
> 	      unsigned int attr_mask);

Whilst you can have up to six arguments on a syscall, I seem to remember that
6-arg syscalls require special handlers on some systems due to register count.

David
Christian Brauner Sept. 22, 2018, 4:14 p.m. UTC | #8
On Sat, Sep 22, 2018 at 04:48:32PM +0100, David Howells wrote:
> Christian Brauner <christian@brauner.io> wrote:
> 
> > mount_setattr(int dfd, const char *path, unsigned int atflags,
> >               unsigned int attr_cmd,
> > 	      unsigned int attr_values,
> > 	      unsigned int attr_mask);
> 
> Whilst you can have up to six arguments on a syscall, I seem to remember that
> 6-arg syscalls require special handlers on some systems due to register count.

Sure, but if we wanted to we might just get the syscall down to five
arguments even with my suggestion. Of course, I'm not sure what the
reasons for all of the other arguments to this function are since it's
not yet implemented. Seems that attr_values and attr_mask could be
compacted to a single attr_mask maybe? :)

Christian
David Howells Sept. 23, 2018, 10:45 p.m. UTC | #9
Christian Brauner <christian@brauner.io> wrote:

> Of course, I'm not sure what the reasons for all of the other arguments to
> this function are since it's not yet implemented.

Well, dfd, path and atflags are pretty standard.  atflags conveys things like
AT_EMPTY_PATH or AT_SYMLINK_NOFOLLOW and dfd conveys a file descriptor
pointing to a vfs object or AT_FDCWD.

> Seems that attr_values and attr_mask could be compacted to a single
> attr_mask maybe?

If you don't have a mask, you can't really do recursion.  Without the mask,
you have to supply the entire set of options absolutely - and this would get
stamped on everything in the target range.

With a mask in combination with the set of desired values, you can turn on or
off a specific subset of the attributes without affecting the rest - without
needing to know the rest.

David
Christian Brauner Sept. 23, 2018, 11:01 p.m. UTC | #10
On Sun, Sep 23, 2018 at 11:45:17PM +0100, David Howells wrote:
> Christian Brauner <christian@brauner.io> wrote:
> 
> > Of course, I'm not sure what the reasons for all of the other arguments to
> > this function are since it's not yet implemented.
> 
> Well, dfd, path and atflags are pretty standard.  atflags conveys things like
> AT_EMPTY_PATH or AT_SYMLINK_NOFOLLOW and dfd conveys a file descriptor
> pointing to a vfs object or AT_FDCWD.
> 
> > Seems that attr_values and attr_mask could be compacted to a single
> > attr_mask maybe?
> 
> If you don't have a mask, you can't really do recursion.  Without the mask,
> you have to supply the entire set of options absolutely - and this would get
> stamped on everything in the target range.
> 
> With a mask in combination with the set of desired values, you can turn on or
> off a specific subset of the attributes without affecting the rest - without
> needing to know the rest.

Ok, understood. What about passing the different attrs as a struct?

struct mount_attr {
        unsigned int attr_cmd,
        unsigned int attr_values,
        unsigned int attr_mask,

};

mount_setattr(int dfd, const char *path, unsigned int atflags,
              struct mount_attr *attr);

I find that to be a little cleaner in all honesty.
One could also add a version argument similar to what we currently do
for vfs fcaps so that kernel and userspace can easily navigate
compabitility when a new member gets added or removed in later releases.

Christian
David Howells Sept. 24, 2018, 6:50 a.m. UTC | #11
Christian Brauner <christian@brauner.io> wrote:

> Ok, understood. What about passing the different attrs as a struct?
> 
> struct mount_attr {
>         unsigned int attr_cmd,
>         unsigned int attr_values,
>         unsigned int attr_mask,
> 
> };
> 
> mount_setattr(int dfd, const char *path, unsigned int atflags,
>               struct mount_attr *attr);
> 
> I find that to be a little cleaner in all honesty.
> One could also add a version argument similar to what we currently do
> for vfs fcaps so that kernel and userspace can easily navigate
> compabitility when a new member gets added or removed in later releases.

Yeah, we could do that - it's not like I expect mount_setattr() to have to be
particularly performant in the user interface.  I would put the attr_cmd in
the argument list, probably, so that you can use that to vary the struct in
future (say we run out of attribute bits).

David
Christian Brauner Sept. 24, 2018, 9:47 a.m. UTC | #12
On Mon, Sep 24, 2018 at 07:50:38AM +0100, David Howells wrote:
> Christian Brauner <christian@brauner.io> wrote:
> 
> > Ok, understood. What about passing the different attrs as a struct?
> > 
> > struct mount_attr {
> >         unsigned int attr_cmd,
> >         unsigned int attr_values,
> >         unsigned int attr_mask,
> > 
> > };
> > 
> > mount_setattr(int dfd, const char *path, unsigned int atflags,
> >               struct mount_attr *attr);
> > 
> > I find that to be a little cleaner in all honesty.
> > One could also add a version argument similar to what we currently do
> > for vfs fcaps so that kernel and userspace can easily navigate
> > compabitility when a new member gets added or removed in later releases.
> 
> Yeah, we could do that - it's not like I expect mount_setattr() to have to be
> particularly performant in the user interface.  I would put the attr_cmd in
> the argument list, probably, so that you can use that to vary the struct in
> future (say we run out of attribute bits).

Yes, that makes sense and mimicks standard ioctl() behavior. So

struct mount_attr {
        unsigned int attr_values,
        unsigned int attr_mask,
}

mount_setattr(int dfd, const char *path, unsigned int atflags,
              unsigned int attr_cmd, struct mount_attr *attr);

I have thought a little more about splitting up the mount flags into
sensible sets. I think the following four sets make sense:

enum {
        MOUNT_ATTR_PROPAGATION = 1,
        MOUNT_ATTR_SECURITY,
        MOUNT_ATTR_SYNC,
        MOUNT_ATTR_TIME,
};

MOUNT_ATTR_PROPAGATION:
#define MOUNT_ATTR_PRIVATE    (1<<0)
#define MOUNT_ATTR_SHARED     (1<<1)
#define MOUNT_ATTR_SLAVE      (1<<2)
#define MOUNT_ATTR_UNBINDABLE (1<<3)

MOUNT_ATTR_SECURITY:
#define MOUNT_ATTR_MANDLOCK     (1<<0)
#define MOUNT_ATTR_NODEV        (1<<1)   
#define MOUNT_ATTR_NOEXEC       (1<<2)  
#define MOUNT_ATTR_NOSUID       (1<<3)  
#define MOUNT_ATTR_NOREMOTELOCK (1<<4)
#define MOUNT_ATTR_RDONLY       (1<<5)  
#define MOUNT_ATTR_POSIXACL     (1<<6)
#define MOUNT_ATTR_SILENT       (1<<7)  

MOUNT_ATTR_SYNC
#define MOUNT_ATTR_DIRSYNC     (1<<0)
#define MOUNT_ATTR_SYNCHRONOUS (1<<1)

MOUNT_ATTR_TIME:
#define MOUNT_ATTR_LAZYTIME    (1<<0)
#define MOUNT_ATTR_NOATIME     (1<<1)
#define MOUNT_ATTR_NODIRATIME  (1<<2)
#define MOUNT_ATTR_RELATIME    (1<<3)
#define MOUNT_ATTR_STRICTATIME (1<<4)

If we ever run out of flags in a specific set I suggest to introduce a
new enum member of the same name with a version number appended and an
alias with a (obvs lower) version number for the old set. A concrete
example would be:

enum {
        MOUNT_ATTR_PROPAGATION = 1,
        MOUNT_ATTR_SECURITY,
        MOUNT_ATTR_SECURITY_1 = MOUNT_ATTR_SECURITY,
        MOUNT_ATTR_SYNC,
        MOUNT_ATTR_TIME,
        MOUNT_ATTR_SECURITY_2,
};

These flags will likely become AT_* flags or be tied to a syscall
afaict.

#define MS_REMOUNT      32
#define MS_BIND	        4096
#define MS_MOVE	        8192
#define MS_REC	        16384

Internal sb flags will not be part of the new mount attr sets. (They
should - imho - not be exposed to userspace at all.):

#define MS_KERNMOUNT    (1<<22)
#define MS_SUBMOUNT     (1<<26)
#define MS_NOREMOTELOCK (1<<27)
#define MS_NOSEC        (1<<28)
#define MS_BORN	        (1<<29)
#define MS_ACTIVE       (1<<30)
#define MS_NOUSER       (1<<31)

What remains is an odd duck that probably could be thrown into security
but also *shrug*

#define MS_I_VERSION    (1<<23)

Christian
David Howells Sept. 24, 2018, 12:37 p.m. UTC | #13
Christian Brauner <christian@brauner.io> wrote:

> I have thought a little more about splitting up the mount flags into
> sensible sets. I think the following four sets make sense:
>
> enum {
>         MOUNT_ATTR_PROPAGATION = 1,
>         MOUNT_ATTR_SECURITY,
>         MOUNT_ATTR_SYNC,
>         MOUNT_ATTR_TIME,
> };

Al (I think it was) has been against splitting them up before (I've previously
proposed splitting the topology propagation flags from the mount attributes).

> #define MOUNT_ATTR_NOATIME     (1<<1)
> #define MOUNT_ATTR_RELATIME    (1<<3)
> #define MOUNT_ATTR_STRICTATIME (1<<4)

These aren't independent, but are actually settings on the same dial, so I
would suggest that they shouldn't be separate flags.  I'm not sure about
LAZYTIME though.

> enum {
>         MOUNT_ATTR_PROPAGATION = 1,
>         MOUNT_ATTR_SECURITY,
>         MOUNT_ATTR_SECURITY_1 = MOUNT_ATTR_SECURITY,
>         MOUNT_ATTR_SYNC,
>         MOUNT_ATTR_TIME,
>         MOUNT_ATTR_SECURITY_2,
> };

In UAPI headers, always explicitly number your symbols, even in an enum, just
to make sure that the numbers don't get transparently changed by accident.

> These flags will likely become AT_* flags or be tied to a syscall
> afaict.
>
> #define MS_REMOUNT      32
> #define MS_BIND	        4096
> #define MS_MOVE	        8192
> #define MS_REC	        16384

MS_REMOUNT: fd = fspick(); fscommand(fd, FSCONFIG_CMD_RECONFIGURE);

MS_REMOUNT|MS_BIND: mount_setattr().

MS_BIND: fd = open_tree(..., OPEN_TREE_CLONE); move_mount(fd, "", ...);

MS_MOVE: fd = open_tree(..., 0); move_mount(fd, "", ...);

MS_REC: AT_RECURSIVE

> Internal sb flags will not be part of the new mount attr sets. (They
> should - imho - not be exposed to userspace at all.):

Agreed.

> What remains is an odd duck that probably could be thrown into security
> but also *shrug*
>
> #define MS_I_VERSION    (1<<23)

Um.  I think it would probably belong with atime settings.

David
Christian Brauner Sept. 24, 2018, 1:18 p.m. UTC | #14
On Mon, Sep 24, 2018 at 01:37:45PM +0100, David Howells wrote:
> Christian Brauner <christian@brauner.io> wrote:
> 
> > I have thought a little more about splitting up the mount flags into
> > sensible sets. I think the following four sets make sense:
> >
> > enum {
> >         MOUNT_ATTR_PROPAGATION = 1,
> >         MOUNT_ATTR_SECURITY,
> >         MOUNT_ATTR_SYNC,
> >         MOUNT_ATTR_TIME,
> > };
> 
> Al (I think it was) has been against splitting them up before (I've previously
> proposed splitting the topology propagation flags from the mount attributes).

Right, that request could probably be fulfilled by the first draft for
this idea that I had but didn't send out.
Basically, having a sequential enum that only ever gets bumped when we
run out of flags in a set, i.e.

enum {
        MOUNT_ATTR_SET_1 = 1,
        MOUNT_ATTR_SET_2 = 2,
        MOUNT_ATTR_SET_3 = 3,
        .
        .
        .
};

Then we would currently only define a single set
enum {
        MOUNT_ATTR_SET_1 = 1,
};

dump all the current mount flags we would like to support in there and
call it a day until we run out of flags at which point we introduce
MOUNT_ATTR_SET_2.
But honestly, I find defining cuts by forming sets of logically related
flags to be more intuitive and transparent for kernel- and userspace.
But I'm just another muppet with an opinion. :)

> 
> > #define MOUNT_ATTR_NOATIME     (1<<1)
> > #define MOUNT_ATTR_RELATIME    (1<<3)
> > #define MOUNT_ATTR_STRICTATIME (1<<4)
> 
> These aren't independent, but are actually settings on the same dial, so I
> would suggest that they shouldn't be separate flags.  I'm not sure about
> LAZYTIME though.

So what you or Miklos suggested before, namely making them an enum too?

> 
> > enum {
> >         MOUNT_ATTR_PROPAGATION = 1,
> >         MOUNT_ATTR_SECURITY,
> >         MOUNT_ATTR_SECURITY_1 = MOUNT_ATTR_SECURITY,
> >         MOUNT_ATTR_SYNC,
> >         MOUNT_ATTR_TIME,
> >         MOUNT_ATTR_SECURITY_2,
> > };
> 
> In UAPI headers, always explicitly number your symbols, even in an enum, just
> to make sure that the numbers don't get transparently changed by accident.

+1 and thanks for the tip!

> 
> > These flags will likely become AT_* flags or be tied to a syscall
> > afaict.
> >
> > #define MS_REMOUNT      32
> > #define MS_BIND	        4096
> > #define MS_MOVE	        8192
> > #define MS_REC	        16384
> 
> MS_REMOUNT: fd = fspick(); fscommand(fd, FSCONFIG_CMD_RECONFIGURE);
> 
> MS_REMOUNT|MS_BIND: mount_setattr().
> 
> MS_BIND: fd = open_tree(..., OPEN_TREE_CLONE); move_mount(fd, "", ...);
> 
> MS_MOVE: fd = open_tree(..., 0); move_mount(fd, "", ...);
> 
> MS_REC: AT_RECURSIVE
> 
> > Internal sb flags will not be part of the new mount attr sets. (They
> > should - imho - not be exposed to userspace at all.):
> 
> Agreed.
> 
> > What remains is an odd duck that probably could be thrown into security
> > but also *shrug*
> >
> > #define MS_I_VERSION    (1<<23)
> 
> Um.  I think it would probably belong with atime settings.
> 
> David
diff mbox series

Patch

diff --git a/fs/namespace.c b/fs/namespace.c
index 7671c1f6fc22..027b99b993c0 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -3332,7 +3332,7 @@  EXPORT_SYMBOL_GPL(kern_mount);
  * Create a kernel mount representation for a new, prepared superblock
  * (specified by fs_fd) and attach to an open_tree-like file descriptor.
  */
-SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags, unsigned int, ms_flags)
+SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags, unsigned int, m_flags)
 {
 	struct fs_context *fc;
 	struct file *file;
@@ -3347,30 +3347,28 @@  SYSCALL_DEFINE3(fsmount, int, fs_fd, unsigned int, flags, unsigned int, ms_flags
 	if ((flags & ~(FSMOUNT_CLOEXEC)) != 0)
 		return -EINVAL;
 
-	if (ms_flags & ~(MS_RDONLY | MS_NOSUID | MS_NODEV | MS_NOEXEC |
-			 MS_NOATIME | MS_NODIRATIME | MS_RELATIME |
-			 MS_STRICTATIME))
+	if (m_flags & ~(M_RDONLY | M_NOSUID | M_NODEV | M_NOEXEC |
+			 M_NOATIME | M_STRICTATIME | M_NODIRATIME))
 		return -EINVAL;
 
-	if (ms_flags & MS_RDONLY)
+	if ((m_flags & M_STRICTATIME) && (m_flags & M_NOATIME))
+		return -EINVAL;
+
+	if (m_flags & M_RDONLY)
 		mnt_flags |= MNT_READONLY;
-	if (ms_flags & MS_NOSUID)
+	if (m_flags & M_NOSUID)
 		mnt_flags |= MNT_NOSUID;
-	if (ms_flags & MS_NODEV)
+	if (m_flags & M_NODEV)
 		mnt_flags |= MNT_NODEV;
-	if (ms_flags & MS_NOEXEC)
+	if (m_flags & M_NOEXEC)
 		mnt_flags |= MNT_NOEXEC;
-	if (ms_flags & MS_NODIRATIME)
+	if (m_flags & M_NODIRATIME)
 		mnt_flags |= MNT_NODIRATIME;
 
-	if (ms_flags & MS_STRICTATIME) {
-		if (ms_flags & MS_NOATIME)
-			return -EINVAL;
-	} else if (ms_flags & MS_NOATIME) {
+	if (m_flags & M_NOATIME)
 		mnt_flags |= MNT_NOATIME;
-	} else {
+	else if (!(m_flags & M_STRICTATIME))
 		mnt_flags |= MNT_RELATIME;
-	}
 
 	f = fdget(fs_fd);
 	if (!f.file)
diff --git a/include/uapi/linux/mount.h b/include/uapi/linux/mount.h
index 3634e065836c..fce3513fc242 100644
--- a/include/uapi/linux/mount.h
+++ b/include/uapi/linux/mount.h
@@ -1,6 +1,15 @@ 
 #ifndef _UAPI_LINUX_MOUNT_H
 #define _UAPI_LINUX_MOUNT_H
 
+/* Mount flags passed to fsmount(2) */
+#define M_RDONLY	0x01
+#define M_NOSUID	0x02
+#define M_NODEV		0x04
+#define M_NOEXEC	0x08
+#define M_NOATIME	0x10
+#define M_STRICTATIME	0x20
+#define M_NODIRATIME	0x40
+
 /*
  * These are the fs-independent mount-flags: up to 32 flags are supported
  *