diff mbox

Read-only `slaves` with shared subtrees?

Message ID 20170921191434.GP5698@ram.oc3035372033.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Ram Pai Sept. 21, 2017, 7:14 p.m. UTC
On Wed, Sep 20, 2017 at 08:00:57PM -0700, Dawid Ciezarkiewicz wrote:
> On Wed, Sep 20, 2017 at 5:39 PM, Ram Pai <linuxram@us.ibm.com> wrote:
> > Anyway; so something like this should be possible without breaking
> > existing semantics.
> >
> > mount -o bind,remount,ro /mnt
> > mount --make-pass-on-access  /mnt
> >
> > anything that gets mounted under /mnt will inherit the
> > 'ro' attribute from its parent.  And when a mount-event propagates
> > to a read-only-slave-mount, that new mount will automatically
> > inherit the read-only attribute from its slave-parent.
> >
> > Dawid: will that work for you?
> 
> 
> Yes. It is even more universal.


Here is a patch that accomplishes the job. tested to work with
some simple use cases.  check if this works for you. If it does
than we will have to think through all the edge cases and make it
acceptable.


------------------------------------------------------
Signed-off-by: Ram Pai <linuxram@us.ibm.com>


------------------------------------------------------

Here is a small program that setsup a mount to enable inheritance
of the RW attribute of the mount.

/* pass_on_readonly.c */
#include <stdio.h>
#include <stdlib.h>
#include <unistd.h>
#include <sys/mount.h>

#define MS_STICKY_RW (1<<8)

int main(int argc, char *argv[])
{
	unsigned long flags = MS_STICKY_RW;

	if (argc < 2) {
		printf("only argc=%d\n", argc);
		exit (0);
	}
	if (mount(argv[1], argv[1], NULL, flags, NULL) == -1)
		perror("failed ");
	exit (0);
}

-------------------------------------------------------



my testcase is

a) setup a shared mount
	mkdir shared
	mount --bind shared shared
	mount --make-shared shared

b) setup a slave mount
	mkdir slave
	mount --bind shared slave
	mount --make-slave slave

c) make the slave readonly
	mount -o bind,remount,ro slave

d) setup the slave to for passing one its readonly attribute
	gcc pass_on_readonly.c -o pass_on_readonly
	./pass_on_readonly slave


e) create a small mount tree to bind
	mkdir tmpbind
	mount --bind tmpbind tmpbind
	mkdir -p tmpbind/subtmpbind
	mount --bind tmpbind/subtmpbind tmpbind/subtmpbind

f) now mount this tree on the shared mount
	mkdir shared/sub
	mount --rbind tmpbind shared/sub

e) verify if the mounts under slave/sub are all readonly.
	> slave/sub/create_a_file
	slave/sub/create_a_file	: Read-only file system

	> slave/sub/subtmpbind/create_another_file
	slave/sub/subtmpbind/create_another_file : Read-only file system


RP

Comments

Dawid Ciezarkiewicz Sept. 22, 2017, 6:43 p.m. UTC | #1
On Thu, Sep 21, 2017 at 12:14 PM, Ram Pai <linuxram@us.ibm.com> wrote:
> Here is a patch that accomplishes the job. tested to work with
> some simple use cases.  check if this works for you. If it does
> than we will have to think through all the edge cases and make it
> acceptable.

From your experiments, it looks exactly right.

I'll give it a try in the upcoming week. Thank you!
Dawid Ciezarkiewicz Sept. 29, 2017, 11:02 p.m. UTC | #2
On Fri, Sep 22, 2017 at 11:43 AM, Dawid Ciezarkiewicz
<dawid.ciezarkiewicz@rubrik.com> wrote:
> On Thu, Sep 21, 2017 at 12:14 PM, Ram Pai <linuxram@us.ibm.com> wrote:
>> Here is a patch that accomplishes the job. tested to work with
>> some simple use cases.  check if this works for you. If it does
>> than we will have to think through all the edge cases and make it
>> acceptable.
>
> From your experiments, it looks exactly right.
>
> I'll give it a try in the upcoming week. Thank you!


I've reproduced your setup and results.

I've played with it for a while, mostly checking some recursive mount scenarios.
My biggest concern is transitivity of the RO attribute. Once a root of
slave directory
is set to be RO + STICKY, it is very important that host directories
propagated there,
even recursively (rslave), or any other, are not RW. From what I was
able to test, everything
seemed OK, but I don't grok all the semantics around it, so I'm just
pointing it out, as I might
have missed something.

One thing that I don't plan to use, but might be worth thinking about is
 `slave + RW + STICKY` combination.  If `master` mounts something RO,
and `slave`
is `RW + STICKY`, should the mount appear RW inside the slave? I don't
find it particularly useful,
but still...

Another thing that popped into my head: Is it worth considering any
dynamic changes to `slave`'s
RO status? It complicates everything a lot (it seems to me), since it
adds a retroactive
dynamic propagation. I don't currently have any plans to use it, but I
could imagine scenarios
in which a slave mount with all it's sub-mounts is remounted from RO
to RW, in response to
some external authorization trigger.

Regards,
Dawid Ciezarkiewicz
Ram Pai Oct. 9, 2017, 12:15 a.m. UTC | #3
On Fri, Sep 29, 2017 at 04:02:42PM -0700, Dawid Ciezarkiewicz wrote:
> On Fri, Sep 22, 2017 at 11:43 AM, Dawid Ciezarkiewicz
> <dawid.ciezarkiewicz@rubrik.com> wrote:
> > On Thu, Sep 21, 2017 at 12:14 PM, Ram Pai <linuxram@us.ibm.com> wrote:
> >> Here is a patch that accomplishes the job. tested to work with
> >> some simple use cases.  check if this works for you. If it does
> >> than we will have to think through all the edge cases and make it
> >> acceptable.
> >
> > From your experiments, it looks exactly right.
> >
> > I'll give it a try in the upcoming week. Thank you!
> 
> 
> I've reproduced your setup and results.
> 
> I've played with it for a while, mostly checking some recursive mount scenarios.
> My biggest concern is transitivity of the RO attribute. Once a root of
> slave directory
> is set to be RO + STICKY, it is very important that host directories
> propagated there,
> even recursively (rslave), or any other, are not RW. From what I was
> able to test, everything
> seemed OK, but I don't grok all the semantics around it, so I'm just
> pointing it out, as I might
> have missed something.

yes it will be RO. the patch takes care of that.

> 
> One thing that I don't plan to use, but might be worth thinking about is
>  `slave + RW + STICKY` combination.  If `master` mounts something RO,
> and `slave`
> is `RW + STICKY`, should the mount appear RW inside the slave? I don't
> find it particularly useful,
> but still...

As per the implemented semantics it will become "RW".  Should it be "RO"
aswell?  Will that open up security holes?

> 
> Another thing that popped into my head: Is it worth considering any
> dynamic changes to `slave`'s
> RO status? It complicates everything a lot (it seems to me), since it
> adds a retroactive
> dynamic propagation. I don't currently have any plans to use it, but I
> could imagine scenarios
> in which a slave mount with all it's sub-mounts is remounted from RO
> to RW, in response to
> some external authorization trigger.

The sematics should be something like this. Check if it makes sense.

a) anything mounted under stick-mount will be a sticky-mount and will
	inherit the mount's access-attribute;i.e RO RW attribute.
b) a mount when made sticky will propagate its sticky attribute
	as well as its access-attribute recursively to its children 
c) anything mounted under non-sticky mount will not inherit the
	mount's access-attribute and will be non-sticky aswell.
d) a mount when made non-sticky will just change itself to non-sticky.
	(will NOT propagate its non-sticky attribute and its
	access-attribue recursively to its children.)


Will this work?

> Regards,
> Dawid Ciezarkiewicz
Dawid Ciezarkiewicz Oct. 9, 2017, 9:39 p.m. UTC | #4
On Sun, Oct 8, 2017 at 5:15 PM, Ram Pai <linuxram@us.ibm.com> wrote:
>>
>> One thing that I don't plan to use, but might be worth thinking about is
>>  `slave + RW + STICKY` combination.  If `master` mounts something RO,
>> and `slave`
>> is `RW + STICKY`, should the mount appear RW inside the slave? I don't
>> find it particularly useful,
>> but still...
>
> As per the implemented semantics it will become "RW".  Should it be "RO"
> aswell?  Will that open up security holes?

It is a mechanism that could be used to potentially increase the scope
of privileges, which is a fertile ground for security issues. There is
some room for using it to circumvent mechanisms that were unaware of
this new feature. I guess for this reason alone, it might be worth
limiting to RO only.l

>>
>> Another thing that popped into my head: Is it worth considering any
>> dynamic changes to `slave`'s
>> RO status? It complicates everything a lot (it seems to me), since it
>> adds a retroactive
>> dynamic propagation. I don't currently have any plans to use it, but I
>> could imagine scenarios
>> in which a slave mount with all it's sub-mounts is remounted from RO
>> to RW, in response to
>> some external authorization trigger.
>
> The sematics should be something like this. Check if it makes sense.
>
> a) anything mounted under stick-mount will be a sticky-mount and will
>         inherit the mount's access-attribute;i.e RO RW attribute.
> b) a mount when made sticky will propagate its sticky attribute
>         as well as its access-attribute recursively to its children
> c) anything mounted under non-sticky mount will not inherit the
>         mount's access-attribute and will be non-sticky aswell.
> d) a mount when made non-sticky will just change itself to non-sticky.
>         (will NOT propagate its non-sticky attribute and its
>         access-attribue recursively to its children.)

a), b) and c), seem uncontroversial. d) seems OK, but I'm unsure as it
is asymmetrical to b). Both recursive and non-recursive D seem to make
sense. I'm just unsure if any is more useful than the other.

What happens when a sticky RO slave mount is remounted as RW? Does it
loose stickiness? Does this change propagate to its children?

Another angle, that just appeared to me: If we have a double link A
(master) -> (slave) B (master) -> (slave) C

If A is RW and B is RO + sticky, does mount propagated to C will also
be RO? It seems to me it should.



Regards,
Dawid
Ram Pai Oct. 19, 2017, 6:13 p.m. UTC | #5
On Mon, Oct 09, 2017 at 02:39:49PM -0700, Dawid Ciezarkiewicz wrote:
> On Sun, Oct 8, 2017 at 5:15 PM, Ram Pai <linuxram@us.ibm.com> wrote:
> >>
> >> One thing that I don't plan to use, but might be worth thinking about is
> >>  `slave + RW + STICKY` combination.  If `master` mounts something RO,
> >> and `slave`
> >> is `RW + STICKY`, should the mount appear RW inside the slave? I don't
> >> find it particularly useful,
> >> but still...
> >
> > As per the implemented semantics it will become "RW".  Should it be "RO"
> > aswell?  Will that open up security holes?
> 
> It is a mechanism that could be used to potentially increase the scope
> of privileges, which is a fertile ground for security issues. There is
> some room for using it to circumvent mechanisms that were unaware of
> this new feature. I guess for this reason alone, it might be worth
> limiting to RO only.l


ok. makes sense. It puts a twist to what I thought would have been
straight-forward-semantics. :-(

> 
> >>
> >> Another thing that popped into my head: Is it worth considering any
> >> dynamic changes to `slave`'s
> >> RO status? It complicates everything a lot (it seems to me), since it
> >> adds a retroactive
> >> dynamic propagation. I don't currently have any plans to use it, but I
> >> could imagine scenarios
> >> in which a slave mount with all it's sub-mounts is remounted from RO
> >> to RW, in response to
> >> some external authorization trigger.
> >
> > The sematics should be something like this. Check if it makes sense.
> >
> > a) anything mounted under stick-mount will be a sticky-mount and will
> >         inherit the mount's access-attribute;i.e RO RW attribute.
> > b) a mount when made sticky will propagate its sticky attribute
> >         as well as its access-attribute recursively to its children
> > c) anything mounted under non-sticky mount will not inherit the
> >         mount's access-attribute and will be non-sticky aswell.
> > d) a mount when made non-sticky will just change itself to non-sticky.
> >         (will NOT propagate its non-sticky attribute and its
> >         access-attribue recursively to its children.)
> 
> a), b) and c), seem uncontroversial. d) seems OK, but I'm unsure as it
> is asymmetrical to b). Both recursive and non-recursive D seem to make
> sense. I'm just unsure if any is more useful than the other.
> 
> What happens when a sticky RO slave mount is remounted as RW? Does it
> loose stickiness? Does this change propagate to its children?
> 
> Another angle, that just appeared to me: If we have a double link A
> (master) -> (slave) B (master) -> (slave) C
> 
> If A is RW and B is RO + sticky, does mount propagated to C will also
> be RO? It seems to me it should.

that seems to be the right thing to do.  

Do you want to code up something and send? I can aswell.. but bit
occupied with other high-priority stuff.

@Eric:  Any thoughts on the proposed semantics? 

RP
Eric W. Biederman Oct. 20, 2017, 2:23 a.m. UTC | #6
Ram Pai <linuxram@us.ibm.com> writes:

> On Mon, Oct 09, 2017 at 02:39:49PM -0700, Dawid Ciezarkiewicz wrote:
>> On Sun, Oct 8, 2017 at 5:15 PM, Ram Pai <linuxram@us.ibm.com> wrote:
>> >>
>> >> One thing that I don't plan to use, but might be worth thinking about is
>> >>  `slave + RW + STICKY` combination.  If `master` mounts something RO,
>> >> and `slave`
>> >> is `RW + STICKY`, should the mount appear RW inside the slave? I don't
>> >> find it particularly useful,
>> >> but still...
>> >
>> > As per the implemented semantics it will become "RW".  Should it be "RO"
>> > aswell?  Will that open up security holes?
>> 
>> It is a mechanism that could be used to potentially increase the scope
>> of privileges, which is a fertile ground for security issues. There is
>> some room for using it to circumvent mechanisms that were unaware of
>> this new feature. I guess for this reason alone, it might be worth
>> limiting to RO only.l
>
>
> ok. makes sense. It puts a twist to what I thought would have been
> straight-forward-semantics. :-(

My feel is that the read-write case should allow read-write only if the
incoming mount is read-write.  If the incomming write is read-only it
should stay read-only.  Meanwhile the sticky bit would connect to the
new read-only mount as sticky read-write.

*Thinks a minute*  That takes another bit and does not seem to add
anything so a sticky bit that just forces read-only makes sense.

>> >> Another thing that popped into my head: Is it worth considering any
>> >> dynamic changes to `slave`'s
>> >> RO status? It complicates everything a lot (it seems to me), since it
>> >> adds a retroactive
>> >> dynamic propagation. I don't currently have any plans to use it, but I
>> >> could imagine scenarios
>> >> in which a slave mount with all it's sub-mounts is remounted from RO
>> >> to RW, in response to
>> >> some external authorization trigger.
>> >
>> > The sematics should be something like this. Check if it makes sense.
>> >
>> > a) anything mounted under stick-mount will be a sticky-mount and will
>> >         inherit the mount's access-attribute;i.e RO RW attribute.
>> > b) a mount when made sticky will propagate its sticky attribute
>> >         as well as its access-attribute recursively to its children
>> > c) anything mounted under non-sticky mount will not inherit the
>> >         mount's access-attribute and will be non-sticky aswell.
>> > d) a mount when made non-sticky will just change itself to non-sticky.
>> >         (will NOT propagate its non-sticky attribute and its
>> >         access-attribue recursively to its children.)
>> 
>> a), b) and c), seem uncontroversial. d) seems OK, but I'm unsure as it
>> is asymmetrical to b). Both recursive and non-recursive D seem to make
>> sense. I'm just unsure if any is more useful than the other.
>> 
>> What happens when a sticky RO slave mount is remounted as RW? Does it
>> loose stickiness? Does this change propagate to its children?
>> 
>> Another angle, that just appeared to me: If we have a double link A
>> (master) -> (slave) B (master) -> (slave) C
>> 
>> If A is RW and B is RO + sticky, does mount propagated to C will also
>> be RO? It seems to me it should.
>
> that seems to be the right thing to do.  
>
> Do you want to code up something and send? I can aswell.. but bit
> occupied with other high-priority stuff.
>
> @Eric:  Any thoughts on the proposed semantics?

Thinking.

A big part of the semantics that has not been described is how does
stickiness propagate.

My inclination would be to define stickiness this way:

a) We start with a single bit MNT_STICKY

b) stickiness propagates like any other mount attribute.
   AKA setting stickiness propgates everywhere and sets stickiness
       clearing stickiness propgates everywhere and clears stickiness

c) We add MNT_LOCK_STICKY to remember the stickiness came from a more
   privileged mount namespace.

d) If the sticky is on a read-only mount and all future child mounts
   (while the sticky remains) will be read-only and sticky

e) If the sticky is on a nodev mount all future child mounts (while the
   sticky remains) will be nodev and sticky

f) If the sticky is on a nosuid mount all future child mounts (while the
   sticky remains) will be nosuid and sticky

g) If the sticky is on a noexec mount all future child mounts (while the
   sticky remains) will be noexec and sticky.

h) A sufficiently privileged user may clear the sticky with no other
   effects than the clear of the sticky propgates.

i) A sufficiently privileged user may clear one of read-only, nodev,
   nosuid, or noexec under a sticky and it has no other effects except
   the change in mount flags propagtes like normal.

Or in short the sticky would just glom onto mount level disables and
make them the default.

Eric
diff mbox

Patch

diff --git a/fs/namespace.c b/fs/namespace.c
index f8893dc..08f63b6 100644
--- a/fs/namespace.c
+++ b/fs/namespace.c
@@ -854,6 +854,8 @@  void mnt_set_mountpoint(struct mount *mnt,
 	child_mnt->mnt_mountpoint = dget(mp->m_dentry);
 	child_mnt->mnt_parent = mnt;
 	child_mnt->mnt_mp = mp;
+	if (mnt->mnt.mnt_flags & MNT_STICKY_RW)
+		child_mnt->mnt.mnt_flags |= (mnt->mnt.mnt_flags & (MNT_READONLY | MNT_STICKY_RW));
 	hlist_add_head(&child_mnt->mnt_mp_list, &mp->m_list);
 }
 
@@ -1052,6 +1054,12 @@  static struct mount *clone_mnt(struct mount *old, struct dentry *root,
 	    (!(flag & CL_EXPIRE) || list_empty(&old->mnt_expire)))
 		mnt->mnt.mnt_flags |= MNT_LOCKED;
 
+	if (flag & CL_STICKY_RW) {
+		mnt->mnt.mnt_flags |= MNT_STICKY_RW;
+		if (flag & CL_READONLY)
+			mnt->mnt.mnt_flags |= MNT_READONLY;
+	}
+
 	atomic_inc(&sb->s_active);
 	mnt->mnt.mnt_sb = sb;
 	mnt->mnt.mnt_root = dget(root);
@@ -2078,7 +2086,7 @@  static int flags_to_propagation_type(int flags)
 	int type = flags & ~(MS_REC | MS_SILENT);
 
 	/* Fail if any non-propagation flags are set */
-	if (type & ~(MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE))
+	if (type & ~(MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE | MS_STICKY_RW))
 		return 0;
 	/* Only one propagation flag should be set */
 	if (!is_power_of_2(type))
@@ -2113,7 +2121,10 @@  static int do_change_type(struct path *path, int flag)
 
 	lock_mount_hash();
 	for (m = mnt; m; m = (recurse ? next_mnt(m, mnt) : NULL))
-		change_mnt_propagation(m, type);
+		if (type == MS_STICKY_RW)
+			set_mnt_sticky(m);
+		else
+			change_mnt_propagation(m, type);
 	unlock_mount_hash();
 
  out_unlock:
@@ -2768,7 +2779,7 @@  long do_mount(const char *dev_name, const char __user *dir_name,
 				    data_page);
 	else if (flags & MS_BIND)
 		retval = do_loopback(&path, dev_name, flags & MS_REC);
-	else if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE))
+	else if (flags & (MS_SHARED | MS_PRIVATE | MS_SLAVE | MS_UNBINDABLE | MS_STICKY_RW))
 		retval = do_change_type(&path, flags);
 	else if (flags & MS_MOVE)
 		retval = do_move_mount(&path, dev_name);
diff --git a/fs/pnode.c b/fs/pnode.c
index 53d411a..386105a 100644
--- a/fs/pnode.c
+++ b/fs/pnode.c
@@ -262,6 +262,13 @@  static int propagate_one(struct mount *m)
 	/* Notice when we are propagating across user namespaces */
 	if (m->mnt_ns->user_ns != user_ns)
 		type |= CL_UNPRIVILEGED;
+
+	if (m->mnt.mnt_flags & MNT_STICKY_RW) {
+		type |= CL_STICKY_RW;
+		if (m->mnt.mnt_flags & MNT_READONLY)
+			type |= CL_READONLY;
+	}
+
 	child = copy_tree(last_source, last_source->mnt.mnt_root, type);
 	if (IS_ERR(child))
 		return PTR_ERR(child);
diff --git a/fs/pnode.h b/fs/pnode.h
index dc87e65..0a4f7c2 100644
--- a/fs/pnode.h
+++ b/fs/pnode.h
@@ -29,6 +29,8 @@ 
 #define CL_SHARED_TO_SLAVE	0x20
 #define CL_UNPRIVILEGED		0x40
 #define CL_COPY_MNT_NS_FILE	0x80
+#define CL_STICKY_RW		0x100
+#define CL_READONLY		0x200
 
 #define CL_COPY_ALL		(CL_COPY_UNBINDABLE | CL_COPY_MNT_NS_FILE)
 
@@ -38,6 +40,11 @@  static inline void set_mnt_shared(struct mount *mnt)
 	mnt->mnt.mnt_flags |= MNT_SHARED;
 }
 
+static inline void set_mnt_sticky(struct mount *mnt)
+{
+	mnt->mnt.mnt_flags |= MNT_STICKY_RW;
+}
+
 void change_mnt_propagation(struct mount *, int);
 int propagate_mnt(struct mount *, struct mountpoint *, struct mount *,
 		struct hlist_head *);
diff --git a/include/linux/mount.h b/include/linux/mount.h
index 1ce85e6..85dc195 100644
--- a/include/linux/mount.h
+++ b/include/linux/mount.h
@@ -28,6 +28,7 @@ 
 #define MNT_NODIRATIME	0x10
 #define MNT_RELATIME	0x20
 #define MNT_READONLY	0x40	/* does the user want this to be r/o? */
+#define MNT_STICKY_RW	0x80	/* children inherit READONLY attr if set */
 
 #define MNT_SHRINKABLE	0x100
 #define MNT_WRITE_HOLD	0x200
diff --git a/include/uapi/linux/fs.h b/include/uapi/linux/fs.h
index b7495d0..b06b277 100644
--- a/include/uapi/linux/fs.h
+++ b/include/uapi/linux/fs.h
@@ -112,6 +112,7 @@  struct inodes_stat_t {
 #define MS_REMOUNT	32	/* Alter flags of a mounted FS */
 #define MS_MANDLOCK	64	/* Allow mandatory locks on an FS */
 #define MS_DIRSYNC	128	/* Directory modifications are synchronous */
+#define MS_STICKY_RW	(1<<8)	/* children inherit the RW flag */
 #define MS_NOATIME	1024	/* Do not update access times. */
 #define MS_NODIRATIME	2048	/* Do not update directory access times */
 #define MS_BIND		4096