Message ID | 20190926195234.bipqpw5sbk5ojcna@fiona (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] Don't propagate automount | expand |
On Thu, 2019-09-26 at 14:52 -0500, Goldwyn Rodrigues wrote: > An access to automounted filesystem can deadlock if it is a bind > mount on shared mounts. A user program should not deadlock the kernel > while automount waits for propagation of the mount. This is explained > at https://bugzilla.redhat.com/show_bug.cgi?id=1358887#c10 > I am not sure completely blocking automount is the best solution, > so please reply with what is the best course of action to do > in such a situation. > > Propagation of dentry with DCACHE_NEED_AUTOMOUNT can lead to > propagation of mount points without automount maps and not under > automount control. So, do not propagate them. Yes, I'm not sure my comments about mount propagation in that bug are accurate. This behaviour has crept into the kernel in reasonably recent times, maybe it's a bug or maybe mount propagation has been "fixed", not sure. I think I'll need to come up with a more detailed description of what is being done for Al to be able to offer advice. I'll get to that a bit later. > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > diff --git a/fs/pnode.c b/fs/pnode.c > index 49f6d7ff2139..b960805d7954 100644 > --- a/fs/pnode.c > +++ b/fs/pnode.c > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount *dest_mnt, struct > mountpoint *dest_mp, > struct mount *m, *n; > int ret = 0; > > + if (source_mnt->mnt_mountpoint->d_flags & > DCACHE_NEED_AUTOMOUNT) > + return 0; > + Possible problem with this is it will probably prevent mount propagation in both directions which will break stuff. I had originally assumed the problem was mount propagation back to the parent mount but now I'm not sure that this is actually what is meant to happen. > /* > * we don't want to bother passing tons of arguments to > * propagate_one(); everything is serialized by namespace_sem, >
On Fri, 2019-09-27 at 09:35 +0800, Ian Kent wrote: > On Thu, 2019-09-26 at 14:52 -0500, Goldwyn Rodrigues wrote: > > An access to automounted filesystem can deadlock if it is a bind > > mount on shared mounts. A user program should not deadlock the > > kernel > > while automount waits for propagation of the mount. This is > > explained > > at https://bugzilla.redhat.com/show_bug.cgi?id=1358887#c10 > > I am not sure completely blocking automount is the best solution, > > so please reply with what is the best course of action to do > > in such a situation. > > > > Propagation of dentry with DCACHE_NEED_AUTOMOUNT can lead to > > propagation of mount points without automount maps and not under > > automount control. So, do not propagate them. > > Yes, I'm not sure my comments about mount propagation in that > bug are accurate. > > This behaviour has crept into the kernel in reasonably recent > times, maybe it's a bug or maybe mount propagation has been > "fixed", not sure. > > I think I'll need to come up with a more detailed description > of what is being done for Al to be able to offer advice. > > I'll get to that a bit later. To duplicate this problem use an autofs indirect map that uses bind mounts and has offsets: test / :/exports \ /tmp :/exports/tmp \ /lib :/exports/lib and add: /bind /etc/auto.exports to /etc/auto.master. Finally create the bind mount directories: mkdir -p /exports/lib /exports/tmp Then, on a broken kernel, eg. 4.13.9-300.fc27: ls /bind/test will result in: /etc/auto.exports on /bind type autofs (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=45485) /dev/mapper/fedora_f27-root on /bind/test type ext4 (rw,relatime,seclabel,data=ordered) /etc/auto.exports on /bind/test/lib type autofs (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=45485) /etc/auto.exports on /exports/lib type autofs (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=45485) /etc/auto.exports on /bind/test/tmp type autofs (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=45485) /etc/auto.exports on /exports/tmp type autofs (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=45485) these mount entries, not all of which have been mounted by autofs. Whereas on a kernel that isn't broken, eg. 4.11.8-300.fc26, the same ls command will result in: /etc/auto.exports on /bind type autofs (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=42067) /etc/auto.exports on /bind/test/lib type autofs (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=42067) /etc/auto.exports on /bind/test/tmp type autofs (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=42067) these mount entries, all of which have been mounted by autofs (and are what's needed for these offset mount constructs). If the /bind mount is made propagation slave or private at mount by automount the problem doesn't happen and that is the workaround I implemented in autofs. I initially thought this was the result of a "fix" in the mount propagation code but it occurred to me that propagation is meant to occur between mount trees not within them so this might be a bug. I probably should have worked out exactly what upstream kernel this started happening in and then done a bisect and tried to work out if the change was doing what it was supposed to. Anyway, I'll need to do that now for us to discuss this sensibly. > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > index 49f6d7ff2139..b960805d7954 100644 > > --- a/fs/pnode.c > > +++ b/fs/pnode.c > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount *dest_mnt, > > struct > > mountpoint *dest_mp, > > struct mount *m, *n; > > int ret = 0; > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > DCACHE_NEED_AUTOMOUNT) > > + return 0; > > + > > Possible problem with this is it will probably prevent mount > propagation in both directions which will break stuff. > > I had originally assumed the problem was mount propagation > back to the parent mount but now I'm not sure that this is > actually what is meant to happen. > > > /* > > * we don't want to bother passing tons of arguments to > > * propagate_one(); everything is serialized by namespace_sem, > >
On Fri, 2019-09-27 at 15:09 +0800, Ian Kent wrote: > On Fri, 2019-09-27 at 09:35 +0800, Ian Kent wrote: > > On Thu, 2019-09-26 at 14:52 -0500, Goldwyn Rodrigues wrote: > > > An access to automounted filesystem can deadlock if it is a bind > > > mount on shared mounts. A user program should not deadlock the > > > kernel > > > while automount waits for propagation of the mount. This is > > > explained > > > at https://bugzilla.redhat.com/show_bug.cgi?id=1358887#c10 > > > I am not sure completely blocking automount is the best solution, > > > so please reply with what is the best course of action to do > > > in such a situation. > > > > > > Propagation of dentry with DCACHE_NEED_AUTOMOUNT can lead to > > > propagation of mount points without automount maps and not under > > > automount control. So, do not propagate them. > > > > Yes, I'm not sure my comments about mount propagation in that > > bug are accurate. > > > > This behaviour has crept into the kernel in reasonably recent > > times, maybe it's a bug or maybe mount propagation has been > > "fixed", not sure. > > > > I think I'll need to come up with a more detailed description > > of what is being done for Al to be able to offer advice. > > > > I'll get to that a bit later. > > To duplicate this problem use an autofs indirect map > that uses bind mounts and has offsets: > > test / :/exports \ > /tmp :/exports/tmp \ > /lib :/exports/lib > > and add: > > /bind /etc/auto.exports > > to /etc/auto.master. > > Finally create the bind mount directories: > > mkdir -p /exports/lib /exports/tmp > > Then, on a broken kernel, eg. 4.13.9-300.fc27: > > ls /bind/test > > will result in: > > /etc/auto.exports on /bind type autofs > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,indirec > t,pipe_ino=45485) > /dev/mapper/fedora_f27-root on /bind/test type ext4 > (rw,relatime,seclabel,data=ordered) > /etc/auto.exports on /bind/test/lib type autofs > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset, > pipe_ino=45485) > /etc/auto.exports on /exports/lib type autofs > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset, > pipe_ino=45485) > /etc/auto.exports on /bind/test/tmp type autofs > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset, > pipe_ino=45485) > /etc/auto.exports on /exports/tmp type autofs > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offset, > pipe_ino=45485) > > these mount entries, not all of which have been mounted by autofs. > > Whereas on a kernel that isn't broken, eg. 4.11.8-300.fc26, the same > ls command will result in: > > /etc/auto.exports on /bind type autofs > (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,indirec > t,pipe_ino=42067) > /etc/auto.exports on /bind/test/lib type autofs > (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,offset, > pipe_ino=42067) > /etc/auto.exports on /bind/test/tmp type autofs > (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,offset, > pipe_ino=42067) > > these mount entries, all of which have been mounted by autofs (and > are what's needed for these offset mount constructs). > > If the /bind mount is made propagation slave or private at mount > by automount the problem doesn't happen and that is the workaround > I implemented in autofs. Actually that's not quite right, there should be a bind mount at /bind/test as well but it's not present. Although I expect this will happen with a rootless multi-mount as well, aka. test \ /tmp :/exports/tmp \ /lib :/exports/lib Leave it with me while I investigate further. > > I initially thought this was the result of a "fix" in the mount > propagation code but it occurred to me that propagation is meant > to occur between mount trees not within them so this might be a > bug. > > I probably should have worked out exactly what upstream kernel > this started happening in and then done a bisect and tried to > work out if the change was doing what it was supposed to. > > Anyway, I'll need to do that now for us to discuss this sensibly. > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > > index 49f6d7ff2139..b960805d7954 100644 > > > --- a/fs/pnode.c > > > +++ b/fs/pnode.c > > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount *dest_mnt, > > > struct > > > mountpoint *dest_mp, > > > struct mount *m, *n; > > > int ret = 0; > > > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > > DCACHE_NEED_AUTOMOUNT) > > > + return 0; > > > + > > > > Possible problem with this is it will probably prevent mount > > propagation in both directions which will break stuff. > > > > I had originally assumed the problem was mount propagation > > back to the parent mount but now I'm not sure that this is > > actually what is meant to happen. > > > > > /* > > > * we don't want to bother passing tons of arguments to > > > * propagate_one(); everything is serialized by namespace_sem, > > >
On Fri, 2019-09-27 at 15:26 +0800, Ian Kent wrote: > On Fri, 2019-09-27 at 15:09 +0800, Ian Kent wrote: > > On Fri, 2019-09-27 at 09:35 +0800, Ian Kent wrote: > > > On Thu, 2019-09-26 at 14:52 -0500, Goldwyn Rodrigues wrote: > > > > An access to automounted filesystem can deadlock if it is a > > > > bind > > > > mount on shared mounts. A user program should not deadlock the > > > > kernel > > > > while automount waits for propagation of the mount. This is > > > > explained > > > > at https://bugzilla.redhat.com/show_bug.cgi?id=1358887#c10 > > > > I am not sure completely blocking automount is the best > > > > solution, > > > > so please reply with what is the best course of action to do > > > > in such a situation. > > > > > > > > Propagation of dentry with DCACHE_NEED_AUTOMOUNT can lead to > > > > propagation of mount points without automount maps and not > > > > under > > > > automount control. So, do not propagate them. > > > > > > Yes, I'm not sure my comments about mount propagation in that > > > bug are accurate. > > > > > > This behaviour has crept into the kernel in reasonably recent > > > times, maybe it's a bug or maybe mount propagation has been > > > "fixed", not sure. > > > > > > I think I'll need to come up with a more detailed description > > > of what is being done for Al to be able to offer advice. > > > > > > I'll get to that a bit later. > > > > To duplicate this problem use an autofs indirect map > > that uses bind mounts and has offsets: > > > > test / :/exports \ > > /tmp :/exports/tmp \ > > /lib :/exports/lib > > > > and add: > > > > /bind /etc/auto.exports > > > > to /etc/auto.master. > > > > Finally create the bind mount directories: > > > > mkdir -p /exports/lib /exports/tmp > > > > Then, on a broken kernel, eg. 4.13.9-300.fc27: > > > > ls /bind/test > > > > will result in: > > > > /etc/auto.exports on /bind type autofs > > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,indir > > ec > > t,pipe_ino=45485) > > /dev/mapper/fedora_f27-root on /bind/test type ext4 > > (rw,relatime,seclabel,data=ordered) > > /etc/auto.exports on /bind/test/lib type autofs > > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offse > > t, > > pipe_ino=45485) > > /etc/auto.exports on /exports/lib type autofs > > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offse > > t, > > pipe_ino=45485) > > /etc/auto.exports on /bind/test/tmp type autofs > > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offse > > t, > > pipe_ino=45485) > > /etc/auto.exports on /exports/tmp type autofs > > (rw,relatime,fd=5,pgrp=2981,timeout=300,minproto=5,maxproto=5,offse > > t, > > pipe_ino=45485) > > > > these mount entries, not all of which have been mounted by autofs. > > > > Whereas on a kernel that isn't broken, eg. 4.11.8-300.fc26, the > > same > > ls command will result in: > > > > /etc/auto.exports on /bind type autofs > > (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,indir > > ec > > t,pipe_ino=42067) > > /etc/auto.exports on /bind/test/lib type autofs > > (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,offse > > t, > > pipe_ino=42067) > > /etc/auto.exports on /bind/test/tmp type autofs > > (rw,relatime,fd=6,pgrp=2920,timeout=300,minproto=5,maxproto=5,offse > > t, > > pipe_ino=42067) > > > > these mount entries, all of which have been mounted by autofs (and > > are what's needed for these offset mount constructs). > > > > If the /bind mount is made propagation slave or private at mount > > by automount the problem doesn't happen and that is the workaround > > I implemented in autofs. > > Actually that's not quite right, there should be a bind mount at > /bind/test as well but it's not present. > > Although I expect this will happen with a rootless multi-mount > as well, aka. > > test \ > /tmp :/exports/tmp \ > /lib :/exports/lib > > Leave it with me while I investigate further. Ok, so this is actually what should have been produced on 4.11.8-300.fc26: /etc/auto.exports on /bind type autofs (rw,relatime,fd=7,pgrp=1800,timeout=300,minproto=5,maxproto=5,indirect,pipe_ino=32018) /dev/mapper/fedora_f26-root on /bind/test type ext4 (rw,relatime,seclabel,data=ordered) /etc/auto.exports on /bind/test/lib type autofs (rw,relatime,fd=7,pgrp=1800,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=32018) /etc/auto.exports on /exports/lib type autofs (rw,relatime,fd=7,pgrp=1800,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=32018) /etc/auto.exports on /bind/test/tmp type autofs (rw,relatime,fd=7,pgrp=1800,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=32018) /etc/auto.exports on /exports/tmp type autofs (rw,relatime,fd=7,pgrp=1800,timeout=300,minproto=5,maxproto=5,offset,pipe_ino=32018) which is also broken. I'll need to check more kernels. > > > I initially thought this was the result of a "fix" in the mount > > propagation code but it occurred to me that propagation is meant > > to occur between mount trees not within them so this might be a > > bug. > > > > I probably should have worked out exactly what upstream kernel > > this started happening in and then done a bisect and tried to > > work out if the change was doing what it was supposed to. > > > > Anyway, I'll need to do that now for us to discuss this sensibly. > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > > > index 49f6d7ff2139..b960805d7954 100644 > > > > --- a/fs/pnode.c > > > > +++ b/fs/pnode.c > > > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount *dest_mnt, > > > > struct > > > > mountpoint *dest_mp, > > > > struct mount *m, *n; > > > > int ret = 0; > > > > > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > > > DCACHE_NEED_AUTOMOUNT) > > > > + return 0; > > > > + > > > > > > Possible problem with this is it will probably prevent mount > > > propagation in both directions which will break stuff. > > > > > > I had originally assumed the problem was mount propagation > > > back to the parent mount but now I'm not sure that this is > > > actually what is meant to happen. > > > > > > > /* > > > > * we don't want to bother passing tons of arguments to > > > > * propagate_one(); everything is serialized by > > > > namespace_sem, > > > >
On Fri, 2019-09-27 at 15:41 +0800, Ian Kent wrote: > > > > I initially thought this was the result of a "fix" in the mount > > > propagation code but it occurred to me that propagation is meant > > > to occur between mount trees not within them so this might be a > > > bug. > > > > > > I probably should have worked out exactly what upstream kernel > > > this started happening in and then done a bisect and tried to > > > work out if the change was doing what it was supposed to. > > > > > > Anyway, I'll need to do that now for us to discuss this sensibly. > > > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > > > > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > > > > index 49f6d7ff2139..b960805d7954 100644 > > > > > --- a/fs/pnode.c > > > > > +++ b/fs/pnode.c > > > > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount *dest_mnt, > > > > > struct > > > > > mountpoint *dest_mp, > > > > > struct mount *m, *n; > > > > > int ret = 0; > > > > > > > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > > > > DCACHE_NEED_AUTOMOUNT) > > > > > + return 0; > > > > > + > > > > > > > > Possible problem with this is it will probably prevent mount > > > > propagation in both directions which will break stuff. > > > > > > > > I had originally assumed the problem was mount propagation > > > > back to the parent mount but now I'm not sure that this is > > > > actually what is meant to happen. Goldwyn, TBH I'm already a bit over this particularly since it's a solved problem from my POV. I've gone back as far as Fedora 20 and 3.11.10-301.fc20 also behaves like this. Unless someone says this behaviour is not the way kernel mount propagation should behave I'm not going to spend more time on it. The ability to use either "slave" or "private" autofs pseudo mount options in master map mount entries that are susceptible to this mount propagation behaviour was included in autofs-5.1.5 and the patches used are present on kernel.org if you need to back port them to an earlier release. https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-set-bind-mount-as-propagation-slave.patch https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-add-master-map-pseudo-options-for-mount-propagation.patch It shouldn't be too difficult to back port them but they might have other patch dependencies. I will help with that if you need it. Ian
On 18:51 27/09, Ian Kent wrote: > On Fri, 2019-09-27 at 15:41 +0800, Ian Kent wrote: > > > > > > I initially thought this was the result of a "fix" in the mount > > > > propagation code but it occurred to me that propagation is meant > > > > to occur between mount trees not within them so this might be a > > > > bug. > > > > > > > > I probably should have worked out exactly what upstream kernel > > > > this started happening in and then done a bisect and tried to > > > > work out if the change was doing what it was supposed to. > > > > > > > > Anyway, I'll need to do that now for us to discuss this sensibly. > > > > > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > > > > > > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > > > > > index 49f6d7ff2139..b960805d7954 100644 > > > > > > --- a/fs/pnode.c > > > > > > +++ b/fs/pnode.c > > > > > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount *dest_mnt, > > > > > > struct > > > > > > mountpoint *dest_mp, > > > > > > struct mount *m, *n; > > > > > > int ret = 0; > > > > > > > > > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > > > > > DCACHE_NEED_AUTOMOUNT) > > > > > > + return 0; > > > > > > + > > > > > > > > > > Possible problem with this is it will probably prevent mount > > > > > propagation in both directions which will break stuff. No, I am specifically checking when the source has a automount flag set. It will block only one way. I checked it with an example. > > > > > > > > > > I had originally assumed the problem was mount propagation > > > > > back to the parent mount but now I'm not sure that this is > > > > > actually what is meant to happen. > > Goldwyn, > > TBH I'm already a bit over this particularly since it's a > solved problem from my POV. > > I've gone back as far as Fedora 20 and 3.11.10-301.fc20 also > behaves like this. The problem started with the root directory being mounted as shared. > > Unless someone says this behaviour is not the way kernel > mount propagation should behave I'm not going to spend > more time on it. > > The ability to use either "slave" or "private" autofs pseudo > mount options in master map mount entries that are susceptible > to this mount propagation behaviour was included in autofs-5.1.5 > and the patches used are present on kernel.org if you need to > back port them to an earlier release. What about "shared" pseudo mount option? The point is the default shared option with automount is broken, and should not be exposed at all. > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-set-bind-mount-as-propagation-slave.patch > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-add-master-map-pseudo-options-for-mount-propagation.patch > > It shouldn't be too difficult to back port them but they might > have other patch dependencies. I will help with that if you > need it. My problem is not with the patch and the "private" or "slave" flag, but with the absence of it. We have the patch you mention in our repos. I am assuming that users are stupid and they will miss putting the flags in the auto.master file and wonder why when they try to access the directories the process hangs. In all, any user configuration should not hang the kernel. My point is, if you don't have a automount map with the daemon, how can you propagate it and still be in control? I tried my experiments with 5.3.1 without "slave" or "private" flags and the process accessing the bind mount hangs.
On Fri, 2019-09-27 at 11:16 -0500, Goldwyn Rodrigues wrote: > On 18:51 27/09, Ian Kent wrote: > > On Fri, 2019-09-27 at 15:41 +0800, Ian Kent wrote: > > > > > I initially thought this was the result of a "fix" in the > > > > > mount > > > > > propagation code but it occurred to me that propagation is > > > > > meant > > > > > to occur between mount trees not within them so this might be > > > > > a > > > > > bug. > > > > > > > > > > I probably should have worked out exactly what upstream > > > > > kernel > > > > > this started happening in and then done a bisect and tried to > > > > > work out if the change was doing what it was supposed to. > > > > > > > > > > Anyway, I'll need to do that now for us to discuss this > > > > > sensibly. > > > > > > > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > > > > > > > > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > > > > > > index 49f6d7ff2139..b960805d7954 100644 > > > > > > > --- a/fs/pnode.c > > > > > > > +++ b/fs/pnode.c > > > > > > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount > > > > > > > *dest_mnt, > > > > > > > struct > > > > > > > mountpoint *dest_mp, > > > > > > > struct mount *m, *n; > > > > > > > int ret = 0; > > > > > > > > > > > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > > > > > > DCACHE_NEED_AUTOMOUNT) > > > > > > > + return 0; > > > > > > > + > > > > > > > > > > > > Possible problem with this is it will probably prevent > > > > > > mount > > > > > > propagation in both directions which will break stuff. > > No, I am specifically checking when the source has a automount flag > set. > It will block only one way. I checked it with an example. I don't understand how this check can selectively block propagation? If you have say: test / :/exports \ /tmp :/exports/tmp \ /lib :/exports/lib and /bind /etc/auto.exports in /etc/auto.master and you use say: docker run -it --rm -v /bind:/bind:slave fedora-autofs:v1 bash your saying the above will not propagate those offset trigger mounts to the parent above the /bind/test mount but will propagate them to the container? It looks like that check is all or nothing to me? Can you explain a bit more. > > > > > > > I had originally assumed the problem was mount propagation > > > > > > back to the parent mount but now I'm not sure that this is > > > > > > actually what is meant to happen. > > > > Goldwyn, > > > > TBH I'm already a bit over this particularly since it's a > > solved problem from my POV. > > > > I've gone back as far as Fedora 20 and 3.11.10-301.fc20 also > > behaves like this. > > The problem started with the root directory being mounted as > shared. The change where systemd set the root file system propagation shared was certainly an autofs pain point (for more than just this case too) but I was so sure that wasn't when this started happening. But ok, I could be mistaken, and you seem to be sure about. > > > Unless someone says this behaviour is not the way kernel > > mount propagation should behave I'm not going to spend > > more time on it. > > > > The ability to use either "slave" or "private" autofs pseudo > > mount options in master map mount entries that are susceptible > > to this mount propagation behaviour was included in autofs-5.1.5 > > and the patches used are present on kernel.org if you need to > > back port them to an earlier release. > > What about "shared" pseudo mount option? The point is the default > shared option with automount is broken, and should not be exposed > at all. What about shared mounts? I don't know of a case where propagation shared is actually needed. If you know of one please describe it. The most common case is "slave" and the "private" option was only included because it might be needed if people are using isolated environments but TBH I'm not at all sure it could actually be used for that case. IIUC the change to the propagation of the root file system was done to help with containers but turned out not to do what was needed and was never reverted. So the propagation shared change probably should have been propagation slave or not changed at all. > > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-set-bind-mount-as-propagation-slave.patch > > > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-add-master-map-pseudo-options-for-mount-propagation.patch > > > > It shouldn't be too difficult to back port them but they might > > have other patch dependencies. I will help with that if you > > need it. > > My problem is not with the patch and the "private" or "slave" flag, > but > with the absence of it. We have the patch you mention in our repos. Ha, play on words, "absence of it" and "we have it in our repos" Don't you mean the problem is that mount propagation isn't set correctly automatically by automount. > > I am assuming that users are stupid and they will miss putting the > flags > in the auto.master file and wonder why when they try to access the > directories > the process hangs. In all, any user configuration should not hang the > kernel. I thought about that when I was working on those patches but, at the time, I didn't think the propagation problem had started when the root file system was set propagation shared at boot. I still think changing the kernel propagation isn't the right way to resolve it. But I would be willing to add a configuration option to autofs that when set would use propagation slave for all bind mounts without the need to modify the master map. Given how long the problem has been around I'm also tempted to make it default to enabled. I'm not sure yet what that would mean for the existing mount options "shared" and "private" other than them possibly being needed if the option is disabled, even with this I'm still not sure a "shared" option is useful. Isn't this automation your main concern? > > My point is, if you don't have a automount map with the daemon, how > can you > propagate it and still be in control? Well, maybe, but IIUC the container example is probably one case but that doesn't quite match what your saying. Typically there would be no daemon in the container for the example I gave. Ian
Hi Ian, Sorry for the late reply, I had to setup and environment for your specific case and it took time. On 9:47 28/09, Ian Kent wrote: > On Fri, 2019-09-27 at 11:16 -0500, Goldwyn Rodrigues wrote: > > On 18:51 27/09, Ian Kent wrote: > > > On Fri, 2019-09-27 at 15:41 +0800, Ian Kent wrote: > > > > > > I initially thought this was the result of a "fix" in the > > > > > > mount > > > > > > propagation code but it occurred to me that propagation is > > > > > > meant > > > > > > to occur between mount trees not within them so this might be > > > > > > a > > > > > > bug. > > > > > > > > > > > > I probably should have worked out exactly what upstream > > > > > > kernel > > > > > > this started happening in and then done a bisect and tried to > > > > > > work out if the change was doing what it was supposed to. > > > > > > > > > > > > Anyway, I'll need to do that now for us to discuss this > > > > > > sensibly. > > > > > > > > > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > > > > > > > > > > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > > > > > > > index 49f6d7ff2139..b960805d7954 100644 > > > > > > > > --- a/fs/pnode.c > > > > > > > > +++ b/fs/pnode.c > > > > > > > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount > > > > > > > > *dest_mnt, > > > > > > > > struct > > > > > > > > mountpoint *dest_mp, > > > > > > > > struct mount *m, *n; > > > > > > > > int ret = 0; > > > > > > > > > > > > > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > > > > > > > DCACHE_NEED_AUTOMOUNT) > > > > > > > > + return 0; > > > > > > > > + > > > > > > > > > > > > > > Possible problem with this is it will probably prevent > > > > > > > mount > > > > > > > propagation in both directions which will break stuff. > > > > No, I am specifically checking when the source has a automount flag > > set. > > It will block only one way. I checked it with an example. > > I don't understand how this check can selectively block propagation? > > If you have say: > test / :/exports \ > /tmp :/exports/tmp \ > /lib :/exports/lib > > and > /bind /etc/auto.exports > > in /etc/auto.master > > and you use say: > > docker run -it --rm -v /bind:/bind:slave fedora-autofs:v1 bash > > your saying the above will not propagate those offset trigger mounts > to the parent above the /bind/test mount but will propagate them to > the container? Yes, It works for me. I could not find the fedora-autofs, but used fedora image. Check both cases. The first one (vanilla) is my modified kernel with the patch I mentioned. [root@fedora30 ~]# cat /etc/auto.master /bind /etc/auto.exports Note, there is no options. I added -bind in the config you provided. [root@fedora30 ~]# cat /etc/auto.exports test -bind / :/exports \ /tmp :/exports/tmp \ /lib :/exports/lib [root@fedora30 ~]# uname -a Linux fedora30 5.3.1vanilla+ #9 SMP Tue Oct 1 11:11:11 CDT 2019 x86_64 x86_64 x86_64 GNU/Linux [root@fedora30 ~]# docker run -it --rm -v /bind:/bind:slave fedora bash [root@cf881c09f90a /]# ls /bind [root@cf881c09f90a /]# ls /bind/test lib tmp [root@cf881c09f90a /]# ls /bind/test/lib lib-file However, on existing fedora 30 kernel.. [root@fedora30 ~]# uname -a Linux fedora30 5.2.17-200.fc30.x86_64 #1 SMP Mon Sep 23 13:42:32 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux [root@fedora30 ~]# docker run -it --rm -v /bind:/bind:slave fedora bash [root@0a1d4f3dc475 /]# ls /bind [root@0a1d4f3dc475 /]# ls /bind/test lib tmp [root@0a1d4f3dc475 /]# ls /bind/test/lib <hangs> > > It looks like that check is all or nothing to me? > Can you explain a bit more. > > > > > > > > > > I had originally assumed the problem was mount propagation > > > > > > > back to the parent mount but now I'm not sure that this is > > > > > > > actually what is meant to happen. > > > > > > Goldwyn, > > > > > > TBH I'm already a bit over this particularly since it's a > > > solved problem from my POV. > > > > > > I've gone back as far as Fedora 20 and 3.11.10-301.fc20 also > > > behaves like this. > > > > The problem started with the root directory being mounted as > > shared. > > The change where systemd set the root file system propagation > shared was certainly an autofs pain point (for more than just > this case too) but I was so sure that wasn't when this started > happening. > > But ok, I could be mistaken, and you seem to be sure about. Well, it might as well be with the propagation code. I am not sure what introduced this. > > > > > > Unless someone says this behaviour is not the way kernel > > > mount propagation should behave I'm not going to spend > > > more time on it. > > > > > > The ability to use either "slave" or "private" autofs pseudo > > > mount options in master map mount entries that are susceptible > > > to this mount propagation behaviour was included in autofs-5.1.5 > > > and the patches used are present on kernel.org if you need to > > > back port them to an earlier release. > > > > What about "shared" pseudo mount option? The point is the default > > shared option with automount is broken, and should not be exposed > > at all. > > What about shared mounts? > > I don't know of a case where propagation shared is actually needed. > If you know of one please describe it. No, I don't have a use case for shared mounts. I am merely trying to emphasize the default option (which behaves as shared) is broken. > > The most common case is "slave" and the "private" option was only > included because it might be needed if people are using isolated > environments but TBH I'm not at all sure it could actually be used > for that case. > > IIUC the change to the propagation of the root file system was > done to help with containers but turned out not to do what was > needed and was never reverted. So the propagation shared change > probably should have been propagation slave or not changed at > all. I agree. I am also worried of the swelling /proc/mounts because of this. > > > > > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-set-bind-mount-as-propagation-slave.patch > > > > > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-add-master-map-pseudo-options-for-mount-propagation.patch > > > > > > It shouldn't be too difficult to back port them but they might > > > have other patch dependencies. I will help with that if you > > > need it. > > > > My problem is not with the patch and the "private" or "slave" flag, > > but > > with the absence of it. We have the patch you mention in our repos. > > Ha, play on words, "absence of it" and "we have it in our repos" I meant, Absence of "private" or "slave" flags. > > Don't you mean the problem is that mount propagation isn't > set correctly automatically by automount. > > > > > I am assuming that users are stupid and they will miss putting the > > flags > > in the auto.master file and wonder why when they try to access the > > directories > > the process hangs. In all, any user configuration should not hang the > > kernel. > > I thought about that when I was working on those patches but, > at the time, I didn't think the propagation problem had started > when the root file system was set propagation shared at boot. > > I still think changing the kernel propagation isn't the right > way to resolve it. > > But I would be willing to add a configuration option to autofs > that when set would use propagation slave for all bind mounts > without the need to modify the master map. Given how long the > problem has been around I'm also tempted to make it default > to enabled. > > I'm not sure yet what that would mean for the existing mount > options "shared" and "private" other than them possibly being > needed if the option is disabled, even with this I'm still not > sure a "shared" option is useful. > > Isn't this automation your main concern? > My main concerns is a user space configuration should not hang the process. This is a problem for people upgrading their kernel/systemd and finding their processes hanging. I am fine with making the change in user space automount daemon keeping slave mounts as default, but then you would leave out a small security window where users can hang the accessing process by modifying/replacing automount.
On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > Hi Ian, > > Sorry for the late reply, I had to setup and environment for your > specific case and it took time. np. > > On 9:47 28/09, Ian Kent wrote: > > On Fri, 2019-09-27 at 11:16 -0500, Goldwyn Rodrigues wrote: > > > On 18:51 27/09, Ian Kent wrote: > > > > On Fri, 2019-09-27 at 15:41 +0800, Ian Kent wrote: > > > > > > > I initially thought this was the result of a "fix" in the > > > > > > > mount > > > > > > > propagation code but it occurred to me that propagation > > > > > > > is > > > > > > > meant > > > > > > > to occur between mount trees not within them so this > > > > > > > might be > > > > > > > a > > > > > > > bug. > > > > > > > > > > > > > > I probably should have worked out exactly what upstream > > > > > > > kernel > > > > > > > this started happening in and then done a bisect and > > > > > > > tried to > > > > > > > work out if the change was doing what it was supposed to. > > > > > > > > > > > > > > Anyway, I'll need to do that now for us to discuss this > > > > > > > sensibly. > > > > > > > > > > > > > > > > Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com> > > > > > > > > > > > > > > > > > > diff --git a/fs/pnode.c b/fs/pnode.c > > > > > > > > > index 49f6d7ff2139..b960805d7954 100644 > > > > > > > > > --- a/fs/pnode.c > > > > > > > > > +++ b/fs/pnode.c > > > > > > > > > @@ -292,6 +292,9 @@ int propagate_mnt(struct mount > > > > > > > > > *dest_mnt, > > > > > > > > > struct > > > > > > > > > mountpoint *dest_mp, > > > > > > > > > struct mount *m, *n; > > > > > > > > > int ret = 0; > > > > > > > > > > > > > > > > > > + if (source_mnt->mnt_mountpoint->d_flags & > > > > > > > > > DCACHE_NEED_AUTOMOUNT) > > > > > > > > > + return 0; > > > > > > > > > + > > > > > > > > > > > > > > > > Possible problem with this is it will probably prevent > > > > > > > > mount > > > > > > > > propagation in both directions which will break stuff. > > > > > > No, I am specifically checking when the source has a automount > > > flag > > > set. > > > It will block only one way. I checked it with an example. > > > > I don't understand how this check can selectively block > > propagation? > > > > If you have say: > > test / :/exports \ > > /tmp :/exports/tmp \ > > /lib :/exports/lib > > > > and > > /bind /etc/auto.exports > > > > in /etc/auto.master > > > > and you use say: > > > > docker run -it --rm -v /bind:/bind:slave fedora-autofs:v1 bash > > > > your saying the above will not propagate those offset trigger > > mounts > > to the parent above the /bind/test mount but will propagate them to > > the container? > > Yes, It works for me. I could not find the fedora-autofs, but used > fedora image. Check both cases. The first one (vanilla) is my > modified > kernel with the patch I mentioned. Sorry, my bad, fedora-autofs is simply fedora container image setup to test autofs, the straight fedora image (or any other Linux os image for that matter) should be fine for this. > > [root@fedora30 ~]# cat /etc/auto.master > /bind /etc/auto.exports > > Note, there is no options. I added -bind in the config you provided. Ok, I thought that wouldn't be needed, but the /exports directory tree would need to exist in the container file system. > > [root@fedora30 ~]# cat /etc/auto.exports > test -bind / :/exports \ > /tmp :/exports/tmp \ > /lib :/exports/lib > > [root@fedora30 ~]# uname -a > Linux fedora30 5.3.1vanilla+ #9 SMP Tue Oct 1 11:11:11 CDT 2019 > x86_64 x86_64 x86_64 GNU/Linux > [root@fedora30 ~]# docker run -it --rm -v /bind:/bind:slave fedora > bash > [root@cf881c09f90a /]# ls /bind > [root@cf881c09f90a /]# ls /bind/test > lib tmp > [root@cf881c09f90a /]# ls /bind/test/lib > lib-file Ok, maybe it's the container root file system being marked as propagation private that makes this behave differently or it could be because the container using a different mount name space. Still, it's curious that you get mounts from the init mount name space propagating to the container with that kernel change. I don't understand what's going on there, the change looks like it prevents propagation across the board. Or maybe it isn't actually mount propagation that causes this to happen and the change papers over some other bug. I admit I always thought it was the propagation too. > > However, on existing fedora 30 kernel.. > > [root@fedora30 ~]# uname -a > Linux fedora30 5.2.17-200.fc30.x86_64 #1 SMP Mon Sep 23 13:42:32 UTC > 2019 x86_64 x86_64 x86_64 GNU/Linux > [root@fedora30 ~]# docker run -it --rm -v /bind:/bind:slave fedora > bash > [root@0a1d4f3dc475 /]# ls /bind > [root@0a1d4f3dc475 /]# ls /bind/test > lib tmp > [root@0a1d4f3dc475 /]# ls /bind/test/lib > <hangs> > Ok, I'll have a look at that case myself but it sounds like the unwanted propagation is occurring with the unpatched kernel. > > > It looks like that check is all or nothing to me? > > Can you explain a bit more. > > > > > > > > > > I had originally assumed the problem was mount > > > > > > > > propagation > > > > > > > > back to the parent mount but now I'm not sure that this > > > > > > > > is > > > > > > > > actually what is meant to happen. > > > > > > > > Goldwyn, > > > > > > > > TBH I'm already a bit over this particularly since it's a > > > > solved problem from my POV. > > > > > > > > I've gone back as far as Fedora 20 and 3.11.10-301.fc20 also > > > > behaves like this. > > > > > > The problem started with the root directory being mounted as > > > shared. > > > > The change where systemd set the root file system propagation > > shared was certainly an autofs pain point (for more than just > > this case too) but I was so sure that wasn't when this started > > happening. > > > > But ok, I could be mistaken, and you seem to be sure about. > > Well, it might as well be with the propagation code. I am not sure > what introduced this. Yeah, it is an annoying problem, I can certainly agree with that. I've had difficulty working out what's going on with the mount propagation code over time (there's so many special cases in various places) so I'd really like to understand how this change appears to achieve what's needed. It doesn't look like it should. > > > > > Unless someone says this behaviour is not the way kernel > > > > mount propagation should behave I'm not going to spend > > > > more time on it. > > > > > > > > The ability to use either "slave" or "private" autofs pseudo > > > > mount options in master map mount entries that are susceptible > > > > to this mount propagation behaviour was included in autofs- > > > > 5.1.5 > > > > and the patches used are present on kernel.org if you need to > > > > back port them to an earlier release. > > > > > > What about "shared" pseudo mount option? The point is the default > > > shared option with automount is broken, and should not be exposed > > > at all. > > > > What about shared mounts? > > > > I don't know of a case where propagation shared is actually needed. > > If you know of one please describe it. > > No, I don't have a use case for shared mounts. I am merely trying to > emphasize the default option (which behaves as shared) is broken. Yes, the inheritance of propagation shared from the root breaks the mounting of autofs trigger mounts within a bind mount that's mounted onto a directory in the root file system. That's the only case I've seen myself but there may be more. It is a special case we don't come across that often. I think that's the reason it went unnoticed for so long. > > > The most common case is "slave" and the "private" option was only > > included because it might be needed if people are using isolated > > environments but TBH I'm not at all sure it could actually be used > > for that case. > > > > IIUC the change to the propagation of the root file system was > > done to help with containers but turned out not to do what was > > needed and was never reverted. So the propagation shared change > > probably should have been propagation slave or not changed at > > all. > > I agree. I am also worried of the swelling /proc/mounts because > of this. I think that's the least of our problems with this. The deadlock that results from an autofs trigger mount trying to mount on the propagated trigger mount (that should not exist) is the real problem. > > > > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-set-bind-mount-as-propagation-slave.patch > > > > > > > > https://mirrors.edge.kernel.org/pub/linux/daemons/autofs/v5/patches-5.1.5/autofs-5.1.4-add-master-map-pseudo-options-for-mount-propagation.patch > > > > > > > > It shouldn't be too difficult to back port them but they might > > > > have other patch dependencies. I will help with that if you > > > > need it. > > > > > > My problem is not with the patch and the "private" or "slave" > > > flag, > > > but > > > with the absence of it. We have the patch you mention in our > > > repos. > > > > Ha, play on words, "absence of it" and "we have it in our repos" > > I meant, Absence of "private" or "slave" flags. Yea, my little joke may have been in poor taste. > > > Don't you mean the problem is that mount propagation isn't > > set correctly automatically by automount. > > > > > I am assuming that users are stupid and they will miss putting > > > the > > > flags > > > in the auto.master file and wonder why when they try to access > > > the > > > directories > > > the process hangs. In all, any user configuration should not hang > > > the > > > kernel. > > > > I thought about that when I was working on those patches but, > > at the time, I didn't think the propagation problem had started > > when the root file system was set propagation shared at boot. > > > > I still think changing the kernel propagation isn't the right > > way to resolve it. > > > > But I would be willing to add a configuration option to autofs > > that when set would use propagation slave for all bind mounts > > without the need to modify the master map. Given how long the > > problem has been around I'm also tempted to make it default > > to enabled. > > > > I'm not sure yet what that would mean for the existing mount > > options "shared" and "private" other than them possibly being > > needed if the option is disabled, even with this I'm still not > > sure a "shared" option is useful. > > > > Isn't this automation your main concern? > > > > My main concerns is a user space configuration should not hang > the process. > > This is a problem for people upgrading their kernel/systemd > and finding their processes hanging. > > I am fine with making the change in user space automount daemon > keeping > slave mounts as default, but then you would leave out a small > security > window where users can hang the accessing process by > modifying/replacing > automount. The same could be said of replacing the kernel since the problem has been around for so long. Anyway, it does sound like it's worth putting time into your suggestion of a kernel change. Unfortunately I think it's going to take a while to work out what's actually going on with the propagation and I'm in the middle of some other pressing work right now. Ian
Hi Ian, On 10:14 02/10, Ian Kent wrote: > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: <snip> > Anyway, it does sound like it's worth putting time into > your suggestion of a kernel change. > > Unfortunately I think it's going to take a while to work > out what's actually going on with the propagation and I'm > in the middle of some other pressing work right now. Have you have made any progress on this issue? As I mentioned, I am fine with a userspace solution of defaulting to slave mounts all of the time instead of this kernel patch.
On Mon, 2019-10-28 at 11:28 -0500, Goldwyn Rodrigues wrote: > Hi Ian, > > On 10:14 02/10, Ian Kent wrote: > > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > <snip> > > > Anyway, it does sound like it's worth putting time into > > your suggestion of a kernel change. > > > > Unfortunately I think it's going to take a while to work > > out what's actually going on with the propagation and I'm > > in the middle of some other pressing work right now. > > Have you have made any progress on this issue? Sorry I haven't. I still want to try and understand what's going on there. > As I mentioned, I am fine with a userspace solution of defaulting > to slave mounts all of the time instead of this kernel patch. Oh, I thought you weren't keen on that recommendation. That shouldn't take long to do so I should be able to get that done and post a patch pretty soon. I'll get back to looking at the mount propagation code when I get a chance. Unfortunately I haven't been very successful when making changes to that area of code in the past ... Ian
On Tue, 2019-10-29 at 07:57 +0800, Ian Kent wrote: > On Mon, 2019-10-28 at 11:28 -0500, Goldwyn Rodrigues wrote: > > Hi Ian, > > > > On 10:14 02/10, Ian Kent wrote: > > > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > > <snip> > > > > > Anyway, it does sound like it's worth putting time into > > > your suggestion of a kernel change. > > > > > > Unfortunately I think it's going to take a while to work > > > out what's actually going on with the propagation and I'm > > > in the middle of some other pressing work right now. > > > > Have you have made any progress on this issue? > > Sorry I haven't. > I still want to try and understand what's going on there. > > > As I mentioned, I am fine with a userspace solution of defaulting > > to slave mounts all of the time instead of this kernel patch. > > Oh, I thought you weren't keen on that recommendation. > > That shouldn't take long to do so I should be able to get that done > and post a patch pretty soon. > > I'll get back to looking at the mount propagation code when I get a > chance. Unfortunately I haven't been very successful when making > changes to that area of code in the past ... After working on this patch I'm even more inclined to let the kernel do it's propagation thing and set the autofs mounts, either silently by default or explicitly by map entry option. Because it's the propagation setting of the parent mount that controls the propagation of its children there shouldn't be any chance of a race so this should be reliable. Anyway, here is a patch, compile tested only, and without the changelog hunk I normally add to save you possible conflicts. But unless there are any problems found this is what I will eventually commit to the repo. If there are any changes your not aware of I'll let you know. Clearly this depends on the other two related patches for this issue. -- autofs-5.1.6 - make bind mounts propagation slave by default From: Ian Kent <raven@themaw.net> Make setting mount propagation on bind mounts mandatory with a default of propagation slave. When using multi-mounts that have bind mounts the bind mount will have the same properties as its parent which is commonly propagation shared. And if the mount target is also propagation shared this can lead to a deadlock when attempting to access the offset mounts. When this happens an unwanted offset mount is propagated back to the target file system resulting in a deadlock since the automount target is itself an (unwanted) automount trigger. This problem has been present much longer than I originally thought, perhaps since mount propagation was introduced into the kernel, so explicitly setting bind mount propagation is the sensible thing to do. Signed-off-by: Ian Kent <raven@themaw.net> --- include/automount.h | 9 +++++---- lib/master_parse.y | 11 ++++++++--- lib/master_tok.l | 1 + man/auto.master.5.in | 19 ++++++++++--------- modules/mount_bind.c | 40 ++++++++++++++++++++++------------------ 5 files changed, 46 insertions(+), 34 deletions(-) diff --git a/include/automount.h b/include/automount.h index 4fd0ba96..fe9c7fee 100644 --- a/include/automount.h +++ b/include/automount.h @@ -551,14 +551,15 @@ struct kernel_mod_version { #define MOUNT_FLAG_AMD_CACHE_ALL 0x0080 /* Set mount propagation for bind mounts */ -#define MOUNT_FLAG_SLAVE 0x0100 -#define MOUNT_FLAG_PRIVATE 0x0200 +#define MOUNT_FLAG_SHARED 0x0100 +#define MOUNT_FLAG_SLAVE 0x0200 +#define MOUNT_FLAG_PRIVATE 0x0400 /* Use strict expire semantics if requested and kernel supports it */ -#define MOUNT_FLAG_STRICTEXPIRE 0x0400 +#define MOUNT_FLAG_STRICTEXPIRE 0x0800 /* Indicator for applications to ignore the mount entry */ -#define MOUNT_FLAG_IGNORE 0x0800 +#define MOUNT_FLAG_IGNORE 0x1000 struct autofs_point { pthread_t thid; diff --git a/lib/master_parse.y b/lib/master_parse.y index f817f739..e9589a5a 100644 --- a/lib/master_parse.y +++ b/lib/master_parse.y @@ -59,6 +59,7 @@ static long timeout; static long negative_timeout; static unsigned symlnk; static unsigned strictexpire; +static unsigned shared; static unsigned slave; static unsigned private; static unsigned nobind; @@ -106,7 +107,7 @@ static int master_fprintf(FILE *, char *, ...); %token MAP %token OPT_TIMEOUT OPT_NTIMEOUT OPT_NOBIND OPT_NOGHOST OPT_GHOST OPT_VERBOSE %token OPT_DEBUG OPT_RANDOM OPT_USE_WEIGHT OPT_SYMLINK OPT_MODE -%token OPT_STRICTEXPIRE OPT_SLAVE OPT_PRIVATE +%token OPT_STRICTEXPIRE OPT_SHARED OPT_SLAVE OPT_PRIVATE %token COLON COMMA NL DDASH %type <strtype> map %type <strtype> options @@ -208,6 +209,7 @@ line: | PATH OPT_TIMEOUT { master_notify($1); YYABORT; } | PATH OPT_SYMLINK { master_notify($1); YYABORT; } | PATH OPT_STRICTEXPIRE { master_notify($1); YYABORT; } + | PATH OPT_SHARED { master_notify($1); YYABORT; } | PATH OPT_SLAVE { master_notify($1); YYABORT; } | PATH OPT_PRIVATE { master_notify($1); YYABORT; } | PATH OPT_NOBIND { master_notify($1); YYABORT; } @@ -622,6 +624,7 @@ daemon_option: OPT_TIMEOUT NUMBER { timeout = $2; } | OPT_NTIMEOUT NUMBER { negative_timeout = $2; } | OPT_SYMLINK { symlnk = 1; } | OPT_STRICTEXPIRE { strictexpire = 1; } + | OPT_SHARED { shared = 1; } | OPT_SLAVE { slave = 1; } | OPT_PRIVATE { private = 1; } | OPT_NOBIND { nobind = 1; } @@ -907,8 +910,10 @@ int master_parse_entry(const char *buffer, unsigned int default_timeout, unsigne entry->ap->flags |= MOUNT_FLAG_SYMLINK; if (strictexpire) entry->ap->flags |= MOUNT_FLAG_STRICTEXPIRE; - if (slave) - entry->ap->flags |= MOUNT_FLAG_SLAVE; + /* Default is propagation slave */ + entry->ap->flags |= MOUNT_FLAG_SLAVE; + if (shared) + entry->ap->flags |= MOUNT_FLAG_SHARED; if (private) entry->ap->flags |= MOUNT_FLAG_PRIVATE; if (negative_timeout) diff --git a/lib/master_tok.l b/lib/master_tok.l index 7486710b..87a6b958 100644 --- a/lib/master_tok.l +++ b/lib/master_tok.l @@ -389,6 +389,7 @@ MODE (--mode{OPTWS}|--mode{OPTWS}={OPTWS}) -?symlink { return(OPT_SYMLINK); } -?nobind { return(OPT_NOBIND); } -?nobrowse { return(OPT_NOGHOST); } + -?shared { return(OPT_SHARED); } -?slave { return(OPT_SLAVE); } -?private { return(OPT_PRIVATE); } -?strictexpire { return(OPT_STRICTEXPIRE); } diff --git a/man/auto.master.5.in b/man/auto.master.5.in index 6e510a59..58132d69 100644 --- a/man/auto.master.5.in +++ b/man/auto.master.5.in @@ -208,17 +208,18 @@ applications scanning the mount tree. Note that this doesn't completely resolve the problem of expired automounts being immediately re-mounted due to application accesses triggered by the expire itself. .TP -.I slave \fPor\fI private +.I slave\fP, \fIprivate\fP or \fIshared\fP This option allows mount propagation of bind mounts to be set to -either \fIslave\fP or \fIprivate\fP. This option may be needed when using -multi-mounts that have bind mounts that bind to a file system that is -propagation shared. This is because the bind mount will have the same -properties as its target which causes problems for offset mounts. When -this happens an unwanted offset mount is propagated back to the target -file system resulting in a deadlock when attempting to access the offset. +\fIslave\fP, \fIprivate\fP or \fIshared\fP. This option defaults to +\fIslave\fP if no option is given. When using multi-mounts that have +bind mounts the bind mount will have the same properties as its parent +which is commonly propagation \fIshared\fP. And if the mount target is +also propagation \fIshared\fP this can lead to a deadlock when attempting +to access the offset mounts. When this happens an unwanted offset mount +is propagated back to the target file system resulting in the deadlock +since the automount target is itself an (unwanted) automount trigger. This option is an autofs pseudo mount option that can be used in the -master map only. By default, bind mounts will inherit the mount propagation -of the target file system. +master map only. .TP .I "\-r, \-\-random-multimount-selection" Enables the use of random selection when choosing a host from a diff --git a/modules/mount_bind.c b/modules/mount_bind.c index 9cba0d7a..5253501c 100644 --- a/modules/mount_bind.c +++ b/modules/mount_bind.c @@ -153,6 +153,7 @@ int mount_mount(struct autofs_point *ap, const char *root, const char *name, int if (!symlnk && bind_works) { int status, existed = 1; + int flags; debug(ap->logopt, MODPREFIX "calling mkdir_path %s", fullpath); @@ -190,24 +191,27 @@ int mount_mount(struct autofs_point *ap, const char *root, const char *name, int what, fstype, fullpath); } - if (ap->flags & (MOUNT_FLAG_SLAVE | MOUNT_FLAG_PRIVATE)) { - int flags = MS_SLAVE; - - if (ap->flags & MOUNT_FLAG_PRIVATE) - flags = MS_PRIVATE; - - /* The bind mount has succeeded but if the target - * mount is propagation shared propagation of child - * mounts (autofs offset mounts for example) back to - * the target of the bind mount must be avoided or - * autofs trigger mounts will deadlock. - */ - err = mount(NULL, fullpath, NULL, flags, NULL); - if (err) { - warn(ap->logopt, - "failed to set propagation for %s", - fullpath, root); - } + /* The bind mount has succeeded, now set the mount propagation. + * + * The default is propagation shared, change it if the master + * map entry has a different option specified. + */ + flags = MS_SLAVE; + if (ap->flags & MOUNT_FLAG_PRIVATE) + flags = MS_PRIVATE; + else if (ap->flags & MOUNT_FLAG_SHARED) + flags = MS_SHARED; + + /* Note: If the parent mount is propagation shared propagation + * of child mounts (autofs offset mounts for example) back to + * the target of the bind mount can happen in some cases and + * must be avoided or autofs trigger mounts will deadlock. + */ + err = mount(NULL, fullpath, NULL, flags, NULL); + if (err) { + warn(ap->logopt, + "failed to set propagation for %s", + fullpath, root); } return 0;
On Tue, 2019-10-29 at 14:39 +0800, Ian Kent wrote: > > After working on this patch I'm even more inclined to let the kernel > do it's propagation thing and set the autofs mounts, either silently > by default or explicitly by map entry option. > > Because it's the propagation setting of the parent mount that > controls > the propagation of its children there shouldn't be any chance of a > race so this should be reliable. > > Anyway, here is a patch, compile tested only, and without the > changelog > hunk I normally add to save you possible conflicts. But unless there > are any problems found this is what I will eventually commit to the > repo. > > If there are any changes your not aware of I'll let you know. > > Clearly this depends on the other two related patches for this issue. Oh, forget to mention, this is compile tested only so far. Please let me know how it goes. Ian
On 14:39 29/10, Ian Kent wrote: > On Tue, 2019-10-29 at 07:57 +0800, Ian Kent wrote: > > On Mon, 2019-10-28 at 11:28 -0500, Goldwyn Rodrigues wrote: > > > Hi Ian, > > > > > > On 10:14 02/10, Ian Kent wrote: > > > > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > > > <snip> > > > > > > > Anyway, it does sound like it's worth putting time into > > > > your suggestion of a kernel change. > > > > > > > > Unfortunately I think it's going to take a while to work > > > > out what's actually going on with the propagation and I'm > > > > in the middle of some other pressing work right now. > > > > > > Have you have made any progress on this issue? > > > > Sorry I haven't. > > I still want to try and understand what's going on there. > > > > > As I mentioned, I am fine with a userspace solution of defaulting > > > to slave mounts all of the time instead of this kernel patch. > > > > Oh, I thought you weren't keen on that recommendation. > > > > That shouldn't take long to do so I should be able to get that done > > and post a patch pretty soon. > > > > I'll get back to looking at the mount propagation code when I get a > > chance. Unfortunately I haven't been very successful when making > > changes to that area of code in the past ... > > After working on this patch I'm even more inclined to let the kernel > do it's propagation thing and set the autofs mounts, either silently > by default or explicitly by map entry option. > > Because it's the propagation setting of the parent mount that controls > the propagation of its children there shouldn't be any chance of a > race so this should be reliable. > > Anyway, here is a patch, compile tested only, and without the changelog > hunk I normally add to save you possible conflicts. But unless there > are any problems found this is what I will eventually commit to the > repo. > > If there are any changes your not aware of I'll let you know. > > Clearly this depends on the other two related patches for this issue. This works good for us. Thanks. However, I have some review comments for the patch. > -- > > autofs-5.1.6 - make bind mounts propagation slave by default > > From: Ian Kent <raven@themaw.net> > > Make setting mount propagation on bind mounts mandatory with a default > of propagation slave. > > When using multi-mounts that have bind mounts the bind mount will have > the same properties as its parent which is commonly propagation shared. > And if the mount target is also propagation shared this can lead to a > deadlock when attempting to access the offset mounts. When this happens > an unwanted offset mount is propagated back to the target file system > resulting in a deadlock since the automount target is itself an > (unwanted) automount trigger. > > This problem has been present much longer than I originally thought, > perhaps since mount propagation was introduced into the kernel, so > explicitly setting bind mount propagation is the sensible thing to do. > > Signed-off-by: Ian Kent <raven@themaw.net> > --- > include/automount.h | 9 +++++---- > lib/master_parse.y | 11 ++++++++--- > lib/master_tok.l | 1 + > man/auto.master.5.in | 19 ++++++++++--------- > modules/mount_bind.c | 40 ++++++++++++++++++++++------------------ > 5 files changed, 46 insertions(+), 34 deletions(-) > > diff --git a/include/automount.h b/include/automount.h > index 4fd0ba96..fe9c7fee 100644 > --- a/include/automount.h > +++ b/include/automount.h > @@ -551,14 +551,15 @@ struct kernel_mod_version { > #define MOUNT_FLAG_AMD_CACHE_ALL 0x0080 > > /* Set mount propagation for bind mounts */ > -#define MOUNT_FLAG_SLAVE 0x0100 > -#define MOUNT_FLAG_PRIVATE 0x0200 > +#define MOUNT_FLAG_SHARED 0x0100 > +#define MOUNT_FLAG_SLAVE 0x0200 > +#define MOUNT_FLAG_PRIVATE 0x0400 > > /* Use strict expire semantics if requested and kernel supports it */ > -#define MOUNT_FLAG_STRICTEXPIRE 0x0400 > +#define MOUNT_FLAG_STRICTEXPIRE 0x0800 > > /* Indicator for applications to ignore the mount entry */ > -#define MOUNT_FLAG_IGNORE 0x0800 > +#define MOUNT_FLAG_IGNORE 0x1000 > > struct autofs_point { > pthread_t thid; > diff --git a/lib/master_parse.y b/lib/master_parse.y > index f817f739..e9589a5a 100644 > --- a/lib/master_parse.y > +++ b/lib/master_parse.y > @@ -59,6 +59,7 @@ static long timeout; > static long negative_timeout; > static unsigned symlnk; > static unsigned strictexpire; > +static unsigned shared; > static unsigned slave; > static unsigned private; > static unsigned nobind; > @@ -106,7 +107,7 @@ static int master_fprintf(FILE *, char *, ...); > %token MAP > %token OPT_TIMEOUT OPT_NTIMEOUT OPT_NOBIND OPT_NOGHOST OPT_GHOST OPT_VERBOSE > %token OPT_DEBUG OPT_RANDOM OPT_USE_WEIGHT OPT_SYMLINK OPT_MODE > -%token OPT_STRICTEXPIRE OPT_SLAVE OPT_PRIVATE > +%token OPT_STRICTEXPIRE OPT_SHARED OPT_SLAVE OPT_PRIVATE > %token COLON COMMA NL DDASH > %type <strtype> map > %type <strtype> options > @@ -208,6 +209,7 @@ line: > | PATH OPT_TIMEOUT { master_notify($1); YYABORT; } > | PATH OPT_SYMLINK { master_notify($1); YYABORT; } > | PATH OPT_STRICTEXPIRE { master_notify($1); YYABORT; } > + | PATH OPT_SHARED { master_notify($1); YYABORT; } > | PATH OPT_SLAVE { master_notify($1); YYABORT; } > | PATH OPT_PRIVATE { master_notify($1); YYABORT; } > | PATH OPT_NOBIND { master_notify($1); YYABORT; } > @@ -622,6 +624,7 @@ daemon_option: OPT_TIMEOUT NUMBER { timeout = $2; } > | OPT_NTIMEOUT NUMBER { negative_timeout = $2; } > | OPT_SYMLINK { symlnk = 1; } > | OPT_STRICTEXPIRE { strictexpire = 1; } > + | OPT_SHARED { shared = 1; } > | OPT_SLAVE { slave = 1; } > | OPT_PRIVATE { private = 1; } > | OPT_NOBIND { nobind = 1; } > @@ -907,8 +910,10 @@ int master_parse_entry(const char *buffer, unsigned int default_timeout, unsigne > entry->ap->flags |= MOUNT_FLAG_SYMLINK; > if (strictexpire) > entry->ap->flags |= MOUNT_FLAG_STRICTEXPIRE; > - if (slave) > - entry->ap->flags |= MOUNT_FLAG_SLAVE; > + /* Default is propagation slave */ > + entry->ap->flags |= MOUNT_FLAG_SLAVE; > + if (shared) > + entry->ap->flags |= MOUNT_FLAG_SHARED; > if (private) > entry->ap->flags |= MOUNT_FLAG_PRIVATE; If the user mention shared or private flag, you will end up enabling both MOUNT_FLAG_SLAVE and MOUNT_FLAG_SHARED. It would be better to put it in a if..else if..else sequence. These are options are mutually exclusive. It may not affect your current implementation, it will show up as a bug if there are further changes in the mount() sequence.
On Tue, 2019-10-29 at 11:00 -0500, Goldwyn Rodrigues wrote: > On 14:39 29/10, Ian Kent wrote: > > On Tue, 2019-10-29 at 07:57 +0800, Ian Kent wrote: > > > On Mon, 2019-10-28 at 11:28 -0500, Goldwyn Rodrigues wrote: > > > > Hi Ian, > > > > > > > > On 10:14 02/10, Ian Kent wrote: > > > > > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > > > > <snip> > > > > > > > > > Anyway, it does sound like it's worth putting time into > > > > > your suggestion of a kernel change. > > > > > > > > > > Unfortunately I think it's going to take a while to work > > > > > out what's actually going on with the propagation and I'm > > > > > in the middle of some other pressing work right now. > > > > > > > > Have you have made any progress on this issue? > > > > > > Sorry I haven't. > > > I still want to try and understand what's going on there. > > > > > > > As I mentioned, I am fine with a userspace solution of > > > > defaulting > > > > to slave mounts all of the time instead of this kernel patch. > > > > > > Oh, I thought you weren't keen on that recommendation. > > > > > > That shouldn't take long to do so I should be able to get that > > > done > > > and post a patch pretty soon. > > > > > > I'll get back to looking at the mount propagation code when I get > > > a > > > chance. Unfortunately I haven't been very successful when making > > > changes to that area of code in the past ... > > > > After working on this patch I'm even more inclined to let the > > kernel > > do it's propagation thing and set the autofs mounts, either > > silently > > by default or explicitly by map entry option. > > > > Because it's the propagation setting of the parent mount that > > controls > > the propagation of its children there shouldn't be any chance of a > > race so this should be reliable. > > > > Anyway, here is a patch, compile tested only, and without the > > changelog > > hunk I normally add to save you possible conflicts. But unless > > there > > are any problems found this is what I will eventually commit to the > > repo. > > > > If there are any changes your not aware of I'll let you know. > > > > Clearly this depends on the other two related patches for this > > issue. > > This works good for us. Thanks. > However, I have some review comments for the patch. > > > -- > > > > autofs-5.1.6 - make bind mounts propagation slave by default > > > > From: Ian Kent <raven@themaw.net> > > > > Make setting mount propagation on bind mounts mandatory with a > > default > > of propagation slave. > > > > When using multi-mounts that have bind mounts the bind mount will > > have > > the same properties as its parent which is commonly propagation > > shared. > > And if the mount target is also propagation shared this can lead to > > a > > deadlock when attempting to access the offset mounts. When this > > happens > > an unwanted offset mount is propagated back to the target file > > system > > resulting in a deadlock since the automount target is itself an > > (unwanted) automount trigger. > > > > This problem has been present much longer than I originally > > thought, > > perhaps since mount propagation was introduced into the kernel, so > > explicitly setting bind mount propagation is the sensible thing to > > do. > > > > Signed-off-by: Ian Kent <raven@themaw.net> > > --- > > include/automount.h | 9 +++++---- > > lib/master_parse.y | 11 ++++++++--- > > lib/master_tok.l | 1 + > > man/auto.master.5.in | 19 ++++++++++--------- > > modules/mount_bind.c | 40 ++++++++++++++++++++++-------------- > > ---- > > 5 files changed, 46 insertions(+), 34 deletions(-) > > > > diff --git a/include/automount.h b/include/automount.h > > index 4fd0ba96..fe9c7fee 100644 > > --- a/include/automount.h > > +++ b/include/automount.h > > @@ -551,14 +551,15 @@ struct kernel_mod_version { > > #define MOUNT_FLAG_AMD_CACHE_ALL 0x0080 > > > > /* Set mount propagation for bind mounts */ > > -#define MOUNT_FLAG_SLAVE 0x0100 > > -#define MOUNT_FLAG_PRIVATE 0x0200 > > +#define MOUNT_FLAG_SHARED 0x0100 > > +#define MOUNT_FLAG_SLAVE 0x0200 > > +#define MOUNT_FLAG_PRIVATE 0x0400 > > > > /* Use strict expire semantics if requested and kernel supports it > > */ > > -#define MOUNT_FLAG_STRICTEXPIRE 0x0400 > > +#define MOUNT_FLAG_STRICTEXPIRE 0x0800 > > > > /* Indicator for applications to ignore the mount entry */ > > -#define MOUNT_FLAG_IGNORE 0x0800 > > +#define MOUNT_FLAG_IGNORE 0x1000 > > > > struct autofs_point { > > pthread_t thid; > > diff --git a/lib/master_parse.y b/lib/master_parse.y > > index f817f739..e9589a5a 100644 > > --- a/lib/master_parse.y > > +++ b/lib/master_parse.y > > @@ -59,6 +59,7 @@ static long timeout; > > static long negative_timeout; > > static unsigned symlnk; > > static unsigned strictexpire; > > +static unsigned shared; > > static unsigned slave; > > static unsigned private; > > static unsigned nobind; > > @@ -106,7 +107,7 @@ static int master_fprintf(FILE *, char *, ...); > > %token MAP > > %token OPT_TIMEOUT OPT_NTIMEOUT OPT_NOBIND OPT_NOGHOST OPT_GHOST > > OPT_VERBOSE > > %token OPT_DEBUG OPT_RANDOM OPT_USE_WEIGHT OPT_SYMLINK OPT_MODE > > -%token OPT_STRICTEXPIRE OPT_SLAVE OPT_PRIVATE > > +%token OPT_STRICTEXPIRE OPT_SHARED OPT_SLAVE OPT_PRIVATE > > %token COLON COMMA NL DDASH > > %type <strtype> map > > %type <strtype> options > > @@ -208,6 +209,7 @@ line: > > | PATH OPT_TIMEOUT { master_notify($1); YYABORT; } > > | PATH OPT_SYMLINK { master_notify($1); YYABORT; } > > | PATH OPT_STRICTEXPIRE { master_notify($1); YYABORT; } > > + | PATH OPT_SHARED { master_notify($1); YYABORT; } > > | PATH OPT_SLAVE { master_notify($1); YYABORT; } > > | PATH OPT_PRIVATE { master_notify($1); YYABORT; } > > | PATH OPT_NOBIND { master_notify($1); YYABORT; } > > @@ -622,6 +624,7 @@ daemon_option: OPT_TIMEOUT NUMBER { timeout = > > $2; } > > | OPT_NTIMEOUT NUMBER { negative_timeout = $2; } > > | OPT_SYMLINK { symlnk = 1; } > > | OPT_STRICTEXPIRE { strictexpire = 1; } > > + | OPT_SHARED { shared = 1; } > > | OPT_SLAVE { slave = 1; } > > | OPT_PRIVATE { private = 1; } > > | OPT_NOBIND { nobind = 1; } > > @@ -907,8 +910,10 @@ int master_parse_entry(const char *buffer, > > unsigned int default_timeout, unsigne > > entry->ap->flags |= MOUNT_FLAG_SYMLINK; > > if (strictexpire) > > entry->ap->flags |= MOUNT_FLAG_STRICTEXPIRE; > > - if (slave) > > - entry->ap->flags |= MOUNT_FLAG_SLAVE; > > + /* Default is propagation slave */ > > + entry->ap->flags |= MOUNT_FLAG_SLAVE; > > + if (shared) > > + entry->ap->flags |= MOUNT_FLAG_SHARED; > > if (private) > > entry->ap->flags |= MOUNT_FLAG_PRIVATE; > > If the user mention shared or private flag, you will end up > enabling both MOUNT_FLAG_SLAVE and MOUNT_FLAG_SHARED. > It would be better to put it in a if..else if..else sequence. > These are options are mutually exclusive. Thanks for noticing this obvious blunder. I'll fix that up and send an update. Ian
On Wed, 2019-10-30 at 14:01 +0800, Ian Kent wrote: > On Tue, 2019-10-29 at 11:00 -0500, Goldwyn Rodrigues wrote: > > On 14:39 29/10, Ian Kent wrote: > > > On Tue, 2019-10-29 at 07:57 +0800, Ian Kent wrote: > > > > On Mon, 2019-10-28 at 11:28 -0500, Goldwyn Rodrigues wrote: > > > > > Hi Ian, > > > > > > > > > > On 10:14 02/10, Ian Kent wrote: > > > > > > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > > > > > <snip> > > > > > > > > > > > Anyway, it does sound like it's worth putting time into > > > > > > your suggestion of a kernel change. > > > > > > > > > > > > Unfortunately I think it's going to take a while to work > > > > > > out what's actually going on with the propagation and I'm > > > > > > in the middle of some other pressing work right now. > > > > > > > > > > Have you have made any progress on this issue? > > > > > > > > Sorry I haven't. > > > > I still want to try and understand what's going on there. > > > > > > > > > As I mentioned, I am fine with a userspace solution of > > > > > defaulting > > > > > to slave mounts all of the time instead of this kernel patch. > > > > > > > > Oh, I thought you weren't keen on that recommendation. > > > > > > > > That shouldn't take long to do so I should be able to get that > > > > done > > > > and post a patch pretty soon. > > > > > > > > I'll get back to looking at the mount propagation code when I > > > > get > > > > a > > > > chance. Unfortunately I haven't been very successful when > > > > making > > > > changes to that area of code in the past ... > > > > > > After working on this patch I'm even more inclined to let the > > > kernel > > > do it's propagation thing and set the autofs mounts, either > > > silently > > > by default or explicitly by map entry option. > > > > > > Because it's the propagation setting of the parent mount that > > > controls > > > the propagation of its children there shouldn't be any chance of > > > a > > > race so this should be reliable. > > > > > > Anyway, here is a patch, compile tested only, and without the > > > changelog > > > hunk I normally add to save you possible conflicts. But unless > > > there > > > are any problems found this is what I will eventually commit to > > > the > > > repo. > > > > > > If there are any changes your not aware of I'll let you know. > > > > > > Clearly this depends on the other two related patches for this > > > issue. > > > > This works good for us. Thanks. > > However, I have some review comments for the patch. > > > > > -- > > > > > > autofs-5.1.6 - make bind mounts propagation slave by default > > > > > > From: Ian Kent <raven@themaw.net> > > > > > > Make setting mount propagation on bind mounts mandatory with a > > > default > > > of propagation slave. > > > > > > When using multi-mounts that have bind mounts the bind mount will > > > have > > > the same properties as its parent which is commonly propagation > > > shared. > > > And if the mount target is also propagation shared this can lead > > > to > > > a > > > deadlock when attempting to access the offset mounts. When this > > > happens > > > an unwanted offset mount is propagated back to the target file > > > system > > > resulting in a deadlock since the automount target is itself an > > > (unwanted) automount trigger. > > > > > > This problem has been present much longer than I originally > > > thought, > > > perhaps since mount propagation was introduced into the kernel, > > > so > > > explicitly setting bind mount propagation is the sensible thing > > > to > > > do. > > > > > > Signed-off-by: Ian Kent <raven@themaw.net> > > > --- > > > include/automount.h | 9 +++++---- > > > lib/master_parse.y | 11 ++++++++--- > > > lib/master_tok.l | 1 + > > > man/auto.master.5.in | 19 ++++++++++--------- > > > modules/mount_bind.c | 40 ++++++++++++++++++++++-------------- > > > ---- > > > 5 files changed, 46 insertions(+), 34 deletions(-) > > > > > > diff --git a/include/automount.h b/include/automount.h > > > index 4fd0ba96..fe9c7fee 100644 > > > --- a/include/automount.h > > > +++ b/include/automount.h > > > @@ -551,14 +551,15 @@ struct kernel_mod_version { > > > #define MOUNT_FLAG_AMD_CACHE_ALL 0x0080 > > > > > > /* Set mount propagation for bind mounts */ > > > -#define MOUNT_FLAG_SLAVE 0x0100 > > > -#define MOUNT_FLAG_PRIVATE 0x0200 > > > +#define MOUNT_FLAG_SHARED 0x0100 > > > +#define MOUNT_FLAG_SLAVE 0x0200 > > > +#define MOUNT_FLAG_PRIVATE 0x0400 > > > > > > /* Use strict expire semantics if requested and kernel supports > > > it > > > */ > > > -#define MOUNT_FLAG_STRICTEXPIRE 0x0400 > > > +#define MOUNT_FLAG_STRICTEXPIRE 0x0800 > > > > > > /* Indicator for applications to ignore the mount entry */ > > > -#define MOUNT_FLAG_IGNORE 0x0800 > > > +#define MOUNT_FLAG_IGNORE 0x1000 > > > > > > struct autofs_point { > > > pthread_t thid; > > > diff --git a/lib/master_parse.y b/lib/master_parse.y > > > index f817f739..e9589a5a 100644 > > > --- a/lib/master_parse.y > > > +++ b/lib/master_parse.y > > > @@ -59,6 +59,7 @@ static long timeout; > > > static long negative_timeout; > > > static unsigned symlnk; > > > static unsigned strictexpire; > > > +static unsigned shared; > > > static unsigned slave; > > > static unsigned private; > > > static unsigned nobind; > > > @@ -106,7 +107,7 @@ static int master_fprintf(FILE *, char *, > > > ...); > > > %token MAP > > > %token OPT_TIMEOUT OPT_NTIMEOUT OPT_NOBIND OPT_NOGHOST OPT_GHOST > > > OPT_VERBOSE > > > %token OPT_DEBUG OPT_RANDOM OPT_USE_WEIGHT OPT_SYMLINK OPT_MODE > > > -%token OPT_STRICTEXPIRE OPT_SLAVE OPT_PRIVATE > > > +%token OPT_STRICTEXPIRE OPT_SHARED OPT_SLAVE OPT_PRIVATE > > > %token COLON COMMA NL DDASH > > > %type <strtype> map > > > %type <strtype> options > > > @@ -208,6 +209,7 @@ line: > > > | PATH OPT_TIMEOUT { master_notify($1); YYABORT; } > > > | PATH OPT_SYMLINK { master_notify($1); YYABORT; } > > > | PATH OPT_STRICTEXPIRE { master_notify($1); YYABORT; } > > > + | PATH OPT_SHARED { master_notify($1); YYABORT; } > > > | PATH OPT_SLAVE { master_notify($1); YYABORT; } > > > | PATH OPT_PRIVATE { master_notify($1); YYABORT; } > > > | PATH OPT_NOBIND { master_notify($1); YYABORT; } > > > @@ -622,6 +624,7 @@ daemon_option: OPT_TIMEOUT NUMBER { timeout = > > > $2; } > > > | OPT_NTIMEOUT NUMBER { negative_timeout = $2; } > > > | OPT_SYMLINK { symlnk = 1; } > > > | OPT_STRICTEXPIRE { strictexpire = 1; } > > > + | OPT_SHARED { shared = 1; } > > > | OPT_SLAVE { slave = 1; } > > > | OPT_PRIVATE { private = 1; } > > > | OPT_NOBIND { nobind = 1; } > > > @@ -907,8 +910,10 @@ int master_parse_entry(const char *buffer, > > > unsigned int default_timeout, unsigne > > > entry->ap->flags |= MOUNT_FLAG_SYMLINK; > > > if (strictexpire) > > > entry->ap->flags |= MOUNT_FLAG_STRICTEXPIRE; > > > - if (slave) > > > - entry->ap->flags |= MOUNT_FLAG_SLAVE; > > > + /* Default is propagation slave */ > > > + entry->ap->flags |= MOUNT_FLAG_SLAVE; > > > + if (shared) > > > + entry->ap->flags |= MOUNT_FLAG_SHARED; > > > if (private) > > > entry->ap->flags |= MOUNT_FLAG_PRIVATE; > > > > If the user mention shared or private flag, you will end up > > enabling both MOUNT_FLAG_SLAVE and MOUNT_FLAG_SHARED. > > It would be better to put it in a if..else if..else sequence. > > These are options are mutually exclusive. > > Thanks for noticing this obvious blunder. > I'll fix that up and send an update. And the new variable, shared, initialization is missing too! > > Ian
On Tue, 2019-10-29 at 11:00 -0500, Goldwyn Rodrigues wrote: > On 14:39 29/10, Ian Kent wrote: > > On Tue, 2019-10-29 at 07:57 +0800, Ian Kent wrote: > > > On Mon, 2019-10-28 at 11:28 -0500, Goldwyn Rodrigues wrote: > > > > Hi Ian, > > > > > > > > On 10:14 02/10, Ian Kent wrote: > > > > > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > > > > <snip> > > > > > > > > > Anyway, it does sound like it's worth putting time into > > > > > your suggestion of a kernel change. > > > > > > > > > > Unfortunately I think it's going to take a while to work > > > > > out what's actually going on with the propagation and I'm > > > > > in the middle of some other pressing work right now. > > > > > > > > Have you have made any progress on this issue? > > > > > > Sorry I haven't. > > > I still want to try and understand what's going on there. > > > > > > > As I mentioned, I am fine with a userspace solution of > > > > defaulting > > > > to slave mounts all of the time instead of this kernel patch. > > > > > > Oh, I thought you weren't keen on that recommendation. > > > > > > That shouldn't take long to do so I should be able to get that > > > done > > > and post a patch pretty soon. > > > > > > I'll get back to looking at the mount propagation code when I get > > > a > > > chance. Unfortunately I haven't been very successful when making > > > changes to that area of code in the past ... > > > > After working on this patch I'm even more inclined to let the > > kernel > > do it's propagation thing and set the autofs mounts, either > > silently > > by default or explicitly by map entry option. > > > > Because it's the propagation setting of the parent mount that > > controls > > the propagation of its children there shouldn't be any chance of a > > race so this should be reliable. > > > > Anyway, here is a patch, compile tested only, and without the > > changelog > > hunk I normally add to save you possible conflicts. But unless > > there > > are any problems found this is what I will eventually commit to the > > repo. > > > > If there are any changes your not aware of I'll let you know. > > > > Clearly this depends on the other two related patches for this > > issue. > > This works good for us. Thanks. > However, I have some review comments for the patch. I think this one should do the trick. autofs-5.1.6 - make bind mounts propagation slave by default From: Ian Kent <raven@themaw.net> Make setting mount propagation on bind mounts mandatory with a default of propagation slave. When using multi-mounts that have bind mounts the bind mount will have the same properties as its parent which is commonly propagation shared. And if the mount target is also propagation shared this can lead to a deadlock when attempting to access the offset mounts. When this happens an unwanted offset mount is propagated back to the target file system resulting in a deadlock since the automount target is itself an (unwanted) automount trigger. This problem has been present much longer than I originally thought, perhaps since mount propagation was introduced into the kernel, so explicitly setting bind mount propagation is the sensible thing to do. Signed-off-by: Ian Kent <raven@themaw.net> --- include/automount.h | 9 +++++---- lib/master_parse.y | 24 +++++++++++++----------- lib/master_tok.l | 1 + man/auto.master.5.in | 19 ++++++++++--------- modules/mount_bind.c | 40 ++++++++++++++++++++++------------------ 5 files changed, 51 insertions(+), 42 deletions(-) diff --git a/include/automount.h b/include/automount.h index 4fd0ba96..fe9c7fee 100644 --- a/include/automount.h +++ b/include/automount.h @@ -551,14 +551,15 @@ struct kernel_mod_version { #define MOUNT_FLAG_AMD_CACHE_ALL 0x0080 /* Set mount propagation for bind mounts */ -#define MOUNT_FLAG_SLAVE 0x0100 -#define MOUNT_FLAG_PRIVATE 0x0200 +#define MOUNT_FLAG_SHARED 0x0100 +#define MOUNT_FLAG_SLAVE 0x0200 +#define MOUNT_FLAG_PRIVATE 0x0400 /* Use strict expire semantics if requested and kernel supports it */ -#define MOUNT_FLAG_STRICTEXPIRE 0x0400 +#define MOUNT_FLAG_STRICTEXPIRE 0x0800 /* Indicator for applications to ignore the mount entry */ -#define MOUNT_FLAG_IGNORE 0x0800 +#define MOUNT_FLAG_IGNORE 0x1000 struct autofs_point { pthread_t thid; diff --git a/lib/master_parse.y b/lib/master_parse.y index f817f739..3f36b0aa 100644 --- a/lib/master_parse.y +++ b/lib/master_parse.y @@ -59,8 +59,6 @@ static long timeout; static long negative_timeout; static unsigned symlnk; static unsigned strictexpire; -static unsigned slave; -static unsigned private; static unsigned nobind; static unsigned ghost; extern unsigned global_selection_options; @@ -72,6 +70,11 @@ static int tmp_argc; static char **local_argv; static int local_argc; +#define PROPAGATION_SHARED MOUNT_FLAG_SHARED +#define PROPAGATION_SLAVE MOUNT_FLAG_SLAVE +#define PROPAGATION_PRIVATE MOUNT_FLAG_PRIVATE +static unsigned int propagation; + static char errstr[MAX_ERR_LEN]; static unsigned int verbose; @@ -106,7 +109,7 @@ static int master_fprintf(FILE *, char *, ...); %token MAP %token OPT_TIMEOUT OPT_NTIMEOUT OPT_NOBIND OPT_NOGHOST OPT_GHOST OPT_VERBOSE %token OPT_DEBUG OPT_RANDOM OPT_USE_WEIGHT OPT_SYMLINK OPT_MODE -%token OPT_STRICTEXPIRE OPT_SLAVE OPT_PRIVATE +%token OPT_STRICTEXPIRE OPT_SHARED OPT_SLAVE OPT_PRIVATE %token COLON COMMA NL DDASH %type <strtype> map %type <strtype> options @@ -208,6 +211,7 @@ line: | PATH OPT_TIMEOUT { master_notify($1); YYABORT; } | PATH OPT_SYMLINK { master_notify($1); YYABORT; } | PATH OPT_STRICTEXPIRE { master_notify($1); YYABORT; } + | PATH OPT_SHARED { master_notify($1); YYABORT; } | PATH OPT_SLAVE { master_notify($1); YYABORT; } | PATH OPT_PRIVATE { master_notify($1); YYABORT; } | PATH OPT_NOBIND { master_notify($1); YYABORT; } @@ -622,8 +626,9 @@ daemon_option: OPT_TIMEOUT NUMBER { timeout = $2; } | OPT_NTIMEOUT NUMBER { negative_timeout = $2; } | OPT_SYMLINK { symlnk = 1; } | OPT_STRICTEXPIRE { strictexpire = 1; } - | OPT_SLAVE { slave = 1; } - | OPT_PRIVATE { private = 1; } + | OPT_SHARED { propagation = PROPAGATION_SHARED; } + | OPT_SLAVE { propagation = PROPAGATION_SLAVE; } + | OPT_PRIVATE { propagation = PROPAGATION_PRIVATE; } | OPT_NOBIND { nobind = 1; } | OPT_NOGHOST { ghost = 0; } | OPT_GHOST { ghost = 1; } @@ -697,8 +702,7 @@ static void local_init_vars(void) negative_timeout = 0; symlnk = 0; strictexpire = 0; - slave = 0; - private = 0; + propagation = PROPAGATION_SLAVE; nobind = 0; ghost = defaults_get_browse_mode(); random_selection = global_selection_options & MOUNT_FLAG_RANDOM_SELECT; @@ -899,6 +903,8 @@ int master_parse_entry(const char *buffer, unsigned int default_timeout, unsigne return 0; } } + entry->ap->flags = propagation; + if (random_selection) entry->ap->flags |= MOUNT_FLAG_RANDOM_SELECT; if (use_weight) @@ -907,10 +913,6 @@ int master_parse_entry(const char *buffer, unsigned int default_timeout, unsigne entry->ap->flags |= MOUNT_FLAG_SYMLINK; if (strictexpire) entry->ap->flags |= MOUNT_FLAG_STRICTEXPIRE; - if (slave) - entry->ap->flags |= MOUNT_FLAG_SLAVE; - if (private) - entry->ap->flags |= MOUNT_FLAG_PRIVATE; if (negative_timeout) entry->ap->negative_timeout = negative_timeout; if (mode && mode < LONG_MAX) diff --git a/lib/master_tok.l b/lib/master_tok.l index 7486710b..87a6b958 100644 --- a/lib/master_tok.l +++ b/lib/master_tok.l @@ -389,6 +389,7 @@ MODE (--mode{OPTWS}|--mode{OPTWS}={OPTWS}) -?symlink { return(OPT_SYMLINK); } -?nobind { return(OPT_NOBIND); } -?nobrowse { return(OPT_NOGHOST); } + -?shared { return(OPT_SHARED); } -?slave { return(OPT_SLAVE); } -?private { return(OPT_PRIVATE); } -?strictexpire { return(OPT_STRICTEXPIRE); } diff --git a/man/auto.master.5.in b/man/auto.master.5.in index 6e510a59..2a0b672a 100644 --- a/man/auto.master.5.in +++ b/man/auto.master.5.in @@ -208,17 +208,18 @@ applications scanning the mount tree. Note that this doesn't completely resolve the problem of expired automounts being immediately re-mounted due to application accesses triggered by the expire itself. .TP -.I slave \fPor\fI private +.I slave\fP, \fIprivate\fP or \fIshared\fP This option allows mount propagation of bind mounts to be set to -either \fIslave\fP or \fIprivate\fP. This option may be needed when using -multi-mounts that have bind mounts that bind to a file system that is -propagation shared. This is because the bind mount will have the same -properties as its target which causes problems for offset mounts. When -this happens an unwanted offset mount is propagated back to the target -file system resulting in a deadlock when attempting to access the offset. +\fIslave\fP, \fIprivate\fP or \fIshared\fP. This option defaults to +\fIslave\fP if no option is given. When using multi-mounts that have +bind mounts the bind mount will have the same properties as its parent +which is commonly propagation \fIshared\fP. And if the mount target is +also propagation \fIshared\fP this can lead to a deadlock when attempting +to access the offset mounts. When this happens an unwanted offset mount +is propagated back to the target file system resulting in a deadlock +since the automount target is itself an (unwanted) automount trigger. This option is an autofs pseudo mount option that can be used in the -master map only. By default, bind mounts will inherit the mount propagation -of the target file system. +master map only. .TP .I "\-r, \-\-random-multimount-selection" Enables the use of random selection when choosing a host from a diff --git a/modules/mount_bind.c b/modules/mount_bind.c index 9cba0d7a..5253501c 100644 --- a/modules/mount_bind.c +++ b/modules/mount_bind.c @@ -153,6 +153,7 @@ int mount_mount(struct autofs_point *ap, const char *root, const char *name, int if (!symlnk && bind_works) { int status, existed = 1; + int flags; debug(ap->logopt, MODPREFIX "calling mkdir_path %s", fullpath); @@ -190,24 +191,27 @@ int mount_mount(struct autofs_point *ap, const char *root, const char *name, int what, fstype, fullpath); } - if (ap->flags & (MOUNT_FLAG_SLAVE | MOUNT_FLAG_PRIVATE)) { - int flags = MS_SLAVE; - - if (ap->flags & MOUNT_FLAG_PRIVATE) - flags = MS_PRIVATE; - - /* The bind mount has succeeded but if the target - * mount is propagation shared propagation of child - * mounts (autofs offset mounts for example) back to - * the target of the bind mount must be avoided or - * autofs trigger mounts will deadlock. - */ - err = mount(NULL, fullpath, NULL, flags, NULL); - if (err) { - warn(ap->logopt, - "failed to set propagation for %s", - fullpath, root); - } + /* The bind mount has succeeded, now set the mount propagation. + * + * The default is propagation shared, change it if the master + * map entry has a different option specified. + */ + flags = MS_SLAVE; + if (ap->flags & MOUNT_FLAG_PRIVATE) + flags = MS_PRIVATE; + else if (ap->flags & MOUNT_FLAG_SHARED) + flags = MS_SHARED; + + /* Note: If the parent mount is propagation shared propagation + * of child mounts (autofs offset mounts for example) back to + * the target of the bind mount can happen in some cases and + * must be avoided or autofs trigger mounts will deadlock. + */ + err = mount(NULL, fullpath, NULL, flags, NULL); + if (err) { + warn(ap->logopt, + "failed to set propagation for %s", + fullpath, root); } return 0;
On 20:05 30/10, Ian Kent wrote: > On Tue, 2019-10-29 at 11:00 -0500, Goldwyn Rodrigues wrote: > > On 14:39 29/10, Ian Kent wrote: > > > On Tue, 2019-10-29 at 07:57 +0800, Ian Kent wrote: > > > > On Mon, 2019-10-28 at 11:28 -0500, Goldwyn Rodrigues wrote: > > > > > Hi Ian, > > > > > > > > > > On 10:14 02/10, Ian Kent wrote: > > > > > > On Tue, 2019-10-01 at 14:09 -0500, Goldwyn Rodrigues wrote: > > > > > <snip> > > > > > > > > > > > Anyway, it does sound like it's worth putting time into > > > > > > your suggestion of a kernel change. > > > > > > > > > > > > Unfortunately I think it's going to take a while to work > > > > > > out what's actually going on with the propagation and I'm > > > > > > in the middle of some other pressing work right now. > > > > > > > > > > Have you have made any progress on this issue? > > > > > > > > Sorry I haven't. > > > > I still want to try and understand what's going on there. > > > > > > > > > As I mentioned, I am fine with a userspace solution of > > > > > defaulting > > > > > to slave mounts all of the time instead of this kernel patch. > > > > > > > > Oh, I thought you weren't keen on that recommendation. > > > > > > > > That shouldn't take long to do so I should be able to get that > > > > done > > > > and post a patch pretty soon. > > > > > > > > I'll get back to looking at the mount propagation code when I get > > > > a > > > > chance. Unfortunately I haven't been very successful when making > > > > changes to that area of code in the past ... > > > > > > After working on this patch I'm even more inclined to let the > > > kernel > > > do it's propagation thing and set the autofs mounts, either > > > silently > > > by default or explicitly by map entry option. > > > > > > Because it's the propagation setting of the parent mount that > > > controls > > > the propagation of its children there shouldn't be any chance of a > > > race so this should be reliable. > > > > > > Anyway, here is a patch, compile tested only, and without the > > > changelog > > > hunk I normally add to save you possible conflicts. But unless > > > there > > > are any problems found this is what I will eventually commit to the > > > repo. > > > > > > If there are any changes your not aware of I'll let you know. > > > > > > Clearly this depends on the other two related patches for this > > > issue. > > > > This works good for us. Thanks. > > However, I have some review comments for the patch. > > I think this one should do the trick. > > autofs-5.1.6 - make bind mounts propagation slave by default > > From: Ian Kent <raven@themaw.net> > > Make setting mount propagation on bind mounts mandatory with a default > of propagation slave. > > When using multi-mounts that have bind mounts the bind mount will have > the same properties as its parent which is commonly propagation shared. > And if the mount target is also propagation shared this can lead to a > deadlock when attempting to access the offset mounts. When this happens > an unwanted offset mount is propagated back to the target file system > resulting in a deadlock since the automount target is itself an > (unwanted) automount trigger. > > This problem has been present much longer than I originally thought, > perhaps since mount propagation was introduced into the kernel, so > explicitly setting bind mount propagation is the sensible thing to do. > > Signed-off-by: Ian Kent <raven@themaw.net> Tested and works as expected. Reviewed-by: Goldwyn Rodrigues <rgoldwyn@suse.com>
diff --git a/fs/pnode.c b/fs/pnode.c index 49f6d7ff2139..b960805d7954 100644 --- a/fs/pnode.c +++ b/fs/pnode.c @@ -292,6 +292,9 @@ int propagate_mnt(struct mount *dest_mnt, struct mountpoint *dest_mp, struct mount *m, *n; int ret = 0; + if (source_mnt->mnt_mountpoint->d_flags & DCACHE_NEED_AUTOMOUNT) + return 0; + /* * we don't want to bother passing tons of arguments to * propagate_one(); everything is serialized by namespace_sem,
An access to automounted filesystem can deadlock if it is a bind mount on shared mounts. A user program should not deadlock the kernel while automount waits for propagation of the mount. This is explained at https://bugzilla.redhat.com/show_bug.cgi?id=1358887#c10 I am not sure completely blocking automount is the best solution, so please reply with what is the best course of action to do in such a situation. Propagation of dentry with DCACHE_NEED_AUTOMOUNT can lead to propagation of mount points without automount maps and not under automount control. So, do not propagate them. Signed-off-by: Goldwyn Rodrigues <rgoldwyn@suse.com>