Message ID | 20170305191802.GK29622@ZenIV.linux.org.uk (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Mar 5, 2017 at 8:18 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Sun, Mar 05, 2017 at 06:33:18PM +0100, Dmitry Vyukov wrote: > >> Added more debug output. >> >> name_to_handle_at(r4, &(0x7f0000003000-0x6)="2e2f62757300", >> &(0x7f0000003000-0xd)={0xc, 0x0, "cd21"}, &(0x7f0000002000)=0x0, >> 0x1000) >> >> actually passes name="" because of the overlapping addresses. Flags >> contain AT_EMPTY_PATH. > > Bloody hell... So you end up with name == (char *)&handle->handle_type + 3? > Looks like it would be a lot more useful to dump the actual contents of > those suckers right before the syscall... We can't yet do dumping, it's opposite of generation and we don't have enough info for it. Strace can do it. But note that it does not necessary say you true. First, kernel can overwrite some of inputs with copy_to_user before reading them. Second, racing syscalls that use the same memory for inputs will lead to non-deterministic inputs, what you will see from strace is not necessary what kernel sees. > Anyway, that explains WTF is going on. The bug is in path_init() and > it triggers when you pass something with dentry allocated by d_alloc_pseudo() > as dfd, combined with empty pathname. You need to have the file closed > by another thread, and have that another thread get out of closing syscall > (close(), dup2(), etc.) before the caller of path_init() gets to > complete_walk(). We need to make sure that this sucker gets DCACHE_RCUPDATE > while it's still guaranteed to be pinned down. Could you try to reproduce > with the patch below applied? > > diff --git a/fs/namei.c b/fs/namei.c > index 6f7d96368734..70840281a41c 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -2226,11 +2226,16 @@ static const char *path_init(struct nameidata *nd, unsigned flags) > nd->path = f.file->f_path; > if (flags & LOOKUP_RCU) { > rcu_read_lock(); > - nd->inode = nd->path.dentry->d_inode; > - nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); > + if (unlikely(!(dentry->d_flags & DCACHE_RCUACCESS))) { > + spin_lock(&dentry->d_lock); > + dentry->d_flags |= DCACHE_RCUACCESS; > + spin_unlock(&dentry->d_lock); > + } > + nd->inode = dentry->d_inode; > + nd->seq = read_seqcount_begin(&dentry->d_seq); > } else { > path_get(&nd->path); > - nd->inode = nd->path.dentry->d_inode; > + nd->inode = dentry->d_inode; > } > fdput(f); > return s; This seems to fix the crash. Reproducer has survived an hour while usually it crashes within 5 minutes or so. But we will back to you with data race reports later. All unprotected accesses should use READ_ONCE/WRITE_ONCE.
On Sun, Mar 5, 2017 at 8:18 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Sun, Mar 05, 2017 at 06:33:18PM +0100, Dmitry Vyukov wrote: > >> Added more debug output. >> >> name_to_handle_at(r4, &(0x7f0000003000-0x6)="2e2f62757300", >> &(0x7f0000003000-0xd)={0xc, 0x0, "cd21"}, &(0x7f0000002000)=0x0, >> 0x1000) >> >> actually passes name="" because of the overlapping addresses. Flags >> contain AT_EMPTY_PATH. > > Bloody hell... So you end up with name == (char *)&handle->handle_type + 3? > Looks like it would be a lot more useful to dump the actual contents of > those suckers right before the syscall... > > Anyway, that explains WTF is going on. The bug is in path_init() and > it triggers when you pass something with dentry allocated by d_alloc_pseudo() > as dfd, combined with empty pathname. You need to have the file closed > by another thread, and have that another thread get out of closing syscall > (close(), dup2(), etc.) before the caller of path_init() gets to > complete_walk(). We need to make sure that this sucker gets DCACHE_RCUPDATE > while it's still guaranteed to be pinned down. Could you try to reproduce > with the patch below applied? > > diff --git a/fs/namei.c b/fs/namei.c > index 6f7d96368734..70840281a41c 100644 > --- a/fs/namei.c > +++ b/fs/namei.c > @@ -2226,11 +2226,16 @@ static const char *path_init(struct nameidata *nd, unsigned flags) > nd->path = f.file->f_path; > if (flags & LOOKUP_RCU) { > rcu_read_lock(); > - nd->inode = nd->path.dentry->d_inode; > - nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); > + if (unlikely(!(dentry->d_flags & DCACHE_RCUACCESS))) { > + spin_lock(&dentry->d_lock); > + dentry->d_flags |= DCACHE_RCUACCESS; > + spin_unlock(&dentry->d_lock); > + } > + nd->inode = dentry->d_inode; > + nd->seq = read_seqcount_begin(&dentry->d_seq); > } else { > path_get(&nd->path); > - nd->inode = nd->path.dentry->d_inode; > + nd->inode = dentry->d_inode; > } > fdput(f); > return s; Al, please send this patch officially. I am running with it since then and have not seen the crashes, nor any other issues that look related. Thanks!
On Thu, Mar 23, 2017 at 3:17 PM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Sun, Mar 5, 2017 at 8:18 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: >> On Sun, Mar 05, 2017 at 06:33:18PM +0100, Dmitry Vyukov wrote: >> >>> Added more debug output. >>> >>> name_to_handle_at(r4, &(0x7f0000003000-0x6)="2e2f62757300", >>> &(0x7f0000003000-0xd)={0xc, 0x0, "cd21"}, &(0x7f0000002000)=0x0, >>> 0x1000) >>> >>> actually passes name="" because of the overlapping addresses. Flags >>> contain AT_EMPTY_PATH. >> >> Bloody hell... So you end up with name == (char *)&handle->handle_type + 3? >> Looks like it would be a lot more useful to dump the actual contents of >> those suckers right before the syscall... >> >> Anyway, that explains WTF is going on. The bug is in path_init() and >> it triggers when you pass something with dentry allocated by d_alloc_pseudo() >> as dfd, combined with empty pathname. You need to have the file closed >> by another thread, and have that another thread get out of closing syscall >> (close(), dup2(), etc.) before the caller of path_init() gets to >> complete_walk(). We need to make sure that this sucker gets DCACHE_RCUPDATE >> while it's still guaranteed to be pinned down. Could you try to reproduce >> with the patch below applied? >> >> diff --git a/fs/namei.c b/fs/namei.c >> index 6f7d96368734..70840281a41c 100644 >> --- a/fs/namei.c >> +++ b/fs/namei.c >> @@ -2226,11 +2226,16 @@ static const char *path_init(struct nameidata *nd, unsigned flags) >> nd->path = f.file->f_path; >> if (flags & LOOKUP_RCU) { >> rcu_read_lock(); >> - nd->inode = nd->path.dentry->d_inode; >> - nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); >> + if (unlikely(!(dentry->d_flags & DCACHE_RCUACCESS))) { >> + spin_lock(&dentry->d_lock); >> + dentry->d_flags |= DCACHE_RCUACCESS; >> + spin_unlock(&dentry->d_lock); >> + } >> + nd->inode = dentry->d_inode; >> + nd->seq = read_seqcount_begin(&dentry->d_seq); >> } else { >> path_get(&nd->path); >> - nd->inode = nd->path.dentry->d_inode; >> + nd->inode = dentry->d_inode; >> } >> fdput(f); >> return s; > > > Al, please send this patch officially. I am running with it since then > and have not seen the crashes, nor any other issues that look related. > > Thanks! Al, ping. Please send this patch.
On Fri, Apr 28, 2017 at 8:19 AM, Dmitry Vyukov <dvyukov@google.com> wrote: > On Thu, Mar 23, 2017 at 3:17 PM, Dmitry Vyukov <dvyukov@google.com> wrote: >> On Sun, Mar 5, 2017 at 8:18 PM, Al Viro <viro@zeniv.linux.org.uk> wrote: >>> On Sun, Mar 05, 2017 at 06:33:18PM +0100, Dmitry Vyukov wrote: >>> >>>> Added more debug output. >>>> >>>> name_to_handle_at(r4, &(0x7f0000003000-0x6)="2e2f62757300", >>>> &(0x7f0000003000-0xd)={0xc, 0x0, "cd21"}, &(0x7f0000002000)=0x0, >>>> 0x1000) >>>> >>>> actually passes name="" because of the overlapping addresses. Flags >>>> contain AT_EMPTY_PATH. >>> >>> Bloody hell... So you end up with name == (char *)&handle->handle_type + 3? >>> Looks like it would be a lot more useful to dump the actual contents of >>> those suckers right before the syscall... >>> >>> Anyway, that explains WTF is going on. The bug is in path_init() and >>> it triggers when you pass something with dentry allocated by d_alloc_pseudo() >>> as dfd, combined with empty pathname. You need to have the file closed >>> by another thread, and have that another thread get out of closing syscall >>> (close(), dup2(), etc.) before the caller of path_init() gets to >>> complete_walk(). We need to make sure that this sucker gets DCACHE_RCUPDATE >>> while it's still guaranteed to be pinned down. Could you try to reproduce >>> with the patch below applied? >>> >>> diff --git a/fs/namei.c b/fs/namei.c >>> index 6f7d96368734..70840281a41c 100644 >>> --- a/fs/namei.c >>> +++ b/fs/namei.c >>> @@ -2226,11 +2226,16 @@ static const char *path_init(struct nameidata *nd, unsigned flags) >>> nd->path = f.file->f_path; >>> if (flags & LOOKUP_RCU) { >>> rcu_read_lock(); >>> - nd->inode = nd->path.dentry->d_inode; >>> - nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); >>> + if (unlikely(!(dentry->d_flags & DCACHE_RCUACCESS))) { >>> + spin_lock(&dentry->d_lock); >>> + dentry->d_flags |= DCACHE_RCUACCESS; >>> + spin_unlock(&dentry->d_lock); >>> + } >>> + nd->inode = dentry->d_inode; >>> + nd->seq = read_seqcount_begin(&dentry->d_seq); >>> } else { >>> path_get(&nd->path); >>> - nd->inode = nd->path.dentry->d_inode; >>> + nd->inode = dentry->d_inode; >>> } >>> fdput(f); >>> return s; >> >> >> Al, please send this patch officially. I am running with it since then >> and have not seen the crashes, nor any other issues that look related. >> >> Thanks! > > > Al, ping. Please send this patch. Al, do you want me to mail the patch? I won't be able to write a super detailed description, but I can do some format patch.
On Mon, May 29, 2017 at 04:48:17PM +0200, Dmitry Vyukov wrote: > Al, do you want me to mail the patch? > I won't be able to write a super detailed description, but I can do > some format patch. It's been fixed by commit c0eb027e5aef7; if you are still able to trigger it on the current mainline, please yell - that would have to be something different.
On Tue, May 30, 2017 at 8:24 AM, Al Viro <viro@zeniv.linux.org.uk> wrote: > On Mon, May 29, 2017 at 04:48:17PM +0200, Dmitry Vyukov wrote: > >> Al, do you want me to mail the patch? >> I won't be able to write a super detailed description, but I can do >> some format patch. > > It's been fixed by commit c0eb027e5aef7; if you are still able to > trigger it on the current mainline, please yell - that would have > to be something different. Thanks for the update! No, I can't say that I still see this. It happened very infrequently, so I wanted to make sure that we don't lost it regardless of whether I see it or not.
diff --git a/fs/namei.c b/fs/namei.c index 6f7d96368734..70840281a41c 100644 --- a/fs/namei.c +++ b/fs/namei.c @@ -2226,11 +2226,16 @@ static const char *path_init(struct nameidata *nd, unsigned flags) nd->path = f.file->f_path; if (flags & LOOKUP_RCU) { rcu_read_lock(); - nd->inode = nd->path.dentry->d_inode; - nd->seq = read_seqcount_begin(&nd->path.dentry->d_seq); + if (unlikely(!(dentry->d_flags & DCACHE_RCUACCESS))) { + spin_lock(&dentry->d_lock); + dentry->d_flags |= DCACHE_RCUACCESS; + spin_unlock(&dentry->d_lock); + } + nd->inode = dentry->d_inode; + nd->seq = read_seqcount_begin(&dentry->d_seq); } else { path_get(&nd->path); - nd->inode = nd->path.dentry->d_inode; + nd->inode = dentry->d_inode; } fdput(f); return s;