Message ID | 20240423191310.19437-1-dmantipov@yandex.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] dma-buf: fix race condition between poll and close | expand |
Am 23.04.24 um 21:13 schrieb Dmitry Antipov: > Syzbot has found the race condition where 'fput()' is in progress > when 'dma_buf_poll()' makes an attempt to hold the 'struct file' > with zero 'f_count'. So use explicit 'atomic_long_inc_not_zero()' > to detect such a case and cancel an undergoing poll activity with > EPOLLERR. Well this is really interesting, you are the second person which comes up with this nonsense. To repeat what I already said on the other thread: Calling dma_buf_poll() while fput() is in progress is illegal in the first place. So there is nothing to fix in dma_buf_poll(), but rather to figure out who is incorrectly calling fput(). Regards, Christian. > > Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646 > Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> > --- > drivers/dma-buf/dma-buf.c | 23 ++++++++++++++++++----- > 1 file changed, 18 insertions(+), 5 deletions(-) > > diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c > index 8fe5aa67b167..39eb75d23219 100644 > --- a/drivers/dma-buf/dma-buf.c > +++ b/drivers/dma-buf/dma-buf.c > @@ -266,8 +266,17 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) > spin_unlock_irq(&dmabuf->poll.lock); > > if (events & EPOLLOUT) { > - /* Paired with fput in dma_buf_poll_cb */ > - get_file(dmabuf->file); > + /* > + * Catch the case when fput() is in progress > + * (e.g. due to close() from another thread). > + * Otherwise the paired fput() will be issued > + * from dma_buf_poll_cb(). > + */ > + if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) { > + events = EPOLLERR; > + dcb->active = 0; > + goto out; > + } > > if (!dma_buf_poll_add_cb(resv, true, dcb)) > /* No callback queued, wake up any other waiters */ > @@ -289,8 +298,12 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) > spin_unlock_irq(&dmabuf->poll.lock); > > if (events & EPOLLIN) { > - /* Paired with fput in dma_buf_poll_cb */ > - get_file(dmabuf->file); > + /* See above */ > + if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) { > + events = EPOLLERR; > + dcb->active = 0; > + goto out; > + } > > if (!dma_buf_poll_add_cb(resv, false, dcb)) > /* No callback queued, wake up any other waiters */ > @@ -299,7 +312,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) > events &= ~EPOLLIN; > } > } > - > +out: > dma_resv_unlock(resv); > return events; > }
On 4/24/24 10:09, Christian König wrote: > To repeat what I already said on the other thread: Calling dma_buf_poll() while fput() is in progress is illegal in the first place. > > So there is nothing to fix in dma_buf_poll(), but rather to figure out who is incorrectly calling fput(). Hm. OTOH it's legal if userspace app calls close([fd]) in one thread when another thread sleeps in (e)poll({..., [fd], ...}) (IIUC this is close to what the syzbot reproducer actually does). What behavior should be considered as valid in this (yes, really weird) scenario? Dmitry
Am 24.04.24 um 12:19 schrieb Dmitry Antipov: > On 4/24/24 10:09, Christian König wrote: > >> To repeat what I already said on the other thread: Calling >> dma_buf_poll() while fput() is in progress is illegal in the first >> place. >> >> So there is nothing to fix in dma_buf_poll(), but rather to figure >> out who is incorrectly calling fput(). > > Hm. OTOH it's legal if userspace app calls close([fd]) in one thread > when another > thread sleeps in (e)poll({..., [fd], ...}) (IIUC this is close to what > the syzbot > reproducer actually does). What behavior should be considered as valid > in this > (yes, really weird) scenario? That scenario is actually not weird at all, but just perfectly normal. As far as I read up on it the EPOLL_FD implementation grabs a reference to the underlying file of added file descriptors. So you can actually close the added file descriptor directly after the operation completes, that is perfectly valid behavior. It's just that somehow the reference which is necessary to call dma_buf_poll() is dropped to early. I don't fully understand how that happens either, it could be that there is some bug in the EPOLL_FD code. Maybe it's a race when the EPOLL file descriptor is closed or something like that. Regards, Christian. > > Dmitry >
On 4/24/24 2:28 PM, Christian König wrote:
> I don't fully understand how that happens either, it could be that there is some bug in the EPOLL_FD code. Maybe it's a race when the EPOLL file descriptor is closed or something like that.
IIUC the race condition looks like the following:
Thread 0 Thread 1
-> do_epoll_ctl()
f_count++, now 2
...
... -> vfs_poll(), f_count == 2
... ...
<- do_epoll_ctl() ...
f_count--, now 1 ...
-> filp_close(), f_count == 1 ...
... -> dma_buf_poll(), f_count == 1
-> fput() ... [*** race window ***]
f_count--, now 0 -> maybe get_file(), now ???
-> __fput() (delayed)
E.g. dma_buf_poll() may be entered in thread 1 with f->count == 1
and call to get_file() shortly later (and may even skip this if
there is nothing to EPOLLIN or EPOLLOUT). During this time window,
thread 0 may call fput() (on behalf of close() in this example)
and (since it sees f->count == 1) file is scheduled to delayed_fput().
Dmitry
Am 03.05.24 um 09:07 schrieb Dmitry Antipov: > On 4/24/24 2:28 PM, Christian König wrote: > >> I don't fully understand how that happens either, it could be that >> there is some bug in the EPOLL_FD code. Maybe it's a race when the >> EPOLL file descriptor is closed or something like that. > > IIUC the race condition looks like the following: > > Thread 0 Thread 1 > -> do_epoll_ctl() > f_count++, now 2 > ... > ... -> vfs_poll(), f_count == 2 > ... ... > <- do_epoll_ctl() ... > f_count--, now 1 ... > -> filp_close(), f_count == 1 ... > ... -> dma_buf_poll(), f_count == 1 > -> fput() ... [*** race window ***] > f_count--, now 0 -> maybe get_file(), now ??? > -> __fput() (delayed) > > E.g. dma_buf_poll() may be entered in thread 1 with f->count == 1 > and call to get_file() shortly later (and may even skip this if > there is nothing to EPOLLIN or EPOLLOUT). During this time window, > thread 0 may call fput() (on behalf of close() in this example) > and (since it sees f->count == 1) file is scheduled to delayed_fput(). Wow, this is indeed looks like a bug in the epoll implementation. Basically Thread 1 needs to make sure that the file reference can't vanish. Otherwise it is illegal to call vfs_poll(). I only skimmed over the epoll implementation and never looked at the code before, but of hand it looks like the implementation uses a mutex in the eventpoll structure which makes sure that the epitem structures don't go away during the vfs_poll() call. But when I look closer at the do_epoll_ctl() function I can see that the file reference acquired isn't handed over to the epitem structure, but rather dropped when returning from the function. That seems to be a buggy behavior because it means that vfs_poll() can be called with a stale file pointer. That in turn can lead to all kind of use after free bugs. Attached is a compile only tested patch, please verify if it fixes your problem. Regards, Christian. > > Dmitry
On 5/3/24 11:18 AM, Christian König wrote: > Attached is a compile only tested patch, please verify if it fixes your problem. LGTM, and this is similar to get_file() in __pollwait() and fput() in free_poll_entry() used in implementation of poll(). Please resubmit to linux-fsdevel@ including the following: Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646 Tested-by: Dmitry Antipov <dmantipov@yandex.ru> Thanks, Dmitry
On Fri, 03. May 14:08, Dmitry Antipov wrote: > On 5/3/24 11:18 AM, Christian König wrote: > > > Attached is a compile only tested patch, please verify if it fixes your problem. > > LGTM, and this is similar to get_file() in __pollwait() and fput() in > free_poll_entry() used in implementation of poll(). Please resubmit to > linux-fsdevel@ including the following: > > Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com > Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646 > Tested-by: Dmitry Antipov <dmantipov@yandex.ru> I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better about file lifetimes") which was pushed upstream just before v6.9-rc7. Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/
Am 06.05.24 um 08:52 schrieb Fedor Pchelkin: > On Fri, 03. May 14:08, Dmitry Antipov wrote: >> On 5/3/24 11:18 AM, Christian König wrote: >> >>> Attached is a compile only tested patch, please verify if it fixes your problem. >> LGTM, and this is similar to get_file() in __pollwait() and fput() in >> free_poll_entry() used in implementation of poll(). Please resubmit to >> linux-fsdevel@ including the following: >> >> Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com >> Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646 >> Tested-by: Dmitry Antipov <dmantipov@yandex.ru> > I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better > about file lifetimes") which was pushed upstream just before v6.9-rc7. > > Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/ Yeah, Linus took care of that after convincing Al that this is really a bug. They key missing information was that we have a mutex which makes sure that fput() blocks for epoll to stop the polling. It also means that you should probably re-consider using epoll together with shared DMA-bufs. Background is that when both client and display server try to use epoll the kernel will return an error because there can only be one user of epoll. Regards, Christian.
On Tue, May 07, 2024 at 11:58:33AM +0200, Christian König wrote: > Am 06.05.24 um 08:52 schrieb Fedor Pchelkin: > > On Fri, 03. May 14:08, Dmitry Antipov wrote: > > > On 5/3/24 11:18 AM, Christian König wrote: > > > > > > > Attached is a compile only tested patch, please verify if it fixes your problem. > > > LGTM, and this is similar to get_file() in __pollwait() and fput() in > > > free_poll_entry() used in implementation of poll(). Please resubmit to > > > linux-fsdevel@ including the following: > > > > > > Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com > > > Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646 > > > Tested-by: Dmitry Antipov <dmantipov@yandex.ru> > > I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better > > about file lifetimes") which was pushed upstream just before v6.9-rc7. > > > > Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/ > > Yeah, Linus took care of that after convincing Al that this is really a bug. > > They key missing information was that we have a mutex which makes sure that > fput() blocks for epoll to stop the polling. > > It also means that you should probably re-consider using epoll together with > shared DMA-bufs. Background is that when both client and display server try > to use epoll the kernel will return an error because there can only be one > user of epoll. I think for dma-buf implicit sync the best is to use the new fence export ioctl, which has the added benefit that you get a snapshot and so no funny livelock issues if someone keeps submitting rendering to a shared buffer. That aside, why can you not use the same file with multiple epoll files in different processes? Afaik from a quick look, all the tracking structures are per epoll file, so both client and compositor using it should work? I haven't tried, so I just might be extremely blind ... -Sima
Am 07.05.24 um 12:40 schrieb Daniel Vetter: > On Tue, May 07, 2024 at 11:58:33AM +0200, Christian König wrote: >> Am 06.05.24 um 08:52 schrieb Fedor Pchelkin: >>> On Fri, 03. May 14:08, Dmitry Antipov wrote: >>>> On 5/3/24 11:18 AM, Christian König wrote: >>>> >>>>> Attached is a compile only tested patch, please verify if it fixes your problem. >>>> LGTM, and this is similar to get_file() in __pollwait() and fput() in >>>> free_poll_entry() used in implementation of poll(). Please resubmit to >>>> linux-fsdevel@ including the following: >>>> >>>> Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com >>>> Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646 >>>> Tested-by: Dmitry Antipov <dmantipov@yandex.ru> >>> I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better >>> about file lifetimes") which was pushed upstream just before v6.9-rc7. >>> >>> Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/ >> Yeah, Linus took care of that after convincing Al that this is really a bug. >> >> They key missing information was that we have a mutex which makes sure that >> fput() blocks for epoll to stop the polling. >> >> It also means that you should probably re-consider using epoll together with >> shared DMA-bufs. Background is that when both client and display server try >> to use epoll the kernel will return an error because there can only be one >> user of epoll. > I think for dma-buf implicit sync the best is to use the new fence export > ioctl, which has the added benefit that you get a snapshot and so no funny > livelock issues if someone keeps submitting rendering to a shared buffer. +1 > > That aside, why can you not use the same file with multiple epoll files in > different processes? Afaik from a quick look, all the tracking structures > are per epoll file, so both client and compositor using it should work? > > I haven't tried, so I just might be extremely blind ... I've misunderstood one of the comments in the discussion. You can't add the same file with the same file descriptor number to the same epoll file descriptor, but you can add it to different epoll file descriptors. So using epoll in both client and server should actually work. Sorry for the noise, Christian. > -Sima
diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c index 8fe5aa67b167..39eb75d23219 100644 --- a/drivers/dma-buf/dma-buf.c +++ b/drivers/dma-buf/dma-buf.c @@ -266,8 +266,17 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLOUT) { - /* Paired with fput in dma_buf_poll_cb */ - get_file(dmabuf->file); + /* + * Catch the case when fput() is in progress + * (e.g. due to close() from another thread). + * Otherwise the paired fput() will be issued + * from dma_buf_poll_cb(). + */ + if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) { + events = EPOLLERR; + dcb->active = 0; + goto out; + } if (!dma_buf_poll_add_cb(resv, true, dcb)) /* No callback queued, wake up any other waiters */ @@ -289,8 +298,12 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) spin_unlock_irq(&dmabuf->poll.lock); if (events & EPOLLIN) { - /* Paired with fput in dma_buf_poll_cb */ - get_file(dmabuf->file); + /* See above */ + if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) { + events = EPOLLERR; + dcb->active = 0; + goto out; + } if (!dma_buf_poll_add_cb(resv, false, dcb)) /* No callback queued, wake up any other waiters */ @@ -299,7 +312,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll) events &= ~EPOLLIN; } } - +out: dma_resv_unlock(resv); return events; }
Syzbot has found the race condition where 'fput()' is in progress when 'dma_buf_poll()' makes an attempt to hold the 'struct file' with zero 'f_count'. So use explicit 'atomic_long_inc_not_zero()' to detect such a case and cancel an undergoing poll activity with EPOLLERR. Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646 Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru> --- drivers/dma-buf/dma-buf.c | 23 ++++++++++++++++++----- 1 file changed, 18 insertions(+), 5 deletions(-)