diff mbox series

[RFC] dma-buf: fix race condition between poll and close

Message ID 20240423191310.19437-1-dmantipov@yandex.ru (mailing list archive)
State New, archived
Headers show
Series [RFC] dma-buf: fix race condition between poll and close | expand

Commit Message

Dmitry Antipov April 23, 2024, 7:13 p.m. UTC
Syzbot has found the race condition where 'fput()' is in progress
when 'dma_buf_poll()' makes an attempt to hold the 'struct file'
with zero 'f_count'. So use explicit 'atomic_long_inc_not_zero()'
to detect such a case and cancel an undergoing poll activity with
EPOLLERR.

Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
---
 drivers/dma-buf/dma-buf.c | 23 ++++++++++++++++++-----
 1 file changed, 18 insertions(+), 5 deletions(-)

Comments

Christian König April 24, 2024, 7:09 a.m. UTC | #1
Am 23.04.24 um 21:13 schrieb Dmitry Antipov:
> Syzbot has found the race condition where 'fput()' is in progress
> when 'dma_buf_poll()' makes an attempt to hold the 'struct file'
> with zero 'f_count'. So use explicit 'atomic_long_inc_not_zero()'
> to detect such a case and cancel an undergoing poll activity with
> EPOLLERR.

Well this is really interesting, you are the second person which comes 
up with this nonsense.

To repeat what I already said on the other thread: Calling 
dma_buf_poll() while fput() is in progress is illegal in the first place.

So there is nothing to fix in dma_buf_poll(), but rather to figure out 
who is incorrectly calling fput().

Regards,
Christian.

>
> Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
> Signed-off-by: Dmitry Antipov <dmantipov@yandex.ru>
> ---
>   drivers/dma-buf/dma-buf.c | 23 ++++++++++++++++++-----
>   1 file changed, 18 insertions(+), 5 deletions(-)
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 8fe5aa67b167..39eb75d23219 100644
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -266,8 +266,17 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   		spin_unlock_irq(&dmabuf->poll.lock);
>   
>   		if (events & EPOLLOUT) {
> -			/* Paired with fput in dma_buf_poll_cb */
> -			get_file(dmabuf->file);
> +			/*
> +			 * Catch the case when fput() is in progress
> +			 * (e.g. due to close() from another thread).
> +			 * Otherwise the paired fput() will be issued
> +			 * from dma_buf_poll_cb().
> +			 */
> +			if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) {
> +				events = EPOLLERR;
> +				dcb->active = 0;
> +				goto out;
> +			}
>   
>   			if (!dma_buf_poll_add_cb(resv, true, dcb))
>   				/* No callback queued, wake up any other waiters */
> @@ -289,8 +298,12 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   		spin_unlock_irq(&dmabuf->poll.lock);
>   
>   		if (events & EPOLLIN) {
> -			/* Paired with fput in dma_buf_poll_cb */
> -			get_file(dmabuf->file);
> +			/* See above */
> +			if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) {
> +				events = EPOLLERR;
> +				dcb->active = 0;
> +				goto out;
> +			}
>   
>   			if (!dma_buf_poll_add_cb(resv, false, dcb))
>   				/* No callback queued, wake up any other waiters */
> @@ -299,7 +312,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   				events &= ~EPOLLIN;
>   		}
>   	}
> -
> +out:
>   	dma_resv_unlock(resv);
>   	return events;
>   }
Dmitry Antipov April 24, 2024, 10:19 a.m. UTC | #2
On 4/24/24 10:09, Christian König wrote:

> To repeat what I already said on the other thread: Calling dma_buf_poll() while fput() is in progress is illegal in the first place.
> 
> So there is nothing to fix in dma_buf_poll(), but rather to figure out who is incorrectly calling fput().

Hm. OTOH it's legal if userspace app calls close([fd]) in one thread when another
thread sleeps in (e)poll({..., [fd], ...}) (IIUC this is close to what the syzbot
reproducer actually does). What behavior should be considered as valid in this
(yes, really weird) scenario?

Dmitry
Christian König April 24, 2024, 11:28 a.m. UTC | #3
Am 24.04.24 um 12:19 schrieb Dmitry Antipov:
> On 4/24/24 10:09, Christian König wrote:
>
>> To repeat what I already said on the other thread: Calling 
>> dma_buf_poll() while fput() is in progress is illegal in the first 
>> place.
>>
>> So there is nothing to fix in dma_buf_poll(), but rather to figure 
>> out who is incorrectly calling fput().
>
> Hm. OTOH it's legal if userspace app calls close([fd]) in one thread 
> when another
> thread sleeps in (e)poll({..., [fd], ...}) (IIUC this is close to what 
> the syzbot
> reproducer actually does). What behavior should be considered as valid 
> in this
> (yes, really weird) scenario?

That scenario is actually not weird at all, but just perfectly normal.

As far as I read up on it the EPOLL_FD implementation grabs a reference 
to the underlying file of added file descriptors.

So you can actually close the added file descriptor directly after the 
operation completes, that is perfectly valid behavior.

It's just that somehow the reference which is necessary to call 
dma_buf_poll() is dropped to early.

I don't fully understand how that happens either, it could be that there 
is some bug in the EPOLL_FD code. Maybe it's a race when the EPOLL file 
descriptor is closed or something like that.

Regards,
Christian.

>
> Dmitry
>
Dmitry Antipov May 3, 2024, 7:07 a.m. UTC | #4
On 4/24/24 2:28 PM, Christian König wrote:

> I don't fully understand how that happens either, it could be that there is some bug in the EPOLL_FD code. Maybe it's a race when the EPOLL file descriptor is closed or something like that.

IIUC the race condition looks like the following:

Thread 0                        Thread 1
-> do_epoll_ctl()
    f_count++, now 2
    ...
    ...                          -> vfs_poll(), f_count == 2
    ...                          ...
<- do_epoll_ctl()               ...
    f_count--, now 1             ...
-> filp_close(), f_count == 1   ...
    ...                            -> dma_buf_poll(), f_count == 1
    -> fput()                      ... [*** race window ***]
       f_count--, now 0              -> maybe get_file(), now ???
       -> __fput() (delayed)

E.g. dma_buf_poll() may be entered in thread 1 with f->count == 1
and call to get_file() shortly later (and may even skip this if
there is nothing to EPOLLIN or EPOLLOUT). During this time window,
thread 0 may call fput() (on behalf of close() in this example)
and (since it sees f->count == 1) file is scheduled to delayed_fput().

Dmitry
Christian König May 3, 2024, 8:18 a.m. UTC | #5
Am 03.05.24 um 09:07 schrieb Dmitry Antipov:
> On 4/24/24 2:28 PM, Christian König wrote:
>
>> I don't fully understand how that happens either, it could be that 
>> there is some bug in the EPOLL_FD code. Maybe it's a race when the 
>> EPOLL file descriptor is closed or something like that.
>
> IIUC the race condition looks like the following:
>
> Thread 0                        Thread 1
> -> do_epoll_ctl()
>    f_count++, now 2
>    ...
>    ...                          -> vfs_poll(), f_count == 2
>    ...                          ...
> <- do_epoll_ctl()               ...
>    f_count--, now 1             ...
> -> filp_close(), f_count == 1   ...
>    ...                            -> dma_buf_poll(), f_count == 1
>    -> fput()                      ... [*** race window ***]
>       f_count--, now 0              -> maybe get_file(), now ???
>       -> __fput() (delayed)
>
> E.g. dma_buf_poll() may be entered in thread 1 with f->count == 1
> and call to get_file() shortly later (and may even skip this if
> there is nothing to EPOLLIN or EPOLLOUT). During this time window,
> thread 0 may call fput() (on behalf of close() in this example)
> and (since it sees f->count == 1) file is scheduled to delayed_fput().

Wow, this is indeed looks like a bug in the epoll implementation.

Basically Thread 1 needs to make sure that the file reference can't 
vanish. Otherwise it is illegal to call vfs_poll().

I only skimmed over the epoll implementation and never looked at the 
code before, but of hand it looks like the implementation uses a mutex 
in the eventpoll structure which makes sure that the epitem structures 
don't go away during the vfs_poll() call.

But when I look closer at the do_epoll_ctl() function I can see that the 
file reference acquired isn't handed over to the epitem structure, but 
rather dropped when returning from the function.

That seems to be a buggy behavior because it means that vfs_poll() can 
be called with a stale file pointer. That in turn can lead to all kind 
of use after free bugs.

Attached is a compile only tested patch, please verify if it fixes your 
problem.

Regards,
Christian.





>
> Dmitry
Dmitry Antipov May 3, 2024, 11:08 a.m. UTC | #6
On 5/3/24 11:18 AM, Christian König wrote:

> Attached is a compile only tested patch, please verify if it fixes your problem.

LGTM, and this is similar to get_file() in __pollwait() and fput() in
free_poll_entry() used in implementation of poll(). Please resubmit to
linux-fsdevel@ including the following:

Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com
Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
Tested-by: Dmitry Antipov <dmantipov@yandex.ru>

Thanks,
Dmitry
Fedor Pchelkin May 6, 2024, 6:52 a.m. UTC | #7
On Fri, 03. May 14:08, Dmitry Antipov wrote:
> On 5/3/24 11:18 AM, Christian König wrote:
> 
> > Attached is a compile only tested patch, please verify if it fixes your problem.
> 
> LGTM, and this is similar to get_file() in __pollwait() and fput() in
> free_poll_entry() used in implementation of poll(). Please resubmit to
> linux-fsdevel@ including the following:
> 
> Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com
> Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
> Tested-by: Dmitry Antipov <dmantipov@yandex.ru>

I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better
about file lifetimes") which was pushed upstream just before v6.9-rc7.

Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/
Christian König May 7, 2024, 9:58 a.m. UTC | #8
Am 06.05.24 um 08:52 schrieb Fedor Pchelkin:
> On Fri, 03. May 14:08, Dmitry Antipov wrote:
>> On 5/3/24 11:18 AM, Christian König wrote:
>>
>>> Attached is a compile only tested patch, please verify if it fixes your problem.
>> LGTM, and this is similar to get_file() in __pollwait() and fput() in
>> free_poll_entry() used in implementation of poll(). Please resubmit to
>> linux-fsdevel@ including the following:
>>
>> Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com
>> Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
>> Tested-by: Dmitry Antipov <dmantipov@yandex.ru>
> I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better
> about file lifetimes") which was pushed upstream just before v6.9-rc7.
>
> Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/

Yeah, Linus took care of that after convincing Al that this is really a bug.

They key missing information was that we have a mutex which makes sure 
that fput() blocks for epoll to stop the polling.

It also means that you should probably re-consider using epoll together 
with shared DMA-bufs. Background is that when both client and display 
server try to use epoll the kernel will return an error because there 
can only be one user of epoll.

Regards,
Christian.
Daniel Vetter May 7, 2024, 10:40 a.m. UTC | #9
On Tue, May 07, 2024 at 11:58:33AM +0200, Christian König wrote:
> Am 06.05.24 um 08:52 schrieb Fedor Pchelkin:
> > On Fri, 03. May 14:08, Dmitry Antipov wrote:
> > > On 5/3/24 11:18 AM, Christian König wrote:
> > > 
> > > > Attached is a compile only tested patch, please verify if it fixes your problem.
> > > LGTM, and this is similar to get_file() in __pollwait() and fput() in
> > > free_poll_entry() used in implementation of poll(). Please resubmit to
> > > linux-fsdevel@ including the following:
> > > 
> > > Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com
> > > Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
> > > Tested-by: Dmitry Antipov <dmantipov@yandex.ru>
> > I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better
> > about file lifetimes") which was pushed upstream just before v6.9-rc7.
> > 
> > Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/
> 
> Yeah, Linus took care of that after convincing Al that this is really a bug.
> 
> They key missing information was that we have a mutex which makes sure that
> fput() blocks for epoll to stop the polling.
> 
> It also means that you should probably re-consider using epoll together with
> shared DMA-bufs. Background is that when both client and display server try
> to use epoll the kernel will return an error because there can only be one
> user of epoll.

I think for dma-buf implicit sync the best is to use the new fence export
ioctl, which has the added benefit that you get a snapshot and so no funny
livelock issues if someone keeps submitting rendering to a shared buffer.

That aside, why can you not use the same file with multiple epoll files in
different processes? Afaik from a quick look, all the tracking structures
are per epoll file, so both client and compositor using it should work?

I haven't tried, so I just might be extremely blind ...
-Sima
Christian König May 7, 2024, 3:02 p.m. UTC | #10
Am 07.05.24 um 12:40 schrieb Daniel Vetter:
> On Tue, May 07, 2024 at 11:58:33AM +0200, Christian König wrote:
>> Am 06.05.24 um 08:52 schrieb Fedor Pchelkin:
>>> On Fri, 03. May 14:08, Dmitry Antipov wrote:
>>>> On 5/3/24 11:18 AM, Christian König wrote:
>>>>
>>>>> Attached is a compile only tested patch, please verify if it fixes your problem.
>>>> LGTM, and this is similar to get_file() in __pollwait() and fput() in
>>>> free_poll_entry() used in implementation of poll(). Please resubmit to
>>>> linux-fsdevel@ including the following:
>>>>
>>>> Reported-by: syzbot+5d4cb6b4409edfd18646@syzkaller.appspotmail.com
>>>> Closes: https://syzkaller.appspot.com/bug?extid=5d4cb6b4409edfd18646
>>>> Tested-by: Dmitry Antipov <dmantipov@yandex.ru>
>>> I guess the problem is addressed by commit 4efaa5acf0a1 ("epoll: be better
>>> about file lifetimes") which was pushed upstream just before v6.9-rc7.
>>>
>>> Link: https://lore.kernel.org/lkml/0000000000002d631f0615918f1e@google.com/
>> Yeah, Linus took care of that after convincing Al that this is really a bug.
>>
>> They key missing information was that we have a mutex which makes sure that
>> fput() blocks for epoll to stop the polling.
>>
>> It also means that you should probably re-consider using epoll together with
>> shared DMA-bufs. Background is that when both client and display server try
>> to use epoll the kernel will return an error because there can only be one
>> user of epoll.
> I think for dma-buf implicit sync the best is to use the new fence export
> ioctl, which has the added benefit that you get a snapshot and so no funny
> livelock issues if someone keeps submitting rendering to a shared buffer.

+1

>
> That aside, why can you not use the same file with multiple epoll files in
> different processes? Afaik from a quick look, all the tracking structures
> are per epoll file, so both client and compositor using it should work?
>
> I haven't tried, so I just might be extremely blind ...

I've misunderstood one of the comments in the discussion.

You can't add the same file with the same file descriptor number to the 
same epoll file descriptor, but you can add it to different epoll file 
descriptors.

So using epoll in both client and server should actually work.

Sorry for the noise,
Christian.

> -Sima
diff mbox series

Patch

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..39eb75d23219 100644
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -266,8 +266,17 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 		spin_unlock_irq(&dmabuf->poll.lock);
 
 		if (events & EPOLLOUT) {
-			/* Paired with fput in dma_buf_poll_cb */
-			get_file(dmabuf->file);
+			/*
+			 * Catch the case when fput() is in progress
+			 * (e.g. due to close() from another thread).
+			 * Otherwise the paired fput() will be issued
+			 * from dma_buf_poll_cb().
+			 */
+			if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) {
+				events = EPOLLERR;
+				dcb->active = 0;
+				goto out;
+			}
 
 			if (!dma_buf_poll_add_cb(resv, true, dcb))
 				/* No callback queued, wake up any other waiters */
@@ -289,8 +298,12 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 		spin_unlock_irq(&dmabuf->poll.lock);
 
 		if (events & EPOLLIN) {
-			/* Paired with fput in dma_buf_poll_cb */
-			get_file(dmabuf->file);
+			/* See above */
+			if (unlikely(!atomic_long_inc_not_zero(&file->f_count))) {
+				events = EPOLLERR;
+				dcb->active = 0;
+				goto out;
+			}
 
 			if (!dma_buf_poll_add_cb(resv, false, dcb))
 				/* No callback queued, wake up any other waiters */
@@ -299,7 +312,7 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 				events &= ~EPOLLIN;
 		}
 	}
-
+out:
 	dma_resv_unlock(resv);
 	return events;
 }