diff mbox series

dmabuf: fix dmabuf file poll uaf issue

Message ID 20240327022903.776-1-justinjiang@vivo.com (mailing list archive)
State New, archived
Headers show
Series dmabuf: fix dmabuf file poll uaf issue | expand

Commit Message

zhiguojiang March 27, 2024, 2:29 a.m. UTC
The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
that the dmabuf file fd is added to the epoll event listener list, and
when it is released, it is not removed from the epoll list, which leads
to the UAF(Use-After-Free) issue. 

The UAF issue can be solved by checking dmabuf file->f_count value and
skipping the poll operation for the closed dmabuf file in the
dma_buf_poll(). We have tested this solved patch multiple times and
have not reproduced the uaf issue.

crash dump:
list_del corruption, ffffff8a6f143a90->next is LIST_POISON1
(dead000000000100)
------------[ cut here ]------------
kernel BUG at lib/list_debug.c:55!
Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
pc : __list_del_entry_valid+0x98/0xd4
lr : __list_del_entry_valid+0x98/0xd4
sp : ffffffc01d413d00
x29: ffffffc01d413d00 x28: 00000000000000c0 x27: 0000000000000020
x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000080007
x23: ffffff8b22e5dcc0 x22: ffffff88a6be12d0 x21: ffffff8b22e572b0
x20: ffffff80254ed0a0 x19: ffffff8a6f143a00 x18: ffffffda5efed3c0
x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0000000000000018
x11: ffffff8b6c188000 x10: 00000000ffffffff x9 : 6c8413a194897b00
x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
x5 : ffffff8b6c3b2a3e x4 : ffffff8b6c3b2a40 x3 : ffff103000001005
x2 : 0000000000000001 x1 : 00000000000000c0 x0 : 000000000000004e
Call trace:
 __list_del_entry_valid+0x98/0xd4
 dma_buf_file_release+0x48/0x90
 __fput+0xf4/0x280
 ____fput+0x10/0x20
 task_work_run+0xcc/0xf4
 do_notify_resume+0x2a0/0x33c
 el0_svc+0x5c/0xa4
 el0t_64_sync_handler+0x68/0xb4
 el0t_64_sync+0x1a0/0x1a4
Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d4210000) 
---[ end trace 0000000000000000 ]---
Kernel panic - not syncing: Oops - BUG: Fatal exception
SMP: stopping secondary CPUs

Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
---
 drivers/dma-buf/dma-buf.c | 28 ++++++++++++++++++++++++----
 1 file changed, 24 insertions(+), 4 deletions(-)
 mode change 100644 => 100755 drivers/dma-buf/dma-buf.c

Comments

T.J. Mercier March 29, 2024, 11:36 p.m. UTC | #1
On Tue, Mar 26, 2024 at 7:29 PM Zhiguo Jiang <justinjiang@vivo.com> wrote:
>
> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
> that the dmabuf file fd is added to the epoll event listener list, and
> when it is released, it is not removed from the epoll list, which leads
> to the UAF(Use-After-Free) issue.
>
> The UAF issue can be solved by checking dmabuf file->f_count value and
> skipping the poll operation for the closed dmabuf file in the
> dma_buf_poll(). We have tested this solved patch multiple times and
> have not reproduced the uaf issue.
>

Hi Zhiguo,

What is the most recent kernel version you've seen the bug on?

You are closing the dmabuf fd from another thread while it is still
part of the epoll interest list?

Thanks,
T.J.
zhiguojiang April 1, 2024, 6:52 a.m. UTC | #2
Hi T.J.,

> What is the most recent kernel version you've seen the bug on?
The latest kernel version of the issue we discovered is kernel-6.1.25, and
kernel-5.15 also reported this issue.
> You are closing the dmabuf fd from another thread while it is still
> part of the epoll interest list?
Yes, we found that threads related to android.display modules performed
an operation to close this dmabuf fd while it was still part of the 
epoll interest
list,  but the specific threads were random. So I think this is also 
issue with
the logic of kernel dmabuf code.

Thanks,
Zhiguo


在 2024/3/30 7:36, T.J. Mercier 写道:
> [你通常不会收到来自 tjmercier@google.com 的电子邮件。请访问 https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]
>
> On Tue, Mar 26, 2024 at 7:29 PM Zhiguo Jiang <justinjiang@vivo.com> wrote:
>> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
>> that the dmabuf file fd is added to the epoll event listener list, and
>> when it is released, it is not removed from the epoll list, which leads
>> to the UAF(Use-After-Free) issue.
>>
>> The UAF issue can be solved by checking dmabuf file->f_count value and
>> skipping the poll operation for the closed dmabuf file in the
>> dma_buf_poll(). We have tested this solved patch multiple times and
>> have not reproduced the uaf issue.
>>
> Hi Zhiguo,
>
> What is the most recent kernel version you've seen the bug on?
>
> You are closing the dmabuf fd from another thread while it is still
> part of the epoll interest list?
>
> Thanks,
> T.J.
Christian König April 1, 2024, 12:22 p.m. UTC | #3
Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:
> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
> that the dmabuf file fd is added to the epoll event listener list, and
> when it is released, it is not removed from the epoll list, which leads
> to the UAF(Use-After-Free) issue.

As far as I can see that's not because of the DMA-buf code, but because 
you are somehow using this interface incorrectly.

When dma_buf_poll() is called it is mandatory for the caller to hold a 
reference to the file descriptor on which the poll operation is executed.

So adding code like "if (!file_count(file))" in the beginning of 
dma_buf_poll() is certainly broken.

My best guess is that you have some unbalanced 
dma_buf_get()/dma_buf_put() somewhere instead.

Regards,
Christian.

>
> The UAF issue can be solved by checking dmabuf file->f_count value and
> skipping the poll operation for the closed dmabuf file in the
> dma_buf_poll(). We have tested this solved patch multiple times and
> have not reproduced the uaf issue.
>
> crash dump:
> list_del corruption, ffffff8a6f143a90->next is LIST_POISON1
> (dead000000000100)
> ------------[ cut here ]------------
> kernel BUG at lib/list_debug.c:55!
> Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
> pc : __list_del_entry_valid+0x98/0xd4
> lr : __list_del_entry_valid+0x98/0xd4
> sp : ffffffc01d413d00
> x29: ffffffc01d413d00 x28: 00000000000000c0 x27: 0000000000000020
> x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000080007
> x23: ffffff8b22e5dcc0 x22: ffffff88a6be12d0 x21: ffffff8b22e572b0
> x20: ffffff80254ed0a0 x19: ffffff8a6f143a00 x18: ffffffda5efed3c0
> x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
> x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0000000000000018
> x11: ffffff8b6c188000 x10: 00000000ffffffff x9 : 6c8413a194897b00
> x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
> x5 : ffffff8b6c3b2a3e x4 : ffffff8b6c3b2a40 x3 : ffff103000001005
> x2 : 0000000000000001 x1 : 00000000000000c0 x0 : 000000000000004e
> Call trace:
>   __list_del_entry_valid+0x98/0xd4
>   dma_buf_file_release+0x48/0x90
>   __fput+0xf4/0x280
>   ____fput+0x10/0x20
>   task_work_run+0xcc/0xf4
>   do_notify_resume+0x2a0/0x33c
>   el0_svc+0x5c/0xa4
>   el0t_64_sync_handler+0x68/0xb4
>   el0t_64_sync+0x1a0/0x1a4
> Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d4210000)
> ---[ end trace 0000000000000000 ]---
> Kernel panic - not syncing: Oops - BUG: Fatal exception
> SMP: stopping secondary CPUs
>
> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
> ---
>   drivers/dma-buf/dma-buf.c | 28 ++++++++++++++++++++++++----
>   1 file changed, 24 insertions(+), 4 deletions(-)
>   mode change 100644 => 100755 drivers/dma-buf/dma-buf.c
>
> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
> index 8fe5aa67b167..e469dd9288cc
> --- a/drivers/dma-buf/dma-buf.c
> +++ b/drivers/dma-buf/dma-buf.c
> @@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   	struct dma_resv *resv;
>   	__poll_t events;
>   
> +	/* Check if the file exists */
> +	if (!file_count(file))
> +		return EPOLLERR;
> +
>   	dmabuf = file->private_data;
>   	if (!dmabuf || !dmabuf->resv)
>   		return EPOLLERR;
> @@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   		spin_unlock_irq(&dmabuf->poll.lock);
>   
>   		if (events & EPOLLOUT) {
> -			/* Paired with fput in dma_buf_poll_cb */
> -			get_file(dmabuf->file);
> +			/*
> +			 * Paired with fput in dma_buf_poll_cb,
> +			 * Skip poll for the closed file.
> +			 */
> +			if (!get_file_rcu(&dmabuf->file)) {
> +				events &= ~EPOLLOUT;
> +				dcb->active = 0;
> +				goto clear_out_event;
> +			}
>   
>   			if (!dma_buf_poll_add_cb(resv, true, dcb))
>   				/* No callback queued, wake up any other waiters */
> @@ -277,6 +288,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   		}
>   	}
>   
> +clear_out_event:
>   	if (events & EPOLLIN) {
>   		struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in;
>   
> @@ -289,8 +301,15 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   		spin_unlock_irq(&dmabuf->poll.lock);
>   
>   		if (events & EPOLLIN) {
> -			/* Paired with fput in dma_buf_poll_cb */
> -			get_file(dmabuf->file);
> +			/*
> +			 * Paired with fput in dma_buf_poll_cb,
> +			 * Skip poll for the closed file.
> +			 */
> +			if (!get_file_rcu(&dmabuf->file)) {
> +				events &= ~EPOLLIN;
> +				dcb->active = 0;
> +				goto clear_in_event;
> +			}
>   
>   			if (!dma_buf_poll_add_cb(resv, false, dcb))
>   				/* No callback queued, wake up any other waiters */
> @@ -300,6 +319,7 @@ static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
>   		}
>   	}
>   
> +clear_in_event:
>   	dma_resv_unlock(resv);
>   	return events;
>   }
zhiguojiang April 2, 2024, 6:49 a.m. UTC | #4
> As far as I can see that's not because of the DMA-buf code, but 
> because you are somehow using this interface incorrectly.
>
> When dma_buf_poll() is called it is mandatory for the caller to hold a 
> reference to the file descriptor on which the poll operation is executed.
>
> So adding code like "if (!file_count(file))" in the beginning of 
> dma_buf_poll() is certainly broken.
>
> My best guess is that you have some unbalanced 
> dma_buf_get()/dma_buf_put() somewhere instead.
>
>
Hi Christian,

The kernel dma_buf_poll() code shound not cause system crashes due to 
the user mode usage logical issues ?

Thanks


在 2024/4/1 20:22, Christian König 写道:
> Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:
>> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
>> that the dmabuf file fd is added to the epoll event listener list, and
>> when it is released, it is not removed from the epoll list, which leads
>> to the UAF(Use-After-Free) issue.
>
> As far as I can see that's not because of the DMA-buf code, but 
> because you are somehow using this interface incorrectly.
>
> When dma_buf_poll() is called it is mandatory for the caller to hold a 
> reference to the file descriptor on which the poll operation is executed.
>
> So adding code like "if (!file_count(file))" in the beginning of 
> dma_buf_poll() is certainly broken.
>
> My best guess is that you have some unbalanced 
> dma_buf_get()/dma_buf_put() somewhere instead.
>
> Regards,
> Christian.
>
>>
>> The UAF issue can be solved by checking dmabuf file->f_count value and
>> skipping the poll operation for the closed dmabuf file in the
>> dma_buf_poll(). We have tested this solved patch multiple times and
>> have not reproduced the uaf issue.
>>
>> crash dump:
>> list_del corruption, ffffff8a6f143a90->next is LIST_POISON1
>> (dead000000000100)
>> ------------[ cut here ]------------
>> kernel BUG at lib/list_debug.c:55!
>> Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>> pc : __list_del_entry_valid+0x98/0xd4
>> lr : __list_del_entry_valid+0x98/0xd4
>> sp : ffffffc01d413d00
>> x29: ffffffc01d413d00 x28: 00000000000000c0 x27: 0000000000000020
>> x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000080007
>> x23: ffffff8b22e5dcc0 x22: ffffff88a6be12d0 x21: ffffff8b22e572b0
>> x20: ffffff80254ed0a0 x19: ffffff8a6f143a00 x18: ffffffda5efed3c0
>> x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
>> x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0000000000000018
>> x11: ffffff8b6c188000 x10: 00000000ffffffff x9 : 6c8413a194897b00
>> x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
>> x5 : ffffff8b6c3b2a3e x4 : ffffff8b6c3b2a40 x3 : ffff103000001005
>> x2 : 0000000000000001 x1 : 00000000000000c0 x0 : 000000000000004e
>> Call trace:
>>   __list_del_entry_valid+0x98/0xd4
>>   dma_buf_file_release+0x48/0x90
>>   __fput+0xf4/0x280
>>   ____fput+0x10/0x20
>>   task_work_run+0xcc/0xf4
>>   do_notify_resume+0x2a0/0x33c
>>   el0_svc+0x5c/0xa4
>>   el0t_64_sync_handler+0x68/0xb4
>>   el0t_64_sync+0x1a0/0x1a4
>> Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d4210000)
>> ---[ end trace 0000000000000000 ]---
>> Kernel panic - not syncing: Oops - BUG: Fatal exception
>> SMP: stopping secondary CPUs
>>
>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
>> ---
>>   drivers/dma-buf/dma-buf.c | 28 ++++++++++++++++++++++++----
>>   1 file changed, 24 insertions(+), 4 deletions(-)
>>   mode change 100644 => 100755 drivers/dma-buf/dma-buf.c
>>
>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>> index 8fe5aa67b167..e469dd9288cc
>> --- a/drivers/dma-buf/dma-buf.c
>> +++ b/drivers/dma-buf/dma-buf.c
>> @@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file, 
>> poll_table *poll)
>>       struct dma_resv *resv;
>>       __poll_t events;
>>   +    /* Check if the file exists */
>> +    if (!file_count(file))
>> +        return EPOLLERR;
>> +
>>       dmabuf = file->private_data;
>>       if (!dmabuf || !dmabuf->resv)
>>           return EPOLLERR;
>> @@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file, 
>> poll_table *poll)
>>           spin_unlock_irq(&dmabuf->poll.lock);
>>             if (events & EPOLLOUT) {
>> -            /* Paired with fput in dma_buf_poll_cb */
>> -            get_file(dmabuf->file);
>> +            /*
>> +             * Paired with fput in dma_buf_poll_cb,
>> +             * Skip poll for the closed file.
>> +             */
>> +            if (!get_file_rcu(&dmabuf->file)) {
>> +                events &= ~EPOLLOUT;
>> +                dcb->active = 0;
>> +                goto clear_out_event;
>> +            }
>>                 if (!dma_buf_poll_add_cb(resv, true, dcb))
>>                   /* No callback queued, wake up any other waiters */
>> @@ -277,6 +288,7 @@ static __poll_t dma_buf_poll(struct file *file, 
>> poll_table *poll)
>>           }
>>       }
>>   +clear_out_event:
>>       if (events & EPOLLIN) {
>>           struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in;
>>   @@ -289,8 +301,15 @@ static __poll_t dma_buf_poll(struct file 
>> *file, poll_table *poll)
>>           spin_unlock_irq(&dmabuf->poll.lock);
>>             if (events & EPOLLIN) {
>> -            /* Paired with fput in dma_buf_poll_cb */
>> -            get_file(dmabuf->file);
>> +            /*
>> +             * Paired with fput in dma_buf_poll_cb,
>> +             * Skip poll for the closed file.
>> +             */
>> +            if (!get_file_rcu(&dmabuf->file)) {
>> +                events &= ~EPOLLIN;
>> +                dcb->active = 0;
>> +                goto clear_in_event;
>> +            }
>>                 if (!dma_buf_poll_add_cb(resv, false, dcb))
>>                   /* No callback queued, wake up any other waiters */
>> @@ -300,6 +319,7 @@ static __poll_t dma_buf_poll(struct file *file, 
>> poll_table *poll)
>>           }
>>       }
>>   +clear_in_event:
>>       dma_resv_unlock(resv);
>>       return events;
>>   }
>
Christian König April 2, 2024, 8:07 a.m. UTC | #5
Am 02.04.24 um 08:49 schrieb zhiguojiang:
>> As far as I can see that's not because of the DMA-buf code, but 
>> because you are somehow using this interface incorrectly.
>>
>> When dma_buf_poll() is called it is mandatory for the caller to hold 
>> a reference to the file descriptor on which the poll operation is 
>> executed.
>>
>> So adding code like "if (!file_count(file))" in the beginning of 
>> dma_buf_poll() is certainly broken.
>>
>> My best guess is that you have some unbalanced 
>> dma_buf_get()/dma_buf_put() somewhere instead.
>>
>>
> Hi Christian,
>
> The kernel dma_buf_poll() code shound not cause system crashes due to 
> the user mode usage logical issues ?

What user mode logical issues are you talking about? Closing a file 
while polling on it is perfectly valid.

dma_buf_poll() is called by the filesystem layer and it's the filesystem 
layer which should make sure that a file can't be closed while polling 
for an event.

If that doesn't work then you have stumbled over a massive bug in the fs 
layer. And I have some doubts that this is actually the case.

What is more likely is that some driver messes up the reference count 
and because of this you see an UAF.

Anyway as far as I can see the DMA-buf code is correct regarding this.

Regards,
Christian.

>
> Thanks
>
>
> 在 2024/4/1 20:22, Christian König 写道:
>> Am 27.03.24 um 03:29 schrieb Zhiguo Jiang:
>>> The issue is a UAF issue of dmabuf file fd. Throght debugging, we found
>>> that the dmabuf file fd is added to the epoll event listener list, and
>>> when it is released, it is not removed from the epoll list, which leads
>>> to the UAF(Use-After-Free) issue.
>>
>> As far as I can see that's not because of the DMA-buf code, but 
>> because you are somehow using this interface incorrectly.
>>
>> When dma_buf_poll() is called it is mandatory for the caller to hold 
>> a reference to the file descriptor on which the poll operation is 
>> executed.
>>
>> So adding code like "if (!file_count(file))" in the beginning of 
>> dma_buf_poll() is certainly broken.
>>
>> My best guess is that you have some unbalanced 
>> dma_buf_get()/dma_buf_put() somewhere instead.
>>
>> Regards,
>> Christian.
>>
>>>
>>> The UAF issue can be solved by checking dmabuf file->f_count value and
>>> skipping the poll operation for the closed dmabuf file in the
>>> dma_buf_poll(). We have tested this solved patch multiple times and
>>> have not reproduced the uaf issue.
>>>
>>> crash dump:
>>> list_del corruption, ffffff8a6f143a90->next is LIST_POISON1
>>> (dead000000000100)
>>> ------------[ cut here ]------------
>>> kernel BUG at lib/list_debug.c:55!
>>> Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
>>> pc : __list_del_entry_valid+0x98/0xd4
>>> lr : __list_del_entry_valid+0x98/0xd4
>>> sp : ffffffc01d413d00
>>> x29: ffffffc01d413d00 x28: 00000000000000c0 x27: 0000000000000020
>>> x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000080007
>>> x23: ffffff8b22e5dcc0 x22: ffffff88a6be12d0 x21: ffffff8b22e572b0
>>> x20: ffffff80254ed0a0 x19: ffffff8a6f143a00 x18: ffffffda5efed3c0
>>> x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
>>> x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0000000000000018
>>> x11: ffffff8b6c188000 x10: 00000000ffffffff x9 : 6c8413a194897b00
>>> x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
>>> x5 : ffffff8b6c3b2a3e x4 : ffffff8b6c3b2a40 x3 : ffff103000001005
>>> x2 : 0000000000000001 x1 : 00000000000000c0 x0 : 000000000000004e
>>> Call trace:
>>>   __list_del_entry_valid+0x98/0xd4
>>>   dma_buf_file_release+0x48/0x90
>>>   __fput+0xf4/0x280
>>>   ____fput+0x10/0x20
>>>   task_work_run+0xcc/0xf4
>>>   do_notify_resume+0x2a0/0x33c
>>>   el0_svc+0x5c/0xa4
>>>   el0t_64_sync_handler+0x68/0xb4
>>>   el0t_64_sync+0x1a0/0x1a4
>>> Code: d0006fe0 912c5000 f2fbd5a2 94231a01 (d4210000)
>>> ---[ end trace 0000000000000000 ]---
>>> Kernel panic - not syncing: Oops - BUG: Fatal exception
>>> SMP: stopping secondary CPUs
>>>
>>> Signed-off-by: Zhiguo Jiang <justinjiang@vivo.com>
>>> ---
>>>   drivers/dma-buf/dma-buf.c | 28 ++++++++++++++++++++++++----
>>>   1 file changed, 24 insertions(+), 4 deletions(-)
>>>   mode change 100644 => 100755 drivers/dma-buf/dma-buf.c
>>>
>>> diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
>>> index 8fe5aa67b167..e469dd9288cc
>>> --- a/drivers/dma-buf/dma-buf.c
>>> +++ b/drivers/dma-buf/dma-buf.c
>>> @@ -240,6 +240,10 @@ static __poll_t dma_buf_poll(struct file *file, 
>>> poll_table *poll)
>>>       struct dma_resv *resv;
>>>       __poll_t events;
>>>   +    /* Check if the file exists */
>>> +    if (!file_count(file))
>>> +        return EPOLLERR;
>>> +
>>>       dmabuf = file->private_data;
>>>       if (!dmabuf || !dmabuf->resv)
>>>           return EPOLLERR;
>>> @@ -266,8 +270,15 @@ static __poll_t dma_buf_poll(struct file *file, 
>>> poll_table *poll)
>>>           spin_unlock_irq(&dmabuf->poll.lock);
>>>             if (events & EPOLLOUT) {
>>> -            /* Paired with fput in dma_buf_poll_cb */
>>> -            get_file(dmabuf->file);
>>> +            /*
>>> +             * Paired with fput in dma_buf_poll_cb,
>>> +             * Skip poll for the closed file.
>>> +             */
>>> +            if (!get_file_rcu(&dmabuf->file)) {
>>> +                events &= ~EPOLLOUT;
>>> +                dcb->active = 0;
>>> +                goto clear_out_event;
>>> +            }
>>>                 if (!dma_buf_poll_add_cb(resv, true, dcb))
>>>                   /* No callback queued, wake up any other waiters */
>>> @@ -277,6 +288,7 @@ static __poll_t dma_buf_poll(struct file *file, 
>>> poll_table *poll)
>>>           }
>>>       }
>>>   +clear_out_event:
>>>       if (events & EPOLLIN) {
>>>           struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in;
>>>   @@ -289,8 +301,15 @@ static __poll_t dma_buf_poll(struct file 
>>> *file, poll_table *poll)
>>>           spin_unlock_irq(&dmabuf->poll.lock);
>>>             if (events & EPOLLIN) {
>>> -            /* Paired with fput in dma_buf_poll_cb */
>>> -            get_file(dmabuf->file);
>>> +            /*
>>> +             * Paired with fput in dma_buf_poll_cb,
>>> +             * Skip poll for the closed file.
>>> +             */
>>> +            if (!get_file_rcu(&dmabuf->file)) {
>>> +                events &= ~EPOLLIN;
>>> +                dcb->active = 0;
>>> +                goto clear_in_event;
>>> +            }
>>>                 if (!dma_buf_poll_add_cb(resv, false, dcb))
>>>                   /* No callback queued, wake up any other waiters */
>>> @@ -300,6 +319,7 @@ static __poll_t dma_buf_poll(struct file *file, 
>>> poll_table *poll)
>>>           }
>>>       }
>>>   +clear_in_event:
>>>       dma_resv_unlock(resv);
>>>       return events;
>>>   }
>>
>
T.J. Mercier April 2, 2024, 6:22 p.m. UTC | #6
On Tue, Apr 2, 2024 at 1:08 AM Christian König <christian.koenig@amd.com> wrote:
>
> Am 02.04.24 um 08:49 schrieb zhiguojiang:
> >> As far as I can see that's not because of the DMA-buf code, but
> >> because you are somehow using this interface incorrectly.
> >>
> >> When dma_buf_poll() is called it is mandatory for the caller to hold
> >> a reference to the file descriptor on which the poll operation is
> >> executed.
> >>
> >> So adding code like "if (!file_count(file))" in the beginning of
> >> dma_buf_poll() is certainly broken.
> >>
> >> My best guess is that you have some unbalanced
> >> dma_buf_get()/dma_buf_put() somewhere instead.
> >>
> >>
> > Hi Christian,
> >
> > The kernel dma_buf_poll() code shound not cause system crashes due to
> > the user mode usage logical issues ?
>
> What user mode logical issues are you talking about? Closing a file
> while polling on it is perfectly valid.
>
> dma_buf_poll() is called by the filesystem layer and it's the filesystem
> layer which should make sure that a file can't be closed while polling
> for an event.
>
> If that doesn't work then you have stumbled over a massive bug in the fs
> layer. And I have some doubts that this is actually the case.
>
> What is more likely is that some driver messes up the reference count
> and because of this you see an UAF.
>
> Anyway as far as I can see the DMA-buf code is correct regarding this.
>
> Regards,
> Christian.

I tried to reproduce this problem but I couldn't. What I see is a ref
get taken when poll is first called. So subsequently closing the fd in
userspace while it's being polled doesn't take the refcount all the
way to 0. That happens when dma_buf_poll_cb fires, either due to an
event or when the fd is closed upon timeout.

I don't really see how this could be triggered from userspace so I am
also suspicious of dma_buf_get/put.
zhiguojiang April 12, 2024, 6:19 a.m. UTC | #7
在 2024/4/3 2:22, T.J. Mercier 写道:
> [你通常不会收到来自 tjmercier@google.com 的电子邮件。请访问 https://aka.ms/LearnAboutSenderIdentification,以了解这一点为什么很重要]
>
> On Tue, Apr 2, 2024 at 1:08 AM Christian König <christian.koenig@amd.com> wrote:
>> Am 02.04.24 um 08:49 schrieb zhiguojiang:
>>>> As far as I can see that's not because of the DMA-buf code, but
>>>> because you are somehow using this interface incorrectly.
>>>>
>>>> When dma_buf_poll() is called it is mandatory for the caller to hold
>>>> a reference to the file descriptor on which the poll operation is
>>>> executed.
>>>>
>>>> So adding code like "if (!file_count(file))" in the beginning of
>>>> dma_buf_poll() is certainly broken.
>>>>
>>>> My best guess is that you have some unbalanced
>>>> dma_buf_get()/dma_buf_put() somewhere instead.
>>>>
>>>>
>>> Hi Christian,
>>>
>>> The kernel dma_buf_poll() code shound not cause system crashes due to
>>> the user mode usage logical issues ?
>> What user mode logical issues are you talking about? Closing a file
>> while polling on it is perfectly valid.
>>
>> dma_buf_poll() is called by the filesystem layer and it's the filesystem
>> layer which should make sure that a file can't be closed while polling
>> for an event.
>>
>> If that doesn't work then you have stumbled over a massive bug in the fs
>> layer. And I have some doubts that this is actually the case.
>>
>> What is more likely is that some driver messes up the reference count
>> and because of this you see an UAF.
>>
>> Anyway as far as I can see the DMA-buf code is correct regarding this.
>>
>> Regards,
>> Christian.
> I tried to reproduce this problem but I couldn't. What I see is a ref
> get taken when poll is first called. So subsequently closing the fd in
> userspace while it's being polled doesn't take the refcount all the
> way to 0. That happens when dma_buf_poll_cb fires, either due to an
> event or when the fd is closed upon timeout.
>
> I don't really see how this could be triggered from userspace so I am
> also suspicious of dma_buf_get/put.
Hi,

Panic signature:

 > list_del corruption, ffffff8a6f143a90->next is LIST_POISON1
 > (dead000000000100)
 > ------------[ cut here ]------------
 > kernel BUG at lib/list_debug.c:55!
 > Internal error: Oops - BUG: 00000000f2000800 [#1] PREEMPT SMP
 > pc : __list_del_entry_valid+0x98/0xd4
 > lr : __list_del_entry_valid+0x98/0xd4
 > sp : ffffffc01d413d00
 > x29: ffffffc01d413d00 x28: 00000000000000c0 x27: 0000000000000020
 > x26: 0000000000000000 x25: 0000000000000000 x24: 0000000000080007
 > x23: ffffff8b22e5dcc0 x22: ffffff88a6be12d0 x21: ffffff8b22e572b0
 > x20: ffffff80254ed0a0 x19: ffffff8a6f143a00 x18: ffffffda5efed3c0
 > x17: 6165642820314e4f x16: 53494f505f545349 x15: 4c20736920747865
 > x14: 6e3e2d3039613334 x13: 2930303130303030 x12: 0000000000000018
 > x11: ffffff8b6c188000 x10: 00000000ffffffff x9 : 6c8413a194897b00
 > x8 : 6c8413a194897b00 x7 : 74707572726f6320 x6 : 6c65645f7473696c
 > x5 : ffffff8b6c3b2a3e x4 : ffffff8b6c3b2a40 x3 : ffff103000001005
 > x2 : 0000000000000001 x1 : 00000000000000c0 x0 : 000000000000004e
 > Call trace:
 >  __list_del_entry_valid+0x98/0xd4
 >  dma_buf_file_release+0x48/0x90
 >  __fput+0xf4/0x280
 >  ____fput+0x10/0x20
 >  task_work_run+0xcc/0xf4
 >  do_notify_resume+0x2a0/0x33c
 >  el0_svc+0x5c/0xa4
 >  el0t_64_sync_handler+0x68/0xb4
 >  el0t_64_sync+0x1a0/0x1a4

It is caused that the same dma buf file released twice in short time, so 
it should be race issue of dma buf release.

Below ftrace log shows the procedure how dma buf release twice:

Line 715473:        android.display-2220  [006] 22160.660738: bprint:    
__fget_files __fget: [file: 0xffffff8ab9e57b80, dmabuf: 
0xffffff8b1baa6a00, count: 0x3] call:(('do_epoll_ctl', 
356)ffffffd4ad46411c<-('__arm64_sys_epoll_ctl', 
112)ffffffd4ad463dd8<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44<-('el0_svc', 
36)ffffffd4ae57ffcc<-('el0t_64_sync_handler', 136)ffffffd4ae57ff44)

Line 715475:        android.display-2220  [006] 22160.660739: bprint:    
get_file get_file[file: 0xffffff8ab9e57b80, dmabuf: 0xffffff8b1baa6a00, 
f_count: 0x4] call:(('dma_buf_poll', 760)ffffffd4adb685c8<-('ep_insert', 
1120)ffffffd4ad464bcc<-('do_epoll_ctl', 
1196)ffffffd4ad464464<-('__arm64_sys_epoll_ctl', 
112)ffffffd4ad463dd8<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44)

Line 715477:        android.display-2220  [006] 22160.660740: bprint:    
fput_many fput for dma buf file[file: 0xffffff8ab9e57b80, dmabuf: 
0xffffff8b1baa6a00, count: 0x4] call:(('dma_buf_poll', 
1104)ffffffd4adb68720<-('ep_insert', 
1120)ffffffd4ad464bcc<-('do_epoll_ctl', 
1196)ffffffd4ad464464<-('__arm64_sys_epoll_ctl', 
112)ffffffd4ad463dd8<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44)

Line 715479:        android.display-2220  [006] 22160.660741: bprint:    
fput_many fput for dma buf file[file: 0xffffff8ab9e57b80, dmabuf: 
0xffffff8b1baa6a00, count: 0x3] call:(('do_epoll_ctl', 
652)ffffffd4ad464244<-('__arm64_sys_epoll_ctl', 
112)ffffffd4ad463dd8<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44<-('el0_svc', 
36)ffffffd4ae57ffcc<-('el0t_64_sync_handler', 136)ffffffd4ae57ff44)

-> Here task 2220 do epoll for dma_buf file twice, and the fget/fput 
match. After this the file refcount is 2.

Line 716521:        RenderThread-3470  [005] 22160.664236: bprint:    
fput_many fput for dma buf file[file: 0xffffff8ab9e57b80, dmabuf: 
0xffffff8b1baa6a00, count: 0x2] call:(('vm_area_free_no_check', 
140)ffffffd4acf5021c<-('__do_munmap', 
1572)ffffffd4ad306eb8<-('__vm_munmap', 
216)ffffffd4ad3095b4<-('__arm64_sys_munmap', 
68)ffffffd4ad3094c4<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44)

Line 716525:        RenderThread-3470  [005] 22160.664243: bprint:    
fput_many fput for dma buf file[file: 0xffffff8ab9e57b80, dmabuf: 
0xffffff8b1baa6a00, count: 0x1] call:(('close_fd', 
376)ffffffd4ad40b404<-('__arm64_sys_close', 
24)ffffffd4ad3bfb1c<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44<-('el0_svc', 
36)ffffffd4ae57ffcc<-('el0t_64_sync_handler', 136)ffffffd4ae57ff44)

Line 716527:        RenderThread-3470  [005] 22160.664244: bprint:    
fput_many fput for dma buf file ret: 0, [file: 0xffffff8ab9e57b80, 
dmabuf: 0xffffff8b1baa6a00] start to free

-> Here task3470 do unmap and close(fd) then decrease the file count to 
zero. Then start to free file buf.

Line 716566:        android.display-2220  [006] 22160.664424: bprint:    
get_file get_file[file: 0xffffff8ab9e57b80, dmabuf: 0xffffff8b1baa6a00, 
f_count: 0x1] call:(('dma_buf_poll', 
760)ffffffd4adb685c8<-('do_epoll_wait', 
1020)ffffffd4ad46229c<-('do_epoll_pwait', 
84)ffffffd4ad463b70<-('__arm64_sys_epoll_pwait', 
276)ffffffd4ad463d1c<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44)

Line 716568:        android.display-2220  [006] 22160.664425: bprint:    
fput_many fput for dma buf file[file: 0xffffff8ab9e57b80, dmabuf: 
0xffffff8b1baa6a00, count: 0x1] call:(('dma_buf_poll', 
1104)ffffffd4adb68720<-('do_epoll_wait', 
1020)ffffffd4ad46229c<-('do_epoll_pwait', 
84)ffffffd4ad463b70<-('__arm64_sys_epoll_pwait', 
276)ffffffd4ad463d1c<-('invoke_syscall', 
96)ffffffd4acebe00c<-('el0_svc_common', 
140)ffffffd4acebdf40<-('do_el0_svc', 40)ffffffd4acebde44)

Line 716570:        android.display-2220  [006] 22160.664427: bprint:    
fput_many fput for dma buf file ret: 0, [file: 0xffffff8ab9e57b80, 
dmabuf: 0xffffff8b1baa6a00] start to free

-> Here task 2220 do epoll again where internally it will get/put then 
start to free twice and lead to final crash.

Here is the basic flow:

1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount is 1

2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)

3. After use the dma buf, Thread A close the fd, then here fd refcount is 0,
   and it will run __fput finally to release the file

4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it still 
resides in one epoll_list.
   Although __fput will call eventpoll_release to remove the file from 
binded epoll list,
   but it has small time window where Thread B jumps in.

5. During the small window, Thread B do the poll action for dma_buf_fd, 
where it will fget/fput for the file,
   this means the fd refcount will be 0 -> 1 -> 0, and it will call 
__fput again.
   This will lead to __fput twice for the same file.

6. So the potenial fix is use get_file_rcu which to check if file 
refcount already zero which means under free.
   If so, we just return and no need to do the dma_buf_poll.

Here is the race condition:

Thread A Thread B
dma_buf_export
fd_refcount is 1
epoll_ctl(EPOLL_ADD)
add dma_buf_fd to epoll list
close(dma_buf_fd)
fd_refcount is 0
__fput
dma_buf_poll
fget
fput
fd_refcount is zero again

Thanks
Christian König April 12, 2024, 6:39 a.m. UTC | #8
Am 12.04.24 um 08:19 schrieb zhiguojiang:
> [SNIP]
> -> Here task 2220 do epoll again where internally it will get/put then 
> start to free twice and lead to final crash.
>
> Here is the basic flow:
>
> 1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount 
> is 1
>
> 2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)
>
> 3. After use the dma buf, Thread A close the fd, then here fd refcount 
> is 0,
>   and it will run __fput finally to release the file

Stop, that isn't correct.

The fs layer which calls dma_buf_poll() should make sure that the file 
can't go away until the function returns.

Then inside dma_buf_poll() we add another reference to the file while 
installing the callback:

                         /* Paired with fput in dma_buf_poll_cb */
                         get_file(dmabuf->file);

This reference is only dropped after the callback is completed in 
dma_buf_poll_cb():

         /* Paired with get_file in dma_buf_poll */
         fput(dmabuf->file);

So your explanation for the issue just seems to be incorrect.

>
> 4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it 
> still resides in one epoll_list.
>   Although __fput will call eventpoll_release to remove the file from 
> binded epoll list,
>   but it has small time window where Thread B jumps in.

Well if that is really the case then that would then be a bug in 
epoll_ctl().

>
> 5. During the small window, Thread B do the poll action for 
> dma_buf_fd, where it will fget/fput for the file,
>   this means the fd refcount will be 0 -> 1 -> 0, and it will call 
> __fput again.
>   This will lead to __fput twice for the same file.
>
> 6. So the potenial fix is use get_file_rcu which to check if file 
> refcount already zero which means under free.
>   If so, we just return and no need to do the dma_buf_poll.

Well to say it bluntly as far as I can see this suggestion is completely 
nonsense.

When the reference to the file goes away while dma_buf_poll() is 
executed then that's a massive bug in the caller of that function.

Regards,
Christian.

>
> Here is the race condition:
>
> Thread A Thread B
> dma_buf_export
> fd_refcount is 1
> epoll_ctl(EPOLL_ADD)
> add dma_buf_fd to epoll list
> close(dma_buf_fd)
> fd_refcount is 0
> __fput
> dma_buf_poll
> fget
> fput
> fd_refcount is zero again
>
> Thanks
>
zhiguojiang April 15, 2024, 10:35 a.m. UTC | #9
在 2024/4/12 14:39, Christian König 写道:
> [Some people who received this message don't often get email from 
> christian.koenig@amd.com. Learn why this is important at 
> https://aka.ms/LearnAboutSenderIdentification ]
>
> Am 12.04.24 um 08:19 schrieb zhiguojiang:
>> [SNIP]
>> -> Here task 2220 do epoll again where internally it will get/put then
>> start to free twice and lead to final crash.
>>
>> Here is the basic flow:
>>
>> 1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount
>> is 1
>>
>> 2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)
>>
>> 3. After use the dma buf, Thread A close the fd, then here fd refcount
>> is 0,
>>   and it will run __fput finally to release the file
>
> Stop, that isn't correct.
>
> The fs layer which calls dma_buf_poll() should make sure that the file
> can't go away until the function returns.
>
> Then inside dma_buf_poll() we add another reference to the file while
> installing the callback:
>
>                         /* Paired with fput in dma_buf_poll_cb */
>                         get_file(dmabuf->file);
Hi,

The problem may just occurred here.

Is it possible file reference count already decreased to 0 and fput 
already being in progressing just before calling get_file(dmabuf->file) 
in dma_buf_poll()?

>
> This reference is only dropped after the callback is completed in
> dma_buf_poll_cb():
>
>         /* Paired with get_file in dma_buf_poll */
>         fput(dmabuf->file);
>
> So your explanation for the issue just seems to be incorrect.
>
>>
>> 4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
>> still resides in one epoll_list.
>>   Although __fput will call eventpoll_release to remove the file from
>> binded epoll list,
>>   but it has small time window where Thread B jumps in.
>
> Well if that is really the case then that would then be a bug in
> epoll_ctl().
>
>>
>> 5. During the small window, Thread B do the poll action for
>> dma_buf_fd, where it will fget/fput for the file,
>>   this means the fd refcount will be 0 -> 1 -> 0, and it will call
>> __fput again.
>>   This will lead to __fput twice for the same file.
>>
>> 6. So the potenial fix is use get_file_rcu which to check if file
>> refcount already zero which means under free.
>>   If so, we just return and no need to do the dma_buf_poll.
>
> Well to say it bluntly as far as I can see this suggestion is completely
> nonsense.
>
> When the reference to the file goes away while dma_buf_poll() is
> executed then that's a massive bug in the caller of that function.
>
> Regards,
> Christian.
>
>>
>> Here is the race condition:
>>
>> Thread A Thread B
>> dma_buf_export
>> fd_refcount is 1
>> epoll_ctl(EPOLL_ADD)
>> add dma_buf_fd to epoll list
>> close(dma_buf_fd)
>> fd_refcount is 0
>> __fput
>> dma_buf_poll
>> fget
>> fput
>> fd_refcount is zero again
>>
>> Thanks
>>
>
Christian König April 15, 2024, 11:57 a.m. UTC | #10
Am 15.04.24 um 12:35 schrieb zhiguojiang:
> 在 2024/4/12 14:39, Christian König 写道:
>> [Some people who received this message don't often get email from 
>> christian.koenig@amd.com. Learn why this is important at 
>> https://aka.ms/LearnAboutSenderIdentification ]
>>
>> Am 12.04.24 um 08:19 schrieb zhiguojiang:
>>> [SNIP]
>>> -> Here task 2220 do epoll again where internally it will get/put then
>>> start to free twice and lead to final crash.
>>>
>>> Here is the basic flow:
>>>
>>> 1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount
>>> is 1
>>>
>>> 2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)
>>>
>>> 3. After use the dma buf, Thread A close the fd, then here fd refcount
>>> is 0,
>>>   and it will run __fput finally to release the file
>>
>> Stop, that isn't correct.
>>
>> The fs layer which calls dma_buf_poll() should make sure that the file
>> can't go away until the function returns.
>>
>> Then inside dma_buf_poll() we add another reference to the file while
>> installing the callback:
>>
>>                         /* Paired with fput in dma_buf_poll_cb */
>>                         get_file(dmabuf->file);
> Hi,
>
> The problem may just occurred here.
>
> Is it possible file reference count already decreased to 0 and fput 
> already being in progressing just before calling 
> get_file(dmabuf->file) in dma_buf_poll()?

No, exactly that isn't possible.

If a function gets a dma_buf pointer or even more general any reference 
counted pointer which has already decreased to 0 then that is a major 
bug in the caller of that function.

BTW: It's completely illegal to work around such issues by using 
file_count() or RCU functions. So when you suggest stuff like that it 
will immediately face rejection.

Regards,
Christian.

>
>>
>> This reference is only dropped after the callback is completed in
>> dma_buf_poll_cb():
>>
>>         /* Paired with get_file in dma_buf_poll */
>>         fput(dmabuf->file);
>>
>> So your explanation for the issue just seems to be incorrect.
>>
>>>
>>> 4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
>>> still resides in one epoll_list.
>>>   Although __fput will call eventpoll_release to remove the file from
>>> binded epoll list,
>>>   but it has small time window where Thread B jumps in.
>>
>> Well if that is really the case then that would then be a bug in
>> epoll_ctl().
>>
>>>
>>> 5. During the small window, Thread B do the poll action for
>>> dma_buf_fd, where it will fget/fput for the file,
>>>   this means the fd refcount will be 0 -> 1 -> 0, and it will call
>>> __fput again.
>>>   This will lead to __fput twice for the same file.
>>>
>>> 6. So the potenial fix is use get_file_rcu which to check if file
>>> refcount already zero which means under free.
>>>   If so, we just return and no need to do the dma_buf_poll.
>>
>> Well to say it bluntly as far as I can see this suggestion is completely
>> nonsense.
>>
>> When the reference to the file goes away while dma_buf_poll() is
>> executed then that's a massive bug in the caller of that function.
>>
>> Regards,
>> Christian.
>>
>>>
>>> Here is the race condition:
>>>
>>> Thread A Thread B
>>> dma_buf_export
>>> fd_refcount is 1
>>> epoll_ctl(EPOLL_ADD)
>>> add dma_buf_fd to epoll list
>>> close(dma_buf_fd)
>>> fd_refcount is 0
>>> __fput
>>> dma_buf_poll
>>> fget
>>> fput
>>> fd_refcount is zero again
>>>
>>> Thanks
>>>
>>
>
zhiguojiang April 18, 2024, 1:33 a.m. UTC | #11
在 2024/4/15 19:57, Christian König 写道:
> [Some people who received this message don't often get email from 
> christian.koenig@amd.com. Learn why this is important at 
> https://aka.ms/LearnAboutSenderIdentification ]
>
> Am 15.04.24 um 12:35 schrieb zhiguojiang:
>> 在 2024/4/12 14:39, Christian König 写道:
>>> [Some people who received this message don't often get email from
>>> christian.koenig@amd.com. Learn why this is important at
>>> https://aka.ms/LearnAboutSenderIdentification ]
>>>
>>> Am 12.04.24 um 08:19 schrieb zhiguojiang:
>>>> [SNIP]
>>>> -> Here task 2220 do epoll again where internally it will get/put then
>>>> start to free twice and lead to final crash.
>>>>
>>>> Here is the basic flow:
>>>>
>>>> 1. Thread A install the dma_buf_fd via dma_buf_export, the fd refcount
>>>> is 1
>>>>
>>>> 2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)
>>>>
>>>> 3. After use the dma buf, Thread A close the fd, then here fd refcount
>>>> is 0,
>>>>   and it will run __fput finally to release the file
>>>
>>> Stop, that isn't correct.
>>>
>>> The fs layer which calls dma_buf_poll() should make sure that the file
>>> can't go away until the function returns.
>>>
>>> Then inside dma_buf_poll() we add another reference to the file while
>>> installing the callback:
>>>
>>>                         /* Paired with fput in dma_buf_poll_cb */
>>>                         get_file(dmabuf->file);
>> Hi,
>>
>> The problem may just occurred here.
>>
>> Is it possible file reference count already decreased to 0 and fput
>> already being in progressing just before calling
>> get_file(dmabuf->file) in dma_buf_poll()?
>
> No, exactly that isn't possible.
>
> If a function gets a dma_buf pointer or even more general any reference
> counted pointer which has already decreased to 0 then that is a major
> bug in the caller of that function.
>
> BTW: It's completely illegal to work around such issues by using
> file_count() or RCU functions. So when you suggest stuff like that it
> will immediately face rejection.
>
> Regards,
> Christian.
Hi,

Thanks for your kind comment.

'If a function gets a dma_buf pointer or even more general any reference

counted pointer which has already decreased to 0 then that is a major

bug in the caller of that function.'

According to the issue log, we can see the dma buf file close and epoll 
operation running in parallel.

Apparently at the moment caller calls epoll, although another task 
caller already called close on the same fd, but this fd was still in 
progress to close, so fd is still valid, thus no EBADF returned to caller.

Then the two task contexts operate on same dma buf fd(one is close, 
another is epoll) in kernel space.

Please kindly help to comment whether above scenario is possible.

If there is some bug in the caller logic, Can u help to point it out? :)

Regards,
Zhiguo
>
>>
>>>
>>> This reference is only dropped after the callback is completed in
>>> dma_buf_poll_cb():
>>>
>>>         /* Paired with get_file in dma_buf_poll */
>>>         fput(dmabuf->file);
>>>
>>> So your explanation for the issue just seems to be incorrect.
>>>
>>>>
>>>> 4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
>>>> still resides in one epoll_list.
>>>>   Although __fput will call eventpoll_release to remove the file from
>>>> binded epoll list,
>>>>   but it has small time window where Thread B jumps in.
>>>
>>> Well if that is really the case then that would then be a bug in
>>> epoll_ctl().
>>>
>>>>
>>>> 5. During the small window, Thread B do the poll action for
>>>> dma_buf_fd, where it will fget/fput for the file,
>>>>   this means the fd refcount will be 0 -> 1 -> 0, and it will call
>>>> __fput again.
>>>>   This will lead to __fput twice for the same file.
>>>>
>>>> 6. So the potenial fix is use get_file_rcu which to check if file
>>>> refcount already zero which means under free.
>>>>   If so, we just return and no need to do the dma_buf_poll.
>>>
>>> Well to say it bluntly as far as I can see this suggestion is 
>>> completely
>>> nonsense.
>>>
>>> When the reference to the file goes away while dma_buf_poll() is
>>> executed then that's a massive bug in the caller of that function.
>>>
>>> Regards,
>>> Christian.
>>>
>>>>
>>>> Here is the race condition:
>>>>
>>>> Thread A Thread B
>>>> dma_buf_export
>>>> fd_refcount is 1
>>>> epoll_ctl(EPOLL_ADD)
>>>> add dma_buf_fd to epoll list
>>>> close(dma_buf_fd)
>>>> fd_refcount is 0
>>>> __fput
>>>> dma_buf_poll
>>>> fget
>>>> fput
>>>> fd_refcount is zero again
>>>>
>>>> Thanks
>>>>
>>>
>>
>
Christian König April 18, 2024, 6:46 a.m. UTC | #12
Am 18.04.24 um 03:33 schrieb zhiguojiang:
> 在 2024/4/15 19:57, Christian König 写道:
>> [Some people who received this message don't often get email from 
>> christian.koenig@amd.com. Learn why this is important at 
>> https://aka.ms/LearnAboutSenderIdentification ]
>>
>> Am 15.04.24 um 12:35 schrieb zhiguojiang:
>>> 在 2024/4/12 14:39, Christian König 写道:
>>>> [Some people who received this message don't often get email from
>>>> christian.koenig@amd.com. Learn why this is important at
>>>> https://aka.ms/LearnAboutSenderIdentification ]
>>>>
>>>> Am 12.04.24 um 08:19 schrieb zhiguojiang:
>>>>> [SNIP]
>>>>> -> Here task 2220 do epoll again where internally it will get/put 
>>>>> then
>>>>> start to free twice and lead to final crash.
>>>>>
>>>>> Here is the basic flow:
>>>>>
>>>>> 1. Thread A install the dma_buf_fd via dma_buf_export, the fd 
>>>>> refcount
>>>>> is 1
>>>>>
>>>>> 2. Thread A add the fd to epoll list via epoll_ctl(EPOLL_CTL_ADD)
>>>>>
>>>>> 3. After use the dma buf, Thread A close the fd, then here fd 
>>>>> refcount
>>>>> is 0,
>>>>>   and it will run __fput finally to release the file
>>>>
>>>> Stop, that isn't correct.
>>>>
>>>> The fs layer which calls dma_buf_poll() should make sure that the file
>>>> can't go away until the function returns.
>>>>
>>>> Then inside dma_buf_poll() we add another reference to the file while
>>>> installing the callback:
>>>>
>>>>                         /* Paired with fput in dma_buf_poll_cb */
>>>>                         get_file(dmabuf->file);
>>> Hi,
>>>
>>> The problem may just occurred here.
>>>
>>> Is it possible file reference count already decreased to 0 and fput
>>> already being in progressing just before calling
>>> get_file(dmabuf->file) in dma_buf_poll()?
>>
>> No, exactly that isn't possible.
>>
>> If a function gets a dma_buf pointer or even more general any reference
>> counted pointer which has already decreased to 0 then that is a major
>> bug in the caller of that function.
>>
>> BTW: It's completely illegal to work around such issues by using
>> file_count() or RCU functions. So when you suggest stuff like that it
>> will immediately face rejection.
>>
>> Regards,
>> Christian.
> Hi,
>
> Thanks for your kind comment.
>
> 'If a function gets a dma_buf pointer or even more general any reference
>
> counted pointer which has already decreased to 0 then that is a major
>
> bug in the caller of that function.'
>
> According to the issue log, we can see the dma buf file close and 
> epoll operation running in parallel.
>
> Apparently at the moment caller calls epoll, although another task 
> caller already called close on the same fd, but this fd was still in 
> progress to close, so fd is still valid, thus no EBADF returned to 
> caller.

No, exactly that can't happen either.

As far as I can see the EPOLL holds a reference to the files it 
contains. So it is perfectly valid to add the file descriptor to EPOLL 
and then close it.

In this case the file won't be closed, but be kept alive by it's 
reference in the EPOLL file descriptor.

>
> Then the two task contexts operate on same dma buf fd(one is close, 
> another is epoll) in kernel space.
>
> Please kindly help to comment whether above scenario is possible.
>
> If there is some bug in the caller logic, Can u help to point it out? :)

Well what could be is that the EPOLL implementation is broken somehow, 
but I have really doubts on that.

As I said before the most likely explanation is that you have a broken 
device driver which messes up the DMA-buf file reference count somehow.

Regards,
Christian.

>
> Regards,
> Zhiguo
>>
>>>
>>>>
>>>> This reference is only dropped after the callback is completed in
>>>> dma_buf_poll_cb():
>>>>
>>>>         /* Paired with get_file in dma_buf_poll */
>>>>         fput(dmabuf->file);
>>>>
>>>> So your explanation for the issue just seems to be incorrect.
>>>>
>>>>>
>>>>> 4. Here Thread A not do epoll_ctl(EPOLL_CTL_DEL) manunally, so it
>>>>> still resides in one epoll_list.
>>>>>   Although __fput will call eventpoll_release to remove the file from
>>>>> binded epoll list,
>>>>>   but it has small time window where Thread B jumps in.
>>>>
>>>> Well if that is really the case then that would then be a bug in
>>>> epoll_ctl().
>>>>
>>>>>
>>>>> 5. During the small window, Thread B do the poll action for
>>>>> dma_buf_fd, where it will fget/fput for the file,
>>>>>   this means the fd refcount will be 0 -> 1 -> 0, and it will call
>>>>> __fput again.
>>>>>   This will lead to __fput twice for the same file.
>>>>>
>>>>> 6. So the potenial fix is use get_file_rcu which to check if file
>>>>> refcount already zero which means under free.
>>>>>   If so, we just return and no need to do the dma_buf_poll.
>>>>
>>>> Well to say it bluntly as far as I can see this suggestion is 
>>>> completely
>>>> nonsense.
>>>>
>>>> When the reference to the file goes away while dma_buf_poll() is
>>>> executed then that's a massive bug in the caller of that function.
>>>>
>>>> Regards,
>>>> Christian.
>>>>
>>>>>
>>>>> Here is the race condition:
>>>>>
>>>>> Thread A Thread B
>>>>> dma_buf_export
>>>>> fd_refcount is 1
>>>>> epoll_ctl(EPOLL_ADD)
>>>>> add dma_buf_fd to epoll list
>>>>> close(dma_buf_fd)
>>>>> fd_refcount is 0
>>>>> __fput
>>>>> dma_buf_poll
>>>>> fget
>>>>> fput
>>>>> fd_refcount is zero again
>>>>>
>>>>> Thanks
>>>>>
>>>>
>>>
>>
>
diff mbox series

Patch

diff --git a/drivers/dma-buf/dma-buf.c b/drivers/dma-buf/dma-buf.c
index 8fe5aa67b167..e469dd9288cc
--- a/drivers/dma-buf/dma-buf.c
+++ b/drivers/dma-buf/dma-buf.c
@@ -240,6 +240,10 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 	struct dma_resv *resv;
 	__poll_t events;
 
+	/* Check if the file exists */
+	if (!file_count(file))
+		return EPOLLERR;
+
 	dmabuf = file->private_data;
 	if (!dmabuf || !dmabuf->resv)
 		return EPOLLERR;
@@ -266,8 +270,15 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 		spin_unlock_irq(&dmabuf->poll.lock);
 
 		if (events & EPOLLOUT) {
-			/* Paired with fput in dma_buf_poll_cb */
-			get_file(dmabuf->file);
+			/*
+			 * Paired with fput in dma_buf_poll_cb,
+			 * Skip poll for the closed file.
+			 */
+			if (!get_file_rcu(&dmabuf->file)) {
+				events &= ~EPOLLOUT;
+				dcb->active = 0;
+				goto clear_out_event;
+			}
 
 			if (!dma_buf_poll_add_cb(resv, true, dcb))
 				/* No callback queued, wake up any other waiters */
@@ -277,6 +288,7 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 		}
 	}
 
+clear_out_event:
 	if (events & EPOLLIN) {
 		struct dma_buf_poll_cb_t *dcb = &dmabuf->cb_in;
 
@@ -289,8 +301,15 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 		spin_unlock_irq(&dmabuf->poll.lock);
 
 		if (events & EPOLLIN) {
-			/* Paired with fput in dma_buf_poll_cb */
-			get_file(dmabuf->file);
+			/*
+			 * Paired with fput in dma_buf_poll_cb,
+			 * Skip poll for the closed file.
+			 */
+			if (!get_file_rcu(&dmabuf->file)) {
+				events &= ~EPOLLIN;
+				dcb->active = 0;
+				goto clear_in_event;
+			}
 
 			if (!dma_buf_poll_add_cb(resv, false, dcb))
 				/* No callback queued, wake up any other waiters */
@@ -300,6 +319,7 @@  static __poll_t dma_buf_poll(struct file *file, poll_table *poll)
 		}
 	}
 
+clear_in_event:
 	dma_resv_unlock(resv);
 	return events;
 }