diff mbox series

[RFC] usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend

Message ID 20240110095532.4776-1-quic_uaggarwa@quicinc.com (mailing list archive)
State Superseded
Headers show
Series [RFC] usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend | expand

Commit Message

UTTKARSH AGGARWAL Jan. 10, 2024, 9:55 a.m. UTC
In current scenario if Plug-out and Plug-In performed continuously
there could be a chance while checking for dwc->gadget_driver in
dwc3_gadget_suspend, a NULL pointer dereference may occur.

Call Stack:

	CPU1:                           CPU2:
	gadget_unbind_driver            dwc3_suspend_common
	dw3_gadget_stop                 dwc3_gadget_suspend
                                        dwc3_disconnect_gadget

CPU1 basically clears the variable and CPU2 checks the variable.
Consider CPU1 is running and right before gadget_driver is cleared
and in parallel CPU2 executes dwc3_gadget_suspend where it finds
dwc->gadget_driver which is not NULL and resumes execution and then
CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
it checks dwc->gadget_driver is already NULL because of which the
NULL pointer deference occur.

To prevent this, moving NULL pointer check inside of spin_lock.

Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>
---
 drivers/usb/dwc3/gadget.c | 6 ++----
 1 file changed, 2 insertions(+), 4 deletions(-)

Comments

Thinh Nguyen Jan. 17, 2024, 1:19 a.m. UTC | #1
Hi,

On Wed, Jan 10, 2024, Uttkarsh Aggarwal wrote:
> In current scenario if Plug-out and Plug-In performed continuously
> there could be a chance while checking for dwc->gadget_driver in
> dwc3_gadget_suspend, a NULL pointer dereference may occur.
> 
> Call Stack:
> 
> 	CPU1:                           CPU2:
> 	gadget_unbind_driver            dwc3_suspend_common
> 	dw3_gadget_stop                 dwc3_gadget_suspend
>                                         dwc3_disconnect_gadget
> 
> CPU1 basically clears the variable and CPU2 checks the variable.
> Consider CPU1 is running and right before gadget_driver is cleared
> and in parallel CPU2 executes dwc3_gadget_suspend where it finds
> dwc->gadget_driver which is not NULL and resumes execution and then
> CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where
> it checks dwc->gadget_driver is already NULL because of which the
> NULL pointer deference occur.
> 
> To prevent this, moving NULL pointer check inside of spin_lock.
> 
> Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com>
> ---
>  drivers/usb/dwc3/gadget.c | 6 ++----
>  1 file changed, 2 insertions(+), 4 deletions(-)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 019368f8e9c4..564976b3e2b9 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -4709,15 +4709,13 @@ int dwc3_gadget_suspend(struct dwc3 *dwc)
>  	unsigned long flags;
>  	int ret;
>  
> -	if (!dwc->gadget_driver)
> -		return 0;
> -
>  	ret = dwc3_gadget_soft_disconnect(dwc);
>  	if (ret)
>  		goto err;
>  
>  	spin_lock_irqsave(&dwc->lock, flags);
> -	dwc3_disconnect_gadget(dwc);
> +	if (dwc->gadget_driver)
> +		dwc3_disconnect_gadget(dwc);
>  	spin_unlock_irqrestore(&dwc->lock, flags);
>  
>  	return 0;
> -- 
> 2.17.1
> 

Do you have the dmesg log of this NULL pointer dereference?

Thanks,
Thinh
UTTKARSH AGGARWAL Jan. 17, 2024, 6:36 a.m. UTC | #2
On 1/17/2024 6:49 AM, Thinh Nguyen wrote:
> Do you have the dmesg log of this NULL pointer dereference?
> Thanks,
> Thinh

Hi Thinh,

Here is the dmesg log:

[  149.524338][  T843] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028
…
[  149.525872][  T843] Workqueue: pm pm_runtime_work
[  149.525886][  T843] pstate: 824000c5 (Nzcv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--)
[  149.525893][  T843] pc : dwc3_gadget_suspend+0x4c/0xb8
[  149.525900][  T843] lr : dwc3_gadget_suspend+0x34/0xb8
…
[  149.526003][  T843] Call trace:
[  149.526008][  T843]  dwc3_gadget_suspend+0x4c/0xb8
[  149.526015][  T843]  dwc3_suspend_common+0x58/0x230
[  149.526021][  T843]  dwc3_runtime_suspend+0x34/0x50
[  149.526027][  T843]  pm_generic_runtime_suspend+0x40/0x58
[  149.526034][  T843]  __rpm_callback+0x94/0x3e0
[  149.526040][  T843]  rpm_suspend+0x2e4/0x720
[  149.526047][  T843]  __pm_runtime_suspend+0x6c/0x100
[  149.526054][  T843]  dwc3_runtime_idle+0x48/0x64
[  149.526060][  T843]  rpm_idle+0x20c/0x310
[  149.526067][  T843]  pm_runtime_work+0x80/0xac
[  149.526075][  T843]  process_one_work+0x1e4/0x43c
[  149.526084][  T843]  worker_thread+0x25c/0x430
[  149.526091][  T843]  kthread+0x104/0x1d4
[  149.526099][  T843]  ret_from_fork+0x10/0x20
  
=======================================================
Process: adbd, [affinity: 0xff] cpu: 6 pid: 4907 start: 0xffffff888079b840
=====================================================
    Task name: adbd [affinity: 0xff] pid: 4907 cpu: 6 prio: 120 start: ffffff888079b840
    state: 0x2[D] exit_state: 0x0 stack base: 0xffffffc02db20000
    Last_enqueued_ts:     149.523808841 Last_sleep_ts:     149.523859362
    Stack:
    [<ffffffc0091cd558>] __switch_to+0x174
    [<ffffffc0091cdd40>] __schedule+0x5ec
    [<ffffffc0091ce19c>] schedule+0x7c
    [<ffffffc0091d7438>] schedule_timeout+0x44
    [<ffffffc0091ce858>] wait_for_common+0xd8
    [<ffffffc0091ce774>] wait_for_completion+0x18
    [<ffffffc0082b23dc>] kthread_stop+0x78
    [<ffffffc0083134a0>] free_irq+0x184
    [<ffffffc008bc7438>] dwc3_gadget_stop+0x40
    [<ffffffc008c12228>] gadget_unbind_driver+0xfc
    [<ffffffc008ab76ac>] device_release_driver_internal[jt]+0x1d4
    [<ffffffc008ab78dc>] driver_detach+0x90
    [<ffffffc008ab519c>] bus_remove_driver+0x78
    [<ffffffc008ab9170>] driver_unregister[jt]+0x44
    [<ffffffc008c11838>] usb_gadget_unregister_driver+0x20
    [<ffffffc008c0c1e0>] unregister_gadget_item+0x30
    [<ffffffc008c256a8>] ffs_data_clear[jt]+0x88

Thanks,

Uttkarsh
Kuen-Han Tsai Jan. 17, 2024, 6:49 a.m. UTC | #3
Hi Uttkarsh and Thinh,

>> Call Stack:
>> CPU1:                             CPU2:
>> gadget_unbind_driver     dwc3_suspend_common
>> dw3_gadget_stop           dwc3_gadget_suspend
>>                                        dwc3_disconnect_gadget

typo: dw3_gadget_stop/dwc3_gadget_stop

> Do you have the dmesg log of this NULL pointer dereference?
> Thanks,
> Thinh

We also encountered similar stack traces.

[    5.130593][  T100] Unable to handle kernel NULL pointer
dereference at virtual address 0000000000000028
[    5.130912][  T100] Call trace:
[    5.130914][  T100]  dwc3_gadget_suspend+0x88/0xf0
[    5.130918][  T100]  dwc3_suspend_common+0x58/0x230
[    5.130921][  T100]  dwc3_runtime_suspend+0x34/0x50
[    5.130925][  T100]  pm_generic_runtime_suspend+0x40/0x58
[    5.130928][  T100]  __rpm_callback+0x94/0x3e0
[    5.130931][  T100]  rpm_suspend+0x2e4/0x720
[    5.130934][  T100]  __pm_runtime_suspend+0x6c/0x100
[    5.130937][  T100]  dwc3_runtime_idle+0x48/0x64
[    5.130941][  T100]  rpm_idle+0x20c/0x310
[    5.130944][  T100]  pm_runtime_work+0x80/0xac
[    5.130947][  T100]  process_one_work+0x1e4/0x43c
[    5.130952][  T100]  worker_thread+0x25c/0x430
[    5.130956][  T100]  kthread+0x104/0x1d4
[    5.130959][  T100]  ret_from_fork+0x10/0x20

(gdb) list *dwc3_gadget_suspend+0x88
0xffffffc0089b2318 is in dwc3_gadget_suspend (drivers/usb/dwc3/gadget.c:3729).
...
3729            if (dwc->async_callbacks && dwc->gadget_driver->disconnect) {
...

Thanks,
Kuen-Han
Kuen-Han Tsai Jan. 17, 2024, 7:17 a.m. UTC | #4
>         ret = dwc3_gadget_soft_disconnect(dwc);
>         if (ret)
>                 goto err;

For improved readability, we can remove the goto statement and move
the error handling logic here.

Thanks,
Kuen-Han
UTTKARSH AGGARWAL Jan. 17, 2024, 9:22 a.m. UTC | #5
On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote:
>>          ret = dwc3_gadget_soft_disconnect(dwc);
>>          if (ret)
>>                  goto err;
> For improved readability, we can remove the goto statement and move
> the error handling logic here.

Hi Kuen-Han,

Thanks for the suggestion.
Does this looks good to you ?

    int ret = dwc3_gadget_soft_disconnect(dwc);    if (ret) {        if 
(dwc->softconnect)            dwc3_gadget_soft_connect(dwc);

        return ret;    }    spin_lock_irqsave(&dwc->lock, flags);    if 
(dwc->gadget_driver)        dwc3_disconnect_gadget(dwc);   
  spin_unlock_irqrestore(&dwc->lock, flags);

Thanks,

Uttkarsh
UTTKARSH AGGARWAL Jan. 17, 2024, 9:35 a.m. UTC | #6
On 1/17/2024 2:52 PM, UTTKARSH AGGARWAL wrote:
>
> On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote:
>>>          ret = dwc3_gadget_soft_disconnect(dwc);
>>>          if (ret)
>>>                  goto err;
>> For improved readability, we can remove the goto statement and move
>> the error handling logic here.
>
> Hi Kuen-Han,
>
> Thanks for the suggestion.
> Does this looks good to you ?
>
>    int ret = dwc3_gadget_soft_disconnect(dwc);if (ret) {        if 
> (dwc->softconnect)            dwc3_gadget_soft_connect(dwc);
>
>        return ret;    }    spin_lock_irqsave(&dwc->lock, flags);    if 
> (dwc->gadget_driver)  dwc3_disconnect_gadget(dwc); 
>  spin_unlock_irqrestore(&dwc->lock, flags);

Sorry for the mistake.

int ret = dwc3_gadget_soft_disconnect(dwc);

if (ret) {

       if (dwc->softconnect)

                  dwc3_gadget_soft_connect(dwc);

       return ret;

}

spin_lock_irqsave(&dwc->lock, flags);

if (dwc->gadget_driver)

        dwc3_disconnect_gadget(dwc);

spin_unlock_irqrestore(&dwc->lock, flags);
Kuen-Han Tsai Jan. 17, 2024, 12:16 p.m. UTC | #7
> int ret = dwc3_gadget_soft_disconnect(dwc);

I'm not sure if the coding style in this line, where the declaration
and assignment of a variable are combined, is considered good
practice.
The other parts look good to me.

Thanks,
Kuen-Han
Thinh Nguyen Jan. 18, 2024, 12:56 a.m. UTC | #8
On Wed, Jan 17, 2024, UTTKARSH AGGARWAL wrote:
> 
> On 1/17/2024 2:52 PM, UTTKARSH AGGARWAL wrote:
> > 
> > On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote:
> > > >          ret = dwc3_gadget_soft_disconnect(dwc);
> > > >          if (ret)
> > > >                  goto err;
> > > For improved readability, we can remove the goto statement and move
> > > the error handling logic here.
> > 
> > Hi Kuen-Han,
> > 
> > Thanks for the suggestion.
> > Does this looks good to you ?
> > 
> >    int ret = dwc3_gadget_soft_disconnect(dwc);if (ret) {        if
> > (dwc->softconnect)            dwc3_gadget_soft_connect(dwc);
> > 
> >        return ret;    }    spin_lock_irqsave(&dwc->lock, flags);    if
> > (dwc->gadget_driver)  dwc3_disconnect_gadget(dwc);
> >  spin_unlock_irqrestore(&dwc->lock, flags);
> 
> Sorry for the mistake.
> 
> int ret = dwc3_gadget_soft_disconnect(dwc);
> 
> if (ret) {
> 
>       if (dwc->softconnect)
> 
>                  dwc3_gadget_soft_connect(dwc);
> 
>       return ret;
> 
> }
> 
> spin_lock_irqsave(&dwc->lock, flags);
> 
> if (dwc->gadget_driver)
> 
>        dwc3_disconnect_gadget(dwc);
> 
> spin_unlock_irqrestore(&dwc->lock, flags);
> 

Please only make one logical fix per change. If any unrelated refactor
or style change is needed, keep it to a separate commit.

Thanks,
Thinh
UTTKARSH AGGARWAL Jan. 18, 2024, 8:46 a.m. UTC | #9
On 1/18/2024 6:26 AM, Thinh Nguyen wrote:
> On Wed, Jan 17, 2024, UTTKARSH AGGARWAL wrote:
>> On 1/17/2024 2:52 PM, UTTKARSH AGGARWAL wrote:
>>> On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote:
>>>>>           ret = dwc3_gadget_soft_disconnect(dwc);
>>>>>           if (ret)
>>>>>                   goto err;
>>>> For improved readability, we can remove the goto statement and move
>>>> the error handling logic here.
>>> Hi Kuen-Han,
>>>
>>> Thanks for the suggestion.
>>> Does this looks good to you ?
>>>
>>>     int ret = dwc3_gadget_soft_disconnect(dwc);if (ret) {        if
>>> (dwc->softconnect)            dwc3_gadget_soft_connect(dwc);
>>>
>>>         return ret;    }    spin_lock_irqsave(&dwc->lock, flags);    if
>>> (dwc->gadget_driver)  dwc3_disconnect_gadget(dwc);
>>>   spin_unlock_irqrestore(&dwc->lock, flags);
>> Sorry for the mistake.
>>
>> int ret = dwc3_gadget_soft_disconnect(dwc);
>>
>> if (ret) {
>>
>>        if (dwc->softconnect)
>>
>>                   dwc3_gadget_soft_connect(dwc);
>>
>>        return ret;
>>
>> }
>>
>> spin_lock_irqsave(&dwc->lock, flags);
>>
>> if (dwc->gadget_driver)
>>
>>         dwc3_disconnect_gadget(dwc);
>>
>> spin_unlock_irqrestore(&dwc->lock, flags);
>>
> Please only make one logical fix per change. If any unrelated refactor
> or style change is needed, keep it to a separate commit.
>
> Thanks,
> Thinh

Sure Thinh,I’ll only push fix in v2, not refactoring.

Thanks,

Uttkarsh
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 019368f8e9c4..564976b3e2b9 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -4709,15 +4709,13 @@  int dwc3_gadget_suspend(struct dwc3 *dwc)
 	unsigned long flags;
 	int ret;
 
-	if (!dwc->gadget_driver)
-		return 0;
-
 	ret = dwc3_gadget_soft_disconnect(dwc);
 	if (ret)
 		goto err;
 
 	spin_lock_irqsave(&dwc->lock, flags);
-	dwc3_disconnect_gadget(dwc);
+	if (dwc->gadget_driver)
+		dwc3_disconnect_gadget(dwc);
 	spin_unlock_irqrestore(&dwc->lock, flags);
 
 	return 0;