Message ID | 20240110095532.4776-1-quic_uaggarwa@quicinc.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [RFC] usb: dwc3: gadget: Fix NULL pointer dereference in dwc3_gadget_suspend | expand |
Hi, On Wed, Jan 10, 2024, Uttkarsh Aggarwal wrote: > In current scenario if Plug-out and Plug-In performed continuously > there could be a chance while checking for dwc->gadget_driver in > dwc3_gadget_suspend, a NULL pointer dereference may occur. > > Call Stack: > > CPU1: CPU2: > gadget_unbind_driver dwc3_suspend_common > dw3_gadget_stop dwc3_gadget_suspend > dwc3_disconnect_gadget > > CPU1 basically clears the variable and CPU2 checks the variable. > Consider CPU1 is running and right before gadget_driver is cleared > and in parallel CPU2 executes dwc3_gadget_suspend where it finds > dwc->gadget_driver which is not NULL and resumes execution and then > CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where > it checks dwc->gadget_driver is already NULL because of which the > NULL pointer deference occur. > > To prevent this, moving NULL pointer check inside of spin_lock. > > Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com> > --- > drivers/usb/dwc3/gadget.c | 6 ++---- > 1 file changed, 2 insertions(+), 4 deletions(-) > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c > index 019368f8e9c4..564976b3e2b9 100644 > --- a/drivers/usb/dwc3/gadget.c > +++ b/drivers/usb/dwc3/gadget.c > @@ -4709,15 +4709,13 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) > unsigned long flags; > int ret; > > - if (!dwc->gadget_driver) > - return 0; > - > ret = dwc3_gadget_soft_disconnect(dwc); > if (ret) > goto err; > > spin_lock_irqsave(&dwc->lock, flags); > - dwc3_disconnect_gadget(dwc); > + if (dwc->gadget_driver) > + dwc3_disconnect_gadget(dwc); > spin_unlock_irqrestore(&dwc->lock, flags); > > return 0; > -- > 2.17.1 > Do you have the dmesg log of this NULL pointer dereference? Thanks, Thinh
On 1/17/2024 6:49 AM, Thinh Nguyen wrote: > Do you have the dmesg log of this NULL pointer dereference? > Thanks, > Thinh Hi Thinh, Here is the dmesg log: [ 149.524338][ T843] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028 … [ 149.525872][ T843] Workqueue: pm pm_runtime_work [ 149.525886][ T843] pstate: 824000c5 (Nzcv daIF +PAN -UAO +TCO -DIT -SSBS BTYPE=--) [ 149.525893][ T843] pc : dwc3_gadget_suspend+0x4c/0xb8 [ 149.525900][ T843] lr : dwc3_gadget_suspend+0x34/0xb8 … [ 149.526003][ T843] Call trace: [ 149.526008][ T843] dwc3_gadget_suspend+0x4c/0xb8 [ 149.526015][ T843] dwc3_suspend_common+0x58/0x230 [ 149.526021][ T843] dwc3_runtime_suspend+0x34/0x50 [ 149.526027][ T843] pm_generic_runtime_suspend+0x40/0x58 [ 149.526034][ T843] __rpm_callback+0x94/0x3e0 [ 149.526040][ T843] rpm_suspend+0x2e4/0x720 [ 149.526047][ T843] __pm_runtime_suspend+0x6c/0x100 [ 149.526054][ T843] dwc3_runtime_idle+0x48/0x64 [ 149.526060][ T843] rpm_idle+0x20c/0x310 [ 149.526067][ T843] pm_runtime_work+0x80/0xac [ 149.526075][ T843] process_one_work+0x1e4/0x43c [ 149.526084][ T843] worker_thread+0x25c/0x430 [ 149.526091][ T843] kthread+0x104/0x1d4 [ 149.526099][ T843] ret_from_fork+0x10/0x20 ======================================================= Process: adbd, [affinity: 0xff] cpu: 6 pid: 4907 start: 0xffffff888079b840 ===================================================== Task name: adbd [affinity: 0xff] pid: 4907 cpu: 6 prio: 120 start: ffffff888079b840 state: 0x2[D] exit_state: 0x0 stack base: 0xffffffc02db20000 Last_enqueued_ts: 149.523808841 Last_sleep_ts: 149.523859362 Stack: [<ffffffc0091cd558>] __switch_to+0x174 [<ffffffc0091cdd40>] __schedule+0x5ec [<ffffffc0091ce19c>] schedule+0x7c [<ffffffc0091d7438>] schedule_timeout+0x44 [<ffffffc0091ce858>] wait_for_common+0xd8 [<ffffffc0091ce774>] wait_for_completion+0x18 [<ffffffc0082b23dc>] kthread_stop+0x78 [<ffffffc0083134a0>] free_irq+0x184 [<ffffffc008bc7438>] dwc3_gadget_stop+0x40 [<ffffffc008c12228>] gadget_unbind_driver+0xfc [<ffffffc008ab76ac>] device_release_driver_internal[jt]+0x1d4 [<ffffffc008ab78dc>] driver_detach+0x90 [<ffffffc008ab519c>] bus_remove_driver+0x78 [<ffffffc008ab9170>] driver_unregister[jt]+0x44 [<ffffffc008c11838>] usb_gadget_unregister_driver+0x20 [<ffffffc008c0c1e0>] unregister_gadget_item+0x30 [<ffffffc008c256a8>] ffs_data_clear[jt]+0x88 Thanks, Uttkarsh
Hi Uttkarsh and Thinh, >> Call Stack: >> CPU1: CPU2: >> gadget_unbind_driver dwc3_suspend_common >> dw3_gadget_stop dwc3_gadget_suspend >> dwc3_disconnect_gadget typo: dw3_gadget_stop/dwc3_gadget_stop > Do you have the dmesg log of this NULL pointer dereference? > Thanks, > Thinh We also encountered similar stack traces. [ 5.130593][ T100] Unable to handle kernel NULL pointer dereference at virtual address 0000000000000028 [ 5.130912][ T100] Call trace: [ 5.130914][ T100] dwc3_gadget_suspend+0x88/0xf0 [ 5.130918][ T100] dwc3_suspend_common+0x58/0x230 [ 5.130921][ T100] dwc3_runtime_suspend+0x34/0x50 [ 5.130925][ T100] pm_generic_runtime_suspend+0x40/0x58 [ 5.130928][ T100] __rpm_callback+0x94/0x3e0 [ 5.130931][ T100] rpm_suspend+0x2e4/0x720 [ 5.130934][ T100] __pm_runtime_suspend+0x6c/0x100 [ 5.130937][ T100] dwc3_runtime_idle+0x48/0x64 [ 5.130941][ T100] rpm_idle+0x20c/0x310 [ 5.130944][ T100] pm_runtime_work+0x80/0xac [ 5.130947][ T100] process_one_work+0x1e4/0x43c [ 5.130952][ T100] worker_thread+0x25c/0x430 [ 5.130956][ T100] kthread+0x104/0x1d4 [ 5.130959][ T100] ret_from_fork+0x10/0x20 (gdb) list *dwc3_gadget_suspend+0x88 0xffffffc0089b2318 is in dwc3_gadget_suspend (drivers/usb/dwc3/gadget.c:3729). ... 3729 if (dwc->async_callbacks && dwc->gadget_driver->disconnect) { ... Thanks, Kuen-Han
> ret = dwc3_gadget_soft_disconnect(dwc); > if (ret) > goto err; For improved readability, we can remove the goto statement and move the error handling logic here. Thanks, Kuen-Han
On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote: >> ret = dwc3_gadget_soft_disconnect(dwc); >> if (ret) >> goto err; > For improved readability, we can remove the goto statement and move > the error handling logic here. Hi Kuen-Han, Thanks for the suggestion. Does this looks good to you ? int ret = dwc3_gadget_soft_disconnect(dwc); if (ret) { if (dwc->softconnect) dwc3_gadget_soft_connect(dwc); return ret; } spin_lock_irqsave(&dwc->lock, flags); if (dwc->gadget_driver) dwc3_disconnect_gadget(dwc); spin_unlock_irqrestore(&dwc->lock, flags); Thanks, Uttkarsh
On 1/17/2024 2:52 PM, UTTKARSH AGGARWAL wrote: > > On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote: >>> ret = dwc3_gadget_soft_disconnect(dwc); >>> if (ret) >>> goto err; >> For improved readability, we can remove the goto statement and move >> the error handling logic here. > > Hi Kuen-Han, > > Thanks for the suggestion. > Does this looks good to you ? > > int ret = dwc3_gadget_soft_disconnect(dwc);if (ret) { if > (dwc->softconnect) dwc3_gadget_soft_connect(dwc); > > return ret; } spin_lock_irqsave(&dwc->lock, flags); if > (dwc->gadget_driver) dwc3_disconnect_gadget(dwc); > spin_unlock_irqrestore(&dwc->lock, flags); Sorry for the mistake. int ret = dwc3_gadget_soft_disconnect(dwc); if (ret) { if (dwc->softconnect) dwc3_gadget_soft_connect(dwc); return ret; } spin_lock_irqsave(&dwc->lock, flags); if (dwc->gadget_driver) dwc3_disconnect_gadget(dwc); spin_unlock_irqrestore(&dwc->lock, flags);
> int ret = dwc3_gadget_soft_disconnect(dwc);
I'm not sure if the coding style in this line, where the declaration
and assignment of a variable are combined, is considered good
practice.
The other parts look good to me.
Thanks,
Kuen-Han
On Wed, Jan 17, 2024, UTTKARSH AGGARWAL wrote: > > On 1/17/2024 2:52 PM, UTTKARSH AGGARWAL wrote: > > > > On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote: > > > > ret = dwc3_gadget_soft_disconnect(dwc); > > > > if (ret) > > > > goto err; > > > For improved readability, we can remove the goto statement and move > > > the error handling logic here. > > > > Hi Kuen-Han, > > > > Thanks for the suggestion. > > Does this looks good to you ? > > > > int ret = dwc3_gadget_soft_disconnect(dwc);if (ret) { if > > (dwc->softconnect) dwc3_gadget_soft_connect(dwc); > > > > return ret; } spin_lock_irqsave(&dwc->lock, flags); if > > (dwc->gadget_driver) dwc3_disconnect_gadget(dwc); > > spin_unlock_irqrestore(&dwc->lock, flags); > > Sorry for the mistake. > > int ret = dwc3_gadget_soft_disconnect(dwc); > > if (ret) { > > if (dwc->softconnect) > > dwc3_gadget_soft_connect(dwc); > > return ret; > > } > > spin_lock_irqsave(&dwc->lock, flags); > > if (dwc->gadget_driver) > > dwc3_disconnect_gadget(dwc); > > spin_unlock_irqrestore(&dwc->lock, flags); > Please only make one logical fix per change. If any unrelated refactor or style change is needed, keep it to a separate commit. Thanks, Thinh
On 1/18/2024 6:26 AM, Thinh Nguyen wrote: > On Wed, Jan 17, 2024, UTTKARSH AGGARWAL wrote: >> On 1/17/2024 2:52 PM, UTTKARSH AGGARWAL wrote: >>> On 1/17/2024 12:47 PM, Kuen-Han Tsai wrote: >>>>> ret = dwc3_gadget_soft_disconnect(dwc); >>>>> if (ret) >>>>> goto err; >>>> For improved readability, we can remove the goto statement and move >>>> the error handling logic here. >>> Hi Kuen-Han, >>> >>> Thanks for the suggestion. >>> Does this looks good to you ? >>> >>> int ret = dwc3_gadget_soft_disconnect(dwc);if (ret) { if >>> (dwc->softconnect) dwc3_gadget_soft_connect(dwc); >>> >>> return ret; } spin_lock_irqsave(&dwc->lock, flags); if >>> (dwc->gadget_driver) dwc3_disconnect_gadget(dwc); >>> spin_unlock_irqrestore(&dwc->lock, flags); >> Sorry for the mistake. >> >> int ret = dwc3_gadget_soft_disconnect(dwc); >> >> if (ret) { >> >> if (dwc->softconnect) >> >> dwc3_gadget_soft_connect(dwc); >> >> return ret; >> >> } >> >> spin_lock_irqsave(&dwc->lock, flags); >> >> if (dwc->gadget_driver) >> >> dwc3_disconnect_gadget(dwc); >> >> spin_unlock_irqrestore(&dwc->lock, flags); >> > Please only make one logical fix per change. If any unrelated refactor > or style change is needed, keep it to a separate commit. > > Thanks, > Thinh Sure Thinh,I’ll only push fix in v2, not refactoring. Thanks, Uttkarsh
diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c index 019368f8e9c4..564976b3e2b9 100644 --- a/drivers/usb/dwc3/gadget.c +++ b/drivers/usb/dwc3/gadget.c @@ -4709,15 +4709,13 @@ int dwc3_gadget_suspend(struct dwc3 *dwc) unsigned long flags; int ret; - if (!dwc->gadget_driver) - return 0; - ret = dwc3_gadget_soft_disconnect(dwc); if (ret) goto err; spin_lock_irqsave(&dwc->lock, flags); - dwc3_disconnect_gadget(dwc); + if (dwc->gadget_driver) + dwc3_disconnect_gadget(dwc); spin_unlock_irqrestore(&dwc->lock, flags); return 0;
In current scenario if Plug-out and Plug-In performed continuously there could be a chance while checking for dwc->gadget_driver in dwc3_gadget_suspend, a NULL pointer dereference may occur. Call Stack: CPU1: CPU2: gadget_unbind_driver dwc3_suspend_common dw3_gadget_stop dwc3_gadget_suspend dwc3_disconnect_gadget CPU1 basically clears the variable and CPU2 checks the variable. Consider CPU1 is running and right before gadget_driver is cleared and in parallel CPU2 executes dwc3_gadget_suspend where it finds dwc->gadget_driver which is not NULL and resumes execution and then CPU1 completes execution. CPU2 executes dwc3_disconnect_gadget where it checks dwc->gadget_driver is already NULL because of which the NULL pointer deference occur. To prevent this, moving NULL pointer check inside of spin_lock. Signed-off-by: Uttkarsh Aggarwal <quic_uaggarwa@quicinc.com> --- drivers/usb/dwc3/gadget.c | 6 ++---- 1 file changed, 2 insertions(+), 4 deletions(-)