diff mbox series

[v2] usb: dwc3: gadget: Ignore dwc3 interrupt if GEVNTCOUNT reads corrupt value

Message ID 20231221151620.16030-1-quic_kriskura@quicinc.com (mailing list archive)
State New
Headers show
Series [v2] usb: dwc3: gadget: Ignore dwc3 interrupt if GEVNTCOUNT reads corrupt value | expand

Commit Message

Krishna Kurapati Dec. 21, 2023, 3:16 p.m. UTC
In the current implementation, the check_event_buf call reads the
GEVNTCOUNT register to determine the amount of event data generated
and copies it from ev->buf to ev->cache after masking interrupt.

During copy if the amount of data to be copied is more than
(length - lpos), we fill the ev->cache till the end of 4096 byte
buffer allocated and then start filling from the top (lpos = 0).

In one instance of SMMU crash it is observed that GEVNTCOUNT register
reads more than 4096 bytes:

dwc3_readl   base=0xffffffc0091dc000  offset=50188  value=63488

(offset = 50188 -> 0xC40C)  -> reads 63488 bytes

As per crash dump:
dwc->lpos = 2056

and evt->buf is at 0xFFFFFFC009185000 and the crash is at
0xFFFFFFC009186000. The diff which is exactly 0x1000 bytes.

We first memcpy 2040 bytes from (buf + lpos) to (buf + 0x1000).

And then we copy the rest of the data (64388 - 2040) from beginning
of dwc->ev_buf. While doing so we go beyond bounds as we are trying
to memcpy 62348 bytes into a 4K buffer resulting in crash.

Fix this by ignoring the interrupt when GEVNTCOUNT register reads a
value more than the event ring allocated.

Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
---
Changes in v2:
Instead of fixing amount of data being copied from ring, ignored
the interrupt when count is corrupt as per suggestion from Thinh.

Link to v1:
https://lore.kernel.org/all/20230521100330.22478-1-quic_kriskura@quicinc.com/

 drivers/usb/dwc3/gadget.c | 10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Comments

Thinh Nguyen Dec. 22, 2023, 10:19 p.m. UTC | #1
On Thu, Dec 21, 2023, Krishna Kurapati wrote:
> In the current implementation, the check_event_buf call reads the
> GEVNTCOUNT register to determine the amount of event data generated
> and copies it from ev->buf to ev->cache after masking interrupt.
> 
> During copy if the amount of data to be copied is more than
> (length - lpos), we fill the ev->cache till the end of 4096 byte
> buffer allocated and then start filling from the top (lpos = 0).
> 
> In one instance of SMMU crash it is observed that GEVNTCOUNT register
> reads more than 4096 bytes:
> 
> dwc3_readl   base=0xffffffc0091dc000  offset=50188  value=63488
> 
> (offset = 50188 -> 0xC40C)  -> reads 63488 bytes
> 
> As per crash dump:
> dwc->lpos = 2056
> 
> and evt->buf is at 0xFFFFFFC009185000 and the crash is at
> 0xFFFFFFC009186000. The diff which is exactly 0x1000 bytes.
> 
> We first memcpy 2040 bytes from (buf + lpos) to (buf + 0x1000).
> 
> And then we copy the rest of the data (64388 - 2040) from beginning
> of dwc->ev_buf. While doing so we go beyond bounds as we are trying
> to memcpy 62348 bytes into a 4K buffer resulting in crash.
> 
> Fix this by ignoring the interrupt when GEVNTCOUNT register reads a
> value more than the event ring allocated.
> 
> Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
> ---
> Changes in v2:
> Instead of fixing amount of data being copied from ring, ignored
> the interrupt when count is corrupt as per suggestion from Thinh.
> 
> Link to v1:
> https://urldefense.com/v3/__https://lore.kernel.org/all/20230521100330.22478-1-quic_kriskura@quicinc.com/__;!!A4F2R9G_pg!ewM3u9Pdk8yD9eU24sOuQqmhm8M2VpGXH8zALqVWGKffGbcJxrtyKKlUPuh8SS2arqO09DjnC9atC3bemEpx-g5UQMllbA$ 
> 
>  drivers/usb/dwc3/gadget.c | 10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> index 858fe4c299b7..e27933fdcce3 100644
> --- a/drivers/usb/dwc3/gadget.c
> +++ b/drivers/usb/dwc3/gadget.c
> @@ -4451,6 +4451,7 @@ static irqreturn_t dwc3_thread_interrupt(int irq, void *_evt)
>  static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>  {
>  	struct dwc3 *dwc = evt->dwc;
> +	int ret = IRQ_WAKE_THREAD;

irqreturn_t instead of int?

>  	u32 amount;
>  	u32 count;
>  
> @@ -4480,6 +4481,12 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>  	if (!count)
>  		return IRQ_NONE;
>  
> +	if (count > evt->length) {
> +		dev_err(dwc->dev, "GEVTCOUNT corrupt\n");
> +		ret = IRQ_NONE;
> +		goto done;
> +	}
> +
>  	evt->count = count;
>  	evt->flags |= DWC3_EVENT_PENDING;
>  
> @@ -4493,9 +4500,10 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>  	if (amount < count)
>  		memcpy(evt->cache, evt->buf, count - amount);
>  
> +done:
>  	dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);

Don't update the GEVNTCOUNT if the read is invalid. We're not processing
any event, so we should not write back the "count" that we did not
process.

>  
> -	return IRQ_WAKE_THREAD;
> +	return ret;
>  }
>  
>  static irqreturn_t dwc3_interrupt(int irq, void *_evt)
> -- 
> 2.42.0
> 

Thanks,
Thinh
Krishna Kurapati Dec. 23, 2023, 8:32 a.m. UTC | #2
On 12/23/2023 3:49 AM, Thinh Nguyen wrote:
> On Thu, Dec 21, 2023, Krishna Kurapati wrote:
>> In the current implementation, the check_event_buf call reads the
>> GEVNTCOUNT register to determine the amount of event data generated
>> and copies it from ev->buf to ev->cache after masking interrupt.
>>
>> During copy if the amount of data to be copied is more than
>> (length - lpos), we fill the ev->cache till the end of 4096 byte
>> buffer allocated and then start filling from the top (lpos = 0).
>>
>> In one instance of SMMU crash it is observed that GEVNTCOUNT register
>> reads more than 4096 bytes:
>>
>> dwc3_readl   base=0xffffffc0091dc000  offset=50188  value=63488
>>
>> (offset = 50188 -> 0xC40C)  -> reads 63488 bytes
>>
>> As per crash dump:
>> dwc->lpos = 2056
>>
>> and evt->buf is at 0xFFFFFFC009185000 and the crash is at
>> 0xFFFFFFC009186000. The diff which is exactly 0x1000 bytes.
>>
>> We first memcpy 2040 bytes from (buf + lpos) to (buf + 0x1000).
>>
>> And then we copy the rest of the data (64388 - 2040) from beginning
>> of dwc->ev_buf. While doing so we go beyond bounds as we are trying
>> to memcpy 62348 bytes into a 4K buffer resulting in crash.
>>
>> Fix this by ignoring the interrupt when GEVNTCOUNT register reads a
>> value more than the event ring allocated.
>>
>> Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
>> ---
>> Changes in v2:
>> Instead of fixing amount of data being copied from ring, ignored
>> the interrupt when count is corrupt as per suggestion from Thinh.
>>
>> Link to v1:
>> https://urldefense.com/v3/__https://lore.kernel.org/all/20230521100330.22478-1-quic_kriskura@quicinc.com/__;!!A4F2R9G_pg!ewM3u9Pdk8yD9eU24sOuQqmhm8M2VpGXH8zALqVWGKffGbcJxrtyKKlUPuh8SS2arqO09DjnC9atC3bemEpx-g5UQMllbA$
>>
>>   drivers/usb/dwc3/gadget.c | 10 +++++++++-
>>   1 file changed, 9 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
>> index 858fe4c299b7..e27933fdcce3 100644
>> --- a/drivers/usb/dwc3/gadget.c
>> +++ b/drivers/usb/dwc3/gadget.c
>> @@ -4451,6 +4451,7 @@ static irqreturn_t dwc3_thread_interrupt(int irq, void *_evt)
>>   static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>>   {
>>   	struct dwc3 *dwc = evt->dwc;
>> +	int ret = IRQ_WAKE_THREAD;
> 
> irqreturn_t instead of int?
> 
>>   	u32 amount;
>>   	u32 count;
>>   
>> @@ -4480,6 +4481,12 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>>   	if (!count)
>>   		return IRQ_NONE;
>>   
>> +	if (count > evt->length) {
>> +		dev_err(dwc->dev, "GEVTCOUNT corrupt\n");
>> +		ret = IRQ_NONE;
>> +		goto done;
>> +	}
>> +
>>   	evt->count = count;
>>   	evt->flags |= DWC3_EVENT_PENDING;
>>   
>> @@ -4493,9 +4500,10 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
>>   	if (amount < count)
>>   		memcpy(evt->cache, evt->buf, count - amount);
>>   
>> +done:
>>   	dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
> 
> Don't update the GEVNTCOUNT if the read is invalid. We're not processing
> any event, so we should not write back the "count" that we did not
> process.
> 
>>   
Thanks for the review Thinh.

If we don't update, won't the register always be non-zero ? It will keep 
triggering the dwc3_interrupt unnecessarily right ?

Regards,
Krishna,
Thinh Nguyen Jan. 11, 2024, 2:10 a.m. UTC | #3
Hi Krishna,

On Sat, Dec 23, 2023, Krishna Kurapati PSSNV wrote:
> 
> 
> On 12/23/2023 3:49 AM, Thinh Nguyen wrote:
> > On Thu, Dec 21, 2023, Krishna Kurapati wrote:
> > > In the current implementation, the check_event_buf call reads the
> > > GEVNTCOUNT register to determine the amount of event data generated
> > > and copies it from ev->buf to ev->cache after masking interrupt.
> > > 
> > > During copy if the amount of data to be copied is more than
> > > (length - lpos), we fill the ev->cache till the end of 4096 byte
> > > buffer allocated and then start filling from the top (lpos = 0).
> > > 
> > > In one instance of SMMU crash it is observed that GEVNTCOUNT register
> > > reads more than 4096 bytes:
> > > 
> > > dwc3_readl   base=0xffffffc0091dc000  offset=50188  value=63488
> > > 
> > > (offset = 50188 -> 0xC40C)  -> reads 63488 bytes
> > > 
> > > As per crash dump:
> > > dwc->lpos = 2056
> > > 
> > > and evt->buf is at 0xFFFFFFC009185000 and the crash is at
> > > 0xFFFFFFC009186000. The diff which is exactly 0x1000 bytes.
> > > 
> > > We first memcpy 2040 bytes from (buf + lpos) to (buf + 0x1000).
> > > 
> > > And then we copy the rest of the data (64388 - 2040) from beginning
> > > of dwc->ev_buf. While doing so we go beyond bounds as we are trying
> > > to memcpy 62348 bytes into a 4K buffer resulting in crash.
> > > 
> > > Fix this by ignoring the interrupt when GEVNTCOUNT register reads a
> > > value more than the event ring allocated.
> > > 
> > > Signed-off-by: Krishna Kurapati <quic_kriskura@quicinc.com>
> > > ---
> > > Changes in v2:
> > > Instead of fixing amount of data being copied from ring, ignored
> > > the interrupt when count is corrupt as per suggestion from Thinh.
> > > 
> > > Link to v1:
> > > https://urldefense.com/v3/__https://lore.kernel.org/all/20230521100330.22478-1-quic_kriskura@quicinc.com/__;!!A4F2R9G_pg!ewM3u9Pdk8yD9eU24sOuQqmhm8M2VpGXH8zALqVWGKffGbcJxrtyKKlUPuh8SS2arqO09DjnC9atC3bemEpx-g5UQMllbA$
> > > 
> > >   drivers/usb/dwc3/gadget.c | 10 +++++++++-
> > >   1 file changed, 9 insertions(+), 1 deletion(-)
> > > 
> > > diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
> > > index 858fe4c299b7..e27933fdcce3 100644
> > > --- a/drivers/usb/dwc3/gadget.c
> > > +++ b/drivers/usb/dwc3/gadget.c
> > > @@ -4451,6 +4451,7 @@ static irqreturn_t dwc3_thread_interrupt(int irq, void *_evt)
> > >   static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > >   {
> > >   	struct dwc3 *dwc = evt->dwc;
> > > +	int ret = IRQ_WAKE_THREAD;
> > 
> > irqreturn_t instead of int?
> > 
> > >   	u32 amount;
> > >   	u32 count;
> > > @@ -4480,6 +4481,12 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > >   	if (!count)
> > >   		return IRQ_NONE;
> > > +	if (count > evt->length) {
> > > +		dev_err(dwc->dev, "GEVTCOUNT corrupt\n");
> > > +		ret = IRQ_NONE;
> > > +		goto done;
> > > +	}
> > > +
> > >   	evt->count = count;
> > >   	evt->flags |= DWC3_EVENT_PENDING;
> > > @@ -4493,9 +4500,10 @@ static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
> > >   	if (amount < count)
> > >   		memcpy(evt->cache, evt->buf, count - amount);
> > > +done:
> > >   	dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
> > 
> > Don't update the GEVNTCOUNT if the read is invalid. We're not processing
> > any event, so we should not write back the "count" that we did not
> > process.
> > 
> Thanks for the review Thinh.
> 
> If we don't update, won't the register always be non-zero ? It will keep
> triggering the dwc3_interrupt unnecessarily right ?
> 

I think I asked this previously also. Perhaps there was a
misunderstanding. From what I understood, this invalid register read was
a temporary issue that would be fixed by itself after a few retries of
register read.

If that's not the case, then we cannot move forward. The driver will not
be in sync with the controller because of the invalid event count. We
will need to gracefully halt the controller and notify the user of the
error.

Thanks,
Thinh
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/gadget.c b/drivers/usb/dwc3/gadget.c
index 858fe4c299b7..e27933fdcce3 100644
--- a/drivers/usb/dwc3/gadget.c
+++ b/drivers/usb/dwc3/gadget.c
@@ -4451,6 +4451,7 @@  static irqreturn_t dwc3_thread_interrupt(int irq, void *_evt)
 static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
 {
 	struct dwc3 *dwc = evt->dwc;
+	int ret = IRQ_WAKE_THREAD;
 	u32 amount;
 	u32 count;
 
@@ -4480,6 +4481,12 @@  static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
 	if (!count)
 		return IRQ_NONE;
 
+	if (count > evt->length) {
+		dev_err(dwc->dev, "GEVTCOUNT corrupt\n");
+		ret = IRQ_NONE;
+		goto done;
+	}
+
 	evt->count = count;
 	evt->flags |= DWC3_EVENT_PENDING;
 
@@ -4493,9 +4500,10 @@  static irqreturn_t dwc3_check_event_buf(struct dwc3_event_buffer *evt)
 	if (amount < count)
 		memcpy(evt->cache, evt->buf, count - amount);
 
+done:
 	dwc3_writel(dwc->regs, DWC3_GEVNTCOUNT(0), count);
 
-	return IRQ_WAKE_THREAD;
+	return ret;
 }
 
 static irqreturn_t dwc3_interrupt(int irq, void *_evt)