diff mbox series

usb: dwc3: Add shutdown to platform_driver

Message ID 20190817174140.6394-1-vicencb@gmail.com (mailing list archive)
State New, archived
Headers show
Series usb: dwc3: Add shutdown to platform_driver | expand

Commit Message

Vicente Bergas Aug. 17, 2019, 5:41 p.m. UTC
Otherwise the device keeps writing to memory after kexec and disturbs
the next kernel.

Signed-off-by: Vicente Bergas <vicencb@gmail.com>
---
 drivers/usb/dwc3/dwc3-of-simple.c | 6 ++++++
 1 file changed, 6 insertions(+)

Hi Felipe, Robin,
this version calls 'remove' from 'shutdown' instead of just asserting
a reset because it looks like a cleaner way to stop the device.

Calling remove from shutdown in core.c instead of dwc3-of-simple.c does not
fix the issue either.

It has been tested on the sapphire board, a RK3399 platform.

Regards,
  Vicenç.

Comments

Vicente Bergas Aug. 27, 2019, 11:45 a.m. UTC | #1
On Saturday, August 17, 2019 7:41:40 PM CEST, Vicente Bergas wrote:
> Otherwise the device keeps writing to memory after kexec and disturbs
> the next kernel.
>
> Signed-off-by: Vicente Bergas <vicencb@gmail.com>
> ---
>  drivers/usb/dwc3/dwc3-of-simple.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> Hi Felipe, Robin,
> this version calls 'remove' from 'shutdown' instead of just asserting
> a reset because it looks like a cleaner way to stop the device.
>
> Calling remove from shutdown in core.c instead of dwc3-of-simple.c does not
> fix the issue either.
>
> It has been tested on the sapphire board, a RK3399 platform.
>
> Regards,
>   Vicenç.
>
> diff --git a/drivers/usb/dwc3/dwc3-of-simple.c 
> b/drivers/usb/dwc3/dwc3-of-simple.c
> index bdac3e7d7b18..d5fd45c64901 100644
> --- a/drivers/usb/dwc3/dwc3-of-simple.c
> +++ b/drivers/usb/dwc3/dwc3-of-simple.c
> @@ -133,6 +133,11 @@ static int dwc3_of_simple_remove(struct 
> platform_device *pdev)
>  	return 0;
>  }
>  
> +static void dwc3_of_simple_shutdown(struct platform_device *pdev)
> +{
> +	dwc3_of_simple_remove(pdev);
> +}
> +
>  static int __maybe_unused 
> dwc3_of_simple_runtime_suspend(struct device *dev)
>  {
>  	struct dwc3_of_simple	*simple = dev_get_drvdata(dev);
> @@ -190,6 +195,7 @@ MODULE_DEVICE_TABLE(of, of_dwc3_simple_match);
>  static struct platform_driver dwc3_of_simple_driver = {
>  	.probe		= dwc3_of_simple_probe,
>  	.remove		= dwc3_of_simple_remove,
> +	.shutdown	= dwc3_of_simple_shutdown,
>  	.driver		= {
>  		.name	= "dwc3-of-simple",
>  		.of_match_table = of_dwc3_simple_match,

Hi,
please, can you provide some feedback on this?

Regards,
  Vicenç.
Felipe Balbi Aug. 27, 2019, 11:53 a.m. UTC | #2
Hi,

Vicente Bergas <vicencb@gmail.com> writes:
> On Saturday, August 17, 2019 7:41:40 PM CEST, Vicente Bergas wrote:
>> Otherwise the device keeps writing to memory after kexec and disturbs
>> the next kernel.
>>
>> Signed-off-by: Vicente Bergas <vicencb@gmail.com>
>> ---
>>  drivers/usb/dwc3/dwc3-of-simple.c | 6 ++++++
>>  1 file changed, 6 insertions(+)
>>
>> Hi Felipe, Robin,
>> this version calls 'remove' from 'shutdown' instead of just asserting
>> a reset because it looks like a cleaner way to stop the device.
>>
>> Calling remove from shutdown in core.c instead of dwc3-of-simple.c does not
>> fix the issue either.
>>
>> It has been tested on the sapphire board, a RK3399 platform.
>>
>> Regards,
>>   Vicenç.
>>
>> diff --git a/drivers/usb/dwc3/dwc3-of-simple.c 
>> b/drivers/usb/dwc3/dwc3-of-simple.c
>> index bdac3e7d7b18..d5fd45c64901 100644
>> --- a/drivers/usb/dwc3/dwc3-of-simple.c
>> +++ b/drivers/usb/dwc3/dwc3-of-simple.c
>> @@ -133,6 +133,11 @@ static int dwc3_of_simple_remove(struct 
>> platform_device *pdev)
>>  	return 0;
>>  }
>>  
>> +static void dwc3_of_simple_shutdown(struct platform_device *pdev)
>> +{
>> +	dwc3_of_simple_remove(pdev);
>> +}
>> +
>>  static int __maybe_unused 
>> dwc3_of_simple_runtime_suspend(struct device *dev)
>>  {
>>  	struct dwc3_of_simple	*simple = dev_get_drvdata(dev);
>> @@ -190,6 +195,7 @@ MODULE_DEVICE_TABLE(of, of_dwc3_simple_match);
>>  static struct platform_driver dwc3_of_simple_driver = {
>>  	.probe		= dwc3_of_simple_probe,
>>  	.remove		= dwc3_of_simple_remove,
>> +	.shutdown	= dwc3_of_simple_shutdown,

why don't you just have shutdown use the same exact function as remove?
Frankly, though, I still don't fully understand what's going wrong
here. Why is the device still alive during kexec?

cheers
Vicente Bergas Aug. 27, 2019, 12:16 p.m. UTC | #3
On Tuesday, August 27, 2019 1:53:04 PM CEST, Felipe Balbi wrote:
> Hi,
>
> Vicente Bergas <vicencb@gmail.com> writes:
>> On Saturday, August 17, 2019 7:41:40 PM CEST, Vicente Bergas wrote:
>>> Otherwise the device keeps writing to memory after kexec and disturbs
>>> the next kernel.
...
>
> why don't you just have shutdown use the same exact function as remove?
> Frankly, though, I still don't fully understand what's going wrong
> here. Why is the device still alive during kexec?
>
> cheers

Hi Felipe,
the remove and shutdown functions have different prototypes, so
shutdown is wrapping remove.
Would it be preferable to cast remove as shutdown?

The issue with kexec is that the device is being used during the livetime
of the first kernel. When the first kernel executes kexec it calls the
shutdown function of drivers (instead of remove). Because of this the dwc3
device keeps doing things like DMA.
While the second kernel is taking over, it gets its memory corrupted with
such DMA accesses from the device. When the second kernel reaches the point
of taking over the dwc3 device, re-initializes it, but it is already too
late. Still worse, if the second kernel did not have the dwc3 driver, it
would get endless memory corruptions.
All in all, devices that can do DMA need to stop doing it on shutdown.

Regards,
  Vicenç.
Vicente Bergas Sept. 9, 2019, 3:07 p.m. UTC | #4
On Tuesday, August 27, 2019 2:16:20 PM CEST, Vicente Bergas wrote:
> On Tuesday, August 27, 2019 1:53:04 PM CEST, Felipe Balbi wrote:
>> Hi,
>> 
>> Vicente Bergas <vicencb@gmail.com> writes:
>>> On Saturday, August 17, 2019 7:41:40 PM CEST, Vicente Bergas wrote:
>>>> Otherwise the device keeps writing to memory after kexec and disturbs
>>>> the next kernel.
> ...
>> 
>> why don't you just have shutdown use the same exact function as remove?
>> Frankly, though, I still don't fully understand what's going wrong
>> here. Why is the device still alive during kexec?
>> 
>> cheers
>
> Hi Felipe,
> the remove and shutdown functions have different prototypes, so
> shutdown is wrapping remove.
> Would it be preferable to cast remove as shutdown?
>
> The issue with kexec is that the device is being used during the livetime
> of the first kernel. When the first kernel executes kexec it calls the
> shutdown function of drivers (instead of remove). Because of this the dwc3
> device keeps doing things like DMA.
> While the second kernel is taking over, it gets its memory corrupted with
> such DMA accesses from the device. When the second kernel reaches the point
> of taking over the dwc3 device, re-initializes it, but it is already too
> late. Still worse, if the second kernel did not have the dwc3 driver, it
> would get endless memory corruptions.
> All in all, devices that can do DMA need to stop doing it on shutdown.
>
> Regards,
>  Vicenç.

Hi,
please, can you provide some feedback on this?

Regards,
 Vicenç.
Vicente Bergas Sept. 19, 2019, 11:36 a.m. UTC | #5
Ping?

On Saturday, August 17, 2019 7:41:40 PM CEST, Vicente Bergas wrote:
> Otherwise the device keeps writing to memory after kexec and disturbs
> the next kernel.
>
> Signed-off-by: Vicente Bergas <vicencb@gmail.com>
> ---
>  drivers/usb/dwc3/dwc3-of-simple.c | 6 ++++++
>  1 file changed, 6 insertions(+)
>
> Hi Felipe, Robin,
> this version calls 'remove' from 'shutdown' instead of just asserting
> a reset because it looks like a cleaner way to stop the device.
>
> Calling remove from shutdown in core.c instead of dwc3-of-simple.c does not
> fix the issue either.
>
> It has been tested on the sapphire board, a RK3399 platform.
>
> Regards,
>   Vicenç.
>
> diff --git a/drivers/usb/dwc3/dwc3-of-simple.c 
> b/drivers/usb/dwc3/dwc3-of-simple.c
> index bdac3e7d7b18..d5fd45c64901 100644
> --- a/drivers/usb/dwc3/dwc3-of-simple.c
> +++ b/drivers/usb/dwc3/dwc3-of-simple.c
> @@ -133,6 +133,11 @@ static int dwc3_of_simple_remove(struct 
> platform_device *pdev)
>  	return 0;
>  }
>  
> +static void dwc3_of_simple_shutdown(struct platform_device *pdev)
> +{
> +	dwc3_of_simple_remove(pdev);
> +}
> +
>  static int __maybe_unused 
> dwc3_of_simple_runtime_suspend(struct device *dev)
>  {
>  	struct dwc3_of_simple	*simple = dev_get_drvdata(dev);
> @@ -190,6 +195,7 @@ MODULE_DEVICE_TABLE(of, of_dwc3_simple_match);
>  static struct platform_driver dwc3_of_simple_driver = {
>  	.probe		= dwc3_of_simple_probe,
>  	.remove		= dwc3_of_simple_remove,
> +	.shutdown	= dwc3_of_simple_shutdown,
>  	.driver		= {
>  		.name	= "dwc3-of-simple",
>  		.of_match_table = of_dwc3_simple_match,
Felipe Balbi Oct. 23, 2019, 6:31 a.m. UTC | #6
Hi,

(sorry for the long delay)

Vicente Bergas <vicencb@gmail.com> writes:

> On Tuesday, August 27, 2019 2:16:20 PM CEST, Vicente Bergas wrote:
>> On Tuesday, August 27, 2019 1:53:04 PM CEST, Felipe Balbi wrote:
>>> Hi,
>>> 
>>> Vicente Bergas <vicencb@gmail.com> writes:
>>>> On Saturday, August 17, 2019 7:41:40 PM CEST, Vicente Bergas wrote:
>>>>> Otherwise the device keeps writing to memory after kexec and disturbs
>>>>> the next kernel.
>> ...
>>> 
>>> why don't you just have shutdown use the same exact function as remove?
>>> Frankly, though, I still don't fully understand what's going wrong
>>> here. Why is the device still alive during kexec?
>>> 
>>> cheers
>>
>> Hi Felipe,
>> the remove and shutdown functions have different prototypes, so
>> shutdown is wrapping remove.
>> Would it be preferable to cast remove as shutdown?
>>
>> The issue with kexec is that the device is being used during the livetime
>> of the first kernel. When the first kernel executes kexec it calls the
>> shutdown function of drivers (instead of remove). Because of this the dwc3
>> device keeps doing things like DMA.
>> While the second kernel is taking over, it gets its memory corrupted with
>> such DMA accesses from the device. When the second kernel reaches the point
>> of taking over the dwc3 device, re-initializes it, but it is already too
>> late. Still worse, if the second kernel did not have the dwc3 driver, it
>> would get endless memory corruptions.
>> All in all, devices that can do DMA need to stop doing it on shutdown.
>>
>> Regards,
>>  Vicenç.
>
> Hi,
> please, can you provide some feedback on this?

I meant something like this:

diff --git a/drivers/usb/dwc3/dwc3-of-simple.c b/drivers/usb/dwc3/dwc3-of-simple.c
index bdac3e7d7b18..e64754be47b4 100644
--- a/drivers/usb/dwc3/dwc3-of-simple.c
+++ b/drivers/usb/dwc3/dwc3-of-simple.c
@@ -110,12 +110,9 @@ static int dwc3_of_simple_probe(struct platform_device *pdev)
 	return ret;
 }
 
-static int dwc3_of_simple_remove(struct platform_device *pdev)
+static void __dwc3_of_simple_teardown(struct dwc3_of_simple *simple)
 {
-	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
-	struct device		*dev = &pdev->dev;
-
-	of_platform_depopulate(dev);
+	of_platform_depopulate(simple->dev);
 
 	clk_bulk_disable_unprepare(simple->num_clocks, simple->clks);
 	clk_bulk_put_all(simple->num_clocks, simple->clks);
@@ -126,13 +123,27 @@ static int dwc3_of_simple_remove(struct platform_device *pdev)
 
 	reset_control_put(simple->resets);
 
-	pm_runtime_disable(dev);
-	pm_runtime_put_noidle(dev);
-	pm_runtime_set_suspended(dev);
+	pm_runtime_disable(simple->dev);
+	pm_runtime_put_noidle(simple->dev);
+	pm_runtime_set_suspended(simple->dev);
+}
+
+static int dwc3_of_simple_remove(struct platform_device *pdev)
+{
+	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
+
+	__dwc3_of_simple_teardown(simple);
 
 	return 0;
 }
 
+static void dwc3_of_simple_shutdown(struct platform_device *pdev)
+{
+	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
+
+	__dwc3_of_simple_teardown(simple);
+}
+
 static int __maybe_unused dwc3_of_simple_runtime_suspend(struct device *dev)
 {
 	struct dwc3_of_simple	*simple = dev_get_drvdata(dev);
@@ -190,6 +201,7 @@ MODULE_DEVICE_TABLE(of, of_dwc3_simple_match);
 static struct platform_driver dwc3_of_simple_driver = {
 	.probe		= dwc3_of_simple_probe,
 	.remove		= dwc3_of_simple_remove,
+	.shutdown	= dwc3_of_simple_shutdown,
 	.driver		= {
 		.name	= "dwc3-of-simple",
 		.of_match_table = of_dwc3_simple_match,

Can you make sure it works as you intended?
Vicente Bergas Oct. 24, 2019, 12:15 p.m. UTC | #7
On Wednesday, October 23, 2019 8:31:21 AM CEST, Felipe Balbi wrote:
> Hi,
>
> (sorry for the long delay)
>
> Vicente Bergas <vicencb@gmail.com> writes:
>
>> On Tuesday, August 27, 2019 2:16:20 PM CEST, Vicente Bergas wrote: ...
>
> I meant something like this:
>
> diff --git a/drivers/usb/dwc3/dwc3-of-simple.c 
> b/drivers/usb/dwc3/dwc3-of-simple.c
> index bdac3e7d7b18..e64754be47b4 100644
> --- a/drivers/usb/dwc3/dwc3-of-simple.c
> +++ b/drivers/usb/dwc3/dwc3-of-simple.c
> @@ -110,12 +110,9 @@ static int dwc3_of_simple_probe(struct 
> platform_device *pdev)
>  	return ret;
>  }
>  
> -static int dwc3_of_simple_remove(struct platform_device *pdev)
> +static void __dwc3_of_simple_teardown(struct dwc3_of_simple *simple)
>  {
> -	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
> -	struct device		*dev = &pdev->dev;
> -
> -	of_platform_depopulate(dev);
> +	of_platform_depopulate(simple->dev);
>  
>  	clk_bulk_disable_unprepare(simple->num_clocks, simple->clks);
>  	clk_bulk_put_all(simple->num_clocks, simple->clks);
> @@ -126,13 +123,27 @@ static int dwc3_of_simple_remove(struct 
> platform_device *pdev)
>  
>  	reset_control_put(simple->resets);
>  
> -	pm_runtime_disable(dev);
> -	pm_runtime_put_noidle(dev);
> -	pm_runtime_set_suspended(dev);
> +	pm_runtime_disable(simple->dev);
> +	pm_runtime_put_noidle(simple->dev);
> +	pm_runtime_set_suspended(simple->dev);
> +}
> +
> +static int dwc3_of_simple_remove(struct platform_device *pdev)
> +{
> +	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
> +
> +	__dwc3_of_simple_teardown(simple);
>  
>  	return 0;
>  }
>  
> +static void dwc3_of_simple_shutdown(struct platform_device *pdev)
> +{
> +	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
> +
> +	__dwc3_of_simple_teardown(simple);
> +}
> +
>  static int __maybe_unused 
> dwc3_of_simple_runtime_suspend(struct device *dev)
>  {
>  	struct dwc3_of_simple	*simple = dev_get_drvdata(dev);
> @@ -190,6 +201,7 @@ MODULE_DEVICE_TABLE(of, of_dwc3_simple_match);
>  static struct platform_driver dwc3_of_simple_driver = {
>  	.probe		= dwc3_of_simple_probe,
>  	.remove		= dwc3_of_simple_remove,
> +	.shutdown	= dwc3_of_simple_shutdown,
>  	.driver		= {
>  		.name	= "dwc3-of-simple",
>  		.of_match_table = of_dwc3_simple_match,
>
> Can you make sure it works as you intended?

Hi Felipe,
just applied your approach to v5.3.7 and it is working properly.

Thanks,
  Vicente.
Felipe Balbi Oct. 25, 2019, 10:25 a.m. UTC | #8
hi,

Vicente Bergas <vicencb@gmail.com> writes:

> On Wednesday, October 23, 2019 8:31:21 AM CEST, Felipe Balbi wrote:
>> Hi,
>>
>> (sorry for the long delay)
>>
>> Vicente Bergas <vicencb@gmail.com> writes:
>>
>>> On Tuesday, August 27, 2019 2:16:20 PM CEST, Vicente Bergas wrote: ...
>>
>> I meant something like this:
>>
>> diff --git a/drivers/usb/dwc3/dwc3-of-simple.c 
>> b/drivers/usb/dwc3/dwc3-of-simple.c
>> index bdac3e7d7b18..e64754be47b4 100644
>> --- a/drivers/usb/dwc3/dwc3-of-simple.c
>> +++ b/drivers/usb/dwc3/dwc3-of-simple.c
>> @@ -110,12 +110,9 @@ static int dwc3_of_simple_probe(struct 
>> platform_device *pdev)
>>  	return ret;
>>  }
>>  
>> -static int dwc3_of_simple_remove(struct platform_device *pdev)
>> +static void __dwc3_of_simple_teardown(struct dwc3_of_simple *simple)
>>  {
>> -	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
>> -	struct device		*dev = &pdev->dev;
>> -
>> -	of_platform_depopulate(dev);
>> +	of_platform_depopulate(simple->dev);
>>  
>>  	clk_bulk_disable_unprepare(simple->num_clocks, simple->clks);
>>  	clk_bulk_put_all(simple->num_clocks, simple->clks);
>> @@ -126,13 +123,27 @@ static int dwc3_of_simple_remove(struct 
>> platform_device *pdev)
>>  
>>  	reset_control_put(simple->resets);
>>  
>> -	pm_runtime_disable(dev);
>> -	pm_runtime_put_noidle(dev);
>> -	pm_runtime_set_suspended(dev);
>> +	pm_runtime_disable(simple->dev);
>> +	pm_runtime_put_noidle(simple->dev);
>> +	pm_runtime_set_suspended(simple->dev);
>> +}
>> +
>> +static int dwc3_of_simple_remove(struct platform_device *pdev)
>> +{
>> +	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
>> +
>> +	__dwc3_of_simple_teardown(simple);
>>  
>>  	return 0;
>>  }
>>  
>> +static void dwc3_of_simple_shutdown(struct platform_device *pdev)
>> +{
>> +	struct dwc3_of_simple	*simple = platform_get_drvdata(pdev);
>> +
>> +	__dwc3_of_simple_teardown(simple);
>> +}
>> +
>>  static int __maybe_unused 
>> dwc3_of_simple_runtime_suspend(struct device *dev)
>>  {
>>  	struct dwc3_of_simple	*simple = dev_get_drvdata(dev);
>> @@ -190,6 +201,7 @@ MODULE_DEVICE_TABLE(of, of_dwc3_simple_match);
>>  static struct platform_driver dwc3_of_simple_driver = {
>>  	.probe		= dwc3_of_simple_probe,
>>  	.remove		= dwc3_of_simple_remove,
>> +	.shutdown	= dwc3_of_simple_shutdown,
>>  	.driver		= {
>>  		.name	= "dwc3-of-simple",
>>  		.of_match_table = of_dwc3_simple_match,
>>
>> Can you make sure it works as you intended?
>
> Hi Felipe,
> just applied your approach to v5.3.7 and it is working properly.

Do you want to send it as a formal patch or shall I do it?
Vicente Bergas Oct. 25, 2019, 10:42 a.m. UTC | #9
On Friday, October 25, 2019 12:25:17 PM CEST, Felipe Balbi wrote:
> hi,
>
> Vicente Bergas <vicencb@gmail.com> writes:
>
>> On Wednesday, October 23, 2019 8:31:21 AM CEST, Felipe Balbi wrote: ...
>
> Do you want to send it as a formal patch or shall I do it?

All yours. Thank you for reviewing and proposing this solution.

Regards,
  Vicente.
diff mbox series

Patch

diff --git a/drivers/usb/dwc3/dwc3-of-simple.c b/drivers/usb/dwc3/dwc3-of-simple.c
index bdac3e7d7b18..d5fd45c64901 100644
--- a/drivers/usb/dwc3/dwc3-of-simple.c
+++ b/drivers/usb/dwc3/dwc3-of-simple.c
@@ -133,6 +133,11 @@  static int dwc3_of_simple_remove(struct platform_device *pdev)
 	return 0;
 }
 
+static void dwc3_of_simple_shutdown(struct platform_device *pdev)
+{
+	dwc3_of_simple_remove(pdev);
+}
+
 static int __maybe_unused dwc3_of_simple_runtime_suspend(struct device *dev)
 {
 	struct dwc3_of_simple	*simple = dev_get_drvdata(dev);
@@ -190,6 +195,7 @@  MODULE_DEVICE_TABLE(of, of_dwc3_simple_match);
 static struct platform_driver dwc3_of_simple_driver = {
 	.probe		= dwc3_of_simple_probe,
 	.remove		= dwc3_of_simple_remove,
+	.shutdown	= dwc3_of_simple_shutdown,
 	.driver		= {
 		.name	= "dwc3-of-simple",
 		.of_match_table = of_dwc3_simple_match,