diff mbox series

[v4,3/8] optee: fix tee out of memory failure seen during kexec reboot

Message ID 20210610210913.536081-4-tyhicks@linux.microsoft.com (mailing list archive)
State New, archived
Headers show
Series tee: Improve support for kexec and kdump | expand

Commit Message

Tyler Hicks June 10, 2021, 9:09 p.m. UTC
From: Allen Pais <apais@linux.microsoft.com>

The following out of memory errors are seen on kexec reboot
from the optee core.

[    0.368428] tee_bnxt_fw optee-clnt0: tee_shm_alloc failed
[    0.368461] tee_bnxt_fw: probe of optee-clnt0 failed with error -22

tee_shm_release() is not invoked on dma shm buffer.

Implement .shutdown() method to handle the release of the buffers
correctly.

More info:
https://github.com/OP-TEE/optee_os/issues/3637

Signed-off-by: Allen Pais <apais@linux.microsoft.com>
Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
---
 drivers/tee/optee/core.c | 20 ++++++++++++++++++++
 1 file changed, 20 insertions(+)

Comments

Jens Wiklander June 11, 2021, 9:11 a.m. UTC | #1
On Thu, Jun 10, 2021 at 11:09 PM Tyler Hicks
<tyhicks@linux.microsoft.com> wrote:
>
> From: Allen Pais <apais@linux.microsoft.com>
>
> The following out of memory errors are seen on kexec reboot
> from the optee core.
>
> [    0.368428] tee_bnxt_fw optee-clnt0: tee_shm_alloc failed
> [    0.368461] tee_bnxt_fw: probe of optee-clnt0 failed with error -22
>
> tee_shm_release() is not invoked on dma shm buffer.
>
> Implement .shutdown() method to handle the release of the buffers
> correctly.
>
> More info:
> https://github.com/OP-TEE/optee_os/issues/3637
>
> Signed-off-by: Allen Pais <apais@linux.microsoft.com>
> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>

Do we really need this considering the patch "optee: Refuse to load
the driver under the kdump kernel"?

Jens
Tyler Hicks June 11, 2021, 12:53 p.m. UTC | #2
On 2021-06-11 11:11:33, Jens Wiklander wrote:
> On Thu, Jun 10, 2021 at 11:09 PM Tyler Hicks
> <tyhicks@linux.microsoft.com> wrote:
> >
> > From: Allen Pais <apais@linux.microsoft.com>
> >
> > The following out of memory errors are seen on kexec reboot
> > from the optee core.
> >
> > [    0.368428] tee_bnxt_fw optee-clnt0: tee_shm_alloc failed
> > [    0.368461] tee_bnxt_fw: probe of optee-clnt0 failed with error -22
> >
> > tee_shm_release() is not invoked on dma shm buffer.
> >
> > Implement .shutdown() method to handle the release of the buffers
> > correctly.
> >
> > More info:
> > https://github.com/OP-TEE/optee_os/issues/3637
> >
> > Signed-off-by: Allen Pais <apais@linux.microsoft.com>
> > Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> 
> Do we really need this considering the patch "optee: Refuse to load
> the driver under the kdump kernel"?

Yes. That patch fixes boot hangs when all of the OP-TEE threads were in
the suspended state at the time of a kernel panic. The kexec into the
kdump kernel after a panic is an "emergency" kexec that doesn't even
call .shutdown hooks. There's no way for the OP-TEE driver to clean up
after itself.

This patch disables the shm cache (and unregisters the shm buffers)
during a normal kexec from one perfectly working kernel into a new
kernel. This is required because the new kernel will not be able to
handle the virtual addresses that were cached under the old kernel. The
new kernel has an entirely different memory layout and the old addresses
point to unmapped memory or memory that's mapped but probably not a TEE
shm.

Tyler

> 
> Jens
>
Jens Wiklander June 14, 2021, 7:21 a.m. UTC | #3
On Fri, Jun 11, 2021 at 07:53:26AM -0500, Tyler Hicks wrote:
> On 2021-06-11 11:11:33, Jens Wiklander wrote:
> > On Thu, Jun 10, 2021 at 11:09 PM Tyler Hicks
> > <tyhicks@linux.microsoft.com> wrote:
> > >
> > > From: Allen Pais <apais@linux.microsoft.com>
> > >
> > > The following out of memory errors are seen on kexec reboot
> > > from the optee core.
> > >
> > > [    0.368428] tee_bnxt_fw optee-clnt0: tee_shm_alloc failed
> > > [    0.368461] tee_bnxt_fw: probe of optee-clnt0 failed with error -22
> > >
> > > tee_shm_release() is not invoked on dma shm buffer.
> > >
> > > Implement .shutdown() method to handle the release of the buffers
> > > correctly.
> > >
> > > More info:
> > > https://github.com/OP-TEE/optee_os/issues/3637
> > >
> > > Signed-off-by: Allen Pais <apais@linux.microsoft.com>
> > > Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> > 
> > Do we really need this considering the patch "optee: Refuse to load
> > the driver under the kdump kernel"?
> 
> Yes. That patch fixes boot hangs when all of the OP-TEE threads were in
> the suspended state at the time of a kernel panic. The kexec into the
> kdump kernel after a panic is an "emergency" kexec that doesn't even
> call .shutdown hooks. There's no way for the OP-TEE driver to clean up
> after itself.
> 
> This patch disables the shm cache (and unregisters the shm buffers)
> during a normal kexec from one perfectly working kernel into a new
> kernel. This is required because the new kernel will not be able to
> handle the virtual addresses that were cached under the old kernel. The
> new kernel has an entirely different memory layout and the old addresses
> point to unmapped memory or memory that's mapped but probably not a TEE
> shm.

Got it, thanks.

Jens
Jens Wiklander June 14, 2021, 7:22 a.m. UTC | #4
On Thu, Jun 10, 2021 at 04:09:08PM -0500, Tyler Hicks wrote:
> From: Allen Pais <apais@linux.microsoft.com>
> 
> The following out of memory errors are seen on kexec reboot
> from the optee core.
> 
> [    0.368428] tee_bnxt_fw optee-clnt0: tee_shm_alloc failed
> [    0.368461] tee_bnxt_fw: probe of optee-clnt0 failed with error -22
> 
> tee_shm_release() is not invoked on dma shm buffer.
> 
> Implement .shutdown() method to handle the release of the buffers
> correctly.
> 
> More info:
> https://github.com/OP-TEE/optee_os/issues/3637
> 
> Signed-off-by: Allen Pais <apais@linux.microsoft.com>
> Reviewed-by: Tyler Hicks <tyhicks@linux.microsoft.com>
> ---
>  drivers/tee/optee/core.c | 20 ++++++++++++++++++++
>  1 file changed, 20 insertions(+)

Reviewed-by: Jens Wiklander <jens.wiklander@linaro.org>
diff mbox series

Patch

diff --git a/drivers/tee/optee/core.c b/drivers/tee/optee/core.c
index 5288cd767d82..0987074d7ed0 100644
--- a/drivers/tee/optee/core.c
+++ b/drivers/tee/optee/core.c
@@ -573,6 +573,13 @@  static optee_invoke_fn *get_invoke_func(struct device *dev)
 	return ERR_PTR(-EINVAL);
 }
 
+/* optee_remove - Device Removal Routine
+ * @pdev: platform device information struct
+ *
+ * optee_remove is called by platform subsystem to alert the driver
+ * that it should release the device
+ */
+
 static int optee_remove(struct platform_device *pdev)
 {
 	struct optee *optee = platform_get_drvdata(pdev);
@@ -603,6 +610,18 @@  static int optee_remove(struct platform_device *pdev)
 	return 0;
 }
 
+/* optee_shutdown - Device Removal Routine
+ * @pdev: platform device information struct
+ *
+ * platform_shutdown is called by the platform subsystem to alert
+ * the driver that a shutdown, reboot, or kexec is happening and
+ * device must be disabled.
+ */
+static void optee_shutdown(struct platform_device *pdev)
+{
+	optee_disable_shm_cache(platform_get_drvdata(pdev));
+}
+
 static int optee_probe(struct platform_device *pdev)
 {
 	optee_invoke_fn *invoke_fn;
@@ -739,6 +758,7 @@  MODULE_DEVICE_TABLE(of, optee_dt_match);
 static struct platform_driver optee_driver = {
 	.probe  = optee_probe,
 	.remove = optee_remove,
+	.shutdown = optee_shutdown,
 	.driver = {
 		.name = "optee",
 		.of_match_table = optee_dt_match,