diff mbox series

[v5,2/3] venus: Add a debugfs file for SSR trigger

Message ID 20200730095350.13925-3-stanimir.varbanov@linaro.org (mailing list archive)
State New, archived
Headers show
Series Venus dynamic debug | expand

Commit Message

Stanimir Varbanov July 30, 2020, 9:53 a.m. UTC
The SSR (SubSystem Restart) is used to simulate an error on FW
side of Venus. We support following type of triggers - fatal error,
div by zero and watchdog IRQ.

Signed-off-by: Stanimir Varbanov <stanimir.varbanov@linaro.org>
---
 drivers/media/platform/qcom/venus/dbgfs.c | 30 +++++++++++++++++++++++
 1 file changed, 30 insertions(+)

Comments

Stephen Boyd Aug. 11, 2020, 9:49 p.m. UTC | #1
Quoting Stanimir Varbanov (2020-07-30 02:53:49)
> The SSR (SubSystem Restart) is used to simulate an error on FW
> side of Venus. We support following type of triggers - fatal error,
> div by zero and watchdog IRQ.

Can this use the fault injection framework instead of custom debugfs?
See Documentation/fault-injection/.
Dikshita Agarwal Sept. 15, 2021, 9:13 a.m. UTC | #2
Hi Stephen,

Reviving the discussion on this change as we need to pull this in.

As per your suggestion, I explored the fault injection framework to 
implement this functionality.
But I don't think that meets our requirements.

We need a way to trigger subsystem restart from the client-side, it's 
not derived from the driver.

while fault injection framework enables the driver to trigger an 
injection
when a specific event occurs for eg: page allocation failure or memory 
access failure.

So, IMO, we will have to use custom debugfs only.

Please feel free to correct me in case my understanding of the framework 
is wrong.

Thanks,
Dikshita

On 2020-08-12 03:19, Stephen Boyd wrote:
> Quoting Stanimir Varbanov (2020-07-30 02:53:49)
>> The SSR (SubSystem Restart) is used to simulate an error on FW
>> side of Venus. We support following type of triggers - fatal error,
>> div by zero and watchdog IRQ.
> 
> Can this use the fault injection framework instead of custom debugfs?
> See Documentation/fault-injection/.
Stephen Boyd Sept. 15, 2021, 7:39 p.m. UTC | #3
Quoting dikshita@codeaurora.org (2021-09-15 02:13:09)
> Hi Stephen,
>
> Reviving the discussion on this change as we need to pull this in.
>
> As per your suggestion, I explored the fault injection framework to
> implement this functionality.
> But I don't think that meets our requirements.
>
> We need a way to trigger subsystem restart from the client-side, it's
> not derived from the driver.

Just to confirm, this is all for debugging purposes right?

>
> while fault injection framework enables the driver to trigger an
> injection
> when a specific event occurs for eg: page allocation failure or memory
> access failure.
>
> So, IMO, we will have to use custom debugfs only.

Can you use DECLARE_FAULT_ATTR()? Or you need it to be active instead of
passive, i.e. it shouldn't wait for should_fail() to return true, but
actively trigger something on the remoteproc?

>
> Please feel free to correct me in case my understanding of the framework
> is wrong.
>

I presume the fault injection framework could get a new feature that
lets the fault be injected immediately upon writing the debugfs file.
My goal is to consolidate this sort of logic into one place and then put
it behind some config option that distros can disable so the kernel
isn't bloated with debug features that end users will never care about.
Dikshita Agarwal Sept. 16, 2021, 6:29 a.m. UTC | #4
On 2021-09-16 01:09, Stephen Boyd wrote:
> Quoting dikshita@codeaurora.org (2021-09-15 02:13:09)
>> Hi Stephen,
>> 
>> Reviving the discussion on this change as we need to pull this in.
>> 
>> As per your suggestion, I explored the fault injection framework to
>> implement this functionality.
>> But I don't think that meets our requirements.
>> 
>> We need a way to trigger subsystem restart from the client-side, it's
>> not derived from the driver.
> 
> Just to confirm, this is all for debugging purposes right?
> 
yes, correct. this is for debugging purposes. We need this to simulate 
an error on FW side.
In a normal scenario, when FW runs into error, sys error is triggered 
from FW as result of which
a sequence of commands are followed for restarting the system.
using this feature, we are trying to simulate this error on FW i.e we 
are forcing the FW to run into an error.
>> 
>> while fault injection framework enables the driver to trigger an
>> injection
>> when a specific event occurs for eg: page allocation failure or memory
>> access failure.
>> 
>> So, IMO, we will have to use custom debugfs only.
> 
> Can you use DECLARE_FAULT_ATTR()? Or you need it to be active instead 
> of
> passive, i.e. it shouldn't wait for should_fail() to return true, but
> actively trigger something on the remoteproc?
> 

yes, it doesn't need to wait for should_fail() to return true.
the client/user should be able to trigger this subsystem restart(SSR) at 
any point of time
when a session is running. It's totally client-driven.

>> 
>> Please feel free to correct me in case my understanding of the 
>> framework
>> is wrong.
>> 
> 
> I presume the fault injection framework could get a new feature that
> lets the fault be injected immediately upon writing the debugfs file.
> My goal is to consolidate this sort of logic into one place and then 
> put
> it behind some config option that distros can disable so the kernel
> isn't bloated with debug features that end users will never care about.
Stephen Boyd Sept. 17, 2021, 6:18 a.m. UTC | #5
Quoting dikshita@codeaurora.org (2021-09-15 23:29:36)
> On 2021-09-16 01:09, Stephen Boyd wrote:
> > Quoting dikshita@codeaurora.org (2021-09-15 02:13:09)
> >>
> >> So, IMO, we will have to use custom debugfs only.
> >
> > Can you use DECLARE_FAULT_ATTR()? Or you need it to be active instead
> > of
> > passive, i.e. it shouldn't wait for should_fail() to return true, but
> > actively trigger something on the remoteproc?
> >
>
> yes, it doesn't need to wait for should_fail() to return true.
> the client/user should be able to trigger this subsystem restart(SSR) at
> any point of time
> when a session is running. It's totally client-driven.
>
> >>
> >> Please feel free to correct me in case my understanding of the
> >> framework
> >> is wrong.
> >>
> >
> > I presume the fault injection framework could get a new feature that
> > lets the fault be injected immediately upon writing the debugfs file.
> > My goal is to consolidate this sort of logic into one place and then
> > put
> > it behind some config option that distros can disable so the kernel
> > isn't bloated with debug features that end users will never care about.

So you can modify fault injection framework to support direct injection
instead of statistical failures?
Dikshita Agarwal Sept. 20, 2021, 5:48 a.m. UTC | #6
On 2021-09-17 11:48, Stephen Boyd wrote:
> Quoting dikshita@codeaurora.org (2021-09-15 23:29:36)
>> On 2021-09-16 01:09, Stephen Boyd wrote:
>> > Quoting dikshita@codeaurora.org (2021-09-15 02:13:09)
>> >>
>> >> So, IMO, we will have to use custom debugfs only.
>> >
>> > Can you use DECLARE_FAULT_ATTR()? Or you need it to be active instead
>> > of
>> > passive, i.e. it shouldn't wait for should_fail() to return true, but
>> > actively trigger something on the remoteproc?
>> >
>> 
>> yes, it doesn't need to wait for should_fail() to return true.
>> the client/user should be able to trigger this subsystem restart(SSR) 
>> at
>> any point of time
>> when a session is running. It's totally client-driven.
>> 
>> >>
>> >> Please feel free to correct me in case my understanding of the
>> >> framework
>> >> is wrong.
>> >>
>> >
>> > I presume the fault injection framework could get a new feature that
>> > lets the fault be injected immediately upon writing the debugfs file.
>> > My goal is to consolidate this sort of logic into one place and then
>> > put
>> > it behind some config option that distros can disable so the kernel
>> > isn't bloated with debug features that end users will never care about.
> 
> So you can modify fault injection framework to support direct injection
> instead of statistical failures?

I am not sure how to do that. Could you pls give me more info?
Also, how is this beneficial than using debugfs?
diff mbox series

Patch

diff --git a/drivers/media/platform/qcom/venus/dbgfs.c b/drivers/media/platform/qcom/venus/dbgfs.c
index 782d54ac1b8f..f95b7b1febe5 100644
--- a/drivers/media/platform/qcom/venus/dbgfs.c
+++ b/drivers/media/platform/qcom/venus/dbgfs.c
@@ -9,10 +9,40 @@ 
 
 extern int venus_fw_debug;
 
+static int trigger_ssr_open(struct inode *inode, struct file *file)
+{
+	file->private_data = inode->i_private;
+	return 0;
+}
+
+static ssize_t trigger_ssr_write(struct file *filp, const char __user *buf,
+				 size_t count, loff_t *ppos)
+{
+	struct venus_core *core = filp->private_data;
+	u32 ssr_type;
+	int ret;
+
+	ret = kstrtou32_from_user(buf, count, 4, &ssr_type);
+	if (ret)
+		return ret;
+
+	ret = hfi_core_trigger_ssr(core, ssr_type);
+	if (ret < 0)
+		return ret;
+
+	return count;
+}
+
+static const struct file_operations ssr_fops = {
+	.open = trigger_ssr_open,
+	.write = trigger_ssr_write,
+};
+
 void venus_dbgfs_init(struct venus_core *core)
 {
 	core->root = debugfs_create_dir("venus", NULL);
 	debugfs_create_x32("fw_level", 0644, core->root, &venus_fw_debug);
+	debugfs_create_file("trigger_ssr", 0200, core->root, core, &ssr_fops);
 }
 
 void venus_dbgfs_deinit(struct venus_core *core)