diff mbox series

[RFC] media: venus: Fix NULL pointer dereference in core selection

Message ID 20200601150314.RFC.1.I1e40623bbe8fa43ff1415fc273cba66503b9b048@changeid (mailing list archive)
State New, archived
Headers show
Series [RFC] media: venus: Fix NULL pointer dereference in core selection | expand

Commit Message

Doug Anderson June 1, 2020, 10:03 p.m. UTC
The newly-introduced function min_loaded_core() iterates over all of
the venus instances an tries to figure out how much load each instance
is putting on each core.  Not all instances, however, might be fully
initialized.  Specifically the "codec_freq_data" is initialized as
part of vdec_queue_setup(), but an instance may already be in the list
of all instances before that time.

Let's band-aid this by checking to see if codec_freq_data is NULL
before dereferencing.

NOTE: without this fix I was running into a crash.  Specifically there
were two venus instances.  One was doing start_streaming.  The other
was midway through queue_setup but hadn't yet gotten to initting
"codec_freq_data".

Fixes: eff82f79c562 ("media: venus: introduce core selection")
Signed-off-by: Douglas Anderson <dianders@chromium.org>
---
I'm not massively happy about this commit but it's the best I could
come up with without being much more of an expert in the venus codec.
If someone has a better patch then please just consider this one to be
a bug report and feel free to submit a better fix!  :-)

In general I wonder a little bit about whether it's safe to be peeking
at all the instances without grabbing the "inst->lock" on each one.  I
guess it is since we do it both here and in load_scale_v4() but I
don't know why.

One thought I had was that we could fully avoid accessing the other
instances, at least in min_loaded_core(), by just keeping track of
"core1_load" and "core2_load" in "struct venus_core".  Whenever we add
a new instance we could add to the relevant variables and whenever we
release an instance we could remove.  Such a change seems cleaner but
would require someone to test to make sure we didn't miss any case
(AKA we always properly added/removed our load from the globals).

 drivers/media/platform/qcom/venus/pm_helpers.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Doug Anderson June 2, 2020, 3:39 a.m. UTC | #1
Hi,

On Mon, Jun 1, 2020 at 3:03 PM Douglas Anderson <dianders@chromium.org> wrote:
>
> The newly-introduced function min_loaded_core() iterates over all of
> the venus instances an tries to figure out how much load each instance
> is putting on each core.  Not all instances, however, might be fully
> initialized.  Specifically the "codec_freq_data" is initialized as
> part of vdec_queue_setup(), but an instance may already be in the list
> of all instances before that time.
>
> Let's band-aid this by checking to see if codec_freq_data is NULL
> before dereferencing.
>
> NOTE: without this fix I was running into a crash.  Specifically there
> were two venus instances.  One was doing start_streaming.  The other
> was midway through queue_setup but hadn't yet gotten to initting
> "codec_freq_data".
>
> Fixes: eff82f79c562 ("media: venus: introduce core selection")
> Signed-off-by: Douglas Anderson <dianders@chromium.org>
> ---
> I'm not massively happy about this commit but it's the best I could
> come up with without being much more of an expert in the venus codec.
> If someone has a better patch then please just consider this one to be
> a bug report and feel free to submit a better fix!  :-)
>
> In general I wonder a little bit about whether it's safe to be peeking
> at all the instances without grabbing the "inst->lock" on each one.  I
> guess it is since we do it both here and in load_scale_v4() but I
> don't know why.
>
> One thought I had was that we could fully avoid accessing the other
> instances, at least in min_loaded_core(), by just keeping track of
> "core1_load" and "core2_load" in "struct venus_core".  Whenever we add
> a new instance we could add to the relevant variables and whenever we
> release an instance we could remove.  Such a change seems cleaner but
> would require someone to test to make sure we didn't miss any case
> (AKA we always properly added/removed our load from the globals).
>
>  drivers/media/platform/qcom/venus/pm_helpers.c | 2 ++
>  1 file changed, 2 insertions(+)

This fixes the same crash as the patch:

https://lore.kernel.org/r/1588314480-22409-1-git-send-email-mansur@codeaurora.org
Stanimir Varbanov June 22, 2020, 11:51 a.m. UTC | #2
Hi Doug,

Thanks for the fix and sorry for the late reply.

On 6/2/20 6:39 AM, Doug Anderson wrote:
> Hi,
> 
> On Mon, Jun 1, 2020 at 3:03 PM Douglas Anderson <dianders@chromium.org> wrote:
>>
>> The newly-introduced function min_loaded_core() iterates over all of
>> the venus instances an tries to figure out how much load each instance
>> is putting on each core.  Not all instances, however, might be fully
>> initialized.  Specifically the "codec_freq_data" is initialized as
>> part of vdec_queue_setup(), but an instance may already be in the list
>> of all instances before that time.
>>
>> Let's band-aid this by checking to see if codec_freq_data is NULL
>> before dereferencing.
>>
>> NOTE: without this fix I was running into a crash.  Specifically there
>> were two venus instances.  One was doing start_streaming.  The other
>> was midway through queue_setup but hadn't yet gotten to initting
>> "codec_freq_data".
>>
>> Fixes: eff82f79c562 ("media: venus: introduce core selection")
>> Signed-off-by: Douglas Anderson <dianders@chromium.org>
>> ---
>> I'm not massively happy about this commit but it's the best I could
>> come up with without being much more of an expert in the venus codec.
>> If someone has a better patch then please just consider this one to be
>> a bug report and feel free to submit a better fix!  :-)
>>
>> In general I wonder a little bit about whether it's safe to be peeking
>> at all the instances without grabbing the "inst->lock" on each one.  I
>> guess it is since we do it both here and in load_scale_v4() but I
>> don't know why.
>>
>> One thought I had was that we could fully avoid accessing the other
>> instances, at least in min_loaded_core(), by just keeping track of
>> "core1_load" and "core2_load" in "struct venus_core".  Whenever we add
>> a new instance we could add to the relevant variables and whenever we
>> release an instance we could remove.  Such a change seems cleaner but
>> would require someone to test to make sure we didn't miss any case
>> (AKA we always properly added/removed our load from the globals).

Thanks for the suggestion (I also thought about something similar).  I
will try to cook something.

>>
>>  drivers/media/platform/qcom/venus/pm_helpers.c | 2 ++
>>  1 file changed, 2 insertions(+)
> 
> This fixes the same crash as the patch:
> 
> https://lore.kernel.org/r/1588314480-22409-1-git-send-email-mansur@codeaurora.org
> 

I'm going to take this approach because it takes into account the state
of the instance. The instance could be opened/created but the streaming
could not be started in near future, so it shouldn't be correct to take
its load when doing the calculations.
diff mbox series

Patch

diff --git a/drivers/media/platform/qcom/venus/pm_helpers.c b/drivers/media/platform/qcom/venus/pm_helpers.c
index abf93158857b..a1d998f62cf2 100644
--- a/drivers/media/platform/qcom/venus/pm_helpers.c
+++ b/drivers/media/platform/qcom/venus/pm_helpers.c
@@ -496,6 +496,8 @@  min_loaded_core(struct venus_inst *inst, u32 *min_coreid, u32 *min_load)
 	list_for_each_entry(inst_pos, &core->instances, list) {
 		if (inst_pos == inst)
 			continue;
+		if (!inst_pos->clk_data.codec_freq_data)
+			continue;
 		vpp_freq = inst_pos->clk_data.codec_freq_data->vpp_freq;
 		coreid = inst_pos->clk_data.core_id;