Message ID | 20200601150314.RFC.1.I1e40623bbe8fa43ff1415fc273cba66503b9b048@changeid (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [RFC] media: venus: Fix NULL pointer dereference in core selection | expand |
Hi, On Mon, Jun 1, 2020 at 3:03 PM Douglas Anderson <dianders@chromium.org> wrote: > > The newly-introduced function min_loaded_core() iterates over all of > the venus instances an tries to figure out how much load each instance > is putting on each core. Not all instances, however, might be fully > initialized. Specifically the "codec_freq_data" is initialized as > part of vdec_queue_setup(), but an instance may already be in the list > of all instances before that time. > > Let's band-aid this by checking to see if codec_freq_data is NULL > before dereferencing. > > NOTE: without this fix I was running into a crash. Specifically there > were two venus instances. One was doing start_streaming. The other > was midway through queue_setup but hadn't yet gotten to initting > "codec_freq_data". > > Fixes: eff82f79c562 ("media: venus: introduce core selection") > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > I'm not massively happy about this commit but it's the best I could > come up with without being much more of an expert in the venus codec. > If someone has a better patch then please just consider this one to be > a bug report and feel free to submit a better fix! :-) > > In general I wonder a little bit about whether it's safe to be peeking > at all the instances without grabbing the "inst->lock" on each one. I > guess it is since we do it both here and in load_scale_v4() but I > don't know why. > > One thought I had was that we could fully avoid accessing the other > instances, at least in min_loaded_core(), by just keeping track of > "core1_load" and "core2_load" in "struct venus_core". Whenever we add > a new instance we could add to the relevant variables and whenever we > release an instance we could remove. Such a change seems cleaner but > would require someone to test to make sure we didn't miss any case > (AKA we always properly added/removed our load from the globals). > > drivers/media/platform/qcom/venus/pm_helpers.c | 2 ++ > 1 file changed, 2 insertions(+) This fixes the same crash as the patch: https://lore.kernel.org/r/1588314480-22409-1-git-send-email-mansur@codeaurora.org
Hi Doug, Thanks for the fix and sorry for the late reply. On 6/2/20 6:39 AM, Doug Anderson wrote: > Hi, > > On Mon, Jun 1, 2020 at 3:03 PM Douglas Anderson <dianders@chromium.org> wrote: >> >> The newly-introduced function min_loaded_core() iterates over all of >> the venus instances an tries to figure out how much load each instance >> is putting on each core. Not all instances, however, might be fully >> initialized. Specifically the "codec_freq_data" is initialized as >> part of vdec_queue_setup(), but an instance may already be in the list >> of all instances before that time. >> >> Let's band-aid this by checking to see if codec_freq_data is NULL >> before dereferencing. >> >> NOTE: without this fix I was running into a crash. Specifically there >> were two venus instances. One was doing start_streaming. The other >> was midway through queue_setup but hadn't yet gotten to initting >> "codec_freq_data". >> >> Fixes: eff82f79c562 ("media: venus: introduce core selection") >> Signed-off-by: Douglas Anderson <dianders@chromium.org> >> --- >> I'm not massively happy about this commit but it's the best I could >> come up with without being much more of an expert in the venus codec. >> If someone has a better patch then please just consider this one to be >> a bug report and feel free to submit a better fix! :-) >> >> In general I wonder a little bit about whether it's safe to be peeking >> at all the instances without grabbing the "inst->lock" on each one. I >> guess it is since we do it both here and in load_scale_v4() but I >> don't know why. >> >> One thought I had was that we could fully avoid accessing the other >> instances, at least in min_loaded_core(), by just keeping track of >> "core1_load" and "core2_load" in "struct venus_core". Whenever we add >> a new instance we could add to the relevant variables and whenever we >> release an instance we could remove. Such a change seems cleaner but >> would require someone to test to make sure we didn't miss any case >> (AKA we always properly added/removed our load from the globals). Thanks for the suggestion (I also thought about something similar). I will try to cook something. >> >> drivers/media/platform/qcom/venus/pm_helpers.c | 2 ++ >> 1 file changed, 2 insertions(+) > > This fixes the same crash as the patch: > > https://lore.kernel.org/r/1588314480-22409-1-git-send-email-mansur@codeaurora.org > I'm going to take this approach because it takes into account the state of the instance. The instance could be opened/created but the streaming could not be started in near future, so it shouldn't be correct to take its load when doing the calculations.
diff --git a/drivers/media/platform/qcom/venus/pm_helpers.c b/drivers/media/platform/qcom/venus/pm_helpers.c index abf93158857b..a1d998f62cf2 100644 --- a/drivers/media/platform/qcom/venus/pm_helpers.c +++ b/drivers/media/platform/qcom/venus/pm_helpers.c @@ -496,6 +496,8 @@ min_loaded_core(struct venus_inst *inst, u32 *min_coreid, u32 *min_load) list_for_each_entry(inst_pos, &core->instances, list) { if (inst_pos == inst) continue; + if (!inst_pos->clk_data.codec_freq_data) + continue; vpp_freq = inst_pos->clk_data.codec_freq_data->vpp_freq; coreid = inst_pos->clk_data.core_id;
The newly-introduced function min_loaded_core() iterates over all of the venus instances an tries to figure out how much load each instance is putting on each core. Not all instances, however, might be fully initialized. Specifically the "codec_freq_data" is initialized as part of vdec_queue_setup(), but an instance may already be in the list of all instances before that time. Let's band-aid this by checking to see if codec_freq_data is NULL before dereferencing. NOTE: without this fix I was running into a crash. Specifically there were two venus instances. One was doing start_streaming. The other was midway through queue_setup but hadn't yet gotten to initting "codec_freq_data". Fixes: eff82f79c562 ("media: venus: introduce core selection") Signed-off-by: Douglas Anderson <dianders@chromium.org> --- I'm not massively happy about this commit but it's the best I could come up with without being much more of an expert in the venus codec. If someone has a better patch then please just consider this one to be a bug report and feel free to submit a better fix! :-) In general I wonder a little bit about whether it's safe to be peeking at all the instances without grabbing the "inst->lock" on each one. I guess it is since we do it both here and in load_scale_v4() but I don't know why. One thought I had was that we could fully avoid accessing the other instances, at least in min_loaded_core(), by just keeping track of "core1_load" and "core2_load" in "struct venus_core". Whenever we add a new instance we could add to the relevant variables and whenever we release an instance we could remove. Such a change seems cleaner but would require someone to test to make sure we didn't miss any case (AKA we always properly added/removed our load from the globals). drivers/media/platform/qcom/venus/pm_helpers.c | 2 ++ 1 file changed, 2 insertions(+)