Message ID | 20211004092100.1.Ic90a5ebd44c75db963112be167a03cc96f9fb249@changeid (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm/edid: Fix crash with zero/invalid EDID | expand |
Hi Douglas, On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson <dianders@chromium.org> wrote: > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of > the EDID") I broke out reading the base block of the EDID to its own > function. Unfortunately, when I did that I messed up the handling when > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or > when we went through 4 loops and didn't get a valid EDID. Specifically > I needed to pass the broken EDID to connector_bad_edid() but now I was > passing an error-pointer. > > Let's re-jigger things so we can pass the bad EDID in properly. > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") > Reported-by: kernel test robot <oliver.sang@intel.com> > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > Signed-off-by: Douglas Anderson <dianders@chromium.org> The crash is was seeing is gone, so Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Gr{oetje,eeting}s, Geert
On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote: > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of > the EDID") I broke out reading the base block of the EDID to its own > function. Unfortunately, when I did that I messed up the handling when > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or > when we went through 4 loops and didn't get a valid EDID. Specifically > I needed to pass the broken EDID to connector_bad_edid() but now I was > passing an error-pointer. > > Let's re-jigger things so we can pass the bad EDID in properly. > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") > Reported-by: kernel test robot <oliver.sang@intel.com> > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > Signed-off-by: Douglas Anderson <dianders@chromium.org> > --- > > drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- > 1 file changed, 11 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > index 9b19eee0e1b4..9c9463ec5465 100644 > --- a/drivers/gpu/drm/drm_edid.c > +++ b/drivers/gpu/drm/drm_edid.c > @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector) > } > EXPORT_SYMBOL(drm_add_override_edid_modes); > > -static struct edid *drm_do_get_edid_base_block( > +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector, > int (*get_edid_block)(void *data, u8 *buf, unsigned int block, > size_t len), > - void *data, bool *edid_corrupt, int *null_edid_counter) > + void *data) > { > - int i; > + int *null_edid_counter = connector ? &connector->null_edid_counter : NULL; > + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; > void *edid; > + int i; > > edid = kmalloc(EDID_LENGTH, GFP_KERNEL); > if (edid == NULL) > @@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block( > return edid; > > carp: > - kfree(edid); > - return ERR_PTR(-EINVAL); > - > + if (connector) > + connector_bad_edid(connector, edid, 1); BTW I believe connector_bad_edid() itself is broken since commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). Before we've even allocated the memory for the extension blocks that code now assumes edid[0x7e] is to be 100% trusted and goes and calculates the checksum on a block based on that. So that's likely going to be pointing somewhere beyond the base block into memory we've not even allocated. So anyone who wanted could craft a bogus EDID and maybe get something interesting to happen. Would be good if someone could fix that while at it. Or just revert the offending commit if there is no simple solution immediately in sight. The fact that we're parsing entirely untrustworthy crap in the kernel always worries me. Either we need super careful review of all relevant code, and/or we need to think about moving the parser out of the kernel. I was considering playing around with the usermode helper stuff. IIRC there is a way to embed the userspace binary into the kernel and just fire it up when needed. But so far it's been the usual -ENOTIME for me...
Hi, On Mon, Oct 4, 2021 at 10:14 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > Hi Douglas, > > On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson <dianders@chromium.org> wrote: > > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of > > the EDID") I broke out reading the base block of the EDID to its own > > function. Unfortunately, when I did that I messed up the handling when > > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or > > when we went through 4 loops and didn't get a valid EDID. Specifically > > I needed to pass the broken EDID to connector_bad_edid() but now I was > > passing an error-pointer. > > > > Let's re-jigger things so we can pass the bad EDID in properly. > > > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") > > Reported-by: kernel test robot <oliver.sang@intel.com> > > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > The crash is was seeing is gone, so > Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> Thanks for testing! I'll plan to apply tomorrow morning (California time) to balance between giving folks a chance to yell at me for my patch and the urgency of fixing the breakage. -Doug
[AMD Official Use Only] Hi Ville: Yhea, it is pretty old change from two years ago, and it is no long valid anymore. Please simply drop it. Regards, Jerry > -----Original Message----- > From: Ville Syrjälä <ville.syrjala@linux.intel.com> > Sent: October 4, 2021 3:45 PM > To: Douglas Anderson <dianders@chromium.org> > Cc: dri-devel@lists.freedesktop.org; geert@linux-m68k.org; > oliver.sang@intel.com; Daniel Vetter <daniel@ffwll.ch>; David Airlie > <airlied@linux.ie>; Jani Nikula <jani.nikula@intel.com>; Linus Walleij > <linus.walleij@linaro.org>; Maarten Lankhorst > <maarten.lankhorst@linux.intel.com>; Maxime Ripard > <mripard@kernel.org>; Sam Ravnborg <sam@ravnborg.org>; Thomas > Zimmermann <tzimmermann@suse.de>; linux-kernel@vger.kernel.org; Zuo, > Jerry <Jerry.Zuo@amd.com>; Wentland, Harry <Harry.Wentland@amd.com>; > Siqueira, Rodrigo <Rodrigo.Siqueira@amd.com> > Subject: Re: [PATCH] drm/edid: Fix crash with zero/invalid EDID > > On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote: > > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of > > the EDID") I broke out reading the base block of the EDID to its own > > function. Unfortunately, when I did that I messed up the handling when > > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or > > when we went through 4 loops and didn't get a valid EDID. Specifically > > I needed to pass the broken EDID to connector_bad_edid() but now I was > > passing an error-pointer. > > > > Let's re-jigger things so we can pass the bad EDID in properly. > > > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the > > EDID") > > Reported-by: kernel test robot <oliver.sang@intel.com> > > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > --- > > > > drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- > > 1 file changed, 11 insertions(+), 16 deletions(-) > > > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > > index 9b19eee0e1b4..9c9463ec5465 100644 > > --- a/drivers/gpu/drm/drm_edid.c > > +++ b/drivers/gpu/drm/drm_edid.c > > @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct > > drm_connector *connector) } > > EXPORT_SYMBOL(drm_add_override_edid_modes); > > > > -static struct edid *drm_do_get_edid_base_block( > > +static struct edid *drm_do_get_edid_base_block(struct drm_connector > > +*connector, > > int (*get_edid_block)(void *data, u8 *buf, unsigned int block, > > size_t len), > > - void *data, bool *edid_corrupt, int *null_edid_counter) > > + void *data) > > { > > - int i; > > + int *null_edid_counter = connector ? &connector- > >null_edid_counter : NULL; > > + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; > > void *edid; > > + int i; > > > > edid = kmalloc(EDID_LENGTH, GFP_KERNEL); > > if (edid == NULL) > > @@ -1941,9 +1943,8 @@ static struct edid > *drm_do_get_edid_base_block( > > return edid; > > > > carp: > > - kfree(edid); > > - return ERR_PTR(-EINVAL); > > - > > + if (connector) > > + connector_bad_edid(connector, edid, 1); > > BTW I believe connector_bad_edid() itself is broken since commit > e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption > test"). Before we've even allocated the memory for the extension blocks > that code now assumes edid[0x7e] is to be 100% trusted and goes and > calculates the checksum on a block based on that. So that's likely going to be > pointing somewhere beyond the base block into memory we've not even > allocated. So anyone who wanted could craft a bogus EDID and maybe get > something interesting to happen. > > Would be good if someone could fix that while at it. Or just revert the > offending commit if there is no simple solution immediately in sight. > > The fact that we're parsing entirely untrustworthy crap in the kernel always > worries me. Either we need super careful review of all relevant code, and/or > we need to think about moving the parser out of the kernel. > I was considering playing around with the usermode helper stuff. IIRC there > is a way to embed the userspace binary into the kernel and just fire it up > when needed. But so far it's been the usual -ENOTIME for me... > > -- > Ville Syrjälä > Intel
Hi, On Mon, Oct 4, 2021 at 5:40 PM Doug Anderson <dianders@chromium.org> wrote: > > Hi, > > On Mon, Oct 4, 2021 at 10:14 AM Geert Uytterhoeven <geert@linux-m68k.org> wrote: > > > > Hi Douglas, > > > > On Mon, Oct 4, 2021 at 6:22 PM Douglas Anderson <dianders@chromium.org> wrote: > > > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of > > > the EDID") I broke out reading the base block of the EDID to its own > > > function. Unfortunately, when I did that I messed up the handling when > > > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or > > > when we went through 4 loops and didn't get a valid EDID. Specifically > > > I needed to pass the broken EDID to connector_bad_edid() but now I was > > > passing an error-pointer. > > > > > > Let's re-jigger things so we can pass the bad EDID in properly. > > > > > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") > > > Reported-by: kernel test robot <oliver.sang@intel.com> > > > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > > > The crash is was seeing is gone, so > > Tested-by: Geert Uytterhoeven <geert+renesas@glider.be> > > Thanks for testing! I'll plan to apply tomorrow morning (California > time) to balance between giving folks a chance to yell at me for my > patch and the urgency of fixing the breakage. Ah, doh! I can't push until I can get a review tag from someone. As soon as I see one then I'll give it a push. -Doug
Hi, On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <Jerry.Zuo@amd.com> wrote: > > > BTW I believe connector_bad_edid() itself is broken since commit > > e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption > > test"). Before we've even allocated the memory for the extension blocks > > that code now assumes edid[0x7e] is to be 100% trusted and goes and > > calculates the checksum on a block based on that. So that's likely going to be > > pointing somewhere beyond the base block into memory we've not even > > allocated. So anyone who wanted could craft a bogus EDID and maybe get > > something interesting to happen. > > > > Would be good if someone could fix that while at it. Or just revert the > > offending commit if there is no simple solution immediately in sight. > > > > The fact that we're parsing entirely untrustworthy crap in the kernel always > > worries me. Either we need super careful review of all relevant code, and/or > > we need to think about moving the parser out of the kernel. > > I was considering playing around with the usermode helper stuff. IIRC there > > is a way to embed the userspace binary into the kernel and just fire it up > > when needed. But so far it's been the usual -ENOTIME for me... > > > [AMD Official Use Only] > > Hi Ville: > > Yhea, it is pretty old change from two years ago, and it is no long valid anymore. Please simply drop it. > > Regards, > Jerry I've cut out other bits from this email and changed the subject line since I think this is an issue unrelated to the one my original patch was fixing. I don't actually know a ton about DP compliance testing, but I attempted to try to be helpful and revert commit e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard to deal with the conflicts in the revert itself, but then things didn't compile because there are two places that use `real_edid_checksum` and that goes away if I revert the patch. I've made an attempt to fix the problem by just adding a bounds check. Perhaps you can see if that looks good to you: https://lore.kernel.org/r/20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1295928d@changeid -Doug
[AMD Official Use Only] > -----Original Message----- > From: Doug Anderson <dianders@chromium.org> > Sent: October 5, 2021 11:14 AM > To: Zuo, Jerry <Jerry.Zuo@amd.com> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>; dri- > devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; > Daniel Vetter <daniel@ffwll.ch>; David Airlie <airlied@linux.ie>; Jani Nikula > <jani.nikula@intel.com>; Linus Walleij <linus.walleij@linaro.org>; Maarten > Lankhorst <maarten.lankhorst@linux.intel.com>; Maxime Ripard > <mripard@kernel.org>; Sam Ravnborg <sam@ravnborg.org>; Thomas > Zimmermann <tzimmermann@suse.de>; linux-kernel@vger.kernel.org; > Wentland, Harry <Harry.Wentland@amd.com>; Siqueira, Rodrigo > <Rodrigo.Siqueira@amd.com>; Kuogee Hsieh <khsieh@codeaurora.org> > Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: > Fix crash with zero/invalid EDID) > > Hi, > > On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <Jerry.Zuo@amd.com> wrote: > > > > > BTW I believe connector_bad_edid() itself is broken since commit > > > e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid > > > corruption test"). Before we've even allocated the memory for the > > > extension blocks that code now assumes edid[0x7e] is to be 100% > > > trusted and goes and calculates the checksum on a block based on > > > that. So that's likely going to be pointing somewhere beyond the > > > base block into memory we've not even allocated. So anyone who > > > wanted could craft a bogus EDID and maybe get something interesting to > happen. > > > > > > Would be good if someone could fix that while at it. Or just revert > > > the offending commit if there is no simple solution immediately in sight. > > > > > > The fact that we're parsing entirely untrustworthy crap in the > > > kernel always worries me. Either we need super careful review of all > > > relevant code, and/or we need to think about moving the parser out of > the kernel. > > > I was considering playing around with the usermode helper stuff. > > > IIRC there is a way to embed the userspace binary into the kernel > > > and just fire it up when needed. But so far it's been the usual -ENOTIME > for me... > > > > > [AMD Official Use Only] > > > > Hi Ville: > > > > Yhea, it is pretty old change from two years ago, and it is no long valid > anymore. Please simply drop it. > > > > Regards, > > Jerry > > I've cut out other bits from this email and changed the subject line since I > think this is an issue unrelated to the one my original patch was fixing. > > I don't actually know a ton about DP compliance testing, but I attempted to > try to be helpful and revert commit e11f5bd8228f ("drm: > Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard > to deal with the conflicts in the revert itself, but then things didn't compile > because there are two places that use `real_edid_checksum` and that goes > away if I revert the patch. > > I've made an attempt to fix the problem by just adding a bounds check. > Perhaps you can see if that looks good to you: > > https://nam11.safelinks.protection.outlook.com/?url=https%3A%2F%2Flore. > kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1 > 295928d%40changeid&data=04%7C01%7CJerry.Zuo%40amd.com%7C90 > b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d > %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJWIj > oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1 > 000&sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3 > D&reserved=0 > > -Doug The patch used for DP1.4 compliance edid corruption test. Let me double check if edid corruption test could be passed without the patch.
On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote: > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of > the EDID") I broke out reading the base block of the EDID to its own > function. Unfortunately, when I did that I messed up the handling when > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or > when we went through 4 loops and didn't get a valid EDID. Specifically > I needed to pass the broken EDID to connector_bad_edid() but now I was > passing an error-pointer. > > Let's re-jigger things so we can pass the bad EDID in properly. > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") > Reported-by: kernel test robot <oliver.sang@intel.com> > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > Signed-off-by: Douglas Anderson <dianders@chromium.org> A bit of historical fallout zone this part of the code. So not the easiest thing to read in general. But looks like what you have here should work. Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> > --- > > drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- > 1 file changed, 11 insertions(+), 16 deletions(-) > > diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c > index 9b19eee0e1b4..9c9463ec5465 100644 > --- a/drivers/gpu/drm/drm_edid.c > +++ b/drivers/gpu/drm/drm_edid.c > @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector) > } > EXPORT_SYMBOL(drm_add_override_edid_modes); > > -static struct edid *drm_do_get_edid_base_block( > +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector, > int (*get_edid_block)(void *data, u8 *buf, unsigned int block, > size_t len), > - void *data, bool *edid_corrupt, int *null_edid_counter) > + void *data) > { > - int i; > + int *null_edid_counter = connector ? &connector->null_edid_counter : NULL; > + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; > void *edid; > + int i; > > edid = kmalloc(EDID_LENGTH, GFP_KERNEL); > if (edid == NULL) > @@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block( > return edid; > > carp: > - kfree(edid); > - return ERR_PTR(-EINVAL); > - > + if (connector) > + connector_bad_edid(connector, edid, 1); > out: > kfree(edid); > return NULL; > @@ -1982,14 +1983,9 @@ struct edid *drm_do_get_edid(struct drm_connector *connector, > if (override) > return override; > > - edid = (u8 *)drm_do_get_edid_base_block(get_edid_block, data, > - &connector->edid_corrupt, > - &connector->null_edid_counter); > - if (IS_ERR_OR_NULL(edid)) { > - if (IS_ERR(edid)) > - connector_bad_edid(connector, edid, 1); > + edid = (u8 *)drm_do_get_edid_base_block(connector, get_edid_block, data); > + if (!edid) > return NULL; > - } > > /* if there's no extensions or no connector, we're done */ > valid_extensions = edid[0x7e]; > @@ -2142,14 +2138,13 @@ u32 drm_edid_get_panel_id(struct i2c_adapter *adapter) > struct edid *edid; > u32 panel_id; > > - edid = drm_do_get_edid_base_block(drm_do_probe_ddc_edid, adapter, > - NULL, NULL); > + edid = drm_do_get_edid_base_block(NULL, drm_do_probe_ddc_edid, adapter); > > /* > * There are no manufacturer IDs of 0, so if there is a problem reading > * the EDID then we'll just return 0. > */ > - if (IS_ERR_OR_NULL(edid)) > + if (!edid) > return 0; > > panel_id = edid_extract_panel_id(edid); > -- > 2.33.0.800.g4c38ced690-goog
On 2021-10-05 11:25, Zuo, Jerry wrote: > [AMD Official Use Only] > >> -----Original Message----- >> From: Doug Anderson <dianders@chromium.org> >> Sent: October 5, 2021 11:14 AM >> To: Zuo, Jerry <Jerry.Zuo@amd.com> >> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>; dri- >> devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; >> Daniel Vetter <daniel@ffwll.ch>; David Airlie <airlied@linux.ie>; Jani Nikula >> <jani.nikula@intel.com>; Linus Walleij <linus.walleij@linaro.org>; Maarten >> Lankhorst <maarten.lankhorst@linux.intel.com>; Maxime Ripard >> <mripard@kernel.org>; Sam Ravnborg <sam@ravnborg.org>; Thomas >> Zimmermann <tzimmermann@suse.de>; linux-kernel@vger.kernel.org; >> Wentland, Harry <Harry.Wentland@amd.com>; Siqueira, Rodrigo >> <Rodrigo.Siqueira@amd.com>; Kuogee Hsieh <khsieh@codeaurora.org> >> Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: >> Fix crash with zero/invalid EDID) >> >> Hi, >> >> On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <Jerry.Zuo@amd.com> wrote: >>> >>>> BTW I believe connector_bad_edid() itself is broken since commit >>>> e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid >>>> corruption test"). Before we've even allocated the memory for the >>>> extension blocks that code now assumes edid[0x7e] is to be 100% >>>> trusted and goes and calculates the checksum on a block based on >>>> that. So that's likely going to be pointing somewhere beyond the >>>> base block into memory we've not even allocated. So anyone who >>>> wanted could craft a bogus EDID and maybe get something interesting to >> happen. >>>> >>>> Would be good if someone could fix that while at it. Or just revert >>>> the offending commit if there is no simple solution immediately in sight. >>>> >>>> The fact that we're parsing entirely untrustworthy crap in the >>>> kernel always worries me. Either we need super careful review of all >>>> relevant code, and/or we need to think about moving the parser out of >> the kernel. >>>> I was considering playing around with the usermode helper stuff. >>>> IIRC there is a way to embed the userspace binary into the kernel >>>> and just fire it up when needed. But so far it's been the usual -ENOTIME >> for me... >>>> >>> [AMD Official Use Only] >>> >>> Hi Ville: >>> >>> Yhea, it is pretty old change from two years ago, and it is no long valid >> anymore. Please simply drop it. >>> >>> Regards, >>> Jerry >> >> I've cut out other bits from this email and changed the subject line since I >> think this is an issue unrelated to the one my original patch was fixing. >> >> I don't actually know a ton about DP compliance testing, but I attempted to >> try to be helpful and revert commit e11f5bd8228f ("drm: >> Add support for DP 1.4 Compliance edid corruption test"). It wasn't too hard >> to deal with the conflicts in the revert itself, but then things didn't compile >> because there are two places that use `real_edid_checksum` and that goes >> away if I revert the patch. >> >> I've made an attempt to fix the problem by just adding a bounds check. >> Perhaps you can see if that looks good to you: >> >> https://lore. kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1 >> 295928d%40changeid&data=04%7C01%7CJerry.Zuo%40amd.com%7C90 >> b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d >> %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJWIj >> oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1 >> 000&sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3 >> D&reserved=0 >> >> -Doug > > The patch used for DP1.4 compliance edid corruption test. Let me double check if edid corruption test could be passed without the patch. > Can you try the CTS test with Doug's fix? https://patchwork.freedesktop.org/patch/457537/ Harry
Hi, On Tue, Oct 5, 2021 at 9:43 AM Ville Syrjälä <ville.syrjala@linux.intel.com> wrote: > > On Mon, Oct 04, 2021 at 09:21:27AM -0700, Douglas Anderson wrote: > > In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of > > the EDID") I broke out reading the base block of the EDID to its own > > function. Unfortunately, when I did that I messed up the handling when > > drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or > > when we went through 4 loops and didn't get a valid EDID. Specifically > > I needed to pass the broken EDID to connector_bad_edid() but now I was > > passing an error-pointer. > > > > Let's re-jigger things so we can pass the bad EDID in properly. > > > > Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") > > Reported-by: kernel test robot <oliver.sang@intel.com> > > Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> > > Signed-off-by: Douglas Anderson <dianders@chromium.org> > > A bit of historical fallout zone this part of the code. So > not the easiest thing to read in general. But looks like what > you have here should work. > > Reviewed-by: Ville Syrjälä <ville.syrjala@linux.intel.com> Thanks! Pushed to drm-misc/drm-misc-next: e7bd95a7ed4e drm/edid: Fix crash with zero/invalid EDID
[AMD Official Use Only] > -----Original Message----- > From: Wentland, Harry <Harry.Wentland@amd.com> > Sent: October 5, 2021 2:04 PM > To: Zuo, Jerry <Jerry.Zuo@amd.com>; Doug Anderson > <dianders@chromium.org> > Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>; dri- > devel@lists.freedesktop.org; geert@linux-m68k.org; oliver.sang@intel.com; > Daniel Vetter <daniel@ffwll.ch>; David Airlie <airlied@linux.ie>; Jani Nikula > <jani.nikula@intel.com>; Linus Walleij <linus.walleij@linaro.org>; Maarten > Lankhorst <maarten.lankhorst@linux.intel.com>; Maxime Ripard > <mripard@kernel.org>; Sam Ravnborg <sam@ravnborg.org>; Thomas > Zimmermann <tzimmermann@suse.de>; linux-kernel@vger.kernel.org; > Siqueira, Rodrigo <Rodrigo.Siqueira@amd.com>; Kuogee Hsieh > <khsieh@codeaurora.org> > Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: > Fix crash with zero/invalid EDID) > > > > On 2021-10-05 11:25, Zuo, Jerry wrote: > > [AMD Official Use Only] > > > >> -----Original Message----- > >> From: Doug Anderson <dianders@chromium.org> > >> Sent: October 5, 2021 11:14 AM > >> To: Zuo, Jerry <Jerry.Zuo@amd.com> > >> Cc: Ville Syrjälä <ville.syrjala@linux.intel.com>; dri- > >> devel@lists.freedesktop.org; geert@linux-m68k.org; > >> oliver.sang@intel.com; Daniel Vetter <daniel@ffwll.ch>; David Airlie > >> <airlied@linux.ie>; Jani Nikula <jani.nikula@intel.com>; Linus > >> Walleij <linus.walleij@linaro.org>; Maarten Lankhorst > >> <maarten.lankhorst@linux.intel.com>; Maxime Ripard > >> <mripard@kernel.org>; Sam Ravnborg <sam@ravnborg.org>; Thomas > >> Zimmermann <tzimmermann@suse.de>; linux-kernel@vger.kernel.org; > >> Wentland, Harry <Harry.Wentland@amd.com>; Siqueira, Rodrigo > >> <Rodrigo.Siqueira@amd.com>; Kuogee Hsieh <khsieh@codeaurora.org> > >> Subject: Re: connector_bad_edid() is broken (was: Re: [PATCH] drm/edid: > >> Fix crash with zero/invalid EDID) > >> > >> Hi, > >> > >> On Tue, Oct 5, 2021 at 6:33 AM Zuo, Jerry <Jerry.Zuo@amd.com> wrote: > >>> > >>>> BTW I believe connector_bad_edid() itself is broken since commit > >>>> e11f5bd8228f ("drm: Add support for DP 1.4 Compliance edid > >>>> corruption test"). Before we've even allocated the memory for the > >>>> extension blocks that code now assumes edid[0x7e] is to be 100% > >>>> trusted and goes and calculates the checksum on a block based on > >>>> that. So that's likely going to be pointing somewhere beyond the > >>>> base block into memory we've not even allocated. So anyone who > >>>> wanted could craft a bogus EDID and maybe get something interesting > >>>> to > >> happen. > >>>> > >>>> Would be good if someone could fix that while at it. Or just revert > >>>> the offending commit if there is no simple solution immediately in sight. > >>>> > >>>> The fact that we're parsing entirely untrustworthy crap in the > >>>> kernel always worries me. Either we need super careful review of > >>>> all relevant code, and/or we need to think about moving the parser > >>>> out of > >> the kernel. > >>>> I was considering playing around with the usermode helper stuff. > >>>> IIRC there is a way to embed the userspace binary into the kernel > >>>> and just fire it up when needed. But so far it's been the usual > >>>> -ENOTIME > >> for me... > >>>> > >>> [AMD Official Use Only] > >>> > >>> Hi Ville: > >>> > >>> Yhea, it is pretty old change from two years ago, and it is no > >>> long valid > >> anymore. Please simply drop it. > >>> > >>> Regards, > >>> Jerry > >> > >> I've cut out other bits from this email and changed the subject line > >> since I think this is an issue unrelated to the one my original patch was > fixing. > >> > >> I don't actually know a ton about DP compliance testing, but I > >> attempted to try to be helpful and revert commit e11f5bd8228f ("drm: > >> Add support for DP 1.4 Compliance edid corruption test"). It wasn't > >> too hard to deal with the conflicts in the revert itself, but then > >> things didn't compile because there are two places that use > >> `real_edid_checksum` and that goes away if I revert the patch. > >> > >> I've made an attempt to fix the problem by just adding a bounds check. > >> Perhaps you can see if that looks good to you: > >> > >> https://lore. > kernel.org%2Fr%2F20211005081022.1.Ib059f9c23c2611cb5a9d760e7d0a700c1 > >> > 295928d%40changeid&data=04%7C01%7CJerry.Zuo%40amd.com%7C90 > >> > b948659454400cedd308d98812c339%7C3dd8961fe4884e608e11a82d994e183d > >> %7C0%7C0%7C637690436453163864%7CUnknown%7CTWFpbGZsb3d8eyJ > WIj > >> > oiMC4wLjAwMDAiLCJQIjoiV2luMzIiLCJBTiI6Ik1haWwiLCJXVCI6Mn0%3D%7C1 > >> > 000&sdata=OtSngWlYyDc1NbNSgAeALqN3nF%2Bnw08nJ068cpAKZJk%3 > >> D&reserved=0 > >> > >> -Doug > > > > The patch used for DP1.4 compliance edid corruption test. Let me double > check if edid corruption test could be passed without the patch. > > > > Can you try the CTS test with Doug's fix? > > https://patchwork.freedesktop.org/patch/457537/ > > Harry Yes, I'll give a try on that.
diff --git a/drivers/gpu/drm/drm_edid.c b/drivers/gpu/drm/drm_edid.c index 9b19eee0e1b4..9c9463ec5465 100644 --- a/drivers/gpu/drm/drm_edid.c +++ b/drivers/gpu/drm/drm_edid.c @@ -1911,13 +1911,15 @@ int drm_add_override_edid_modes(struct drm_connector *connector) } EXPORT_SYMBOL(drm_add_override_edid_modes); -static struct edid *drm_do_get_edid_base_block( +static struct edid *drm_do_get_edid_base_block(struct drm_connector *connector, int (*get_edid_block)(void *data, u8 *buf, unsigned int block, size_t len), - void *data, bool *edid_corrupt, int *null_edid_counter) + void *data) { - int i; + int *null_edid_counter = connector ? &connector->null_edid_counter : NULL; + bool *edid_corrupt = connector ? &connector->edid_corrupt : NULL; void *edid; + int i; edid = kmalloc(EDID_LENGTH, GFP_KERNEL); if (edid == NULL) @@ -1941,9 +1943,8 @@ static struct edid *drm_do_get_edid_base_block( return edid; carp: - kfree(edid); - return ERR_PTR(-EINVAL); - + if (connector) + connector_bad_edid(connector, edid, 1); out: kfree(edid); return NULL; @@ -1982,14 +1983,9 @@ struct edid *drm_do_get_edid(struct drm_connector *connector, if (override) return override; - edid = (u8 *)drm_do_get_edid_base_block(get_edid_block, data, - &connector->edid_corrupt, - &connector->null_edid_counter); - if (IS_ERR_OR_NULL(edid)) { - if (IS_ERR(edid)) - connector_bad_edid(connector, edid, 1); + edid = (u8 *)drm_do_get_edid_base_block(connector, get_edid_block, data); + if (!edid) return NULL; - } /* if there's no extensions or no connector, we're done */ valid_extensions = edid[0x7e]; @@ -2142,14 +2138,13 @@ u32 drm_edid_get_panel_id(struct i2c_adapter *adapter) struct edid *edid; u32 panel_id; - edid = drm_do_get_edid_base_block(drm_do_probe_ddc_edid, adapter, - NULL, NULL); + edid = drm_do_get_edid_base_block(NULL, drm_do_probe_ddc_edid, adapter); /* * There are no manufacturer IDs of 0, so if there is a problem reading * the EDID then we'll just return 0. */ - if (IS_ERR_OR_NULL(edid)) + if (!edid) return 0; panel_id = edid_extract_panel_id(edid);
In the commit bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") I broke out reading the base block of the EDID to its own function. Unfortunately, when I did that I messed up the handling when drm_edid_is_zero() indicated that we had an EDID that was all 0x00 or when we went through 4 loops and didn't get a valid EDID. Specifically I needed to pass the broken EDID to connector_bad_edid() but now I was passing an error-pointer. Let's re-jigger things so we can pass the bad EDID in properly. Fixes: bac9c2948224 ("drm/edid: Break out reading block 0 of the EDID") Reported-by: kernel test robot <oliver.sang@intel.com> Reported-by: Geert Uytterhoeven <geert@linux-m68k.org> Signed-off-by: Douglas Anderson <dianders@chromium.org> --- drivers/gpu/drm/drm_edid.c | 27 +++++++++++---------------- 1 file changed, 11 insertions(+), 16 deletions(-)