diff mbox

[BUG] hdlcd gets confused about base address

Message ID 20161121173231.GM1005@e106497-lin.cambridge.arm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Liviu Dudau Nov. 21, 2016, 5:32 p.m. UTC
On Mon, Nov 21, 2016 at 02:03:49PM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 21, 2016 at 01:50:31PM +0000, Liviu Dudau wrote:
> > On Mon, Nov 21, 2016 at 01:24:19PM +0000, Russell King - ARM Linux wrote:
> > > On Mon, Nov 21, 2016 at 12:56:53PM +0000, Liviu Dudau wrote:
> > > > That is mostly due to the check in hdlcd_crtc_disable() which I should
> > > > remove, I've added it because I was getting a ->disable() hook call
> > > > before any ->enable() was called at startup time. I need to revisit
> > > > this as I remember Daniel was commenting that this was not needed.
> > > 
> > > Removing that test results in:
> > > 
> > > [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:24:crtc-0] flip_done timed out
> > > 
> > > and the kernel hanging, seemingly in an IRQs-off region.
> > 
> > Right, I need to sort this one out. Are you doing these tests out of
> > some tagged branch that I can get in sync with?

Hi Russell,

> 
> No, not yet, and some of the changes I have are rather hacky.
> 
> I do always build my full tree of patches (which is currently running at
> around 320 patches at the moment) but I never share that entire patch
> set.  However, none of those touch i2c (apart from the ones I've recently
> posted) and the only patches touching hdlcd are those I've posted so far.
> 
> Most of the problems I'm finding are through trying basic stuff - I'm not
> doing anything special or unusual to find them, at the moment quite
> literally just starting Xorg up and shutting it down.  For example, the
> above was caused by logging in on serial, running:
> 
> 	Xorg -terminate -verbose
> 
> and then hitting ^C.  (I have lxdm disabled so systemd boots to VT login
> prompts on both the "framebuffer" and serial - I don't want Xorg coming
> up when the machine is booting for its nightly KVM boot tests.)
> 
> I'm afraid that when I try someone elses code, I have a tendency to find
> loads of seemingly trivial bugs when I try putting it through some basic
> tests.

I'm not being able to reproduce your bug conditions. I'm running the following
setup when testing:

- mainline v4.9-rc6
- edited the juno-base.dtsi file to disable the hdlcd@7f600000 and
  hdmi-transmitter@70 nodes to remove the second HDMI output from the test.
- patched tda998x_drv.c to set interlace_allowed = 0, see below why
- modified the hdlcd_crtc.c file with the following patch:

-8<-----------------------------------------------------------------------
->8-----------------------------------------------------------------------

That takes care of the pxlclk refcounting issue you were seeing. I've started
Xorg several times (and yes, I do see EDID checksum error every now and then,
specially when running xrandr). When closing down Xorg I get back the framebuffer
console with the login prompt and no image shifting. My monitor is a TV that
reports that preferred mode is 1080i, however HDLCD and TDA19988 don't talk
propertly with each other to be able to set the interlaced mode correctly, so
I've had to disable support for interlacing mode in tda998x_drv.c and now the
preferred mode that gets picked up is 1920x1200@60Hz.

Please advise on what other steps I can take to try to reproduce this.

P.S: What revision of Juno do you have? Any chance you can capture the start
of the boot process where the firmware component prints the version numbers?

Best regards,
Liviu


> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.

Comments

Russell King (Oracle) Nov. 21, 2016, 5:56 p.m. UTC | #1
On Mon, Nov 21, 2016 at 05:32:32PM +0000, Liviu Dudau wrote:
> On Mon, Nov 21, 2016 at 02:03:49PM +0000, Russell King - ARM Linux wrote:
> > On Mon, Nov 21, 2016 at 01:50:31PM +0000, Liviu Dudau wrote:
> > > On Mon, Nov 21, 2016 at 01:24:19PM +0000, Russell King - ARM Linux wrote:
> > > > On Mon, Nov 21, 2016 at 12:56:53PM +0000, Liviu Dudau wrote:
> > > > > That is mostly due to the check in hdlcd_crtc_disable() which I should
> > > > > remove, I've added it because I was getting a ->disable() hook call
> > > > > before any ->enable() was called at startup time. I need to revisit
> > > > > this as I remember Daniel was commenting that this was not needed.
> > > > 
> > > > Removing that test results in:
> > > > 
> > > > [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:24:crtc-0] flip_done timed out
> > > > 
> > > > and the kernel hanging, seemingly in an IRQs-off region.
> > > 
> > > Right, I need to sort this one out. Are you doing these tests out of
> > > some tagged branch that I can get in sync with?
> 
> Hi Russell,
> 
> > 
> > No, not yet, and some of the changes I have are rather hacky.
> > 
> > I do always build my full tree of patches (which is currently running at
> > around 320 patches at the moment) but I never share that entire patch
> > set.  However, none of those touch i2c (apart from the ones I've recently
> > posted) and the only patches touching hdlcd are those I've posted so far.
> > 
> > Most of the problems I'm finding are through trying basic stuff - I'm not
> > doing anything special or unusual to find them, at the moment quite
> > literally just starting Xorg up and shutting it down.  For example, the
> > above was caused by logging in on serial, running:
> > 
> > 	Xorg -terminate -verbose
> > 
> > and then hitting ^C.  (I have lxdm disabled so systemd boots to VT login
> > prompts on both the "framebuffer" and serial - I don't want Xorg coming
> > up when the machine is booting for its nightly KVM boot tests.)
> > 
> > I'm afraid that when I try someone elses code, I have a tendency to find
> > loads of seemingly trivial bugs when I try putting it through some basic
> > tests.
> 
> I'm not being able to reproduce your bug conditions. I'm running the following
> setup when testing:
> 
> - mainline v4.9-rc6
> - edited the juno-base.dtsi file to disable the hdlcd@7f600000 and
>   hdmi-transmitter@70 nodes to remove the second HDMI output from the test.
> - patched tda998x_drv.c to set interlace_allowed = 0, see below why
> - modified the hdlcd_crtc.c file with the following patch:
> 
> -8<-----------------------------------------------------------------------
> diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c
> index 48019ae..656dc43 100644
> --- a/drivers/gpu/drm/arm/hdlcd_crtc.c
> +++ b/drivers/gpu/drm/arm/hdlcd_crtc.c
> @@ -156,9 +156,7 @@ static void hdlcd_crtc_disable(struct drm_crtc *crtc)
>  {
>  	struct hdlcd_drm_private *hdlcd = crtc_to_hdlcd_priv(crtc);
>  
> -	if (!crtc->state->active)
> -		return;
> -
> +	drm_crtc_vblank_off(crtc);

Don't you need a drm_crtc_vblank_on() call in the enable function?

>  	hdlcd_write(hdlcd, HDLCD_REG_COMMAND, 0);
>  	clk_disable_unprepare(hdlcd->clk);
>  }
> ->8-----------------------------------------------------------------------
> 
> That takes care of the pxlclk refcounting issue you were seeing. I've started
> Xorg several times (and yes, I do see EDID checksum error every now and then,
> specially when running xrandr). When closing down Xorg I get back the framebuffer
> console with the login prompt and no image shifting.

For me, the image shift was 100% reproducable.  With the above patch
and a call to drm_crtc_vblank_on() in the enable path, it seems to
behave correctly - I can alternately switch between 1920x1080 and
1280x1024 and it behaves correctly.  Indeed, my debug prints show that
the right thing is happening wrt disabling the controller:

[   76.869136] hdlcd_crtc_disable: active 0
[   76.869159] hdlcd_plane_atomic_update: pitch 7680 lines 1080
[   76.888983] hdlcd_plane_atomic_update: pitch 5120 lines 1024
[   76.888995] hdlcd_crtc_enable: active 1 cmd 00000000
[   85.262451] hdlcd_crtc_disable: active 0
[   85.262474] hdlcd_plane_atomic_update: pitch 5120 lines 1024
[   85.286667] hdlcd_plane_atomic_update: pitch 7680 lines 1080
[   85.286679] hdlcd_crtc_enable: active 1 cmd 00000000
[   92.658038] hdlcd_crtc_disable: active 0
[   92.658057] hdlcd_plane_atomic_update: pitch 7680 lines 1080
[   92.680659] hdlcd_plane_atomic_update: pitch 5120 lines 1024
[   92.680668] hdlcd_crtc_enable: active 1 cmd 00000000
[   97.805205] hdlcd_crtc_disable: active 0
[   97.805220] hdlcd_plane_atomic_update: pitch 5120 lines 1024
[   97.834415] hdlcd_plane_atomic_update: pitch 7680 lines 1080
[   97.834423] hdlcd_crtc_enable: active 1 cmd 00000000

> My monitor is a TV that
> reports that preferred mode is 1080i, however HDLCD and TDA19988 don't talk
> propertly with each other to be able to set the interlaced mode correctly, so
> I've had to disable support for interlacing mode in tda998x_drv.c and now the
> preferred mode that gets picked up is 1920x1200@60Hz.

That's more of a generic DRM issue - the CRTC layer doesn't get a
look-in when a connector parses the modes supplied from the display,
so there's no real way for the CRTC layer to apply any kind of
limitations to the available modes, except when a mode is attempted
to be set.

I don't want to see an "interlace" DT property introduced for the
TDA998x, because that's the wrong approach - it would be adding a
property for the needs of the implementation, and not a description
of the hardware.

> Please advise on what other steps I can take to try to reproduce this.

I guess if you've tried and failed to reproduce it, there is something
very specific to my setup which I can't describe.

> P.S: What revision of Juno do you have? Any chance you can capture the start
> of the boot process where the firmware component prints the version numbers?

All I know is that it's a HBI0282B, which thanks to Sudeep's efforts
(he spent _all_ of last Tuesday logged in to one of my systems trying
to upgrade the firmware) is now running the Linaro 16.10 firmware.
Sudeep says that my hardware is a very early revision which went out
the door without being properly calibrated.

Whether that has any bearing on the reproducability of this or not, I've
no idea.

If you want to see the boot messages, head to my autobuilder status page
and look at the juno-kvm entries - they contain all the boot messages
from initial power up of the juno.  I'll send you a link in private
mail so googlebot doesn't end up hammering on it after it expires.
Liviu Dudau Nov. 21, 2016, 6:23 p.m. UTC | #2
On Mon, Nov 21, 2016 at 05:56:02PM +0000, Russell King - ARM Linux wrote:
> On Mon, Nov 21, 2016 at 05:32:32PM +0000, Liviu Dudau wrote:
> > On Mon, Nov 21, 2016 at 02:03:49PM +0000, Russell King - ARM Linux wrote:
> > > On Mon, Nov 21, 2016 at 01:50:31PM +0000, Liviu Dudau wrote:
> > > > On Mon, Nov 21, 2016 at 01:24:19PM +0000, Russell King - ARM Linux wrote:
> > > > > On Mon, Nov 21, 2016 at 12:56:53PM +0000, Liviu Dudau wrote:
> > > > > > That is mostly due to the check in hdlcd_crtc_disable() which I should
> > > > > > remove, I've added it because I was getting a ->disable() hook call
> > > > > > before any ->enable() was called at startup time. I need to revisit
> > > > > > this as I remember Daniel was commenting that this was not needed.
> > > > > 
> > > > > Removing that test results in:
> > > > > 
> > > > > [drm:drm_atomic_helper_commit_cleanup_done] *ERROR* [CRTC:24:crtc-0] flip_done timed out
> > > > > 
> > > > > and the kernel hanging, seemingly in an IRQs-off region.
> > > > 
> > > > Right, I need to sort this one out. Are you doing these tests out of
> > > > some tagged branch that I can get in sync with?
> > 
> > Hi Russell,
> > 
> > > 
> > > No, not yet, and some of the changes I have are rather hacky.
> > > 
> > > I do always build my full tree of patches (which is currently running at
> > > around 320 patches at the moment) but I never share that entire patch
> > > set.  However, none of those touch i2c (apart from the ones I've recently
> > > posted) and the only patches touching hdlcd are those I've posted so far.
> > > 
> > > Most of the problems I'm finding are through trying basic stuff - I'm not
> > > doing anything special or unusual to find them, at the moment quite
> > > literally just starting Xorg up and shutting it down.  For example, the
> > > above was caused by logging in on serial, running:
> > > 
> > > 	Xorg -terminate -verbose
> > > 
> > > and then hitting ^C.  (I have lxdm disabled so systemd boots to VT login
> > > prompts on both the "framebuffer" and serial - I don't want Xorg coming
> > > up when the machine is booting for its nightly KVM boot tests.)
> > > 
> > > I'm afraid that when I try someone elses code, I have a tendency to find
> > > loads of seemingly trivial bugs when I try putting it through some basic
> > > tests.
> > 
> > I'm not being able to reproduce your bug conditions. I'm running the following
> > setup when testing:
> > 
> > - mainline v4.9-rc6
> > - edited the juno-base.dtsi file to disable the hdlcd@7f600000 and
> >   hdmi-transmitter@70 nodes to remove the second HDMI output from the test.
> > - patched tda998x_drv.c to set interlace_allowed = 0, see below why
> > - modified the hdlcd_crtc.c file with the following patch:
> > 
> > -8<-----------------------------------------------------------------------
> > diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c
> > index 48019ae..656dc43 100644
> > --- a/drivers/gpu/drm/arm/hdlcd_crtc.c
> > +++ b/drivers/gpu/drm/arm/hdlcd_crtc.c
> > @@ -156,9 +156,7 @@ static void hdlcd_crtc_disable(struct drm_crtc *crtc)
> >  {
> >  	struct hdlcd_drm_private *hdlcd = crtc_to_hdlcd_priv(crtc);
> >  
> > -	if (!crtc->state->active)
> > -		return;
> > -
> > +	drm_crtc_vblank_off(crtc);
> 
> Don't you need a drm_crtc_vblank_on() call in the enable function?

I do, thanks for calling me on that!

> 
> >  	hdlcd_write(hdlcd, HDLCD_REG_COMMAND, 0);
> >  	clk_disable_unprepare(hdlcd->clk);
> >  }
> > ->8-----------------------------------------------------------------------
> > 
> > That takes care of the pxlclk refcounting issue you were seeing. I've started
> > Xorg several times (and yes, I do see EDID checksum error every now and then,
> > specially when running xrandr). When closing down Xorg I get back the framebuffer
> > console with the login prompt and no image shifting.
> 
> For me, the image shift was 100% reproducable.  With the above patch
> and a call to drm_crtc_vblank_on() in the enable path, it seems to
> behave correctly - I can alternately switch between 1920x1080 and
> 1280x1024 and it behaves correctly.  Indeed, my debug prints show that
> the right thing is happening wrt disabling the controller:

OK, so I'll take it that you did not also use your patch to fix the base plane
calculations, or was that included as well in your stack?

> 
> [   76.869136] hdlcd_crtc_disable: active 0
> [   76.869159] hdlcd_plane_atomic_update: pitch 7680 lines 1080
> [   76.888983] hdlcd_plane_atomic_update: pitch 5120 lines 1024
> [   76.888995] hdlcd_crtc_enable: active 1 cmd 00000000
> [   85.262451] hdlcd_crtc_disable: active 0
> [   85.262474] hdlcd_plane_atomic_update: pitch 5120 lines 1024
> [   85.286667] hdlcd_plane_atomic_update: pitch 7680 lines 1080
> [   85.286679] hdlcd_crtc_enable: active 1 cmd 00000000
> [   92.658038] hdlcd_crtc_disable: active 0
> [   92.658057] hdlcd_plane_atomic_update: pitch 7680 lines 1080
> [   92.680659] hdlcd_plane_atomic_update: pitch 5120 lines 1024
> [   92.680668] hdlcd_crtc_enable: active 1 cmd 00000000
> [   97.805205] hdlcd_crtc_disable: active 0
> [   97.805220] hdlcd_plane_atomic_update: pitch 5120 lines 1024
> [   97.834415] hdlcd_plane_atomic_update: pitch 7680 lines 1080
> [   97.834423] hdlcd_crtc_enable: active 1 cmd 00000000
> 
> > My monitor is a TV that
> > reports that preferred mode is 1080i, however HDLCD and TDA19988 don't talk
> > propertly with each other to be able to set the interlaced mode correctly, so
> > I've had to disable support for interlacing mode in tda998x_drv.c and now the
> > preferred mode that gets picked up is 1920x1200@60Hz.
> 
> That's more of a generic DRM issue - the CRTC layer doesn't get a
> look-in when a connector parses the modes supplied from the display,
> so there's no real way for the CRTC layer to apply any kind of
> limitations to the available modes, except when a mode is attempted
> to be set.
> 
> I don't want to see an "interlace" DT property introduced for the
> TDA998x, because that's the wrong approach - it would be adding a
> property for the needs of the implementation, and not a description
> of the hardware.

AFAICT the issue is the fact that while HDLCD could scan out the alternate
lines with a bit of a convoluted hack, there is no way to tell TDA19988
to generate the interlaced timings. And no, I'm not advocating introducing
a DT property as this is a runtime mode, depending on the resolution
selected by userspace.

> 
> > Please advise on what other steps I can take to try to reproduce this.
> 
> I guess if you've tried and failed to reproduce it, there is something
> very specific to my setup which I can't describe.
> 
> > P.S: What revision of Juno do you have? Any chance you can capture the start
> > of the boot process where the firmware component prints the version numbers?
> 
> All I know is that it's a HBI0282B, which thanks to Sudeep's efforts
> (he spent _all_ of last Tuesday logged in to one of my systems trying
> to upgrade the firmware) is now running the Linaro 16.10 firmware.
> Sudeep says that my hardware is a very early revision which went out
> the door without being properly calibrated.

:) That is what you get for having special access to early samples of hardware :)

Might be worth asking your ARM contacts if a swap with an R2 is possible.

But, yes, my tests were also run on a Juno R0 (HBI0282B is just the R0 code)
and mine I'm pretty sure is earlier build than yours (board #13 no less).

> 
> Whether that has any bearing on the reproducability of this or not, I've
> no idea.

The one factor that could affect it is the capability of the SCP firmware
to generate the exact pixel clock for your 1080p mode. If it doesn't, then
restoring the old mode might lead to an incorrect synchronisation with the
TDA chip. Current (less than 1.5 years old I guess) SCP firmware has that
sorted via an hdlcd.dat file that pre-calculates a lot of common pixel clock
frequencies).

Best regards,
Liviu

> 
> If you want to see the boot messages, head to my autobuilder status page
> and look at the juno-kvm entries - they contain all the boot messages
> from initial power up of the juno.  I'll send you a link in private
> mail so googlebot doesn't end up hammering on it after it expires.
> 
> -- 
> RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
> FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
> according to speedtest.net.
Russell King (Oracle) Nov. 21, 2016, 6:43 p.m. UTC | #3
On Mon, Nov 21, 2016 at 06:23:24PM +0000, Liviu Dudau wrote:
> On Mon, Nov 21, 2016 at 05:56:02PM +0000, Russell King - ARM Linux wrote:
> > For me, the image shift was 100% reproducable.  With the above patch
> > and a call to drm_crtc_vblank_on() in the enable path, it seems to
> > behave correctly - I can alternately switch between 1920x1080 and
> > 1280x1024 and it behaves correctly.  Indeed, my debug prints show that
> > the right thing is happening wrt disabling the controller:
> 
> OK, so I'll take it that you did not also use your patch to fix the base
> plane calculations, or was that included as well in your stack?

It was before that patch - so it was using crtc_x and crtc_y.  However,
I can guarantee that those were both zero (as I've previously
described.)

> > That's more of a generic DRM issue - the CRTC layer doesn't get a
> > look-in when a connector parses the modes supplied from the display,
> > so there's no real way for the CRTC layer to apply any kind of
> > limitations to the available modes, except when a mode is attempted
> > to be set.
> > 
> > I don't want to see an "interlace" DT property introduced for the
> > TDA998x, because that's the wrong approach - it would be adding a
> > property for the needs of the implementation, and not a description
> > of the hardware.
> 
> AFAICT the issue is the fact that while HDLCD could scan out the alternate
> lines with a bit of a convoluted hack, there is no way to tell TDA19988
> to generate the interlaced timings. And no, I'm not advocating introducing
> a DT property as this is a runtime mode, depending on the resolution
> selected by userspace.

The TDA998x doesn't "generate" the timings.  They come from the input
to it, the TDA998x merely tracks where it is within the frame, so it
knows where it can place things like the infoframes and other data.

So, the responsibility for generating the interlaced timings is with
the CRTC.

That means the CRTC needs to not only scan out alternate lines (which
is the easy bit - setting the pitch to twice the value) but it also
needs to be able to adjust the timing of the vertical sync by half
a line.  The HDLCD from what I can see does not support that, the
overall system does not support for interlaced modes.

> > Whether that has any bearing on the reproducability of this or not, I've
> > no idea.
> 
> The one factor that could affect it is the capability of the SCP firmware
> to generate the exact pixel clock for your 1080p mode. If it doesn't, then
> restoring the old mode might lead to an incorrect synchronisation with the
> TDA chip. Current (less than 1.5 years old I guess) SCP firmware has that
> sorted via an hdlcd.dat file that pre-calculates a lot of common pixel clock
> frequencies).

The TDA998x takes the sync signals itself to synchronise with the CRTC,
and the pixel clock had better be synchronous with the data being closed
out of the CRTC otherwise its going to be in violation of the RGB data
setup and hold timings - which will cause random colour errors.

That isn't what's going on here - the image is rock stable, it's just
shifted.

I tried inverting the sync signals from the CRTC to the TDA998x, and
that shifts the display (as I expect, because the TDA998x synchronises
on the transition of the sync signals not on their absolute values) and
at that point I get the black sync bars appearing - again as expected.

Same kind of effect if I swap the horizontal front and back porches.

Of course, adjusting such things necessitates the TDA998x to re-lock
to the CRTC each time something like that changes, and the image
shift remains.

As I described originally, the _only_ two things that solved the image
shift was (a) shifting the framebuffer start address earlier than it
should be, or (b) disabling the CRTC and re-enabling the CRTC.  Both
of those were tried using devmem2 in userspace with no patches to the
HDLCD code over v4.9-rc5.

The only patches that would be in effect are my TDA998x patch stack
(which you've already tested), the i2c-designware patches to sort that
crappy thing out, and a dirty patch to the TDA998x code to read the
EDID in 16 byte chunks [*], so that the i2c-designware crappage never
causes a problem.

* - I'm not submitting this patch, because while it may solve the
    EDID reading issue on Juno, it's putting intimate knowledge of
    i2c-designware into the TDA998x driver - it's a hack around the
    problem, it's not a real fix.  It's possible that there are other
    i2c-designware crappages out there which have even smaller FIFOs
    which would need us to read in even smaller chunks for reliability.
diff mbox

Patch

diff --git a/drivers/gpu/drm/arm/hdlcd_crtc.c b/drivers/gpu/drm/arm/hdlcd_crtc.c
index 48019ae..656dc43 100644
--- a/drivers/gpu/drm/arm/hdlcd_crtc.c
+++ b/drivers/gpu/drm/arm/hdlcd_crtc.c
@@ -156,9 +156,7 @@  static void hdlcd_crtc_disable(struct drm_crtc *crtc)
 {
 	struct hdlcd_drm_private *hdlcd = crtc_to_hdlcd_priv(crtc);
 
-	if (!crtc->state->active)
-		return;
-
+	drm_crtc_vblank_off(crtc);
 	hdlcd_write(hdlcd, HDLCD_REG_COMMAND, 0);
 	clk_disable_unprepare(hdlcd->clk);
 }