mbox series

[0/9] arm64: dts: rockchip: Initial Toybrick TB-RK1808M0 support

Message ID 20210516230551.12469-1-afaerber@suse.de (mailing list archive)
Headers show
Series arm64: dts: rockchip: Initial Toybrick TB-RK1808M0 support | expand

Message

Andreas Färber May 16, 2021, 11:05 p.m. UTC
Hello Heiko et al.,

It seems linux-rockchip list only saw two RK1808 patches for ASoC in 2019.
Following up on a SUSE Hackweek 20 project of mine, here's some patches that
allow me to start booting into the TB-RK1808M0 mPCIe card's eMMC.

Tested using its USB adapter, which allows to connect a serial cable and a
USB storage device that I load kernel+dtb from. It has a reset button, and
Ctrl+C allows to enter a U-Boot prompt (without EBBR/UEFI support though).

Patches are based on the shipping toybrick.dtb file.
http://t.rock-chips.com/en/wiki.php?mod=view&id=110 gives instructions for
compiling sources, but no source download or link is actually provided.

I encountered a hang: earlycon revealed it being related to KVM and vGIC.
Disabling KVM in Kconfig works around it, as does removing the vGIC irq in DT.
I've already tried low and high for the vGIC interrupt, so no clue what might
cause it. On an mPCIe card with 1 GiB of RAM I figured KVM is not going to be
a major use case, so if we find no other solution, we could just delete the
interrupts property in its .dts, as demonstrated here.

The TB-96AIoT 96Boards SoM would be another RK1808 platform someone might
test these patches on. For the TB-RK1808S0 USB stick there's at least no
documented way to access a serial console.

Have a lot of fun!

Cheers,
Andreas

Cc: devicetree@vger.kernel.org
Cc: Rob Herring <robh+dt@kernel.org>
Cc: Marc Zyngier <maz@kernel.org>

Andreas Färber (9):
  dt-bindings: arm: rockchip: Add Rockchip RK1808 and TB-RK1808M0
  dt-bindings: serial: snps-dw-apb-uart: Add Rockchip RK1808
  arm64: dts: rockchip: Prepare Rockchip RK1808
  arm64: dts: rockchip: Add Rockchip TB-RK1808M0
  arm64: dts: rockchip: rk1808k-toybrick-m0: Suppress vGIC interrupt
  dt-bindings: mmc: rockchip-dw-mshc: Add Rockchip RK1808
  arm64: dts: rockchip: rk1808: Prepare eMMC node
  arm64: dts: rockchip: rk1808k-toybrick-m0: Enable eMMC
  arm64: dts: rockchip: rk1808: Add CPU operating points

 .../devicetree/bindings/arm/rockchip.yaml     |   5 +
 .../bindings/mmc/rockchip-dw-mshc.yaml        |   1 +
 .../bindings/serial/snps-dw-apb-uart.yaml     |   1 +
 arch/arm64/boot/dts/rockchip/Makefile         |   1 +
 arch/arm64/boot/dts/rockchip/rk1808.dtsi      | 276 ++++++++++++++++++
 .../boot/dts/rockchip/rk1808k-toybrick-m0.dts |  97 ++++++
 6 files changed, 381 insertions(+)
 create mode 100644 arch/arm64/boot/dts/rockchip/rk1808.dtsi
 create mode 100644 arch/arm64/boot/dts/rockchip/rk1808k-toybrick-m0.dts

Comments

Marc Zyngier May 17, 2021, 9:02 a.m. UTC | #1
On Mon, 17 May 2021 00:05:42 +0100,
Andreas Färber <afaerber@suse.de> wrote:
> 
> Hello Heiko et al.,
> 
> It seems linux-rockchip list only saw two RK1808 patches for ASoC in 2019.
> Following up on a SUSE Hackweek 20 project of mine, here's some patches that
> allow me to start booting into the TB-RK1808M0 mPCIe card's eMMC.
> 
> Tested using its USB adapter, which allows to connect a serial cable and a
> USB storage device that I load kernel+dtb from. It has a reset button, and
> Ctrl+C allows to enter a U-Boot prompt (without EBBR/UEFI support though).
> 
> Patches are based on the shipping toybrick.dtb file.
> http://t.rock-chips.com/en/wiki.php?mod=view&id=110 gives instructions for
> compiling sources, but no source download or link is actually provided.
> 
> I encountered a hang: earlycon revealed it being related to KVM and
> vGIC.  Disabling KVM in Kconfig works around it, as does removing
> the vGIC irq in DT.  I've already tried low and high for the vGIC
> interrupt, so no clue what might cause it. On an mPCIe card with 1
> GiB of RAM I figured KVM is not going to be a major use case, so if
> we find no other solution, we could just delete the interrupts
> property in its .dts, as demonstrated here.

I think you figured it out wrong, for a number of reasons:

- KVM hanging is usually a sign that you have described the platform
  the wrong way. Either you are stepping over reserved memory regions,
  or you have badly described the GIC itself.

- It could also be a bug in KVM, which will need to be fixed. If
  that's because the HW is broken, we need to be able to detect it.

- You cannot be prescriptive of what a user is going to run. People
  have been running KVM on systems with less memory than that.

So no, we don't paper over these issues. We work out what is going
wrong and we fix it.

Thanks,

	M.
Andreas Färber May 17, 2021, 12:22 p.m. UTC | #2
Hi Marc,

On 17.05.21 11:02, Marc Zyngier wrote:
> On Mon, 17 May 2021 00:05:42 +0100,
> Andreas Färber <afaerber@suse.de> wrote:
>> Patches are based on the shipping toybrick.dtb file.

>> http://t.rock-chips.com/en/wiki.php?mod=view&id=110 gives instructions for

>> compiling sources, but no source download or link is actually provided.

>> 

>> I encountered a hang: earlycon revealed it being related to KVM and
>> vGIC.  Disabling KVM in Kconfig works around it, as does removing
>> the vGIC irq in DT.  I've already tried low and high for the vGIC
>> interrupt, so no clue what might cause it. On an mPCIe card with 1
>> GiB of RAM I figured KVM is not going to be a major use case, so if
>> we find no other solution, we could just delete the interrupts
>> property in its .dts, as demonstrated here.
> 
> I think you figured it out wrong,

Did I? I identified that an issue resulting in no serial console was
dependent on CONFIG_KVM being enabled and specifically to the vGIC
interrupt being specified in my DT. That's all I said.

I never claimed KVM code was to blame, you should know me better by now!

> for a number of reasons:
> 
> - KVM hanging is usually a sign that you have described the platform
>   the wrong way. Either you are stepping over reserved memory regions,
>   or you have badly described the GIC itself.

This whole series is about a new DT hardware description, so yes, that
is the most likely source of the problem I'm observing. Without further
hints how to verify what may cause it, you're just stating the obvious.

The only /reserved-memory entries in the shipping DTB are drm-logo of
size 0 and ramoops - the latter I could try to test, but I'd assume that
to just be a software convention that for lack of oops should not affect
KVM here?

And why would reserved memory affect the vGIC but no other driver doing
allocations? Any way to narrow it down, does vGIC allocate specially?

Only other issue I'm seeing is Debian failing to mount partitions that I
checked I do have drivers built in for and ends up failing to provide an
emergency shell. In order to boot a clean openSUSE rootfs for comparison
I'd first need to figure out adding any USB host nodes and clocks.

> 
> - It could also be a bug in KVM, which will need to be fixed. If
>   that's because the HW is broken, we need to be able to detect it.
> 
> - You cannot be prescriptive of what a user is going to run. People
>   have been running KVM on systems with less memory than that.
> 
> So no, we don't paper over these issues.

As you can see in patch 3, it does include the vGIC interrupt, so that
anyone with access to the TB-96AIoT or any EVB can test KVM and report
success or failure. Thus I don't see me as papering over something here.

However, patch 5 is needed to test this patchset on at least M0 - to
have serial and eMMC rootfs working - until a better fix is found.

> We work out what is going
> wrong and we fix it.

Thanks. You were specifically copied to advise on
how to figure out what might cause it, so that we/I can fix it properly. :)

As I mentioned, I already tried changing the interrupt between high and
low (which was a likely bug source on Realtek RK1319 (where I'm still
waiting on them to confirm a ~year later...)).
I don't have a data source other than the downstream .dtb to check the
interrupt number - mainline PX30/RK3308/RK3328/RK3368/RK3399 do all use
9 and high consistently though, so I figured it's likely correct.

What I was wondering is whether the vGIC, similar to arch timer, might
need some initialization in the bootloader? (Note: No U-Boot sources
either at the link.)
Unfortunately I'm seeing a recurring pattern (cf. Realtek) that vendors
in their BSPs don't enable KVM and thus don't validate their hardware
description against KVM; their shipping 4.4 based kernel here does not
seem to have KVM enabled.

Or is it possible for vendors to actually have a Cortex-A35 without the
Armv8 Virtualization Extensions in silicon? If so, how could one verify?

Thanks,
Andreas
Marc Zyngier May 17, 2021, 1:42 p.m. UTC | #3
Andreas,

On Mon, 17 May 2021 13:22:27 +0100,
Andreas Färber <afaerber@suse.de> wrote:
> 
> Hi Marc,
> 
> On 17.05.21 11:02, Marc Zyngier wrote:
> > On Mon, 17 May 2021 00:05:42 +0100,
> > Andreas Färber <afaerber@suse.de> wrote:
> >> Patches are based on the shipping toybrick.dtb file.
> 
> >> http://t.rock-chips.com/en/wiki.php?mod=view&id=110 gives instructions for
> 
> >> compiling sources, but no source download or link is actually provided.
> 
> >> 
> 
> >> I encountered a hang: earlycon revealed it being related to KVM and
> >> vGIC.  Disabling KVM in Kconfig works around it, as does removing
> >> the vGIC irq in DT.  I've already tried low and high for the vGIC
> >> interrupt, so no clue what might cause it. On an mPCIe card with 1
> >> GiB of RAM I figured KVM is not going to be a major use case, so if
> >> we find no other solution, we could just delete the interrupts
> >> property in its .dts, as demonstrated here.
> > 
> > I think you figured it out wrong,
> 
> Did I? I identified that an issue resulting in no serial console was
> dependent on CONFIG_KVM being enabled and specifically to the vGIC
> interrupt being specified in my DT. That's all I said.

I guess we have a different way to approach these issues. Rather than
disabling a feature, I would have reached out to narrow the problem
down *before* posting a series.

> I never claimed KVM code was to blame, you should know me better by
> now!

Maybe it *is* to blame, and I'd really like to know.

> > for a number of reasons:
> > 
> > - KVM hanging is usually a sign that you have described the platform
> >   the wrong way. Either you are stepping over reserved memory regions,
> >   or you have badly described the GIC itself.
> 
> This whole series is about a new DT hardware description, so yes, that
> is the most likely source of the problem I'm observing. Without further
> hints how to verify what may cause it, you're just stating the obvious.
>
> The only /reserved-memory entries in the shipping DTB are drm-logo of
> size 0 and ramoops - the latter I could try to test, but I'd assume that
> to just be a software convention that for lack of oops should not affect
> KVM here?
> 
> And why would reserved memory affect the vGIC but no other driver doing
> allocations? Any way to narrow it down, does vGIC allocate specially?

Not an existing reserved memory, but instead the lack of a reserved
memory description in the DT, on which KVM would happily step as part
of its own allocations. Having a working vGIC adds a substantial
amount of code paths and (surprise!) interrupt handling.

> Only other issue I'm seeing is Debian failing to mount partitions that I
> checked I do have drivers built in for and ends up failing to provide an
> emergency shell. In order to boot a clean openSUSE rootfs for comparison
> I'd first need to figure out adding any USB host nodes and clocks.
> 
> > 
> > - It could also be a bug in KVM, which will need to be fixed. If
> >   that's because the HW is broken, we need to be able to detect it.
> > 
> > - You cannot be prescriptive of what a user is going to run. People
> >   have been running KVM on systems with less memory than that.
> > 
> > So no, we don't paper over these issues.
> 
> As you can see in patch 3, it does include the vGIC interrupt, so that
> anyone with access to the TB-96AIoT or any EVB can test KVM and report
> success or failure. Thus I don't see me as papering over something here.
> 
> However, patch 5 is needed to test this patchset on at least M0 - to
> have serial and eMMC rootfs working - until a better fix is found.

And that's not papering over the problem? OK, nevermind. Not to
mention that the GIC node has some obvious mistakes which result from
copy-paste.

> > We work out what is going
> > wrong and we fix it.
> 
> Thanks. You were specifically copied to advise on
> how to figure out what might cause it, so that we/I can fix it properly. :)
> 
> As I mentioned, I already tried changing the interrupt between high and
> low (which was a likely bug source on Realtek RK1319 (where I'm still
> waiting on them to confirm a ~year later...)).

Which has no influence since the GIC-500 PPIs are not configurable in
SW, and the presence of this attribute in the DT is just for
documentation.

> I don't have a data source other than the downstream .dtb to check the
> interrupt number - mainline PX30/RK3308/RK3328/RK3368/RK3399 do all use
> 9 and high consistently though, so I figured it's likely correct.
> 
> What I was wondering is whether the vGIC, similar to arch timer, might
> need some initialization in the bootloader? (Note: No U-Boot sources
> either at the link.)

As long as the PPIs are set as group-1NS, this is enough. You can find
out by dumping the redistributors' GICR_IGROUPR0 registers. Nothing
else is required for the GIC to behave.

> Unfortunately I'm seeing a recurring pattern (cf. Realtek) that vendors
> in their BSPs don't enable KVM and thus don't validate their hardware
> description against KVM; their shipping 4.4 based kernel here does not
> seem to have KVM enabled.
> 
> Or is it possible for vendors to actually have a Cortex-A35 without the
> Armv8 Virtualization Extensions in silicon? If so, how could one verify?

There is no "Armv8 Virtualization Extensions". There is only EL2, and
you are already booting at that exception level, or KVM wouldn't even
try to initialise.

It would probably help if you posted a full dmesg as well as added
some basic tracing in the vgic init code so that we can figure out
*what* is going wrong, so that we can all stop making idle guesses.

	M.