diff mbox series

[01/20] Hexagon HVX (target/hexagon) README

Message ID 1625528074-19440-2-git-send-email-tsimpson@quicinc.com (mailing list archive)
State New, archived
Headers show
Series Hexagon HVX (target/hexagon) patch series | expand

Commit Message

Taylor Simpson July 5, 2021, 11:34 p.m. UTC
Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
---
 target/hexagon/README | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 82 insertions(+), 1 deletion(-)

Comments

Rob Landley July 12, 2021, 8:16 a.m. UTC | #1
On 7/5/21 6:34 PM, Taylor Simpson wrote:
> Signed-off-by: Taylor Simpson <tsimpson@quicinc.com>
> ---
>  target/hexagon/README | 83 ++++++++++++++++++++++++++++++++++++++++++++++++++-
>  1 file changed, 82 insertions(+), 1 deletion(-)

I'm poking at the hexagon toolchain build script you checked into the test
directory, which boils down to (starting with):

git clone https://github.com/llvm/llvm-project
mkdir build-llvm
cd build-llvm
cmake -G Ninja -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INSTALL_PREFIX=$(readlink -f ../llvm) -DLLVM_ENABLE_LLD=ON \
  -DLLVM_TARGETS_TO_BUILD="Hexagon" -DLLVM_ENABLE_PROJECTS="clang;lld" \
  $(readlink -f ../llvm-project/llvm)

Except the LLVM_ENABLE_LLD part breaks with a standard debian/devuan x86-64 host
toolchain because it ONLY works with host llvm, and apparently only a pretty
current one at that:

  https://github.com/tensorflow/mlir-hlo/issues/4

(Devuan Beowulf only packages lld-7, not lld-10.)

I'm currently building:

cmake -G Ninja -DCMAKE_BUILD_TYPE=Release \
  -DCMAKE_INSTALL_PREFIX=$(readlink -f ../llvm) -DLLVM_ENABLE_PROJECTS="lld" \
  $(readlink -f ../llvm-project/llvm)
ninja all install

On the theory that should give me an lld-git I can play $PATH games and re-run
the other build with, but my QUESTION is what does the LLVM_ENABLE_LLD=potato
actually accomplish here? Is it a sanitizing step or is there something about
building with gcc's lld that's known to break the hexagon toolchain? If I just
omit it (to avoid building lld _twice_) will I (probably) get a working hexagon
toolchain? (Assuming I do the musl and headers-install builds and so on?)

What's the _issue_ here that this config symbol addresses?

Thanks,

Rob
Brian Cain July 12, 2021, 1:42 p.m. UTC | #2
> -----Original Message-----
> From: Rob Landley <rob@landley.net>
> Sent: Monday, July 12, 2021 3:16 AM
...
> Except the LLVM_ENABLE_LLD part breaks with a standard debian/devuan x86-
> 64 host
> toolchain because it ONLY works with host llvm, and apparently only a pretty
> current one at that:
> 
>   https://github.com/tensorflow/mlir-hlo/issues/4
> 
> (Devuan Beowulf only packages lld-7, not lld-10.)

Sorry about that, we did test only with fairly current/recent lld.

> I'm currently building:
> 
> cmake -G Ninja -DCMAKE_BUILD_TYPE=Release \
>   -DCMAKE_INSTALL_PREFIX=$(readlink -f ../llvm) -
> DLLVM_ENABLE_PROJECTS="lld" \
>   $(readlink -f ../llvm-project/llvm)
> ninja all install
> 
> On the theory that should give me an lld-git I can play $PATH games and re-run
> the other build with, but my QUESTION is what does the
> LLVM_ENABLE_LLD=potato
> actually accomplish here? Is it a sanitizing step or is there something about
> building with gcc's lld that's known to break the hexagon toolchain? If I just
> omit it (to avoid building lld _twice_) will I (probably) get a working hexagon
> toolchain? (Assuming I do the musl and headers-install builds and so on?)
> 
> What's the _issue_ here that this config symbol addresses?

lld is not necessary to build the hexagon toolchain.  It just happens that the build process takes an enormous amount of time (and memory) using the gnu BFD ld.  I would expect building with either ld/gold  would work fine.

If you don't mind binaries, there are x86_64 linux binary toolchains with lld on releases.llvm.org and there's also a binary hexagon-linux cross toolchain that we shared for use by kernel developers.  The hexagon linux toolchain is built on Ubuntu 16.04.

But when building your toolchain, omitting LLVM_ENABLE_LLD should work just fine.

-Brian
Rob Landley July 19, 2021, 1:10 a.m. UTC | #3
On 7/12/21 8:42 AM, Brian Cain wrote:
> If you don't mind binaries, there are x86_64 linux binary toolchains with lld
> on releases.llvm.org

I've never managed to run those binaries, because they're dynamically linked
against some specific distro I'm not using:

  $ bin/clang --help
  /lib/ld-linux-aarch64.so.1: No such file or directory

All the toolchains I build for distribution are statically linked on the host.
(Back in the day I even wrote a
https://github.com/landley/aboriginal/blob/master/sources/toys/ccwrap.c wrapper
to feed --nostdinc --nostdlib to gcc and build up all the includes again
manually to stop gcc from leaking random host context into the cross compilers,
but these days I use
https://github.com/landley/toybox/blob/master/scripts/mcm-buildall.sh with Rich
Felker's musl-cross-make to create statically linked cross and native musl
toolchains which I would happily post binaries of if they weren't GPLv3 and thus
non-distributable. Oh well.)

Vaguely trying to make an llvm-buildall.sh for toybox, which might be a fourth
section to https://landley.net/toybox/faq.html#cross but first I'm trying to
make the hexagon-only one work based on your model. :)

> and there's also a binary hexagon-linux cross toolchain that
> we shared for use by kernel developers.  The hexagon linux
> toolchain is built on Ubuntu 16.04.

Where's that one?

> But when building your toolchain, omitting LLVM_ENABLE_LLD should work just fine.

It did, thanks.

Now I'm trying to figure out what all the extra CFLAGS are for.

The clang_rt build has CMAKE_ASM_FLAGS="-G0 -mlong-calls -fno-pic
--target=hexagon-unknown-linux-musl" which
https://clang.llvm.org/docs/ClangCommandLineReference.html defines as:

-G<size>
  Put objects of at most <size> bytes into small data section (MIPS / Hexagon)

-mlong-calls
  Generate branches with extended addressability, usually via indirect jumps.

I don't understand why your libcc build needs no-pic? (Are we only building
a static libgcc.a instead of a dynamic one? I'm fine with that if so, but
this needs to be specified in the MAKE_ASM_FLAGS why?)

Why is it saying --target=hexagon-random-nonsense-musl to a toolchain
we built with exactly one target type? How does it NOT default to hexagon?
(Is this related to the build writing a hexagon-potato-banana-musl.cfg file
in the bin directory, meaning the config file is in the $PATH? Does clang only
look for it in the same directory the clang executable lives in?)

And while we're at it, the CONTENTS of hexagon-gratuitous-gnu-format.cfg is:

cat <<EOF > hexagon-unknown-linux-musl.cfg
-G0 --sysroot=${HEX_SYSROOT}
EOF

Which is ALREADY saying -G0? (Why does it want to do that globally? Some sort of
bug workaround?) So why do we specify it again here?

Next up build_musl_headers does CROSS_CFLAGS="-G0 -O0 -mv65 -fno-builtin
-fno-rounding-math --target=hexagon-unknown-linux-musl" which:

-O0
  disable most of the optimizer

-mv65
  -mtune for a specific hexagon generation.
  (Why? Does qemu only support that but not newer?)

-fno-builtin
  musl's ./configure already probes for this and will add it if
  the compiler supports it.

-fno-rounding-math
  the docs MENTION this, but do not explain it.

And again with the -G0.

These flags probably aren't needed _here_ because this is just the headers
install (which is basically a cp -a isn't it?). This looks like it's copied
verbatim from the musl library build. But that library build happens in a bit,
so relevant-ish I guess...

(Also, why does building librt-but-not-really need the libc headers?
The libgcc build EXPLICITLY does not require that, because otherwise you
have this kind of BS circular dependency. Also, how do you EVER build a
bare metal ELF toolchain with that dependency in there?)

Next up build_kernel_headers has KBUILD_CFLAGS_KERNEL="-mlong-calls" which
again, A) why does the compiler not do by default, B) can't be needed here
because you don't even have to specify a cross compiler when doing
headers_install. (I just confirmed this by diffing installs with an without a
cross compiler specified: they were identical. I remembered this from
https://github.com/torvalds/linux/commit/e0e2fa4b515c but checked again to be
sure.) Presumably this is more "shared with full kernel build".

And then build_musl, covered above under the headers build: lotsa flags, not
sure why.

> -Brian
> 

Rob

P.S. It took me a while to figure out that clang_rt is NOT librt.a, I think it's
their libgcc? Especially confusing since librt.a has existed for decades and
was on solaris before it was on linux, and the OBVIOUS name is libcc
the same way "cc" is the generic compiler name instead of "gcc".
(In fact that was the posix compiler name until they decided to replace
it with "c99" and everybody ignored them the way tar->pax was ignored,
largely because make's $CC defaults to "cc" so it Just Works, and yes
the cross compiler should have that name but the prepackaged clang tarball
above does not. *shrug* I fix that up when making my prefix symlinks. The
android NDK guys at least have the excuse of shipping NINE different
x86_64-linux-android*-clang with API version numbers and thus not wanting to
pick a default to single out, so leave making the -cc link as an exercise for
the reader. I give instructions for doing so on the toybox cross compiling page
I linked above. :)
Brian Cain July 19, 2021, 1:39 p.m. UTC | #4
> -----Original Message-----
> From: Rob Landley <rob@landley.net>
...
> On 7/12/21 8:42 AM, Brian Cain wrote:
...
> > and there's also a binary hexagon-linux cross toolchain that
> > we shared for use by kernel developers.  The hexagon linux
> > toolchain is built on Ubuntu 16.04.
> 
> Where's that one?

https://codelinaro.jfrog.io/artifactory/codelinaro-qemu/2021-05-12/clang+llvm-12.0.0-cross-hexagon-unknown-linux-musl.tar.xz - 
	- Built on Ubuntu 16.04, similar dynamic dependencies as releases.llvm.org binaries
	- Manifest:
		- llvm+clang 12.0.0 tag
		- Linux 5.6.18
		- github.com/qemu/qemu 15106f7dc3290ff3254611f265849a314a93eb0e
		- github.com/quic/musl aff74b395fbf59cd7e93b3691905aa1af6c0778c


> > But when building your toolchain, omitting LLVM_ENABLE_LLD should work
> just fine.
> 
> It did, thanks.
> 
> Now I'm trying to figure out what all the extra CFLAGS are for.

+Sid for some perspective on the rationale of these flags.  Some of these flags may be workarounds for toolchain issues.

> The clang_rt build has CMAKE_ASM_FLAGS="-G0 -mlong-calls -fno-pic
> --target=hexagon-unknown-linux-musl" which
> https://clang.llvm.org/docs/ClangCommandLineReference.html defines as:
> 
> -G<size>
>   Put objects of at most <size> bytes into small data section (MIPS / Hexagon)
> 
> -mlong-calls
>   Generate branches with extended addressability, usually via indirect jumps.
> 
> I don't understand why your libcc build needs no-pic? (Are we only building
> a static libgcc.a instead of a dynamic one? I'm fine with that if so, but
> this needs to be specified in the MAKE_ASM_FLAGS why?)
> 
> Why is it saying --target=hexagon-random-nonsense-musl to a toolchain
> we built with exactly one target type? How does it NOT default to hexagon?
> (Is this related to the build writing a hexagon-potato-banana-musl.cfg file
> in the bin directory, meaning the config file is in the $PATH? Does clang only
> look for it in the same directory the clang executable lives in?)
> 
> And while we're at it, the CONTENTS of hexagon-gratuitous-gnu-format.cfg is:
> 
> cat <<EOF > hexagon-unknown-linux-musl.cfg
> -G0 --sysroot=${HEX_SYSROOT}
> EOF
> 
> Which is ALREADY saying -G0? (Why does it want to do that globally? Some
> sort of
> bug workaround?) So why do we specify it again here?
> 
> Next up build_musl_headers does CROSS_CFLAGS="-G0 -O0 -mv65 -fno-builtin
> -fno-rounding-math --target=hexagon-unknown-linux-musl" which:
> 
> -O0
>   disable most of the optimizer
> 
> -mv65
>   -mtune for a specific hexagon generation.
>   (Why? Does qemu only support that but not newer?)
> 
> -fno-builtin
>   musl's ./configure already probes for this and will add it if
>   the compiler supports it.
> 
> -fno-rounding-math
>   the docs MENTION this, but do not explain it.
> 
> And again with the -G0.
> 
> These flags probably aren't needed _here_ because this is just the headers
> install (which is basically a cp -a isn't it?). This looks like it's copied
> verbatim from the musl library build. But that library build happens in a bit,
> so relevant-ish I guess...
> 
> (Also, why does building librt-but-not-really need the libc headers?
> The libgcc build EXPLICITLY does not require that, because otherwise you
> have this kind of BS circular dependency. Also, how do you EVER build a
> bare metal ELF toolchain with that dependency in there?)
> 
> Next up build_kernel_headers has KBUILD_CFLAGS_KERNEL="-mlong-calls"
> which
> again, A) why does the compiler not do by default, B) can't be needed here
> because you don't even have to specify a cross compiler when doing
> headers_install. (I just confirmed this by diffing installs with an without a
> cross compiler specified: they were identical. I remembered this from
> https://github.com/torvalds/linux/commit/e0e2fa4b515c but checked again to
> be
> sure.) Presumably this is more "shared with full kernel build".
> 
> And then build_musl, covered above under the headers build: lotsa flags, not
> sure why.
> 
> > -Brian
> >
> 
> Rob
> 
> P.S. It took me a while to figure out that clang_rt is NOT librt.a, I think it's
> their libgcc? Especially confusing since librt.a has existed for decades and
> was on solaris before it was on linux, and the OBVIOUS name is libcc
> the same way "cc" is the generic compiler name instead of "gcc".
> (In fact that was the posix compiler name until they decided to replace
> it with "c99" and everybody ignored them the way tar->pax was ignored,
> largely because make's $CC defaults to "cc" so it Just Works, and yes
> the cross compiler should have that name but the prepackaged clang tarball
> above does not. *shrug* I fix that up when making my prefix symlinks. The
> android NDK guys at least have the excuse of shipping NINE different
> x86_64-linux-android*-clang with API version numbers and thus not wanting to
> pick a default to single out, so leave making the -cc link as an exercise for
> the reader. I give instructions for doing so on the toybox cross compiling page
> I linked above. :)
Sid Manning July 19, 2021, 4:19 p.m. UTC | #5
> -----Original Message-----
> From: Brian Cain <bcain@quicinc.com>
> Sent: Monday, July 19, 2021 8:40 AM
> To: Rob Landley <rob@landley.net>; Taylor Simpson
> <tsimpson@quicinc.com>; qemu-devel@nongnu.org; Sid Manning
> <sidneym@quicinc.com>
> Cc: ale@rev.ng; peter.maydell@linaro.org; richard.henderson@linaro.org;
> philmd@redhat.com
> Subject: RE: [EXT] Re: [PATCH 01/20] Hexagon HVX (target/hexagon)
> README
> 
> 
> 
> > -----Original Message-----
> > From: Rob Landley <rob@landley.net>
> ...
> > On 7/12/21 8:42 AM, Brian Cain wrote:
> ...
> > > and there's also a binary hexagon-linux cross toolchain that we
> > > shared for use by kernel developers.  The hexagon linux toolchain is
> > > built on Ubuntu 16.04.
> >
> > Where's that one?
> 
> https://codelinaro.jfrog.io/artifactory/codelinaro-qemu/2021-05-
> 12/clang+llvm-12.0.0-cross-hexagon-unknown-linux-musl.tar.xz -
> 	- Built on Ubuntu 16.04, similar dynamic dependencies as
> releases.llvm.org binaries
> 	- Manifest:
> 		- llvm+clang 12.0.0 tag
> 		- Linux 5.6.18
> 		- github.com/qemu/qemu
> 15106f7dc3290ff3254611f265849a314a93eb0e
> 		- github.com/quic/musl
> aff74b395fbf59cd7e93b3691905aa1af6c0778c
> 
> 
> > > But when building your toolchain, omitting LLVM_ENABLE_LLD should
> > > work
> > just fine.
> >
> > It did, thanks.
> >
> > Now I'm trying to figure out what all the extra CFLAGS are for.
> 
> +Sid for some perspective on the rationale of these flags.  Some of these
> flags may be workarounds for toolchain issues.
> 
> > The clang_rt build has CMAKE_ASM_FLAGS="-G0 -mlong-calls -fno-pic
> > --target=hexagon-unknown-linux-musl" which
> > https://clang.llvm.org/docs/ClangCommandLineReference.html defines as:
> >
> > -G<size>
> >   Put objects of at most <size> bytes into small data section (MIPS /
> > Hexagon)
> >
> > -mlong-calls
> >   Generate branches with extended addressability, usually via indirect
> jumps.
> >
> > I don't understand why your libcc build needs no-pic? (Are we only
> > building a static libgcc.a instead of a dynamic one? I'm fine with
> > that if so, but this needs to be specified in the MAKE_ASM_FLAGS why?)
> >
> > Why is it saying --target=hexagon-random-nonsense-musl to a toolchain
> > we built with exactly one target type? How does it NOT default to
> hexagon?
> > (Is this related to the build writing a hexagon-potato-banana-musl.cfg
> > file in the bin directory, meaning the config file is in the $PATH?
> > Does clang only look for it in the same directory the clang executable
> > lives in?)
> >
> > And while we're at it, the CONTENTS of hexagon-gratuitous-gnu-format.cfg
> is:
> >
> > cat <<EOF > hexagon-unknown-linux-musl.cfg
> > -G0 --sysroot=${HEX_SYSROOT}
> > EOF
> >
> > Which is ALREADY saying -G0? (Why does it want to do that globally?
> > Some sort of bug workaround?) So why do we specify it again here?
> >
> > Next up build_musl_headers does CROSS_CFLAGS="-G0 -O0 -mv65
> > -fno-builtin -fno-rounding-math --target=hexagon-unknown-linux-musl"
> which:

I agree G0 is overplayed here.  G0 is the implied default on Linux.  On occasion we use a different configuration of clang where small data (G8) is the default so G0 is specified.


> >
> > -O0
> >   disable most of the optimizer

This should be changed.  This was added so I could factor out clang's floating point optimizations.  These optimizations caused musl-libc testsuite to fail some floating point accuracy tests.  I know at least some of those issues have now been resolved.

> >
> > -mv65
> >   -mtune for a specific hexagon generation.
> >   (Why? Does qemu only support that but not newer?)
Passing -mvXX it is now recommended practice.  A few years ago the default arch selected changed from the oldest support arch to the newest arch.  QEMU supports later architectures.

> >
> > -fno-builtin
> >   musl's ./configure already probes for this and will add it if
> >   the compiler supports it.
I did not know this, we can get rid of -fno-builtin if the driver is meeting musl's expectations.


> >
> > -fno-rounding-math
> >   the docs MENTION this, but do not explain it.

This was workaround and is no longer needed.  IIRC clang was asserting while building musl.

> >
> > And again with the -G0.
> >
> > These flags probably aren't needed _here_ because this is just the
> > headers install (which is basically a cp -a isn't it?). This looks
> > like it's copied verbatim from the musl library build. But that
> > library build happens in a bit, so relevant-ish I guess...
> >
> > (Also, why does building librt-but-not-really need the libc headers?
> > The libgcc build EXPLICITLY does not require that, because otherwise
> > you have this kind of BS circular dependency. Also, how do you EVER
> > build a bare metal ELF toolchain with that dependency in there?)

Getting cmake to agree to build compiler-rt might be better now.


> >
> > Next up build_kernel_headers has KBUILD_CFLAGS_KERNEL="-mlong-
> calls"
> > which
> > again, A) why does the compiler not do by default, B) can't be needed
> > here because you don't even have to specify a cross compiler when
> > doing headers_install. (I just confirmed this by diffing installs with
> > an without a cross compiler specified: they were identical. I
> > remembered this from
> > https://github.com/torvalds/linux/commit/e0e2fa4b515c but checked
> > again to be
> > sure.) Presumably this is more "shared with full kernel build".

-mlong-calls are not needed for header install.  -mlong-calls are needed when building the kernel source.  If this is removed the link step may fail with a relocation overflow depending on the version of the kernel source you are building.


> >
> > And then build_musl, covered above under the headers build: lotsa
> > flags, not sure why.
> >
> > > -Brian
> > >
> >
> > Rob
> >
> > P.S. It took me a while to figure out that clang_rt is NOT librt.a, I
> > think it's their libgcc? Especially confusing since librt.a has
> > existed for decades and was on solaris before it was on linux, and the
> > OBVIOUS name is libcc the same way "cc" is the generic compiler name
> instead of "gcc".
> > (In fact that was the posix compiler name until they decided to
> > replace it with "c99" and everybody ignored them the way tar->pax was
> > ignored, largely because make's $CC defaults to "cc" so it Just Works,
> > and yes the cross compiler should have that name but the prepackaged
> > clang tarball above does not. *shrug* I fix that up when making my
> > prefix symlinks. The android NDK guys at least have the excuse of
> > shipping NINE different x86_64-linux-android*-clang with API version
> > numbers and thus not wanting to pick a default to single out, so leave
> > making the -cc link as an exercise for the reader. I give instructions
> > for doing so on the toybox cross compiling page I linked above. :)
Rob Landley July 26, 2021, 7:57 a.m. UTC | #6
On 7/19/21 11:19 AM, Sid Manning wrote:>> -----Original Message-----
>> From: Brian Cain <bcain@quicinc.com>
>> Sent: Monday, July 19, 2021 8:40 AM
>> To: Rob Landley <rob@landley.net>; Taylor Simpson
>> <tsimpson@quicinc.com>; qemu-devel@nongnu.org; Sid Manning
>> <sidneym@quicinc.com>
>> Cc: ale@rev.ng; peter.maydell@linaro.org; richard.henderson@linaro.org;
>> philmd@redhat.com
>> Subject: RE: [EXT] Re: [PATCH 01/20] Hexagon HVX (target/hexagon)
>> README
>> 
>> > -----Original Message-----
>> > From: Rob Landley <rob@landley.net>
>> ...
>> > On 7/12/21 8:42 AM, Brian Cain wrote:
>> ...
>> > > and there's also a binary hexagon-linux cross toolchain that we
>> > > shared for use by kernel developers.  The hexagon linux toolchain is
>> > > built on Ubuntu 16.04.
>> >
>> > Where's that one?
>> 
>> https://codelinaro.jfrog.io/artifactory/codelinaro-qemu/2021-05-
>> 12/clang+llvm-12.0.0-cross-hexagon-unknown-linux-musl.tar.xz -
>> 	- Built on Ubuntu 16.04, similar dynamic dependencies as
>> releases.llvm.org binaries

Indeed, in a "that also does not run on devuan, which is 99% stock debian" way. :(

Luckily, I built a working hexagon toolchain with the attached script, as in
"qemu-hexagon ran a statically linked toybox", and it even built a kernel.

I'm still trimming the build script down, that clang-rt section is WAY too big,
and I need to static link the binaires it produces so I can tar 'em up and use
them under a different distro, and I haven't even _started_ making a native
toolchain yet.[1]

Next question: is there a qemu-system-hexagon anywhere?

I mentioned I built a comet_defconfig kernel, ala:

LLVM_IAS=1
CROSS_COMPILE=~/toybox/hexagon/ccc/cross_bin/hexagon-unknown-linux-musl- make
ARCH=hexagon CC=~/toybox/hexagon/ccc/cross_bin/hexagon-unknown-linux-musl-cc

Which is kinda silly because:

1) Other packages figure out that ${CROSS}cc works but Linux insists on
${CROSS}gcc, and you can't even do "CC=cc make" because then it won't add the
cross compiler prefix. (And if I say LLVM=1 on the kernel command line, which I
shouldn't have to do, it uses _unprefixed_ clang as the $CC name, despite cross
compiling.)

2) If you don't set LLVM_IAS it tries to call the UNPREFIXED assembler, again
while cross compiling.

Anyway, I've got a compiler now and I (awkwardly) built a kernel and I'm sitting
down to try to figure out how to get qemu to invoke it: does this arch want
vmlinux or arch/hexagon/boot/$RANDOMFORMAT, is serial on console=ttyS0 or
/significant/dev/prefix/ttyasparagus0 or...

See https://github.com/landley/toybox/blob/master/scripts/mkroot.sh#L186 for the
other architectures I've already added to toybox's mkroot, yes I have a ~250
line bash script that builds bootable Linux systems for a bunch of different
architectures and adding a new architecture looks like:

elif [ "$TARGET" == m68k ]; then
QEMU="m68k -M q800" KARCH=m68k KARGS=ttyS0 VMLINUX=vmlinux
KCONF=MMU,M68040,M68KFPU_EMU,MAC,SCSI_MAC_ESP,MACINTOSH_DRIVERS,ADB,ADB_MACII,NET_CORE,MACSONIC,SERIAL_PMACZILOG,SERIAL_PMACZILOG_TTYS,SERIAL_PMACZILOG_CONSOLE

(There's a little documentation at https://landley.net/toybox/faq.html#mkroot if
you're curious.)

Anyway... it doesn't look like qemu-system-hexagon (softmmu) its currently in
vanilla qemu? Is there a public fork that has this somewhere?

Thanks,

Rob

[1] Why does https://llvm.org/docs/GettingStarted.html#cross-compiling-llvm talk
about osx? Dear compiler writers: a compiler is conceptually the same as an html
to pdf converter. It takes input files, it produces output files. Yes some of
the input files are common library stuff like fonts reused by multiple
input/output pairs... again like an html to pdf converter. This is not
unprecedented black magic. Sure it's clever. So was Quake, which has now been
genericized into a broad industry from WoW to Skyrim.
Rob Landley July 26, 2021, 8:54 a.m. UTC | #7
On 7/26/21 2:57 AM, Rob Landley wrote:
> Anyway... it doesn't look like qemu-system-hexagon (softmmu) its currently in
> vanilla qemu? Is there a public fork that has this somewhere?

I did a little wild flailing to get ./configure to give me a qemu-system-hexagon
option (patch attached), I.E. just enough to get meson to shut up and quite
possibly still missing something important. (Is this python? It looks kind of
like python.)

Unfortunately after liberally cribbing from the cris architecture (which seems
like the simplest one) I'm left with several new C files to implement, all
currently zero length in the patch:

  hw/hexagon/boot.c
  hw/hexagon/hexagon_comet.c
  target/hexagon/machine.c
  target/hexagon/mmu.c

(In theory there's a "none" board on all the current qemu-system architectures,
but I've never figured out what to DO with those...)

All this raises two problems:

1) I dunno how the hexagon mmu works. (I can presumably read the kernel code and
reverse engineer what that's looking for, but it would be really nice not to
_have_ to?)

2) What's a comet board? (Memory layout? I/O devices? I guess all I need for
serial console on initramfs is a contiguous block of DRAM, timer interrupt to
drive the scheduler, and a serial port. I keep thinking there should be a way to
tell the "none" board to add that stuff from the command line... but dunno how.
"map DRAM here". "add this clock hardware at here". "add this kind of serial
port at here." "call elf_load on this file and start executing at its entry
point"...)

3) Reading the arch/hexagon kernel stuff ala "so what IS in a comet board"...
CONFIG_HEXAGON_COMET is only ever used to guard one #define in a header file:

  arch/hexagon/include/asm/timer-regs.h:#define RTOS_TIMER_REGS_ADDR

Which is then used to initialize structure members in arch/hexagon/kernel/time.c
without any sort of guard there, and no it isn't #defined to 0 by default
anywhere I can see? And of course obj-y += time.o in
arch/hexagon/kernel/Makefile has no config guard there either. So if it wasn't
set, the build would break. And that's currently all the symbol does?

Anyway, I still hope somebody else has already done most of this in a git tree
somewhere. :)

Rob
Taylor Simpson July 26, 2021, 1:59 p.m. UTC | #8
We're working on system mode support for Hexagon, and we plan to upstream it when it is ready.

Thanks,
Taylor



> -----Original Message-----
> From: Rob Landley <rob@landley.net>
> Sent: Monday, July 26, 2021 3:55 AM
> To: Sid Manning <sidneym@quicinc.com>; Brian Cain <bcain@quicinc.com>;
> Taylor Simpson <tsimpson@quicinc.com>; qemu-devel@nongnu.org; linux-
> hexagon@vger.kernel.org
> Cc: ale@rev.ng; peter.maydell@linaro.org; richard.henderson@linaro.org;
> philmd@redhat.com
> Subject: Re: [PATCH 01/20] Hexagon HVX (target/hexagon) README
> 
> On 7/26/21 2:57 AM, Rob Landley wrote:
> > Anyway... it doesn't look like qemu-system-hexagon (softmmu) its
> > currently in vanilla qemu? Is there a public fork that has this somewhere?
> 
> I did a little wild flailing to get ./configure to give me a qemu-system-hexagon
> option (patch attached), I.E. just enough to get meson to shut up and quite
> possibly still missing something important. (Is this python? It looks kind of like
> python.)
> 
> Unfortunately after liberally cribbing from the cris architecture (which seems
> like the simplest one) I'm left with several new C files to implement, all
> currently zero length in the patch:
> 
>   hw/hexagon/boot.c
>   hw/hexagon/hexagon_comet.c
>   target/hexagon/machine.c
>   target/hexagon/mmu.c
> 
> (In theory there's a "none" board on all the current qemu-system
> architectures, but I've never figured out what to DO with those...)
> 
> All this raises two problems:
> 
> 1) I dunno how the hexagon mmu works. (I can presumably read the kernel
> code and reverse engineer what that's looking for, but it would be really nice
> not to _have_ to?)
> 
> 2) What's a comet board? (Memory layout? I/O devices? I guess all I need for
> serial console on initramfs is a contiguous block of DRAM, timer interrupt to
> drive the scheduler, and a serial port. I keep thinking there should be a way
> to tell the "none" board to add that stuff from the command line... but
> dunno how.
> "map DRAM here". "add this clock hardware at here". "add this kind of serial
> port at here." "call elf_load on this file and start executing at its entry
> point"...)
> 
> 3) Reading the arch/hexagon kernel stuff ala "so what IS in a comet board"...
> CONFIG_HEXAGON_COMET is only ever used to guard one #define in a
> header file:
> 
>   arch/hexagon/include/asm/timer-regs.h:#define
> RTOS_TIMER_REGS_ADDR
> 
> Which is then used to initialize structure members in
> arch/hexagon/kernel/time.c without any sort of guard there, and no it isn't
> #defined to 0 by default anywhere I can see? And of course obj-y += time.o
> in arch/hexagon/kernel/Makefile has no config guard there either. So if it
> wasn't set, the build would break. And that's currently all the symbol does?
> 
> Anyway, I still hope somebody else has already done most of this in a git tree
> somewhere. :)
> 
> Rob
Rob Landley July 28, 2021, 8:11 a.m. UTC | #9
On 7/26/21 8:59 AM, Taylor Simpson wrote:
>> Anyway, I still hope somebody else has already done most of this in a git
>> tree somewhere. :)
>
> We're working on system mode support for Hexagon, and we plan to upstream it when it is ready.

Yay! Thanks.

While you're at it, why is llvm's cmake config unable to do:

  $ cccnext/cross_bin/hexagon-unknown-linux-musl-cc \
    -Xpreprocessor -P -E - <<< __SIZEOF_POINTER__
  4

I'm trying to genericize that llvm build script to do all the targets musl and
llvm agree on supporting, which means not passing in -DCMAKE_SIZEOF_VOID_P=4
because the compiler ALREADY KNOWS THIS... but cmake/config-ix.cmake line 196 is
REALLY going to barf if we didn't explicitly specify it on the command line? Are
the llvm developers not _aware_ of the "cc -E -dM - < /dev/null" trick? Even if
they aren't, why couldn't they just sizeof(void *) in a header file?

*shrug* I can do the above trick in the wrapper script and then provide
-DCMAKE_SIZEOF_VOID_P=$BLAH on the command line, it just seems DEEPLY pointless
to go to all the trouble of having a ./configure that has to be manually told
stuff the compiler already knows.

Confused,

Rob
Rob Landley Nov. 25, 2021, 6:26 a.m. UTC | #10
On 7/26/21 8:59 AM, Taylor Simpson wrote:
> We're working on system mode support for Hexagon, and we plan to upstream it when it is ready.
> 
> Thanks,
> Taylor

Any progress on this? (Is there a way for outsiders to track the status?)

Thanks,

Rob
diff mbox series

Patch

diff --git a/target/hexagon/README b/target/hexagon/README
index b0b2435..9a57802 100644
--- a/target/hexagon/README
+++ b/target/hexagon/README
@@ -1,9 +1,13 @@ 
 Hexagon is Qualcomm's very long instruction word (VLIW) digital signal
-processor(DSP).
+processor(DSP).  We also support Hexagon Vector eXtensions (HVX).  HVX
+is a wide vector coprocessor designed for high performance computer vision,
+image processing, machine learning, and other workloads.
 
 The following versions of the Hexagon core are supported
     Scalar core: v67
     https://developer.qualcomm.com/downloads/qualcomm-hexagon-v67-programmer-s-reference-manual
+    HVX extension: v66
+    https://developer.qualcomm.com/downloads/qualcomm-hexagon-v66-hvx-programmer-s-reference-manual
 
 We presented an overview of the project at the 2019 KVM Forum.
     https://kvmforum2019.sched.com/event/Tmwc/qemu-hexagon-automatic-translation-of-the-isa-manual-pseudcode-to-tiny-code-instructions-of-a-vliw-architecture-niccolo-izzo-revng-taylor-simpson-qualcomm-innovation-center
@@ -124,6 +128,73 @@  There are also cases where we brute force the TCG code generation.
 Instructions with multiple definitions are examples.  These require special
 handling because qemu helpers can only return a single value.
 
+For HVX vectors, the generator behaves slightly differently.  The wide vectors
+won't fit in a TCGv or TCGv_i64, so we pass TCGv_ptr variables to pass the
+address to helper functions.  Here's an example for an HVX vector-add-word
+istruction.
+    static void generate_V6_vaddw(
+                    CPUHexagonState *env,
+                    DisasContext *ctx,
+                    Insn *insn,
+                    Packet *pkt)
+    {
+        const int VdN = insn->regno[0];
+        const intptr_t VdV_off =
+            offsetof(CPUHexagonState,
+                     future_VRegs[VdN]);
+        TCGv_ptr VdV = tcg_temp_local_new_ptr();
+        tcg_gen_addi_ptr(VdV, cpu_env, VdV_off);
+        const int VuN = insn->regno[1];
+        const intptr_t VuV_off =
+            vreg_src_off(ctx, VuN);
+        TCGv_ptr VuV = tcg_temp_local_new_ptr();
+        const int VvN = insn->regno[2];
+        const intptr_t VvV_off =
+            vreg_src_off(ctx, VvN);
+        TCGv_ptr VvV = tcg_temp_local_new_ptr();
+        tcg_gen_addi_ptr(VuV, cpu_env, VuV_off);
+        tcg_gen_addi_ptr(VvV, cpu_env, VvV_off);
+        TCGv slot = tcg_const_tl(insn->slot);
+        gen_helper_V6_vaddw(cpu_env, VdV, VuV, VvV, slot);
+        tcg_temp_free(slot);
+        gen_log_vreg_write(VdV_off, VdN, EXT_DFL, insn->slot, false, pkt->pkt_has_vhist);
+        ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
+        tcg_temp_free_ptr(VdV);
+        tcg_temp_free_ptr(VuV);
+        tcg_temp_free_ptr(VvV);
+    }
+
+Notice that we also generate a variable named <operand>_off for each operand of
+the instruction.  This makes it easy to override the instruction semantics with
+functions from tcg-op-gved.h.  Here's the override for this instruction.
+    #define fGEN_TCG_V6_vaddw(SHORTCODE) \
+        tcg_gen_gvec_add(MO_32, VdV_off, VuV_off, VvV_off, \
+                         sizeof(MMVector), sizeof(MMVector))
+
+Finally, we notice that the override doesn't use the TCGv_ptr variables, so
+we don't generate them when an override is present.  Here is what we generate
+when the override is present.
+    static void generate_V6_vaddw(
+                    CPUHexagonState *env,
+                    DisasContext *ctx,
+                    Insn *insn,
+                    Packet *pkt)
+    {
+        const int VdN = insn->regno[0];
+        const intptr_t VdV_off =
+            offsetof(CPUHexagonState,
+                     future_VRegs[VdN]);
+        const int VuN = insn->regno[1];
+        const intptr_t VuV_off =
+            vreg_src_off(ctx, VuN);
+        const int VvN = insn->regno[2];
+        const intptr_t VvV_off =
+            vreg_src_off(ctx, VvN);
+        fGEN_TCG_V6_vaddw({ fHIDE(int i;) fVFOREACH(32, i) { VdV.w[i] = VuV.w[i] + VvV.w[i] ; } });
+        gen_log_vreg_write(VdV_off, VdN, EXT_DFL, insn->slot, false, pkt->pkt_has_vhist);
+        ctx_log_vreg_write(ctx, VdN, EXT_DFL, false);
+    }
+
 In addition to instruction semantics, we use a generator to create the decode
 tree.  This generation is also a two step process.  The first step is to run
 target/hexagon/gen_dectree_import.c to produce
@@ -140,6 +211,7 @@  runtime information for each thread and contains stuff like the GPR and
 predicate registers.
 
 macros.h
+mmvec/macros.h
 
 The Hexagon arch lib relies heavily on macros for the instruction semantics.
 This is a great advantage for qemu because we can override them for different
@@ -203,6 +275,15 @@  During runtime, the following fields in CPUHexagonState (see cpu.h) are used
     pred_written          boolean indicating if predicate was written
     mem_log_stores        record of the stores (indexed by slot)
 
+For Hexagon Vector eXtensions (HVX), the following fields are used
+    future_VRegs
+    tmp_VRegs
+    future_ZRegs
+    ZRegs_updated
+    VRegs_updated_tmp
+    VRegs_updated
+    VRegs_select
+
 *** Debugging ***
 
 You can turn on a lot of debugging by changing the HEX_DEBUG macro to 1 in