Message ID | 20220426125028.18844-1-lucas.araujo@eldorado.org.br (mailing list archive) |
---|---|
Headers | show |
Series | VSX MMA Implementation | expand |
On Tue, 26 Apr 2022 at 12:51, Lucas Mateus Castro(alqotel) <lucas.araujo@eldorado.org.br> wrote: > > From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> > > This patch series is an RFC of the Matrix-Multiply Assist (MMA) > instructions implementation from the PowerISA 3.1 > > These and the VDIV/VMOD implementation are the last new PowerISA 3.1 > instructions left to be implemented. > > Thanks > Lucas Mateus Castro (alqotel) (7): > target/ppc: Implement xxm[tf]acc and xxsetaccz > target/ppc: Implemented xvi*ger* instructions > target/ppc: Implemented pmxvi*ger* instructions > target/ppc: Implemented xvf*ger* > target/ppc: Implemented xvf16ger* > target/ppc: Implemented pmxvf*ger* > target/ppc: Implemented [pm]xvbf16ger2* I have a small test case for the MMA instructions that Alistair wrote a while back[1]. It passes when run with these patches applied (previously it would sigill). $ qemu-ppc64le -cpu power10 -L ~/ppc64le/ ./test -m Smoke test MMA MMA[0] = 1 (Correct) MMA[1] = 2 (Correct) MMA[2] = 3 (Correct) MMA[3] = 4 (Correct) MMA[4] = 2 (Correct) MMA[5] = 4 (Correct) MMA[6] = 6 (Correct) MMA[7] = 8 (Correct) MMA[8] = 3 (Correct) MMA[9] = 6 (Correct) MMA[10] = 9 (Correct) MMA[11] = 12 (Correct) MMA[12] = 4 (Correct) MMA[13] = 8 (Correct) MMA[14] = 12 (Correct) MMA[15] = 16 (Correct) [1] https://github.com/shenki/p10_tests > > include/fpu/softfloat.h | 9 ++ > target/ppc/cpu.h | 15 +++ > target/ppc/fpu_helper.c | 130 ++++++++++++++++++ > target/ppc/helper.h | 7 + > target/ppc/insn32.decode | 49 +++++++ > target/ppc/insn64.decode | 80 +++++++++++ > target/ppc/int_helper.c | 85 ++++++++++++ > target/ppc/internal.h | 28 ++++ > target/ppc/translate/vsx-impl.c.inc | 200 ++++++++++++++++++++++++++++ > 9 files changed, 603 insertions(+) > > -- > 2.31.1 > >
Hello, On 4/27/22 08:21, Joel Stanley wrote: > On Tue, 26 Apr 2022 at 12:51, Lucas Mateus Castro(alqotel) > <lucas.araujo@eldorado.org.br> wrote: >> >> From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> >> >> This patch series is an RFC of the Matrix-Multiply Assist (MMA) >> instructions implementation from the PowerISA 3.1 >> >> These and the VDIV/VMOD implementation are the last new PowerISA 3.1 >> instructions left to be implemented. >> >> Thanks >> Lucas Mateus Castro (alqotel) (7): >> target/ppc: Implement xxm[tf]acc and xxsetaccz >> target/ppc: Implemented xvi*ger* instructions >> target/ppc: Implemented pmxvi*ger* instructions >> target/ppc: Implemented xvf*ger* >> target/ppc: Implemented xvf16ger* >> target/ppc: Implemented pmxvf*ger* >> target/ppc: Implemented [pm]xvbf16ger2* > > I have a small test case for the MMA instructions that Alistair wrote > a while back[1]. It passes when run with these patches applied > (previously it would sigill). Could we have your Tested-by then ? > > $ qemu-ppc64le -cpu power10 -L ~/ppc64le/ ./test -m > Smoke test MMA > MMA[0] = 1 (Correct) > MMA[1] = 2 (Correct) > MMA[2] = 3 (Correct) > MMA[3] = 4 (Correct) > MMA[4] = 2 (Correct) > MMA[5] = 4 (Correct) > MMA[6] = 6 (Correct) > MMA[7] = 8 (Correct) > MMA[8] = 3 (Correct) > MMA[9] = 6 (Correct) > MMA[10] = 9 (Correct) > MMA[11] = 12 (Correct) > MMA[12] = 4 (Correct) > MMA[13] = 8 (Correct) > MMA[14] = 12 (Correct) > MMA[15] = 16 (Correct) > > [1] https://github.com/shenki/p10_tests Looks like a good candidate for tests/tcg/ppc64le/. Adding Matheus and Leandro. Thanks, C. > > >> >> include/fpu/softfloat.h | 9 ++ >> target/ppc/cpu.h | 15 +++ >> target/ppc/fpu_helper.c | 130 ++++++++++++++++++ >> target/ppc/helper.h | 7 + >> target/ppc/insn32.decode | 49 +++++++ >> target/ppc/insn64.decode | 80 +++++++++++ >> target/ppc/int_helper.c | 85 ++++++++++++ >> target/ppc/internal.h | 28 ++++ >> target/ppc/translate/vsx-impl.c.inc | 200 ++++++++++++++++++++++++++++ >> 9 files changed, 603 insertions(+) >> >> -- >> 2.31.1 >> >>
Something I forgot to mention in the cover letter, the XVFGER instructions accumulate the exception status and at the end set the FPSCR and take a Program interrupt on a trap-enabled exception, but as the exception functions are currently set up in target/ppc/fpu_helper.c a call to set a FPSCR bit could raise an exception before all bits could be set. Victor (CCing him) is working on a patch series to fix the FPSCR.FI bit that will reorganize do_float_check_status (that would solve the aforementioned problem), so for now I sent without trying to solve that problem In v2 I'll remember to mention this in the cover letter On 26/04/2022 09:50, Lucas Mateus Castro(alqotel) wrote: > From: "Lucas Mateus Castro (alqotel)"<lucas.araujo@eldorado.org.br> > > This patch series is an RFC of the Matrix-Multiply Assist (MMA) > instructions implementation from the PowerISA 3.1 > > These and the VDIV/VMOD implementation are the last new PowerISA 3.1 > instructions left to be implemented. > > Thanks > Lucas Mateus Castro (alqotel) (7): > target/ppc: Implement xxm[tf]acc and xxsetaccz > target/ppc: Implemented xvi*ger* instructions > target/ppc: Implemented pmxvi*ger* instructions > target/ppc: Implemented xvf*ger* > target/ppc: Implemented xvf16ger* > target/ppc: Implemented pmxvf*ger* > target/ppc: Implemented [pm]xvbf16ger2* > > include/fpu/softfloat.h | 9 ++ > target/ppc/cpu.h | 15 +++ > target/ppc/fpu_helper.c | 130 ++++++++++++++++++ > target/ppc/helper.h | 7 + > target/ppc/insn32.decode | 49 +++++++ > target/ppc/insn64.decode | 80 +++++++++++ > target/ppc/int_helper.c | 85 ++++++++++++ > target/ppc/internal.h | 28 ++++ > target/ppc/translate/vsx-impl.c.inc | 200 ++++++++++++++++++++++++++++ > 9 files changed, 603 insertions(+) >
On Wed, 27 Apr 2022 at 07:10, Cédric Le Goater <clg@kaod.org> wrote: > > Hello, > > On 4/27/22 08:21, Joel Stanley wrote: > > On Tue, 26 Apr 2022 at 12:51, Lucas Mateus Castro(alqotel) > > <lucas.araujo@eldorado.org.br> wrote: > >> > >> From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> > >> > >> This patch series is an RFC of the Matrix-Multiply Assist (MMA) > >> instructions implementation from the PowerISA 3.1 > >> > >> These and the VDIV/VMOD implementation are the last new PowerISA 3.1 > >> instructions left to be implemented. > >> > >> Thanks > >> Lucas Mateus Castro (alqotel) (7): > >> target/ppc: Implement xxm[tf]acc and xxsetaccz > >> target/ppc: Implemented xvi*ger* instructions > >> target/ppc: Implemented pmxvi*ger* instructions > >> target/ppc: Implemented xvf*ger* > >> target/ppc: Implemented xvf16ger* > >> target/ppc: Implemented pmxvf*ger* > >> target/ppc: Implemented [pm]xvbf16ger2* > > > > I have a small test case for the MMA instructions that Alistair wrote > > a while back[1]. It passes when run with these patches applied > > (previously it would sigill). > > Could we have your Tested-by then ? Sure! I was going to re-test v2, but it doesn't hurt to mention it for this version. Tested-by: Joel Stanley <joel@jms.id.au> > > > > > > $ qemu-ppc64le -cpu power10 -L ~/ppc64le/ ./test -m > > Smoke test MMA > > MMA[0] = 1 (Correct) > > MMA[1] = 2 (Correct) > > MMA[2] = 3 (Correct) > > MMA[3] = 4 (Correct) > > MMA[4] = 2 (Correct) > > MMA[5] = 4 (Correct) > > MMA[6] = 6 (Correct) > > MMA[7] = 8 (Correct) > > MMA[8] = 3 (Correct) > > MMA[9] = 6 (Correct) > > MMA[10] = 9 (Correct) > > MMA[11] = 12 (Correct) > > MMA[12] = 4 (Correct) > > MMA[13] = 8 (Correct) > > MMA[14] = 12 (Correct) > > MMA[15] = 16 (Correct) > > > > [1] https://github.com/shenki/p10_tests > > Looks like a good candidate for tests/tcg/ppc64le/. Adding Matheus and Leandro. > > Thanks, > > C. > > > > > > > > >> > >> include/fpu/softfloat.h | 9 ++ > >> target/ppc/cpu.h | 15 +++ > >> target/ppc/fpu_helper.c | 130 ++++++++++++++++++ > >> target/ppc/helper.h | 7 + > >> target/ppc/insn32.decode | 49 +++++++ > >> target/ppc/insn64.decode | 80 +++++++++++ > >> target/ppc/int_helper.c | 85 ++++++++++++ > >> target/ppc/internal.h | 28 ++++ > >> target/ppc/translate/vsx-impl.c.inc | 200 ++++++++++++++++++++++++++++ > >> 9 files changed, 603 insertions(+) > >> > >> -- > >> 2.31.1 > >> > >> >
From: "Lucas Mateus Castro (alqotel)" <lucas.araujo@eldorado.org.br> This patch series is an RFC of the Matrix-Multiply Assist (MMA) instructions implementation from the PowerISA 3.1 These and the VDIV/VMOD implementation are the last new PowerISA 3.1 instructions left to be implemented. Thanks Lucas Mateus Castro (alqotel) (7): target/ppc: Implement xxm[tf]acc and xxsetaccz target/ppc: Implemented xvi*ger* instructions target/ppc: Implemented pmxvi*ger* instructions target/ppc: Implemented xvf*ger* target/ppc: Implemented xvf16ger* target/ppc: Implemented pmxvf*ger* target/ppc: Implemented [pm]xvbf16ger2* include/fpu/softfloat.h | 9 ++ target/ppc/cpu.h | 15 +++ target/ppc/fpu_helper.c | 130 ++++++++++++++++++ target/ppc/helper.h | 7 + target/ppc/insn32.decode | 49 +++++++ target/ppc/insn64.decode | 80 +++++++++++ target/ppc/int_helper.c | 85 ++++++++++++ target/ppc/internal.h | 28 ++++ target/ppc/translate/vsx-impl.c.inc | 200 ++++++++++++++++++++++++++++ 9 files changed, 603 insertions(+)