target/arm/tcg: Fix overflow in matrix-multiply accumulate

Message ID	20240811054341.745674-1-joe@pf.is.s.u-tokyo.ac.jp (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org> From: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> To: peter.maydell@linaro.org Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org, Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp> Subject: [PATCH] target/arm/tcg: Fix overflow in matrix-multiply accumulate Date: Sun, 11 Aug 2024 14:43:41 +0900 Message-Id: <20240811054341.745674-1-joe@pf.is.s.u-tokyo.ac.jp> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Received-SPF: none client-ip=2607:f8b0:4864:20::d33; envelope-from=joe@pf.is.s.u-tokyo.ac.jp; helo=mail-io1-xd33.google.com X-Spam_score_int: -18 X-Spam_score: -1.9 X-Spam_bar: - X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1, DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001, SPF_NONE=0.001 autolearn=unavailable autolearn_force=no X-Spam_action: no action Precedence: list Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Series	target/arm/tcg: Fix overflow in matrix-multiply accumulate \| expand target/arm/tcg: Fix overflow in matrix-multiply accumulate

Message ID

20240811054341.745674-1-joe@pf.is.s.u-tokyo.ac.jp (mailing list archive)

State

New, archived

Headers

From: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
To: peter.maydell@linaro.org
Cc: qemu-arm@nongnu.org, qemu-devel@nongnu.org,
 Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
Subject: [PATCH] target/arm/tcg: Fix overflow in matrix-multiply accumulate
Date: Sun, 11 Aug 2024 14:43:41 +0900
Message-Id: <20240811054341.745674-1-joe@pf.is.s.u-tokyo.ac.jp>
MIME-Version: 1.0
Content-Transfer-Encoding: 8bit
Received-SPF: none client-ip=2607:f8b0:4864:20::d33;
 envelope-from=joe@pf.is.s.u-tokyo.ac.jp; helo=mail-io1-xd33.google.com
X-Spam_score_int: -18
X-Spam_score: -1.9
X-Spam_bar: -
X-Spam_report: (-1.9 / 5.0 requ) BAYES_00=-1.9, DKIM_SIGNED=0.1,
 DKIM_VALID=-0.1, RCVD_IN_DNSWL_NONE=-0.0001, SPF_HELO_NONE=0.001,
 SPF_NONE=0.001 autolearn=unavailable autolearn_force=no
X-Spam_action: no action
X-Mailman-Approved-At: Sun, 11 Aug 2024 10:13:47 -0400
X-BeenThere: qemu-devel@nongnu.org
X-Mailman-Version: 2.1.29
Precedence: list
List-Id: <qemu-devel.nongnu.org>
List-Unsubscribe: <https://lists.nongnu.org/mailman/options/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=unsubscribe>
List-Archive: <https://lists.nongnu.org/archive/html/qemu-devel>
List-Post: <mailto:qemu-devel@nongnu.org>
List-Help: <mailto:qemu-devel-request@nongnu.org?subject=help>
List-Subscribe: <https://lists.nongnu.org/mailman/listinfo/qemu-devel>,
 <mailto:qemu-devel-request@nongnu.org?subject=subscribe>
Errors-To: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org
Sender: qemu-devel-bounces+qemu-devel=archiver.kernel.org@nongnu.org

Series

target/arm/tcg: Fix overflow in matrix-multiply accumulate | expand

Commit Message

Joe Hattori Aug. 11, 2024, 5:43 a.m. UTC

Arm's intrinsic matrix multiply accumulate instructions take two 8-bit
vector and add up a 32-bit vector. Current emulation causes overflow
when large 8-bit integers are used. This commit fixes the issue by
casting the 8-bit integers to 32-bit integers before multiplication.

Fixes: 2323c5ffd4b5 ("target/arm: Implement integer matrix multiply accumulate")
Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
---
 target/arm/tcg/vec_helper.c | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

Comments

Richard Henderson Aug. 11, 2024, 9:42 p.m. UTC | #1

On 8/11/24 15:43, Joe Hattori wrote:
> Arm's intrinsic matrix multiply accumulate instructions take two 8-bit
> vector and add up a 32-bit vector. Current emulation causes overflow
> when large 8-bit integers are used. This commit fixes the issue by
> casting the 8-bit integers to 32-bit integers before multiplication.

"Large 8-bit integers"?

0xff * 0xff = 0xfe01.

This in no way overflows "int" on any supported host, which is the type we get via normal 
C arithmetic promotion rules.

So what is this supposed to fix?


r~

> 
> Fixes: 2323c5ffd4b5 ("target/arm: Implement integer matrix multiply accumulate")
> Signed-off-by: Joe Hattori <joe@pf.is.s.u-tokyo.ac.jp>
> ---
>   target/arm/tcg/vec_helper.c | 6 +++---
>   1 file changed, 3 insertions(+), 3 deletions(-)
> 
> diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
> index 98604d170fd3..e9c33520232a 100644
> --- a/target/arm/tcg/vec_helper.c
> +++ b/target/arm/tcg/vec_helper.c
> @@ -2718,7 +2718,7 @@ static uint32_t do_smmla_b(uint32_t sum, void *vn, void *vm)
>       int8_t *n = vn, *m = vm;
>   
>       for (intptr_t k = 0; k < 8; ++k) {
> -        sum += n[H1(k)] * m[H1(k)];
> +        sum += (uint32_t)n[H1(k)] * (uint32_t)m[H1(k)];
>       }
>       return sum;
>   }
> @@ -2728,7 +2728,7 @@ static uint32_t do_ummla_b(uint32_t sum, void *vn, void *vm)
>       uint8_t *n = vn, *m = vm;
>   
>       for (intptr_t k = 0; k < 8; ++k) {
> -        sum += n[H1(k)] * m[H1(k)];
> +        sum += (uint32_t)n[H1(k)] * (uint32_t)m[H1(k)];
>       }
>       return sum;
>   }
> @@ -2739,7 +2739,7 @@ static uint32_t do_usmmla_b(uint32_t sum, void *vn, void *vm)
>       int8_t *m = vm;
>   
>       for (intptr_t k = 0; k < 8; ++k) {
> -        sum += n[H1(k)] * m[H1(k)];
> +        sum += (uint32_t)n[H1(k)] * (uint32_t)m[H1(k)];
>       }
>       return sum;
>   }

diff --git a/target/arm/tcg/vec_helper.c b/target/arm/tcg/vec_helper.c
index 98604d170fd3..e9c33520232a 100644
--- a/target/arm/tcg/vec_helper.c
+++ b/target/arm/tcg/vec_helper.c
@@ -2718,7 +2718,7 @@  static uint32_t do_smmla_b(uint32_t sum, void *vn, void *vm)
     int8_t *n = vn, *m = vm;
 
     for (intptr_t k = 0; k < 8; ++k) {
-        sum += n[H1(k)] * m[H1(k)];
+        sum += (uint32_t)n[H1(k)] * (uint32_t)m[H1(k)];
     }
     return sum;
 }
@@ -2728,7 +2728,7 @@  static uint32_t do_ummla_b(uint32_t sum, void *vn, void *vm)
     uint8_t *n = vn, *m = vm;
 
     for (intptr_t k = 0; k < 8; ++k) {
-        sum += n[H1(k)] * m[H1(k)];
+        sum += (uint32_t)n[H1(k)] * (uint32_t)m[H1(k)];
     }
     return sum;
 }
@@ -2739,7 +2739,7 @@  static uint32_t do_usmmla_b(uint32_t sum, void *vn, void *vm)
     int8_t *m = vm;
 
     for (intptr_t k = 0; k < 8; ++k) {
-        sum += n[H1(k)] * m[H1(k)];
+        sum += (uint32_t)n[H1(k)] * (uint32_t)m[H1(k)];
     }
     return sum;
 }

target/arm/tcg: Fix overflow in matrix-multiply accumulate

Commit Message

Comments

Patch