Message ID | 20240217161151.3987164-2-ardb+git@google.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Herbert Xu |
Headers | show |
Series | crypto: arm64/neonbs - fix out-of-bounds access on short input | expand |
On Sat, 17 Feb 2024 at 17:12, Ard Biesheuvel <ardb+git@google.com> wrote: > > From: Ard Biesheuvel <ardb@kernel.org> > > The bit-sliced implementation of AES-CTR operates on blocks of 128 > bytes, and will fall back to the plain NEON version for tail blocks or > inputs that are shorter than 128 bytes to begin with. > > It will call straight into the plain NEON asm helper, which performs all > memory accesses in granules of 16 bytes (the size of a NEON register). > For this reason, the associated plain NEON glue code will copy inputs > shorter than 16 bytes into a temporary buffer, given that this is a rare > occurrence and it is not worth the effort to work around this in the asm > code. > > The fallback from the bit-sliced NEON version fails to take this into > account, potentially resulting in out-of-bounds accesses. So clone the > same workaround, and use a temp buffer for short in/outputs. > > Cc: <stable@vger.kernel.org> > Reported-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > Tested-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Ping? > --- > arch/arm64/crypto/aes-neonbs-glue.c | 11 +++++++++++ > 1 file changed, 11 insertions(+) > > diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c > index bac4cabef607..849dc41320db 100644 > --- a/arch/arm64/crypto/aes-neonbs-glue.c > +++ b/arch/arm64/crypto/aes-neonbs-glue.c > @@ -227,8 +227,19 @@ static int ctr_encrypt(struct skcipher_request *req) > src += blocks * AES_BLOCK_SIZE; > } > if (nbytes && walk.nbytes == walk.total) { > + u8 buf[AES_BLOCK_SIZE]; > + u8 *d = dst; > + > + if (unlikely(nbytes < AES_BLOCK_SIZE)) > + src = dst = memcpy(buf + sizeof(buf) - nbytes, > + src, nbytes); > + > neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds, > nbytes, walk.iv); > + > + if (unlikely(nbytes < AES_BLOCK_SIZE)) > + memcpy(d, buf + sizeof(buf) - nbytes, nbytes); > + > nbytes = 0; > } > kernel_neon_end(); > -- > 2.44.0.rc0.258.g7320e95886-goog >
On Thu, Feb 22, 2024 at 12:37:45AM +0100, Ard Biesheuvel wrote: > On Sat, 17 Feb 2024 at 17:12, Ard Biesheuvel <ardb+git@google.com> wrote: > > > > From: Ard Biesheuvel <ardb@kernel.org> > > > > The bit-sliced implementation of AES-CTR operates on blocks of 128 > > bytes, and will fall back to the plain NEON version for tail blocks or > > inputs that are shorter than 128 bytes to begin with. > > > > It will call straight into the plain NEON asm helper, which performs all > > memory accesses in granules of 16 bytes (the size of a NEON register). > > For this reason, the associated plain NEON glue code will copy inputs > > shorter than 16 bytes into a temporary buffer, given that this is a rare > > occurrence and it is not worth the effort to work around this in the asm > > code. > > > > The fallback from the bit-sliced NEON version fails to take this into > > account, potentially resulting in out-of-bounds accesses. So clone the > > same workaround, and use a temp buffer for short in/outputs. > > > > Cc: <stable@vger.kernel.org> > > Reported-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > > Tested-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > Ping? It's in my queue. Thanks.
On Sat, Feb 17, 2024 at 05:11:52PM +0100, Ard Biesheuvel wrote: > From: Ard Biesheuvel <ardb@kernel.org> > > The bit-sliced implementation of AES-CTR operates on blocks of 128 > bytes, and will fall back to the plain NEON version for tail blocks or > inputs that are shorter than 128 bytes to begin with. > > It will call straight into the plain NEON asm helper, which performs all > memory accesses in granules of 16 bytes (the size of a NEON register). > For this reason, the associated plain NEON glue code will copy inputs > shorter than 16 bytes into a temporary buffer, given that this is a rare > occurrence and it is not worth the effort to work around this in the asm > code. > > The fallback from the bit-sliced NEON version fails to take this into > account, potentially resulting in out-of-bounds accesses. So clone the > same workaround, and use a temp buffer for short in/outputs. > > Cc: <stable@vger.kernel.org> > Reported-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > Tested-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> Looks like this could use: Fixes: fc074e130051 ("crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk") > + if (unlikely(nbytes < AES_BLOCK_SIZE)) > + src = dst = memcpy(buf + sizeof(buf) - nbytes, > + src, nbytes); > + > neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds, > nbytes, walk.iv); > + > + if (unlikely(nbytes < AES_BLOCK_SIZE)) > + memcpy(d, buf + sizeof(buf) - nbytes, nbytes); The second one could use 'dst' instead of 'buf + sizeof(buf) - nbytes', right? Otherwise this looks good. Reviewed-by: Eric Biggers <ebiggers@google.com> - Eric
On Thu, 22 Feb 2024 at 07:34, Eric Biggers <ebiggers@kernel.org> wrote: > > On Sat, Feb 17, 2024 at 05:11:52PM +0100, Ard Biesheuvel wrote: > > From: Ard Biesheuvel <ardb@kernel.org> > > > > The bit-sliced implementation of AES-CTR operates on blocks of 128 > > bytes, and will fall back to the plain NEON version for tail blocks or > > inputs that are shorter than 128 bytes to begin with. > > > > It will call straight into the plain NEON asm helper, which performs all > > memory accesses in granules of 16 bytes (the size of a NEON register). > > For this reason, the associated plain NEON glue code will copy inputs > > shorter than 16 bytes into a temporary buffer, given that this is a rare > > occurrence and it is not worth the effort to work around this in the asm > > code. > > > > The fallback from the bit-sliced NEON version fails to take this into > > account, potentially resulting in out-of-bounds accesses. So clone the > > same workaround, and use a temp buffer for short in/outputs. > > > > Cc: <stable@vger.kernel.org> > > Reported-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > > Tested-by: syzbot+f1ceaa1a09ab891e1934@syzkaller.appspotmail.com > > Signed-off-by: Ard Biesheuvel <ardb@kernel.org> > > Looks like this could use: > > Fixes: fc074e130051 ("crypto: arm64/aes-neonbs-ctr - fallback to plain NEON for final chunk") > Indeed. > > + if (unlikely(nbytes < AES_BLOCK_SIZE)) > > + src = dst = memcpy(buf + sizeof(buf) - nbytes, > > + src, nbytes); > > + > > neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds, > > nbytes, walk.iv); > > + > > + if (unlikely(nbytes < AES_BLOCK_SIZE)) > > + memcpy(d, buf + sizeof(buf) - nbytes, nbytes); > > The second one could use 'dst' instead of 'buf + sizeof(buf) - nbytes', right? > Correct. > Otherwise this looks good. > > Reviewed-by: Eric Biggers <ebiggers@google.com> > I'll respin with these changes. Thanks.
diff --git a/arch/arm64/crypto/aes-neonbs-glue.c b/arch/arm64/crypto/aes-neonbs-glue.c index bac4cabef607..849dc41320db 100644 --- a/arch/arm64/crypto/aes-neonbs-glue.c +++ b/arch/arm64/crypto/aes-neonbs-glue.c @@ -227,8 +227,19 @@ static int ctr_encrypt(struct skcipher_request *req) src += blocks * AES_BLOCK_SIZE; } if (nbytes && walk.nbytes == walk.total) { + u8 buf[AES_BLOCK_SIZE]; + u8 *d = dst; + + if (unlikely(nbytes < AES_BLOCK_SIZE)) + src = dst = memcpy(buf + sizeof(buf) - nbytes, + src, nbytes); + neon_aes_ctr_encrypt(dst, src, ctx->enc, ctx->key.rounds, nbytes, walk.iv); + + if (unlikely(nbytes < AES_BLOCK_SIZE)) + memcpy(d, buf + sizeof(buf) - nbytes, nbytes); + nbytes = 0; } kernel_neon_end();