Message ID | 20220712075031.29061-1-guozihua@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Herbert Xu |
Headers | show |
Series | [v2] arm64/crypto: poly1305 fix a read out-of-bound | expand |
On 2022/7/12 15:50, GUO Zihua wrote: > A kasan error was reported during fuzzing: > > BUG: KASAN: slab-out-of-bounds in neon_poly1305_blocks.constprop.0+0x1b4/0x250 [poly1305_neon] > Read of size 4 at addr ffff0010e293f010 by task syz-executor.5/1646715 > CPU: 4 PID: 1646715 Comm: syz-executor.5 Kdump: loaded Not tainted 5.10.0.aarch64 #1 > Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.59 01/31/2019 > Call trace: > dump_backtrace+0x0/0x394 > show_stack+0x34/0x4c arch/arm64/kernel/stacktrace.c:196 > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x158/0x1e4 lib/dump_stack.c:118 > print_address_description.constprop.0+0x68/0x204 mm/kasan/report.c:387 > __kasan_report+0xe0/0x140 mm/kasan/report.c:547 > kasan_report+0x44/0xe0 mm/kasan/report.c:564 > check_memory_region_inline mm/kasan/generic.c:187 [inline] > __asan_load4+0x94/0xd0 mm/kasan/generic.c:252 > neon_poly1305_blocks.constprop.0+0x1b4/0x250 [poly1305_neon] > neon_poly1305_do_update+0x6c/0x15c [poly1305_neon] > neon_poly1305_update+0x9c/0x1c4 [poly1305_neon] > crypto_shash_update crypto/shash.c:131 [inline] > shash_finup_unaligned+0x84/0x15c crypto/shash.c:179 > crypto_shash_finup+0x8c/0x140 crypto/shash.c:193 > shash_digest_unaligned+0xb8/0xe4 crypto/shash.c:201 > crypto_shash_digest+0xa4/0xfc crypto/shash.c:217 > crypto_shash_tfm_digest+0xb4/0x150 crypto/shash.c:229 > essiv_skcipher_setkey+0x164/0x200 [essiv] > crypto_skcipher_setkey+0xb0/0x160 crypto/skcipher.c:612 > skcipher_setkey+0x3c/0x50 crypto/algif_skcipher.c:305 > alg_setkey+0x114/0x2a0 crypto/af_alg.c:220 > alg_setsockopt+0x19c/0x210 crypto/af_alg.c:253 > __sys_setsockopt+0x190/0x2e0 net/socket.c:2123 > __do_sys_setsockopt net/socket.c:2134 [inline] > __se_sys_setsockopt net/socket.c:2131 [inline] > __arm64_sys_setsockopt+0x78/0x94 net/socket.c:2131 > __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] > invoke_syscall+0x64/0x100 arch/arm64/kernel/syscall.c:48 > el0_svc_common.constprop.0+0x220/0x230 arch/arm64/kernel/syscall.c:155 > do_el0_svc+0xb4/0xd4 arch/arm64/kernel/syscall.c:217 > el0_svc+0x24/0x3c arch/arm64/kernel/entry-common.c:353 > el0_sync_handler+0x160/0x164 arch/arm64/kernel/entry-common.c:369 > el0_sync+0x160/0x180 arch/arm64/kernel/entry.S:683 > > This error can be reproduced by the following code compiled as ko on a > system with kasan enabled: > > char test_data[] = "\x00\x01\x02\x03\x04\x05\x06\x07" > "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" > "\x10\x11\x12\x13\x14\x15\x16\x17" > "\x18\x19\x1a\x1b\x1c\x1d\x1e"; > > int init(void) > { > struct crypto_shash *tfm = NULL; > struct shash_desc *desc = NULL; > char *data = NULL; > > tfm = crypto_alloc_shash("poly1305", 0, 0); > desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL); > desc->tfm = tfm; > > data = kmalloc(POLY1305_KEY_SIZE - 1, GFP_KERNEL); > memcpy(data, test_data, POLY1305_KEY_SIZE - 1); > crypto_shash_update(desc, data, POLY1305_KEY_SIZE - 1); > crypto_shash_final(desc, data); > kfree(data); > return 0; > } > > void deinit(void) > { > } > > module_init(init) > module_exit(deinit) > MODULE_LICENSE("GPL"); > > The root cause of the bug sits in neon_poly1305_blocks. The logic > neon_poly1305_blocks() performed is that if it was called with both s[] > and r[] uninitialized, it will first try to initialize them with the > data from the first "block" that it believed to be 32 bytes in length. > First 16 bytes are used as the key and the next 16 bytes for s[]. This > would lead to the aforementioned read out-of-bound. However, after > calling poly1305_init_arch(), only 16 bytes were deducted from the input > and s[] is initialized yet again with the following 16 bytes. The second > initialization of s[] is certainly redundent which indicates that the > first initialization should be for r[] only. > > This patch fixes the issue by calling poly1305_init_arm64() instead of > poly1305_init_arch(). This is also the implementation for the same > algorithm on arm platform. > > Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") > Cc: stable@vger.kernel.org > Signed-off-by: GUO Zihua <guozihua@huawei.com> > --- > arch/arm64/crypto/poly1305-glue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c > index 9c3d86e397bf..1fae18ba11ed 100644 > --- a/arch/arm64/crypto/poly1305-glue.c > +++ b/arch/arm64/crypto/poly1305-glue.c > @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, > { > if (unlikely(!dctx->sset)) { > if (!dctx->rset) { > - poly1305_init_arch(dctx, src); > + poly1305_init_arm64(&dctx->h, src); > src += POLY1305_BLOCK_SIZE; > len -= POLY1305_BLOCK_SIZE; > dctx->rset = 1; Friendly ping~
On 2022/7/12 15:50, GUO Zihua wrote: > A kasan error was reported during fuzzing: > > BUG: KASAN: slab-out-of-bounds in neon_poly1305_blocks.constprop.0+0x1b4/0x250 [poly1305_neon] > Read of size 4 at addr ffff0010e293f010 by task syz-executor.5/1646715 > CPU: 4 PID: 1646715 Comm: syz-executor.5 Kdump: loaded Not tainted 5.10.0.aarch64 #1 > Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.59 01/31/2019 > Call trace: > dump_backtrace+0x0/0x394 > show_stack+0x34/0x4c arch/arm64/kernel/stacktrace.c:196 > __dump_stack lib/dump_stack.c:77 [inline] > dump_stack+0x158/0x1e4 lib/dump_stack.c:118 > print_address_description.constprop.0+0x68/0x204 mm/kasan/report.c:387 > __kasan_report+0xe0/0x140 mm/kasan/report.c:547 > kasan_report+0x44/0xe0 mm/kasan/report.c:564 > check_memory_region_inline mm/kasan/generic.c:187 [inline] > __asan_load4+0x94/0xd0 mm/kasan/generic.c:252 > neon_poly1305_blocks.constprop.0+0x1b4/0x250 [poly1305_neon] > neon_poly1305_do_update+0x6c/0x15c [poly1305_neon] > neon_poly1305_update+0x9c/0x1c4 [poly1305_neon] > crypto_shash_update crypto/shash.c:131 [inline] > shash_finup_unaligned+0x84/0x15c crypto/shash.c:179 > crypto_shash_finup+0x8c/0x140 crypto/shash.c:193 > shash_digest_unaligned+0xb8/0xe4 crypto/shash.c:201 > crypto_shash_digest+0xa4/0xfc crypto/shash.c:217 > crypto_shash_tfm_digest+0xb4/0x150 crypto/shash.c:229 > essiv_skcipher_setkey+0x164/0x200 [essiv] > crypto_skcipher_setkey+0xb0/0x160 crypto/skcipher.c:612 > skcipher_setkey+0x3c/0x50 crypto/algif_skcipher.c:305 > alg_setkey+0x114/0x2a0 crypto/af_alg.c:220 > alg_setsockopt+0x19c/0x210 crypto/af_alg.c:253 > __sys_setsockopt+0x190/0x2e0 net/socket.c:2123 > __do_sys_setsockopt net/socket.c:2134 [inline] > __se_sys_setsockopt net/socket.c:2131 [inline] > __arm64_sys_setsockopt+0x78/0x94 net/socket.c:2131 > __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] > invoke_syscall+0x64/0x100 arch/arm64/kernel/syscall.c:48 > el0_svc_common.constprop.0+0x220/0x230 arch/arm64/kernel/syscall.c:155 > do_el0_svc+0xb4/0xd4 arch/arm64/kernel/syscall.c:217 > el0_svc+0x24/0x3c arch/arm64/kernel/entry-common.c:353 > el0_sync_handler+0x160/0x164 arch/arm64/kernel/entry-common.c:369 > el0_sync+0x160/0x180 arch/arm64/kernel/entry.S:683 > > This error can be reproduced by the following code compiled as ko on a > system with kasan enabled: > > char test_data[] = "\x00\x01\x02\x03\x04\x05\x06\x07" > "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" > "\x10\x11\x12\x13\x14\x15\x16\x17" > "\x18\x19\x1a\x1b\x1c\x1d\x1e"; > > int init(void) > { > struct crypto_shash *tfm = NULL; > struct shash_desc *desc = NULL; > char *data = NULL; > > tfm = crypto_alloc_shash("poly1305", 0, 0); > desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL); > desc->tfm = tfm; > > data = kmalloc(POLY1305_KEY_SIZE - 1, GFP_KERNEL); > memcpy(data, test_data, POLY1305_KEY_SIZE - 1); > crypto_shash_update(desc, data, POLY1305_KEY_SIZE - 1); > crypto_shash_final(desc, data); > kfree(data); > return 0; > } > > void deinit(void) > { > } > > module_init(init) > module_exit(deinit) > MODULE_LICENSE("GPL"); > > The root cause of the bug sits in neon_poly1305_blocks. The logic > neon_poly1305_blocks() performed is that if it was called with both s[] > and r[] uninitialized, it will first try to initialize them with the > data from the first "block" that it believed to be 32 bytes in length. > First 16 bytes are used as the key and the next 16 bytes for s[]. This > would lead to the aforementioned read out-of-bound. However, after > calling poly1305_init_arch(), only 16 bytes were deducted from the input > and s[] is initialized yet again with the following 16 bytes. The second > initialization of s[] is certainly redundent which indicates that the > first initialization should be for r[] only. > > This patch fixes the issue by calling poly1305_init_arm64() instead of > poly1305_init_arch(). This is also the implementation for the same > algorithm on arm platform. > > Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") > Cc: stable@vger.kernel.org > Signed-off-by: GUO Zihua <guozihua@huawei.com> > --- > arch/arm64/crypto/poly1305-glue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c > index 9c3d86e397bf..1fae18ba11ed 100644 > --- a/arch/arm64/crypto/poly1305-glue.c > +++ b/arch/arm64/crypto/poly1305-glue.c > @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, > { > if (unlikely(!dctx->sset)) { > if (!dctx->rset) { > - poly1305_init_arch(dctx, src); > + poly1305_init_arm64(&dctx->h, src); > src += POLY1305_BLOCK_SIZE; > len -= POLY1305_BLOCK_SIZE; > dctx->rset = 1; Including Ard in the loop as they are the original uploader of this algo.
On Tue, Jul 12, 2022 at 03:50:31PM +0800, GUO Zihua wrote: > A kasan error was reported during fuzzing: [...] > This patch fixes the issue by calling poly1305_init_arm64() instead of > poly1305_init_arch(). This is also the implementation for the same > algorithm on arm platform. > > Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") > Cc: stable@vger.kernel.org > Signed-off-by: GUO Zihua <guozihua@huawei.com> > --- > arch/arm64/crypto/poly1305-glue.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) I'm not a crypto guy by any stretch of the imagination, but Ard is out at the moment and this looks like an important fix so I had a crack at reviewing it. > diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c > index 9c3d86e397bf..1fae18ba11ed 100644 > --- a/arch/arm64/crypto/poly1305-glue.c > +++ b/arch/arm64/crypto/poly1305-glue.c > @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, > { > if (unlikely(!dctx->sset)) { > if (!dctx->rset) { > - poly1305_init_arch(dctx, src); > + poly1305_init_arm64(&dctx->h, src); > src += POLY1305_BLOCK_SIZE; > len -= POLY1305_BLOCK_SIZE; > dctx->rset = 1; With this change, we no longer initialise dctx->buflen to 0 as part of the initialisation. Looking at neon_poly1305_do_update(), I'm a bit worried that we could land in the 'if (likely(len >= POLY1305_BLOCK_SIZE))' block, end up with len == 0 and fail to set dctx->buflen. Is this a problem, or is my ignorance showing? Will
On 2022/7/20 17:41, Will Deacon wrote: > On Tue, Jul 12, 2022 at 03:50:31PM +0800, GUO Zihua wrote: >> A kasan error was reported during fuzzing: > > [...] > >> This patch fixes the issue by calling poly1305_init_arm64() instead of >> poly1305_init_arch(). This is also the implementation for the same >> algorithm on arm platform. >> >> Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") >> Cc: stable@vger.kernel.org >> Signed-off-by: GUO Zihua <guozihua@huawei.com> >> --- >> arch/arm64/crypto/poly1305-glue.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) > > I'm not a crypto guy by any stretch of the imagination, but Ard is out > at the moment and this looks like an important fix so I had a crack at > reviewing it. > >> diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c >> index 9c3d86e397bf..1fae18ba11ed 100644 >> --- a/arch/arm64/crypto/poly1305-glue.c >> +++ b/arch/arm64/crypto/poly1305-glue.c >> @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, >> { >> if (unlikely(!dctx->sset)) { >> if (!dctx->rset) { >> - poly1305_init_arch(dctx, src); >> + poly1305_init_arm64(&dctx->h, src); >> src += POLY1305_BLOCK_SIZE; >> len -= POLY1305_BLOCK_SIZE; >> dctx->rset = 1; > > With this change, we no longer initialise dctx->buflen to 0 as part of the > initialisation. Looking at neon_poly1305_do_update(), I'm a bit worried > that we could land in the 'if (likely(len >= POLY1305_BLOCK_SIZE))' block, > end up with len == 0 and fail to set dctx->buflen. Is this a problem, or is > my ignorance showing? > > Will > . Thanks Will. I noticed this as well, but I leaved it out so that the behavior is the same as the implementation for arm. The buflen here seems to be used for maintaining any excessive data after the last block, and is zeroed during init. I am not sure why it should be zeroed again during key initialization. Maybe the thought was that the very first block of the data is always used for initializing rset and that is also considered to be the "initialization" process for the algorithm, thus the zeroing of buflen. I could be completely wrong though.
On Tue, Jul 12, 2022 at 03:50:31PM +0800, GUO Zihua wrote: > int init(void) > { > struct crypto_shash *tfm = NULL; > struct shash_desc *desc = NULL; > char *data = NULL; > > tfm = crypto_alloc_shash("poly1305", 0, 0); > desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL); > desc->tfm = tfm; > > data = kmalloc(POLY1305_KEY_SIZE - 1, GFP_KERNEL); > memcpy(data, test_data, POLY1305_KEY_SIZE - 1); > crypto_shash_update(desc, data, POLY1305_KEY_SIZE - 1); > crypto_shash_final(desc, data); > kfree(data); > return 0; > } This isn't actually a valid test case since it never calls crypto_shash_init(). So the behavior of this test is undefined both before and after this patch. The simplest way to write a correct test would be to use crypto_shash_tfm_digest(). Anyway, the bug is still real and this patch is still the correct fix, so it's good enough to add my reviewed-by: Reviewed-by: Eric Biggers <ebiggers@google.com> - Eric
On Wed, Jul 20, 2022 at 05:57:30PM +0800, Guozihua (Scott) wrote: > On 2022/7/20 17:41, Will Deacon wrote: > > On Tue, Jul 12, 2022 at 03:50:31PM +0800, GUO Zihua wrote: > > > A kasan error was reported during fuzzing: > > > > [...] > > > > > This patch fixes the issue by calling poly1305_init_arm64() instead of > > > poly1305_init_arch(). This is also the implementation for the same > > > algorithm on arm platform. > > > > > > Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") > > > Cc: stable@vger.kernel.org > > > Signed-off-by: GUO Zihua <guozihua@huawei.com> > > > --- > > > arch/arm64/crypto/poly1305-glue.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > I'm not a crypto guy by any stretch of the imagination, but Ard is out > > at the moment and this looks like an important fix so I had a crack at > > reviewing it. > > > > > diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c > > > index 9c3d86e397bf..1fae18ba11ed 100644 > > > --- a/arch/arm64/crypto/poly1305-glue.c > > > +++ b/arch/arm64/crypto/poly1305-glue.c > > > @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, > > > { > > > if (unlikely(!dctx->sset)) { > > > if (!dctx->rset) { > > > - poly1305_init_arch(dctx, src); > > > + poly1305_init_arm64(&dctx->h, src); > > > src += POLY1305_BLOCK_SIZE; > > > len -= POLY1305_BLOCK_SIZE; > > > dctx->rset = 1; > > > > With this change, we no longer initialise dctx->buflen to 0 as part of the > > initialisation. Looking at neon_poly1305_do_update(), I'm a bit worried > > that we could land in the 'if (likely(len >= POLY1305_BLOCK_SIZE))' block, > > end up with len == 0 and fail to set dctx->buflen. Is this a problem, or is > > my ignorance showing? > > > > Will > > . > > Thanks Will. > > I noticed this as well, but I leaved it out so that the behavior is the same > as the implementation for arm. The buflen here seems to be used for > maintaining any excessive data after the last block, and is zeroed during > init. I am not sure why it should be zeroed again during key initialization. > Maybe the thought was that the very first block of the data is always used > for initializing rset and that is also considered to be the "initialization" > process for the algorithm, thus the zeroing of buflen. I could be completely > wrong though. > buflen is initialized by neon_poly1305_init(), so there's no issue here. - Eric
On Wed, Jul 20, 2022 at 07:37:17PM -0700, Eric Biggers wrote: > On Wed, Jul 20, 2022 at 05:57:30PM +0800, Guozihua (Scott) wrote: > > On 2022/7/20 17:41, Will Deacon wrote: > > > On Tue, Jul 12, 2022 at 03:50:31PM +0800, GUO Zihua wrote: > > > > A kasan error was reported during fuzzing: > > > > > > [...] > > > > > > > This patch fixes the issue by calling poly1305_init_arm64() instead of > > > > poly1305_init_arch(). This is also the implementation for the same > > > > algorithm on arm platform. > > > > > > > > Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") > > > > Cc: stable@vger.kernel.org > > > > Signed-off-by: GUO Zihua <guozihua@huawei.com> > > > > --- > > > > arch/arm64/crypto/poly1305-glue.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > I'm not a crypto guy by any stretch of the imagination, but Ard is out > > > at the moment and this looks like an important fix so I had a crack at > > > reviewing it. > > > > > > > diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c > > > > index 9c3d86e397bf..1fae18ba11ed 100644 > > > > --- a/arch/arm64/crypto/poly1305-glue.c > > > > +++ b/arch/arm64/crypto/poly1305-glue.c > > > > @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, > > > > { > > > > if (unlikely(!dctx->sset)) { > > > > if (!dctx->rset) { > > > > - poly1305_init_arch(dctx, src); > > > > + poly1305_init_arm64(&dctx->h, src); > > > > src += POLY1305_BLOCK_SIZE; > > > > len -= POLY1305_BLOCK_SIZE; > > > > dctx->rset = 1; > > > > > > With this change, we no longer initialise dctx->buflen to 0 as part of the > > > initialisation. Looking at neon_poly1305_do_update(), I'm a bit worried > > > that we could land in the 'if (likely(len >= POLY1305_BLOCK_SIZE))' block, > > > end up with len == 0 and fail to set dctx->buflen. Is this a problem, or is > > > my ignorance showing? > > > > > > Will > > > . > > > > Thanks Will. > > > > I noticed this as well, but I leaved it out so that the behavior is the same > > as the implementation for arm. The buflen here seems to be used for > > maintaining any excessive data after the last block, and is zeroed during > > init. I am not sure why it should be zeroed again during key initialization. > > Maybe the thought was that the very first block of the data is always used > > for initializing rset and that is also considered to be the "initialization" > > process for the algorithm, thus the zeroing of buflen. I could be completely > > wrong though. > > > > buflen is initialized by neon_poly1305_init(), so there's no issue here. Ah yes, thanks. I missed that. In which case, for the very little it's worth: Acked-by: Will Deacon <will@kernel.org> Herbert, please can you pick this up? Will
On 2022/7/21 17:28, Will Deacon wrote: > On Wed, Jul 20, 2022 at 07:37:17PM -0700, Eric Biggers wrote: >> On Wed, Jul 20, 2022 at 05:57:30PM +0800, Guozihua (Scott) wrote: >>> On 2022/7/20 17:41, Will Deacon wrote: >>>> On Tue, Jul 12, 2022 at 03:50:31PM +0800, GUO Zihua wrote: >>>>> A kasan error was reported during fuzzing: >>>> >>>> [...] >>>> >>>>> This patch fixes the issue by calling poly1305_init_arm64() instead of >>>>> poly1305_init_arch(). This is also the implementation for the same >>>>> algorithm on arm platform. >>>>> >>>>> Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") >>>>> Cc: stable@vger.kernel.org >>>>> Signed-off-by: GUO Zihua <guozihua@huawei.com> >>>>> --- >>>>> arch/arm64/crypto/poly1305-glue.c | 2 +- >>>>> 1 file changed, 1 insertion(+), 1 deletion(-) >>>> >>>> I'm not a crypto guy by any stretch of the imagination, but Ard is out >>>> at the moment and this looks like an important fix so I had a crack at >>>> reviewing it. >>>> >>>>> diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c >>>>> index 9c3d86e397bf..1fae18ba11ed 100644 >>>>> --- a/arch/arm64/crypto/poly1305-glue.c >>>>> +++ b/arch/arm64/crypto/poly1305-glue.c >>>>> @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, >>>>> { >>>>> if (unlikely(!dctx->sset)) { >>>>> if (!dctx->rset) { >>>>> - poly1305_init_arch(dctx, src); >>>>> + poly1305_init_arm64(&dctx->h, src); >>>>> src += POLY1305_BLOCK_SIZE; >>>>> len -= POLY1305_BLOCK_SIZE; >>>>> dctx->rset = 1; >>>> >>>> With this change, we no longer initialise dctx->buflen to 0 as part of the >>>> initialisation. Looking at neon_poly1305_do_update(), I'm a bit worried >>>> that we could land in the 'if (likely(len >= POLY1305_BLOCK_SIZE))' block, >>>> end up with len == 0 and fail to set dctx->buflen. Is this a problem, or is >>>> my ignorance showing? >>>> >>>> Will >>>> . >>> >>> Thanks Will. >>> >>> I noticed this as well, but I leaved it out so that the behavior is the same >>> as the implementation for arm. The buflen here seems to be used for >>> maintaining any excessive data after the last block, and is zeroed during >>> init. I am not sure why it should be zeroed again during key initialization. >>> Maybe the thought was that the very first block of the data is always used >>> for initializing rset and that is also considered to be the "initialization" >>> process for the algorithm, thus the zeroing of buflen. I could be completely >>> wrong though. >>> >> >> buflen is initialized by neon_poly1305_init(), so there's no issue here. > > Ah yes, thanks. I missed that. In which case, for the very little it's > worth: > > Acked-by: Will Deacon <will@kernel.org> > > Herbert, please can you pick this up? > > Will > . Thanks Will. Herbert, would you please wait for a v3 patch? I'll update the reproduce example based on Eric's comment.
diff --git a/arch/arm64/crypto/poly1305-glue.c b/arch/arm64/crypto/poly1305-glue.c index 9c3d86e397bf..1fae18ba11ed 100644 --- a/arch/arm64/crypto/poly1305-glue.c +++ b/arch/arm64/crypto/poly1305-glue.c @@ -52,7 +52,7 @@ static void neon_poly1305_blocks(struct poly1305_desc_ctx *dctx, const u8 *src, { if (unlikely(!dctx->sset)) { if (!dctx->rset) { - poly1305_init_arch(dctx, src); + poly1305_init_arm64(&dctx->h, src); src += POLY1305_BLOCK_SIZE; len -= POLY1305_BLOCK_SIZE; dctx->rset = 1;
A kasan error was reported during fuzzing: BUG: KASAN: slab-out-of-bounds in neon_poly1305_blocks.constprop.0+0x1b4/0x250 [poly1305_neon] Read of size 4 at addr ffff0010e293f010 by task syz-executor.5/1646715 CPU: 4 PID: 1646715 Comm: syz-executor.5 Kdump: loaded Not tainted 5.10.0.aarch64 #1 Hardware name: Huawei TaiShan 2280 /BC11SPCD, BIOS 1.59 01/31/2019 Call trace: dump_backtrace+0x0/0x394 show_stack+0x34/0x4c arch/arm64/kernel/stacktrace.c:196 __dump_stack lib/dump_stack.c:77 [inline] dump_stack+0x158/0x1e4 lib/dump_stack.c:118 print_address_description.constprop.0+0x68/0x204 mm/kasan/report.c:387 __kasan_report+0xe0/0x140 mm/kasan/report.c:547 kasan_report+0x44/0xe0 mm/kasan/report.c:564 check_memory_region_inline mm/kasan/generic.c:187 [inline] __asan_load4+0x94/0xd0 mm/kasan/generic.c:252 neon_poly1305_blocks.constprop.0+0x1b4/0x250 [poly1305_neon] neon_poly1305_do_update+0x6c/0x15c [poly1305_neon] neon_poly1305_update+0x9c/0x1c4 [poly1305_neon] crypto_shash_update crypto/shash.c:131 [inline] shash_finup_unaligned+0x84/0x15c crypto/shash.c:179 crypto_shash_finup+0x8c/0x140 crypto/shash.c:193 shash_digest_unaligned+0xb8/0xe4 crypto/shash.c:201 crypto_shash_digest+0xa4/0xfc crypto/shash.c:217 crypto_shash_tfm_digest+0xb4/0x150 crypto/shash.c:229 essiv_skcipher_setkey+0x164/0x200 [essiv] crypto_skcipher_setkey+0xb0/0x160 crypto/skcipher.c:612 skcipher_setkey+0x3c/0x50 crypto/algif_skcipher.c:305 alg_setkey+0x114/0x2a0 crypto/af_alg.c:220 alg_setsockopt+0x19c/0x210 crypto/af_alg.c:253 __sys_setsockopt+0x190/0x2e0 net/socket.c:2123 __do_sys_setsockopt net/socket.c:2134 [inline] __se_sys_setsockopt net/socket.c:2131 [inline] __arm64_sys_setsockopt+0x78/0x94 net/socket.c:2131 __invoke_syscall arch/arm64/kernel/syscall.c:36 [inline] invoke_syscall+0x64/0x100 arch/arm64/kernel/syscall.c:48 el0_svc_common.constprop.0+0x220/0x230 arch/arm64/kernel/syscall.c:155 do_el0_svc+0xb4/0xd4 arch/arm64/kernel/syscall.c:217 el0_svc+0x24/0x3c arch/arm64/kernel/entry-common.c:353 el0_sync_handler+0x160/0x164 arch/arm64/kernel/entry-common.c:369 el0_sync+0x160/0x180 arch/arm64/kernel/entry.S:683 This error can be reproduced by the following code compiled as ko on a system with kasan enabled: char test_data[] = "\x00\x01\x02\x03\x04\x05\x06\x07" "\x08\x09\x0a\x0b\x0c\x0d\x0e\x0f" "\x10\x11\x12\x13\x14\x15\x16\x17" "\x18\x19\x1a\x1b\x1c\x1d\x1e"; int init(void) { struct crypto_shash *tfm = NULL; struct shash_desc *desc = NULL; char *data = NULL; tfm = crypto_alloc_shash("poly1305", 0, 0); desc = kmalloc(sizeof(*desc) + crypto_shash_descsize(tfm), GFP_KERNEL); desc->tfm = tfm; data = kmalloc(POLY1305_KEY_SIZE - 1, GFP_KERNEL); memcpy(data, test_data, POLY1305_KEY_SIZE - 1); crypto_shash_update(desc, data, POLY1305_KEY_SIZE - 1); crypto_shash_final(desc, data); kfree(data); return 0; } void deinit(void) { } module_init(init) module_exit(deinit) MODULE_LICENSE("GPL"); The root cause of the bug sits in neon_poly1305_blocks. The logic neon_poly1305_blocks() performed is that if it was called with both s[] and r[] uninitialized, it will first try to initialize them with the data from the first "block" that it believed to be 32 bytes in length. First 16 bytes are used as the key and the next 16 bytes for s[]. This would lead to the aforementioned read out-of-bound. However, after calling poly1305_init_arch(), only 16 bytes were deducted from the input and s[] is initialized yet again with the following 16 bytes. The second initialization of s[] is certainly redundent which indicates that the first initialization should be for r[] only. This patch fixes the issue by calling poly1305_init_arm64() instead of poly1305_init_arch(). This is also the implementation for the same algorithm on arm platform. Fixes: f569ca164751 ("crypto: arm64/poly1305 - incorporate OpenSSL/CRYPTOGAMS NEON implementation") Cc: stable@vger.kernel.org Signed-off-by: GUO Zihua <guozihua@huawei.com> --- arch/arm64/crypto/poly1305-glue.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)