Message ID | 20181207223449.38808-1-cai@lca.pw (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: increase stack size for KASAN_EXTRA | expand |
On Fri, Dec 7, 2018 at 11:35 PM Qian Cai <cai@lca.pw> wrote: > > If the kernel is configured with KASAN_EXTRA, the stack size is > increasted significantly due to enable this option will set > -fstack-reuse to "none" in GCC [1]. As the results, it could trigger > stack overrun quite often with 32k stack size compiled using GCC 8. For > example, this reproducer > > size > 7536 shrink_inactive_list > 7440 shrink_page_list > 6560 fscache_stats_show > 3920 jbd2_journal_commit_transaction > 3216 try_to_unmap_one > 3072 migrate_page_move_mapping > 3584 migrate_misplaced_transhuge_page > 3920 ip_vs_lblcr_schedule > 4304 lpfc_nvme_info_show > 3888 lpfc_debugfs_nvmestat_data.constprop > > There are other 49 functions are over 2k in size while compiling kernel > with "-Wframe-larger-than=" on this machine. Hence, it is too much work > to change Makefiles for each object to compile without > -fsanitize-address-use-after-scope individually. > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23 Could you clarify: are the numbers you see with or without the bugfix from that bugzilla? Arnd
On Tue, 2018-12-11 at 13:42 +0100, Arnd Bergmann wrote: > On Fri, Dec 7, 2018 at 11:35 PM Qian Cai <cai@lca.pw> wrote: > > > > If the kernel is configured with KASAN_EXTRA, the stack size is > > increasted significantly due to enable this option will set > > -fstack-reuse to "none" in GCC [1]. As the results, it could trigger > > stack overrun quite often with 32k stack size compiled using GCC 8. For > > example, this reproducer > > > > size > > 7536 shrink_inactive_list > > 7440 shrink_page_list > > 6560 fscache_stats_show > > 3920 jbd2_journal_commit_transaction > > 3216 try_to_unmap_one > > 3072 migrate_page_move_mapping > > 3584 migrate_misplaced_transhuge_page > > 3920 ip_vs_lblcr_schedule > > 4304 lpfc_nvme_info_show > > 3888 lpfc_debugfs_nvmestat_data.constprop > > > > There are other 49 functions are over 2k in size while compiling kernel > > with "-Wframe-larger-than=" on this machine. Hence, it is too much work > > to change Makefiles for each object to compile without > > -fsanitize-address-use-after-scope individually. > > > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23 > > Could you clarify: are the numbers you see with or without the bugfix > from that bugzilla? > The numbers were from GCC8 which does NOT contain this patch [1]. GCC9 is awesome which reduced the numbers in half even for KASAN_EXTRA. Only thing is that GCC9 has not been officially released yet, so it is a bit inconvenient for users need to compile the compiler by themselves first. I am fine either way to drop this patch or keep it until GCC9 is GA. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c35
On Tue, Dec 11, 2018 at 6:18 PM Qian Cai <cai@lca.pw> wrote: > > On Tue, 2018-12-11 at 13:42 +0100, Arnd Bergmann wrote: > > On Fri, Dec 7, 2018 at 11:35 PM Qian Cai <cai@lca.pw> wrote: > > > > > > If the kernel is configured with KASAN_EXTRA, the stack size is > > > increasted significantly due to enable this option will set > > > -fstack-reuse to "none" in GCC [1]. As the results, it could trigger > > > stack overrun quite often with 32k stack size compiled using GCC 8. For > > > example, this reproducer > > > > > > size > > > 7536 shrink_inactive_list > > > 7440 shrink_page_list > > > 6560 fscache_stats_show > > > 3920 jbd2_journal_commit_transaction > > > 3216 try_to_unmap_one > > > 3072 migrate_page_move_mapping > > > 3584 migrate_misplaced_transhuge_page > > > 3920 ip_vs_lblcr_schedule > > > 4304 lpfc_nvme_info_show > > > 3888 lpfc_debugfs_nvmestat_data.constprop > > > > > > There are other 49 functions are over 2k in size while compiling kernel > > > with "-Wframe-larger-than=" on this machine. Hence, it is too much work > > > to change Makefiles for each object to compile without > > > -fsanitize-address-use-after-scope individually. > > > > > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23 > > > > Could you clarify: are the numbers you see with or without the bugfix > > from that bugzilla? > > > > The numbers were from GCC8 which does NOT contain this patch [1]. > > GCC9 is awesome which reduced the numbers in half even for KASAN_EXTRA. Only > thing is that GCC9 has not been officially released yet, so it is a bit > inconvenient for users need to compile the compiler by themselves first. > > I am fine either way to drop this patch or keep it until GCC9 is GA. > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c35 Maybe we can make the constant depend on the compiler version? It may also be possible to reduce the KASAN_THREAD_SHIFT constant for the normal case with gcc-9 and go back to the default frame size then. Arnd
On Tue, 2018-12-11 at 22:43 +0100, Arnd Bergmann wrote: > On Tue, Dec 11, 2018 at 6:18 PM Qian Cai <cai@lca.pw> wrote: > > > > On Tue, 2018-12-11 at 13:42 +0100, Arnd Bergmann wrote: > > > On Fri, Dec 7, 2018 at 11:35 PM Qian Cai <cai@lca.pw> wrote: > > > > > > > > If the kernel is configured with KASAN_EXTRA, the stack size is > > > > increasted significantly due to enable this option will set > > > > -fstack-reuse to "none" in GCC [1]. As the results, it could trigger > > > > stack overrun quite often with 32k stack size compiled using GCC 8. For > > > > example, this reproducer > > > > > > > > size > > > > 7536 shrink_inactive_list > > > > 7440 shrink_page_list > > > > 6560 fscache_stats_show > > > > 3920 jbd2_journal_commit_transaction > > > > 3216 try_to_unmap_one > > > > 3072 migrate_page_move_mapping > > > > 3584 migrate_misplaced_transhuge_page > > > > 3920 ip_vs_lblcr_schedule > > > > 4304 lpfc_nvme_info_show > > > > 3888 lpfc_debugfs_nvmestat_data.constprop > > > > > > > > There are other 49 functions are over 2k in size while compiling kernel > > > > with "-Wframe-larger-than=" on this machine. Hence, it is too much work > > > > to change Makefiles for each object to compile without > > > > -fsanitize-address-use-after-scope individually. > > > > > > > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23 > > > > > > Could you clarify: are the numbers you see with or without the bugfix > > > from that bugzilla? > > > > > > > The numbers were from GCC8 which does NOT contain this patch [1]. > > > > GCC9 is awesome which reduced the numbers in half even for KASAN_EXTRA. Only > > thing is that GCC9 has not been officially released yet, so it is a bit > > inconvenient for users need to compile the compiler by themselves first. > > > > I am fine either way to drop this patch or keep it until GCC9 is GA. > > > > [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c35 > > Maybe we can make the constant depend on the compiler version? I am not too keen to do the version-check considering some LTS versions could just back-port those patches and the render the version-check incorrectly. > It may also be possible to reduce the KASAN_THREAD_SHIFT > constant for the normal case with gcc-9 and go back to the > default frame size then.
On Tue, Dec 11, 2018 at 10:52 PM Qian Cai <cai@lca.pw> wrote: > On Tue, 2018-12-11 at 22:43 +0100, Arnd Bergmann wrote: > > On Tue, Dec 11, 2018 at 6:18 PM Qian Cai <cai@lca.pw> wrote: > I am not too keen to do the version-check considering some LTS versions could > just back-port those patches and the render the version-check incorrectly. I'm not following what the problem is. Do you mean distro versions gcc with the compiler bugfix, or LTS kernel versions? For backported kernel fixes, this doesn't seem to be any different from other kernel changes that might be incorrect, but a straight backport of that kernel patch should still do the right thing in older kernels, and set the frame size higher for old compilers but not new ones. Arnd
On Tue, 2018-12-11 at 22:56 +0100, Arnd Bergmann wrote: > On Tue, Dec 11, 2018 at 10:52 PM Qian Cai <cai@lca.pw> wrote: > > On Tue, 2018-12-11 at 22:43 +0100, Arnd Bergmann wrote: > > > On Tue, Dec 11, 2018 at 6:18 PM Qian Cai <cai@lca.pw> wrote: > > I am not too keen to do the version-check considering some LTS versions > > could > > just back-port those patches and the render the version-check incorrectly. > > I'm not following what the problem is. Do you mean distro versions gcc > with the compiler bugfix, or LTS kernel versions? > I mean distro versions of GCC where the version is still 8 but keep back-porting tons of patches.
On Tue, Dec 11, 2018 at 10:59 PM Qian Cai <cai@lca.pw> wrote: > > On Tue, 2018-12-11 at 22:56 +0100, Arnd Bergmann wrote: > > On Tue, Dec 11, 2018 at 10:52 PM Qian Cai <cai@lca.pw> wrote: > > > On Tue, 2018-12-11 at 22:43 +0100, Arnd Bergmann wrote: > > > > On Tue, Dec 11, 2018 at 6:18 PM Qian Cai <cai@lca.pw> wrote: > > > I am not too keen to do the version-check considering some LTS versions > > > could > > > just back-port those patches and the render the version-check incorrectly. > > > > I'm not following what the problem is. Do you mean distro versions gcc > > with the compiler bugfix, or LTS kernel versions? > > > > I mean distro versions of GCC where the version is still 8 but keep back-porting > tons of patches. Ok, but in that case, checking the version would still be no worse than your current patch, the only difference is that for users of a fixed older gcc, the kernel would use more stack than it needs. Arnd
On Tue, 2018-12-11 at 23:12 +0100, Arnd Bergmann wrote: > On Tue, Dec 11, 2018 at 10:59 PM Qian Cai <cai@lca.pw> wrote: > > > > On Tue, 2018-12-11 at 22:56 +0100, Arnd Bergmann wrote: > > > On Tue, Dec 11, 2018 at 10:52 PM Qian Cai <cai@lca.pw> wrote: > > > > On Tue, 2018-12-11 at 22:43 +0100, Arnd Bergmann wrote: > > > > > On Tue, Dec 11, 2018 at 6:18 PM Qian Cai <cai@lca.pw> wrote: > > > > > > > > I am not too keen to do the version-check considering some LTS versions > > > > could > > > > just back-port those patches and the render the version-check > > > > incorrectly. > > > > > > I'm not following what the problem is. Do you mean distro versions gcc > > > with the compiler bugfix, or LTS kernel versions? > > > > > > > I mean distro versions of GCC where the version is still 8 but keep back- > > porting > > tons of patches. > > Ok, but in that case, checking the version would still be no worse > than your current patch, the only difference is that for users of a > fixed older gcc, the kernel would use more stack than it needs. > I am thinking about something it is probably best just waiting for those major distors to complete upgrading to GCC9 or back-porting those stack reduction patches first. Then, it is good time to tie up loose ends for those default stack sizes in all combinations.
On Tue, Dec 11, 2018 at 11:22 PM Qian Cai <cai@lca.pw> wrote: > > On Tue, 2018-12-11 at 23:12 +0100, Arnd Bergmann wrote: > > On Tue, Dec 11, 2018 at 10:59 PM Qian Cai <cai@lca.pw> wrote: > > > > > > On Tue, 2018-12-11 at 22:56 +0100, Arnd Bergmann wrote: > > > > On Tue, Dec 11, 2018 at 10:52 PM Qian Cai <cai@lca.pw> wrote: > > > > > On Tue, 2018-12-11 at 22:43 +0100, Arnd Bergmann wrote: > > > > > > On Tue, Dec 11, 2018 at 6:18 PM Qian Cai <cai@lca.pw> wrote: > > > > > > > > > > I am not too keen to do the version-check considering some LTS versions > > > > > could > > > > > just back-port those patches and the render the version-check > > > > > incorrectly. > > > > > > > > I'm not following what the problem is. Do you mean distro versions gcc > > > > with the compiler bugfix, or LTS kernel versions? > > > > > > > > > > I mean distro versions of GCC where the version is still 8 but keep back- > > > porting > > > tons of patches. > > > > Ok, but in that case, checking the version would still be no worse > > than your current patch, the only difference is that for users of a > > fixed older gcc, the kernel would use more stack than it needs. > > > > I am thinking about something it is probably best just waiting for those major > distors to complete upgrading to GCC9 or back-porting those stack reduction > patches first. Then, it is good time to tie up loose ends for those default > stack sizes in all combinations. I was basically trying to make sure we don't forget it when it gets to that. Another alternative would be to just disable KASAN_EXTRA now for gcc versions before 9, which essentially means for everyone, but then we get it back once a working version gets released. As I understand, this kasan option is actually fairly useless given its cost, so very few people would miss it. On a related note, I think we have to turn off asan-stack entirely on all released clang versions. asan-stack in general is much more useful than the use-after-scope check, but we clang produces some very large stack frames with it and we probably can't even work around it with KASAN_THREAD_SHIFT=2 but would need even more than that otherwise. Arnd
On 12/11/18 6:06 PM, Arnd Bergmann wrote: >> I am thinking about something it is probably best just waiting for those major >> distors to complete upgrading to GCC9 or back-porting those stack reduction >> patches first. Then, it is good time to tie up loose ends for those default >> stack sizes in all combinations. > > I was basically trying to make sure we don't forget it when it gets to that. I added a reminder in my calendar to check the GCC9 adoption in Q2 FY19. > > Another alternative would be to just disable KASAN_EXTRA now > for gcc versions before 9, which essentially means for everyone, > but then we get it back once a working version gets released. As > I understand, this kasan option is actually fairly useless given its > cost, so very few people would miss it. > > On a related note, I think we have to turn off asan-stack entirely > on all released clang versions. asan-stack in general is much more > useful than the use-after-scope check, but we clang produces some > very large stack frames with it and we probably can't even work > around it with KASAN_THREAD_SHIFT=2 but would need even > more than that otherwise. > > Arnd >
diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index b96442960aea..56562ff01076 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -76,12 +76,17 @@ /* * KASAN requires 1/8th of the kernel virtual address space for the shadow * region. KASAN can bloat the stack significantly, so double the (minimum) - * stack size when KASAN is in use. + * stack size when KASAN is in use, and then double it again if KASAN_EXTRA is + * on. */ #ifdef CONFIG_KASAN #define KASAN_SHADOW_SCALE_SHIFT 3 #define KASAN_SHADOW_SIZE (UL(1) << (VA_BITS - KASAN_SHADOW_SCALE_SHIFT)) +#ifdef CONFIG_KASAN_EXTRA +#define KASAN_THREAD_SHIFT 2 +#else #define KASAN_THREAD_SHIFT 1 +#endif /* CONFIG_KASAN_EXTRA */ #else #define KASAN_SHADOW_SIZE (0) #define KASAN_THREAD_SHIFT 0
If the kernel is configured with KASAN_EXTRA, the stack size is increasted significantly due to enable this option will set -fstack-reuse to "none" in GCC [1]. As the results, it could trigger stack overrun quite often with 32k stack size compiled using GCC 8. For example, this reproducer https://github.com/linux-test-project/ltp/blob/master/testcases/kernel/\ syscalls/madvise/madvise06.c could trigger a "corrupted stack end detected inside scheduler" very reliably with CONFIG_SCHED_STACK_END_CHECK enabled. Also, See other bug reports, https://lore.kernel.org/lkml/1542144497.12945.29.camel@gmx.us/ https://lore.kernel.org/lkml/721E7B42-2D55-4866-9C1A-3E8D64F33F9C@gmx.us/ There are just too many functions that could have a large stack with KASAN_EXTRA due to large local variables that have been called over and over again without being able to reuse the stacks. Some noticiable ones are, size 7536 shrink_inactive_list 7440 shrink_page_list 6560 fscache_stats_show 3920 jbd2_journal_commit_transaction 3216 try_to_unmap_one 3072 migrate_page_move_mapping 3584 migrate_misplaced_transhuge_page 3920 ip_vs_lblcr_schedule 4304 lpfc_nvme_info_show 3888 lpfc_debugfs_nvmestat_data.constprop There are other 49 functions are over 2k in size while compiling kernel with "-Wframe-larger-than=" on this machine. Hence, it is too much work to change Makefiles for each object to compile without -fsanitize-address-use-after-scope individually. [1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81715#c23 Signed-off-by: Qian Cai <cai@lca.pw> --- arch/arm64/include/asm/memory.h | 7 ++++++- 1 file changed, 6 insertions(+), 1 deletion(-)