Message ID | 20200810072115.429-1-walter-zh.wu@mediatek.com (mailing list archive) |
---|---|
Headers | show |
Series | kasan: add workqueue and timer stack for generic KASAN | expand |
> On Aug 10, 2020, at 3:21 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote: > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2]. > In some of these access/allocation happened in process_one_work(), > we see the free stack is useless in KASAN report, it doesn't help > programmers to solve UAF on workqueue. The same may stand for times. > > This patchset improves KASAN reports by making them to have workqueue > queueing stack and timer queueing stack information. It is useful for > programmers to solve use-after-free or double-free memory issue. > > Generic KASAN will record the last two workqueue and timer stacks, > print them in KASAN report. It is only suitable for generic KASAN. > > In order to print the last two workqueue and timer stacks, so that > we add new members in struct kasan_alloc_meta. > - two workqueue queueing work stacks, total size is 8 bytes. > - two timer queueing stacks, total size is 8 bytes. > > Orignial struct kasan_alloc_meta size is 16 bytes. After add new > members, then the struct kasan_alloc_meta total size is 32 bytes, > It is a good number of alignment. Let it get better memory consumption. Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely. > > [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work > [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers > [3]https://bugzilla.kernel.org/show_bug.cgi?id=198437 > > Walter Wu (5): > timer: kasan: record and print timer stack > workqueue: kasan: record and print workqueue stack > lib/test_kasan.c: add timer test case > lib/test_kasan.c: add workqueue test case > kasan: update documentation for generic kasan > > Documentation/dev-tools/kasan.rst | 4 ++-- > include/linux/kasan.h | 4 ++++ > kernel/time/timer.c | 2 ++ > kernel/workqueue.c | 3 +++ > lib/test_kasan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > mm/kasan/generic.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > mm/kasan/kasan.h | 6 +++++- > mm/kasan/report.c | 22 ++++++++++++++++++++++ > 8 files changed, 134 insertions(+), 3 deletions(-) > > -- > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20200810072115.429-1-walter-zh.wu%40mediatek.com.
On Mon, 2020-08-10 at 07:19 -0400, Qian Cai wrote: > > > On Aug 10, 2020, at 3:21 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote: > > > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2]. > > In some of these access/allocation happened in process_one_work(), > > we see the free stack is useless in KASAN report, it doesn't help > > programmers to solve UAF on workqueue. The same may stand for times. > > > > This patchset improves KASAN reports by making them to have workqueue > > queueing stack and timer queueing stack information. It is useful for > > programmers to solve use-after-free or double-free memory issue. > > > > Generic KASAN will record the last two workqueue and timer stacks, > > print them in KASAN report. It is only suitable for generic KASAN. > > > > In order to print the last two workqueue and timer stacks, so that > > we add new members in struct kasan_alloc_meta. > > - two workqueue queueing work stacks, total size is 8 bytes. > > - two timer queueing stacks, total size is 8 bytes. > > > > Orignial struct kasan_alloc_meta size is 16 bytes. After add new > > members, then the struct kasan_alloc_meta total size is 32 bytes, > > It is a good number of alignment. Let it get better memory consumption. > > Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely. > A good debug tool is to have complete information in order to solve issue. We should focus on if KASAN reports always show this debug information or create a option to decide if show it. Because this feature is Dimitry's suggestion. see [1]. So I think it need to be implemented. Maybe we can wait his response. [1]https://lkml.org/lkml/2020/6/23/256 Thanks. > > > > [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work > > [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers > > [3]https://bugzilla.kernel.org/show_bug.cgi?id=198437 > > > > Walter Wu (5): > > timer: kasan: record and print timer stack > > workqueue: kasan: record and print workqueue stack > > lib/test_kasan.c: add timer test case > > lib/test_kasan.c: add workqueue test case > > kasan: update documentation for generic kasan > > > > Documentation/dev-tools/kasan.rst | 4 ++-- > > include/linux/kasan.h | 4 ++++ > > kernel/time/timer.c | 2 ++ > > kernel/workqueue.c | 3 +++ > > lib/test_kasan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > mm/kasan/generic.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > mm/kasan/kasan.h | 6 +++++- > > mm/kasan/report.c | 22 ++++++++++++++++++++++ > > 8 files changed, 134 insertions(+), 3 deletions(-) > > > > -- > > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20200810072115.429-1-walter-zh.wu%40mediatek.com.
On Mon, 2020-08-10 at 19:50 +0800, Walter Wu wrote: > On Mon, 2020-08-10 at 07:19 -0400, Qian Cai wrote: > > > > > On Aug 10, 2020, at 3:21 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote: > > > > > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2]. > > > In some of these access/allocation happened in process_one_work(), > > > we see the free stack is useless in KASAN report, it doesn't help > > > programmers to solve UAF on workqueue. The same may stand for times. > > > > > > This patchset improves KASAN reports by making them to have workqueue > > > queueing stack and timer queueing stack information. It is useful for > > > programmers to solve use-after-free or double-free memory issue. > > > > > > Generic KASAN will record the last two workqueue and timer stacks, > > > print them in KASAN report. It is only suitable for generic KASAN. > > > > > > In order to print the last two workqueue and timer stacks, so that > > > we add new members in struct kasan_alloc_meta. > > > - two workqueue queueing work stacks, total size is 8 bytes. > > > - two timer queueing stacks, total size is 8 bytes. > > > > > > Orignial struct kasan_alloc_meta size is 16 bytes. After add new > > > members, then the struct kasan_alloc_meta total size is 32 bytes, > > > It is a good number of alignment. Let it get better memory consumption. > > > > Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely. > > > > A good debug tool is to have complete information in order to solve > issue. We should focus on if KASAN reports always show this debug > information or create a option to decide if show it. Because this > feature is Dmitry's suggestion. see [1]. So I think it need to be > implemented. Maybe we can wait his response. > > [1]https://lkml.org/lkml/2020/6/23/256 > > Thanks. > Fix name typo. I am sorry to him. And add a bugzilla to show why need to do it. please see [1]. [1] https://bugzilla.kernel.org/show_bug.cgi?id=198437 > > > > > > [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work > > > [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers > > > [3]https://bugzilla.kernel.org/show_bug.cgi?id=198437 > > > > > > Walter Wu (5): > > > timer: kasan: record and print timer stack > > > workqueue: kasan: record and print workqueue stack > > > lib/test_kasan.c: add timer test case > > > lib/test_kasan.c: add workqueue test case > > > kasan: update documentation for generic kasan > > > > > > Documentation/dev-tools/kasan.rst | 4 ++-- > > > include/linux/kasan.h | 4 ++++ > > > kernel/time/timer.c | 2 ++ > > > kernel/workqueue.c | 3 +++ > > > lib/test_kasan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > mm/kasan/generic.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > > mm/kasan/kasan.h | 6 +++++- > > > mm/kasan/report.c | 22 ++++++++++++++++++++++ > > > 8 files changed, 134 insertions(+), 3 deletions(-) > > > > > > -- > > > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > > > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > > > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20200810072115.429-1-walter-zh.wu%40mediatek.com. >
On Mon, Aug 10, 2020 at 07:50:57PM +0800, Walter Wu wrote: > On Mon, 2020-08-10 at 07:19 -0400, Qian Cai wrote: > > > > > On Aug 10, 2020, at 3:21 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote: > > > > > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2]. > > > In some of these access/allocation happened in process_one_work(), > > > we see the free stack is useless in KASAN report, it doesn't help > > > programmers to solve UAF on workqueue. The same may stand for times. > > > > > > This patchset improves KASAN reports by making them to have workqueue > > > queueing stack and timer queueing stack information. It is useful for > > > programmers to solve use-after-free or double-free memory issue. > > > > > > Generic KASAN will record the last two workqueue and timer stacks, > > > print them in KASAN report. It is only suitable for generic KASAN. > > > > > > In order to print the last two workqueue and timer stacks, so that > > > we add new members in struct kasan_alloc_meta. > > > - two workqueue queueing work stacks, total size is 8 bytes. > > > - two timer queueing stacks, total size is 8 bytes. > > > > > > Orignial struct kasan_alloc_meta size is 16 bytes. After add new > > > members, then the struct kasan_alloc_meta total size is 32 bytes, > > > It is a good number of alignment. Let it get better memory consumption. > > > > Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely. > > > > A good debug tool is to have complete information in order to solve > issue. We should focus on if KASAN reports always show this debug > information or create a option to decide if show it. Because this > feature is Dimitry's suggestion. see [1]. So I think it need to be > implemented. Maybe we can wait his response. > > [1]https://lkml.org/lkml/2020/6/23/256 I don't know if it is Dmitry's pipe-dream which every KASAN report would enable developers to fix it without reproducing it. It is always an ongoing struggling between to make kernel easier to debug and the things less cumbersome. On the other hand, Dmitry's suggestion makes sense only if the price we are going to pay is fair. With the current diffstat and the recent experience of call_rcu() stacks "waste" screen spaces as a heavy KASAN user myself, I can't really get that exciting for pushing the limit again at all. > > Thanks. > > > > > > > [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work > > > [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers > > > [3]https://bugzilla.kernel.org/show_bug.cgi?id=198437 > > > > > > Walter Wu (5): > > > timer: kasan: record and print timer stack > > > workqueue: kasan: record and print workqueue stack > > > lib/test_kasan.c: add timer test case > > > lib/test_kasan.c: add workqueue test case > > > kasan: update documentation for generic kasan > > > > > > Documentation/dev-tools/kasan.rst | 4 ++-- > > > include/linux/kasan.h | 4 ++++ > > > kernel/time/timer.c | 2 ++ > > > kernel/workqueue.c | 3 +++ > > > lib/test_kasan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > mm/kasan/generic.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > > mm/kasan/kasan.h | 6 +++++- > > > mm/kasan/report.c | 22 ++++++++++++++++++++++ > > > 8 files changed, 134 insertions(+), 3 deletions(-) > > > > > > -- > > > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > > > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > > > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20200810072115.429-1-walter-zh.wu%40mediatek.com. >
On Mon, 2020-08-10 at 08:44 -0400, Qian Cai wrote: > On Mon, Aug 10, 2020 at 07:50:57PM +0800, Walter Wu wrote: > > On Mon, 2020-08-10 at 07:19 -0400, Qian Cai wrote: > > > > > > > On Aug 10, 2020, at 3:21 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote: > > > > > > > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2]. > > > > In some of these access/allocation happened in process_one_work(), > > > > we see the free stack is useless in KASAN report, it doesn't help > > > > programmers to solve UAF on workqueue. The same may stand for times. > > > > > > > > This patchset improves KASAN reports by making them to have workqueue > > > > queueing stack and timer queueing stack information. It is useful for > > > > programmers to solve use-after-free or double-free memory issue. > > > > > > > > Generic KASAN will record the last two workqueue and timer stacks, > > > > print them in KASAN report. It is only suitable for generic KASAN. > > > > > > > > In order to print the last two workqueue and timer stacks, so that > > > > we add new members in struct kasan_alloc_meta. > > > > - two workqueue queueing work stacks, total size is 8 bytes. > > > > - two timer queueing stacks, total size is 8 bytes. > > > > > > > > Orignial struct kasan_alloc_meta size is 16 bytes. After add new > > > > members, then the struct kasan_alloc_meta total size is 32 bytes, > > > > It is a good number of alignment. Let it get better memory consumption. > > > > > > Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely. > > > > > > > A good debug tool is to have complete information in order to solve > > issue. We should focus on if KASAN reports always show this debug > > information or create a option to decide if show it. Because this > > feature is Dimitry's suggestion. see [1]. So I think it need to be > > implemented. Maybe we can wait his response. > > > > [1]https://lkml.org/lkml/2020/6/23/256 > > I don't know if it is Dmitry's pipe-dream which every KASAN report would enable > developers to fix it without reproducing it. It is always an ongoing struggling > between to make kernel easier to debug and the things less cumbersome. > > On the other hand, Dmitry's suggestion makes sense only if the price we are > going to pay is fair. With the current diffstat and the recent experience of > call_rcu() stacks "waste" screen spaces as a heavy KASAN user myself, I can't > really get that exciting for pushing the limit again at all. > If you are concerned that the report is long, maybe we can create an option for the user decide whether print them (include call_rcu). So this should satisfy everyone? > > > > > > > > > > [1]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22+process_one_work > > > > [2]https://groups.google.com/g/syzkaller-bugs/search?q=%22use-after-free%22%20expire_timers > > > > [3]https://bugzilla.kernel.org/show_bug.cgi?id=198437 > > > > > > > > Walter Wu (5): > > > > timer: kasan: record and print timer stack > > > > workqueue: kasan: record and print workqueue stack > > > > lib/test_kasan.c: add timer test case > > > > lib/test_kasan.c: add workqueue test case > > > > kasan: update documentation for generic kasan > > > > > > > > Documentation/dev-tools/kasan.rst | 4 ++-- > > > > include/linux/kasan.h | 4 ++++ > > > > kernel/time/timer.c | 2 ++ > > > > kernel/workqueue.c | 3 +++ > > > > lib/test_kasan.c | 54 ++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > mm/kasan/generic.c | 42 ++++++++++++++++++++++++++++++++++++++++++ > > > > mm/kasan/kasan.h | 6 +++++- > > > > mm/kasan/report.c | 22 ++++++++++++++++++++++ > > > > 8 files changed, 134 insertions(+), 3 deletions(-) > > > > > > > > -- > > > > You received this message because you are subscribed to the Google Groups "kasan-dev" group. > > > > To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com. > > > > To view this discussion on the web visit https://groups.google.com/d/msgid/kasan-dev/20200810072115.429-1-walter-zh.wu%40mediatek.com. > >
On Mon, Aug 10, 2020 at 10:31:22PM +0800, Walter Wu wrote: > On Mon, 2020-08-10 at 08:44 -0400, Qian Cai wrote: > > On Mon, Aug 10, 2020 at 07:50:57PM +0800, Walter Wu wrote: > > > On Mon, 2020-08-10 at 07:19 -0400, Qian Cai wrote: > > > > > > > > > On Aug 10, 2020, at 3:21 AM, Walter Wu <walter-zh.wu@mediatek.com> wrote: > > > > > > > > > > Syzbot reports many UAF issues for workqueue or timer, see [1] and [2]. > > > > > In some of these access/allocation happened in process_one_work(), > > > > > we see the free stack is useless in KASAN report, it doesn't help > > > > > programmers to solve UAF on workqueue. The same may stand for times. > > > > > > > > > > This patchset improves KASAN reports by making them to have workqueue > > > > > queueing stack and timer queueing stack information. It is useful for > > > > > programmers to solve use-after-free or double-free memory issue. > > > > > > > > > > Generic KASAN will record the last two workqueue and timer stacks, > > > > > print them in KASAN report. It is only suitable for generic KASAN. > > > > > > > > > > In order to print the last two workqueue and timer stacks, so that > > > > > we add new members in struct kasan_alloc_meta. > > > > > - two workqueue queueing work stacks, total size is 8 bytes. > > > > > - two timer queueing stacks, total size is 8 bytes. > > > > > > > > > > Orignial struct kasan_alloc_meta size is 16 bytes. After add new > > > > > members, then the struct kasan_alloc_meta total size is 32 bytes, > > > > > It is a good number of alignment. Let it get better memory consumption. > > > > > > > > Getting debugging tools complicated surely is the best way to kill it. I would argue that it only make sense to complicate it if it is useful most of the time which I never feel or hear that is the case. This reminds me your recent call_rcu() stacks that most of time just makes parsing the report cumbersome. Thus, I urge this exercise to over-engineer on special cases need to stop entirely. > > > > > > > > > > A good debug tool is to have complete information in order to solve > > > issue. We should focus on if KASAN reports always show this debug > > > information or create a option to decide if show it. Because this > > > feature is Dimitry's suggestion. see [1]. So I think it need to be > > > implemented. Maybe we can wait his response. > > > > > > [1]https://lkml.org/lkml/2020/6/23/256 > > > > I don't know if it is Dmitry's pipe-dream which every KASAN report would enable > > developers to fix it without reproducing it. It is always an ongoing struggling > > between to make kernel easier to debug and the things less cumbersome. > > > > On the other hand, Dmitry's suggestion makes sense only if the price we are > > going to pay is fair. With the current diffstat and the recent experience of > > call_rcu() stacks "waste" screen spaces as a heavy KASAN user myself, I can't > > really get that exciting for pushing the limit again at all. > > > > If you are concerned that the report is long, maybe we can create an > option for the user decide whether print them (include call_rcu). > So this should satisfy everyone? Adding kernel config options is just another way to add complications with real cost. The only other way I can think of right now is to create some kinds of plugin systems for kasan to be able to run ebpf scripts (for example) to deal with those special cases.