Message ID | 20200924040335.30934-1-walter-zh.wu@mediatek.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | kasan: add workqueue and timer stack for generic KASAN | expand |
On Thu, Sep 24 2020 at 12:03, Walter Wu wrote: > When analyze use-after-free or double-free issue, recording the timer > stacks is helpful to preserve usage history which potentially gives > a hint about the affected code. > > Record the most recent two timer init calls in KASAN which are printed > on failure in the KASAN report. > > For timers it has turned out to be useful to record the stack trace > of the timer init call. In which way? And what kind of bug does it catch which cannot be catched by existing debug mechanisms already? > Because if the UAF root cause is in timer init, then user can see > KASAN report to get where it is registered and find out the root > cause. What? If the UAF root cause is in timer init, then registering it after using it in that very same function is pretty pointless. > It don't need to enable DEBUG_OBJECTS_TIMERS, but they have a chance > to find out the root cause. There is a lot of handwaving how useful this is, but TBH I don't see the value at all. DEBUG_OBJECTS_TIMERS does a lot more than crashing on UAF. If KASAN provides additional value over DEBUG_OBJECTS_TIMERS then spell it out, but just saying that you don't need to enable DEBUG_OBJECTS_TIMERS is not making an argument for that change. Try again please. Thanks, tglx
On Thu, 2020-09-24 at 23:41 +0200, Thomas Gleixner wrote: > On Thu, Sep 24 2020 at 12:03, Walter Wu wrote: > > When analyze use-after-free or double-free issue, recording the timer > > stacks is helpful to preserve usage history which potentially gives > > a hint about the affected code. > > > > Record the most recent two timer init calls in KASAN which are printed > > on failure in the KASAN report. > > > > For timers it has turned out to be useful to record the stack trace > > of the timer init call. > > In which way? And what kind of bug does it catch which cannot be catched > by existing debug mechanisms already? > We only provide another debug mechanisms to debug use-after-free or double-free, it can be displayed together in KASAN report and have a chance to debug, and it doesn't need to enable existing debug mechanisms at the same time. then it has a chance to resolve issue. > > Because if the UAF root cause is in timer init, then user can see > > KASAN report to get where it is registered and find out the root > > cause. > > What? If the UAF root cause is in timer init, then registering it after > using it in that very same function is pretty pointless. > See [1], the call stack shows UAF happen at dummy_timer(), it is the callback function and set by timer_setup(), if KASAN report shows the timer call stack, it should be useful for programmer. [1] https://syzkaller.appspot.com/bug?id=34e69b7c8c0165658cbc987da0b61dadec644b6b > > It don't need to enable DEBUG_OBJECTS_TIMERS, but they have a chance > > to find out the root cause. > > There is a lot of handwaving how useful this is, but TBH I don't see the > value at all. > > DEBUG_OBJECTS_TIMERS does a lot more than crashing on UAF. If KASAN > provides additional value over DEBUG_OBJECTS_TIMERS then spell it out, > but just saying that you don't need to enable DEBUG_OBJECTS_TIMERS is > not making an argument for that change. > We don't want to replace DEBUG_OBJECTS_TIMERS with this patches, only hope to use low overhead(compare with DEBUG_OBJECTS_TIMERS) to debug use-after-free/double-free issue. If you have some concerns, we can add those message into commit log. Thanks. Walter
Walter, On Fri, Sep 25 2020 at 15:18, Walter Wu wrote: > On Thu, 2020-09-24 at 23:41 +0200, Thomas Gleixner wrote: >> > For timers it has turned out to be useful to record the stack trace >> > of the timer init call. >> >> In which way? And what kind of bug does it catch which cannot be catched >> by existing debug mechanisms already? >> > We only provide another debug mechanisms to debug use-after-free or > double-free, it can be displayed together in KASAN report and have a > chance to debug, and it doesn't need to enable existing debug mechanisms > at the same time. then it has a chance to resolve issue. Again. KASAN can only cover UAF, but there are a dozen other ways to wreck the system with wrong usage of timers which can't be caught by KASAN. >> > Because if the UAF root cause is in timer init, then user can see >> > KASAN report to get where it is registered and find out the root >> > cause. >> >> What? If the UAF root cause is in timer init, then registering it after >> using it in that very same function is pretty pointless. >> > See [1], the call stack shows UAF happen at dummy_timer(), it is the > callback function and set by timer_setup(), if KASAN report shows the > timer call stack, it should be useful for programmer. The report you linked to has absolutely nothing to do with a timer related UAF. The timer callback calls kfree_skb() on something which is already freed. So the root cause of this is NOT in timer init as you claimed above. The timer callback is just exposing a problem in the URB management of this driver. IOW the recording of the timer init stack is completely useless for decoding this problem. >> There is a lot of handwaving how useful this is, but TBH I don't see the >> value at all. >> >> DEBUG_OBJECTS_TIMERS does a lot more than crashing on UAF. If KASAN >> provides additional value over DEBUG_OBJECTS_TIMERS then spell it out, >> but just saying that you don't need to enable DEBUG_OBJECTS_TIMERS is >> not making an argument for that change. >> > We don't want to replace DEBUG_OBJECTS_TIMERS with this patches, only > hope to use low overhead(compare with DEBUG_OBJECTS_TIMERS) to debug KASAN has lower overhead than DEBUG_OBJECTS_TIMERS? Maybe in a different universe. That said, I'm not opposed to the change per se, but without a sensible justification this is just pointless. Sprinkling kasan_foo() all over the place and claiming it's useful without a valid example does not provide any value. Quite the contrary it gives the completely wrong sense what KASAN can do and what not. Thanks, tglx
On Fri, 2020-09-25 at 10:55 +0200, Thomas Gleixner wrote: > Walter, > > On Fri, Sep 25 2020 at 15:18, Walter Wu wrote: > > On Thu, 2020-09-24 at 23:41 +0200, Thomas Gleixner wrote: > >> > For timers it has turned out to be useful to record the stack trace > >> > of the timer init call. > >> > >> In which way? And what kind of bug does it catch which cannot be catched > >> by existing debug mechanisms already? > >> > > We only provide another debug mechanisms to debug use-after-free or > > double-free, it can be displayed together in KASAN report and have a > > chance to debug, and it doesn't need to enable existing debug mechanisms > > at the same time. then it has a chance to resolve issue. > > Again. KASAN can only cover UAF, but there are a dozen other ways to > wreck the system with wrong usage of timers which can't be caught by > KASAN. > > >> > Because if the UAF root cause is in timer init, then user can see > >> > KASAN report to get where it is registered and find out the root > >> > cause. > >> > >> What? If the UAF root cause is in timer init, then registering it after > >> using it in that very same function is pretty pointless. > >> > > See [1], the call stack shows UAF happen at dummy_timer(), it is the > > callback function and set by timer_setup(), if KASAN report shows the > > timer call stack, it should be useful for programmer. > > The report you linked to has absolutely nothing to do with a timer > related UAF. The timer callback calls kfree_skb() on something which is > already freed. So the root cause of this is NOT in timer init as you > claimed above. The timer callback is just exposing a problem in the URB > management of this driver. IOW the recording of the timer init stack is > completely useless for decoding this problem. > > >> There is a lot of handwaving how useful this is, but TBH I don't see the > >> value at all. > >> > >> DEBUG_OBJECTS_TIMERS does a lot more than crashing on UAF. If KASAN > >> provides additional value over DEBUG_OBJECTS_TIMERS then spell it out, > >> but just saying that you don't need to enable DEBUG_OBJECTS_TIMERS is > >> not making an argument for that change. > >> > > We don't want to replace DEBUG_OBJECTS_TIMERS with this patches, only > > hope to use low overhead(compare with DEBUG_OBJECTS_TIMERS) to debug > > KASAN has lower overhead than DEBUG_OBJECTS_TIMERS? Maybe in a different > universe. > I mean KASAN + our patch vs KASAN + DEBUG_OBJECTS_TIMERS. The front one have the information to the original caller and help to debug. It is smaller overhead than the one behind. > That said, I'm not opposed to the change per se, but without a sensible > justification this is just pointless. > > Sprinkling kasan_foo() all over the place and claiming it's useful > without a valid example does not provide any value. > > Quite the contrary it gives the completely wrong sense what KASAN can do > and what not. > I agree your saying, so that I need to find out a use case to explain to you. Thanks Walter > Thanks, > > tglx >
On Fri, Sep 25 2020 at 17:15, Walter Wu wrote: > On Fri, 2020-09-25 at 10:55 +0200, Thomas Gleixner wrote: >> > We don't want to replace DEBUG_OBJECTS_TIMERS with this patches, only >> > hope to use low overhead(compare with DEBUG_OBJECTS_TIMERS) to debug >> >> KASAN has lower overhead than DEBUG_OBJECTS_TIMERS? Maybe in a different >> universe. >> > I mean KASAN + our patch vs KASAN + DEBUG_OBJECTS_TIMERS. The front one > have the information to the original caller and help to debug. It is > smaller overhead than the one behind. For ONE specific problem related to timers and you have still not shown a single useful debug output where this information helps to debug anything. > I agree your saying, so that I need to find out a use case to explain to > you. Indeed. Thanks, tglx
Hi Thomas, On Sat, 2020-09-26 at 00:59 +0200, Thomas Gleixner wrote: > On Fri, Sep 25 2020 at 17:15, Walter Wu wrote: > > On Fri, 2020-09-25 at 10:55 +0200, Thomas Gleixner wrote: > >> > We don't want to replace DEBUG_OBJECTS_TIMERS with this patches, only > >> > hope to use low overhead(compare with DEBUG_OBJECTS_TIMERS) to debug > >> > >> KASAN has lower overhead than DEBUG_OBJECTS_TIMERS? Maybe in a different > >> universe. > >> > > I mean KASAN + our patch vs KASAN + DEBUG_OBJECTS_TIMERS. The front one > > have the information to the original caller and help to debug. It is > > smaller overhead than the one behind. > > For ONE specific problem related to timers and you have still not shown > a single useful debug output where this information helps to debug > anything. > > > I agree your saying, so that I need to find out a use case to explain to > > you. > > Indeed. > First, I think the commit log “Because if the UAF root cause is in timer init …” needs to be removed, this patch hopes to help programmer gets timer callback is where is registered. It is useful only if free stack is called from timer callback, because programmer can see why & where register this function. Second, see [1], it should satisfies first point. The free stack is from timer callback, if we know where register this function, then it should be useful to solve UAF. [1]https://lore.kernel.org/linux-usb/000000000000590f6b05a1c05d15@google.com/ Thanks Walter > Thanks, > > tglx >
Walter, On Sun, Sep 27 2020 at 01:11, Walter Wu wrote: > First, I think the commit log “Because if the UAF root cause is in timer > init …” needs to be removed, this patch hopes to help programmer gets > timer callback is where is registered. It is useful only if free stack > is called from timer callback, because programmer can see why & where > register this function. > > Second, see [1], it should satisfies first point. The free stack is from > timer callback, if we know where register this function, then it should > be useful to solve UAF. No. It's completely useless. The problem has absolutely nothing to do with the timer callback and the timer_init() invocation which set the timer's callback to 'dummy_timer'. The timer callback happens to free the object, but the worker thread has still a reference of some sort. So the problem is either missing refcounting which allows the timer callback to free the object or some missing serialization. Knowing the place which initialized the timer is absolutely not helping to figure out what's missing here. Aside of that it's trivial enough to do: git grep dummy_timer drivers/usb/gadget/udc/dummy_hcd.c if you really want to know what initialized it: dummy_timer+0x1258/0x32ae drivers/usb/gadget/udc/dummy_hcd.c:1966 call_timer_fn+0x195/0x6f0 kernel/time/timer.c:1404 expire_timers kernel/time/timer.c:1449 [inline] That said, I'm all for adding useful information to KASAN or whatever reports, but I'm not agreeing with the approach of 'Let's sprinkle kasan_foo() all over tha place and claim it is useful to decode an UAF'. Adding irrelevant information to a report is actually counter productive because it makes people look at the wrong place. Again: Provide an analysis of such a dump where the timer_init() function is a key element of solving the problem. Thanks, tglx
diff --git a/kernel/time/timer.c b/kernel/time/timer.c index a16764b0116e..1ed8f8aca7f5 100644 --- a/kernel/time/timer.c +++ b/kernel/time/timer.c @@ -796,6 +796,9 @@ static void do_init_timer(struct timer_list *timer, timer->function = func; timer->flags = flags | raw_smp_processor_id(); lockdep_init_map(&timer->lockdep_map, name, key, 0); + + /* record the timer stack in order to print it in KASAN report */ + kasan_record_aux_stack(timer); } /**