Message ID | 20200328083209.21793-1-tingwei@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | arm64: hw_breakpoint: don't clear debug registers in halt mode | expand |
On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote: > If external debugger sets a breakpoint for one Kernel function > when device is in bootloader mode and loads Kernel, this breakpoint > will be wiped out in hw_breakpoint_reset(). To fix this, check > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is > 0b1, halting debug is enabled. Don't reset debug registers in this case. I don't think this is sufficient, because the kernel can still subsequently mess with breakpoints, and the HW debugger might not be attached at this point in time anyhow. I reckon this should hang off the existing "nodebumon" command line option, and we shouldn't use HW breakpoints at all when that is passed. Then you can pass that to prevent the kernel stomping on the external debugger. Will, thoughts? Mark. > > Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> > --- > arch/arm64/include/asm/debug-monitors.h | 1 + > arch/arm64/kernel/hw_breakpoint.c | 19 +++++++++++++++++++ > 2 files changed, 20 insertions(+) > > diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h > index 7619f473155f..8dc2c28791a0 100644 > --- a/arch/arm64/include/asm/debug-monitors.h > +++ b/arch/arm64/include/asm/debug-monitors.h > @@ -18,6 +18,7 @@ > > /* MDSCR_EL1 enabling bits */ > #define DBG_MDSCR_KDE (1 << 13) > +#define DBG_MDSCR_HDE (1 << 14) > #define DBG_MDSCR_MDE (1 << 15) > #define DBG_MDSCR_MASK ~(DBG_MDSCR_KDE | DBG_MDSCR_MDE) > > diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c > index 0b727edf4104..0180306f74d7 100644 > --- a/arch/arm64/kernel/hw_breakpoint.c > +++ b/arch/arm64/kernel/hw_breakpoint.c > @@ -927,6 +927,17 @@ void hw_breakpoint_thread_switch(struct task_struct *next) > !next_debug_info->wps_disabled); > } > > +/* > + * Check if halted debug mode is enabled. > + */ > +static u32 hde_enabled(void) > +{ > + u32 mdscr; > + > + asm volatile("mrs %0, mdscr_el1" : "=r" (mdscr)); > + return (mdscr & DBG_MDSCR_HDE); > +} > + > /* > * CPU initialisation. > */ > @@ -934,6 +945,14 @@ static int hw_breakpoint_reset(unsigned int cpu) > { > int i; > struct perf_event **slots; > + > + /* > + * When halting debug mode is enabled, break point could be already > + * set be external debugger. Don't reset debug registers here to > + * reserve break point from external debugger. > + */ > + if (hde_enabled()) > + return 0; > /* > * When a CPU goes through cold-boot, it does not have any installed > * slot, so it is safe to share the same function for restoring and > -- > The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, > a Linux Foundation Collaborative Project
On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote: > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote: > > If external debugger sets a breakpoint for one Kernel function > > when device is in bootloader mode and loads Kernel, this breakpoint > > will be wiped out in hw_breakpoint_reset(). To fix this, check > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is > > 0b1, halting debug is enabled. Don't reset debug registers in this case. > > I don't think this is sufficient, because the kernel can still > subsequently mess with breakpoints, and the HW debugger might not be > attached at this point in time anyhow. > > I reckon this should hang off the existing "nodebumon" command line > option, and we shouldn't use HW breakpoints at all when that is passed. > Then you can pass that to prevent the kernel stomping on the external > debugger. > > Will, thoughts? I was going to suggest the same thing, although we will also need to take care to reset the registers if "nodebugmon" is toggled at runtime via the "debug_enabled" file in debugfs. Will
在 2020-03-30 21:42,Will Deacon 写道: > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote: >> On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote: >> > If external debugger sets a breakpoint for one Kernel function >> > when device is in bootloader mode and loads Kernel, this breakpoint >> > will be wiped out in hw_breakpoint_reset(). To fix this, check >> > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is >> > 0b1, halting debug is enabled. Don't reset debug registers in this > case. >> >> I don't think this is sufficient, because the kernel can still >> subsequently mess with breakpoints, and the HW debugger might not be >> attached at this point in time anyhow. >> >> I reckon this should hang off the existing "nodebumon" command line >> option, and we shouldn't use HW breakpoints at all when that is >> passed. >> Then you can pass that to prevent the kernel stomping on the external >> debugger. >> >> Will, thoughts? > > I was going to suggest the same thing, although we will also need to > take > care to reset the registers if "nodebugmon" is toggled at runtime via > the > "debug_enabled" file in debugfs. > > Will Thanks for the suggestion, Mark and Will. It's a great idea to use "nodebugmon". When "nodebugmon" is set, Kernel won't change HW breakpoints. For reset the registers after "debug_enabled" is toggled, I'm thinking if we are adding unnecessary complexity here.If we take that approach, we will hook "debug_enabled" interface and use smp_call_function_single() to call hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done and change "debug_enabled". External debugger would clear the breakpoints when it detaches the device and restores its breakpoints when attaches the device. Assume debug_enabled is changed to one after external debugger detaches the device. Debugger would already clear the breakpoint registers. If debgger is still attached, there's nothing Kernel can do to stop it restores/programs the breakpoint registers. What do you think of this? Thanks, Tingwei
On Tue, Mar 31, 2020 at 10:39:42AM +0800, tingwei@codeaurora.org wrote: > 在 2020-03-30 21:42,Will Deacon 写道: > > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote: > > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote: > > > > If external debugger sets a breakpoint for one Kernel function > > > > when device is in bootloader mode and loads Kernel, this breakpoint > > > > will be wiped out in hw_breakpoint_reset(). To fix this, check > > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is > > > > 0b1, halting debug is enabled. Don't reset debug registers in this > > case. > > > > > > I don't think this is sufficient, because the kernel can still > > > subsequently mess with breakpoints, and the HW debugger might not be > > > attached at this point in time anyhow. > > > > > > I reckon this should hang off the existing "nodebumon" command line > > > option, and we shouldn't use HW breakpoints at all when that is > > > passed. > > > Then you can pass that to prevent the kernel stomping on the external > > > debugger. > > > > > > Will, thoughts? > > > > I was going to suggest the same thing, although we will also need to > > take > > care to reset the registers if "nodebugmon" is toggled at runtime via > > the > > "debug_enabled" file in debugfs. > > > Thanks for the suggestion, Mark and Will. It's a great idea to use > "nodebugmon". When "nodebugmon" is set, Kernel won't change HW breakpoints. > > For reset the registers after "debug_enabled" is toggled, I'm thinking if > we are adding unnecessary complexity here.If we take that approach, we will > hook "debug_enabled" interface and use smp_call_function_single() to call > hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done and > change "debug_enabled". External debugger would clear the breakpoints when > it detaches the device and restores its breakpoints when attaches the > device. > Assume debug_enabled is changed to one after external debugger detaches the > device. Debugger would already clear the breakpoint registers. If debgger is > still attached, there's nothing Kernel can do to stop it restores/programs > the breakpoint registers. > > What do you think of this? It's all a bit of a mess. Looking at it some more, why can't the external debugger simply trap access to the debug registers using EDSCR.TDA? That way, we don't have to change anything in the kernel. Will
在 2020-03-31 15:41,Will Deacon 写道: > On Tue, Mar 31, 2020 at 10:39:42AM +0800, tingwei@codeaurora.org wrote: >> 在 2020-03-30 21:42,Will Deacon 写道: >> > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote: >> > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote: >> > > > If external debugger sets a breakpoint for one Kernel function >> > > > when device is in bootloader mode and loads Kernel, this breakpoint >> > > > will be wiped out in hw_breakpoint_reset(). To fix this, check >> > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is >> > > > 0b1, halting debug is enabled. Don't reset debug registers in this >> > case. >> > > >> > > I don't think this is sufficient, because the kernel can still >> > > subsequently mess with breakpoints, and the HW debugger might not be >> > > attached at this point in time anyhow. >> > > >> > > I reckon this should hang off the existing "nodebumon" command line >> > > option, and we shouldn't use HW breakpoints at all when that is >> > > passed. >> > > Then you can pass that to prevent the kernel stomping on the external >> > > debugger. >> > > >> > > Will, thoughts? >> > >> > I was going to suggest the same thing, although we will also need to >> > take >> > care to reset the registers if "nodebugmon" is toggled at runtime via >> > the >> > "debug_enabled" file in debugfs. >> > >> Thanks for the suggestion, Mark and Will. It's a great idea to use >> "nodebugmon". When "nodebugmon" is set, Kernel won't change HW >> breakpoints. >> >> For reset the registers after "debug_enabled" is toggled, I'm thinking >> if >> we are adding unnecessary complexity here.If we take that approach, we >> will >> hook "debug_enabled" interface and use smp_call_function_single() to >> call >> hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done >> and >> change "debug_enabled". External debugger would clear the breakpoints >> when >> it detaches the device and restores its breakpoints when attaches the >> device. >> Assume debug_enabled is changed to one after external debugger >> detaches >> the >> device. Debugger would already clear the breakpoint registers. If >> debgger >> is >> still attached, there's nothing Kernel can do to stop it >> restores/programs >> the breakpoint registers. >> >> What do you think of this? > > It's all a bit of a mess. Looking at it some more, why can't the > external > debugger simply trap access to the debug registers using EDSCR.TDA? > That > way, we don't have to change anything in the kernel. > > Will External debugger has the function to trap access to debug registers now. What do we expect debugger to do after core is stopped? Skip that msr instruction and continue to run? Tingwei
On Tue, Mar 31, 2020 at 07:33:38PM +0800, tingwei@codeaurora.org wrote: > 在 2020-03-31 15:41,Will Deacon 写道: > > On Tue, Mar 31, 2020 at 10:39:42AM +0800, tingwei@codeaurora.org wrote: > > > 在 2020-03-30 21:42,Will Deacon 写道: > > > > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote: > > > > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote: > > > > > > If external debugger sets a breakpoint for one Kernel function > > > > > > when device is in bootloader mode and loads Kernel, this breakpoint > > > > > > will be wiped out in hw_breakpoint_reset(). To fix this, check > > > > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is > > > > > > 0b1, halting debug is enabled. Don't reset debug registers in this > > > > case. > > > > > > > > > > I don't think this is sufficient, because the kernel can still > > > > > subsequently mess with breakpoints, and the HW debugger might not be > > > > > attached at this point in time anyhow. > > > > > > > > > > I reckon this should hang off the existing "nodebumon" command line > > > > > option, and we shouldn't use HW breakpoints at all when that is > > > > > passed. > > > > > Then you can pass that to prevent the kernel stomping on the external > > > > > debugger. > > > > > > > > > > Will, thoughts? > > > > > > > > I was going to suggest the same thing, although we will also need to > > > > take > > > > care to reset the registers if "nodebugmon" is toggled at runtime via > > > > the > > > > "debug_enabled" file in debugfs. > > > > > > > Thanks for the suggestion, Mark and Will. It's a great idea to use > > > "nodebugmon". When "nodebugmon" is set, Kernel won't change HW > > > breakpoints. > > > > > > For reset the registers after "debug_enabled" is toggled, I'm > > > thinking if > > > we are adding unnecessary complexity here.If we take that approach, we > > > will > > > hook "debug_enabled" interface and use smp_call_function_single() to > > > call > > > hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done > > > and > > > change "debug_enabled". External debugger would clear the > > > breakpoints when > > > it detaches the device and restores its breakpoints when attaches the > > > device. > > > Assume debug_enabled is changed to one after external debugger > > > detaches > > > the > > > device. Debugger would already clear the breakpoint registers. If > > > debgger > > > is > > > still attached, there's nothing Kernel can do to stop it > > > restores/programs > > > the breakpoint registers. > > > > > > What do you think of this? > > > > It's all a bit of a mess. Looking at it some more, why can't the > > external > > debugger simply trap access to the debug registers using EDSCR.TDA? That > > way, we don't have to change anything in the kernel. > > > > External debugger has the function to trap access to debug registers now. > What do we expect debugger to do after core is stopped? Skip that msr > instruction and continue to run? The nicest thing to do would probably be to record all the accesses made by the OS so that it can emulate reads and replay writes when external debugging is over. Given that you'd still be expecting to pass "nodebugmon", the emulation should be pretty straightforward, I think. Will
在 2020-03-31 19:45,Will Deacon 写道: > On Tue, Mar 31, 2020 at 07:33:38PM +0800, tingwei@codeaurora.org wrote: >> 在 2020-03-31 15:41,Will Deacon 写道: >> > On Tue, Mar 31, 2020 at 10:39:42AM +0800, tingwei@codeaurora.org wrote: >> > > 在 2020-03-30 21:42,Will Deacon 写道: >> > > > On Mon, Mar 30, 2020 at 01:39:46PM +0100, Mark Rutland wrote: >> > > > > On Sat, Mar 28, 2020 at 04:32:09PM +0800, Tingwei Zhang wrote: >> > > > > > If external debugger sets a breakpoint for one Kernel function >> > > > > > when device is in bootloader mode and loads Kernel, this >> > > > > > breakpoint >> > > > > > will be wiped out in hw_breakpoint_reset(). To fix this, check >> > > > > > MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is >> > > > > > 0b1, halting debug is enabled. Don't reset debug registers in >> > > > > > this >> > > > case. >> > > > > >> > > > > I don't think this is sufficient, because the kernel can still >> > > > > subsequently mess with breakpoints, and the HW debugger might not >> > > > > be >> > > > > attached at this point in time anyhow. >> > > > > >> > > > > I reckon this should hang off the existing "nodebumon" command >> > > > > line >> > > > > option, and we shouldn't use HW breakpoints at all when that is >> > > > > passed. >> > > > > Then you can pass that to prevent the kernel stomping on the >> > > > > external >> > > > > debugger. >> > > > > >> > > > > Will, thoughts? >> > > > >> > > > I was going to suggest the same thing, although we will also need to >> > > > take >> > > > care to reset the registers if "nodebugmon" is toggled at runtime >> > > > via >> > > > the >> > > > "debug_enabled" file in debugfs. >> > > > >> > > Thanks for the suggestion, Mark and Will. It's a great idea to use >> > > "nodebugmon". When "nodebugmon" is set, Kernel won't change HW >> > > breakpoints. >> > > >> > > For reset the registers after "debug_enabled" is toggled, I'm >> > > thinking if >> > > we are adding unnecessary complexity here.If we take that approach, we >> > > will >> > > hook "debug_enabled" interface and use smp_call_function_single() to >> > > call >> > > hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done >> > > and >> > > change "debug_enabled". External debugger would clear the >> > > breakpoints when >> > > it detaches the device and restores its breakpoints when attaches the >> > > device. >> > > Assume debug_enabled is changed to one after external debugger >> > > detaches >> > > the >> > > device. Debugger would already clear the breakpoint registers. If >> > > debgger >> > > is >> > > still attached, there's nothing Kernel can do to stop it >> > > restores/programs >> > > the breakpoint registers. >> > > >> > > What do you think of this? >> > >> > It's all a bit of a mess. Looking at it some more, why can't the >> > external >> > debugger simply trap access to the debug registers using EDSCR.TDA? That >> > way, we don't have to change anything in the kernel. >> > >> >> External debugger has the function to trap access to debug registers >> now. >> What do we expect debugger to do after core is stopped? Skip that msr >> instruction and continue to run? > > The nicest thing to do would probably be to record all the accesses > made > by the OS so that it can emulate reads and replay writes when external > debugging is over. Given that you'd still be expecting to pass > "nodebugmon", > the emulation should be pretty straightforward, I think. > > Will Will, To provide an update on this, I've worked with external debugger vendor on this. Now external debugger can trap the write to debug registers and ignore the write. This is the first step. Thanks, Tingwei
On Tue, Apr 21, 2020 at 11:49:11AM +0800, tingwei@codeaurora.org wrote: > 在 2020-03-31 19:45,Will Deacon 写道: > > On Tue, Mar 31, 2020 at 07:33:38PM +0800, tingwei@codeaurora.org wrote: > > > 在 2020-03-31 15:41,Will Deacon 写道: > > > > On Tue, Mar 31, 2020 at 10:39:42AM +0800, tingwei@codeaurora.org wrote: > > > > > For reset the registers after "debug_enabled" is toggled, I'm > > > > > thinking if > > > > > we are adding unnecessary complexity here.If we take that approach, we > > > > > will > > > > > hook "debug_enabled" interface and use smp_call_function_single() to > > > > > call > > > > > hw_breakpoint_reset() on each CPU. Wait for all CPUs' execution done > > > > > and > > > > > change "debug_enabled". External debugger would clear the > > > > > breakpoints when > > > > > it detaches the device and restores its breakpoints when attaches the > > > > > device. > > > > > Assume debug_enabled is changed to one after external debugger > > > > > detaches > > > > > the > > > > > device. Debugger would already clear the breakpoint registers. If > > > > > debgger > > > > > is > > > > > still attached, there's nothing Kernel can do to stop it > > > > > restores/programs > > > > > the breakpoint registers. > > > > > > > > > > What do you think of this? > > > > > > > > It's all a bit of a mess. Looking at it some more, why can't the > > > > external > > > > debugger simply trap access to the debug registers using EDSCR.TDA? That > > > > way, we don't have to change anything in the kernel. > > > > > > > > > > External debugger has the function to trap access to debug registers > > > now. > > > What do we expect debugger to do after core is stopped? Skip that msr > > > instruction and continue to run? > > > > The nicest thing to do would probably be to record all the accesses made > > by the OS so that it can emulate reads and replay writes when external > > debugging is over. Given that you'd still be expecting to pass > > "nodebugmon", > > the emulation should be pretty straightforward, I think. > > > > To provide an update on this, I've worked with external debugger vendor on > this. > Now external debugger can trap the write to debug registers and ignore the > write. > This is the first step. Thanks for the update! Please let us know if you run into any unforeseen problems. Will
diff --git a/arch/arm64/include/asm/debug-monitors.h b/arch/arm64/include/asm/debug-monitors.h index 7619f473155f..8dc2c28791a0 100644 --- a/arch/arm64/include/asm/debug-monitors.h +++ b/arch/arm64/include/asm/debug-monitors.h @@ -18,6 +18,7 @@ /* MDSCR_EL1 enabling bits */ #define DBG_MDSCR_KDE (1 << 13) +#define DBG_MDSCR_HDE (1 << 14) #define DBG_MDSCR_MDE (1 << 15) #define DBG_MDSCR_MASK ~(DBG_MDSCR_KDE | DBG_MDSCR_MDE) diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c index 0b727edf4104..0180306f74d7 100644 --- a/arch/arm64/kernel/hw_breakpoint.c +++ b/arch/arm64/kernel/hw_breakpoint.c @@ -927,6 +927,17 @@ void hw_breakpoint_thread_switch(struct task_struct *next) !next_debug_info->wps_disabled); } +/* + * Check if halted debug mode is enabled. + */ +static u32 hde_enabled(void) +{ + u32 mdscr; + + asm volatile("mrs %0, mdscr_el1" : "=r" (mdscr)); + return (mdscr & DBG_MDSCR_HDE); +} + /* * CPU initialisation. */ @@ -934,6 +945,14 @@ static int hw_breakpoint_reset(unsigned int cpu) { int i; struct perf_event **slots; + + /* + * When halting debug mode is enabled, break point could be already + * set be external debugger. Don't reset debug registers here to + * reserve break point from external debugger. + */ + if (hde_enabled()) + return 0; /* * When a CPU goes through cold-boot, it does not have any installed * slot, so it is safe to share the same function for restoring and
If external debugger sets a breakpoint for one Kernel function when device is in bootloader mode and loads Kernel, this breakpoint will be wiped out in hw_breakpoint_reset(). To fix this, check MDSCR_EL1.HDE in hw_breakpoint_reset(). When MDSCR_EL1.HDE is 0b1, halting debug is enabled. Don't reset debug registers in this case. Signed-off-by: Tingwei Zhang <tingwei@codeaurora.org> --- arch/arm64/include/asm/debug-monitors.h | 1 + arch/arm64/kernel/hw_breakpoint.c | 19 +++++++++++++++++++ 2 files changed, 20 insertions(+)