Message ID | 1470952169-39061-1-git-send-email-thgarnie@google.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Aug 11, 2016 at 02:49:29PM -0700, Thomas Garnier wrote: > Restore the processor state before calling any other function to ensure > per-cpu variables can be used with KASLR memory randomization. > > Tracing functions use per-cpu variables (gs based) and one was called > just before restoring the processor state fully. It resulted in a double > fault when both the tracing & the exception handler functions tried to > use a per-cpu variable. > > Signed-off-by: Thomas Garnier <thgarnie@google.com> > --- > Based on next-20160808 Ok, I believe before I test this, I need to apply another patch from Rafael. I think it is the "Always create temporary identity mapping correctly" thing. Yes, no? Rafael, can you please apply everything on a test branch for us to run? Thanks.
On Thu, 11 Aug 2016, Thomas Garnier wrote: > Restore the processor state before calling any other function to ensure > per-cpu variables can be used with KASLR memory randomization. > > Tracing functions use per-cpu variables (gs based) and one was called > just before restoring the processor state fully. It resulted in a double > fault when both the tracing & the exception handler functions tried to > use a per-cpu variable. > > Signed-off-by: Thomas Garnier <thgarnie@google.com> Alright, this did the trick, thanks :) Feel free to add Reported-by: Jiri Kosina <jkosina@suse.cz> Tested-by: Jiri Kosina <jkosina@suse.cz> One thing is still beyond me though ... how the heck this doesn't happen without DEBUG_LOCK_ALLOC? The percpu area pointer should be corrupted nevertheless, shouldn't it? Thanks,
Hi! > Restore the processor state before calling any other function to ensure > per-cpu variables can be used with KASLR memory randomization. > > Tracing functions use per-cpu variables (gs based) and one was called > just before restoring the processor state fully. It resulted in a double > fault when both the tracing & the exception handler functions tried to > use a per-cpu variable. > > Signed-off-by: Thomas Garnier <thgarnie@google.com> > --- > Based on next-20160808 > > Thanks to Rafael, Jiri & Borislav in tracking down this bug. > --- > kernel/power/hibernate.c | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c > index a881c6a..33c79b6 100644 > --- a/kernel/power/hibernate.c > +++ b/kernel/power/hibernate.c > @@ -300,12 +300,12 @@ static int create_image(int platform_mode) > save_processor_state(); > trace_suspend_resume(TPS("machine_suspend"), PM_EVENT_HIBERNATE, true); > error = swsusp_arch_suspend(); > + /* Restore control flow magically appears here */ > + restore_processor_state(); > trace_suspend_resume(TPS("machine_suspend"), PM_EVENT_HIBERNATE, false); > if (error) > printk(KERN_ERR "PM: Error %d creating hibernation image\n", > error); > - /* Restore control flow magically appears here */ > - restore_processor_state(); > if (!in_suspend) > events_check_enabled = false; > Ugh. Plus it also fixes very confusing situation where /* Restore control flow magically appears here */ comment was 4 lines away from where it _actually_ magically appeared. Good catch. Acked-by: Pavel Machek <pavel@ucw.cz>
On Fri, 12 Aug 2016, Jiri Kosina wrote: > One thing is still beyond me though ... how the heck this doesn't happen > without DEBUG_LOCK_ALLOC? The percpu area pointer should be corrupted > nevertheless, shouldn't it? The reason is that turning DEBUG_LOCK_ALLOC changing trace_suspend_resume() from ffffffff810c7280 <trace_suspend_resume>: ffffffff810c7280: 55 push %rbp ffffffff810c7281: 48 89 e5 mov %rsp,%rbp ffffffff810c7284: 41 56 push %r14 ffffffff810c7286: 41 55 push %r13 ffffffff810c7288: 41 54 push %r12 ffffffff810c728a: 53 push %rbx ffffffff810c728b: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ffffffff810c7290: 65 8b 05 59 2f f4 7e mov %gs:0x7ef42f59(%rip),%eax # a1f0 <cpu_number> ffffffff810c7297: 89 c0 mov %eax,%eax ffffffff810c7299: 48 0f a3 05 9f 2f c4 bt %rax,0xc42f9f(%rip) # ffffffff81d0a240 <__cpu_online_mask> to ffffffff810ad150 <trace_suspend_resume>: ffffffff810ad150: 0f 1f 44 00 00 nopl 0x0(%rax,%rax,1) ffffffff810ad155: c3 retq ffffffff810ad156: 65 8b 05 93 d0 f5 7e mov %gs:0x7ef5d093(%rip),%eax # a1f0 <cpu_number> ffffffff810ad15d: 89 c0 mov %eax,%eax ffffffff810ad15f: 48 0f a3 05 59 0b c4 bt %rax,0xc40b59(%rip) # ffffffff81cedcc0 <__cpu_online_mask> ffffffff810ad166: 00 IOW, with the config change, trace_suspend_resume() returns immediately without trying to get the current SMP id. And it's because of DEBUG_LOCK_ALLOC implies LOCKDEP, and that does this to __DECLARE_TRACE() * When lockdep is enabled, we make sure to always do the RCU portions of * the tracepoint code, regardless of whether tracing is on. However, * don't check if the condition is false, due to interaction with idle * instrumentation. This lets us find RCU issues triggered with tracepoints * even when this tracepoint is off. This code has no purpose other than * poking RCU a bit. */ #define __DECLARE_TRACE(name, proto, args, cond, data_proto, data_args) \ extern struct tracepoint __tracepoint_##name; \ static inline void trace_##name(proto) \ { \ if (static_key_false(&__tracepoint_##name.key)) \ __DO_TRACE(&__tracepoint_##name, \ TP_PROTO(data_proto), \ TP_ARGS(data_args), \ TP_CONDITION(cond),,); \ if (IS_ENABLED(CONFIG_LOCKDEP) && (cond)) { \ rcu_read_lock_sched_notrace(); \ rcu_dereference_sched(__tracepoint_##name.funcs);\ rcu_read_unlock_sched_notrace(); \ } \ } That's pretty nasty, as turning LOCKDEP on has sideffects on the code that'd normally not be expected to be run at all (tracepoint off). Oh well. Thanks again,
On Fri, Aug 12, 2016 at 7:49 AM, Borislav Petkov <bp@suse.de> wrote: > On Thu, Aug 11, 2016 at 02:49:29PM -0700, Thomas Garnier wrote: >> Restore the processor state before calling any other function to ensure >> per-cpu variables can be used with KASLR memory randomization. >> >> Tracing functions use per-cpu variables (gs based) and one was called >> just before restoring the processor state fully. It resulted in a double >> fault when both the tracing & the exception handler functions tried to >> use a per-cpu variable. >> >> Signed-off-by: Thomas Garnier <thgarnie@google.com> >> --- >> Based on next-20160808 > > Ok, I believe before I test this, I need to apply another patch from > Rafael. I think it is the "Always create temporary identity mapping > correctly" thing. > > Yes, no? Yes. > Rafael, can you please apply everything on a test branch for us to run? You can simply test my linux-next branch: git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ linux-next That's 4.8-rc1 plus 3 fixes on top of it. Thanks, Rafael
On Fri, Aug 12, 2016 at 2:23 AM, Jiri Kosina <jikos@kernel.org> wrote: > On Fri, 12 Aug 2016, Jiri Kosina wrote: > > That's pretty nasty, as turning LOCKDEP on has sideffects on the code > that'd normally not be expected to be run at all (tracepoint off). > > Oh well. Thanks for the analysis, I didn't got that far so I had no idea how everything was connected. > Thanks again, Thanks you and Borislav for finding it.
On Fri, Aug 12, 2016 at 4:14 AM, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Fri, Aug 12, 2016 at 7:49 AM, Borislav Petkov <bp@suse.de> wrote: >> On Thu, Aug 11, 2016 at 02:49:29PM -0700, Thomas Garnier wrote: >>> Restore the processor state before calling any other function to ensure >>> per-cpu variables can be used with KASLR memory randomization. >>> >>> Tracing functions use per-cpu variables (gs based) and one was called >>> just before restoring the processor state fully. It resulted in a double >>> fault when both the tracing & the exception handler functions tried to >>> use a per-cpu variable. >>> >>> Signed-off-by: Thomas Garnier <thgarnie@google.com> >>> --- >>> Based on next-20160808 >> >> Ok, I believe before I test this, I need to apply another patch from >> Rafael. I think it is the "Always create temporary identity mapping >> correctly" thing. >> >> Yes, no? > > Yes. > >> Rafael, can you please apply everything on a test branch for us to run? > > You can simply test my linux-next branch: > > git://git.kernel.org/pub/scm/linux/kernel/git/rafael/linux-pm.git \ > linux-next > > That's 4.8-rc1 plus 3 fixes on top of it. Borislav, let me know once you tested it and I will send a v2 with acked/tested.
On Fri, Aug 12, 2016 at 09:03:52AM -0700, Thomas Garnier wrote: > Borislav, let me know once you tested it and I will send a v2 with > acked/tested. Just did 5 s2d runs, back-to-back, with Rafael's linux-next branch. Looks good so far, no hickups. I'll watch it the coming days if there are any changes. Reported-and-tested-by: Borislav Petkov <bp@suse.de> Btw, I'm running with: CONFIG_ARCH_HAS_ELF_RANDOMIZE=y CONFIG_RANDOMIZE_BASE=y CONFIG_RANDOMIZE_MEMORY=y CONFIG_RANDOMIZE_MEMORY_PHYSICAL_PADDING=0x0 Thanks guys for the fixes.
diff --git a/kernel/power/hibernate.c b/kernel/power/hibernate.c index a881c6a..33c79b6 100644 --- a/kernel/power/hibernate.c +++ b/kernel/power/hibernate.c @@ -300,12 +300,12 @@ static int create_image(int platform_mode) save_processor_state(); trace_suspend_resume(TPS("machine_suspend"), PM_EVENT_HIBERNATE, true); error = swsusp_arch_suspend(); + /* Restore control flow magically appears here */ + restore_processor_state(); trace_suspend_resume(TPS("machine_suspend"), PM_EVENT_HIBERNATE, false); if (error) printk(KERN_ERR "PM: Error %d creating hibernation image\n", error); - /* Restore control flow magically appears here */ - restore_processor_state(); if (!in_suspend) events_check_enabled = false;
Restore the processor state before calling any other function to ensure per-cpu variables can be used with KASLR memory randomization. Tracing functions use per-cpu variables (gs based) and one was called just before restoring the processor state fully. It resulted in a double fault when both the tracing & the exception handler functions tried to use a per-cpu variable. Signed-off-by: Thomas Garnier <thgarnie@google.com> --- Based on next-20160808 Thanks to Rafael, Jiri & Borislav in tracking down this bug. --- kernel/power/hibernate.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)