Message ID | 1466660926-1544-8-git-send-email-david@gibson.dropbear.id.au (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, 2016-06-23 at 15:48 +1000, David Gibson wrote: > From: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > This reworks emulation of the various "rfi" variants. I removed > some masking bits that I couldn't make sense of, the only bit that > I am aware we should mask here is POW, the CPU's MSR mask should > take care of the rest. See I'd rather we didn't boot at all. I just spent hours trying to figure out why my kernel wouldn't boot in qemu on a mac99 model with 970, weird weird things happening inside the device-tree parsing... Until I figured we were losing the 64-bit mode in the MSR. Why ? Because OpenBIOS isn't bolting the hash entries or SLBs for the entire kernel ! So we are taking some exceptions right during the early assembly, precisely between enable_64b_mode and __mmu_off. Now this is really fishy to begin with, there is code in there that will use SRR0/SRR1 and won't expect a fault of any sort... such as __mmu_off itself. The problem in our case was that OpenBIOS using rfi, it only restores 32-bits of the MSR, so we lose the 64-bit flag. Typically that was happening on the call to __cpu_preinit_ppc970 which happens to reside far enough away that it needs a new translation. I wonder if prom_init should "touch" the entire kernel for safety, but in any case, OpenBIOS need that fix urgently. Cheers, Ben.
On 27/06/16 05:42, Benjamin Herrenschmidt wrote: > On Thu, 2016-06-23 at 15:48 +1000, David Gibson wrote: >> From: Benjamin Herrenschmidt <benh@kernel.crashing.org> >> >> This reworks emulation of the various "rfi" variants. I removed >> some masking bits that I couldn't make sense of, the only bit that >> I am aware we should mask here is POW, the CPU's MSR mask should >> take care of the rest. > > See I'd rather we didn't boot at all. > > I just spent hours trying to figure out why my kernel wouldn't boot > in qemu on a mac99 model with 970, weird weird things happening > inside the device-tree parsing... > > Until I figured we were losing the 64-bit mode in the MSR. Why ? > > Because OpenBIOS isn't bolting the hash entries or SLBs for the entire > kernel ! So we are taking some exceptions right during the early > assembly, precisely between enable_64b_mode and __mmu_off. > > Now this is really fishy to begin with, there is code in there that > will use SRR0/SRR1 and won't expect a fault of any sort... such > as __mmu_off itself. > > The problem in our case was that OpenBIOS using rfi, it only restores > 32-bits of the MSR, so we lose the 64-bit flag. > > Typically that was happening on the call to __cpu_preinit_ppc970 which > happens to reside far enough away that it needs a new translation. > > I wonder if prom_init should "touch" the entire kernel for safety, > but in any case, OpenBIOS need that fix urgently. I know, and I do apologies as the OpenBIOS repository has been in freeze for a month now trying to transition over to git. I'll send a follow-up email ASAP. ATB, Mark.
On Mon, Jun 27, 2016 at 02:42:08PM +1000, Benjamin Herrenschmidt wrote: > On Thu, 2016-06-23 at 15:48 +1000, David Gibson wrote: > > From: Benjamin Herrenschmidt <benh@kernel.crashing.org> > > > > This reworks emulation of the various "rfi" variants. I removed > > some masking bits that I couldn't make sense of, the only bit that > > I am aware we should mask here is POW, the CPU's MSR mask should > > take care of the rest. > > See I'd rather we didn't boot at all. > > I just spent hours trying to figure out why my kernel wouldn't boot > in qemu on a mac99 model with 970, weird weird things happening > inside the device-tree parsing... > > Until I figured we were losing the 64-bit mode in the MSR. Why ? > > Because OpenBIOS isn't bolting the hash entries or SLBs for the entire > kernel ! So we are taking some exceptions right during the early > assembly, precisely between enable_64b_mode and __mmu_off. > > Now this is really fishy to begin with, there is code in there that > will use SRR0/SRR1 and won't expect a fault of any sort... such > as __mmu_off itself. > > The problem in our case was that OpenBIOS using rfi, it only restores > 32-bits of the MSR, so we lose the 64-bit flag. > > Typically that was happening on the call to __cpu_preinit_ppc970 which > happens to reside far enough away that it needs a new translation. > > I wonder if prom_init should "touch" the entire kernel for safety, > but in any case, OpenBIOS need that fix urgently. Ah, ok, I hadn't realized that OpenBIOS still failed to boot, just later in the process with this hunk left out.
On Mon, 2016-06-27 at 16:48 +1000, David Gibson wrote: > > I wonder if prom_init should "touch" the entire kernel for safety, > > but in any case, OpenBIOS need that fix urgently. > > Ah, ok, I hadn't realized that OpenBIOS still failed to boot, just > later in the process with this hunk left out. OpenBIOS itself works, but the kernel then fails in odd ways :-) Cheers, Ben.
diff --git a/target-ppc/excp_helper.c b/target-ppc/excp_helper.c index 30e960e..aa0b63f 100644 --- a/target-ppc/excp_helper.c +++ b/target-ppc/excp_helper.c @@ -922,25 +922,20 @@ void helper_store_msr(CPUPPCState *env, target_ulong val) } } -static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr, - target_ulong msrm, int keep_msrh) +static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr) { CPUState *cs = CPU(ppc_env_get_cpu(env)); + /* MSR:POW cannot be set by any form of rfi */ + msr &= ~(1ULL << MSR_POW); + #if defined(TARGET_PPC64) - if (msr_is_64bit(env, msr)) { - nip = (uint64_t)nip; - msr &= (uint64_t)msrm; - } else { + /* Switching to 32-bit ? Crop the nip */ + if (!msr_is_64bit(env, msr)) { nip = (uint32_t)nip; - msr = (uint32_t)(msr & msrm); - if (keep_msrh) { - msr |= env->msr & ~((uint64_t)0xFFFFFFFF); - } } #else nip = (uint32_t)nip; - msr &= (uint32_t)msrm; #endif /* XXX: beware: this is false if VLE is supported */ env->nip = nip & ~((target_ulong)0x00000003); @@ -959,26 +954,24 @@ static inline void do_rfi(CPUPPCState *env, target_ulong nip, target_ulong msr, void helper_rfi(CPUPPCState *env) { - if (env->excp_model == POWERPC_EXCP_BOOKE) { - do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1], - ~((target_ulong)0), 0); - } else { - do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1], - ~((target_ulong)0x783F0000), 1); - } + do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1] & 0xfffffffful); } +#define MSR_BOOK3S_MASK #if defined(TARGET_PPC64) void helper_rfid(CPUPPCState *env) { - do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1], - ~((target_ulong)0x783F0000), 0); + /* The architeture defines a number of rules for which bits + * can change but in practice, we handle this in hreg_store_msr() + * which will be called by do_rfi(), so there is no need to filter + * here + */ + do_rfi(env, env->spr[SPR_SRR0], env->spr[SPR_SRR1]); } void helper_hrfid(CPUPPCState *env) { - do_rfi(env, env->spr[SPR_HSRR0], env->spr[SPR_HSRR1], - ~((target_ulong)0x783F0000), 0); + do_rfi(env, env->spr[SPR_HSRR0], env->spr[SPR_HSRR1]); } #endif @@ -986,28 +979,24 @@ void helper_hrfid(CPUPPCState *env) /* Embedded PowerPC specific helpers */ void helper_40x_rfci(CPUPPCState *env) { - do_rfi(env, env->spr[SPR_40x_SRR2], env->spr[SPR_40x_SRR3], - ~((target_ulong)0xFFFF0000), 0); + do_rfi(env, env->spr[SPR_40x_SRR2], env->spr[SPR_40x_SRR3]); } void helper_rfci(CPUPPCState *env) { - do_rfi(env, env->spr[SPR_BOOKE_CSRR0], env->spr[SPR_BOOKE_CSRR1], - ~((target_ulong)0), 0); + do_rfi(env, env->spr[SPR_BOOKE_CSRR0], env->spr[SPR_BOOKE_CSRR1]); } void helper_rfdi(CPUPPCState *env) { /* FIXME: choose CSRR1 or DSRR1 based on cpu type */ - do_rfi(env, env->spr[SPR_BOOKE_DSRR0], env->spr[SPR_BOOKE_DSRR1], - ~((target_ulong)0), 0); + do_rfi(env, env->spr[SPR_BOOKE_DSRR0], env->spr[SPR_BOOKE_DSRR1]); } void helper_rfmci(CPUPPCState *env) { /* FIXME: choose CSRR1 or MCSRR1 based on cpu type */ - do_rfi(env, env->spr[SPR_BOOKE_MCSRR0], env->spr[SPR_BOOKE_MCSRR1], - ~((target_ulong)0), 0); + do_rfi(env, env->spr[SPR_BOOKE_MCSRR0], env->spr[SPR_BOOKE_MCSRR1]); } #endif @@ -1045,7 +1034,7 @@ void helper_td(CPUPPCState *env, target_ulong arg1, target_ulong arg2, void helper_rfsvc(CPUPPCState *env) { - do_rfi(env, env->lr, env->ctr, 0x0000FFFF, 0); + do_rfi(env, env->lr, env->ctr & 0x0000FFFF); } /* Embedded.Processor Control */ diff --git a/target-ppc/translate.c b/target-ppc/translate.c index 395b885..6398bad 100644 --- a/target-ppc/translate.c +++ b/target-ppc/translate.c @@ -4119,6 +4119,10 @@ static void gen_rfi(DisasContext *ctx) #if defined(CONFIG_USER_ONLY) gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC); #else + /* FIXME: This instruction doesn't exist anymore on 64-bit server + * processors compliant with arch 2.x, we should remove it there, + * but we need to fix OpenBIOS not to use it on 970 first + */ /* Restore CPU state */ if (unlikely(ctx->pr)) { gen_inval_exception(ctx, POWERPC_EXCP_PRIV_OPC);