Message ID | 1512495507-23259-1-git-send-email-julien.thierry@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Hi Julien, On 05/12/17 17:38, Julien Thierry wrote: > When writing the user ASID to ttbr0, 16bit get copied to ttbr0, potentially > including part of the ASID generation in the ttbr0.ASID. If the kernel is > using less than 16bits ASIDs and the other ttbr0 bits aren't RES0, two > tasks using the same mm context might end up running with different > ttbr0.ASID values. > This would be triggered by one of the threads being scheduled before a > roll-over and the second one scheduled after a roll-over. > > Pad the generation out of the 16bits of the mm id that are written to > ttbr0. Thus, what the hardware sees is what the kernel considers ASID. Is this to fix a system with a mix of 8/16 asid bits? Or someone being clever and reducing asid_bits to stress rollover? (I think we should do something like this so we can test rollover.) > diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c > index 6f40170..a7c72d4 100644 > --- a/arch/arm64/mm/context.c > +++ b/arch/arm64/mm/context.c > @@ -180,6 +190,13 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu) > /* We're out of ASIDs, so increment the global generation count */ > generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION, > &asid_generation); > + > + /* > + * It is unlikely the generation will ever overflow, but if this > + * happens, let it be known strange things can occur. > + */ > + WARN_ON(generation == ASID_FIRST_VERSION); Won't generation wrap to zero, not '1<<16'? asid_generation-is-never-zero is one of the nasty things in this code. In check_and_switch_context() we switch to the 'fast path' where the current asid is usable if: the generation matches and the active_asid isn't 0 (which indicates a rollover has occurred). from mm/context.c::check_and_switch_context(): > if (!((asid ^ atomic64_read(&asid_generation)) >> asid_bits) > && atomic64_xchg_relaxed(&per_cpu(active_asids, cpu), asid)) > goto switch_mm_fastpath; If asid_generation is ever zero, (even briefly) multiple new tasks with different pages-tables will pass the generation test, and run with asid=0. Short of xchg(ASID_MAX_VERSION, ASID_FIRST_VERSION), every time, just in case, I don't think this is something we can handle. But, this thing is layers on layers of subtle behaviour, so I may have missed something... Instead, do you think we can duplicate just the asid bits (asid & ASID_MASK) into a new 16bit field in mm_context_t, which we then use instead of the ASID() and mmid macros? (We only set a new asid in one place as returned by new_context()). This would let context.c's asid_bits be arbitrary, as the hardware only uses the masked value it exposes in the new field. This doesn't solve the mixed 8/16 bit case. I think we should force those systems to use 8bit asids via cpufeature to preserve as much of the generation counter as possible. Thanks, James
Hi James, On 05/12/17 19:33, James Morse wrote: > Hi Julien, > > On 05/12/17 17:38, Julien Thierry wrote: >> When writing the user ASID to ttbr0, 16bit get copied to ttbr0, potentially >> including part of the ASID generation in the ttbr0.ASID. If the kernel is >> using less than 16bits ASIDs and the other ttbr0 bits aren't RES0, two >> tasks using the same mm context might end up running with different >> ttbr0.ASID values. >> This would be triggered by one of the threads being scheduled before a >> roll-over and the second one scheduled after a roll-over. >> >> Pad the generation out of the 16bits of the mm id that are written to >> ttbr0. Thus, what the hardware sees is what the kernel considers ASID. > > Is this to fix a system with a mix of 8/16 asid bits? Or someone being clever > and reducing asid_bits to stress rollover? > So it was motivated by the later. But hopefully the fix, pushing the generation always over 16bits is also suitable for hardware that mixes 8/16bits asids. If it is not, do call it out. > (I think we should do something like this so we can test rollover.) > > >> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c >> index 6f40170..a7c72d4 100644 >> --- a/arch/arm64/mm/context.c >> +++ b/arch/arm64/mm/context.c >> @@ -180,6 +190,13 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu) >> /* We're out of ASIDs, so increment the global generation count */ >> generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION, >> &asid_generation); >> + >> + /* >> + * It is unlikely the generation will ever overflow, but if this >> + * happens, let it be known strange things can occur. >> + */ >> + WARN_ON(generation == ASID_FIRST_VERSION); > > Won't generation wrap to zero, not '1<<16'? Yes it will! Silly me... > > asid_generation-is-never-zero is one of the nasty things in this code. > > In check_and_switch_context() we switch to the 'fast path' where the current > asid is usable if: the generation matches and the active_asid isn't 0 (which > indicates a rollover has occurred). > > from mm/context.c::check_and_switch_context(): >> if (!((asid ^ atomic64_read(&asid_generation)) >> asid_bits) >> && atomic64_xchg_relaxed(&per_cpu(active_asids, cpu), asid)) >> goto switch_mm_fastpath; > > > If asid_generation is ever zero, (even briefly) multiple new tasks with > different pages-tables will pass the generation test, and run with asid=0. > > Short of xchg(ASID_MAX_VERSION, ASID_FIRST_VERSION), every time, just in case, I > don't think this is something we can handle. > > But, this thing is layers on layers of subtle behaviour, so I may have missed > something... > > I had not thought of that. However I believe we checked with 48bits and the case of the generation overflowing would take a human life span (or something on that scale) of a system running and spawning continuous processes before reaching this. Which is why we decided on having a WARN_ON for now. So I don't know if we want to change things for this corner case which really means "things are going bad" more than anything else. > Instead, do you think we can duplicate just the asid bits (asid & ASID_MASK) > into a new 16bit field in mm_context_t, which we then use instead of the ASID() > and mmid macros? (We only set a new asid in one place as returned by new_context()). > I'm not sure I understand why this prevents running with asid = 0 when generation is 0. > This would let context.c's asid_bits be arbitrary, as the hardware only uses the > masked value it exposes in the new field. > > This doesn't solve the mixed 8/16 bit case. I think we should force those > systems to use 8bit asids via cpufeature to preserve as much of the generation > counter as possible. > Hmmm right, I see that today we just panic the kernel when we have a smaller asid size than the boot cpu... So if we boot on a 8bit asid cpu it should work but not the other way around... I agree we probably should find a way to reduce the size of software asids system wide when we find a cpu has less bits available. Thanks,
Hi Julien, On 06/12/17 10:17, Julien Thierry wrote: > On 05/12/17 19:33, James Morse wrote: >> On 05/12/17 17:38, Julien Thierry wrote: >>> When writing the user ASID to ttbr0, 16bit get copied to ttbr0, potentially >>> including part of the ASID generation in the ttbr0.ASID. If the kernel is >>> using less than 16bits ASIDs and the other ttbr0 bits aren't RES0, two >>> tasks using the same mm context might end up running with different >>> ttbr0.ASID values. >>> This would be triggered by one of the threads being scheduled before a >>> roll-over and the second one scheduled after a roll-over. >>> >>> Pad the generation out of the 16bits of the mm id that are written to >>> ttbr0. Thus, what the hardware sees is what the kernel considers ASID. >> >> Is this to fix a system with a mix of 8/16 asid bits? Or someone being clever >> and reducing asid_bits to stress rollover? > So it was motivated by the later. But hopefully the fix, pushing the generation > always over 16bits is also suitable for hardware that mixes 8/16bits asids. If > it is not, do call it out. (okay, wasn't clear from the commit message!) >>> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c >>> index 6f40170..a7c72d4 100644 >>> --- a/arch/arm64/mm/context.c >>> +++ b/arch/arm64/mm/context.c >>> @@ -180,6 +190,13 @@ static u64 new_context(struct mm_struct *mm, unsigned >>> int cpu) >>> /* We're out of ASIDs, so increment the global generation count */ >>> generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION, >>> &asid_generation); >>> + >>> + /* >>> + * It is unlikely the generation will ever overflow, but if this >>> + * happens, let it be known strange things can occur. >>> + */ >>> + WARN_ON(generation == ASID_FIRST_VERSION); >> >> Won't generation wrap to zero, not '1<<16'? > > Yes it will! Silly me... > >> >> asid_generation-is-never-zero is one of the nasty things in this code. >> >> In check_and_switch_context() we switch to the 'fast path' where the current >> asid is usable if: the generation matches and the active_asid isn't 0 (which >> indicates a rollover has occurred). >> >> from mm/context.c::check_and_switch_context(): >>> if (!((asid ^ atomic64_read(&asid_generation)) >> asid_bits) >>> && atomic64_xchg_relaxed(&per_cpu(active_asids, cpu), asid)) >>> goto switch_mm_fastpath; >> >> >> If asid_generation is ever zero, (even briefly) multiple new tasks with >> different pages-tables will pass the generation test, and run with asid=0. >> >> Short of xchg(ASID_MAX_VERSION, ASID_FIRST_VERSION), every time, just in case, I >> don't think this is something we can handle. >> >> But, this thing is layers on layers of subtle behaviour, so I may have missed >> something... > I had not thought of that. However I believe we checked with 48bits and the case > of the generation overflowing would take a human life span (or something on that > scale) of a system running and spawning continuous processes before reaching This scales with the number of CPUs, and how quickly they can fork(). (Not using all the counter bits makes me nervous.) > this. Which is why we decided on having a WARN_ON for now. So I don't know if we > want to change things for this corner case which really means "things are going > bad" more than anything else. But it fires the generation after the chaos started, so if we ever do see this it gives us false-confidence that generation-overflow wasn't the issue. WARN_ON(generation == ASID_MAX_VERSION) ? >> Instead, do you think we can duplicate just the asid bits (asid & ASID_MASK) >> into a new 16bit field in mm_context_t, which we then use instead of the ASID() >> and mmid macros? (We only set a new asid in one place as returned by >> new_context()). >> > > I'm not sure I understand why this prevents running with asid = 0 when > generation is 0. Sorry, I was talking about two things. It doesn't, I don't think we can do anything about that. This was a suggestion for an alternative way of stopping generation bits getting into the TTBR, which still lets use use all the counter bits. (and lets us mess with asid_bits to stress rollover without re-inventing this bug). >> This would let context.c's asid_bits be arbitrary, as the hardware only uses the >> masked value it exposes in the new field. >> >> This doesn't solve the mixed 8/16 bit case. I think we should force those >> systems to use 8bit asids via cpufeature to preserve as much of the generation >> counter as possible. > Hmmm right, I see that today we just panic the kernel when we have a smaller > asid size than the boot cpu... > So if we boot on a 8bit asid cpu it should work but not the other way around... > I agree we probably should find a way to reduce the size of software asids > system wide when we find a cpu has less bits available. cpufeature has some stuff for 'LOWER_SAFE' that might help here.. Thanks, James
On 06/12/17 10:56, James Morse wrote: > Hi Julien, > > On 06/12/17 10:17, Julien Thierry wrote: >> On 05/12/17 19:33, James Morse wrote: >>> On 05/12/17 17:38, Julien Thierry wrote: >>>> When writing the user ASID to ttbr0, 16bit get copied to ttbr0, potentially >>>> including part of the ASID generation in the ttbr0.ASID. If the kernel is >>>> using less than 16bits ASIDs and the other ttbr0 bits aren't RES0, two >>>> tasks using the same mm context might end up running with different >>>> ttbr0.ASID values. >>>> This would be triggered by one of the threads being scheduled before a >>>> roll-over and the second one scheduled after a roll-over. >>>> >>>> Pad the generation out of the 16bits of the mm id that are written to >>>> ttbr0. Thus, what the hardware sees is what the kernel considers ASID. >>> >>> Is this to fix a system with a mix of 8/16 asid bits? Or someone being clever >>> and reducing asid_bits to stress rollover? > >> So it was motivated by the later. But hopefully the fix, pushing the generation >> always over 16bits is also suitable for hardware that mixes 8/16bits asids. If >> it is not, do call it out. > > (okay, wasn't clear from the commit message!) > Fair point, I'll mention the mixed size hardware asids in the next version. > >>>> diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c >>>> index 6f40170..a7c72d4 100644 >>>> --- a/arch/arm64/mm/context.c >>>> +++ b/arch/arm64/mm/context.c >>>> @@ -180,6 +190,13 @@ static u64 new_context(struct mm_struct *mm, unsigned >>>> int cpu) >>>> /* We're out of ASIDs, so increment the global generation count */ >>>> generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION, >>>> &asid_generation); >>>> + >>>> + /* >>>> + * It is unlikely the generation will ever overflow, but if this >>>> + * happens, let it be known strange things can occur. >>>> + */ >>>> + WARN_ON(generation == ASID_FIRST_VERSION); >>> >>> Won't generation wrap to zero, not '1<<16'? >> >> Yes it will! Silly me... >> >>> >>> asid_generation-is-never-zero is one of the nasty things in this code. >>> >>> In check_and_switch_context() we switch to the 'fast path' where the current >>> asid is usable if: the generation matches and the active_asid isn't 0 (which >>> indicates a rollover has occurred). >>> >>> from mm/context.c::check_and_switch_context(): >>>> if (!((asid ^ atomic64_read(&asid_generation)) >> asid_bits) >>>> && atomic64_xchg_relaxed(&per_cpu(active_asids, cpu), asid)) >>>> goto switch_mm_fastpath; >>> >>> >>> If asid_generation is ever zero, (even briefly) multiple new tasks with >>> different pages-tables will pass the generation test, and run with asid=0. >>> >>> Short of xchg(ASID_MAX_VERSION, ASID_FIRST_VERSION), every time, just in case, I >>> don't think this is something we can handle. >>> >>> But, this thing is layers on layers of subtle behaviour, so I may have missed >>> something... > >> I had not thought of that. However I believe we checked with 48bits and the case >> of the generation overflowing would take a human life span (or something on that >> scale) of a system running and spawning continuous processes before reaching > > This scales with the number of CPUs, and how quickly they can fork(). > (Not using all the counter bits makes me nervous.) > > >> this. Which is why we decided on having a WARN_ON for now. So I don't know if we >> want to change things for this corner case which really means "things are going >> bad" more than anything else. > > But it fires the generation after the chaos started, so if we ever do see this > it gives us false-confidence that generation-overflow wasn't the issue. > WARN_ON(generation == ASID_MAX_VERSION) ? > Yes that seems fair. > >>> Instead, do you think we can duplicate just the asid bits (asid & ASID_MASK) >>> into a new 16bit field in mm_context_t, which we then use instead of the ASID() >>> and mmid macros? (We only set a new asid in one place as returned by >>> new_context()). >>> >> >> I'm not sure I understand why this prevents running with asid = 0 when >> generation is 0. > > Sorry, I was talking about two things. It doesn't, I don't think we can do > anything about that. > > This was a suggestion for an alternative way of stopping generation bits getting > into the TTBR, which still lets use use all the counter bits. (and lets us mess > with asid_bits to stress rollover without re-inventing this bug). > I see. Only thing that worries me is people being tempted to use this duplicated field (only ASID) wrongfully. I'm not very familiar with ASIDs and mem contexts so I'd like to know how this suggestion sounds to others. If people agree, I'm happy to add this for the next version. > >>> This would let context.c's asid_bits be arbitrary, as the hardware only uses the >>> masked value it exposes in the new field. >>> >>> This doesn't solve the mixed 8/16 bit case. I think we should force those >>> systems to use 8bit asids via cpufeature to preserve as much of the generation >>> counter as possible. > >> Hmmm right, I see that today we just panic the kernel when we have a smaller >> asid size than the boot cpu... >> So if we boot on a 8bit asid cpu it should work but not the other way around... >> I agree we probably should find a way to reduce the size of software asids >> system wide when we find a cpu has less bits available. > > cpufeature has some stuff for 'LOWER_SAFE' that might help here.. > I'll have a look. Cheers,
On 06/12/17 11:33, Julien Thierry wrote: > > > On 06/12/17 10:56, James Morse wrote: >> Hi Julien, >> >> On 06/12/17 10:17, Julien Thierry wrote: >>> On 05/12/17 19:33, James Morse wrote: >>>> On 05/12/17 17:38, Julien Thierry wrote: >> >>>> This would let context.c's asid_bits be arbitrary, as the hardware only uses the >>>> masked value it exposes in the new field. >>>> >>>> This doesn't solve the mixed 8/16 bit case. I think we should force those >>>> systems to use 8bit asids via cpufeature to preserve as much of the generation >>>> counter as possible. >> >>> Hmmm right, I see that today we just panic the kernel when we have a smaller >>> asid size than the boot cpu... >>> So if we boot on a 8bit asid cpu it should work but not the other way around... >>> I agree we probably should find a way to reduce the size of software asids >>> system wide when we find a cpu has less bits available. >> >> cpufeature has some stuff for 'LOWER_SAFE' that might help here.. >> > > I'll have a look. The problem with ASIDs is that you have to allocate the ASID map early in the init, before any of the secondaries are brought up. So this is something that has to be decided based on the boot CPU and hence is treated as an "early" feature (without really a CAP bit). So all new CPUs should have minimum of the ASID size as the boot CPU, failing which we panic the system, via verfiy_cpu_asid_bits(). I am not sure if you can really delay the asid_map allocation. You could probably start there. Cheers Suzuki > > Cheers, >
diff --git a/arch/arm64/include/asm/mmu.h b/arch/arm64/include/asm/mmu.h index 0d34bf0..61e5436 100644 --- a/arch/arm64/include/asm/mmu.h +++ b/arch/arm64/include/asm/mmu.h @@ -16,6 +16,10 @@ #ifndef __ASM_MMU_H #define __ASM_MMU_H +#define ASID_MAX_BITS 16 + +#ifndef __ASSEMBLY__ + #define MMCF_AARCH32 0x1 /* mm context flag for AArch32 executables */ typedef struct { @@ -29,7 +33,8 @@ * ASID change and therefore doesn't need to reload the counter using * atomic64_read. */ -#define ASID(mm) ((mm)->context.id.counter & 0xffff) +#define ASID(mm) \ + ((mm)->context.id.counter & GENMASK(ASID_MAX_BITS - 1, 0)) extern void paging_init(void); extern void bootmem_init(void); @@ -41,4 +46,5 @@ extern void create_pgd_mapping(struct mm_struct *mm, phys_addr_t phys, extern void *fixmap_remap_fdt(phys_addr_t dt_phys); extern void mark_linear_text_alias_ro(void); +#endif /* __ASSEMBLY__ */ #endif diff --git a/arch/arm64/mm/context.c b/arch/arm64/mm/context.c index 6f40170..a7c72d4 100644 --- a/arch/arm64/mm/context.c +++ b/arch/arm64/mm/context.c @@ -37,9 +37,9 @@ static DEFINE_PER_CPU(u64, reserved_asids); static cpumask_t tlb_flush_pending; +#define ASID_FIRST_VERSION (1UL << ASID_MAX_BITS) #define ASID_MASK (~GENMASK(asid_bits - 1, 0)) -#define ASID_FIRST_VERSION (1UL << asid_bits) -#define NUM_USER_ASIDS ASID_FIRST_VERSION +#define NUM_USER_ASIDS (1UL << asid_bits) /* Get the ASIDBits supported by the current CPU */ static u32 get_cpu_asid_bits(void) @@ -60,6 +60,8 @@ static u32 get_cpu_asid_bits(void) asid = 16; } + WARN_ON(asid > ASID_MAX_BITS); + return asid; } @@ -142,6 +144,14 @@ static bool check_update_reserved_asid(u64 asid, u64 newasid) return hit; } +/* + * Format of ASID is: + * - bits <asid_bits - 1>.. 0 -> actual ASID + * - bits 63..16 -> ASID generation + * + * Generation is padded to the maximum supported ASID size + * to avoid it being taken into account in the ttbr. + */ static u64 new_context(struct mm_struct *mm, unsigned int cpu) { static u32 cur_idx = 1; @@ -180,6 +190,13 @@ static u64 new_context(struct mm_struct *mm, unsigned int cpu) /* We're out of ASIDs, so increment the global generation count */ generation = atomic64_add_return_relaxed(ASID_FIRST_VERSION, &asid_generation); + + /* + * It is unlikely the generation will ever overflow, but if this + * happens, let it be known strange things can occur. + */ + WARN_ON(generation == ASID_FIRST_VERSION); + flush_context(cpu); /* We have more ASIDs than CPUs, so this will always succeed */ diff --git a/arch/arm64/mm/proc.S b/arch/arm64/mm/proc.S index 95233df..33c7f13 100644 --- a/arch/arm64/mm/proc.S +++ b/arch/arm64/mm/proc.S @@ -23,6 +23,7 @@ #include <asm/assembler.h> #include <asm/asm-offsets.h> #include <asm/hwcap.h> +#include <asm/mmu.h> #include <asm/pgtable.h> #include <asm/pgtable-hwdef.h> #include <asm/cpufeature.h> @@ -140,7 +141,7 @@ ENDPROC(cpu_do_resume) ENTRY(cpu_do_switch_mm) pre_ttbr0_update_workaround x0, x2, x3 mmid x1, x1 // get mm->context.id - bfi x0, x1, #48, #16 // set the ASID + bfi x0, x1, #48, #ASID_MAX_BITS // set the ASID msr ttbr0_el1, x0 // set TTBR0 isb post_ttbr0_update_workaround
When writing the user ASID to ttbr0, 16bit get copied to ttbr0, potentially including part of the ASID generation in the ttbr0.ASID. If the kernel is using less than 16bits ASIDs and the other ttbr0 bits aren't RES0, two tasks using the same mm context might end up running with different ttbr0.ASID values. This would be triggered by one of the threads being scheduled before a roll-over and the second one scheduled after a roll-over. Pad the generation out of the 16bits of the mm id that are written to ttbr0. Thus, what the hardware sees is what the kernel considers ASID. Signed-off-by: Julien Thierry <julien.thierry@arm.com> Reported-by: Will Deacon <will.deacon@arm.com> Cc: Will Deacon <will.deacon@arm.com> Cc: Catalin Marinas <catalin.marinas@arm.com> Cc: James Morse <james.morse@arm.com> Cc: Vladimir Murzin <vladimir.murzin@arm.com> --- arch/arm64/include/asm/mmu.h | 8 +++++++- arch/arm64/mm/context.c | 21 +++++++++++++++++++-- arch/arm64/mm/proc.S | 3 ++- 3 files changed, 28 insertions(+), 4 deletions(-) -- 1.9.1