[v9,09/12] mm/kasan: kasan specific map populate function

Message ID	20170920201714.19817-10-pasha.tatashin@oracle.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org> From: Pavel Tatashin <pasha.tatashin@oracle.com> To: linux-kernel@vger.kernel.org, sparclinux@vger.kernel.org, linux-mm@kvack.org, linuxppc-dev@lists.ozlabs.org, linux-s390@vger.kernel.org, linux-arm-kernel@lists.infradead.org, x86@kernel.org, kasan-dev@googlegroups.com, borntraeger@de.ibm.com, heiko.carstens@de.ibm.com, davem@davemloft.net, willy@infradead.org, mhocko@kernel.org, ard.biesheuvel@linaro.org, mark.rutland@arm.com, will.deacon@arm.com, catalin.marinas@arm.com, sam@ravnborg.org, mgorman@techsingularity.net, steven.sistare@oracle.com, daniel.m.jordan@oracle.com, bob.picco@oracle.com Subject: [PATCH v9 09/12] mm/kasan: kasan specific map populate function Date: Wed, 20 Sep 2017 16:17:11 -0400 Message-Id: <20170920201714.19817-10-pasha.tatashin@oracle.com> In-Reply-To: <20170920201714.19817-1-pasha.tatashin@oracle.com> References: <20170920201714.19817-1-pasha.tatashin@oracle.com> Precedence: list MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-arm-kernel" <linux-arm-kernel-bounces@lists.infradead.org> Errors-To: linux-arm-kernel-bounces+patchwork-linux-arm=patchwork.kernel.org@lists.infradead.org

Pavel Tatashin Sept. 20, 2017, 8:17 p.m. UTC

During early boot, kasan uses vmemmap_populate() to establish its shadow
memory. But, that interface is intended for struct pages use.

Because of the current project, vmemmap won't be zeroed during allocation,
but kasan expects that memory to be zeroed. We are adding a new
kasan_map_populate() function to resolve this difference.

Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
---
 arch/arm64/include/asm/pgtable.h |  3 ++
 include/linux/kasan.h            |  2 ++
 mm/kasan/kasan_init.c            | 67 ++++++++++++++++++++++++++++++++++++++++
 3 files changed, 72 insertions(+)

Mark Rutland Oct. 3, 2017, 2:48 p.m. UTC | #1

Hi Pavel,

On Wed, Sep 20, 2017 at 04:17:11PM -0400, Pavel Tatashin wrote:
> During early boot, kasan uses vmemmap_populate() to establish its shadow
> memory. But, that interface is intended for struct pages use.
> 
> Because of the current project, vmemmap won't be zeroed during allocation,
> but kasan expects that memory to be zeroed. We are adding a new
> kasan_map_populate() function to resolve this difference.

Thanks for putting this together.

I've given this a spin on arm64, and can confirm that it works.

Given that this involes redundant walking of page tables, I still think
it'd be preferable to have some common *_populate() helper that took a
gfp argument, but I guess it's not the end of the world.

I'll leave it to Will and Catalin to say whether they're happy with the
page table walking and the new p{u,m}d_large() helpers added to arm64.

Thanks,
Mark.

> 
> Signed-off-by: Pavel Tatashin <pasha.tatashin@oracle.com>
> ---
>  arch/arm64/include/asm/pgtable.h |  3 ++
>  include/linux/kasan.h            |  2 ++
>  mm/kasan/kasan_init.c            | 67 ++++++++++++++++++++++++++++++++++++++++
>  3 files changed, 72 insertions(+)
> 
> diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h
> index bc4e92337d16..d89713f04354 100644
> --- a/arch/arm64/include/asm/pgtable.h
> +++ b/arch/arm64/include/asm/pgtable.h
> @@ -381,6 +381,9 @@ extern pgprot_t phys_mem_access_prot(struct file *file, unsigned long pfn,
>  				 PUD_TYPE_TABLE)
>  #endif
>  
> +#define pmd_large(pmd)		pmd_sect(pmd)
> +#define pud_large(pud)		pud_sect(pud)
> +
>  static inline void set_pmd(pmd_t *pmdp, pmd_t pmd)
>  {
>  	*pmdp = pmd;
> diff --git a/include/linux/kasan.h b/include/linux/kasan.h
> index a5c7046f26b4..7e13df1722c2 100644
> --- a/include/linux/kasan.h
> +++ b/include/linux/kasan.h
> @@ -78,6 +78,8 @@ size_t kasan_metadata_size(struct kmem_cache *cache);
>  
>  bool kasan_save_enable_multi_shot(void);
>  void kasan_restore_multi_shot(bool enabled);
> +int __meminit kasan_map_populate(unsigned long start, unsigned long end,
> +				 int node);
>  
>  #else /* CONFIG_KASAN */
>  
> diff --git a/mm/kasan/kasan_init.c b/mm/kasan/kasan_init.c
> index 554e4c0f23a2..57a973f05f63 100644
> --- a/mm/kasan/kasan_init.c
> +++ b/mm/kasan/kasan_init.c
> @@ -197,3 +197,70 @@ void __init kasan_populate_zero_shadow(const void *shadow_start,
>  		zero_p4d_populate(pgd, addr, next);
>  	} while (pgd++, addr = next, addr != end);
>  }
> +
> +/* Creates mappings for kasan during early boot. The mapped memory is zeroed */
> +int __meminit kasan_map_populate(unsigned long start, unsigned long end,
> +				 int node)
> +{
> +	unsigned long addr, pfn, next;
> +	unsigned long long size;
> +	pgd_t *pgd;
> +	p4d_t *p4d;
> +	pud_t *pud;
> +	pmd_t *pmd;
> +	pte_t *pte;
> +	int ret;
> +
> +	ret = vmemmap_populate(start, end, node);
> +	/*
> +	 * We might have partially populated memory, so check for no entries,
> +	 * and zero only those that actually exist.
> +	 */
> +	for (addr = start; addr < end; addr = next) {
> +		pgd = pgd_offset_k(addr);
> +		if (pgd_none(*pgd)) {
> +			next = pgd_addr_end(addr, end);
> +			continue;
> +		}
> +
> +		p4d = p4d_offset(pgd, addr);
> +		if (p4d_none(*p4d)) {
> +			next = p4d_addr_end(addr, end);
> +			continue;
> +		}
> +
> +		pud = pud_offset(p4d, addr);
> +		if (pud_none(*pud)) {
> +			next = pud_addr_end(addr, end);
> +			continue;
> +		}
> +		if (pud_large(*pud)) {
> +			/* This is PUD size page */
> +			next = pud_addr_end(addr, end);
> +			size = PUD_SIZE;
> +			pfn = pud_pfn(*pud);
> +		} else {
> +			pmd = pmd_offset(pud, addr);
> +			if (pmd_none(*pmd)) {
> +				next = pmd_addr_end(addr, end);
> +				continue;
> +			}
> +			if (pmd_large(*pmd)) {
> +				/* This is PMD size page */
> +				next = pmd_addr_end(addr, end);
> +				size = PMD_SIZE;
> +				pfn = pmd_pfn(*pmd);
> +			} else {
> +				pte = pte_offset_kernel(pmd, addr);
> +				next = addr + PAGE_SIZE;
> +				if (pte_none(*pte))
> +					continue;
> +				/* This is base size page */
> +				size = PAGE_SIZE;
> +				pfn = pte_pfn(*pte);
> +			}
> +		}
> +		memset(phys_to_virt(PFN_PHYS(pfn)), 0, size);
> +	}
> +	return ret;
> +}
> -- 
> 2.14.1
>

Pavel Tatashin Oct. 3, 2017, 3:04 p.m. UTC | #2

Hi Mark,

I considered using a new  *populate() function for shadow without using 
vmemmap_populate(), but that makes things unnecessary complicated: 
vmemmap_populate() has builtin:

1. large page support
2. device memory support
3. node locality support
4. several config based variants on different platforms

All of that  will cause the code simply be duplicated on each platform 
if we want to support that in kasan.

We could limit ourselves to only supporting base pages in memory by 
using something like vmemmap_populate_basepages(), but that is a step 
backward.  Kasan benefits from using large pages now, why remove it?

So, the solution I provide is walking page table right after memory is 
mapped. Since, we are using the actual page table, it is guaranteed that 
we are not going to miss any mapped memory, and also it is in common 
code, which makes things smaller and nicer.

Thank you,
Pasha

On 10/03/2017 10:48 AM, Mark Rutland wrote:
> 
> I've given this a spin on arm64, and can confirm that it works.
> 
> Given that this involes redundant walking of page tables, I still think
> it'd be preferable to have some common *_populate() helper that took a
> gfp argument, but I guess it's not the end of the world.
> 
> I'll leave it to Will and Catalin to say whether they're happy with the
> page table walking and the new p{u,m}d_large() helpers added to arm64.
>

Will Deacon Oct. 9, 2017, 5:13 p.m. UTC | #3

On Tue, Oct 03, 2017 at 03:48:46PM +0100, Mark Rutland wrote:
> On Wed, Sep 20, 2017 at 04:17:11PM -0400, Pavel Tatashin wrote:
> > During early boot, kasan uses vmemmap_populate() to establish its shadow
> > memory. But, that interface is intended for struct pages use.
> > 
> > Because of the current project, vmemmap won't be zeroed during allocation,
> > but kasan expects that memory to be zeroed. We are adding a new
> > kasan_map_populate() function to resolve this difference.
> 
> Thanks for putting this together.
> 
> I've given this a spin on arm64, and can confirm that it works.
> 
> Given that this involes redundant walking of page tables, I still think
> it'd be preferable to have some common *_populate() helper that took a
> gfp argument, but I guess it's not the end of the world.
> 
> I'll leave it to Will and Catalin to say whether they're happy with the
> page table walking and the new p{u,m}d_large() helpers added to arm64.

To be honest, it just looks completely backwards to me; we're walking the
page tables we created earlier on so that we can figure out what needs to
be zeroed for KASAN. We already had that information before, hence my
preference to allow propagation of GFP_FLAGs to vmemmap_alloc_block when
it's needed. I know that's not popular for some reason, but is walking the
page tables really better?

Will

Pavel Tatashin Oct. 9, 2017, 5:51 p.m. UTC | #4

Hi Will,

I can go back to that approach, if Michal OK with it. But, that would
mean that I would need to touch every single architecture that
implements vmemmap_populate(), and also pass flags at least through
these functions on every architectures (some have more than one
decided by configs).:

vmemmap_populate()
vmemmap_populate_basepages()
vmemmap_populate_hugepages()
vmemmap_pte_populate()
__vmemmap_alloc_block_buf()
alloc_block_buf()
vmemmap_alloc_block()

IMO, while I understand that it looks strange that we must walk page
table after creating it, it is a better approach: more enclosed as it
effects kasan only, and more universal as it is in common code. We are
also somewhat late in the review process, means we will need again to
get ACKs from the maintainers of other arches.

Pavel

On Mon, Oct 9, 2017 at 1:13 PM, Will Deacon <will.deacon@arm.com> wrote:
> On Tue, Oct 03, 2017 at 03:48:46PM +0100, Mark Rutland wrote:
>> On Wed, Sep 20, 2017 at 04:17:11PM -0400, Pavel Tatashin wrote:
>> > During early boot, kasan uses vmemmap_populate() to establish its shadow
>> > memory. But, that interface is intended for struct pages use.
>> >
>> > Because of the current project, vmemmap won't be zeroed during allocation,
>> > but kasan expects that memory to be zeroed. We are adding a new
>> > kasan_map_populate() function to resolve this difference.
>>
>> Thanks for putting this together.
>>
>> I've given this a spin on arm64, and can confirm that it works.
>>
>> Given that this involes redundant walking of page tables, I still think
>> it'd be preferable to have some common *_populate() helper that took a
>> gfp argument, but I guess it's not the end of the world.
>>
>> I'll leave it to Will and Catalin to say whether they're happy with the
>> page table walking and the new p{u,m}d_large() helpers added to arm64.
>
> To be honest, it just looks completely backwards to me; we're walking the
> page tables we created earlier on so that we can figure out what needs to
> be zeroed for KASAN. We already had that information before, hence my
> preference to allow propagation of GFP_FLAGs to vmemmap_alloc_block when
> it's needed. I know that's not popular for some reason, but is walking the
> page tables really better?
>
> Will
>
> --
> To unsubscribe, send a message with 'unsubscribe linux-mm' in
> the body to majordomo@kvack.org.  For more info on Linux MM,
> see: http://www.linux-mm.org/ .
> Don't email: <a href=mailto:"dont@kvack.org"> email@kvack.org </a>

Michal Hocko Oct. 9, 2017, 6:14 p.m. UTC | #5

On Mon 09-10-17 13:51:47, Pavel Tatashin wrote:
> Hi Will,
> 
> I can go back to that approach, if Michal OK with it. But, that would
> mean that I would need to touch every single architecture that
> implements vmemmap_populate(), and also pass flags at least through
> these functions on every architectures (some have more than one
> decided by configs).:
> 
> vmemmap_populate()
> vmemmap_populate_basepages()
> vmemmap_populate_hugepages()
> vmemmap_pte_populate()
> __vmemmap_alloc_block_buf()
> alloc_block_buf()
> vmemmap_alloc_block()
> 
> IMO, while I understand that it looks strange that we must walk page
> table after creating it, it is a better approach: more enclosed as it
> effects kasan only, and more universal as it is in common code.

While I understand that gfp mask approach might look better at first
sight this is by no means a general purpose API so I would rather be
pragmatic and have a smaller code footprint than a more general
interface. Kasan is pretty much a special case and doing a one time
initialization 2 pass thing is imho acceptable. If this turns out to be
impractical in future then let's fix it up but right now I would rather
go a simpler path.

Will Deacon Oct. 9, 2017, 6:22 p.m. UTC | #6

Hi Pavel,

On Mon, Oct 09, 2017 at 01:51:47PM -0400, Pavel Tatashin wrote:
> I can go back to that approach, if Michal OK with it. But, that would
> mean that I would need to touch every single architecture that
> implements vmemmap_populate(), and also pass flags at least through
> these functions on every architectures (some have more than one
> decided by configs).:
> 
> vmemmap_populate()
> vmemmap_populate_basepages()
> vmemmap_populate_hugepages()
> vmemmap_pte_populate()
> __vmemmap_alloc_block_buf()
> alloc_block_buf()
> vmemmap_alloc_block()

As an interim step, why not introduce something like
vmemmap_alloc_block_flags and make the page-table walking opt-out for
architectures that don't want it? Then we can just pass __GFP_ZERO from
our vmemmap_populate where necessary and other architectures can do the
page-table walking dance if they prefer.

> IMO, while I understand that it looks strange that we must walk page
> table after creating it, it is a better approach: more enclosed as it
> effects kasan only, and more universal as it is in common code.

I don't buy the more universal aspect, but I appreciate it's subjective.
Frankly, I'd just sooner not have core code walking early page tables if
it can be avoided, and it doesn't look hard to avoid it in this case.
The fact that you're having to add pmd_large and pud_large, which are
otherwise unused in mm/, is an indication that this isn't quite right imo.

Will

Pavel Tatashin Oct. 9, 2017, 6:42 p.m. UTC | #7

Hi Will,

In addition to what Michal wrote:

> As an interim step, why not introduce something like
> vmemmap_alloc_block_flags and make the page-table walking opt-out for
> architectures that don't want it? Then we can just pass __GFP_ZERO from
> our vmemmap_populate where necessary and other architectures can do the
> page-table walking dance if they prefer.

I do not see the benefit, implementing this approach means that we
would need to implement two table walks instead of one: one for x86,
another for ARM, as these two architectures support kasan. Also, this
would become a requirement for any future architecture that want to
add kasan support to add this page table walk implementation.

>> IMO, while I understand that it looks strange that we must walk page
>> table after creating it, it is a better approach: more enclosed as it
>> effects kasan only, and more universal as it is in common code.
>
> I don't buy the more universal aspect, but I appreciate it's subjective.
> Frankly, I'd just sooner not have core code walking early page tables if
> it can be avoided, and it doesn't look hard to avoid it in this case.
> The fact that you're having to add pmd_large and pud_large, which are
> otherwise unused in mm/, is an indication that this isn't quite right imo.

 28 +#define pmd_large(pmd)         pmd_sect(pmd)
 29 +#define pud_large(pud)         pud_sect(pud)

it is just naming difference, ARM64 calls them pmd_sect, common mm and
other arches call them
pmd_large/pud_large. Even the ARM has these defines in

arm/include/asm/pgtable-3level.h
arm/include/asm/pgtable-2level.h

Pavel

Will Deacon Oct. 9, 2017, 6:48 p.m. UTC | #8

On Mon, Oct 09, 2017 at 08:14:33PM +0200, Michal Hocko wrote:
> On Mon 09-10-17 13:51:47, Pavel Tatashin wrote:
> > I can go back to that approach, if Michal OK with it. But, that would
> > mean that I would need to touch every single architecture that
> > implements vmemmap_populate(), and also pass flags at least through
> > these functions on every architectures (some have more than one
> > decided by configs).:
> > 
> > vmemmap_populate()
> > vmemmap_populate_basepages()
> > vmemmap_populate_hugepages()
> > vmemmap_pte_populate()
> > __vmemmap_alloc_block_buf()
> > alloc_block_buf()
> > vmemmap_alloc_block()
> > 
> > IMO, while I understand that it looks strange that we must walk page
> > table after creating it, it is a better approach: more enclosed as it
> > effects kasan only, and more universal as it is in common code.
> 
> While I understand that gfp mask approach might look better at first
> sight this is by no means a general purpose API so I would rather be
> pragmatic and have a smaller code footprint than a more general
> interface. Kasan is pretty much a special case and doing a one time
> initialization 2 pass thing is imho acceptable. If this turns out to be
> impractical in future then let's fix it up but right now I would rather
> go a simpler path.

I think the simpler path for arm64 is really to say when we want the memory
zeroing as opposed to exposing pmd_large/pud_large macros. Those are likely
to grow more users too, but are difficult to use correctly as we have things
like contiguous ptes that map to a granule smaller than a pmd.

I proposed an alternative solution to Pavel already, but it could be made
less general purpose by marking the function __meminit and only having it
do anything if KASAN is compiled in.

Will

Will Deacon Oct. 9, 2017, 6:48 p.m. UTC | #9

On Mon, Oct 09, 2017 at 02:42:32PM -0400, Pavel Tatashin wrote:
> Hi Will,
> 
> In addition to what Michal wrote:
> 
> > As an interim step, why not introduce something like
> > vmemmap_alloc_block_flags and make the page-table walking opt-out for
> > architectures that don't want it? Then we can just pass __GFP_ZERO from
> > our vmemmap_populate where necessary and other architectures can do the
> > page-table walking dance if they prefer.
> 
> I do not see the benefit, implementing this approach means that we
> would need to implement two table walks instead of one: one for x86,
> another for ARM, as these two architectures support kasan. Also, this
> would become a requirement for any future architecture that want to
> add kasan support to add this page table walk implementation.

We have two table walks even with your patch series applied afaict: one in
our definition of vmemmap_populate (arch/arm64/mm/mmu.c) and this one
in the core code.

> >> IMO, while I understand that it looks strange that we must walk page
> >> table after creating it, it is a better approach: more enclosed as it
> >> effects kasan only, and more universal as it is in common code.
> >
> > I don't buy the more universal aspect, but I appreciate it's subjective.
> > Frankly, I'd just sooner not have core code walking early page tables if
> > it can be avoided, and it doesn't look hard to avoid it in this case.
> > The fact that you're having to add pmd_large and pud_large, which are
> > otherwise unused in mm/, is an indication that this isn't quite right imo.
> 
>  28 +#define pmd_large(pmd)         pmd_sect(pmd)
>  29 +#define pud_large(pud)         pud_sect(pud)
> 
> it is just naming difference, ARM64 calls them pmd_sect, common mm and
> other arches call them
> pmd_large/pud_large. Even the ARM has these defines in
> 
> arm/include/asm/pgtable-3level.h
> arm/include/asm/pgtable-2level.h

My worry is that these are actually highly arch-specific, but will likely
grow more users in mm/ that assume things for all architectures that aren't
necessarily valid.

Will

Pavel Tatashin Oct. 9, 2017, 6:59 p.m. UTC | #10

Hi Will,

> We have two table walks even with your patch series applied afaict: one in
> our definition of vmemmap_populate (arch/arm64/mm/mmu.c) and this one
> in the core code.

I meant to say implementing two new page table walkers, not at runtime.

> My worry is that these are actually highly arch-specific, but will likely
> grow more users in mm/ that assume things for all architectures that aren't
> necessarily valid.

I see, how about moving new kasan_map_populate() implementation into
arch dependent code:

arch/x86/mm/kasan_init_64.c
arch/arm64/mm/kasan_init.c

This way we won't need to add pmd_large()/pud_large() macros for arm64?

Pavel

Will Deacon Oct. 9, 2017, 7:02 p.m. UTC | #11

Hi Pavel,

On Mon, Oct 09, 2017 at 02:59:09PM -0400, Pavel Tatashin wrote:
> > We have two table walks even with your patch series applied afaict: one in
> > our definition of vmemmap_populate (arch/arm64/mm/mmu.c) and this one
> > in the core code.
> 
> I meant to say implementing two new page table walkers, not at runtime.

Ok, but I'm still missing why you think that is needed. What would be the
second page table walker that needs implementing?

> > My worry is that these are actually highly arch-specific, but will likely
> > grow more users in mm/ that assume things for all architectures that aren't
> > necessarily valid.
> 
> I see, how about moving new kasan_map_populate() implementation into
> arch dependent code:
> 
> arch/x86/mm/kasan_init_64.c
> arch/arm64/mm/kasan_init.c
> 
> This way we won't need to add pmd_large()/pud_large() macros for arm64?

I guess we could implement that on arm64 using our current vmemmap_populate
logic and an explicit memset.

Will

Pavel Tatashin Oct. 9, 2017, 7:07 p.m. UTC | #12

>
> Ok, but I'm still missing why you think that is needed. What would be the
> second page table walker that needs implementing?
>
> I guess we could implement that on arm64 using our current vmemmap_populate
> logic and an explicit memset.
>

Hi Will,

What do you mean by explicit memset()? We can't simply memset() from
start to end without doing the page table walk, because at the time
kasan is calling vmemmap_populate() we have a tmp_pg_dir instead of
swapper_pg_dir.

We could do the explicit memset() after
cpu_replace_ttbr1(lm_alias(swapper_pg_dir)); but again, this was in
one of my previous implementations, and I was asked to replace that.

Pavel

Pavel Tatashin Oct. 9, 2017, 7:57 p.m. UTC | #13

>> I guess we could implement that on arm64 using our current vmemmap_populate
>> logic and an explicit memset.

Hi Will,

I will send out a new patch series with x86/arm64  versions of
kasan_map_populate(), so you could take a look if this is something
that is acceptable.

Thank you,
Pavel

[v9,09/12] mm/kasan: kasan specific map populate function

Commit Message

Comments

Patch