Message ID | 20140209221826.GA30556@www.outflux.net (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Sun, Feb 09, 2014 at 10:18:26PM +0000, Kees Cook wrote: > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c > index 1f7b1e13d945..ff1559f9200c 100644 > --- a/arch/arm/mm/dump.c > +++ b/arch/arm/mm/dump.c > @@ -264,6 +264,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) > note_page(st, addr, 3, pmd_val(*pmd)); > else > walk_pte(st, pmd, addr); > + > + if (SECTION_SIZE < PMD_SIZE && pmd_sect(*pmd)) > + note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); You can use pmd_large() here as well. But I think this function is broken (the "for" statement not shown here). The pmd_t is 32-bit with classic MMU and it uses pmd++ while the address grows by PMD_SIZE (two pmd_t entries).
On Mon, Feb 10, 2014 at 10:29:35AM +0000, Catalin Marinas wrote: > On Sun, Feb 09, 2014 at 10:18:26PM +0000, Kees Cook wrote: > > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c > > index 1f7b1e13d945..ff1559f9200c 100644 > > --- a/arch/arm/mm/dump.c > > +++ b/arch/arm/mm/dump.c > > @@ -264,6 +264,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) > > note_page(st, addr, 3, pmd_val(*pmd)); > > else > > walk_pte(st, pmd, addr); > > + > > + if (SECTION_SIZE < PMD_SIZE && pmd_sect(*pmd)) > > + note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); > > You can use pmd_large() here as well. > > But I think this function is broken (the "for" statement not shown > here). The pmd_t is 32-bit with classic MMU and it uses pmd++ while the > address grows by PMD_SIZE (two pmd_t entries). Actually it's ok since PTRS_PER_PMD is 1, so it only goes through this loop once. But in your patch shouldn't you check for pmd_large(*(pmd+1))? The first pmd is already caught by the 'if' statement.
On Mon, Feb 10, 2014 at 2:41 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Mon, Feb 10, 2014 at 10:29:35AM +0000, Catalin Marinas wrote: >> On Sun, Feb 09, 2014 at 10:18:26PM +0000, Kees Cook wrote: >> > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c >> > index 1f7b1e13d945..ff1559f9200c 100644 >> > --- a/arch/arm/mm/dump.c >> > +++ b/arch/arm/mm/dump.c >> > @@ -264,6 +264,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) >> > note_page(st, addr, 3, pmd_val(*pmd)); >> > else >> > walk_pte(st, pmd, addr); >> > + >> > + if (SECTION_SIZE < PMD_SIZE && pmd_sect(*pmd)) >> > + note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); >> >> You can use pmd_large() here as well. >> >> But I think this function is broken (the "for" statement not shown >> here). The pmd_t is 32-bit with classic MMU and it uses pmd++ while the >> address grows by PMD_SIZE (two pmd_t entries). > > Actually it's ok since PTRS_PER_PMD is 1, so it only goes through this > loop once. > > But in your patch shouldn't you check for pmd_large(*(pmd+1))? The first > pmd is already caught by the 'if' statement. It wasn't clear to me what the logic should be here. If PTRS_PER_PMD is 1, then why is there this second pmd after the first? Shouldn't PTRS_PER_PMD be 2 if that's the case? If that's not the case, then I figured the state of needing to report the 2nd pmd depended on the type of the first one, so that's what I wrote instead of trying to figure out why PTRS_PER_PMD wasn't 2. There's clearly something I'm not understanding in here. :) Thanks! -Kees
On Mon, Feb 10, 2014 at 05:26:28PM +0000, Kees Cook wrote: > On Mon, Feb 10, 2014 at 2:41 AM, Catalin Marinas > <catalin.marinas@arm.com> wrote: > > On Mon, Feb 10, 2014 at 10:29:35AM +0000, Catalin Marinas wrote: > >> On Sun, Feb 09, 2014 at 10:18:26PM +0000, Kees Cook wrote: > >> > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c > >> > index 1f7b1e13d945..ff1559f9200c 100644 > >> > --- a/arch/arm/mm/dump.c > >> > +++ b/arch/arm/mm/dump.c > >> > @@ -264,6 +264,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) > >> > note_page(st, addr, 3, pmd_val(*pmd)); > >> > else > >> > walk_pte(st, pmd, addr); > >> > + > >> > + if (SECTION_SIZE < PMD_SIZE && pmd_sect(*pmd)) > >> > + note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); > >> > >> You can use pmd_large() here as well. > >> > >> But I think this function is broken (the "for" statement not shown > >> here). The pmd_t is 32-bit with classic MMU and it uses pmd++ while the > >> address grows by PMD_SIZE (two pmd_t entries). > > > > Actually it's ok since PTRS_PER_PMD is 1, so it only goes through this > > loop once. > > > > But in your patch shouldn't you check for pmd_large(*(pmd+1))? The first > > pmd is already caught by the 'if' statement. > > It wasn't clear to me what the logic should be here. If PTRS_PER_PMD > is 1, then why is there this second pmd after the first? Shouldn't > PTRS_PER_PMD be 2 if that's the case? The reason is that a hardware pte has only 256 entries (classic MMU), this is 1KB. We put two pte tables together and it gives us 2KB. The other 2KB in the page are used for Linux pte bits. Because we have two hw pte tables in a page, we need two corresponding pmd entries. A side effect is that even though we don't actually have a pmd (normally we should have included pgtable-nopmd.h), we still pretend we have one and __pmd_populate takes care of writing two consecutive entries. If we set PTRS_PER_PMD to 2, we should modify pte_alloc_one() to allocate a single hw pte (1KB + 1KB for software bits). I don't think this would be more efficient (there may have been other kernel restrictions in the past to require a full pte table page). > If that's not the case, then I figured the state of needing to report > the 2nd pmd depended on the type of the first one, so that's what I > wrote instead of trying to figure out why PTRS_PER_PMD wasn't 2. I don't remember whether we can have the first pmd being a table and the second one being a section. I don't think we restrict this but Russell should know more. > There's clearly something I'm not understanding in here. :) I happen to understand it from time to time but it doesn't last ;).
On Tue, Feb 11, 2014 at 3:17 AM, Catalin Marinas <catalin.marinas@arm.com> wrote: > On Mon, Feb 10, 2014 at 05:26:28PM +0000, Kees Cook wrote: >> On Mon, Feb 10, 2014 at 2:41 AM, Catalin Marinas >> <catalin.marinas@arm.com> wrote: >> > On Mon, Feb 10, 2014 at 10:29:35AM +0000, Catalin Marinas wrote: >> >> On Sun, Feb 09, 2014 at 10:18:26PM +0000, Kees Cook wrote: >> >> > diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c >> >> > index 1f7b1e13d945..ff1559f9200c 100644 >> >> > --- a/arch/arm/mm/dump.c >> >> > +++ b/arch/arm/mm/dump.c >> >> > @@ -264,6 +264,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) >> >> > note_page(st, addr, 3, pmd_val(*pmd)); >> >> > else >> >> > walk_pte(st, pmd, addr); >> >> > + >> >> > + if (SECTION_SIZE < PMD_SIZE && pmd_sect(*pmd)) >> >> > + note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); >> >> >> >> You can use pmd_large() here as well. >> >> >> >> But I think this function is broken (the "for" statement not shown >> >> here). The pmd_t is 32-bit with classic MMU and it uses pmd++ while the >> >> address grows by PMD_SIZE (two pmd_t entries). >> > >> > Actually it's ok since PTRS_PER_PMD is 1, so it only goes through this >> > loop once. >> > >> > But in your patch shouldn't you check for pmd_large(*(pmd+1))? The first >> > pmd is already caught by the 'if' statement. >> >> It wasn't clear to me what the logic should be here. If PTRS_PER_PMD >> is 1, then why is there this second pmd after the first? Shouldn't >> PTRS_PER_PMD be 2 if that's the case? > > The reason is that a hardware pte has only 256 entries (classic MMU), > this is 1KB. We put two pte tables together and it gives us 2KB. The > other 2KB in the page are used for Linux pte bits. Because we have two > hw pte tables in a page, we need two corresponding pmd entries. > > A side effect is that even though we don't actually have a pmd (normally > we should have included pgtable-nopmd.h), we still pretend we have one > and __pmd_populate takes care of writing two consecutive entries. If we > set PTRS_PER_PMD to 2, we should modify pte_alloc_one() to allocate a > single hw pte (1KB + 1KB for software bits). I don't think this would be > more efficient (there may have been other kernel restrictions in the > past to require a full pte table page). > >> If that's not the case, then I figured the state of needing to report >> the 2nd pmd depended on the type of the first one, so that's what I >> wrote instead of trying to figure out why PTRS_PER_PMD wasn't 2. > > I don't remember whether we can have the first pmd being a table and the > second one being a section. I don't think we restrict this but Russell > should know more. It sounds like my logic is still okay, then? Perhaps move it into the first "if" for readability? if (pmd_none(*pmd) || pmd_large(*pmd) || !pmd_present(*pmd)) { note_page(st, addr, 3, pmd_val(*pmd)); if (SECTION_SIZE < PMD_SIZE && pmd_sect(*pmd)) note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); } else walk_pte(st, pmd, addr); Or should be be explicitly separated (to allow for the very unlikely future case of pmd_large != pmd_sect)? In the LPAE case, SECTION_SIZE == PMD_SIZE, so IIUC, we have to continue testing for that: if (pmd_sect(*pmd)) { note_page(st, addr, 3, pmd_val(*pmd)); if (SECTION_SIZE < PMD_SIZE) note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); } else if (pmd_none(*pmd) || pmd_large(*pmd) || !pmd_present(*pmd)) note_page(st, addr, 3, pmd_val(*pmd)); else walk_pte(st, pmd, addr); >> There's clearly something I'm not understanding in here. :) > > I happen to understand it from time to time but it doesn't last ;). Heh, understood. Thanks for looking at this. :) -Kees
diff --git a/arch/arm/include/asm/pgtable-3level.h b/arch/arm/include/asm/pgtable-3level.h index 03243f7eeddf..fb3de59ee811 100644 --- a/arch/arm/include/asm/pgtable-3level.h +++ b/arch/arm/include/asm/pgtable-3level.h @@ -138,10 +138,6 @@ #define pud_none(pud) (!pud_val(pud)) #define pud_bad(pud) (!(pud_val(pud) & 2)) #define pud_present(pud) (pud_val(pud)) -#define pmd_table(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ - PMD_TYPE_TABLE) -#define pmd_sect(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ - PMD_TYPE_SECT) #define pmd_large(pmd) pmd_sect(pmd) #define pud_clear(pudp) \ diff --git a/arch/arm/include/asm/pgtable.h b/arch/arm/include/asm/pgtable.h index 7d59b524f2af..934aa5b60c7c 100644 --- a/arch/arm/include/asm/pgtable.h +++ b/arch/arm/include/asm/pgtable.h @@ -183,6 +183,10 @@ extern pgd_t swapper_pg_dir[PTRS_PER_PGD]; #define pmd_none(pmd) (!pmd_val(pmd)) #define pmd_present(pmd) (pmd_val(pmd)) +#define pmd_table(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ + PMD_TYPE_TABLE) +#define pmd_sect(pmd) ((pmd_val(pmd) & PMD_TYPE_MASK) == \ + PMD_TYPE_SECT) static inline pte_t *pmd_page_vaddr(pmd_t pmd) { diff --git a/arch/arm/mm/dump.c b/arch/arm/mm/dump.c index 1f7b1e13d945..ff1559f9200c 100644 --- a/arch/arm/mm/dump.c +++ b/arch/arm/mm/dump.c @@ -264,6 +264,9 @@ static void walk_pmd(struct pg_state *st, pud_t *pud, unsigned long start) note_page(st, addr, 3, pmd_val(*pmd)); else walk_pte(st, pmd, addr); + + if (SECTION_SIZE < PMD_SIZE && pmd_sect(*pmd)) + note_page(st, addr + SECTION_SIZE, 3, pmd_val(pmd[1])); } }
On 2-level page table systems, the PMD has 2 section entries. Report these, otherwise ARM_PTDUMP will miss reporting permission changes on odd section boundaries. Signed-off-by: Kees Cook <keescook@chromium.org> --- arch/arm/include/asm/pgtable-3level.h | 4 ---- arch/arm/include/asm/pgtable.h | 4 ++++ arch/arm/mm/dump.c | 3 +++ 3 files changed, 7 insertions(+), 4 deletions(-)