Message ID | 20210923172107.1117604-1-guoren@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [V2,1/2] riscv: Add RISC-V svpbmt extension | expand |
On Mon, Sep 27, 2021 at 1:09 PM Atish Patra <atishp@atishpatra.org> wrote: > > > > On Thu, Sep 23, 2021 at 10:22 AM <guoren@kernel.org> wrote: >> >> From: Guo Ren <guoren@linux.alibaba.com> >> >> This patch follows the standard pure RISC-V Svpbmt extension in >> privilege spec to solve the non-coherent SOC dma synchronization >> issues. >> >> Here is the svpbmt PTE format: >> | 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 >> N MT RSW D A G U X W R V >> ^ >> >> Of the Reserved bits [63:54] in a leaf PTE, the high bit is already >> allocated (as the N bit), so bits [62:61] are used as the MT (aka >> MemType) field. This field specifies one of three memory types that >> are close equivalents (or equivalent in effect) to the three main x86 >> and ARMv8 memory types - as shown in the following table. >> >> RISC-V >> Encoding & >> MemType RISC-V Description >> ---------- ------------------------------------------------ >> 00 - PMA Normal Cacheable, No change to implied PMA memory type >> 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory >> 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory >> 11 - Rsvd Reserved for future standard use >> >> The standard protection_map[] needn't be modified because the "PMA" >> type keeps the highest bits zero. And the whole modification is >> limited in the arch/riscv/* and using a global variable >> (__riscv_svpbmt) as _PAGE_DMA_MASK/IO/NC for pgprot_noncached >> (&writecombine) in pgtable.h. We also add _PAGE_CHG_MASK to filter >> PFN than before. >> > Resending it as it was not delivered to the mailing list because HTML formatting was selected by mistake. Sorry for the noise. Here is the original email. > @Palmer Dabbelt @Guo Ren > > Can we first decide what to do with D1's upstreaming plan ? I had a slide[1] to discuss that during RISC-V BoF. > But we ran out of time. Let's continue the discussion here. > > We all agree that Allwinner D1 has incompatible changes with privilege specification because it uses two reserved bits even after Svpbmt is merged. > Let's not argue on the reasoning behind this change. The silicon is already out and the specification just got frozen. > Unfortunately, we don't have a time stone to change the past ;). > > We need to decide whether we should support the upstream kernel for D1. Few things to consider. > – Can it be considered as an errata ? > – Does it set a bad precedent and open can of worms in future ? > – Can we just ignore D1 given the mass volume ? > > One solution I can think of is that we allow this as an exception to the patch acceptance policy. > We need to explicitly specify this board as an exception because the policy was not in place during the design phase of the hardware. > At least, it protects us from accepting the incompatible changes in the future. Any other ideas ? > > [1] https://linuxplumbersconf.org/event/11/contributions/1128/attachments/846/1757/RISC-V%20Bof.pdf > >> Enable it in devicetree - (Reuse "mmu-type" of cpu_section) >> - riscv,sv39,svpbmt >> - riscv,sv48,svpbmt >> >> Signed-off-by: Guo Ren <guoren@linux.alibaba.com> >> Co-developed-by: Liu Shaohua <liush@allwinnertech.com> >> Signed-off-by: Liu Shaohua <liush@allwinnertech.com> >> Co-developed-by: Wei Fu <wefu@redhat.com> >> Signed-off-by: Wei Fu <wefu@redhat.com> >> Cc: Palmer Dabbelt <palmerdabbelt@google.com> >> Cc: Christoph Hellwig <hch@lst.de> >> Cc: Anup Patel <anup.patel@wdc.com> >> Cc: Arnd Bergmann <arnd@arndb.de> >> Cc: Atish Patra <atish.patra@wdc.com> >> Cc: Drew Fustini <drew@beagleboard.org> >> Cc: Wei Fu <wefu@redhat.com> >> Cc: Wei Wu <lazyparser@gmail.com> >> Cc: Chen-Yu Tsai <wens@csie.org> >> Cc: Maxime Ripard <maxime@cerno.tech> >> Cc: Daniel Lustig <dlustig@nvidia.com> >> Cc: Greg Favor <gfavor@ventanamicro.com> >> Cc: Andrea Mondelli <andrea.mondelli@huawei.com> >> Cc: Jonathan Behrens <behrensj@mit.edu> >> Cc: Xinhaoqu (Freddie) <xinhaoqu@huawei.com> >> Cc: Bill Huffman <huffman@cadence.com> >> Cc: Nick Kossifidis <mick@ics.forth.gr> >> Cc: Allen Baum <allen.baum@esperantotech.com> >> Cc: Josh Scheid <jscheid@ventanamicro.com> >> Cc: Richard Trauben <rtrauben@gmail.com> >> >> --- >> >> Changes since V2: >> - Seperate DT modification into another patch >> - Move riscv_svpbmt() into riscv_fill_hwcap() >> - Fixup print_mmu() >> - Make __riscv_svpbmt updated only when all CPU nodes have "svpmbt" >> in the "mmu-type" DT property >> - Define _PAGE_MT_MASK as (_PAGE_MT_PMA | _PAGE_MT_NC | _PAGE_MT_IO) >> - Change u64 to unsigned long in _PAGE_MT_XXX >> - Change __riscv_svpbmt.mt[] to __riscv_svpbmt.mt_xxx >> - Optimize some misleading names >> --- >> arch/riscv/include/asm/fixmap.h | 2 +- >> arch/riscv/include/asm/pgtable-64.h | 8 ++++-- >> arch/riscv/include/asm/pgtable-bits.h | 41 +++++++++++++++++++++++++-- >> arch/riscv/include/asm/pgtable.h | 39 +++++++++++++++++++------ >> arch/riscv/kernel/cpu.c | 4 ++- >> arch/riscv/kernel/cpufeature.c | 24 ++++++++++++++++ >> arch/riscv/mm/init.c | 5 ++++ >> 7 files changed, 107 insertions(+), 16 deletions(-) >> >> diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h >> index 54cbf07fb4e9..5acd99d08e74 100644 >> --- a/arch/riscv/include/asm/fixmap.h >> +++ b/arch/riscv/include/asm/fixmap.h >> @@ -43,7 +43,7 @@ enum fixed_addresses { >> __end_of_fixed_addresses >> }; >> >> -#define FIXMAP_PAGE_IO PAGE_KERNEL >> +#define FIXMAP_PAGE_IO PAGE_IOREMAP >> >> #define __early_set_fixmap __set_fixmap >> >> diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h >> index 228261aa9628..0b53ea67e91a 100644 >> --- a/arch/riscv/include/asm/pgtable-64.h >> +++ b/arch/riscv/include/asm/pgtable-64.h >> @@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp) >> >> static inline pmd_t *pud_pgtable(pud_t pud) >> { >> - return (pmd_t *)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT); >> + return (pmd_t *)pfn_to_virt((pud_val(pud) & _PAGE_CHG_MASK) >> + >> _PAGE_PFN_SHIFT); >> } >> >> static inline struct page *pud_page(pud_t pud) >> { >> - return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT); >> + return pfn_to_page((pud_val(pud) & _PAGE_CHG_MASK) >> + >> _PAGE_PFN_SHIFT); >> } >> >> static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot) >> @@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot) >> >> static inline unsigned long _pmd_pfn(pmd_t pmd) >> { >> - return pmd_val(pmd) >> _PAGE_PFN_SHIFT; >> + return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT; >> } >> >> #define mk_pmd(page, prot) pfn_pmd(page_to_pfn(page), prot) >> diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h >> index 2ee413912926..3b38fe14f169 100644 >> --- a/arch/riscv/include/asm/pgtable-bits.h >> +++ b/arch/riscv/include/asm/pgtable-bits.h >> @@ -7,7 +7,7 @@ >> #define _ASM_RISCV_PGTABLE_BITS_H >> >> /* >> - * PTE format: >> + * rv32 PTE format: >> * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 >> * PFN reserved for SW D A G U X W R V >> */ >> @@ -24,6 +24,42 @@ >> #define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */ >> #define _PAGE_SOFT (1 << 8) /* Reserved for software */ >> >> +#ifndef __ASSEMBLY__ >> +#ifdef CONFIG_64BIT >> +/* >> + * rv64 PTE format: >> + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 >> + * N MT RSV PFN reserved for SW D A G U X W R V >> + * [62:61] Memory Type definitions: >> + * 00 - PMA Normal Cacheable, No change to implied PMA memory type >> + * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory >> + * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory >> + * 11 - Rsvd Reserved for future standard use >> + */ >> +#define _SVPBMT_PMA ((unsigned long)0x0 << 61) >> +#define _SVPBMT_NC ((unsigned long)0x1 << 61) >> +#define _SVPBMT_IO ((unsigned long)0x2 << 61) >> +#define _SVPBMT_MASK (_SVPBMT_PMA | _SVPBMT_NC | _SVPBMT_IO) >> + >> +extern struct __riscv_svpbmt_struct { >> + unsigned long mask; >> + unsigned long mt_pma; >> + unsigned long mt_nc; >> + unsigned long mt_io; >> +} __riscv_svpbmt; >> + >> +#define _PAGE_MT_MASK __riscv_svpbmt.mask >> +#define _PAGE_MT_PMA __riscv_svpbmt.mt_pma >> +#define _PAGE_MT_NC __riscv_svpbmt.mt_nc >> +#define _PAGE_MT_IO __riscv_svpbmt.mt_io >> +#else >> +#define _PAGE_MT_MASK 0 >> +#define _PAGE_MT_PMA 0 >> +#define _PAGE_MT_NC 0 >> +#define _PAGE_MT_IO 0 >> +#endif /* CONFIG_64BIT */ >> +#endif /* __ASSEMBLY__ */ >> + >> #define _PAGE_SPECIAL _PAGE_SOFT >> #define _PAGE_TABLE _PAGE_PRESENT >> >> @@ -38,7 +74,8 @@ >> /* Set of bits to preserve across pte_modify() */ >> #define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \ >> _PAGE_WRITE | _PAGE_EXEC | \ >> - _PAGE_USER | _PAGE_GLOBAL)) >> + _PAGE_USER | _PAGE_GLOBAL | \ >> + _PAGE_MT_MASK)) >> /* >> * when all of R/W/X are zero, the PTE is a pointer to the next level >> * of the page table; otherwise, it is a leaf PTE. >> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h >> index 39b550310ec6..3fc70a63e395 100644 >> --- a/arch/riscv/include/asm/pgtable.h >> +++ b/arch/riscv/include/asm/pgtable.h >> @@ -136,7 +136,8 @@ >> | _PAGE_PRESENT \ >> | _PAGE_ACCESSED \ >> | _PAGE_DIRTY \ >> - | _PAGE_GLOBAL) >> + | _PAGE_GLOBAL \ >> + | _PAGE_MT_PMA) >> >> #define PAGE_KERNEL __pgprot(_PAGE_KERNEL) >> #define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE) >> @@ -146,11 +147,9 @@ >> >> #define PAGE_TABLE __pgprot(_PAGE_TABLE) >> >> -/* >> - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't >> - * change the properties of memory regions. >> - */ >> -#define _PAGE_IOREMAP _PAGE_KERNEL >> +#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MT_MASK) | _PAGE_MT_IO) >> + >> +#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP) >> >> extern pgd_t swapper_pg_dir[]; >> >> @@ -230,12 +229,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd) >> >> static inline struct page *pmd_page(pmd_t pmd) >> { >> - return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT); >> + return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT); >> } >> >> static inline unsigned long pmd_page_vaddr(pmd_t pmd) >> { >> - return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT); >> + return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT); >> } >> >> static inline pte_t pmd_pte(pmd_t pmd) >> @@ -251,7 +250,7 @@ static inline pte_t pud_pte(pud_t pud) >> /* Yields the page frame number (PFN) of a page table entry */ >> static inline unsigned long pte_pfn(pte_t pte) >> { >> - return (pte_val(pte) >> _PAGE_PFN_SHIFT); >> + return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT); >> } >> >> #define pte_page(x) pfn_to_page(pte_pfn(x)) >> @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, >> return ptep_test_and_clear_young(vma, address, ptep); >> } >> >> +#define pgprot_noncached pgprot_noncached >> +static inline pgprot_t pgprot_noncached(pgprot_t _prot) >> +{ >> + unsigned long prot = pgprot_val(_prot); >> + >> + prot &= ~_PAGE_MT_MASK; >> + prot |= _PAGE_MT_IO; >> + >> + return __pgprot(prot); >> +} >> + >> +#define pgprot_writecombine pgprot_writecombine >> +static inline pgprot_t pgprot_writecombine(pgprot_t _prot) >> +{ >> + unsigned long prot = pgprot_val(_prot); >> + >> + prot &= ~_PAGE_MT_MASK; >> + prot |= _PAGE_MT_NC; >> + >> + return __pgprot(prot); >> +} >> + >> /* >> * THP functions >> */ >> diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c >> index 6d59e6906fdd..fbce525961c0 100644 >> --- a/arch/riscv/kernel/cpu.c >> +++ b/arch/riscv/kernel/cpu.c >> @@ -77,7 +77,9 @@ static void print_mmu(struct seq_file *f, const char *mmu_type) >> return; >> #elif defined(CONFIG_64BIT) >> if (strcmp(mmu_type, "riscv,sv39") != 0 && >> - strcmp(mmu_type, "riscv,sv48") != 0) >> + strcmp(mmu_type, "riscv,sv48") != 0 && >> + strcmp(mmu_type, "riscv,sv39,svpbmt") != 0 && >> + strcmp(mmu_type, "riscv,sv48,svpbmt") != 0) >> return; >> #endif >> >> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c >> index d959d207a40d..d1b046a8254b 100644 >> --- a/arch/riscv/kernel/cpufeature.c >> +++ b/arch/riscv/kernel/cpufeature.c >> @@ -8,6 +8,7 @@ >> >> #include <linux/bitmap.h> >> #include <linux/of.h> >> +#include <linux/pgtable.h> >> #include <asm/processor.h> >> #include <asm/hwcap.h> >> #include <asm/smp.h> >> @@ -59,6 +60,27 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit) >> } >> EXPORT_SYMBOL_GPL(__riscv_isa_extension_available); >> >> +static void __init riscv_svpbmt(void) >> +{ >> +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT) >> + struct device_node *node; >> + const char *str; >> + >> + for_each_of_cpu_node(node) { >> + if (of_property_read_string(node, "mmu-type", &str)) >> + continue; >> + >> + if (strncmp(str + 11, "svpbmt", 6)) >> + return; >> + } >> + >> + __riscv_svpbmt.mask = _SVPBMT_MASK; >> + __riscv_svpbmt.mt_pma = _SVPBMT_PMA; >> + __riscv_svpbmt.mt_nc = _SVPBMT_NC; >> + __riscv_svpbmt.mt_io = _SVPBMT_IO; >> +#endif >> +} >> + >> void __init riscv_fill_hwcap(void) >> { >> struct device_node *node; >> @@ -67,6 +89,8 @@ void __init riscv_fill_hwcap(void) >> size_t i, j, isa_len; >> static unsigned long isa2hwcap[256] = {0}; >> >> + riscv_svpbmt(); >> + >> isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I; >> isa2hwcap['m'] = isa2hwcap['M'] = COMPAT_HWCAP_ISA_M; >> isa2hwcap['a'] = isa2hwcap['A'] = COMPAT_HWCAP_ISA_A; >> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c >> index 7cb4f391d106..43b2e11fd3e0 100644 >> --- a/arch/riscv/mm/init.c >> +++ b/arch/riscv/mm/init.c >> @@ -905,3 +905,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, >> return vmemmap_populate_basepages(start, end, node, NULL); >> } >> #endif >> + >> +#ifdef CONFIG_64BIT >> +struct __riscv_svpbmt_struct __riscv_svpbmt __ro_after_init; >> +EXPORT_SYMBOL(__riscv_svpbmt); >> +#endif >> -- >> 2.25.1 >> >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv > > > > -- > Regards, > Atish
On Mon, Sep 27, 2021 at 1:53 PM Greg Favor <gfavor@ventanamicro.com> wrote: > > With the big caveat that I haven't been in the middle of this discussion, it seems like Allwinner D1's changes represent a custom (and nonconforming) extension. As per the v1.12 privilege specification, bit 63 is reserved for Svnapot extension while bit 60–54 are reserved for future standard use. D1's implementation uses both 60 and 63 bit for their custom "PBMT" extension in addition to bit 61 & 62 [1]. > Isn't this just a matter of the patch needing to be treated as for a RISC-V custom extension per the recently clarified policy for handling upstreaming/etc. of custom extensions? (Philipp can > speak to this clarified policy.) Or what am I missing? Linux kernel upstream policy is yet to adopt that clarification as it was recently discussed at RVI meetings. Is there a written definition of non-conforming/custom/incompatible ? Moreover, as per the platform specification[2], A non-conforming extension that conflicts with a supported standard extensions must satisfy at least one of the following: --- It must be disabled by default. --- The supported standard extension must be declared as unsupported in all feature discovery structures used by software. This option is allowed only if the standard extension is not required. In this case, the custom non-conforming implementation can not be disabled or marked unsupported as it is critical for all the necessary I/O devices (usb, mmc, ethernet). Without this custom implementation support in upstream, we can not really use the mainline kernel for Allwinner D1. [1] https://linuxplumbersconf.org/event/11/contributions/1100/attachments/841/1607/What%E2%80%99s%20the%20problem%20with%20D1%20Linux%20upstream-.pdf [2] https://github.com/riscv/riscv-platform-specs/blob/main/riscv-platform-spec.adoc#2112-general > > Greg > > On Mon, Sep 27, 2021 at 1:14 PM Atish Patra <atishp@atishpatra.org> wrote: >> >> > @Palmer Dabbelt @Guo Ren >> > >> > Can we first decide what to do with D1's upstreaming plan ? I had a slide[1] to discuss that during RISC-V BoF. >> > But we ran out of time. Let's continue the discussion here. >> > >> > We all agree that Allwinner D1 has incompatible changes with privilege specification because it uses two reserved bits even after Svpbmt is merged. >> > Let's not argue on the reasoning behind this change. The silicon is already out and the specification just got frozen. >> > Unfortunately, we don't have a time stone to change the past ;). >> > >> > We need to decide whether we should support the upstream kernel for D1. Few things to consider. >> > – Can it be considered as an errata ? >> > – Does it set a bad precedent and open can of worms in future ? >> > – Can we just ignore D1 given the mass volume ? >> > >> > One solution I can think of is that we allow this as an exception to the patch acceptance policy. >> > We need to explicitly specify this board as an exception because the policy was not in place during the design phase of the hardware. >> > At least, it protects us from accepting the incompatible changes in the future. Any other ideas ? >> > >> > [1] https://linuxplumbersconf.org/event/11/contributions/1128/attachments/846/1757/RISC-V%20Bof.pdf
Hello Greg, Στις 2021-09-27 23:53, Greg Favor έγραψε: > With the big caveat that I haven't been in the middle of this > discussion, it seems like Allwinner D1's changes represent a custom > (and nonconforming) extension. Isn't this just a matter of the patch > needing to be treated as for a RISC-V custom extension per the recently > clarified policy for handling upstreaming/etc. of custom extensions? > (Philipp can speak to this clarified policy.) Or what am I missing? > The Priv. Spec. defines sv39/48 without allowing custom use of reserved pte bits by implementations, Allwinner D1 claims to be sv39 but it does use reserved PTE bits. When vendors want to do custom stuff on PTEs, the standard says they may use values 14-15 on satp.mode for that and define their own MMU basically. Messing up with the implementation of sv39 is not an extension, is violation of sv39. Even worse this implementation can't work if we ignore the customization since without setting the required bits on PTEs dma (and I believe SMP as well) doesn't work. Regards, Nick
Στις 2021-09-27 23:13, Atish Patra έγραψε: >> We need to decide whether we should support the upstream kernel for >> D1. Few things to consider. >> – Can it be considered as an errata ? It's one thing to follow the spec and have an error in the implementation, and another to not follow the spec. >> – Does it set a bad precedent and open can of worms in future ? IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and asking for MMU support, they 've also shipped many chips already. I can also imagine other vendors in the future coming up with implementations that violate the spec in which case handling the standard stuff will become messy and complex, and hurt performance/security. We'll end up filling the code with exceptions and tweaks all over the place. We need to be strict about what is "riscv" and what's "draft riscv" or "riscv inspired", and what we are willing to support upstream. I can understand supporting vendor extensions upstream but they need to fit within the standard spec, we can't have for example extensions that use encoding space/csrs/fields etc reserved for standard use, they may only use what's reserved for custom/vendor use. At least let's agree on that. >> – Can we just ignore D1 given the mass volume ? >> IMHO no, we need to find a way to support it upstream but I believe there is another question to answer: Do we also guarantee "one image to rule them all" approach, required by binary distros, for implementations that violate the spec ? Are we ok for example to support Allwinner D1 upstream but require a custom configuration/build instead of supporting it with the "generic" image ? In one case we need to handle the violation at runtime and introduce overhead for everyone (like looking up __riscv_svpbmt every time we set a PTE in this case), in the other it's an #ifdef. Regards, Nick
On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > > Στις 2021-09-27 23:13, Atish Patra έγραψε: > >> We need to decide whether we should support the upstream kernel for > >> D1. Few things to consider. > >> – Can it be considered as an errata ? > > It's one thing to follow the spec and have an error in the > implementation, and another to not follow the spec. > > >> – Does it set a bad precedent and open can of worms in future ? > > IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and > asking for MMU support, they 've also shipped many chips already. I can > also imagine other vendors in the future coming up with implementations > that violate the spec in which case handling the standard stuff will > become messy and complex, and hurt performance/security. We'll end up > filling the code with exceptions and tweaks all over the place. We need > to be strict about what is "riscv" and what's "draft riscv" or "riscv > inspired", and what we are willing to support upstream. I can understand > supporting vendor extensions upstream but they need to fit within the > standard spec, we can't have for example extensions that use encoding > space/csrs/fields etc reserved for standard use, they may only use > what's reserved for custom/vendor use. At least let's agree on that. Totally agree with Nick here. It's a slippery slope. Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V means future hardware which intentionally violates specs will also have to be merged and the RISC-V patch acceptance policy will have no significance. > > >> – Can we just ignore D1 given the mass volume ? > >> > > IMHO no, we need to find a way to support it upstream but I believe > there is another question to answer: > > Do we also guarantee "one image to rule them all" approach, required by > binary distros, for implementations that violate the spec ? Are we ok > for example to support Allwinner D1 upstream but require a custom > configuration/build instead of supporting it with the "generic" image ? > In one case we need to handle the violation at runtime and introduce > overhead for everyone (like looking up __riscv_svpbmt every time we set > a PTE in this case), in the other it's an #ifdef. At least, we should not have hardware violating specs as part of the unified kernel image instead have these intentional deviations/violations under separate kconfig which will not be enabled by default. This means vendors (of such hardware) and distros will have to explicitly enable support for such violations/deviations. Regards, Anup
On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote: > > On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > > > > Στις 2021-09-27 23:13, Atish Patra έγραψε: > > >> We need to decide whether we should support the upstream kernel for > > >> D1. Few things to consider. > > >> – Can it be considered as an errata ? > > > > It's one thing to follow the spec and have an error in the > > implementation, and another to not follow the spec. > > > > >> – Does it set a bad precedent and open can of worms in future ? > > > > IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and > > asking for MMU support, they 've also shipped many chips already. I can > > also imagine other vendors in the future coming up with implementations > > that violate the spec in which case handling the standard stuff will > > become messy and complex, and hurt performance/security. We'll end up > > filling the code with exceptions and tweaks all over the place. We need > > to be strict about what is "riscv" and what's "draft riscv" or "riscv > > inspired", and what we are willing to support upstream. I can understand > > supporting vendor extensions upstream but they need to fit within the > > standard spec, we can't have for example extensions that use encoding > > space/csrs/fields etc reserved for standard use, they may only use > > what's reserved for custom/vendor use. At least let's agree on that. > > Totally agree with Nick here. It's a slippery slope. > > Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V > means future hardware which intentionally violates specs will also have to > be merged and the RISC-V patch acceptance policy will have no significance. > > > > > >> – Can we just ignore D1 given the mass volume ? > > >> > > > > IMHO no, we need to find a way to support it upstream but I believe > > there is another question to answer: > > > > Do we also guarantee "one image to rule them all" approach, required by > > binary distros, for implementations that violate the spec ? Are we ok > > for example to support Allwinner D1 upstream but require a custom > > configuration/build instead of supporting it with the "generic" image ? > > In one case we need to handle the violation at runtime and introduce > > overhead for everyone (like looking up __riscv_svpbmt every time we set > > a PTE in this case), in the other it's an #ifdef. > > At least, we should not have hardware violating specs as part of the > unified kernel image instead have these intentional deviations/violations > under separate kconfig which will not be enabled by default. This means > vendors (of such hardware) and distros will have to explicitly enable > support for such violations/deviations. > If we merge the code and are not enabled by default, it would be a maintenance nightmare in future. These part of the kernel will not be regularly tested but we have to carry the changes for a long time. Similar changes will only grow over time causing a lot of custom configs that are not enabled by default. IMHO, if we want to support this board in upstream, we should just clearly state that it is one time special exception for this board only because of the following reasons 1. The board design predates the patch acceptance policy. 2. We don't have enough affordable Linux compatible platforms today. 3. Allowing running an upstream kernel on D1 helps the RISC-V software ecosystem to grow. No more exceptions will be allowed in future for such hardware that violates the spec. Period. > Regards, > Anup
On Tue, Sep 28, 2021 at 12:26 PM Atish Patra <atishp@atishpatra.org> wrote: > > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote: > > > > On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > > > > > > Στις 2021-09-27 23:13, Atish Patra έγραψε: > > > >> We need to decide whether we should support the upstream kernel for > > > >> D1. Few things to consider. > > > >> – Can it be considered as an errata ? > > > > > > It's one thing to follow the spec and have an error in the > > > implementation, and another to not follow the spec. > > > > > > >> – Does it set a bad precedent and open can of worms in future ? > > > > > > IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and > > > asking for MMU support, they 've also shipped many chips already. I can > > > also imagine other vendors in the future coming up with implementations > > > that violate the spec in which case handling the standard stuff will > > > become messy and complex, and hurt performance/security. We'll end up > > > filling the code with exceptions and tweaks all over the place. We need > > > to be strict about what is "riscv" and what's "draft riscv" or "riscv > > > inspired", and what we are willing to support upstream. I can understand > > > supporting vendor extensions upstream but they need to fit within the > > > standard spec, we can't have for example extensions that use encoding > > > space/csrs/fields etc reserved for standard use, they may only use > > > what's reserved for custom/vendor use. At least let's agree on that. > > > > Totally agree with Nick here. It's a slippery slope. > > > > Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V > > means future hardware which intentionally violates specs will also have to > > be merged and the RISC-V patch acceptance policy will have no significance. > > > > > > > > >> – Can we just ignore D1 given the mass volume ? > > > >> > > > > > > IMHO no, we need to find a way to support it upstream but I believe > > > there is another question to answer: > > > > > > Do we also guarantee "one image to rule them all" approach, required by > > > binary distros, for implementations that violate the spec ? Are we ok > > > for example to support Allwinner D1 upstream but require a custom > > > configuration/build instead of supporting it with the "generic" image ? > > > In one case we need to handle the violation at runtime and introduce > > > overhead for everyone (like looking up __riscv_svpbmt every time we set > > > a PTE in this case), in the other it's an #ifdef. > > > > At least, we should not have hardware violating specs as part of the > > unified kernel image instead have these intentional deviations/violations > > under separate kconfig which will not be enabled by default. This means > > vendors (of such hardware) and distros will have to explicitly enable > > support for such violations/deviations. > > > > If we merge the code and are not enabled by default, it would be a > maintenance nightmare in future. > These part of the kernel will not be regularly tested but we have to > carry the changes for a long time. > Similar changes will only grow over time causing a lot of custom > configs that are not enabled by default. D1 could still use generic Image. The reason why I send the standard implementation of svpbmt is that when we introduce svpbmt, we actually introduce the page attribute frameworks for different platforms(svpbmt & non-svpbmt). Then, "custom svpbmt" can also modify "protect_mapp []" and svpbmt [] "in errata by limited codes from vendor. If we support standard svpbmt first, then let "generic Image" support D1 would be very little modification and all could be kept in errata. Another patch [1] cleans up the wrong usage of "protect_map []" so that the entire Linux user state page attributes come from it. The design principle of Linux is to allow the platform to init "protect_map []" flexibly. [1]: https://lore.kernel.org/all/20210927064340.2411397-1-guoren@kernel.org/ > > IMHO, if we want to support this board in upstream, we should just > clearly state that it is one time special exception > for this board only because of the following reasons > > 1. The board design predates the patch acceptance policy. D1 is designed at 2019. > 2. We don't have enough affordable Linux compatible platforms today. D1 only $65. > 3. Allowing running an upstream kernel on D1 helps the RISC-V software > ecosystem to grow. Yes > > No more exceptions will be allowed in future for such hardware that > violates the spec. Period. > > > Regards, > > Anup > > > > -- > Regards, > Atish
On Mon, Sep 27, 2021 at 9:49 PM Greg Favor <gfavor@ventanamicro.com> wrote: > > On Mon, Sep 27, 2021 at 9:26 PM Atish Patra <atishp@atishpatra.org> wrote: >> >> IMHO, if we want to support this board in upstream, we should just >> clearly state that it is one time special exception >> for this board only because of the following reasons > > > I'm not quite following what the exception is? If RVI's policy is followed for how software support for custom extensions (which D1 falls under) should be handled (as part of allowing such software to be upstreamed), then that doesn't burden standard distros nor the community to maintain ongoing support. I am a bit confused. As per our understanding, D1 doesn't fall under custom extensions because they have non-standard conflicting implementations of the PTE bits that violate the privilege specification (v1.12 with svpbmt & svnapot extension merged). Custom extensions can only be defined via satp mode (14-15). Am I missing something? > And this doesn't stop a custom Linux build from being provided to users of the D1 board (like what any other vendor would need to do to support the custom extensions on their hardware). A custom Linux build is already floating around for the users for the D1 board. Whether the upstream mainline kernel supports this board is the key question here. > > Or are you proposing that standard binary distros would have to support D1's custom extension? (And how would that even work if and when Svpbmt becomes required in some rev of the OS-A platform spec - at which point the D1 boards have a nonconforming extension that can't be disabled and thus conflicts with required (Svpbmt) functionality?) I was suggesting to support D1 in the unified kernel image built from defconfig only if we decide to support it. standard binary distros(such as Fedora, RHEL, Suse, Ubuntu) anyways use custom config. They can always disable support for D1 if they want it. > > Greg > >> >> >> 1. The board design predates the patch acceptance policy. >> 2. We don't have enough affordable Linux compatible platforms today. >> 3. Allowing running an upstream kernel on D1 helps the RISC-V software >> ecosystem to grow. >> >> No more exceptions will be allowed in future for such hardware that >> violates the spec. Period. >> >> > Regards, >> > Anup >> >> >> >> -- >> Regards, >> Atish -- Regards, Atish
On Tue, 28 Sept 2021 at 01:05, Atish Patra <atishp@atishpatra.org> wrote: > > On Mon, Sep 27, 2021 at 1:53 PM Greg Favor <gfavor@ventanamicro.com> wrote: > > > > With the big caveat that I haven't been in the middle of this discussion, it seems like Allwinner D1's changes represent a custom (and nonconforming) extension. > > As per the v1.12 privilege specification, bit 63 is reserved for > Svnapot extension while bit 60–54 are reserved for future standard > use. > D1's implementation uses both 60 and 63 bit for their custom "PBMT" > extension in addition to bit 61 & 62 [1]. > > > Isn't this just a matter of the patch needing to be treated as for a RISC-V custom extension per the recently clarified policy for handling upstreaming/etc. of custom extensions? (Philipp can > speak to this clarified policy.) Or what am I missing? > > Linux kernel upstream policy is yet to adopt that clarification as it > was recently discussed at RVI meetings. Is there a written definition > of non-conforming/custom/incompatible ? > Moreover, as per the platform specification[2], > > A non-conforming extension that conflicts with a supported standard > extensions must satisfy at least one of the following: > --- It must be disabled by default. > --- The supported standard extension must be declared as unsupported > in all feature discovery structures used by software. > This option is allowed only if the standard extension is not required. The terminology for those cases (and x-thead-v is the mental test-case) would be "non-compliant and non-conflicting". The "non-compliant" part comes from the fact that the specification has been violated: in the case of x-thead-v by using the reserved opcode space, in the case of the PTEs by using reserved bits. The "non-conflicting" is based on a system still being able to operate according to the specification, even if the non-compliant extension is simply ignored (e.g. RVV is not mandatory and the opcodes for RVV are not required to raise illegal insn exceptions, so these systems would simply appear as not having RVV)… Now, with their "custom" PBMT, we are pushing the boundaries of the 'non-conflicting' definition but are still within the same reasoning: if we only signal sv39 and no PBMT-support, then the abuse of the PTE bits does not conflict. In the end, the 'non-conflicting' status will hinge on whether we make svpbmt mandatory in the Platform (or the referenced Profile). In this particular case — given the importance of the D1 boards for bootstrapping the software ecosystem — I would make the case that we need to provide a provision in the Platforms (i.e. OS-A base) to retain the 'non-conflicting' status (e.g. by mandating "some pbmt" — with an application note stating that this will be restricted to svpbmt in the next major revision of the Platforms ... and already restricting it to svpbmt for the OS-A server extension). > In this case, the custom non-conforming implementation can not be > disabled or marked unsupported as it is critical for all the necessary > I/O devices (usb, mmc, ethernet). > Without this custom implementation support in upstream, we can not > really use the mainline kernel for Allwinner D1. I don't agree from a specification standpoint and from the 'non-conflicting' standpoint: whether or not device drivers can be used, does not change the 'non-conflicting' status. This is similar to the 'non-conflicting' status with x-thead-v: vector instructions (as in "the standard RVV instructions") also can't be used with a toolchain/libraries/OS that only implements RVV ... but nonetheless, vendor-specific vector instructions are available. I would argue the same for Alibaba's PBMT: I/O devices will not work, unless a vendor-specific non-conflicting PBMT is used. The brings us back to the requirements that I defined multiple times for this: the implementation needs to be properly modularized and quarantined as to not adversely impact compliant implementations. If this can be assured, I will always argue for inclusion based on the benefit to the ecosystem and reconciling the imminent fragmentation. Or in other words: In what world does it make sense to encourage a vendor of a board with significant market share to create a vendor-fork of the kernel, if that vendor is willing to work with the upstream? Note that I normally am the first to argue on principle, but have come to the conclusion that we are really between a rock and a hard place on this one — and my priority is to keep the ecosystem from fragmenting. >> >> > We need to decide whether we should support the upstream kernel for D1. Few things to consider. > >> > – Can it be considered as an errata ? > >> > – Does it set a bad precedent and open can of worms in future ? > >> > – Can we just ignore D1 given the mass volume ? > >> > > >> > One solution I can think of is that we allow this as an exception to the patch acceptance policy. > >> > We need to explicitly specify this board as an exception because the policy was not in place during the design phase of the hardware. > >> > At least, it protects us from accepting the incompatible changes in the future. Any other ideas ? Anything we do needs to consider future impact and precedent. But that should not preclude us from making an exception, when it benefits the ecosystem by preventing further fragmentation. I consider the stated requirements of proper modularization (quarantining the impact within the code base) and a precondition on the vendor maintaining the changes, as a strong-enough discouragement for any parties that might also consider to apply gross negligence to the meaning of 'reserved'. Philipp.
On 9/28/21 7:26 AM, Atish Patra wrote: > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote: >> >> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote: >>> >>> Στις 2021-09-27 23:13, Atish Patra έγραψε: >>>>> We need to decide whether we should support the upstream kernel for >>>>> D1. Few things to consider. >>>>> – Can it be considered as an errata ? >>> >>> It's one thing to follow the spec and have an error in the >>> implementation, and another to not follow the spec. >>> >>>>> – Does it set a bad precedent and open can of worms in future ? >>> >>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and >>> asking for MMU support, they 've also shipped many chips already. I can >>> also imagine other vendors in the future coming up with implementations >>> that violate the spec in which case handling the standard stuff will >>> become messy and complex, and hurt performance/security. We'll end up >>> filling the code with exceptions and tweaks all over the place. We need >>> to be strict about what is "riscv" and what's "draft riscv" or "riscv >>> inspired", and what we are willing to support upstream. I can understand >>> supporting vendor extensions upstream but they need to fit within the >>> standard spec, we can't have for example extensions that use encoding >>> space/csrs/fields etc reserved for standard use, they may only use >>> what's reserved for custom/vendor use. At least let's agree on that. >> >> Totally agree with Nick here. It's a slippery slope. >> >> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V >> means future hardware which intentionally violates specs will also have to >> be merged and the RISC-V patch acceptance policy will have no significance. >> >>> >>>>> – Can we just ignore D1 given the mass volume ? >>>>> >>> >>> IMHO no, we need to find a way to support it upstream but I believe >>> there is another question to answer: >>> >>> Do we also guarantee "one image to rule them all" approach, required by >>> binary distros, for implementations that violate the spec ? Are we ok >>> for example to support Allwinner D1 upstream but require a custom >>> configuration/build instead of supporting it with the "generic" image ? >>> In one case we need to handle the violation at runtime and introduce >>> overhead for everyone (like looking up __riscv_svpbmt every time we set >>> a PTE in this case), in the other it's an #ifdef. >> >> At least, we should not have hardware violating specs as part of the >> unified kernel image instead have these intentional deviations/violations >> under separate kconfig which will not be enabled by default. This means >> vendors (of such hardware) and distros will have to explicitly enable >> support for such violations/deviations. >> > > If we merge the code and are not enabled by default, it would be a > maintenance nightmare in future. > These part of the kernel will not be regularly tested but we have to > carry the changes for a long time. I don't see a difference between having these features as part of the generic image vs having them as custom configs/builds. The code will get executed only on boards that support the custom/non-compliant implementation anyway. To the contrary we'll have more code to test if we are doing things at runtime vs at compile time. > Similar changes will only grow over time causing a lot of custom > configs that are not enabled by default. > We'll have a lot of custom configs that will only get used on boards that use them, vs runtime code that will run for no reason on every board and choose the default/standard-compliant implementation most of the time. In the end the code will only get tested on specific hardware anyway. > IMHO, if we want to support this board in upstream, we should just > clearly state that it is one time special exception > for this board only because of the following reasons > > 1. The board design predates the patch acceptance policy. > 2. We don't have enough affordable Linux compatible platforms today. > 3. Allowing running an upstream kernel on D1 helps the RISC-V software > ecosystem to grow. > The same can be said for Kendryte as well, are we willing to also support their MMU implementation on the generic image if a patch comes in ? To be clear I'm not saying we shouldn't support D1 or Kendryte upstream, I'm just saying that we shouldn't sacrifice the compleity and performance of the code path for standard-compliant implementations, to support non-compliant implementations, and instead support non-compliant implementations with custom kernel builds using compile time options. It still counts as upstream support, they won't have to maintain their own forks. It'll also allow custom implementations to have more flexibility on what they can do since they will be able to use completely different/custom code paths, instead of trying to fit in the standard code path (which will become a mess over time). I think this approach is much more flexible and will allow more customizations to be supported upstream in the future.
Nick, On Tue, 28 Sept 2021 at 15:19, Nick Kossifidis <mick@ics.forth.gr> wrote: > > On 9/28/21 7:26 AM, Atish Patra wrote: > > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote: > >> > >> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > >>> > >>> Στις 2021-09-27 23:13, Atish Patra έγραψε: > >>>>> We need to decide whether we should support the upstream kernel for > >>>>> D1. Few things to consider. > >>>>> – Can it be considered as an errata ? > >>> > >>> It's one thing to follow the spec and have an error in the > >>> implementation, and another to not follow the spec. > >>> > >>>>> – Does it set a bad precedent and open can of worms in future ? > >>> > >>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and > >>> asking for MMU support, they 've also shipped many chips already. I can > >>> also imagine other vendors in the future coming up with implementations > >>> that violate the spec in which case handling the standard stuff will > >>> become messy and complex, and hurt performance/security. We'll end up > >>> filling the code with exceptions and tweaks all over the place. We need > >>> to be strict about what is "riscv" and what's "draft riscv" or "riscv > >>> inspired", and what we are willing to support upstream. I can understand > >>> supporting vendor extensions upstream but they need to fit within the > >>> standard spec, we can't have for example extensions that use encoding > >>> space/csrs/fields etc reserved for standard use, they may only use > >>> what's reserved for custom/vendor use. At least let's agree on that. > >> > >> Totally agree with Nick here. It's a slippery slope. > >> > >> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V > >> means future hardware which intentionally violates specs will also have to > >> be merged and the RISC-V patch acceptance policy will have no significance. > >> > >>> > >>>>> – Can we just ignore D1 given the mass volume ? > >>>>> > >>> > >>> IMHO no, we need to find a way to support it upstream but I believe > >>> there is another question to answer: > >>> > >>> Do we also guarantee "one image to rule them all" approach, required by > >>> binary distros, for implementations that violate the spec ? Are we ok > >>> for example to support Allwinner D1 upstream but require a custom > >>> configuration/build instead of supporting it with the "generic" image ? > >>> In one case we need to handle the violation at runtime and introduce > >>> overhead for everyone (like looking up __riscv_svpbmt every time we set > >>> a PTE in this case), in the other it's an #ifdef. > >> > >> At least, we should not have hardware violating specs as part of the > >> unified kernel image instead have these intentional deviations/violations > >> under separate kconfig which will not be enabled by default. This means > >> vendors (of such hardware) and distros will have to explicitly enable > >> support for such violations/deviations. > >> > > > > If we merge the code and are not enabled by default, it would be a > > maintenance nightmare in future. > > These part of the kernel will not be regularly tested but we have to > > carry the changes for a long time. > > I don't see a difference between having these features as part of the > generic image vs having them as custom configs/builds. The code will get > executed only on boards that support the custom/non-compliant > implementation anyway. To the contrary we'll have more code to test if > we are doing things at runtime vs at compile time. > > > Similar changes will only grow over time causing a lot of custom > > configs that are not enabled by default. > > > > We'll have a lot of custom configs that will only get used on boards > that use them, vs runtime code that will run for no reason on every > board and choose the default/standard-compliant implementation most of > the time. In the end the code will only get tested on specific hardware > anyway. > > > IMHO, if we want to support this board in upstream, we should just > > clearly state that it is one time special exception > > for this board only because of the following reasons > > > > 1. The board design predates the patch acceptance policy. > > 2. We don't have enough affordable Linux compatible platforms today. > > 3. Allowing running an upstream kernel on D1 helps the RISC-V software > > ecosystem to grow. > > > > The same can be said for Kendryte as well, are we willing to also > support their MMU implementation on the generic image if a patch comes > in? To be clear I'm not saying we shouldn't support D1 or Kendryte > upstream, I'm just saying that we shouldn't sacrifice the complexity and > performance of the code path for standard-compliant implementations, to > support non-compliant implementations, and instead support non-compliant > implementations with custom kernel builds using compile time options. It For priming the pump on the software effort, having a solution that is enabled on distro-builds is clearly preferable — that leads to the solution that Palmer had outlined at LPC, which is to have a KCONFIG option that enables the alternate code paths and can be turned off for embedded use-cases. > still counts as upstream support, they won't have to maintain their own > forks. It'll also allow custom implementations to have more flexibility > on what they can do since they will be able to use completely > different/custom code paths, instead of trying to fit in the standard > code path (which will become a mess over time). I think this approach is > much more flexible and will allow more customizations to be supported > upstream in the future. The important detail will be the ground rules: changes have to be sufficiently quarantined that (a) they can be turned off, (b) can be reverted easily (in case that vendors fail to perform their maintenance obligations), and (c) they don't affect the performance and complexity of the standard code paths. Cheers, Philipp.
On Tue, Sep 28, 2021 at 3:48 PM Philipp Tomsich <philipp.tomsich@vrull.eu> wrote: > > Nick, > > On Tue, 28 Sept 2021 at 15:19, Nick Kossifidis <mick@ics.forth.gr> wrote: > > > > On 9/28/21 7:26 AM, Atish Patra wrote: > > > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote: > > >> > > >> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote: > > >>> > > >>> Στις 2021-09-27 23:13, Atish Patra έγραψε: > > >>>>> We need to decide whether we should support the upstream kernel for > > >>>>> D1. Few things to consider. > > >>>>> – Can it be considered as an errata ? > > >>> > > >>> It's one thing to follow the spec and have an error in the > > >>> implementation, and another to not follow the spec. > > >>> > > >>>>> – Does it set a bad precedent and open can of worms in future ? > > >>> > > >>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and > > >>> asking for MMU support, they 've also shipped many chips already. I can > > >>> also imagine other vendors in the future coming up with implementations > > >>> that violate the spec in which case handling the standard stuff will > > >>> become messy and complex, and hurt performance/security. We'll end up > > >>> filling the code with exceptions and tweaks all over the place. We need > > >>> to be strict about what is "riscv" and what's "draft riscv" or "riscv > > >>> inspired", and what we are willing to support upstream. I can understand > > >>> supporting vendor extensions upstream but they need to fit within the > > >>> standard spec, we can't have for example extensions that use encoding > > >>> space/csrs/fields etc reserved for standard use, they may only use > > >>> what's reserved for custom/vendor use. At least let's agree on that. > > >> > > >> Totally agree with Nick here. It's a slippery slope. > > >> > > >> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V > > >> means future hardware which intentionally violates specs will also have to > > >> be merged and the RISC-V patch acceptance policy will have no significance. > > >> > > >>> > > >>>>> – Can we just ignore D1 given the mass volume ? > > >>>>> > > >>> > > >>> IMHO no, we need to find a way to support it upstream but I believe > > >>> there is another question to answer: > > >>> > > >>> Do we also guarantee "one image to rule them all" approach, required by > > >>> binary distros, for implementations that violate the spec ? Are we ok > > >>> for example to support Allwinner D1 upstream but require a custom > > >>> configuration/build instead of supporting it with the "generic" image ? > > >>> In one case we need to handle the violation at runtime and introduce > > >>> overhead for everyone (like looking up __riscv_svpbmt every time we set > > >>> a PTE in this case), in the other it's an #ifdef. > > >> > > >> At least, we should not have hardware violating specs as part of the > > >> unified kernel image instead have these intentional deviations/violations > > >> under separate kconfig which will not be enabled by default. This means > > >> vendors (of such hardware) and distros will have to explicitly enable > > >> support for such violations/deviations. > > >> > > > > > > If we merge the code and are not enabled by default, it would be a > > > maintenance nightmare in future. > > > These part of the kernel will not be regularly tested but we have to > > > carry the changes for a long time. > > > > I don't see a difference between having these features as part of the > > generic image vs having them as custom configs/builds. The code will get > > executed only on boards that support the custom/non-compliant > > implementation anyway. To the contrary we'll have more code to test if > > we are doing things at runtime vs at compile time. > > > > > Similar changes will only grow over time causing a lot of custom > > > configs that are not enabled by default. > > > > > > > We'll have a lot of custom configs that will only get used on boards > > that use them, vs runtime code that will run for no reason on every > > board and choose the default/standard-compliant implementation most of > > the time. In the end the code will only get tested on specific hardware > > anyway. > > > > > IMHO, if we want to support this board in upstream, we should just > > > clearly state that it is one time special exception > > > for this board only because of the following reasons > > > > > > 1. The board design predates the patch acceptance policy. > > > 2. We don't have enough affordable Linux compatible platforms today. > > > 3. Allowing running an upstream kernel on D1 helps the RISC-V software > > > ecosystem to grow. > > > > > > > The same can be said for Kendryte as well, are we willing to also > > support their MMU implementation on the generic image if a patch comes > > in? To be clear I'm not saying we shouldn't support D1 or Kendryte > > upstream, I'm just saying that we shouldn't sacrifice the complexity and > > performance of the code path for standard-compliant implementations, to > > support non-compliant implementations, and instead support non-compliant > > implementations with custom kernel builds using compile time options. It > > For priming the pump on the software effort, having a solution that is enabled > on distro-builds is clearly preferable — that leads to the solution that Palmer > had outlined at LPC, which is to have a KCONFIG option that enables the > alternate code paths and can be turned off for embedded use-cases. > > > still counts as upstream support, they won't have to maintain their own > > forks. It'll also allow custom implementations to have more flexibility > > on what they can do since they will be able to use completely > > different/custom code paths, instead of trying to fit in the standard > > code path (which will become a mess over time). I think this approach is > > much more flexible and will allow more customizations to be supported > > upstream in the future. > > The important detail will be the ground rules: changes have to be sufficiently > quarantined that (a) they can be turned off, (b) can be reverted easily (in case > that vendors fail to perform their maintenance obligations), Can we really remove support once it is in and widely used? > and (c) they don't > affect the performance and complexity of the standard code paths. > > Cheers, > Philipp. > > _______________________________________________ > linux-riscv mailing list > linux-riscv@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-riscv
On Tue, 28 Sep 2021 07:58:51 PDT (-0700), alexandre.ghiti@canonical.com wrote: > On Tue, Sep 28, 2021 at 3:48 PM Philipp Tomsich > <philipp.tomsich@vrull.eu> wrote: >> >> Nick, >> >> On Tue, 28 Sept 2021 at 15:19, Nick Kossifidis <mick@ics.forth.gr> wrote: >> > >> > On 9/28/21 7:26 AM, Atish Patra wrote: >> > > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote: >> > >> >> > >> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote: >> > >>> >> > >>> Στις 2021-09-27 23:13, Atish Patra έγραψε: >> > >>>>> We need to decide whether we should support the upstream kernel for >> > >>>>> D1. Few things to consider. >> > >>>>> – Can it be considered as an errata ? >> > >>> >> > >>> It's one thing to follow the spec and have an error in the >> > >>> implementation, and another to not follow the spec. >> > >>> >> > >>>>> – Does it set a bad precedent and open can of worms in future ? >> > >>> >> > >>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and >> > >>> asking for MMU support, they 've also shipped many chips already. I can >> > >>> also imagine other vendors in the future coming up with implementations >> > >>> that violate the spec in which case handling the standard stuff will >> > >>> become messy and complex, and hurt performance/security. We'll end up >> > >>> filling the code with exceptions and tweaks all over the place. We need >> > >>> to be strict about what is "riscv" and what's "draft riscv" or "riscv >> > >>> inspired", and what we are willing to support upstream. I can understand >> > >>> supporting vendor extensions upstream but they need to fit within the >> > >>> standard spec, we can't have for example extensions that use encoding >> > >>> space/csrs/fields etc reserved for standard use, they may only use >> > >>> what's reserved for custom/vendor use. At least let's agree on that. >> > >> >> > >> Totally agree with Nick here. It's a slippery slope. >> > >> >> > >> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V >> > >> means future hardware which intentionally violates specs will also have to >> > >> be merged and the RISC-V patch acceptance policy will have no significance. >> > >> >> > >>> >> > >>>>> – Can we just ignore D1 given the mass volume ? >> > >>>>> >> > >>> >> > >>> IMHO no, we need to find a way to support it upstream but I believe >> > >>> there is another question to answer: >> > >>> >> > >>> Do we also guarantee "one image to rule them all" approach, required by >> > >>> binary distros, for implementations that violate the spec ? Are we ok >> > >>> for example to support Allwinner D1 upstream but require a custom >> > >>> configuration/build instead of supporting it with the "generic" image ? >> > >>> In one case we need to handle the violation at runtime and introduce >> > >>> overhead for everyone (like looking up __riscv_svpbmt every time we set >> > >>> a PTE in this case), in the other it's an #ifdef. >> > >> >> > >> At least, we should not have hardware violating specs as part of the >> > >> unified kernel image instead have these intentional deviations/violations >> > >> under separate kconfig which will not be enabled by default. This means >> > >> vendors (of such hardware) and distros will have to explicitly enable >> > >> support for such violations/deviations. >> > >> >> > > >> > > If we merge the code and are not enabled by default, it would be a >> > > maintenance nightmare in future. >> > > These part of the kernel will not be regularly tested but we have to >> > > carry the changes for a long time. >> > >> > I don't see a difference between having these features as part of the >> > generic image vs having them as custom configs/builds. The code will get >> > executed only on boards that support the custom/non-compliant >> > implementation anyway. To the contrary we'll have more code to test if >> > we are doing things at runtime vs at compile time. >> > >> > > Similar changes will only grow over time causing a lot of custom >> > > configs that are not enabled by default. >> > > >> > >> > We'll have a lot of custom configs that will only get used on boards >> > that use them, vs runtime code that will run for no reason on every >> > board and choose the default/standard-compliant implementation most of >> > the time. In the end the code will only get tested on specific hardware >> > anyway. >> > >> > > IMHO, if we want to support this board in upstream, we should just >> > > clearly state that it is one time special exception >> > > for this board only because of the following reasons >> > > >> > > 1. The board design predates the patch acceptance policy. >> > > 2. We don't have enough affordable Linux compatible platforms today. >> > > 3. Allowing running an upstream kernel on D1 helps the RISC-V software >> > > ecosystem to grow. >> > > >> > >> > The same can be said for Kendryte as well, are we willing to also >> > support their MMU implementation on the generic image if a patch comes >> > in? To be clear I'm not saying we shouldn't support D1 or Kendryte >> > upstream, I'm just saying that we shouldn't sacrifice the complexity and >> > performance of the code path for standard-compliant implementations, to >> > support non-compliant implementations, and instead support non-compliant >> > implementations with custom kernel builds using compile time options. It >> >> For priming the pump on the software effort, having a solution that is enabled >> on distro-builds is clearly preferable — that leads to the solution that Palmer >> had outlined at LPC, which is to have a KCONFIG option that enables the >> alternate code paths and can be turned off for embedded use-cases. >> >> > still counts as upstream support, they won't have to maintain their own >> > forks. It'll also allow custom implementations to have more flexibility >> > on what they can do since they will be able to use completely >> > different/custom code paths, instead of trying to fit in the standard >> > code path (which will become a mess over time). I think this approach is >> > much more flexible and will allow more customizations to be supported >> > upstream in the future. >> >> The important detail will be the ground rules: changes have to be sufficiently >> quarantined that (a) they can be turned off, (b) can be reverted easily (in case >> that vendors fail to perform their maintenance obligations), > > Can we really remove support once it is in and widely used? We'll follow the standard deprecation policies for anything I have any say over, which in the kernel I've always heard described as forever-ish. Since this is pretty coupled to a specific chip one could imagine deprecating it when we can convince ourselves those chips have all had their smoke let out, but that's a decade timescale sort of thing. >> and (c) they don't >> affect the performance and complexity of the standard code paths. >> >> Cheers, >> Philipp. >> >> _______________________________________________ >> linux-riscv mailing list >> linux-riscv@lists.infradead.org >> http://lists.infradead.org/mailman/listinfo/linux-riscv
diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h index 54cbf07fb4e9..5acd99d08e74 100644 --- a/arch/riscv/include/asm/fixmap.h +++ b/arch/riscv/include/asm/fixmap.h @@ -43,7 +43,7 @@ enum fixed_addresses { __end_of_fixed_addresses }; -#define FIXMAP_PAGE_IO PAGE_KERNEL +#define FIXMAP_PAGE_IO PAGE_IOREMAP #define __early_set_fixmap __set_fixmap diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h index 228261aa9628..0b53ea67e91a 100644 --- a/arch/riscv/include/asm/pgtable-64.h +++ b/arch/riscv/include/asm/pgtable-64.h @@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp) static inline pmd_t *pud_pgtable(pud_t pud) { - return (pmd_t *)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT); + return (pmd_t *)pfn_to_virt((pud_val(pud) & _PAGE_CHG_MASK) + >> _PAGE_PFN_SHIFT); } static inline struct page *pud_page(pud_t pud) { - return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT); + return pfn_to_page((pud_val(pud) & _PAGE_CHG_MASK) + >> _PAGE_PFN_SHIFT); } static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot) @@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot) static inline unsigned long _pmd_pfn(pmd_t pmd) { - return pmd_val(pmd) >> _PAGE_PFN_SHIFT; + return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT; } #define mk_pmd(page, prot) pfn_pmd(page_to_pfn(page), prot) diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h index 2ee413912926..3b38fe14f169 100644 --- a/arch/riscv/include/asm/pgtable-bits.h +++ b/arch/riscv/include/asm/pgtable-bits.h @@ -7,7 +7,7 @@ #define _ASM_RISCV_PGTABLE_BITS_H /* - * PTE format: + * rv32 PTE format: * | XLEN-1 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 * PFN reserved for SW D A G U X W R V */ @@ -24,6 +24,42 @@ #define _PAGE_DIRTY (1 << 7) /* Set by hardware on any write */ #define _PAGE_SOFT (1 << 8) /* Reserved for software */ +#ifndef __ASSEMBLY__ +#ifdef CONFIG_64BIT +/* + * rv64 PTE format: + * | 63 | 62 61 | 60 54 | 53 10 | 9 8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0 + * N MT RSV PFN reserved for SW D A G U X W R V + * [62:61] Memory Type definitions: + * 00 - PMA Normal Cacheable, No change to implied PMA memory type + * 01 - NC Non-cacheable, idempotent, weakly-ordered Main Memory + * 10 - IO Non-cacheable, non-idempotent, strongly-ordered I/O memory + * 11 - Rsvd Reserved for future standard use + */ +#define _SVPBMT_PMA ((unsigned long)0x0 << 61) +#define _SVPBMT_NC ((unsigned long)0x1 << 61) +#define _SVPBMT_IO ((unsigned long)0x2 << 61) +#define _SVPBMT_MASK (_SVPBMT_PMA | _SVPBMT_NC | _SVPBMT_IO) + +extern struct __riscv_svpbmt_struct { + unsigned long mask; + unsigned long mt_pma; + unsigned long mt_nc; + unsigned long mt_io; +} __riscv_svpbmt; + +#define _PAGE_MT_MASK __riscv_svpbmt.mask +#define _PAGE_MT_PMA __riscv_svpbmt.mt_pma +#define _PAGE_MT_NC __riscv_svpbmt.mt_nc +#define _PAGE_MT_IO __riscv_svpbmt.mt_io +#else +#define _PAGE_MT_MASK 0 +#define _PAGE_MT_PMA 0 +#define _PAGE_MT_NC 0 +#define _PAGE_MT_IO 0 +#endif /* CONFIG_64BIT */ +#endif /* __ASSEMBLY__ */ + #define _PAGE_SPECIAL _PAGE_SOFT #define _PAGE_TABLE _PAGE_PRESENT @@ -38,7 +74,8 @@ /* Set of bits to preserve across pte_modify() */ #define _PAGE_CHG_MASK (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \ _PAGE_WRITE | _PAGE_EXEC | \ - _PAGE_USER | _PAGE_GLOBAL)) + _PAGE_USER | _PAGE_GLOBAL | \ + _PAGE_MT_MASK)) /* * when all of R/W/X are zero, the PTE is a pointer to the next level * of the page table; otherwise, it is a leaf PTE. diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h index 39b550310ec6..3fc70a63e395 100644 --- a/arch/riscv/include/asm/pgtable.h +++ b/arch/riscv/include/asm/pgtable.h @@ -136,7 +136,8 @@ | _PAGE_PRESENT \ | _PAGE_ACCESSED \ | _PAGE_DIRTY \ - | _PAGE_GLOBAL) + | _PAGE_GLOBAL \ + | _PAGE_MT_PMA) #define PAGE_KERNEL __pgprot(_PAGE_KERNEL) #define PAGE_KERNEL_READ __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE) @@ -146,11 +147,9 @@ #define PAGE_TABLE __pgprot(_PAGE_TABLE) -/* - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't - * change the properties of memory regions. - */ -#define _PAGE_IOREMAP _PAGE_KERNEL +#define _PAGE_IOREMAP ((_PAGE_KERNEL & ~_PAGE_MT_MASK) | _PAGE_MT_IO) + +#define PAGE_IOREMAP __pgprot(_PAGE_IOREMAP) extern pgd_t swapper_pg_dir[]; @@ -230,12 +229,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd) static inline struct page *pmd_page(pmd_t pmd) { - return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT); + return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT); } static inline unsigned long pmd_page_vaddr(pmd_t pmd) { - return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT); + return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT); } static inline pte_t pmd_pte(pmd_t pmd) @@ -251,7 +250,7 @@ static inline pte_t pud_pte(pud_t pud) /* Yields the page frame number (PFN) of a page table entry */ static inline unsigned long pte_pfn(pte_t pte) { - return (pte_val(pte) >> _PAGE_PFN_SHIFT); + return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT); } #define pte_page(x) pfn_to_page(pte_pfn(x)) @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma, return ptep_test_and_clear_young(vma, address, ptep); } +#define pgprot_noncached pgprot_noncached +static inline pgprot_t pgprot_noncached(pgprot_t _prot) +{ + unsigned long prot = pgprot_val(_prot); + + prot &= ~_PAGE_MT_MASK; + prot |= _PAGE_MT_IO; + + return __pgprot(prot); +} + +#define pgprot_writecombine pgprot_writecombine +static inline pgprot_t pgprot_writecombine(pgprot_t _prot) +{ + unsigned long prot = pgprot_val(_prot); + + prot &= ~_PAGE_MT_MASK; + prot |= _PAGE_MT_NC; + + return __pgprot(prot); +} + /* * THP functions */ diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c index 6d59e6906fdd..fbce525961c0 100644 --- a/arch/riscv/kernel/cpu.c +++ b/arch/riscv/kernel/cpu.c @@ -77,7 +77,9 @@ static void print_mmu(struct seq_file *f, const char *mmu_type) return; #elif defined(CONFIG_64BIT) if (strcmp(mmu_type, "riscv,sv39") != 0 && - strcmp(mmu_type, "riscv,sv48") != 0) + strcmp(mmu_type, "riscv,sv48") != 0 && + strcmp(mmu_type, "riscv,sv39,svpbmt") != 0 && + strcmp(mmu_type, "riscv,sv48,svpbmt") != 0) return; #endif diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c index d959d207a40d..d1b046a8254b 100644 --- a/arch/riscv/kernel/cpufeature.c +++ b/arch/riscv/kernel/cpufeature.c @@ -8,6 +8,7 @@ #include <linux/bitmap.h> #include <linux/of.h> +#include <linux/pgtable.h> #include <asm/processor.h> #include <asm/hwcap.h> #include <asm/smp.h> @@ -59,6 +60,27 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit) } EXPORT_SYMBOL_GPL(__riscv_isa_extension_available); +static void __init riscv_svpbmt(void) +{ +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT) + struct device_node *node; + const char *str; + + for_each_of_cpu_node(node) { + if (of_property_read_string(node, "mmu-type", &str)) + continue; + + if (strncmp(str + 11, "svpbmt", 6)) + return; + } + + __riscv_svpbmt.mask = _SVPBMT_MASK; + __riscv_svpbmt.mt_pma = _SVPBMT_PMA; + __riscv_svpbmt.mt_nc = _SVPBMT_NC; + __riscv_svpbmt.mt_io = _SVPBMT_IO; +#endif +} + void __init riscv_fill_hwcap(void) { struct device_node *node; @@ -67,6 +89,8 @@ void __init riscv_fill_hwcap(void) size_t i, j, isa_len; static unsigned long isa2hwcap[256] = {0}; + riscv_svpbmt(); + isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I; isa2hwcap['m'] = isa2hwcap['M'] = COMPAT_HWCAP_ISA_M; isa2hwcap['a'] = isa2hwcap['A'] = COMPAT_HWCAP_ISA_A; diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c index 7cb4f391d106..43b2e11fd3e0 100644 --- a/arch/riscv/mm/init.c +++ b/arch/riscv/mm/init.c @@ -905,3 +905,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node, return vmemmap_populate_basepages(start, end, node, NULL); } #endif + +#ifdef CONFIG_64BIT +struct __riscv_svpbmt_struct __riscv_svpbmt __ro_after_init; +EXPORT_SYMBOL(__riscv_svpbmt); +#endif