[V2,1/2] riscv: Add RISC-V svpbmt extension

Message ID	20210923172107.1117604-1-guoren@kernel.org (mailing list archive)
State	New, archived
Headers	show Return-Path: <SRS0=XVif=ON=lists.infradead.org=linux-riscv-bounces+linux-riscv=archiver.kernel.org@kernel.org> DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 6BA34604D1 From: guoren@kernel.org To: anup.patel@wdc.com, atish.patra@wdc.com, palmerdabbelt@google.com, guoren@kernel.org, christoph.muellner@vrull.eu, philipp.tomsich@vrull.eu, hch@lst.de, liush@allwinnertech.com, wefu@redhat.com, lazyparser@gmail.com, drew@beagleboard.org Cc: linux-riscv@lists.infradead.org, linux-kernel@vger.kernel.org, taiten.peng@canonical.com, aniket.ponkshe@canonical.com, heinrich.schuchardt@canonical.com, gordan.markus@canonical.com, Guo Ren <guoren@linux.alibaba.com>, Arnd Bergmann <arnd@arndb.de>, Chen-Yu Tsai <wens@csie.org>, Maxime Ripard <maxime@cerno.tech>, Daniel Lustig <dlustig@nvidia.com>, Greg Favor <gfavor@ventanamicro.com>, Andrea Mondelli <andrea.mondelli@huawei.com>, Jonathan Behrens <behrensj@mit.edu>, Xinhaoqu <xinhaoqu@huawei.com>, Bill Huffman <huffman@cadence.com>, Nick Kossifidis <mick@ics.forth.gr>, Allen Baum <allen.baum@esperantotech.com>, Josh Scheid <jscheid@ventanamicro.com>, Richard Trauben <rtrauben@gmail.com> Subject: [PATCH V2 1/2] riscv: Add RISC-V svpbmt extension Date: Fri, 24 Sep 2021 01:21:06 +0800 Message-Id: <20210923172107.1117604-1-guoren@kernel.org> MIME-Version: 1.0 Precedence: list Content-Type: text/plain; charset="us-ascii" Content-Transfer-Encoding: 7bit Sender: "linux-riscv" <linux-riscv-bounces@lists.infradead.org> Errors-To: linux-riscv-bounces+linux-riscv=archiver.kernel.org@lists.infradead.org
Series	[V2,1/2] riscv: Add RISC-V svpbmt extension \| expand [V2,1/2] riscv: Add RISC-V svpbmt extension [V2,2/2] dt-bindings: riscv: Add svpbmt in cpu mmu-type property

Guo Ren Sept. 23, 2021, 5:21 p.m. UTC

From: Guo Ren <guoren@linux.alibaba.com>

This patch follows the standard pure RISC-V Svpbmt extension in
privilege spec to solve the non-coherent SOC dma synchronization
issues.

Here is the svpbmt PTE format:
| 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
  N     MT     RSW    D   A   G   U   X   W   R   V
        ^

Of the Reserved bits [63:54] in a leaf PTE, the high bit is already
allocated (as the N bit), so bits [62:61] are used as the MT (aka
MemType) field. This field specifies one of three memory types that
are close equivalents (or equivalent in effect) to the three main x86
and ARMv8 memory types - as shown in the following table.

RISC-V
Encoding &
MemType     RISC-V Description
----------  ------------------------------------------------
00 - PMA    Normal Cacheable, No change to implied PMA memory type
01 - NC     Non-cacheable, idempotent, weakly-ordered Main Memory
10 - IO     Non-cacheable, non-idempotent, strongly-ordered I/O memory
11 - Rsvd   Reserved for future standard use

The standard protection_map[] needn't be modified because the "PMA"
type keeps the highest bits zero. And the whole modification is
limited in the arch/riscv/* and using a global variable
(__riscv_svpbmt) as _PAGE_DMA_MASK/IO/NC for pgprot_noncached
(&writecombine) in pgtable.h. We also add _PAGE_CHG_MASK to filter
PFN than before.

Enable it in devicetree - (Reuse "mmu-type" of cpu_section)
 - riscv,sv39,svpbmt
 - riscv,sv48,svpbmt

Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
Co-developed-by: Liu Shaohua <liush@allwinnertech.com>
Signed-off-by: Liu Shaohua <liush@allwinnertech.com>
Co-developed-by: Wei Fu <wefu@redhat.com>
Signed-off-by: Wei Fu <wefu@redhat.com>
Cc: Palmer Dabbelt <palmerdabbelt@google.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Anup Patel <anup.patel@wdc.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Atish Patra <atish.patra@wdc.com>
Cc: Drew Fustini <drew@beagleboard.org>
Cc: Wei Fu <wefu@redhat.com>
Cc: Wei Wu <lazyparser@gmail.com>
Cc: Chen-Yu Tsai <wens@csie.org>
Cc: Maxime Ripard <maxime@cerno.tech>
Cc: Daniel Lustig <dlustig@nvidia.com>
Cc: Greg Favor <gfavor@ventanamicro.com>
Cc: Andrea Mondelli <andrea.mondelli@huawei.com>
Cc: Jonathan Behrens <behrensj@mit.edu>
Cc: Xinhaoqu (Freddie) <xinhaoqu@huawei.com>
Cc: Bill Huffman <huffman@cadence.com>
Cc: Nick Kossifidis <mick@ics.forth.gr>
Cc: Allen Baum <allen.baum@esperantotech.com>
Cc: Josh Scheid <jscheid@ventanamicro.com>
Cc: Richard Trauben <rtrauben@gmail.com>

---

Changes since V2:
 - Seperate DT modification into another patch
 - Move riscv_svpbmt() into riscv_fill_hwcap()
 - Fixup print_mmu()
 - Make __riscv_svpbmt updated only when all CPU nodes have "svpmbt"
   in the "mmu-type" DT property
 - Define _PAGE_MT_MASK as (_PAGE_MT_PMA | _PAGE_MT_NC | _PAGE_MT_IO)
 - Change u64 to unsigned long in _PAGE_MT_XXX
 - Change __riscv_svpbmt.mt[] to __riscv_svpbmt.mt_xxx
 - Optimize some misleading names
---
 arch/riscv/include/asm/fixmap.h       |  2 +-
 arch/riscv/include/asm/pgtable-64.h   |  8 ++++--
 arch/riscv/include/asm/pgtable-bits.h | 41 +++++++++++++++++++++++++--
 arch/riscv/include/asm/pgtable.h      | 39 +++++++++++++++++++------
 arch/riscv/kernel/cpu.c               |  4 ++-
 arch/riscv/kernel/cpufeature.c        | 24 ++++++++++++++++
 arch/riscv/mm/init.c                  |  5 ++++
 7 files changed, 107 insertions(+), 16 deletions(-)

Atish Patra Sept. 27, 2021, 8:13 p.m. UTC | #1

On Mon, Sep 27, 2021 at 1:09 PM Atish Patra <atishp@atishpatra.org> wrote:
>
>
>
> On Thu, Sep 23, 2021 at 10:22 AM <guoren@kernel.org> wrote:
>>
>> From: Guo Ren <guoren@linux.alibaba.com>
>>
>> This patch follows the standard pure RISC-V Svpbmt extension in
>> privilege spec to solve the non-coherent SOC dma synchronization
>> issues.
>>
>> Here is the svpbmt PTE format:
>> | 63 | 62-61 | 60-8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
>>   N     MT     RSW    D   A   G   U   X   W   R   V
>>         ^
>>
>> Of the Reserved bits [63:54] in a leaf PTE, the high bit is already
>> allocated (as the N bit), so bits [62:61] are used as the MT (aka
>> MemType) field. This field specifies one of three memory types that
>> are close equivalents (or equivalent in effect) to the three main x86
>> and ARMv8 memory types - as shown in the following table.
>>
>> RISC-V
>> Encoding &
>> MemType     RISC-V Description
>> ----------  ------------------------------------------------
>> 00 - PMA    Normal Cacheable, No change to implied PMA memory type
>> 01 - NC     Non-cacheable, idempotent, weakly-ordered Main Memory
>> 10 - IO     Non-cacheable, non-idempotent, strongly-ordered I/O memory
>> 11 - Rsvd   Reserved for future standard use
>>
>> The standard protection_map[] needn't be modified because the "PMA"
>> type keeps the highest bits zero. And the whole modification is
>> limited in the arch/riscv/* and using a global variable
>> (__riscv_svpbmt) as _PAGE_DMA_MASK/IO/NC for pgprot_noncached
>> (&writecombine) in pgtable.h. We also add _PAGE_CHG_MASK to filter
>> PFN than before.
>>
>

Resending it as it was not delivered to the mailing list because HTML
formatting was selected by mistake.
Sorry for the noise. Here is the original email.

> @Palmer Dabbelt @Guo Ren
>
> Can we first decide what to do with D1's upstreaming plan ? I had a slide[1] to discuss that during RISC-V BoF.
> But we ran out of time. Let's continue the discussion here.
>
> We all agree that Allwinner D1 has incompatible changes with privilege specification because it uses two reserved bits even after Svpbmt is merged.
> Let's not argue on the reasoning behind this change. The silicon is already out and the specification just got frozen.
> Unfortunately, we don't have a time stone to change the past ;).
>
> We need to decide whether we should support the upstream kernel for D1. Few things to consider.
> – Can it be considered as an errata ?
> – Does it set a bad precedent and open can of worms in future ?
> – Can we just ignore D1 given the mass volume ?
>
> One solution I can think of is that we allow this as an exception to the patch acceptance policy.
> We need to explicitly specify this board as an exception because the policy was not in place during the design phase of the hardware.
> At least, it protects us from accepting the incompatible changes in the future. Any other ideas ?
>
> [1] https://linuxplumbersconf.org/event/11/contributions/1128/attachments/846/1757/RISC-V%20Bof.pdf
>
>> Enable it in devicetree - (Reuse "mmu-type" of cpu_section)
>>  - riscv,sv39,svpbmt
>>  - riscv,sv48,svpbmt
>>
>> Signed-off-by: Guo Ren <guoren@linux.alibaba.com>
>> Co-developed-by: Liu Shaohua <liush@allwinnertech.com>
>> Signed-off-by: Liu Shaohua <liush@allwinnertech.com>
>> Co-developed-by: Wei Fu <wefu@redhat.com>
>> Signed-off-by: Wei Fu <wefu@redhat.com>
>> Cc: Palmer Dabbelt <palmerdabbelt@google.com>
>> Cc: Christoph Hellwig <hch@lst.de>
>> Cc: Anup Patel <anup.patel@wdc.com>
>> Cc: Arnd Bergmann <arnd@arndb.de>
>> Cc: Atish Patra <atish.patra@wdc.com>
>> Cc: Drew Fustini <drew@beagleboard.org>
>> Cc: Wei Fu <wefu@redhat.com>
>> Cc: Wei Wu <lazyparser@gmail.com>
>> Cc: Chen-Yu Tsai <wens@csie.org>
>> Cc: Maxime Ripard <maxime@cerno.tech>
>> Cc: Daniel Lustig <dlustig@nvidia.com>
>> Cc: Greg Favor <gfavor@ventanamicro.com>
>> Cc: Andrea Mondelli <andrea.mondelli@huawei.com>
>> Cc: Jonathan Behrens <behrensj@mit.edu>
>> Cc: Xinhaoqu (Freddie) <xinhaoqu@huawei.com>
>> Cc: Bill Huffman <huffman@cadence.com>
>> Cc: Nick Kossifidis <mick@ics.forth.gr>
>> Cc: Allen Baum <allen.baum@esperantotech.com>
>> Cc: Josh Scheid <jscheid@ventanamicro.com>
>> Cc: Richard Trauben <rtrauben@gmail.com>
>>
>> ---
>>
>> Changes since V2:
>>  - Seperate DT modification into another patch
>>  - Move riscv_svpbmt() into riscv_fill_hwcap()
>>  - Fixup print_mmu()
>>  - Make __riscv_svpbmt updated only when all CPU nodes have "svpmbt"
>>    in the "mmu-type" DT property
>>  - Define _PAGE_MT_MASK as (_PAGE_MT_PMA | _PAGE_MT_NC | _PAGE_MT_IO)
>>  - Change u64 to unsigned long in _PAGE_MT_XXX
>>  - Change __riscv_svpbmt.mt[] to __riscv_svpbmt.mt_xxx
>>  - Optimize some misleading names
>> ---
>>  arch/riscv/include/asm/fixmap.h       |  2 +-
>>  arch/riscv/include/asm/pgtable-64.h   |  8 ++++--
>>  arch/riscv/include/asm/pgtable-bits.h | 41 +++++++++++++++++++++++++--
>>  arch/riscv/include/asm/pgtable.h      | 39 +++++++++++++++++++------
>>  arch/riscv/kernel/cpu.c               |  4 ++-
>>  arch/riscv/kernel/cpufeature.c        | 24 ++++++++++++++++
>>  arch/riscv/mm/init.c                  |  5 ++++
>>  7 files changed, 107 insertions(+), 16 deletions(-)
>>
>> diff --git a/arch/riscv/include/asm/fixmap.h b/arch/riscv/include/asm/fixmap.h
>> index 54cbf07fb4e9..5acd99d08e74 100644
>> --- a/arch/riscv/include/asm/fixmap.h
>> +++ b/arch/riscv/include/asm/fixmap.h
>> @@ -43,7 +43,7 @@ enum fixed_addresses {
>>         __end_of_fixed_addresses
>>  };
>>
>> -#define FIXMAP_PAGE_IO         PAGE_KERNEL
>> +#define FIXMAP_PAGE_IO         PAGE_IOREMAP
>>
>>  #define __early_set_fixmap     __set_fixmap
>>
>> diff --git a/arch/riscv/include/asm/pgtable-64.h b/arch/riscv/include/asm/pgtable-64.h
>> index 228261aa9628..0b53ea67e91a 100644
>> --- a/arch/riscv/include/asm/pgtable-64.h
>> +++ b/arch/riscv/include/asm/pgtable-64.h
>> @@ -61,12 +61,14 @@ static inline void pud_clear(pud_t *pudp)
>>
>>  static inline pmd_t *pud_pgtable(pud_t pud)
>>  {
>> -       return (pmd_t *)pfn_to_virt(pud_val(pud) >> _PAGE_PFN_SHIFT);
>> +       return (pmd_t *)pfn_to_virt((pud_val(pud) & _PAGE_CHG_MASK)
>> +                                               >> _PAGE_PFN_SHIFT);
>>  }
>>
>>  static inline struct page *pud_page(pud_t pud)
>>  {
>> -       return pfn_to_page(pud_val(pud) >> _PAGE_PFN_SHIFT);
>> +       return pfn_to_page((pud_val(pud) & _PAGE_CHG_MASK)
>> +                                               >> _PAGE_PFN_SHIFT);
>>  }
>>
>>  static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
>> @@ -76,7 +78,7 @@ static inline pmd_t pfn_pmd(unsigned long pfn, pgprot_t prot)
>>
>>  static inline unsigned long _pmd_pfn(pmd_t pmd)
>>  {
>> -       return pmd_val(pmd) >> _PAGE_PFN_SHIFT;
>> +       return (pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT;
>>  }
>>
>>  #define mk_pmd(page, prot)    pfn_pmd(page_to_pfn(page), prot)
>> diff --git a/arch/riscv/include/asm/pgtable-bits.h b/arch/riscv/include/asm/pgtable-bits.h
>> index 2ee413912926..3b38fe14f169 100644
>> --- a/arch/riscv/include/asm/pgtable-bits.h
>> +++ b/arch/riscv/include/asm/pgtable-bits.h
>> @@ -7,7 +7,7 @@
>>  #define _ASM_RISCV_PGTABLE_BITS_H
>>
>>  /*
>> - * PTE format:
>> + * rv32 PTE format:
>>   * | XLEN-1  10 | 9             8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
>>   *       PFN      reserved for SW   D   A   G   U   X   W   R   V
>>   */
>> @@ -24,6 +24,42 @@
>>  #define _PAGE_DIRTY     (1 << 7)    /* Set by hardware on any write */
>>  #define _PAGE_SOFT      (1 << 8)    /* Reserved for software */
>>
>> +#ifndef __ASSEMBLY__
>> +#ifdef CONFIG_64BIT
>> +/*
>> + * rv64 PTE format:
>> + * | 63 | 62 61 | 60 54 | 53  10 | 9             8 | 7 | 6 | 5 | 4 | 3 | 2 | 1 | 0
>> + *   N      MT     RSV    PFN      reserved for SW   D   A   G   U   X   W   R   V
>> + * [62:61] Memory Type definitions:
>> + *  00 - PMA    Normal Cacheable, No change to implied PMA memory type
>> + *  01 - NC     Non-cacheable, idempotent, weakly-ordered Main Memory
>> + *  10 - IO     Non-cacheable, non-idempotent, strongly-ordered I/O memory
>> + *  11 - Rsvd   Reserved for future standard use
>> + */
>> +#define _SVPBMT_PMA            ((unsigned long)0x0 << 61)
>> +#define _SVPBMT_NC             ((unsigned long)0x1 << 61)
>> +#define _SVPBMT_IO             ((unsigned long)0x2 << 61)
>> +#define _SVPBMT_MASK           (_SVPBMT_PMA | _SVPBMT_NC | _SVPBMT_IO)
>> +
>> +extern struct __riscv_svpbmt_struct {
>> +       unsigned long mask;
>> +       unsigned long mt_pma;
>> +       unsigned long mt_nc;
>> +       unsigned long mt_io;
>> +} __riscv_svpbmt;
>> +
>> +#define _PAGE_MT_MASK          __riscv_svpbmt.mask
>> +#define _PAGE_MT_PMA           __riscv_svpbmt.mt_pma
>> +#define _PAGE_MT_NC            __riscv_svpbmt.mt_nc
>> +#define _PAGE_MT_IO            __riscv_svpbmt.mt_io
>> +#else
>> +#define _PAGE_MT_MASK          0
>> +#define _PAGE_MT_PMA           0
>> +#define _PAGE_MT_NC            0
>> +#define _PAGE_MT_IO            0
>> +#endif /* CONFIG_64BIT */
>> +#endif /* __ASSEMBLY__ */
>> +
>>  #define _PAGE_SPECIAL   _PAGE_SOFT
>>  #define _PAGE_TABLE     _PAGE_PRESENT
>>
>> @@ -38,7 +74,8 @@
>>  /* Set of bits to preserve across pte_modify() */
>>  #define _PAGE_CHG_MASK  (~(unsigned long)(_PAGE_PRESENT | _PAGE_READ | \
>>                                           _PAGE_WRITE | _PAGE_EXEC |    \
>> -                                         _PAGE_USER | _PAGE_GLOBAL))
>> +                                         _PAGE_USER | _PAGE_GLOBAL |   \
>> +                                         _PAGE_MT_MASK))
>>  /*
>>   * when all of R/W/X are zero, the PTE is a pointer to the next level
>>   * of the page table; otherwise, it is a leaf PTE.
>> diff --git a/arch/riscv/include/asm/pgtable.h b/arch/riscv/include/asm/pgtable.h
>> index 39b550310ec6..3fc70a63e395 100644
>> --- a/arch/riscv/include/asm/pgtable.h
>> +++ b/arch/riscv/include/asm/pgtable.h
>> @@ -136,7 +136,8 @@
>>                                 | _PAGE_PRESENT \
>>                                 | _PAGE_ACCESSED \
>>                                 | _PAGE_DIRTY \
>> -                               | _PAGE_GLOBAL)
>> +                               | _PAGE_GLOBAL \
>> +                               | _PAGE_MT_PMA)
>>
>>  #define PAGE_KERNEL            __pgprot(_PAGE_KERNEL)
>>  #define PAGE_KERNEL_READ       __pgprot(_PAGE_KERNEL & ~_PAGE_WRITE)
>> @@ -146,11 +147,9 @@
>>
>>  #define PAGE_TABLE             __pgprot(_PAGE_TABLE)
>>
>> -/*
>> - * The RISC-V ISA doesn't yet specify how to query or modify PMAs, so we can't
>> - * change the properties of memory regions.
>> - */
>> -#define _PAGE_IOREMAP _PAGE_KERNEL
>> +#define _PAGE_IOREMAP  ((_PAGE_KERNEL & ~_PAGE_MT_MASK) | _PAGE_MT_IO)
>> +
>> +#define PAGE_IOREMAP           __pgprot(_PAGE_IOREMAP)
>>
>>  extern pgd_t swapper_pg_dir[];
>>
>> @@ -230,12 +229,12 @@ static inline unsigned long _pgd_pfn(pgd_t pgd)
>>
>>  static inline struct page *pmd_page(pmd_t pmd)
>>  {
>> -       return pfn_to_page(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
>> +       return pfn_to_page((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
>>  }
>>
>>  static inline unsigned long pmd_page_vaddr(pmd_t pmd)
>>  {
>> -       return (unsigned long)pfn_to_virt(pmd_val(pmd) >> _PAGE_PFN_SHIFT);
>> +       return (unsigned long)pfn_to_virt((pmd_val(pmd) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
>>  }
>>
>>  static inline pte_t pmd_pte(pmd_t pmd)
>> @@ -251,7 +250,7 @@ static inline pte_t pud_pte(pud_t pud)
>>  /* Yields the page frame number (PFN) of a page table entry */
>>  static inline unsigned long pte_pfn(pte_t pte)
>>  {
>> -       return (pte_val(pte) >> _PAGE_PFN_SHIFT);
>> +       return ((pte_val(pte) & _PAGE_CHG_MASK) >> _PAGE_PFN_SHIFT);
>>  }
>>
>>  #define pte_page(x)     pfn_to_page(pte_pfn(x))
>> @@ -490,6 +489,28 @@ static inline int ptep_clear_flush_young(struct vm_area_struct *vma,
>>         return ptep_test_and_clear_young(vma, address, ptep);
>>  }
>>
>> +#define pgprot_noncached pgprot_noncached
>> +static inline pgprot_t pgprot_noncached(pgprot_t _prot)
>> +{
>> +       unsigned long prot = pgprot_val(_prot);
>> +
>> +       prot &= ~_PAGE_MT_MASK;
>> +       prot |= _PAGE_MT_IO;
>> +
>> +       return __pgprot(prot);
>> +}
>> +
>> +#define pgprot_writecombine pgprot_writecombine
>> +static inline pgprot_t pgprot_writecombine(pgprot_t _prot)
>> +{
>> +       unsigned long prot = pgprot_val(_prot);
>> +
>> +       prot &= ~_PAGE_MT_MASK;
>> +       prot |= _PAGE_MT_NC;
>> +
>> +       return __pgprot(prot);
>> +}
>> +
>>  /*
>>   * THP functions
>>   */
>> diff --git a/arch/riscv/kernel/cpu.c b/arch/riscv/kernel/cpu.c
>> index 6d59e6906fdd..fbce525961c0 100644
>> --- a/arch/riscv/kernel/cpu.c
>> +++ b/arch/riscv/kernel/cpu.c
>> @@ -77,7 +77,9 @@ static void print_mmu(struct seq_file *f, const char *mmu_type)
>>                 return;
>>  #elif defined(CONFIG_64BIT)
>>         if (strcmp(mmu_type, "riscv,sv39") != 0 &&
>> -           strcmp(mmu_type, "riscv,sv48") != 0)
>> +           strcmp(mmu_type, "riscv,sv48") != 0 &&
>> +           strcmp(mmu_type, "riscv,sv39,svpbmt") != 0 &&
>> +           strcmp(mmu_type, "riscv,sv48,svpbmt") != 0)
>>                 return;
>>  #endif
>>
>> diff --git a/arch/riscv/kernel/cpufeature.c b/arch/riscv/kernel/cpufeature.c
>> index d959d207a40d..d1b046a8254b 100644
>> --- a/arch/riscv/kernel/cpufeature.c
>> +++ b/arch/riscv/kernel/cpufeature.c
>> @@ -8,6 +8,7 @@
>>
>>  #include <linux/bitmap.h>
>>  #include <linux/of.h>
>> +#include <linux/pgtable.h>
>>  #include <asm/processor.h>
>>  #include <asm/hwcap.h>
>>  #include <asm/smp.h>
>> @@ -59,6 +60,27 @@ bool __riscv_isa_extension_available(const unsigned long *isa_bitmap, int bit)
>>  }
>>  EXPORT_SYMBOL_GPL(__riscv_isa_extension_available);
>>
>> +static void __init riscv_svpbmt(void)
>> +{
>> +#if defined(CONFIG_MMU) && defined(CONFIG_64BIT)
>> +       struct device_node *node;
>> +       const char *str;
>> +
>> +       for_each_of_cpu_node(node) {
>> +               if (of_property_read_string(node, "mmu-type", &str))
>> +                       continue;
>> +
>> +               if (strncmp(str + 11, "svpbmt", 6))
>> +                       return;
>> +       }
>> +
>> +       __riscv_svpbmt.mask     = _SVPBMT_MASK;
>> +       __riscv_svpbmt.mt_pma   = _SVPBMT_PMA;
>> +       __riscv_svpbmt.mt_nc    = _SVPBMT_NC;
>> +       __riscv_svpbmt.mt_io    = _SVPBMT_IO;
>> +#endif
>> +}
>> +
>>  void __init riscv_fill_hwcap(void)
>>  {
>>         struct device_node *node;
>> @@ -67,6 +89,8 @@ void __init riscv_fill_hwcap(void)
>>         size_t i, j, isa_len;
>>         static unsigned long isa2hwcap[256] = {0};
>>
>> +       riscv_svpbmt();
>> +
>>         isa2hwcap['i'] = isa2hwcap['I'] = COMPAT_HWCAP_ISA_I;
>>         isa2hwcap['m'] = isa2hwcap['M'] = COMPAT_HWCAP_ISA_M;
>>         isa2hwcap['a'] = isa2hwcap['A'] = COMPAT_HWCAP_ISA_A;
>> diff --git a/arch/riscv/mm/init.c b/arch/riscv/mm/init.c
>> index 7cb4f391d106..43b2e11fd3e0 100644
>> --- a/arch/riscv/mm/init.c
>> +++ b/arch/riscv/mm/init.c
>> @@ -905,3 +905,8 @@ int __meminit vmemmap_populate(unsigned long start, unsigned long end, int node,
>>         return vmemmap_populate_basepages(start, end, node, NULL);
>>  }
>>  #endif
>> +
>> +#ifdef CONFIG_64BIT
>> +struct __riscv_svpbmt_struct __riscv_svpbmt __ro_after_init;
>> +EXPORT_SYMBOL(__riscv_svpbmt);
>> +#endif
>> --
>> 2.25.1
>>
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv
>
>
>
> --
> Regards,
> Atish

Atish Patra Sept. 27, 2021, 11:05 p.m. UTC | #2

On Mon, Sep 27, 2021 at 1:53 PM Greg Favor <gfavor@ventanamicro.com> wrote:
>
> With the big caveat that I haven't been in the middle of this discussion, it seems like Allwinner D1's changes represent a custom (and nonconforming) extension.

As per the v1.12 privilege specification, bit 63 is reserved for
Svnapot extension while bit 60–54 are reserved for future standard
use.
D1's implementation uses both 60 and 63 bit for their custom "PBMT"
extension in addition to bit 61 & 62 [1].

> Isn't this just a matter of the patch needing to be treated as for a RISC-V custom extension per the recently clarified policy for handling upstreaming/etc. of custom extensions?  (Philipp can > speak to this clarified policy.)  Or what am I missing?

Linux kernel upstream policy is yet to adopt that clarification as it
was recently discussed at RVI meetings. Is there a written definition
of non-conforming/custom/incompatible ?
Moreover, as per the platform specification[2],

A non-conforming extension that conflicts with a supported standard
extensions must satisfy at least one of the following:
  --- It must be disabled by default.
  --- The supported standard extension must be declared as unsupported
in all feature discovery structures used by software.
      This option is allowed only if the standard extension is not required.

In this case, the custom non-conforming implementation can not be
disabled or marked unsupported as it is critical for all the necessary
I/O devices (usb, mmc, ethernet).
Without this custom implementation support in upstream, we can not
really use the mainline kernel for Allwinner D1.

[1] https://linuxplumbersconf.org/event/11/contributions/1100/attachments/841/1607/What%E2%80%99s%20the%20problem%20with%20D1%20Linux%20upstream-.pdf
[2] https://github.com/riscv/riscv-platform-specs/blob/main/riscv-platform-spec.adoc#2112-general

>
> Greg
>
> On Mon, Sep 27, 2021 at 1:14 PM Atish Patra <atishp@atishpatra.org> wrote:
>>
>> > @Palmer Dabbelt @Guo Ren
>> >
>> > Can we first decide what to do with D1's upstreaming plan ? I had a slide[1] to discuss that during RISC-V BoF.
>> > But we ran out of time. Let's continue the discussion here.
>> >
>> > We all agree that Allwinner D1 has incompatible changes with privilege specification because it uses two reserved bits even after Svpbmt is merged.
>> > Let's not argue on the reasoning behind this change. The silicon is already out and the specification just got frozen.
>> > Unfortunately, we don't have a time stone to change the past ;).
>> >
>> > We need to decide whether we should support the upstream kernel for D1. Few things to consider.
>> > – Can it be considered as an errata ?
>> > – Does it set a bad precedent and open can of worms in future ?
>> > – Can we just ignore D1 given the mass volume ?
>> >
>> > One solution I can think of is that we allow this as an exception to the patch acceptance policy.
>> > We need to explicitly specify this board as an exception because the policy was not in place during the design phase of the hardware.
>> > At least, it protects us from accepting the incompatible changes in the future. Any other ideas ?
>> >
>> > [1] https://linuxplumbersconf.org/event/11/contributions/1128/attachments/846/1757/RISC-V%20Bof.pdf

Nick Kossifidis Sept. 28, 2021, 12:23 a.m. UTC | #3

Hello Greg,

Στις 2021-09-27 23:53, Greg Favor έγραψε:
> With the big caveat that I haven't been in the middle of this 
> discussion, it seems like Allwinner D1's changes represent a custom 
> (and nonconforming) extension.  Isn't this just a matter of the patch 
> needing to be treated as for a RISC-V custom extension per the recently 
> clarified policy for handling upstreaming/etc. of custom extensions?  
> (Philipp can speak to this clarified policy.)  Or what am I missing?
> 

The Priv. Spec. defines sv39/48 without allowing custom use of reserved 
pte bits by implementations, Allwinner D1 claims to be sv39 but it does 
use reserved PTE bits. When vendors want to do custom stuff on PTEs, the 
standard says they may use values 14-15 on satp.mode for that and define 
their own MMU basically. Messing up with the implementation of sv39 is 
not an extension, is violation of sv39. Even worse this implementation 
can't work if we ignore the customization since without setting the 
required bits on PTEs dma (and I believe SMP as well) doesn't work.

Regards,
Nick

Nick Kossifidis Sept. 28, 2021, 1:02 a.m. UTC | #4

Στις 2021-09-27 23:13, Atish Patra έγραψε:
>> We need to decide whether we should support the upstream kernel for 
>> D1. Few things to consider.
>> – Can it be considered as an errata ?

It's one thing to follow the spec and have an error in the 
implementation, and another to not follow the spec.

>> – Does it set a bad precedent and open can of worms in future ?

IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and 
asking for MMU support, they 've also shipped many chips already. I can 
also imagine other vendors in the future coming up with implementations 
that violate the spec in which case handling the standard stuff will 
become messy and complex, and hurt performance/security. We'll end up 
filling the code with exceptions and tweaks all over the place. We need 
to be strict about what is "riscv" and what's "draft riscv" or "riscv 
inspired", and what we are willing to support upstream. I can understand 
supporting vendor extensions upstream but they need to fit within the 
standard spec, we can't have for example extensions that use encoding 
space/csrs/fields etc reserved for standard use, they may only use 
what's reserved for custom/vendor use. At least let's agree on that.

>> – Can we just ignore D1 given the mass volume ?
>> 

IMHO no, we need to find a way to support it upstream but I believe 
there is another question to answer:

Do we also guarantee "one image to rule them all" approach, required by 
binary distros, for implementations that violate the spec ? Are we ok 
for example to support Allwinner D1 upstream but require a custom 
configuration/build instead of supporting it with the "generic" image ? 
In one case we need to handle the violation at runtime and introduce 
overhead for everyone (like looking up __riscv_svpbmt every time we set 
a PTE in this case), in the other it's an #ifdef.

Regards,
Nick

Anup Patel Sept. 28, 2021, 3:50 a.m. UTC | #5

On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
>
> Στις 2021-09-27 23:13, Atish Patra έγραψε:
> >> We need to decide whether we should support the upstream kernel for
> >> D1. Few things to consider.
> >> – Can it be considered as an errata ?
>
> It's one thing to follow the spec and have an error in the
> implementation, and another to not follow the spec.
>
> >> – Does it set a bad precedent and open can of worms in future ?
>
> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
> asking for MMU support, they 've also shipped many chips already. I can
> also imagine other vendors in the future coming up with implementations
> that violate the spec in which case handling the standard stuff will
> become messy and complex, and hurt performance/security. We'll end up
> filling the code with exceptions and tweaks all over the place. We need
> to be strict about what is "riscv" and what's "draft riscv" or "riscv
> inspired", and what we are willing to support upstream. I can understand
> supporting vendor extensions upstream but they need to fit within the
> standard spec, we can't have for example extensions that use encoding
> space/csrs/fields etc reserved for standard use, they may only use
> what's reserved for custom/vendor use. At least let's agree on that.

Totally agree with Nick here. It's a slippery slope.

Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
means future hardware which intentionally violates specs will also have to
be merged and the RISC-V patch acceptance policy will have no significance.

>
> >> – Can we just ignore D1 given the mass volume ?
> >>
>
> IMHO no, we need to find a way to support it upstream but I believe
> there is another question to answer:
>
> Do we also guarantee "one image to rule them all" approach, required by
> binary distros, for implementations that violate the spec ? Are we ok
> for example to support Allwinner D1 upstream but require a custom
> configuration/build instead of supporting it with the "generic" image ?
> In one case we need to handle the violation at runtime and introduce
> overhead for everyone (like looking up __riscv_svpbmt every time we set
> a PTE in this case), in the other it's an #ifdef.

At least, we should not have hardware violating specs as part of the
unified kernel image instead have these intentional deviations/violations
under separate kconfig which will not be enabled by default. This means
vendors (of such hardware) and distros will have to explicitly enable
support for such violations/deviations.

Regards,
Anup

Atish Patra Sept. 28, 2021, 4:26 a.m. UTC | #6

On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote:
>
> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
> >
> > Στις 2021-09-27 23:13, Atish Patra έγραψε:
> > >> We need to decide whether we should support the upstream kernel for
> > >> D1. Few things to consider.
> > >> – Can it be considered as an errata ?
> >
> > It's one thing to follow the spec and have an error in the
> > implementation, and another to not follow the spec.
> >
> > >> – Does it set a bad precedent and open can of worms in future ?
> >
> > IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
> > asking for MMU support, they 've also shipped many chips already. I can
> > also imagine other vendors in the future coming up with implementations
> > that violate the spec in which case handling the standard stuff will
> > become messy and complex, and hurt performance/security. We'll end up
> > filling the code with exceptions and tweaks all over the place. We need
> > to be strict about what is "riscv" and what's "draft riscv" or "riscv
> > inspired", and what we are willing to support upstream. I can understand
> > supporting vendor extensions upstream but they need to fit within the
> > standard spec, we can't have for example extensions that use encoding
> > space/csrs/fields etc reserved for standard use, they may only use
> > what's reserved for custom/vendor use. At least let's agree on that.
>
> Totally agree with Nick here. It's a slippery slope.
>
> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
> means future hardware which intentionally violates specs will also have to
> be merged and the RISC-V patch acceptance policy will have no significance.
>
> >
> > >> – Can we just ignore D1 given the mass volume ?
> > >>
> >
> > IMHO no, we need to find a way to support it upstream but I believe
> > there is another question to answer:
> >
> > Do we also guarantee "one image to rule them all" approach, required by
> > binary distros, for implementations that violate the spec ? Are we ok
> > for example to support Allwinner D1 upstream but require a custom
> > configuration/build instead of supporting it with the "generic" image ?
> > In one case we need to handle the violation at runtime and introduce
> > overhead for everyone (like looking up __riscv_svpbmt every time we set
> > a PTE in this case), in the other it's an #ifdef.
>
> At least, we should not have hardware violating specs as part of the
> unified kernel image instead have these intentional deviations/violations
> under separate kconfig which will not be enabled by default. This means
> vendors (of such hardware) and distros will have to explicitly enable
> support for such violations/deviations.
>

If we merge the code and are not enabled by default, it would be a
maintenance nightmare in future.
These part of the kernel will not be regularly tested but we have to
carry the changes for a long time.
Similar changes will only grow over time causing a lot of custom
configs that are not enabled by default.

IMHO, if we want to support this board in upstream, we should just
clearly state that it is one time special exception
for this board only because of the following reasons

1. The board design predates the patch acceptance policy.
2. We don't have enough affordable Linux compatible platforms today.
3. Allowing running an upstream kernel on D1 helps the RISC-V software
ecosystem to grow.

No more exceptions will be allowed in future for such hardware that
violates the spec. Period.

> Regards,
> Anup

Guo Ren Sept. 28, 2021, 6:03 a.m. UTC | #7

On Tue, Sep 28, 2021 at 12:26 PM Atish Patra <atishp@atishpatra.org> wrote:
>
> On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote:
> >
> > On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
> > >
> > > Στις 2021-09-27 23:13, Atish Patra έγραψε:
> > > >> We need to decide whether we should support the upstream kernel for
> > > >> D1. Few things to consider.
> > > >> – Can it be considered as an errata ?
> > >
> > > It's one thing to follow the spec and have an error in the
> > > implementation, and another to not follow the spec.
> > >
> > > >> – Does it set a bad precedent and open can of worms in future ?
> > >
> > > IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
> > > asking for MMU support, they 've also shipped many chips already. I can
> > > also imagine other vendors in the future coming up with implementations
> > > that violate the spec in which case handling the standard stuff will
> > > become messy and complex, and hurt performance/security. We'll end up
> > > filling the code with exceptions and tweaks all over the place. We need
> > > to be strict about what is "riscv" and what's "draft riscv" or "riscv
> > > inspired", and what we are willing to support upstream. I can understand
> > > supporting vendor extensions upstream but they need to fit within the
> > > standard spec, we can't have for example extensions that use encoding
> > > space/csrs/fields etc reserved for standard use, they may only use
> > > what's reserved for custom/vendor use. At least let's agree on that.
> >
> > Totally agree with Nick here. It's a slippery slope.
> >
> > Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
> > means future hardware which intentionally violates specs will also have to
> > be merged and the RISC-V patch acceptance policy will have no significance.
> >
> > >
> > > >> – Can we just ignore D1 given the mass volume ?
> > > >>
> > >
> > > IMHO no, we need to find a way to support it upstream but I believe
> > > there is another question to answer:
> > >
> > > Do we also guarantee "one image to rule them all" approach, required by
> > > binary distros, for implementations that violate the spec ? Are we ok
> > > for example to support Allwinner D1 upstream but require a custom
> > > configuration/build instead of supporting it with the "generic" image ?
> > > In one case we need to handle the violation at runtime and introduce
> > > overhead for everyone (like looking up __riscv_svpbmt every time we set
> > > a PTE in this case), in the other it's an #ifdef.
> >
> > At least, we should not have hardware violating specs as part of the
> > unified kernel image instead have these intentional deviations/violations
> > under separate kconfig which will not be enabled by default. This means
> > vendors (of such hardware) and distros will have to explicitly enable
> > support for such violations/deviations.
> >
>
> If we merge the code and are not enabled by default, it would be a
> maintenance nightmare in future.
> These part of the kernel will not be regularly tested but we have to
> carry the changes for a long time.
> Similar changes will only grow over time causing a lot of custom
> configs that are not enabled by default.
D1 could still use generic Image. The reason why I send the standard
implementation of svpbmt is that when we introduce svpbmt, we actually
introduce the page attribute frameworks for different platforms(svpbmt
& non-svpbmt). Then, "custom svpbmt" can also modify "protect_mapp []"
and svpbmt [] "in errata by limited codes from vendor.
If we support standard svpbmt first, then let "generic Image" support
D1 would be very little modification and all could be kept in errata.

Another patch [1] cleans up the wrong usage of "protect_map []" so
that the entire Linux user state page attributes come from it. The
design principle of Linux is to allow the platform to init
"protect_map []" flexibly.
[1]: https://lore.kernel.org/all/20210927064340.2411397-1-guoren@kernel.org/

>
> IMHO, if we want to support this board in upstream, we should just
> clearly state that it is one time special exception
> for this board only because of the following reasons
>
> 1. The board design predates the patch acceptance policy.
D1 is designed at 2019.

> 2. We don't have enough affordable Linux compatible platforms today.
D1 only  $65.

> 3. Allowing running an upstream kernel on D1 helps the RISC-V software
> ecosystem to grow.
Yes

>
> No more exceptions will be allowed in future for such hardware that
> violates the spec. Period.
>
> > Regards,
> > Anup
>
>
>
> --
> Regards,
> Atish

Atish Patra Sept. 28, 2021, 6:14 a.m. UTC | #8

On Mon, Sep 27, 2021 at 9:49 PM Greg Favor <gfavor@ventanamicro.com> wrote:
>
> On Mon, Sep 27, 2021 at 9:26 PM Atish Patra <atishp@atishpatra.org> wrote:
>>
>> IMHO, if we want to support this board in upstream, we should just
>> clearly state that it is one time special exception
>> for this board only because of the following reasons
>
>
> I'm not quite following what the exception is?  If RVI's policy is followed for how software support for custom extensions (which D1 falls under) should be handled (as part of allowing such software to be upstreamed), then that doesn't burden standard distros nor the community to maintain ongoing support.

I am a bit confused. As per our understanding, D1 doesn't fall under
custom extensions because they have non-standard conflicting
implementations of the PTE bits that violate the privilege
specification (v1.12 with svpbmt & svnapot extension merged).
Custom extensions can only be defined via satp mode (14-15). Am I
missing something?

>  And this doesn't stop a custom Linux build from being provided to users of the D1 board (like what any other vendor would need to do to support the custom extensions on their hardware).

A custom Linux build is already floating around for the users for the
D1 board. Whether the upstream mainline kernel supports this board is
the key question here.

>
> Or are you proposing that standard binary distros would have to support D1's custom extension?  (And how would that even work if and when Svpbmt becomes required in some rev of the OS-A platform spec - at which point the D1 boards have a nonconforming extension that can't be disabled and thus conflicts with required (Svpbmt) functionality?)

I was suggesting to support D1 in the unified kernel image built from
defconfig only if we decide to support it.
standard binary distros(such as Fedora, RHEL, Suse, Ubuntu) anyways
use custom config. They can always disable support for D1 if they want
it.

>
> Greg
>
>>
>>
>> 1. The board design predates the patch acceptance policy.
>> 2. We don't have enough affordable Linux compatible platforms today.
>> 3. Allowing running an upstream kernel on D1 helps the RISC-V software
>> ecosystem to grow.
>>
>> No more exceptions will be allowed in future for such hardware that
>> violates the spec. Period.
>>
>> > Regards,
>> > Anup
>>
>>
>>
>> --
>> Regards,
>> Atish

--
Regards,
Atish

Philipp Tomsich Sept. 28, 2021, 9:45 a.m. UTC | #9

On Tue, 28 Sept 2021 at 01:05, Atish Patra <atishp@atishpatra.org> wrote:
>
> On Mon, Sep 27, 2021 at 1:53 PM Greg Favor <gfavor@ventanamicro.com> wrote:
> >
> > With the big caveat that I haven't been in the middle of this discussion, it seems like Allwinner D1's changes represent a custom (and nonconforming) extension.
>
> As per the v1.12 privilege specification, bit 63 is reserved for
> Svnapot extension while bit 60–54 are reserved for future standard
> use.
> D1's implementation uses both 60 and 63 bit for their custom "PBMT"
> extension in addition to bit 61 & 62 [1].
>
> > Isn't this just a matter of the patch needing to be treated as for a RISC-V custom extension per the recently clarified policy for handling upstreaming/etc. of custom extensions?  (Philipp can > speak to this clarified policy.)  Or what am I missing?
>
> Linux kernel upstream policy is yet to adopt that clarification as it
> was recently discussed at RVI meetings. Is there a written definition
> of non-conforming/custom/incompatible ?
> Moreover, as per the platform specification[2],
>
> A non-conforming extension that conflicts with a supported standard
> extensions must satisfy at least one of the following:
>   --- It must be disabled by default.
>   --- The supported standard extension must be declared as unsupported
> in all feature discovery structures used by software.
>       This option is allowed only if the standard extension is not required.

The terminology for those cases (and x-thead-v is the mental
test-case) would be "non-compliant and non-conflicting".
The "non-compliant" part comes from the fact that the specification
has been violated: in the case of x-thead-v by using the reserved
opcode space, in the case of the PTEs by using reserved bits.
The "non-conflicting" is based on a system still being able to operate
according to the specification, even if the non-compliant extension is
simply ignored (e.g. RVV is not mandatory and the opcodes for RVV are
not required to raise illegal insn exceptions, so these systems would
simply appear as not having RVV)…

Now, with their "custom" PBMT, we are pushing the boundaries of the
'non-conflicting' definition but are still within the same reasoning:
if we only signal sv39 and no PBMT-support, then the abuse of the PTE
bits does not conflict.  In the end, the 'non-conflicting' status will
hinge on whether we make svpbmt mandatory in the Platform (or the
referenced Profile).

In this particular case — given the importance of the D1 boards for
bootstrapping the software ecosystem — I would make the case that we
need to provide a provision in the Platforms (i.e. OS-A base) to
retain the 'non-conflicting' status (e.g. by mandating "some pbmt" —
with an application note stating that this will be restricted to
svpbmt in the next major revision of the Platforms ... and already
restricting it to svpbmt for the OS-A server extension).

> In this case, the custom non-conforming implementation can not be
> disabled or marked unsupported as it is critical for all the necessary
> I/O devices (usb, mmc, ethernet).
> Without this custom implementation support in upstream, we can not
> really use the mainline kernel for Allwinner D1.

I don't agree from a specification standpoint and from the
'non-conflicting' standpoint: whether or not device drivers can be
used, does not change the 'non-conflicting' status.
This is similar to the 'non-conflicting' status with x-thead-v: vector
instructions (as in "the standard RVV instructions") also can't be
used with a toolchain/libraries/OS that only implements RVV ... but
nonetheless, vendor-specific vector instructions are available.

I would argue the same for Alibaba's PBMT: I/O devices will not work,
unless a vendor-specific non-conflicting PBMT is used.

The brings us back to the requirements that I defined multiple times
for this: the implementation needs to be properly modularized and
quarantined as to not adversely impact compliant implementations.
If this can be assured, I will always argue for inclusion based on the
benefit to the ecosystem and reconciling the imminent fragmentation.
Or in other words: In what world does it make sense to encourage a
vendor of a board with significant market share to create a
vendor-fork of the kernel, if that vendor is willing to work with the
upstream?  Note that I normally am the first to argue on principle,
but have come to the conclusion that we are really between a rock and
a hard place on this one — and my priority is to keep the ecosystem
from fragmenting.

>> >> > We need to decide whether we should support the upstream kernel for D1. Few things to consider.
> >> > – Can it be considered as an errata ?
> >> > – Does it set a bad precedent and open can of worms in future ?
> >> > – Can we just ignore D1 given the mass volume ?
> >> >
> >> > One solution I can think of is that we allow this as an exception to the patch acceptance policy.
> >> > We need to explicitly specify this board as an exception because the policy was not in place during the design phase of the hardware.
> >> > At least, it protects us from accepting the incompatible changes in the future. Any other ideas ?

Anything we do needs to consider future impact and precedent.  But
that should not preclude us from making an exception, when it benefits
the ecosystem by preventing further fragmentation.
I consider the stated requirements of proper modularization
(quarantining the impact within the code base) and a precondition on
the vendor maintaining the changes, as a strong-enough discouragement
for any parties that might also consider to apply gross negligence to
the meaning of 'reserved'.

Philipp.

Nick Kossifidis Sept. 28, 2021, 1:19 p.m. UTC | #10

On 9/28/21 7:26 AM, Atish Patra wrote:
> On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote:
>>
>> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
>>>
>>> Στις 2021-09-27 23:13, Atish Patra έγραψε:
>>>>> We need to decide whether we should support the upstream kernel for
>>>>> D1. Few things to consider.
>>>>> – Can it be considered as an errata ?
>>>
>>> It's one thing to follow the spec and have an error in the
>>> implementation, and another to not follow the spec.
>>>
>>>>> – Does it set a bad precedent and open can of worms in future ?
>>>
>>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
>>> asking for MMU support, they 've also shipped many chips already. I can
>>> also imagine other vendors in the future coming up with implementations
>>> that violate the spec in which case handling the standard stuff will
>>> become messy and complex, and hurt performance/security. We'll end up
>>> filling the code with exceptions and tweaks all over the place. We need
>>> to be strict about what is "riscv" and what's "draft riscv" or "riscv
>>> inspired", and what we are willing to support upstream. I can understand
>>> supporting vendor extensions upstream but they need to fit within the
>>> standard spec, we can't have for example extensions that use encoding
>>> space/csrs/fields etc reserved for standard use, they may only use
>>> what's reserved for custom/vendor use. At least let's agree on that.
>>
>> Totally agree with Nick here. It's a slippery slope.
>>
>> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
>> means future hardware which intentionally violates specs will also have to
>> be merged and the RISC-V patch acceptance policy will have no significance.
>>
>>>
>>>>> – Can we just ignore D1 given the mass volume ?
>>>>>
>>>
>>> IMHO no, we need to find a way to support it upstream but I believe
>>> there is another question to answer:
>>>
>>> Do we also guarantee "one image to rule them all" approach, required by
>>> binary distros, for implementations that violate the spec ? Are we ok
>>> for example to support Allwinner D1 upstream but require a custom
>>> configuration/build instead of supporting it with the "generic" image ?
>>> In one case we need to handle the violation at runtime and introduce
>>> overhead for everyone (like looking up __riscv_svpbmt every time we set
>>> a PTE in this case), in the other it's an #ifdef.
>>
>> At least, we should not have hardware violating specs as part of the
>> unified kernel image instead have these intentional deviations/violations
>> under separate kconfig which will not be enabled by default. This means
>> vendors (of such hardware) and distros will have to explicitly enable
>> support for such violations/deviations.
>>
> 
> If we merge the code and are not enabled by default, it would be a
> maintenance nightmare in future.
> These part of the kernel will not be regularly tested but we have to
> carry the changes for a long time.

I don't see a difference between having these features as part of the
generic image vs having them as custom configs/builds. The code will get
executed only on boards that support the custom/non-compliant
implementation anyway. To the contrary we'll have more code to test if
we are doing things at runtime vs at compile time.

> Similar changes will only grow over time causing a lot of custom
> configs that are not enabled by default.
> 

We'll have a lot of custom configs that will only get used on boards
that use them, vs runtime code that will run for no reason on every
board and choose the default/standard-compliant implementation most of
the time. In the end the code will only get tested on specific hardware
anyway.

> IMHO, if we want to support this board in upstream, we should just
> clearly state that it is one time special exception
> for this board only because of the following reasons
> 
> 1. The board design predates the patch acceptance policy.
> 2. We don't have enough affordable Linux compatible platforms today.
> 3. Allowing running an upstream kernel on D1 helps the RISC-V software
> ecosystem to grow.
> 

The same can be said for Kendryte as well, are we willing to also
support their MMU implementation on the generic image if a patch comes
in ? To be clear I'm not saying we shouldn't support D1 or Kendryte
upstream, I'm just saying that we shouldn't sacrifice the compleity and
performance of the code path for standard-compliant implementations, to
support non-compliant implementations, and instead support non-compliant
implementations with custom kernel builds using compile time options. It
still counts as upstream support, they won't have to maintain their own
forks. It'll also allow custom implementations to have more flexibility
on what they can do since they will be able to use completely
different/custom code paths, instead of trying to fit in the standard
code path (which will become a mess over time). I think this approach is
much more flexible and will allow more customizations to be supported
upstream in the future.

Philipp Tomsich Sept. 28, 2021, 1:46 p.m. UTC | #11

Nick,

On Tue, 28 Sept 2021 at 15:19, Nick Kossifidis <mick@ics.forth.gr> wrote:
>
> On 9/28/21 7:26 AM, Atish Patra wrote:
> > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote:
> >>
> >> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
> >>>
> >>> Στις 2021-09-27 23:13, Atish Patra έγραψε:
> >>>>> We need to decide whether we should support the upstream kernel for
> >>>>> D1. Few things to consider.
> >>>>> – Can it be considered as an errata ?
> >>>
> >>> It's one thing to follow the spec and have an error in the
> >>> implementation, and another to not follow the spec.
> >>>
> >>>>> – Does it set a bad precedent and open can of worms in future ?
> >>>
> >>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
> >>> asking for MMU support, they 've also shipped many chips already. I can
> >>> also imagine other vendors in the future coming up with implementations
> >>> that violate the spec in which case handling the standard stuff will
> >>> become messy and complex, and hurt performance/security. We'll end up
> >>> filling the code with exceptions and tweaks all over the place. We need
> >>> to be strict about what is "riscv" and what's "draft riscv" or "riscv
> >>> inspired", and what we are willing to support upstream. I can understand
> >>> supporting vendor extensions upstream but they need to fit within the
> >>> standard spec, we can't have for example extensions that use encoding
> >>> space/csrs/fields etc reserved for standard use, they may only use
> >>> what's reserved for custom/vendor use. At least let's agree on that.
> >>
> >> Totally agree with Nick here. It's a slippery slope.
> >>
> >> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
> >> means future hardware which intentionally violates specs will also have to
> >> be merged and the RISC-V patch acceptance policy will have no significance.
> >>
> >>>
> >>>>> – Can we just ignore D1 given the mass volume ?
> >>>>>
> >>>
> >>> IMHO no, we need to find a way to support it upstream but I believe
> >>> there is another question to answer:
> >>>
> >>> Do we also guarantee "one image to rule them all" approach, required by
> >>> binary distros, for implementations that violate the spec ? Are we ok
> >>> for example to support Allwinner D1 upstream but require a custom
> >>> configuration/build instead of supporting it with the "generic" image ?
> >>> In one case we need to handle the violation at runtime and introduce
> >>> overhead for everyone (like looking up __riscv_svpbmt every time we set
> >>> a PTE in this case), in the other it's an #ifdef.
> >>
> >> At least, we should not have hardware violating specs as part of the
> >> unified kernel image instead have these intentional deviations/violations
> >> under separate kconfig which will not be enabled by default. This means
> >> vendors (of such hardware) and distros will have to explicitly enable
> >> support for such violations/deviations.
> >>
> >
> > If we merge the code and are not enabled by default, it would be a
> > maintenance nightmare in future.
> > These part of the kernel will not be regularly tested but we have to
> > carry the changes for a long time.
>
> I don't see a difference between having these features as part of the
> generic image vs having them as custom configs/builds. The code will get
> executed only on boards that support the custom/non-compliant
> implementation anyway. To the contrary we'll have more code to test if
> we are doing things at runtime vs at compile time.
>
> > Similar changes will only grow over time causing a lot of custom
> > configs that are not enabled by default.
> >
>
> We'll have a lot of custom configs that will only get used on boards
> that use them, vs runtime code that will run for no reason on every
> board and choose the default/standard-compliant implementation most of
> the time. In the end the code will only get tested on specific hardware
> anyway.
>
> > IMHO, if we want to support this board in upstream, we should just
> > clearly state that it is one time special exception
> > for this board only because of the following reasons
> >
> > 1. The board design predates the patch acceptance policy.
> > 2. We don't have enough affordable Linux compatible platforms today.
> > 3. Allowing running an upstream kernel on D1 helps the RISC-V software
> > ecosystem to grow.
> >
>
> The same can be said for Kendryte as well, are we willing to also
> support their MMU implementation on the generic image if a patch comes
> in? To be clear I'm not saying we shouldn't support D1 or Kendryte
> upstream, I'm just saying that we shouldn't sacrifice the complexity and
> performance of the code path for standard-compliant implementations, to
> support non-compliant implementations, and instead support non-compliant
> implementations with custom kernel builds using compile time options. It

For priming the pump on the software effort, having a solution that is enabled
on distro-builds is clearly preferable — that leads to the solution that Palmer
had outlined at LPC, which is to have a KCONFIG option that enables the
alternate code paths and can be turned off for embedded use-cases.

> still counts as upstream support, they won't have to maintain their own
> forks. It'll also allow custom implementations to have more flexibility
> on what they can do since they will be able to use completely
> different/custom code paths, instead of trying to fit in the standard
> code path (which will become a mess over time). I think this approach is
> much more flexible and will allow more customizations to be supported
> upstream in the future.

The important detail will be the ground rules: changes have to be sufficiently
quarantined that (a) they can be turned off, (b) can be reverted easily (in case
that vendors fail to perform their maintenance obligations), and (c) they don't
affect the performance and complexity of the standard code paths.

Cheers,
Philipp.

Alexandre Ghiti Sept. 28, 2021, 2:58 p.m. UTC | #12

On Tue, Sep 28, 2021 at 3:48 PM Philipp Tomsich
<philipp.tomsich@vrull.eu> wrote:
>
> Nick,
>
> On Tue, 28 Sept 2021 at 15:19, Nick Kossifidis <mick@ics.forth.gr> wrote:
> >
> > On 9/28/21 7:26 AM, Atish Patra wrote:
> > > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote:
> > >>
> > >> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
> > >>>
> > >>> Στις 2021-09-27 23:13, Atish Patra έγραψε:
> > >>>>> We need to decide whether we should support the upstream kernel for
> > >>>>> D1. Few things to consider.
> > >>>>> – Can it be considered as an errata ?
> > >>>
> > >>> It's one thing to follow the spec and have an error in the
> > >>> implementation, and another to not follow the spec.
> > >>>
> > >>>>> – Does it set a bad precedent and open can of worms in future ?
> > >>>
> > >>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
> > >>> asking for MMU support, they 've also shipped many chips already. I can
> > >>> also imagine other vendors in the future coming up with implementations
> > >>> that violate the spec in which case handling the standard stuff will
> > >>> become messy and complex, and hurt performance/security. We'll end up
> > >>> filling the code with exceptions and tweaks all over the place. We need
> > >>> to be strict about what is "riscv" and what's "draft riscv" or "riscv
> > >>> inspired", and what we are willing to support upstream. I can understand
> > >>> supporting vendor extensions upstream but they need to fit within the
> > >>> standard spec, we can't have for example extensions that use encoding
> > >>> space/csrs/fields etc reserved for standard use, they may only use
> > >>> what's reserved for custom/vendor use. At least let's agree on that.
> > >>
> > >> Totally agree with Nick here. It's a slippery slope.
> > >>
> > >> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
> > >> means future hardware which intentionally violates specs will also have to
> > >> be merged and the RISC-V patch acceptance policy will have no significance.
> > >>
> > >>>
> > >>>>> – Can we just ignore D1 given the mass volume ?
> > >>>>>
> > >>>
> > >>> IMHO no, we need to find a way to support it upstream but I believe
> > >>> there is another question to answer:
> > >>>
> > >>> Do we also guarantee "one image to rule them all" approach, required by
> > >>> binary distros, for implementations that violate the spec ? Are we ok
> > >>> for example to support Allwinner D1 upstream but require a custom
> > >>> configuration/build instead of supporting it with the "generic" image ?
> > >>> In one case we need to handle the violation at runtime and introduce
> > >>> overhead for everyone (like looking up __riscv_svpbmt every time we set
> > >>> a PTE in this case), in the other it's an #ifdef.
> > >>
> > >> At least, we should not have hardware violating specs as part of the
> > >> unified kernel image instead have these intentional deviations/violations
> > >> under separate kconfig which will not be enabled by default. This means
> > >> vendors (of such hardware) and distros will have to explicitly enable
> > >> support for such violations/deviations.
> > >>
> > >
> > > If we merge the code and are not enabled by default, it would be a
> > > maintenance nightmare in future.
> > > These part of the kernel will not be regularly tested but we have to
> > > carry the changes for a long time.
> >
> > I don't see a difference between having these features as part of the
> > generic image vs having them as custom configs/builds. The code will get
> > executed only on boards that support the custom/non-compliant
> > implementation anyway. To the contrary we'll have more code to test if
> > we are doing things at runtime vs at compile time.
> >
> > > Similar changes will only grow over time causing a lot of custom
> > > configs that are not enabled by default.
> > >
> >
> > We'll have a lot of custom configs that will only get used on boards
> > that use them, vs runtime code that will run for no reason on every
> > board and choose the default/standard-compliant implementation most of
> > the time. In the end the code will only get tested on specific hardware
> > anyway.
> >
> > > IMHO, if we want to support this board in upstream, we should just
> > > clearly state that it is one time special exception
> > > for this board only because of the following reasons
> > >
> > > 1. The board design predates the patch acceptance policy.
> > > 2. We don't have enough affordable Linux compatible platforms today.
> > > 3. Allowing running an upstream kernel on D1 helps the RISC-V software
> > > ecosystem to grow.
> > >
> >
> > The same can be said for Kendryte as well, are we willing to also
> > support their MMU implementation on the generic image if a patch comes
> > in? To be clear I'm not saying we shouldn't support D1 or Kendryte
> > upstream, I'm just saying that we shouldn't sacrifice the complexity and
> > performance of the code path for standard-compliant implementations, to
> > support non-compliant implementations, and instead support non-compliant
> > implementations with custom kernel builds using compile time options. It
>
> For priming the pump on the software effort, having a solution that is enabled
> on distro-builds is clearly preferable — that leads to the solution that Palmer
> had outlined at LPC, which is to have a KCONFIG option that enables the
> alternate code paths and can be turned off for embedded use-cases.
>
> > still counts as upstream support, they won't have to maintain their own
> > forks. It'll also allow custom implementations to have more flexibility
> > on what they can do since they will be able to use completely
> > different/custom code paths, instead of trying to fit in the standard
> > code path (which will become a mess over time). I think this approach is
> > much more flexible and will allow more customizations to be supported
> > upstream in the future.
>
> The important detail will be the ground rules: changes have to be sufficiently
> quarantined that (a) they can be turned off, (b) can be reverted easily (in case
> that vendors fail to perform their maintenance obligations),

Can we really remove support once it is in and widely used?

> and (c) they don't
> affect the performance and complexity of the standard code paths.
>
> Cheers,
> Philipp.
>
> _______________________________________________
> linux-riscv mailing list
> linux-riscv@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-riscv

Palmer Dabbelt Oct. 5, 2021, 12:44 a.m. UTC | #13

On Tue, 28 Sep 2021 07:58:51 PDT (-0700), alexandre.ghiti@canonical.com wrote:
> On Tue, Sep 28, 2021 at 3:48 PM Philipp Tomsich
> <philipp.tomsich@vrull.eu> wrote:
>>
>> Nick,
>>
>> On Tue, 28 Sept 2021 at 15:19, Nick Kossifidis <mick@ics.forth.gr> wrote:
>> >
>> > On 9/28/21 7:26 AM, Atish Patra wrote:
>> > > On Mon, Sep 27, 2021 at 8:50 PM Anup Patel <anup@brainfault.org> wrote:
>> > >>
>> > >> On Tue, Sep 28, 2021 at 6:32 AM Nick Kossifidis <mick@ics.forth.gr> wrote:
>> > >>>
>> > >>> Στις 2021-09-27 23:13, Atish Patra έγραψε:
>> > >>>>> We need to decide whether we should support the upstream kernel for
>> > >>>>> D1. Few things to consider.
>> > >>>>> – Can it be considered as an errata ?
>> > >>>
>> > >>> It's one thing to follow the spec and have an error in the
>> > >>> implementation, and another to not follow the spec.
>> > >>>
>> > >>>>> – Does it set a bad precedent and open can of worms in future ?
>> > >>>
>> > >>> IMHO yes, I'm thinking of Kendryte 210 devs for example coming up and
>> > >>> asking for MMU support, they 've also shipped many chips already. I can
>> > >>> also imagine other vendors in the future coming up with implementations
>> > >>> that violate the spec in which case handling the standard stuff will
>> > >>> become messy and complex, and hurt performance/security. We'll end up
>> > >>> filling the code with exceptions and tweaks all over the place. We need
>> > >>> to be strict about what is "riscv" and what's "draft riscv" or "riscv
>> > >>> inspired", and what we are willing to support upstream. I can understand
>> > >>> supporting vendor extensions upstream but they need to fit within the
>> > >>> standard spec, we can't have for example extensions that use encoding
>> > >>> space/csrs/fields etc reserved for standard use, they may only use
>> > >>> what's reserved for custom/vendor use. At least let's agree on that.
>> > >>
>> > >> Totally agree with Nick here. It's a slippery slope.
>> > >>
>> > >> Including D1 PTE bits (or Kendryte K210 MMU) part of the Linux RISC-V
>> > >> means future hardware which intentionally violates specs will also have to
>> > >> be merged and the RISC-V patch acceptance policy will have no significance.
>> > >>
>> > >>>
>> > >>>>> – Can we just ignore D1 given the mass volume ?
>> > >>>>>
>> > >>>
>> > >>> IMHO no, we need to find a way to support it upstream but I believe
>> > >>> there is another question to answer:
>> > >>>
>> > >>> Do we also guarantee "one image to rule them all" approach, required by
>> > >>> binary distros, for implementations that violate the spec ? Are we ok
>> > >>> for example to support Allwinner D1 upstream but require a custom
>> > >>> configuration/build instead of supporting it with the "generic" image ?
>> > >>> In one case we need to handle the violation at runtime and introduce
>> > >>> overhead for everyone (like looking up __riscv_svpbmt every time we set
>> > >>> a PTE in this case), in the other it's an #ifdef.
>> > >>
>> > >> At least, we should not have hardware violating specs as part of the
>> > >> unified kernel image instead have these intentional deviations/violations
>> > >> under separate kconfig which will not be enabled by default. This means
>> > >> vendors (of such hardware) and distros will have to explicitly enable
>> > >> support for such violations/deviations.
>> > >>
>> > >
>> > > If we merge the code and are not enabled by default, it would be a
>> > > maintenance nightmare in future.
>> > > These part of the kernel will not be regularly tested but we have to
>> > > carry the changes for a long time.
>> >
>> > I don't see a difference between having these features as part of the
>> > generic image vs having them as custom configs/builds. The code will get
>> > executed only on boards that support the custom/non-compliant
>> > implementation anyway. To the contrary we'll have more code to test if
>> > we are doing things at runtime vs at compile time.
>> >
>> > > Similar changes will only grow over time causing a lot of custom
>> > > configs that are not enabled by default.
>> > >
>> >
>> > We'll have a lot of custom configs that will only get used on boards
>> > that use them, vs runtime code that will run for no reason on every
>> > board and choose the default/standard-compliant implementation most of
>> > the time. In the end the code will only get tested on specific hardware
>> > anyway.
>> >
>> > > IMHO, if we want to support this board in upstream, we should just
>> > > clearly state that it is one time special exception
>> > > for this board only because of the following reasons
>> > >
>> > > 1. The board design predates the patch acceptance policy.
>> > > 2. We don't have enough affordable Linux compatible platforms today.
>> > > 3. Allowing running an upstream kernel on D1 helps the RISC-V software
>> > > ecosystem to grow.
>> > >
>> >
>> > The same can be said for Kendryte as well, are we willing to also
>> > support their MMU implementation on the generic image if a patch comes
>> > in? To be clear I'm not saying we shouldn't support D1 or Kendryte
>> > upstream, I'm just saying that we shouldn't sacrifice the complexity and
>> > performance of the code path for standard-compliant implementations, to
>> > support non-compliant implementations, and instead support non-compliant
>> > implementations with custom kernel builds using compile time options. It
>>
>> For priming the pump on the software effort, having a solution that is enabled
>> on distro-builds is clearly preferable — that leads to the solution that Palmer
>> had outlined at LPC, which is to have a KCONFIG option that enables the
>> alternate code paths and can be turned off for embedded use-cases.
>>
>> > still counts as upstream support, they won't have to maintain their own
>> > forks. It'll also allow custom implementations to have more flexibility
>> > on what they can do since they will be able to use completely
>> > different/custom code paths, instead of trying to fit in the standard
>> > code path (which will become a mess over time). I think this approach is
>> > much more flexible and will allow more customizations to be supported
>> > upstream in the future.
>>
>> The important detail will be the ground rules: changes have to be sufficiently
>> quarantined that (a) they can be turned off, (b) can be reverted easily (in case
>> that vendors fail to perform their maintenance obligations),
>
> Can we really remove support once it is in and widely used?

We'll follow the standard deprecation policies for anything I have any 
say over, which in the kernel I've always heard described as 
forever-ish.  Since this is pretty coupled to a specific chip one could 
imagine deprecating it when we can convince ourselves those chips have 
all had their smoke let out, but that's a decade timescale sort of 
thing.

>> and (c) they don't
>> affect the performance and complexity of the standard code paths.
>>
>> Cheers,
>> Philipp.
>>
>> _______________________________________________
>> linux-riscv mailing list
>> linux-riscv@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-riscv

[V2,1/2] riscv: Add RISC-V svpbmt extension

Commit Message

Comments

Patch