diff mbox series

[v7] target/riscv: Add support to access ctrsource, ctrtarget, ctrdata regs.

Message ID 20250212-b4-ctr_upstream_v6-v7-1-4e8159ea33bf@rivosinc.com (mailing list archive)
State New
Headers show
Series [v7] target/riscv: Add support to access ctrsource, ctrtarget, ctrdata regs. | expand

Commit Message

Rajnesh Kanwal Feb. 12, 2025, 10:18 a.m. UTC
CTR entries are accessed using ctrsource, ctrtarget and ctrdata
registers using smcsrind/sscsrind extension. This commits extends
the csrind extension to support CTR registers.

ctrsource is accessible through xireg CSR, ctrtarget is accessible
through xireg1 and ctrdata is accessible through xireg2 CSR.

CTR supports maximum depth of 256 entries which are accessed using
xiselect range 0x200 to 0x2ff.

This commits also adds properties to enable CTR extension. CTR can be
enabled using smctr=true and ssctr=true now.

Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
Acked-by: Alistair Francis <alistair.francis@wdc.com>
---
This series enables Control Transfer Records extension support on riscv
platform. This extension is similar to Arch LBR in x86 and BRBE in ARM.
The Extension has been ratified and this series is based on v1.0 [0]

CTR extension depends on both the implementation of S-mode and Sscsrind
extension v1.0.0 [1]. CTR access ctrsource, ctrtartget and ctrdata CSRs using
sscsrind extension.

The series is based on Smcdeleg/Ssccfg counter delegation extension [2]
patches [3]. CTR itself doesn't depend on counter delegation support. This
rebase is basically to include the Smcsrind patches.

Here is the link to a quick start guide [4] to setup and run a basic perf demo
on Linux to use CTR Ext.

Qemu patches can be found here:
https://github.com/rajnesh-kanwal/qemu/tree/b4/ctr_upstream_v7

Opensbi patch can be found here:
https://github.com/rajnesh-kanwal/opensbi/tree/ctr_upstream_v2

Linux kernel patches can be found here:
https://github.com/rajnesh-kanwal/linux/tree/b4/ctr_upstream_v2

[0]: https://github.com/riscv/riscv-control-transfer-records/releases/tag/v1.0
[1]: https://github.com/riscvarchive/riscv-indirect-csr-access/releases/tag/v1.0.0
[2]: https://github.com/riscvarchive/riscv-smcdeleg-ssccfg/releases/tag/v1.0.0
[3]: https://lore.kernel.org/qemu-riscv/20241203-counter_delegation-v4-0-c12a89baed86@rivosinc.com/
[4]: https://github.com/rajnesh-kanwal/linux/wiki/Running-CTR-basic-demo-on-QEMU-RISC%E2%80%90V-Virt-machine
---
Changes in v7:
v7: Rebased on latest riscv-to-apply.next. Given 6 out of 7 patches
    are already in riscv-to-apply.next, this version only contains the
    last patch which failed to apply.

v6: Rebased on latest riscv-to-apply.for-upstream.
  - https://lore.kernel.org/qemu-devel/20250205-b4-ctr_upstream_v6-v6-0-439d8e06c8ef@rivosinc.com

v5: Improvements based on Richard Henderson's feedback.
  - Fixed code gen logic to use gen_update_pc() instead of
    tcg_constant_tl().
  - Some function renaming.
  - Rebased onto v4 of counter delegation series.
  - https://lore.kernel.org/qemu-riscv/20241205-b4-ctr_upstream_v3-v5-0-60b993aa567d@rivosinc.com/

v4: Improvements based on Richard Henderson's feedback.
  - Refactored CTR related code generation to move more code into
    translation side and avoid unnecessary code execution in generated
    code.
  - Added missing code in machine.c to migrate the new state.
  - https://lore.kernel.org/r/20241204-b4-ctr_upstream_v3-v4-0-d3ce6bef9432@rivosinc.com

v3: Improvements based on Jason Chien and Frank Chang's feedback.
  - Created single set of MACROs for CTR CSRs in cpu_bit.h
  - Some fixes in riscv_ctr_add_entry.
  - Return zero for vs/sireg4-6 for CTR 0x200 to 0x2ff range.
  - Improved extension dependency check.
  - Fixed invalid ctrctl csr selection bug in riscv_ctr_freeze.
  - Added implied rules for Smctr and Ssctr.
  - Added missing SMSTATEEN0_CTR bit in mstateen0 and hstateen0 write ops.
  - Some more cosmetic changes.
  - https://lore.kernel.org/qemu-riscv/20241104-b4-ctr_upstream_v3-v3-0-32fd3c48205f@rivosinc.com/

v2: Lots of improvements based on Jason Chien's feedback including:
  - Added CTR recording for cm.jalt, cm.jt, cm.popret, cm.popretz.
  - Fixed and added more CTR extension enable checks.
  - Fixed CTR CSR predicate functions.
  - Fixed external trap xTE bit checks.
  - One fix in freeze function for VS-mode.
  - Lots of minor code improvements.
  - Added checks in sctrclr instruction helper.
  - https://lore.kernel.org/qemu-riscv/20240619152708.135991-1-rkanwal@rivosinc.com/

v1:
  - https://lore.kernel.org/qemu-riscv/20240529160950.132754-1-rkanwal@rivosinc.com/
---
 target/riscv/cpu.c         |  26 +++++++-
 target/riscv/csr.c         | 150 ++++++++++++++++++++++++++++++++++++++++++++-
 target/riscv/tcg/tcg-cpu.c |  11 ++++
 3 files changed, 185 insertions(+), 2 deletions(-)


---
base-commit: 485adaaf6657dd5070dbefed593b2923a397a63f
change-id: 20250205-b4-ctr_upstream_v6-71418cd245ee

Best regards,

Comments

Alistair Francis Feb. 17, 2025, 5:24 a.m. UTC | #1
On Wed, Feb 12, 2025 at 8:20 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
>
> CTR entries are accessed using ctrsource, ctrtarget and ctrdata
> registers using smcsrind/sscsrind extension. This commits extends
> the csrind extension to support CTR registers.
>
> ctrsource is accessible through xireg CSR, ctrtarget is accessible
> through xireg1 and ctrdata is accessible through xireg2 CSR.
>
> CTR supports maximum depth of 256 entries which are accessed using
> xiselect range 0x200 to 0x2ff.
>
> This commits also adds properties to enable CTR extension. CTR can be
> enabled using smctr=true and ssctr=true now.
>
> Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> Acked-by: Alistair Francis <alistair.francis@wdc.com>

Thanks!

Applied to riscv-to-apply.next

Alistair

> ---
> This series enables Control Transfer Records extension support on riscv
> platform. This extension is similar to Arch LBR in x86 and BRBE in ARM.
> The Extension has been ratified and this series is based on v1.0 [0]
>
> CTR extension depends on both the implementation of S-mode and Sscsrind
> extension v1.0.0 [1]. CTR access ctrsource, ctrtartget and ctrdata CSRs using
> sscsrind extension.
>
> The series is based on Smcdeleg/Ssccfg counter delegation extension [2]
> patches [3]. CTR itself doesn't depend on counter delegation support. This
> rebase is basically to include the Smcsrind patches.
>
> Here is the link to a quick start guide [4] to setup and run a basic perf demo
> on Linux to use CTR Ext.
>
> Qemu patches can be found here:
> https://github.com/rajnesh-kanwal/qemu/tree/b4/ctr_upstream_v7
>
> Opensbi patch can be found here:
> https://github.com/rajnesh-kanwal/opensbi/tree/ctr_upstream_v2
>
> Linux kernel patches can be found here:
> https://github.com/rajnesh-kanwal/linux/tree/b4/ctr_upstream_v2
>
> [0]: https://github.com/riscv/riscv-control-transfer-records/releases/tag/v1.0
> [1]: https://github.com/riscvarchive/riscv-indirect-csr-access/releases/tag/v1.0.0
> [2]: https://github.com/riscvarchive/riscv-smcdeleg-ssccfg/releases/tag/v1.0.0
> [3]: https://lore.kernel.org/qemu-riscv/20241203-counter_delegation-v4-0-c12a89baed86@rivosinc.com/
> [4]: https://github.com/rajnesh-kanwal/linux/wiki/Running-CTR-basic-demo-on-QEMU-RISC%E2%80%90V-Virt-machine
> ---
> Changes in v7:
> v7: Rebased on latest riscv-to-apply.next. Given 6 out of 7 patches
>     are already in riscv-to-apply.next, this version only contains the
>     last patch which failed to apply.
>
> v6: Rebased on latest riscv-to-apply.for-upstream.
>   - https://lore.kernel.org/qemu-devel/20250205-b4-ctr_upstream_v6-v6-0-439d8e06c8ef@rivosinc.com
>
> v5: Improvements based on Richard Henderson's feedback.
>   - Fixed code gen logic to use gen_update_pc() instead of
>     tcg_constant_tl().
>   - Some function renaming.
>   - Rebased onto v4 of counter delegation series.
>   - https://lore.kernel.org/qemu-riscv/20241205-b4-ctr_upstream_v3-v5-0-60b993aa567d@rivosinc.com/
>
> v4: Improvements based on Richard Henderson's feedback.
>   - Refactored CTR related code generation to move more code into
>     translation side and avoid unnecessary code execution in generated
>     code.
>   - Added missing code in machine.c to migrate the new state.
>   - https://lore.kernel.org/r/20241204-b4-ctr_upstream_v3-v4-0-d3ce6bef9432@rivosinc.com
>
> v3: Improvements based on Jason Chien and Frank Chang's feedback.
>   - Created single set of MACROs for CTR CSRs in cpu_bit.h
>   - Some fixes in riscv_ctr_add_entry.
>   - Return zero for vs/sireg4-6 for CTR 0x200 to 0x2ff range.
>   - Improved extension dependency check.
>   - Fixed invalid ctrctl csr selection bug in riscv_ctr_freeze.
>   - Added implied rules for Smctr and Ssctr.
>   - Added missing SMSTATEEN0_CTR bit in mstateen0 and hstateen0 write ops.
>   - Some more cosmetic changes.
>   - https://lore.kernel.org/qemu-riscv/20241104-b4-ctr_upstream_v3-v3-0-32fd3c48205f@rivosinc.com/
>
> v2: Lots of improvements based on Jason Chien's feedback including:
>   - Added CTR recording for cm.jalt, cm.jt, cm.popret, cm.popretz.
>   - Fixed and added more CTR extension enable checks.
>   - Fixed CTR CSR predicate functions.
>   - Fixed external trap xTE bit checks.
>   - One fix in freeze function for VS-mode.
>   - Lots of minor code improvements.
>   - Added checks in sctrclr instruction helper.
>   - https://lore.kernel.org/qemu-riscv/20240619152708.135991-1-rkanwal@rivosinc.com/
>
> v1:
>   - https://lore.kernel.org/qemu-riscv/20240529160950.132754-1-rkanwal@rivosinc.com/
> ---
>  target/riscv/cpu.c         |  26 +++++++-
>  target/riscv/csr.c         | 150 ++++++++++++++++++++++++++++++++++++++++++++-
>  target/riscv/tcg/tcg-cpu.c |  11 ++++
>  3 files changed, 185 insertions(+), 2 deletions(-)
>
> diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> index 8264c81e889424dfd491cec0ef95eeffc8fcc5b6..522d6584e4c3be7070e5a59f70f5948be8196a77 100644
> --- a/target/riscv/cpu.c
> +++ b/target/riscv/cpu.c
> @@ -216,6 +216,8 @@ const RISCVIsaExtData isa_edata_arr[] = {
>      ISA_EXT_DATA_ENTRY(ssu64xl, PRIV_VERSION_1_12_0, has_priv_1_12),
>      ISA_EXT_DATA_ENTRY(supm, PRIV_VERSION_1_13_0, ext_supm),
>      ISA_EXT_DATA_ENTRY(svade, PRIV_VERSION_1_11_0, ext_svade),
> +    ISA_EXT_DATA_ENTRY(smctr, PRIV_VERSION_1_12_0, ext_smctr),
> +    ISA_EXT_DATA_ENTRY(ssctr, PRIV_VERSION_1_12_0, ext_ssctr),
>      ISA_EXT_DATA_ENTRY(svadu, PRIV_VERSION_1_12_0, ext_svadu),
>      ISA_EXT_DATA_ENTRY(svinval, PRIV_VERSION_1_12_0, ext_svinval),
>      ISA_EXT_DATA_ENTRY(svnapot, PRIV_VERSION_1_12_0, ext_svnapot),
> @@ -1599,6 +1601,8 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
>      MULTI_EXT_CFG_BOOL("smcdeleg", ext_smcdeleg, false),
>      MULTI_EXT_CFG_BOOL("sscsrind", ext_sscsrind, false),
>      MULTI_EXT_CFG_BOOL("ssccfg", ext_ssccfg, false),
> +    MULTI_EXT_CFG_BOOL("smctr", ext_smctr, false),
> +    MULTI_EXT_CFG_BOOL("ssctr", ext_ssctr, false),
>      MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true),
>      MULTI_EXT_CFG_BOOL("zicfilp", ext_zicfilp, false),
>      MULTI_EXT_CFG_BOOL("zicfiss", ext_zicfiss, false),
> @@ -2863,6 +2867,26 @@ static RISCVCPUImpliedExtsRule SSPM_IMPLIED = {
>      },
>  };
>
> +static RISCVCPUImpliedExtsRule SMCTR_IMPLIED = {
> +    .ext = CPU_CFG_OFFSET(ext_smctr),
> +    .implied_misa_exts = RVS,
> +    .implied_multi_exts = {
> +        CPU_CFG_OFFSET(ext_sscsrind),
> +
> +        RISCV_IMPLIED_EXTS_RULE_END
> +    },
> +};
> +
> +static RISCVCPUImpliedExtsRule SSCTR_IMPLIED = {
> +    .ext = CPU_CFG_OFFSET(ext_ssctr),
> +    .implied_misa_exts = RVS,
> +    .implied_multi_exts = {
> +        CPU_CFG_OFFSET(ext_sscsrind),
> +
> +        RISCV_IMPLIED_EXTS_RULE_END
> +    },
> +};
> +
>  RISCVCPUImpliedExtsRule *riscv_misa_ext_implied_rules[] = {
>      &RVA_IMPLIED, &RVD_IMPLIED, &RVF_IMPLIED,
>      &RVM_IMPLIED, &RVV_IMPLIED, NULL
> @@ -2881,7 +2905,7 @@ RISCVCPUImpliedExtsRule *riscv_multi_ext_implied_rules[] = {
>      &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVKN_IMPLIED,
>      &ZVKNC_IMPLIED, &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED,
>      &ZVKS_IMPLIED,  &ZVKSC_IMPLIED, &ZVKSG_IMPLIED, &SSCFG_IMPLIED,
> -    &SUPM_IMPLIED, &SSPM_IMPLIED,
> +    &SUPM_IMPLIED, &SSPM_IMPLIED, &SMCTR_IMPLIED, &SSCTR_IMPLIED,
>      NULL
>  };
>
> diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> index a62c50f057f487753a79393306641d3e50085ee5..d0068ce98c156abd67b7d08f94f29edb957143bd 100644
> --- a/target/riscv/csr.c
> +++ b/target/riscv/csr.c
> @@ -2431,6 +2431,13 @@ static bool xiselect_cd_range(target_ulong isel)
>      return (ISELECT_CD_FIRST <= isel && isel <= ISELECT_CD_LAST);
>  }
>
> +static bool xiselect_ctr_range(int csrno, target_ulong isel)
> +{
> +    /* MIREG-MIREG6 for the range 0x200-0x2ff are not used by CTR. */
> +    return CTR_ENTRIES_FIRST <= isel && isel <= CTR_ENTRIES_LAST &&
> +           csrno < CSR_MIREG;
> +}
> +
>  static int rmw_iprio(target_ulong xlen,
>                       target_ulong iselect, uint8_t *iprio,
>                       target_ulong *val, target_ulong new_val,
> @@ -2476,6 +2483,124 @@ static int rmw_iprio(target_ulong xlen,
>      return 0;
>  }
>
> +static int rmw_ctrsource(CPURISCVState *env, int isel, target_ulong *val,
> +                          target_ulong new_val, target_ulong wr_mask)
> +{
> +    /*
> +     * CTR arrays are treated as circular buffers and TOS always points to next
> +     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> +     * 0 is always the latest one, traversal is a bit different here. See the
> +     * below example.
> +     *
> +     * Depth = 16.
> +     *
> +     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> +     * TOS                                 H
> +     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
> +     */
> +    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> +    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> +    uint64_t idx;
> +
> +    /* Entry greater than depth-1 is read-only zero */
> +    if (entry >= depth) {
> +        if (val) {
> +            *val = 0;
> +        }
> +        return 0;
> +    }
> +
> +    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> +    idx = (idx - entry - 1) & (depth - 1);
> +
> +    if (val) {
> +        *val = env->ctr_src[idx];
> +    }
> +
> +    env->ctr_src[idx] = (env->ctr_src[idx] & ~wr_mask) | (new_val & wr_mask);
> +
> +    return 0;
> +}
> +
> +static int rmw_ctrtarget(CPURISCVState *env, int isel, target_ulong *val,
> +                          target_ulong new_val, target_ulong wr_mask)
> +{
> +    /*
> +     * CTR arrays are treated as circular buffers and TOS always points to next
> +     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> +     * 0 is always the latest one, traversal is a bit different here. See the
> +     * below example.
> +     *
> +     * Depth = 16.
> +     *
> +     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> +     * head                                H
> +     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
> +     */
> +    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> +    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> +    uint64_t idx;
> +
> +    /* Entry greater than depth-1 is read-only zero */
> +    if (entry >= depth) {
> +        if (val) {
> +            *val = 0;
> +        }
> +        return 0;
> +    }
> +
> +    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> +    idx = (idx - entry - 1) & (depth - 1);
> +
> +    if (val) {
> +        *val = env->ctr_dst[idx];
> +    }
> +
> +    env->ctr_dst[idx] = (env->ctr_dst[idx] & ~wr_mask) | (new_val & wr_mask);
> +
> +    return 0;
> +}
> +
> +static int rmw_ctrdata(CPURISCVState *env, int isel, target_ulong *val,
> +                        target_ulong new_val, target_ulong wr_mask)
> +{
> +    /*
> +     * CTR arrays are treated as circular buffers and TOS always points to next
> +     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> +     * 0 is always the latest one, traversal is a bit different here. See the
> +     * below example.
> +     *
> +     * Depth = 16.
> +     *
> +     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> +     * head                                H
> +     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
> +     */
> +    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> +    const uint64_t mask = wr_mask & CTRDATA_MASK;
> +    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> +    uint64_t idx;
> +
> +    /* Entry greater than depth-1 is read-only zero */
> +    if (entry >= depth) {
> +        if (val) {
> +            *val = 0;
> +        }
> +        return 0;
> +    }
> +
> +    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> +    idx = (idx - entry - 1) & (depth - 1);
> +
> +    if (val) {
> +        *val = env->ctr_data[idx];
> +    }
> +
> +    env->ctr_data[idx] = (env->ctr_data[idx] & ~mask) | (new_val & mask);
> +
> +    return 0;
> +}
> +
>  static RISCVException rmw_xireg_aia(CPURISCVState *env, int csrno,
>                           target_ulong isel, target_ulong *val,
>                           target_ulong new_val, target_ulong wr_mask)
> @@ -2628,6 +2753,27 @@ done:
>      return ret;
>  }
>
> +static int rmw_xireg_ctr(CPURISCVState *env, int csrno,
> +                        target_ulong isel, target_ulong *val,
> +                        target_ulong new_val, target_ulong wr_mask)
> +{
> +    if (!riscv_cpu_cfg(env)->ext_smctr && !riscv_cpu_cfg(env)->ext_ssctr) {
> +        return -EINVAL;
> +    }
> +
> +    if (csrno == CSR_SIREG || csrno == CSR_VSIREG) {
> +        return rmw_ctrsource(env, isel, val, new_val, wr_mask);
> +    } else if (csrno == CSR_SIREG2 || csrno == CSR_VSIREG2) {
> +        return rmw_ctrtarget(env, isel, val, new_val, wr_mask);
> +    } else if (csrno == CSR_SIREG3 || csrno == CSR_VSIREG3) {
> +        return rmw_ctrdata(env, isel, val, new_val, wr_mask);
> +    } else if (val) {
> +        *val = 0;
> +    }
> +
> +    return 0;
> +}
> +
>  /*
>   * rmw_xireg_csrind: Perform indirect access to xireg and xireg2-xireg6
>   *
> @@ -2639,11 +2785,13 @@ static int rmw_xireg_csrind(CPURISCVState *env, int csrno,
>                                target_ulong isel, target_ulong *val,
>                                target_ulong new_val, target_ulong wr_mask)
>  {
> -    int ret = -EINVAL;
>      bool virt = csrno == CSR_VSIREG ? true : false;
> +    int ret = -EINVAL;
>
>      if (xiselect_cd_range(isel)) {
>          ret = rmw_xireg_cd(env, csrno, isel, val, new_val, wr_mask);
> +    } else if (xiselect_ctr_range(csrno, isel)) {
> +        ret = rmw_xireg_ctr(env, csrno, isel, val, new_val, wr_mask);
>      } else {
>          /*
>           * As per the specification, access to unimplented region is undefined
> diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> index 027b0324136961c61efb3fcca7a8dc13920d5e4d..29f6a3a72901abd9d56744834c6b0c28ae8cf685 100644
> --- a/target/riscv/tcg/tcg-cpu.c
> +++ b/target/riscv/tcg/tcg-cpu.c
> @@ -681,6 +681,17 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
>          return;
>      }
>
> +    if ((cpu->cfg.ext_smctr || cpu->cfg.ext_ssctr) &&
> +        (!riscv_has_ext(env, RVS) || !cpu->cfg.ext_sscsrind)) {
> +        if (cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_smctr)) ||
> +            cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_ssctr))) {
> +            error_setg(errp, "Smctr and Ssctr require S-mode and Sscsrind");
> +            return;
> +        }
> +        cpu->cfg.ext_smctr = false;
> +        cpu->cfg.ext_ssctr = false;
> +    }
> +
>      /*
>       * Disable isa extensions based on priv spec after we
>       * validated and set everything we need.
>
> ---
> base-commit: 485adaaf6657dd5070dbefed593b2923a397a63f
> change-id: 20250205-b4-ctr_upstream_v6-71418cd245ee
>
> Best regards,
> --
> Rajnesh Kanwal
>
>
Rajnesh Kanwal Feb. 17, 2025, 12:06 p.m. UTC | #2
On Mon, Feb 17, 2025 at 5:25 AM Alistair Francis <alistair23@gmail.com> wrote:
>
> On Wed, Feb 12, 2025 at 8:20 PM Rajnesh Kanwal <rkanwal@rivosinc.com> wrote:
> >
> > CTR entries are accessed using ctrsource, ctrtarget and ctrdata
> > registers using smcsrind/sscsrind extension. This commits extends
> > the csrind extension to support CTR registers.
> >
> > ctrsource is accessible through xireg CSR, ctrtarget is accessible
> > through xireg1 and ctrdata is accessible through xireg2 CSR.
> >
> > CTR supports maximum depth of 256 entries which are accessed using
> > xiselect range 0x200 to 0x2ff.
> >
> > This commits also adds properties to enable CTR extension. CTR can be
> > enabled using smctr=true and ssctr=true now.
> >
> > Signed-off-by: Rajnesh Kanwal <rkanwal@rivosinc.com>
> > Acked-by: Alistair Francis <alistair.francis@wdc.com>
>
> Thanks!
>
> Applied to riscv-to-apply.next
>
> Alistair
>

Thanks.

> > ---
> > This series enables Control Transfer Records extension support on riscv
> > platform. This extension is similar to Arch LBR in x86 and BRBE in ARM.
> > The Extension has been ratified and this series is based on v1.0 [0]
> >
> > CTR extension depends on both the implementation of S-mode and Sscsrind
> > extension v1.0.0 [1]. CTR access ctrsource, ctrtartget and ctrdata CSRs using
> > sscsrind extension.
> >
> > The series is based on Smcdeleg/Ssccfg counter delegation extension [2]
> > patches [3]. CTR itself doesn't depend on counter delegation support. This
> > rebase is basically to include the Smcsrind patches.
> >
> > Here is the link to a quick start guide [4] to setup and run a basic perf demo
> > on Linux to use CTR Ext.
> >
> > Qemu patches can be found here:
> > https://github.com/rajnesh-kanwal/qemu/tree/b4/ctr_upstream_v7
> >
> > Opensbi patch can be found here:
> > https://github.com/rajnesh-kanwal/opensbi/tree/ctr_upstream_v2
> >
> > Linux kernel patches can be found here:
> > https://github.com/rajnesh-kanwal/linux/tree/b4/ctr_upstream_v2
> >
> > [0]: https://github.com/riscv/riscv-control-transfer-records/releases/tag/v1.0
> > [1]: https://github.com/riscvarchive/riscv-indirect-csr-access/releases/tag/v1.0.0
> > [2]: https://github.com/riscvarchive/riscv-smcdeleg-ssccfg/releases/tag/v1.0.0
> > [3]: https://lore.kernel.org/qemu-riscv/20241203-counter_delegation-v4-0-c12a89baed86@rivosinc.com/
> > [4]: https://github.com/rajnesh-kanwal/linux/wiki/Running-CTR-basic-demo-on-QEMU-RISC%E2%80%90V-Virt-machine
> > ---
> > Changes in v7:
> > v7: Rebased on latest riscv-to-apply.next. Given 6 out of 7 patches
> >     are already in riscv-to-apply.next, this version only contains the
> >     last patch which failed to apply.
> >
> > v6: Rebased on latest riscv-to-apply.for-upstream.
> >   - https://lore.kernel.org/qemu-devel/20250205-b4-ctr_upstream_v6-v6-0-439d8e06c8ef@rivosinc.com
> >
> > v5: Improvements based on Richard Henderson's feedback.
> >   - Fixed code gen logic to use gen_update_pc() instead of
> >     tcg_constant_tl().
> >   - Some function renaming.
> >   - Rebased onto v4 of counter delegation series.
> >   - https://lore.kernel.org/qemu-riscv/20241205-b4-ctr_upstream_v3-v5-0-60b993aa567d@rivosinc.com/
> >
> > v4: Improvements based on Richard Henderson's feedback.
> >   - Refactored CTR related code generation to move more code into
> >     translation side and avoid unnecessary code execution in generated
> >     code.
> >   - Added missing code in machine.c to migrate the new state.
> >   - https://lore.kernel.org/r/20241204-b4-ctr_upstream_v3-v4-0-d3ce6bef9432@rivosinc.com
> >
> > v3: Improvements based on Jason Chien and Frank Chang's feedback.
> >   - Created single set of MACROs for CTR CSRs in cpu_bit.h
> >   - Some fixes in riscv_ctr_add_entry.
> >   - Return zero for vs/sireg4-6 for CTR 0x200 to 0x2ff range.
> >   - Improved extension dependency check.
> >   - Fixed invalid ctrctl csr selection bug in riscv_ctr_freeze.
> >   - Added implied rules for Smctr and Ssctr.
> >   - Added missing SMSTATEEN0_CTR bit in mstateen0 and hstateen0 write ops.
> >   - Some more cosmetic changes.
> >   - https://lore.kernel.org/qemu-riscv/20241104-b4-ctr_upstream_v3-v3-0-32fd3c48205f@rivosinc.com/
> >
> > v2: Lots of improvements based on Jason Chien's feedback including:
> >   - Added CTR recording for cm.jalt, cm.jt, cm.popret, cm.popretz.
> >   - Fixed and added more CTR extension enable checks.
> >   - Fixed CTR CSR predicate functions.
> >   - Fixed external trap xTE bit checks.
> >   - One fix in freeze function for VS-mode.
> >   - Lots of minor code improvements.
> >   - Added checks in sctrclr instruction helper.
> >   - https://lore.kernel.org/qemu-riscv/20240619152708.135991-1-rkanwal@rivosinc.com/
> >
> > v1:
> >   - https://lore.kernel.org/qemu-riscv/20240529160950.132754-1-rkanwal@rivosinc.com/
> > ---
> >  target/riscv/cpu.c         |  26 +++++++-
> >  target/riscv/csr.c         | 150 ++++++++++++++++++++++++++++++++++++++++++++-
> >  target/riscv/tcg/tcg-cpu.c |  11 ++++
> >  3 files changed, 185 insertions(+), 2 deletions(-)
> >
> > diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
> > index 8264c81e889424dfd491cec0ef95eeffc8fcc5b6..522d6584e4c3be7070e5a59f70f5948be8196a77 100644
> > --- a/target/riscv/cpu.c
> > +++ b/target/riscv/cpu.c
> > @@ -216,6 +216,8 @@ const RISCVIsaExtData isa_edata_arr[] = {
> >      ISA_EXT_DATA_ENTRY(ssu64xl, PRIV_VERSION_1_12_0, has_priv_1_12),
> >      ISA_EXT_DATA_ENTRY(supm, PRIV_VERSION_1_13_0, ext_supm),
> >      ISA_EXT_DATA_ENTRY(svade, PRIV_VERSION_1_11_0, ext_svade),
> > +    ISA_EXT_DATA_ENTRY(smctr, PRIV_VERSION_1_12_0, ext_smctr),
> > +    ISA_EXT_DATA_ENTRY(ssctr, PRIV_VERSION_1_12_0, ext_ssctr),
> >      ISA_EXT_DATA_ENTRY(svadu, PRIV_VERSION_1_12_0, ext_svadu),
> >      ISA_EXT_DATA_ENTRY(svinval, PRIV_VERSION_1_12_0, ext_svinval),
> >      ISA_EXT_DATA_ENTRY(svnapot, PRIV_VERSION_1_12_0, ext_svnapot),
> > @@ -1599,6 +1601,8 @@ const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
> >      MULTI_EXT_CFG_BOOL("smcdeleg", ext_smcdeleg, false),
> >      MULTI_EXT_CFG_BOOL("sscsrind", ext_sscsrind, false),
> >      MULTI_EXT_CFG_BOOL("ssccfg", ext_ssccfg, false),
> > +    MULTI_EXT_CFG_BOOL("smctr", ext_smctr, false),
> > +    MULTI_EXT_CFG_BOOL("ssctr", ext_ssctr, false),
> >      MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true),
> >      MULTI_EXT_CFG_BOOL("zicfilp", ext_zicfilp, false),
> >      MULTI_EXT_CFG_BOOL("zicfiss", ext_zicfiss, false),
> > @@ -2863,6 +2867,26 @@ static RISCVCPUImpliedExtsRule SSPM_IMPLIED = {
> >      },
> >  };
> >
> > +static RISCVCPUImpliedExtsRule SMCTR_IMPLIED = {
> > +    .ext = CPU_CFG_OFFSET(ext_smctr),
> > +    .implied_misa_exts = RVS,
> > +    .implied_multi_exts = {
> > +        CPU_CFG_OFFSET(ext_sscsrind),
> > +
> > +        RISCV_IMPLIED_EXTS_RULE_END
> > +    },
> > +};
> > +
> > +static RISCVCPUImpliedExtsRule SSCTR_IMPLIED = {
> > +    .ext = CPU_CFG_OFFSET(ext_ssctr),
> > +    .implied_misa_exts = RVS,
> > +    .implied_multi_exts = {
> > +        CPU_CFG_OFFSET(ext_sscsrind),
> > +
> > +        RISCV_IMPLIED_EXTS_RULE_END
> > +    },
> > +};
> > +
> >  RISCVCPUImpliedExtsRule *riscv_misa_ext_implied_rules[] = {
> >      &RVA_IMPLIED, &RVD_IMPLIED, &RVF_IMPLIED,
> >      &RVM_IMPLIED, &RVV_IMPLIED, NULL
> > @@ -2881,7 +2905,7 @@ RISCVCPUImpliedExtsRule *riscv_multi_ext_implied_rules[] = {
> >      &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVKN_IMPLIED,
> >      &ZVKNC_IMPLIED, &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED,
> >      &ZVKS_IMPLIED,  &ZVKSC_IMPLIED, &ZVKSG_IMPLIED, &SSCFG_IMPLIED,
> > -    &SUPM_IMPLIED, &SSPM_IMPLIED,
> > +    &SUPM_IMPLIED, &SSPM_IMPLIED, &SMCTR_IMPLIED, &SSCTR_IMPLIED,
> >      NULL
> >  };
> >
> > diff --git a/target/riscv/csr.c b/target/riscv/csr.c
> > index a62c50f057f487753a79393306641d3e50085ee5..d0068ce98c156abd67b7d08f94f29edb957143bd 100644
> > --- a/target/riscv/csr.c
> > +++ b/target/riscv/csr.c
> > @@ -2431,6 +2431,13 @@ static bool xiselect_cd_range(target_ulong isel)
> >      return (ISELECT_CD_FIRST <= isel && isel <= ISELECT_CD_LAST);
> >  }
> >
> > +static bool xiselect_ctr_range(int csrno, target_ulong isel)
> > +{
> > +    /* MIREG-MIREG6 for the range 0x200-0x2ff are not used by CTR. */
> > +    return CTR_ENTRIES_FIRST <= isel && isel <= CTR_ENTRIES_LAST &&
> > +           csrno < CSR_MIREG;
> > +}
> > +
> >  static int rmw_iprio(target_ulong xlen,
> >                       target_ulong iselect, uint8_t *iprio,
> >                       target_ulong *val, target_ulong new_val,
> > @@ -2476,6 +2483,124 @@ static int rmw_iprio(target_ulong xlen,
> >      return 0;
> >  }
> >
> > +static int rmw_ctrsource(CPURISCVState *env, int isel, target_ulong *val,
> > +                          target_ulong new_val, target_ulong wr_mask)
> > +{
> > +    /*
> > +     * CTR arrays are treated as circular buffers and TOS always points to next
> > +     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> > +     * 0 is always the latest one, traversal is a bit different here. See the
> > +     * below example.
> > +     *
> > +     * Depth = 16.
> > +     *
> > +     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> > +     * TOS                                 H
> > +     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
> > +     */
> > +    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> > +    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> > +    uint64_t idx;
> > +
> > +    /* Entry greater than depth-1 is read-only zero */
> > +    if (entry >= depth) {
> > +        if (val) {
> > +            *val = 0;
> > +        }
> > +        return 0;
> > +    }
> > +
> > +    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> > +    idx = (idx - entry - 1) & (depth - 1);
> > +
> > +    if (val) {
> > +        *val = env->ctr_src[idx];
> > +    }
> > +
> > +    env->ctr_src[idx] = (env->ctr_src[idx] & ~wr_mask) | (new_val & wr_mask);
> > +
> > +    return 0;
> > +}
> > +
> > +static int rmw_ctrtarget(CPURISCVState *env, int isel, target_ulong *val,
> > +                          target_ulong new_val, target_ulong wr_mask)
> > +{
> > +    /*
> > +     * CTR arrays are treated as circular buffers and TOS always points to next
> > +     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> > +     * 0 is always the latest one, traversal is a bit different here. See the
> > +     * below example.
> > +     *
> > +     * Depth = 16.
> > +     *
> > +     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> > +     * head                                H
> > +     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
> > +     */
> > +    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> > +    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> > +    uint64_t idx;
> > +
> > +    /* Entry greater than depth-1 is read-only zero */
> > +    if (entry >= depth) {
> > +        if (val) {
> > +            *val = 0;
> > +        }
> > +        return 0;
> > +    }
> > +
> > +    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> > +    idx = (idx - entry - 1) & (depth - 1);
> > +
> > +    if (val) {
> > +        *val = env->ctr_dst[idx];
> > +    }
> > +
> > +    env->ctr_dst[idx] = (env->ctr_dst[idx] & ~wr_mask) | (new_val & wr_mask);
> > +
> > +    return 0;
> > +}
> > +
> > +static int rmw_ctrdata(CPURISCVState *env, int isel, target_ulong *val,
> > +                        target_ulong new_val, target_ulong wr_mask)
> > +{
> > +    /*
> > +     * CTR arrays are treated as circular buffers and TOS always points to next
> > +     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
> > +     * 0 is always the latest one, traversal is a bit different here. See the
> > +     * below example.
> > +     *
> > +     * Depth = 16.
> > +     *
> > +     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
> > +     * head                                H
> > +     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
> > +     */
> > +    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
> > +    const uint64_t mask = wr_mask & CTRDATA_MASK;
> > +    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
> > +    uint64_t idx;
> > +
> > +    /* Entry greater than depth-1 is read-only zero */
> > +    if (entry >= depth) {
> > +        if (val) {
> > +            *val = 0;
> > +        }
> > +        return 0;
> > +    }
> > +
> > +    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
> > +    idx = (idx - entry - 1) & (depth - 1);
> > +
> > +    if (val) {
> > +        *val = env->ctr_data[idx];
> > +    }
> > +
> > +    env->ctr_data[idx] = (env->ctr_data[idx] & ~mask) | (new_val & mask);
> > +
> > +    return 0;
> > +}
> > +
> >  static RISCVException rmw_xireg_aia(CPURISCVState *env, int csrno,
> >                           target_ulong isel, target_ulong *val,
> >                           target_ulong new_val, target_ulong wr_mask)
> > @@ -2628,6 +2753,27 @@ done:
> >      return ret;
> >  }
> >
> > +static int rmw_xireg_ctr(CPURISCVState *env, int csrno,
> > +                        target_ulong isel, target_ulong *val,
> > +                        target_ulong new_val, target_ulong wr_mask)
> > +{
> > +    if (!riscv_cpu_cfg(env)->ext_smctr && !riscv_cpu_cfg(env)->ext_ssctr) {
> > +        return -EINVAL;
> > +    }
> > +
> > +    if (csrno == CSR_SIREG || csrno == CSR_VSIREG) {
> > +        return rmw_ctrsource(env, isel, val, new_val, wr_mask);
> > +    } else if (csrno == CSR_SIREG2 || csrno == CSR_VSIREG2) {
> > +        return rmw_ctrtarget(env, isel, val, new_val, wr_mask);
> > +    } else if (csrno == CSR_SIREG3 || csrno == CSR_VSIREG3) {
> > +        return rmw_ctrdata(env, isel, val, new_val, wr_mask);
> > +    } else if (val) {
> > +        *val = 0;
> > +    }
> > +
> > +    return 0;
> > +}
> > +
> >  /*
> >   * rmw_xireg_csrind: Perform indirect access to xireg and xireg2-xireg6
> >   *
> > @@ -2639,11 +2785,13 @@ static int rmw_xireg_csrind(CPURISCVState *env, int csrno,
> >                                target_ulong isel, target_ulong *val,
> >                                target_ulong new_val, target_ulong wr_mask)
> >  {
> > -    int ret = -EINVAL;
> >      bool virt = csrno == CSR_VSIREG ? true : false;
> > +    int ret = -EINVAL;
> >
> >      if (xiselect_cd_range(isel)) {
> >          ret = rmw_xireg_cd(env, csrno, isel, val, new_val, wr_mask);
> > +    } else if (xiselect_ctr_range(csrno, isel)) {
> > +        ret = rmw_xireg_ctr(env, csrno, isel, val, new_val, wr_mask);
> >      } else {
> >          /*
> >           * As per the specification, access to unimplented region is undefined
> > diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
> > index 027b0324136961c61efb3fcca7a8dc13920d5e4d..29f6a3a72901abd9d56744834c6b0c28ae8cf685 100644
> > --- a/target/riscv/tcg/tcg-cpu.c
> > +++ b/target/riscv/tcg/tcg-cpu.c
> > @@ -681,6 +681,17 @@ void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
> >          return;
> >      }
> >
> > +    if ((cpu->cfg.ext_smctr || cpu->cfg.ext_ssctr) &&
> > +        (!riscv_has_ext(env, RVS) || !cpu->cfg.ext_sscsrind)) {
> > +        if (cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_smctr)) ||
> > +            cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_ssctr))) {
> > +            error_setg(errp, "Smctr and Ssctr require S-mode and Sscsrind");
> > +            return;
> > +        }
> > +        cpu->cfg.ext_smctr = false;
> > +        cpu->cfg.ext_ssctr = false;
> > +    }
> > +
> >      /*
> >       * Disable isa extensions based on priv spec after we
> >       * validated and set everything we need.
> >
> > ---
> > base-commit: 485adaaf6657dd5070dbefed593b2923a397a63f
> > change-id: 20250205-b4-ctr_upstream_v6-71418cd245ee
> >
> > Best regards,
> > --
> > Rajnesh Kanwal
> >
> >
diff mbox series

Patch

diff --git a/target/riscv/cpu.c b/target/riscv/cpu.c
index 8264c81e889424dfd491cec0ef95eeffc8fcc5b6..522d6584e4c3be7070e5a59f70f5948be8196a77 100644
--- a/target/riscv/cpu.c
+++ b/target/riscv/cpu.c
@@ -216,6 +216,8 @@  const RISCVIsaExtData isa_edata_arr[] = {
     ISA_EXT_DATA_ENTRY(ssu64xl, PRIV_VERSION_1_12_0, has_priv_1_12),
     ISA_EXT_DATA_ENTRY(supm, PRIV_VERSION_1_13_0, ext_supm),
     ISA_EXT_DATA_ENTRY(svade, PRIV_VERSION_1_11_0, ext_svade),
+    ISA_EXT_DATA_ENTRY(smctr, PRIV_VERSION_1_12_0, ext_smctr),
+    ISA_EXT_DATA_ENTRY(ssctr, PRIV_VERSION_1_12_0, ext_ssctr),
     ISA_EXT_DATA_ENTRY(svadu, PRIV_VERSION_1_12_0, ext_svadu),
     ISA_EXT_DATA_ENTRY(svinval, PRIV_VERSION_1_12_0, ext_svinval),
     ISA_EXT_DATA_ENTRY(svnapot, PRIV_VERSION_1_12_0, ext_svnapot),
@@ -1599,6 +1601,8 @@  const RISCVCPUMultiExtConfig riscv_cpu_extensions[] = {
     MULTI_EXT_CFG_BOOL("smcdeleg", ext_smcdeleg, false),
     MULTI_EXT_CFG_BOOL("sscsrind", ext_sscsrind, false),
     MULTI_EXT_CFG_BOOL("ssccfg", ext_ssccfg, false),
+    MULTI_EXT_CFG_BOOL("smctr", ext_smctr, false),
+    MULTI_EXT_CFG_BOOL("ssctr", ext_ssctr, false),
     MULTI_EXT_CFG_BOOL("zifencei", ext_zifencei, true),
     MULTI_EXT_CFG_BOOL("zicfilp", ext_zicfilp, false),
     MULTI_EXT_CFG_BOOL("zicfiss", ext_zicfiss, false),
@@ -2863,6 +2867,26 @@  static RISCVCPUImpliedExtsRule SSPM_IMPLIED = {
     },
 };
 
+static RISCVCPUImpliedExtsRule SMCTR_IMPLIED = {
+    .ext = CPU_CFG_OFFSET(ext_smctr),
+    .implied_misa_exts = RVS,
+    .implied_multi_exts = {
+        CPU_CFG_OFFSET(ext_sscsrind),
+
+        RISCV_IMPLIED_EXTS_RULE_END
+    },
+};
+
+static RISCVCPUImpliedExtsRule SSCTR_IMPLIED = {
+    .ext = CPU_CFG_OFFSET(ext_ssctr),
+    .implied_misa_exts = RVS,
+    .implied_multi_exts = {
+        CPU_CFG_OFFSET(ext_sscsrind),
+
+        RISCV_IMPLIED_EXTS_RULE_END
+    },
+};
+
 RISCVCPUImpliedExtsRule *riscv_misa_ext_implied_rules[] = {
     &RVA_IMPLIED, &RVD_IMPLIED, &RVF_IMPLIED,
     &RVM_IMPLIED, &RVV_IMPLIED, NULL
@@ -2881,7 +2905,7 @@  RISCVCPUImpliedExtsRule *riscv_multi_ext_implied_rules[] = {
     &ZVFH_IMPLIED, &ZVFHMIN_IMPLIED, &ZVKN_IMPLIED,
     &ZVKNC_IMPLIED, &ZVKNG_IMPLIED, &ZVKNHB_IMPLIED,
     &ZVKS_IMPLIED,  &ZVKSC_IMPLIED, &ZVKSG_IMPLIED, &SSCFG_IMPLIED,
-    &SUPM_IMPLIED, &SSPM_IMPLIED,
+    &SUPM_IMPLIED, &SSPM_IMPLIED, &SMCTR_IMPLIED, &SSCTR_IMPLIED,
     NULL
 };
 
diff --git a/target/riscv/csr.c b/target/riscv/csr.c
index a62c50f057f487753a79393306641d3e50085ee5..d0068ce98c156abd67b7d08f94f29edb957143bd 100644
--- a/target/riscv/csr.c
+++ b/target/riscv/csr.c
@@ -2431,6 +2431,13 @@  static bool xiselect_cd_range(target_ulong isel)
     return (ISELECT_CD_FIRST <= isel && isel <= ISELECT_CD_LAST);
 }
 
+static bool xiselect_ctr_range(int csrno, target_ulong isel)
+{
+    /* MIREG-MIREG6 for the range 0x200-0x2ff are not used by CTR. */
+    return CTR_ENTRIES_FIRST <= isel && isel <= CTR_ENTRIES_LAST &&
+           csrno < CSR_MIREG;
+}
+
 static int rmw_iprio(target_ulong xlen,
                      target_ulong iselect, uint8_t *iprio,
                      target_ulong *val, target_ulong new_val,
@@ -2476,6 +2483,124 @@  static int rmw_iprio(target_ulong xlen,
     return 0;
 }
 
+static int rmw_ctrsource(CPURISCVState *env, int isel, target_ulong *val,
+                          target_ulong new_val, target_ulong wr_mask)
+{
+    /*
+     * CTR arrays are treated as circular buffers and TOS always points to next
+     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
+     * 0 is always the latest one, traversal is a bit different here. See the
+     * below example.
+     *
+     * Depth = 16.
+     *
+     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
+     * TOS                                 H
+     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
+     */
+    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
+    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
+    uint64_t idx;
+
+    /* Entry greater than depth-1 is read-only zero */
+    if (entry >= depth) {
+        if (val) {
+            *val = 0;
+        }
+        return 0;
+    }
+
+    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
+    idx = (idx - entry - 1) & (depth - 1);
+
+    if (val) {
+        *val = env->ctr_src[idx];
+    }
+
+    env->ctr_src[idx] = (env->ctr_src[idx] & ~wr_mask) | (new_val & wr_mask);
+
+    return 0;
+}
+
+static int rmw_ctrtarget(CPURISCVState *env, int isel, target_ulong *val,
+                          target_ulong new_val, target_ulong wr_mask)
+{
+    /*
+     * CTR arrays are treated as circular buffers and TOS always points to next
+     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
+     * 0 is always the latest one, traversal is a bit different here. See the
+     * below example.
+     *
+     * Depth = 16.
+     *
+     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
+     * head                                H
+     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
+     */
+    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
+    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
+    uint64_t idx;
+
+    /* Entry greater than depth-1 is read-only zero */
+    if (entry >= depth) {
+        if (val) {
+            *val = 0;
+        }
+        return 0;
+    }
+
+    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
+    idx = (idx - entry - 1) & (depth - 1);
+
+    if (val) {
+        *val = env->ctr_dst[idx];
+    }
+
+    env->ctr_dst[idx] = (env->ctr_dst[idx] & ~wr_mask) | (new_val & wr_mask);
+
+    return 0;
+}
+
+static int rmw_ctrdata(CPURISCVState *env, int isel, target_ulong *val,
+                        target_ulong new_val, target_ulong wr_mask)
+{
+    /*
+     * CTR arrays are treated as circular buffers and TOS always points to next
+     * empty slot, keeping TOS - 1 always pointing to latest entry. Given entry
+     * 0 is always the latest one, traversal is a bit different here. See the
+     * below example.
+     *
+     * Depth = 16.
+     *
+     * idx    [0] [1] [2] [3] [4] [5] [6] [7] [8] [9] [A] [B] [C] [D] [E] [F]
+     * head                                H
+     * entry   6   5   4   3   2   1   0   F   E   D   C   B   A   9   8   7
+     */
+    const uint64_t entry = isel - CTR_ENTRIES_FIRST;
+    const uint64_t mask = wr_mask & CTRDATA_MASK;
+    const uint64_t depth = 16 << get_field(env->sctrdepth, SCTRDEPTH_MASK);
+    uint64_t idx;
+
+    /* Entry greater than depth-1 is read-only zero */
+    if (entry >= depth) {
+        if (val) {
+            *val = 0;
+        }
+        return 0;
+    }
+
+    idx = get_field(env->sctrstatus, SCTRSTATUS_WRPTR_MASK);
+    idx = (idx - entry - 1) & (depth - 1);
+
+    if (val) {
+        *val = env->ctr_data[idx];
+    }
+
+    env->ctr_data[idx] = (env->ctr_data[idx] & ~mask) | (new_val & mask);
+
+    return 0;
+}
+
 static RISCVException rmw_xireg_aia(CPURISCVState *env, int csrno,
                          target_ulong isel, target_ulong *val,
                          target_ulong new_val, target_ulong wr_mask)
@@ -2628,6 +2753,27 @@  done:
     return ret;
 }
 
+static int rmw_xireg_ctr(CPURISCVState *env, int csrno,
+                        target_ulong isel, target_ulong *val,
+                        target_ulong new_val, target_ulong wr_mask)
+{
+    if (!riscv_cpu_cfg(env)->ext_smctr && !riscv_cpu_cfg(env)->ext_ssctr) {
+        return -EINVAL;
+    }
+
+    if (csrno == CSR_SIREG || csrno == CSR_VSIREG) {
+        return rmw_ctrsource(env, isel, val, new_val, wr_mask);
+    } else if (csrno == CSR_SIREG2 || csrno == CSR_VSIREG2) {
+        return rmw_ctrtarget(env, isel, val, new_val, wr_mask);
+    } else if (csrno == CSR_SIREG3 || csrno == CSR_VSIREG3) {
+        return rmw_ctrdata(env, isel, val, new_val, wr_mask);
+    } else if (val) {
+        *val = 0;
+    }
+
+    return 0;
+}
+
 /*
  * rmw_xireg_csrind: Perform indirect access to xireg and xireg2-xireg6
  *
@@ -2639,11 +2785,13 @@  static int rmw_xireg_csrind(CPURISCVState *env, int csrno,
                               target_ulong isel, target_ulong *val,
                               target_ulong new_val, target_ulong wr_mask)
 {
-    int ret = -EINVAL;
     bool virt = csrno == CSR_VSIREG ? true : false;
+    int ret = -EINVAL;
 
     if (xiselect_cd_range(isel)) {
         ret = rmw_xireg_cd(env, csrno, isel, val, new_val, wr_mask);
+    } else if (xiselect_ctr_range(csrno, isel)) {
+        ret = rmw_xireg_ctr(env, csrno, isel, val, new_val, wr_mask);
     } else {
         /*
          * As per the specification, access to unimplented region is undefined
diff --git a/target/riscv/tcg/tcg-cpu.c b/target/riscv/tcg/tcg-cpu.c
index 027b0324136961c61efb3fcca7a8dc13920d5e4d..29f6a3a72901abd9d56744834c6b0c28ae8cf685 100644
--- a/target/riscv/tcg/tcg-cpu.c
+++ b/target/riscv/tcg/tcg-cpu.c
@@ -681,6 +681,17 @@  void riscv_cpu_validate_set_extensions(RISCVCPU *cpu, Error **errp)
         return;
     }
 
+    if ((cpu->cfg.ext_smctr || cpu->cfg.ext_ssctr) &&
+        (!riscv_has_ext(env, RVS) || !cpu->cfg.ext_sscsrind)) {
+        if (cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_smctr)) ||
+            cpu_cfg_ext_is_user_set(CPU_CFG_OFFSET(ext_ssctr))) {
+            error_setg(errp, "Smctr and Ssctr require S-mode and Sscsrind");
+            return;
+        }
+        cpu->cfg.ext_smctr = false;
+        cpu->cfg.ext_ssctr = false;
+    }
+
     /*
      * Disable isa extensions based on priv spec after we
      * validated and set everything we need.