[v4,14/30] xen/riscv: introduce atomic.h

Message ID	6554f2479e19ed3eae6de842ac1568c31d236461.1707146506.git.oleksii.kurochko@gmail.com (mailing list archive)
State	Superseded
Headers	show Return-Path: <xen-devel-bounces@lists.xenproject.org> Errors-To: xen-devel-bounces@lists.xenproject.org Precedence: list Sender: "Xen-devel" <xen-devel-bounces@lists.xenproject.org> From: Oleksii Kurochko <oleksii.kurochko@gmail.com> To: xen-devel@lists.xenproject.org Cc: Bobby Eshleman <bobbyeshleman@gmail.com>, Alistair Francis <alistair.francis@wdc.com>, Connor Davis <connojdavis@gmail.com>, Andrew Cooper <andrew.cooper3@citrix.com>, George Dunlap <george.dunlap@citrix.com>, Jan Beulich <jbeulich@suse.com>, Julien Grall <julien@xen.org>, Stefano Stabellini <sstabellini@kernel.org>, Wei Liu <wl@xen.org>, Oleksii Kurochko <oleksii.kurochko@gmail.com> Subject: [PATCH v4 14/30] xen/riscv: introduce atomic.h Date: Mon, 5 Feb 2024 16:32:21 +0100 Message-ID: <6554f2479e19ed3eae6de842ac1568c31d236461.1707146506.git.oleksii.kurochko@gmail.com> In-Reply-To: <cover.1707146506.git.oleksii.kurochko@gmail.com> References: <cover.1707146506.git.oleksii.kurochko@gmail.com> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	Enable build of full Xen for RISC-V \| expand [v4,00/30] Enable build of full Xen for RISC-V [v4,01/30] xen/riscv: disable unnecessary configs [v4,02/30] xen/riscv: use some asm-generic headers [v4,03/30] xen: add support in public/hvm/save.h for PPC and RISC-V [v4,04/30] xen/riscv: introduce cpufeature.h [v4,05/30] xen/riscv: introduce guest_atomics.h [v4,06/30] xen: avoid generation of empty asm/iommu.h [v4,07/30] xen/asm-generic: introdure nospec.h [v4,08/30] xen/riscv: introduce setup.h [v4,09/30] xen/riscv: introduce bitops.h [v4,10/30] xen/riscv: introduce flushtlb.h [v4,11/30] xen/riscv: introduce smp.h [v4,12/30] xen/riscv: introduce cmpxchg.h [v4,13/30] xen/riscv: introduce io.h [v4,14/30] xen/riscv: introduce atomic.h [v4,15/30] xen/riscv: introduce irq.h [v4,16/30] xen/riscv: introduce p2m.h [v4,17/30] xen/riscv: introduce regs.h [v4,18/30] xen/riscv: introduce time.h [v4,19/30] xen/riscv: introduce event.h [v4,20/30] xen/riscv: introduce monitor.h [v4,21/30] xen/riscv: add definition of __read_mostly [v4,22/30] xen/riscv: define an address of frame table [v4,23/30] xen/riscv: add required things to current.h [v4,24/30] xen/riscv: add minimal stuff to page.h to build full Xen [v4,25/30] xen/riscv: add minimal stuff to processor.h to build full Xen [v4,26/30] xen/riscv: add minimal stuff to mm.h to build full Xen [v4,27/30] xen/riscv: introduce vm_event_*() functions [v4,28/30] xen/rirscv: add minimal amount of stubs to build full Xen [v4,29/30] xen/riscv: enable full Xen build [v4,30/30] xen/README: add compiler and binutils versions for RISC-V64

Oleksii Kurochko Feb. 5, 2024, 3:32 p.m. UTC

From: Bobby Eshleman <bobbyeshleman@gmail.com>

Additionally, this patch introduces macros in fence.h,
which are utilized in atomic.h.

atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n)
were updated to use __*xchg_generic().

Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
---
Changes in V4:
 - do changes related to the updates of [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
 - drop casts in read_atomic_size(), write_atomic(), add_sized()
 - tabs -> spaces
 - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler to handle NR_CPUS=1
   the same as NR_CPUS>1 with accepting less than ideal performance.
---
Changes in V3:
  - update the commit message
  - add SPDX for fence.h
  - code style fixes
  - Remove /* TODO: ... */ for add_sized macros. It looks correct to me.
  - re-order the patch
  - merge to this patch fence.h
---
Changes in V2:
 - Change an author of commit. I got this header from Bobby's old repo.
---
 xen/arch/riscv/include/asm/atomic.h | 395 ++++++++++++++++++++++++++++
 xen/arch/riscv/include/asm/fence.h  |   8 +
 2 files changed, 403 insertions(+)
 create mode 100644 xen/arch/riscv/include/asm/atomic.h
 create mode 100644 xen/arch/riscv/include/asm/fence.h

Jan Beulich Feb. 13, 2024, 11:36 a.m. UTC | #1

On 05.02.2024 16:32, Oleksii Kurochko wrote:
> From: Bobby Eshleman <bobbyeshleman@gmail.com>
> 
> Additionally, this patch introduces macros in fence.h,
> which are utilized in atomic.h.

These are used in an earlier patch already, so either you want to
re-order the series, or you want to move that introduction ahead.

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/atomic.h
> @@ -0,0 +1,395 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Taken and modified from Linux.
> + *
> + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated to use
> + * __*xchg_generic()
> + * 
> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> + * Copyright (C) 2012 Regents of the University of California
> + * Copyright (C) 2017 SiFive
> + * Copyright (C) 2021 Vates SAS
> + */
> +
> +#ifndef _ASM_RISCV_ATOMIC_H
> +#define _ASM_RISCV_ATOMIC_H
> +
> +#include <xen/atomic.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +void __bad_atomic_size(void);
> +
> +static always_inline void read_atomic_size(const volatile void *p,
> +                                           void *res,
> +                                           unsigned int size)
> +{
> +    switch ( size )
> +    {
> +    case 1: *(uint8_t *)res = readb(p); break;
> +    case 2: *(uint16_t *)res = readw(p); break;
> +    case 4: *(uint32_t *)res = readl(p); break;
> +    case 8: *(uint32_t *)res  = readq(p); break;

Why is it the MMIO primitives you use here, i.e. not read<X>_cpu()?
It's RAM you're accessing after all.

Also - no CONFIG_64BIT conditional here (like you have in the other
patch)?

> +    default: __bad_atomic_size(); break;
> +    }
> +}
> +
> +#define read_atomic(p) ({                               \
> +    union { typeof(*p) val; char c[0]; } x_;            \
> +    read_atomic_size(p, x_.c, sizeof(*p));              \

I'll be curious for how much longer gcc will tolerate this accessing
of a zero-length array, without issuing at least a warning. I'd
recommend using sizeof(*(p)) as the array dimension right away. (From
this not also the missing parentheses in what you have.)

> +    x_.val;                                             \
> +})
> +
> +#define write_atomic(p, x)                              \
> +({                                                      \
> +    typeof(*p) x__ = (x);                               \
> +    switch ( sizeof(*p) )                               \
> +    {                                                   \
> +    case 1: writeb((uint8_t)x__,  p); break;            \
> +    case 2: writew((uint16_t)x__, p); break;            \
> +    case 4: writel((uint32_t)x__, p); break;            \
> +    case 8: writeq((uint64_t)x__, p); break;            \

Are the casts actually necessary here?

> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +    x__;                                                \
> +})
> +
> +#define add_sized(p, x)                                 \
> +({                                                      \
> +    typeof(*(p)) x__ = (x);                             \
> +    switch ( sizeof(*(p)) )                             \
> +    {                                                   \
> +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> +    case 2: writew(read_atomic(p) + x__, p); break;     \
> +    case 4: writel(read_atomic(p) + x__, p); break;     \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +})
> +
> +/*
> + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
> + *               non-scalar types unchanged.
> + *
> + * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char'
> + * is not type-compatible with 'signed char', and we define a separate case.
> + */
> +#define __scalar_type_to_expr_cases(type)               \
> +    unsigned type:  (unsigned type)0,                   \
> +    signed type:    (signed type)0
> +
> +#define __unqual_scalar_typeof(x) typeof(               \
> +    _Generic((x),                                       \
> +        char:  (char)0,                                 \
> +        __scalar_type_to_expr_cases(char),              \
> +        __scalar_type_to_expr_cases(short),             \
> +        __scalar_type_to_expr_cases(int),               \
> +        __scalar_type_to_expr_cases(long),              \
> +        __scalar_type_to_expr_cases(long long),         \
> +        default: (x)))

This isn't RISC-V specific, is it? In which case it wants moving to,
perhaps, xen/macros.h (and then also have the leading underscores
dropped).

> +#define READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x) *)&(x))
> +#define WRITE_ONCE(x, val)                                      \
> +    do {                                                        \
> +        *(volatile typeof(x) *)&(x) = (val);                    \
> +    } while (0)

In Xen we use ACCESS_ONCE(); any reason you need to introduce
{READ,WRITE}_ONCE() in addition? Without them, __unqual_scalar_typeof()
may then also not be needed (or, if there's a need to enhance it, may
then be needed for ACCESS_ONCE()). Which in turn raises the question
why only READ_ONCE() uses it here.

> +#define __atomic_acquire_fence() \
> +    __asm__ __volatile__( RISCV_ACQUIRE_BARRIER "" ::: "memory" )

Missing blank here and ...

> +#define __atomic_release_fence() \
> +    __asm__ __volatile__( RISCV_RELEASE_BARRIER "" ::: "memory" );

... here, and stray semicolon additionally just here.

> +static inline int atomic_read(const atomic_t *v)
> +{
> +    return READ_ONCE(v->counter);
> +}
> +
> +static inline int _atomic_read(atomic_t v)
> +{
> +    return v.counter;
> +}
> +
> +static inline void atomic_set(atomic_t *v, int i)
> +{
> +    WRITE_ONCE(v->counter, i);
> +}
> +
> +static inline void _atomic_set(atomic_t *v, int i)
> +{
> +    v->counter = i;
> +}
> +
> +static inline int atomic_sub_and_test(int i, atomic_t *v)
> +{
> +    return atomic_sub_return(i, v) == 0;
> +}
> +
> +static inline void atomic_inc(atomic_t *v)
> +{
> +    atomic_add(1, v);
> +}
> +
> +static inline int atomic_inc_return(atomic_t *v)
> +{
> +    return atomic_add_return(1, v);
> +}
> +
> +static inline void atomic_dec(atomic_t *v)
> +{
> +    atomic_sub(1, v);
> +}
> +
> +static inline int atomic_dec_return(atomic_t *v)
> +{
> +    return atomic_sub_return(1, v);
> +}
> +
> +static inline int atomic_dec_and_test(atomic_t *v)
> +{
> +    return atomic_sub_return(1, v) == 0;
> +}
> +
> +static inline int atomic_add_negative(int i, atomic_t *v)
> +{
> +    return atomic_add_return(i, v) < 0;
> +}
> +
> +static inline int atomic_inc_and_test(atomic_t *v)
> +{
> +    return atomic_add_return(1, v) == 0;
> +}

None of these look RISC-V-specific. Perhaps worth having something in
asm-generic/ that can be utilized here?

> +/*
> + * First, the atomic ops that have no ordering constraints and therefor don't
> + * have the AQ or RL bits set.  These don't return anything, so there's only
> + * one version to worry about.
> + */
> +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> +static inline                                               \
> +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                           \
> +    __asm__ __volatile__ (                                  \
> +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> +        : "+A" (v->counter)                                 \
> +        : "r" (I)                                           \
> +        : "memory" );                                       \
> +}                                                           \
> +
> +#define ATOMIC_OPS(op, asm_op, I)                           \
> +        ATOMIC_OP (op, asm_op, I, w, int,   )

So the last three parameters are to be ready to also support
atomic64, without actually doing so right now?

> +ATOMIC_OPS(add, add,  i)
> +ATOMIC_OPS(sub, add, -i)
> +ATOMIC_OPS(and, and,  i)
> +ATOMIC_OPS( or,  or,  i)
> +ATOMIC_OPS(xor, xor,  i)
> +
> +#undef ATOMIC_OP
> +#undef ATOMIC_OPS
> +
> +/*
> + * Atomic ops that have ordered, relaxed, acquire, and release variants.
> + * There's two flavors of these: the arithmatic ops have both fetch and return
> + * versions, while the logical ops only have fetch versions.
> + */

I'm somewhat confused by the comment: It first talks of 4 variants, but
then says there are only 2 (arithmetic) or 1 (logical) ones.

> +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type, prefix)    \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op##_relaxed(c_type i,              \
> +                         atomic##prefix##_t *v)                     \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type " %1, %2, %0"                \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}                                                                   \
> +static inline                                                       \
> +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t *v) \
> +{                                                                   \
> +    register c_type ret;                                            \
> +    __asm__ __volatile__ (                                          \
> +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2, %0"          \
> +        : "+A" (v->counter), "=r" (ret)                             \
> +        : "r" (I)                                                   \
> +        : "memory" );                                               \
> +    return ret;                                                     \
> +}
> +
> +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type, prefix) \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return_relaxed(c_type i,                 \
> +                          atomic##prefix##_t *v)                        \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op I;      \
> +}                                                                       \
> +static inline                                                           \
> +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t *v)  \
> +{                                                                       \
> +        return atomic##prefix##_fetch_##op(i, v) c_op I;                \
> +}
> +
> +#define ATOMIC_OPS(op, asm_op, c_op, I)                                 \
> +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,   )               \
> +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
> +
> +ATOMIC_OPS(add, add, +,  i)
> +ATOMIC_OPS(sub, add, +, -i)
> +
> +#define atomic_add_return_relaxed   atomic_add_return_relaxed
> +#define atomic_sub_return_relaxed   atomic_sub_return_relaxed
> +#define atomic_add_return   atomic_add_return
> +#define atomic_sub_return   atomic_sub_return
> +
> +#define atomic_fetch_add_relaxed    atomic_fetch_add_relaxed
> +#define atomic_fetch_sub_relaxed    atomic_fetch_sub_relaxed
> +#define atomic_fetch_add    atomic_fetch_add
> +#define atomic_fetch_sub    atomic_fetch_sub

What are all of these #define-s (any yet more further down) about?

> +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> +{
> +       int prev, rc;
> +
> +    __asm__ __volatile__ (
> +        "0: lr.w     %[p],  %[c]\n"
> +        "   sub      %[rc], %[p], %[o]\n"
> +        "   bltz     %[rc], 1f\n"
> +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> +        "   bnez     %[rc], 0b\n"
> +        "   fence    rw, rw\n"
> +        "1:\n"
> +        : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> +        : [o]"r" (offset)

Nit: Blanks please between ] and ".

> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/fence.h
> @@ -0,0 +1,8 @@
> +/* SPDX-License-Identifier: GPL-2.0-or-later */
> +#ifndef _ASM_RISCV_FENCE_H
> +#define _ASM_RISCV_FENCE_H
> +
> +#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
> +#define RISCV_RELEASE_BARRIER   "\tfence rw,  w\n"

Seeing that another "fence rw, rw" appears in this patch, I'm now pretty
sure you want to add e.g. RISCV_FULL_BARRIER here as well.

Jan

Oleksii Kurochko Feb. 14, 2024, 12:11 p.m. UTC | #2

On Tue, 2024-02-13 at 12:36 +0100, Jan Beulich wrote:
> On 05.02.2024 16:32, Oleksii Kurochko wrote:
> > From: Bobby Eshleman <bobbyeshleman@gmail.com>
> > 
> > Additionally, this patch introduces macros in fence.h,
> > which are utilized in atomic.h.
> 
> These are used in an earlier patch already, so either you want to
> re-order the series, or you want to move that introduction ahead.
> 
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/atomic.h
> > @@ -0,0 +1,395 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Taken and modified from Linux.
> > + *
> > + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
> > updated to use
> > + * __*xchg_generic()
> > + * 
> > + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> > + * Copyright (C) 2012 Regents of the University of California
> > + * Copyright (C) 2017 SiFive
> > + * Copyright (C) 2021 Vates SAS
> > + */
> > +
> > +#ifndef _ASM_RISCV_ATOMIC_H
> > +#define _ASM_RISCV_ATOMIC_H
> > +
> > +#include <xen/atomic.h>
> > +#include <asm/cmpxchg.h>
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +void __bad_atomic_size(void);
> > +
> > +static always_inline void read_atomic_size(const volatile void *p,
> > +                                           void *res,
> > +                                           unsigned int size)
> > +{
> > +    switch ( size )
> > +    {
> > +    case 1: *(uint8_t *)res = readb(p); break;
> > +    case 2: *(uint16_t *)res = readw(p); break;
> > +    case 4: *(uint32_t *)res = readl(p); break;
> > +    case 8: *(uint32_t *)res  = readq(p); break;
> 
> Why is it the MMIO primitives you use here, i.e. not read<X>_cpu()?
> It's RAM you're accessing after all.
Legacy from Linux kernel. For some reason they wanted to have ordered
read/write access.

> 
> Also - no CONFIG_64BIT conditional here (like you have in the other
> patch)?
Agree, it should be added.

> 
> > +    default: __bad_atomic_size(); break;
> > +    }
> > +}
> > +
> > +#define read_atomic(p) ({                               \
> > +    union { typeof(*p) val; char c[0]; } x_;            \
> > +    read_atomic_size(p, x_.c, sizeof(*p));              \
> 
> I'll be curious for how much longer gcc will tolerate this accessing
> of a zero-length array, without issuing at least a warning. I'd
> recommend using sizeof(*(p)) as the array dimension right away. (From
> this not also the missing parentheses in what you have.)
Thanks. I'll update that.

> 
> > +    x_.val;                                             \
> > +})
> > +
> > +#define write_atomic(p, x)                              \
> > +({                                                      \
> > +    typeof(*p) x__ = (x);                               \
> > +    switch ( sizeof(*p) )                               \
> > +    {                                                   \
> > +    case 1: writeb((uint8_t)x__,  p); break;            \
> > +    case 2: writew((uint16_t)x__, p); break;            \
> > +    case 4: writel((uint32_t)x__, p); break;            \
> > +    case 8: writeq((uint64_t)x__, p); break;            \
> 
> Are the casts actually necessary here?
Not really, we can drop them.

> 
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +    x__;                                                \
> > +})
> > +
> > +#define add_sized(p, x)                                 \
> > +({                                                      \
> > +    typeof(*(p)) x__ = (x);                             \
> > +    switch ( sizeof(*(p)) )                             \
> > +    {                                                   \
> > +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> > +    case 2: writew(read_atomic(p) + x__, p); break;     \
> > +    case 4: writel(read_atomic(p) + x__, p); break;     \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +})
> > +
> > +/*
> > + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar
> > type, leaving
> > + *               non-scalar types unchanged.
> > + *
> > + * Prefer C11 _Generic for better compile-times and simpler code.
> > Note: 'char'
> > + * is not type-compatible with 'signed char', and we define a
> > separate case.
> > + */
> > +#define __scalar_type_to_expr_cases(type)               \
> > +    unsigned type:  (unsigned type)0,                   \
> > +    signed type:    (signed type)0
> > +
> > +#define __unqual_scalar_typeof(x) typeof(               \
> > +    _Generic((x),                                       \
> > +        char:  (char)0,                                 \
> > +        __scalar_type_to_expr_cases(char),              \
> > +        __scalar_type_to_expr_cases(short),             \
> > +        __scalar_type_to_expr_cases(int),               \
> > +        __scalar_type_to_expr_cases(long),              \
> > +        __scalar_type_to_expr_cases(long long),         \
> > +        default: (x)))
> 
> This isn't RISC-V specific, is it? In which case it wants moving to,
> perhaps, xen/macros.h (and then also have the leading underscores
> dropped).
No, at all. But this thing is only used in RISC-V part, but if it would
be better to move it to xen/macros.h I will happy to sent separate
patch.

> 
> > +#define READ_ONCE(x)  (*(const volatile __unqual_scalar_typeof(x)
> > *)&(x))
> > +#define WRITE_ONCE(x, val)                                      \
> > +    do {                                                        \
> > +        *(volatile typeof(x) *)&(x) = (val);                    \
> > +    } while (0)
> 
> In Xen we use ACCESS_ONCE(); any reason you need to introduce
> {READ,WRITE}_ONCE() in addition? Without them,
> __unqual_scalar_typeof()
> may then also not be needed (or, if there's a need to enhance it, may
> then be needed for ACCESS_ONCE()). Which in turn raises the question
> why only READ_ONCE() uses it here.
Hmm, READ_ONCE() and WRITE_ONCE() can be dropped then, I'll switch
everything in my code to ACCESS_ONCE().

> 
> > +#define __atomic_acquire_fence() \
> > +    __asm__ __volatile__( RISCV_ACQUIRE_BARRIER "" ::: "memory" )
> 
> Missing blank here and ...
> 
> > +#define __atomic_release_fence() \
> > +    __asm__ __volatile__( RISCV_RELEASE_BARRIER "" ::: "memory" );
> 
> ... here, and stray semicolon additionally just here.
Thanks. I'll apply your comments to this part of code.

> 
> > +static inline int atomic_read(const atomic_t *v)
> > +{
> > +    return READ_ONCE(v->counter);
> > +}
> > +
> > +static inline int _atomic_read(atomic_t v)
> > +{
> > +    return v.counter;
> > +}
> > +
> > +static inline void atomic_set(atomic_t *v, int i)
> > +{
> > +    WRITE_ONCE(v->counter, i);
> > +}
> > +
> > +static inline void _atomic_set(atomic_t *v, int i)
> > +{
> > +    v->counter = i;
> > +}
> > +
> > +static inline int atomic_sub_and_test(int i, atomic_t *v)
> > +{
> > +    return atomic_sub_return(i, v) == 0;
> > +}
> > +
> > +static inline void atomic_inc(atomic_t *v)
> > +{
> > +    atomic_add(1, v);
> > +}
> > +
> > +static inline int atomic_inc_return(atomic_t *v)
> > +{
> > +    return atomic_add_return(1, v);
> > +}
> > +
> > +static inline void atomic_dec(atomic_t *v)
> > +{
> > +    atomic_sub(1, v);
> > +}
> > +
> > +static inline int atomic_dec_return(atomic_t *v)
> > +{
> > +    return atomic_sub_return(1, v);
> > +}
> > +
> > +static inline int atomic_dec_and_test(atomic_t *v)
> > +{
> > +    return atomic_sub_return(1, v) == 0;
> > +}
> > +
> > +static inline int atomic_add_negative(int i, atomic_t *v)
> > +{
> > +    return atomic_add_return(i, v) < 0;
> > +}
> > +
> > +static inline int atomic_inc_and_test(atomic_t *v)
> > +{
> > +    return atomic_add_return(1, v) == 0;
> > +}
> 
> None of these look RISC-V-specific. Perhaps worth having something in
> asm-generic/ that can be utilized here?
Looks like we can, at least, PPC has the similar definitions.

> 
> > +/*
> > + * First, the atomic ops that have no ordering constraints and
> > therefor don't
> > + * have the AQ or RL bits set.  These don't return anything, so
> > there's only
> > + * one version to worry about.
> > + */
> > +#define ATOMIC_OP(op, asm_op, I, asm_type, c_type, prefix)  \
> > +static inline                                               \
> > +void atomic##prefix##_##op(c_type i, atomic##prefix##_t *v) \
> > +{                                                           \
> > +    __asm__ __volatile__ (                                  \
> > +        "   amo" #asm_op "." #asm_type " zero, %1, %0"      \
> > +        : "+A" (v->counter)                                 \
> > +        : "r" (I)                                           \
> > +        : "memory" );                                       \
> > +}                                                           \
> > +
> > +#define ATOMIC_OPS(op, asm_op, I)                           \
> > +        ATOMIC_OP (op, asm_op, I, w, int,   )
> 
> So the last three parameters are to be ready to also support
> atomic64, without actually doing so right now?
Yes, it is ready to support.

> 
> > +ATOMIC_OPS(add, add,  i)
> > +ATOMIC_OPS(sub, add, -i)
> > +ATOMIC_OPS(and, and,  i)
> > +ATOMIC_OPS( or,  or,  i)
> > +ATOMIC_OPS(xor, xor,  i)
> > +
> > +#undef ATOMIC_OP
> > +#undef ATOMIC_OPS
> > +
> > +/*
> > + * Atomic ops that have ordered, relaxed, acquire, and release
> > variants.
> > + * There's two flavors of these: the arithmatic ops have both
> > fetch and return
> > + * versions, while the logical ops only have fetch versions.
> > + */
> 
> I'm somewhat confused by the comment: It first talks of 4 variants,
> but
> then says there are only 2 (arithmetic) or 1 (logical) ones.
Probably they mean that usually they have 4 variants, but it was
implemented only 2 (arith) and 1 (logical).

> 
> > +#define ATOMIC_FETCH_OP(op, asm_op, I, asm_type, c_type,
> > prefix)    \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op##_relaxed(c_type
> > i,              \
> > +                         atomic##prefix##_t
> > *v)                     \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type " %1, %2,
> > %0"                \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}                                                                 
> >   \
> > +static
> > inline                                                       \
> > +c_type atomic##prefix##_fetch_##op(c_type i, atomic##prefix##_t
> > *v) \
> > +{                                                                 
> >   \
> > +    register c_type
> > ret;                                            \
> > +    __asm__ __volatile__
> > (                                          \
> > +        "   amo" #asm_op "." #asm_type ".aqrl  %1, %2,
> > %0"          \
> > +        : "+A" (v->counter), "=r"
> > (ret)                             \
> > +        : "r"
> > (I)                                                   \
> > +        : "memory"
> > );                                               \
> > +    return
> > ret;                                                     \
> > +}
> > +
> > +#define ATOMIC_OP_RETURN(op, asm_op, c_op, I, asm_type, c_type,
> > prefix) \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return_relaxed(c_type
> > i,                 \
> > +                          atomic##prefix##_t
> > *v)                        \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op##_relaxed(i, v) c_op
> > I;      \
> > +}                                                                 
> >       \
> > +static
> > inline                                                           \
> > +c_type atomic##prefix##_##op##_return(c_type i, atomic##prefix##_t
> > *v)  \
> > +{                                                                 
> >       \
> > +        return atomic##prefix##_fetch_##op(i, v) c_op
> > I;                \
> > +}
> > +
> > +#define ATOMIC_OPS(op, asm_op, c_op,
> > I)                                 \
> > +        ATOMIC_FETCH_OP( op, asm_op,       I, w, int,  
> > )               \
> > +        ATOMIC_OP_RETURN(op, asm_op, c_op, I, w, int,   )
> > +
> > +ATOMIC_OPS(add, add, +,  i)
> > +ATOMIC_OPS(sub, add, +, -i)
> > +
> > +#define atomic_add_return_relaxed   atomic_add_return_relaxed
> > +#define atomic_sub_return_relaxed   atomic_sub_return_relaxed
> > +#define atomic_add_return   atomic_add_return
> > +#define atomic_sub_return   atomic_sub_return
> > +
> > +#define atomic_fetch_add_relaxed    atomic_fetch_add_relaxed
> > +#define atomic_fetch_sub_relaxed    atomic_fetch_sub_relaxed
> > +#define atomic_fetch_add    atomic_fetch_add
> > +#define atomic_fetch_sub    atomic_fetch_sub
> 
> What are all of these #define-s (any yet more further down) about?
> 
> > +static inline int atomic_sub_if_positive(atomic_t *v, int offset)
> > +{
> > +       int prev, rc;
> > +
> > +    __asm__ __volatile__ (
> > +        "0: lr.w     %[p],  %[c]\n"
> > +        "   sub      %[rc], %[p], %[o]\n"
> > +        "   bltz     %[rc], 1f\n"
> > +        "   sc.w.rl  %[rc], %[rc], %[c]\n"
> > +        "   bnez     %[rc], 0b\n"
> > +        "   fence    rw, rw\n"
> > +        "1:\n"
> > +        : [p]"=&r" (prev), [rc]"=&r" (rc), [c]"+A" (v->counter)
> > +        : [o]"r" (offset)
> 
> Nit: Blanks please between ] and ".
Thanks. I'll update that.

> 
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/fence.h
> > @@ -0,0 +1,8 @@
> > +/* SPDX-License-Identifier: GPL-2.0-or-later */
> > +#ifndef _ASM_RISCV_FENCE_H
> > +#define _ASM_RISCV_FENCE_H
> > +
> > +#define RISCV_ACQUIRE_BARRIER   "\tfence r , rw\n"
> > +#define RISCV_RELEASE_BARRIER   "\tfence rw,  w\n"
> 
> Seeing that another "fence rw, rw" appears in this patch, I'm now
> pretty
> sure you want to add e.g. RISCV_FULL_BARRIER here as well.
It makes sense. I'll do that. Thanks.
> 
> Jan

Jan Beulich Feb. 14, 2024, 1:09 p.m. UTC | #3

On 14.02.2024 13:11, Oleksii wrote:
> On Tue, 2024-02-13 at 12:36 +0100, Jan Beulich wrote:
>> On 05.02.2024 16:32, Oleksii Kurochko wrote:
>>> --- /dev/null
>>> +++ b/xen/arch/riscv/include/asm/atomic.h
>>> @@ -0,0 +1,395 @@
>>> +/* SPDX-License-Identifier: GPL-2.0-only */
>>> +/*
>>> + * Taken and modified from Linux.
>>> + *
>>> + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
>>> updated to use
>>> + * __*xchg_generic()
>>> + * 
>>> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
>>> + * Copyright (C) 2012 Regents of the University of California
>>> + * Copyright (C) 2017 SiFive
>>> + * Copyright (C) 2021 Vates SAS
>>> + */
>>> +
>>> +#ifndef _ASM_RISCV_ATOMIC_H
>>> +#define _ASM_RISCV_ATOMIC_H
>>> +
>>> +#include <xen/atomic.h>
>>> +#include <asm/cmpxchg.h>
>>> +#include <asm/fence.h>
>>> +#include <asm/io.h>
>>> +#include <asm/system.h>
>>> +
>>> +void __bad_atomic_size(void);
>>> +
>>> +static always_inline void read_atomic_size(const volatile void *p,
>>> +                                           void *res,
>>> +                                           unsigned int size)
>>> +{
>>> +    switch ( size )
>>> +    {
>>> +    case 1: *(uint8_t *)res = readb(p); break;
>>> +    case 2: *(uint16_t *)res = readw(p); break;
>>> +    case 4: *(uint32_t *)res = readl(p); break;
>>> +    case 8: *(uint32_t *)res  = readq(p); break;
>>
>> Why is it the MMIO primitives you use here, i.e. not read<X>_cpu()?
>> It's RAM you're accessing after all.
> Legacy from Linux kernel. For some reason they wanted to have ordered
> read/write access.

Wants expressing in a comment then, or at the very least in the patch
description.

Jan

Julien Grall Feb. 18, 2024, 7:22 p.m. UTC | #4

Hi,

On 05/02/2024 15:32, Oleksii Kurochko wrote:
> From: Bobby Eshleman <bobbyeshleman@gmail.com>
> 
> Additionally, this patch introduces macros in fence.h,
> which are utilized in atomic.h.
> 
> atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n)
> were updated to use __*xchg_generic().
> 
> Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>

The author is Bobby, but I don't see a Signed-off-by. Did you forgot it?

> ---
> Changes in V4:
>   - do changes related to the updates of [PATCH v3 13/34] xen/riscv: introduce cmpxchg.h
>   - drop casts in read_atomic_size(), write_atomic(), add_sized()
>   - tabs -> spaces
>   - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler to handle NR_CPUS=1
>     the same as NR_CPUS>1 with accepting less than ideal performance.
> ---
> Changes in V3:
>    - update the commit message
>    - add SPDX for fence.h
>    - code style fixes
>    - Remove /* TODO: ... */ for add_sized macros. It looks correct to me.
>    - re-order the patch
>    - merge to this patch fence.h
> ---
> Changes in V2:
>   - Change an author of commit. I got this header from Bobby's old repo.
> ---
>   xen/arch/riscv/include/asm/atomic.h | 395 ++++++++++++++++++++++++++++
>   xen/arch/riscv/include/asm/fence.h  |   8 +
>   2 files changed, 403 insertions(+)
>   create mode 100644 xen/arch/riscv/include/asm/atomic.h
>   create mode 100644 xen/arch/riscv/include/asm/fence.h
> 
> diff --git a/xen/arch/riscv/include/asm/atomic.h b/xen/arch/riscv/include/asm/atomic.h
> new file mode 100644
> index 0000000000..267d3c0803
> --- /dev/null
> +++ b/xen/arch/riscv/include/asm/atomic.h
> @@ -0,0 +1,395 @@
> +/* SPDX-License-Identifier: GPL-2.0-only */
> +/*
> + * Taken and modified from Linux.

Which version of Linux? Can you also spell out what are the big changes? 
This would be helpful if we need to re-sync.

> + *
> + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were updated to use
> + * __*xchg_generic()
> + *
> + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> + * Copyright (C) 2012 Regents of the University of California
> + * Copyright (C) 2017 SiFive
> + * Copyright (C) 2021 Vates SAS
> + */
> +
> +#ifndef _ASM_RISCV_ATOMIC_H
> +#define _ASM_RISCV_ATOMIC_H
> +
> +#include <xen/atomic.h>
> +#include <asm/cmpxchg.h>
> +#include <asm/fence.h>
> +#include <asm/io.h>
> +#include <asm/system.h>
> +
> +void __bad_atomic_size(void);
> +
> +static always_inline void read_atomic_size(const volatile void *p,
> +                                           void *res,
> +                                           unsigned int size)
> +{
> +    switch ( size )
> +    {
> +    case 1: *(uint8_t *)res = readb(p); break;
> +    case 2: *(uint16_t *)res = readw(p); break;
> +    case 4: *(uint32_t *)res = readl(p); break;
> +    case 8: *(uint32_t *)res  = readq(p); break;
> +    default: __bad_atomic_size(); break;
> +    }
> +}
> +
> +#define read_atomic(p) ({                               \
> +    union { typeof(*p) val; char c[0]; } x_;            \
> +    read_atomic_size(p, x_.c, sizeof(*p));              \
> +    x_.val;                                             \
> +})
> +
> +#define write_atomic(p, x)                              \
> +({                                                      \
> +    typeof(*p) x__ = (x);                               \
> +    switch ( sizeof(*p) )                               \
> +    {                                                   \
> +    case 1: writeb((uint8_t)x__,  p); break;            \
> +    case 2: writew((uint16_t)x__, p); break;            \
> +    case 4: writel((uint32_t)x__, p); break;            \
> +    case 8: writeq((uint64_t)x__, p); break;            \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +    x__;                                                \
> +})
> +
> +#define add_sized(p, x)                                 \
> +({                                                      \
> +    typeof(*(p)) x__ = (x);                             \
> +    switch ( sizeof(*(p)) )                             \
> +    {                                                   \
> +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> +    case 2: writew(read_atomic(p) + x__, p); break;     \
> +    case 4: writel(read_atomic(p) + x__, p); break;     \
> +    default: __bad_atomic_size(); break;                \
> +    }                                                   \
> +})
> +
> +/*
> + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar type, leaving
> + *               non-scalar types unchanged.
> + *
> + * Prefer C11 _Generic for better compile-times and simpler code. Note: 'char'

Xen is technically built using c99/gnu99. So it is feels a bit odd to 
introduce a C11 feature. I see that _Generic is already used in PPC... 
However, if we decide to add more use of it, then I think this should at 
minimum be documented in docs/misra/C-language-toolchain.rst (the more 
if we plan the macro is moved to common as Jan suggested).

Cheers,

Oleksii Kurochko Feb. 19, 2024, 2:35 p.m. UTC | #5

Hi Julien,

On Sun, 2024-02-18 at 19:22 +0000, Julien Grall wrote:
> Hi,
> 
> On 05/02/2024 15:32, Oleksii Kurochko wrote:
> > From: Bobby Eshleman <bobbyeshleman@gmail.com>
> > 
> > Additionally, this patch introduces macros in fence.h,
> > which are utilized in atomic.h.
> > 
> > atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n)
> > were updated to use __*xchg_generic().
> > 
> > Signed-off-by: Oleksii Kurochko <oleksii.kurochko@gmail.com>
> 
> The author is Bobby, but I don't see a Signed-off-by. Did you forgot
> it?
I missed to add that as I thought that it would be enough to change a
commit author.

> 
> > ---
> > Changes in V4:
> >   - do changes related to the updates of [PATCH v3 13/34]
> > xen/riscv: introduce cmpxchg.h
> >   - drop casts in read_atomic_size(), write_atomic(), add_sized()
> >   - tabs -> spaces
> >   - drop #ifdef CONFIG_SMP ... #endif in fence.ha as it is simpler
> > to handle NR_CPUS=1
> >     the same as NR_CPUS>1 with accepting less than ideal
> > performance.
> > ---
> > Changes in V3:
> >    - update the commit message
> >    - add SPDX for fence.h
> >    - code style fixes
> >    - Remove /* TODO: ... */ for add_sized macros. It looks correct
> > to me.
> >    - re-order the patch
> >    - merge to this patch fence.h
> > ---
> > Changes in V2:
> >   - Change an author of commit. I got this header from Bobby's old
> > repo.
> > ---
> >   xen/arch/riscv/include/asm/atomic.h | 395
> > ++++++++++++++++++++++++++++
> >   xen/arch/riscv/include/asm/fence.h  |   8 +
> >   2 files changed, 403 insertions(+)
> >   create mode 100644 xen/arch/riscv/include/asm/atomic.h
> >   create mode 100644 xen/arch/riscv/include/asm/fence.h
> > 
> > diff --git a/xen/arch/riscv/include/asm/atomic.h
> > b/xen/arch/riscv/include/asm/atomic.h
> > new file mode 100644
> > index 0000000000..267d3c0803
> > --- /dev/null
> > +++ b/xen/arch/riscv/include/asm/atomic.h
> > @@ -0,0 +1,395 @@
> > +/* SPDX-License-Identifier: GPL-2.0-only */
> > +/*
> > + * Taken and modified from Linux.
> 
> Which version of Linux? Can you also spell out what are the big
> changes? 
> This would be helpful if we need to re-sync.
Sure, I'll add the changes here.

> 
> > + *
> > + * atomic##prefix##_*xchg_*(atomic##prefix##_t *v, c_t n) were
> > updated to use
> > + * __*xchg_generic()
> > + *
> > + * Copyright (C) 2007 Red Hat, Inc. All Rights Reserved.
> > + * Copyright (C) 2012 Regents of the University of California
> > + * Copyright (C) 2017 SiFive
> > + * Copyright (C) 2021 Vates SAS
> > + */
> > +
> > +#ifndef _ASM_RISCV_ATOMIC_H
> > +#define _ASM_RISCV_ATOMIC_H
> > +
> > +#include <xen/atomic.h>
> > +#include <asm/cmpxchg.h>
> > +#include <asm/fence.h>
> > +#include <asm/io.h>
> > +#include <asm/system.h>
> > +
> > +void __bad_atomic_size(void);
> > +
> > +static always_inline void read_atomic_size(const volatile void *p,
> > +                                           void *res,
> > +                                           unsigned int size)
> > +{
> > +    switch ( size )
> > +    {
> > +    case 1: *(uint8_t *)res = readb(p); break;
> > +    case 2: *(uint16_t *)res = readw(p); break;
> > +    case 4: *(uint32_t *)res = readl(p); break;
> > +    case 8: *(uint32_t *)res  = readq(p); break;
> > +    default: __bad_atomic_size(); break;
> > +    }
> > +}
> > +
> > +#define read_atomic(p) ({                               \
> > +    union { typeof(*p) val; char c[0]; } x_;            \
> > +    read_atomic_size(p, x_.c, sizeof(*p));              \
> > +    x_.val;                                             \
> > +})
> > +
> > +#define write_atomic(p, x)                              \
> > +({                                                      \
> > +    typeof(*p) x__ = (x);                               \
> > +    switch ( sizeof(*p) )                               \
> > +    {                                                   \
> > +    case 1: writeb((uint8_t)x__,  p); break;            \
> > +    case 2: writew((uint16_t)x__, p); break;            \
> > +    case 4: writel((uint32_t)x__, p); break;            \
> > +    case 8: writeq((uint64_t)x__, p); break;            \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +    x__;                                                \
> > +})
> > +
> > +#define add_sized(p, x)                                 \
> > +({                                                      \
> > +    typeof(*(p)) x__ = (x);                             \
> > +    switch ( sizeof(*(p)) )                             \
> > +    {                                                   \
> > +    case 1: writeb(read_atomic(p) + x__, p); break;     \
> > +    case 2: writew(read_atomic(p) + x__, p); break;     \
> > +    case 4: writel(read_atomic(p) + x__, p); break;     \
> > +    default: __bad_atomic_size(); break;                \
> > +    }                                                   \
> > +})
> > +
> > +/*
> > + *  __unqual_scalar_typeof(x) - Declare an unqualified scalar
> > type, leaving
> > + *               non-scalar types unchanged.
> > + *
> > + * Prefer C11 _Generic for better compile-times and simpler code.
> > Note: 'char'
> 
> Xen is technically built using c99/gnu99. So it is feels a bit odd to
> introduce a C11 feature. I see that _Generic is already used in
> PPC... 
> However, if we decide to add more use of it, then I think this should
> at 
> minimum be documented in docs/misra/C-language-toolchain.rst (the
> more 
> if we plan the macro is moved to common as Jan suggested).
> 
> Cheers,
>

[v4,14/30] xen/riscv: introduce atomic.h

Commit Message

Comments

Patch