Message ID | 20210518183655.1711377-15-philmd@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | exec: Add load/store API for aligned pointers | expand |
On Tue, 18 May 2021 at 19:38, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > > When the pointer alignment is known to be safe, we can > directly swap the data in place, without having to rely > on the compiler builtin code. > > Load/store methods expecting aligned pointer use the 'a' > infix. For example to read a 16-bit unsigned value stored > in little endianess at an unaligned pointer: > > val = lduw_le_p(&unaligned_ptr); > > then to store it in big endianess at an aligned pointer: > > stw_be_ap(&aligned_ptr, val); It sounded from the bug report as if the desired effect was "this access is atomic". Nothing in the documentation here makes that guarantee of the implementation -- it merely imposes an extra requirement on the caller that the pointer alignment is "safe" (which term it does not define...) and a valid implementation would be to implement the "aligned" versions identically to the "unaligned" versions... Q: should the functions at the bottom of this stack of APIs be using something from the atomic.h header? If not, why not? Do we need any of the other atomic primitives ? thanks -- PMM
On 5/18/21 10:15 PM, Peter Maydell wrote: > On Tue, 18 May 2021 at 19:38, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: >> >> When the pointer alignment is known to be safe, we can >> directly swap the data in place, without having to rely >> on the compiler builtin code. >> >> Load/store methods expecting aligned pointer use the 'a' >> infix. For example to read a 16-bit unsigned value stored >> in little endianess at an unaligned pointer: >> >> val = lduw_le_p(&unaligned_ptr); >> >> then to store it in big endianess at an aligned pointer: >> >> stw_be_ap(&aligned_ptr, val); > > It sounded from the bug report as if the desired effect > was "this access is atomic". Nothing in the documentation here > makes that guarantee of the implementation -- it merely imposes > an extra requirement on the caller that the pointer alignment > is "safe" (which term it does not define...) and a valid > implementation would be to implement the "aligned" versions > identically to the "unaligned" versions... > > Q: should the functions at the bottom of this stack of APIs > be using something from the atomic.h header? If not, why not? > Do we need any of the other atomic primitives ? I'll defer this question to Stefan/Paolo...
On Wed, May 19, 2021 at 07:56:51AM +0200, Philippe Mathieu-Daudé wrote: > On 5/18/21 10:15 PM, Peter Maydell wrote: > > On Tue, 18 May 2021 at 19:38, Philippe Mathieu-Daudé <philmd@redhat.com> wrote: > >> > >> When the pointer alignment is known to be safe, we can > >> directly swap the data in place, without having to rely > >> on the compiler builtin code. > >> > >> Load/store methods expecting aligned pointer use the 'a' > >> infix. For example to read a 16-bit unsigned value stored > >> in little endianess at an unaligned pointer: > >> > >> val = lduw_le_p(&unaligned_ptr); > >> > >> then to store it in big endianess at an aligned pointer: > >> > >> stw_be_ap(&aligned_ptr, val); > > > > It sounded from the bug report as if the desired effect > > was "this access is atomic". Nothing in the documentation here > > makes that guarantee of the implementation -- it merely imposes > > an extra requirement on the caller that the pointer alignment > > is "safe" (which term it does not define...) and a valid > > implementation would be to implement the "aligned" versions > > identically to the "unaligned" versions... > > > > Q: should the functions at the bottom of this stack of APIs > > be using something from the atomic.h header? If not, why not? > > Do we need any of the other atomic primitives ? > > I'll defer this question to Stefan/Paolo... Stricly speaking qatomic_read() and qatomic_set() are necessary for __ATOMIC_RELAXED semantics. The comment in atomic.h mentions this generally has no effect on generated code. That's probably because aligned 16/32/64-bit accesses don't tear on modern CPUs so no special instructions are needed. This is why not using atomic.h usually works. Even qatomic_read()/qatomic_set() is too strong since the doc comment mentions it prevents the compiler moving other loads/stores past the atomic operation. The hw/virtio/ code already has explicit memory barriers and doesn't need the additional compiler barrier. For safety I suggest using qatomic_read()/qatomic_set(). Stefan
diff --git a/docs/devel/loads-stores.rst b/docs/devel/loads-stores.rst index 568274baec0..88493ba1293 100644 --- a/docs/devel/loads-stores.rst +++ b/docs/devel/loads-stores.rst @@ -13,20 +13,21 @@ documentation of each API -- for that you should look at the documentation comments in the relevant header files. -``ld*_p and st*_p`` -~~~~~~~~~~~~~~~~~~~ +``ld*_[a]p and st*_[a]p`` +~~~~~~~~~~~~~~~~~~~~~~~~~ These functions operate on a host pointer, and should be used when you already have a pointer into host memory (corresponding to guest ram or a local buffer). They deal with doing accesses with the desired endianness and with correctly handling -potentially unaligned pointer values. +potentially unaligned pointer values. If the pointer alignment +is known to be safe, then the aligned functions can be used. Function names follow the pattern: -load: ``ld{sign}{size}_{endian}_p(ptr)`` +load: ``ld{sign}{size}_{endian}_{aligned}p(ptr)`` -store: ``st{size}_{endian}_p(ptr, val)`` +store: ``st{size}_{endian}_{aligned}p(ptr, val)`` ``sign`` - (empty) : for 32 or 64 bit sizes @@ -49,24 +50,28 @@ The ``_{endian}`` infix is omitted for target-endian accesses. The target endian accessors are only available to source files which are built per-target. +By using the ``_{aligned}`` infix, unsafe optimizations might be used, +however unaligned pointer might trigger an exception and abort the +process. + There are also functions which take the size as an argument: -load: ``ldn{endian}_p(ptr, sz)`` +load: ``ldn{endian}_{aligned}p(ptr, sz)`` which performs an unsigned load of ``sz`` bytes from ``ptr`` as an ``{endian}`` order value and returns it in a uint64_t. -store: ``stn{endian}_p(ptr, sz, val)`` +store: ``stn{endian}_{aligned}p(ptr, sz, val)`` which stores ``val`` to ``ptr`` as an ``{endian}`` order value of size ``sz`` bytes. Regexes for git grep - - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_p\>`` - - ``\<st[bwlq]\(_[hbl]e\)\?_p\>`` - - ``\<ldn_\([hbl]e\)?_p\>`` - - ``\<stn_\([hbl]e\)?_p\>`` + - ``\<ld[us]\?[bwlq]\(_[hbl]e\)\?_a?p\>`` + - ``\<st[bwlq]\(_[hbl]e\)\?_a?p\>`` + - ``\<ldn_\([hbl]e\)?_a?p\>`` + - ``\<stn_\([hbl]e\)?_a?p\>`` ``cpu_{ld,st}*_mmuidx_ra`` ~~~~~~~~~~~~~~~~~~~~~~~~~~ diff --git a/include/qemu/bswap.h b/include/qemu/bswap.h index 4cd120ca014..3f272c3cb46 100644 --- a/include/qemu/bswap.h +++ b/include/qemu/bswap.h @@ -350,25 +350,47 @@ static inline void st ## size ## _he_p(void *ptr, vtype v)\ __builtin_memcpy(ptr, &v, sizeof(v));\ } +#define LD_CONVERT_ALIGNED(bits, rtype, vtype, size)\ +static inline rtype ld ## size ## _he_ap(const void *ptr)\ +{\ + return *(vtype *)ptr;\ +} + +#define ST_CONVERT_ALIGNED(bits, vtype, size)\ +static inline void st ## size ## _he_ap(void *ptr, vtype v)\ +{\ + *(vtype *)ptr = v;\ +} + #define LD_CONVERT_END(endian, bits, rtype, vtype, size)\ static inline rtype ld ## size ## _ ## endian ## _p(const void *ptr)\ {\ return (vtype)glue(endian, _bswap)(ld ## size ## _he_p(ptr), bits);\ +}\ +static inline rtype ld ## size ## _ ## endian ## _ap(const void *ptr)\ +{\ + return (vtype)glue(endian, _bswap)(ld ## size ## _he_ap(ptr), bits);\ } #define ST_CONVERT_END(endian, bits, vtype, size)\ static inline void st ## size ## _ ## endian ## _p(void *ptr, vtype v)\ {\ st ## size ## _he_p(ptr, glue(endian, _bswap)(v, bits));\ +}\ +static inline void st ## size ## _ ## endian ## _ap(void *ptr, vtype v)\ +{\ + st ## size ## _he_ap(ptr, glue(endian, _bswap)(v, bits));\ } #define LD_CONVERT(bits, rtype, vtype, size)\ LD_CONVERT_UNALIGNED(bits, rtype, vtype, size)\ + LD_CONVERT_ALIGNED(bits, rtype, vtype, size)\ LD_CONVERT_END(le, bits, rtype, vtype, size)\ LD_CONVERT_END(be, bits, rtype, vtype, size) #define ST_CONVERT(bits, vtype, size)\ ST_CONVERT_UNALIGNED(bits, vtype, size)\ + ST_CONVERT_ALIGNED(bits, vtype, size)\ ST_CONVERT_END(le, bits, vtype, size)\ ST_CONVERT_END(be, bits, vtype, size)
When the pointer alignment is known to be safe, we can directly swap the data in place, without having to rely on the compiler builtin code. Load/store methods expecting aligned pointer use the 'a' infix. For example to read a 16-bit unsigned value stored in little endianess at an unaligned pointer: val = lduw_le_p(&unaligned_ptr); then to store it in big endianess at an aligned pointer: stw_be_ap(&aligned_ptr, val); Suggested-by: Stefan Hajnoczi <stefanha@redhat.com> Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com> --- docs/devel/loads-stores.rst | 27 ++++++++++++++++----------- include/qemu/bswap.h | 22 ++++++++++++++++++++++ 2 files changed, 38 insertions(+), 11 deletions(-)