diff mbox series

[v4,1/9] x86/asm: add iosubmit_cmds512() based on MOVDIR64B CPU instruction

Message ID 157842965854.27241.2519525691634439881.stgit@djiang5-desk3.ch.intel.com (mailing list archive)
State Changes Requested
Headers show
Series idxd driver for Intel Data Streaming Accelerator | expand

Commit Message

Dave Jiang Jan. 7, 2020, 8:40 p.m. UTC
With the introduction of MOVDIR64B instruction, there is now an instruction
that can write 64 bytes of data atomically.

Quoting from Intel SDM:
"There is no atomicity guarantee provided for the 64-byte load operation
from source address, and processor implementations may use multiple
load operations to read the 64-bytes. The 64-byte direct-store issued
by MOVDIR64B guarantees 64-byte write-completion atomicity. This means
that the data arrives at the destination in a single undivided 64-byte
write transaction."

We have identified at least 3 different use cases for this instruction in
the format of func(dst, src, count):
1) Clear poison / Initialize MKTME memory
   @dst is normal memory.
   @src in normal memory. Does not increment. (Copy same line to all
   targets)
   @count (to clear/init multiple lines)
2) Submit command(s) to new devices
   @dst is a special MMIO region for a device. Does not increment.
   @src is normal memory. Increments.
   @count usually is 1, but can be multiple.
3) Copy to iomem in big chunks
   @dst is iomem and increments
   @src in normal memory and increments
   @count is number of chunks to copy

Add support for case #2 to support device that will accept commands via
this instruction. We provide a @count in order to submit a batch of
preprogrammed descriptors in virtually contiguous memory. This
allows the caller to submit multiple descriptors to a devices with a single
submission. The special device requires the entire 64bytes descriptor to
be written atomically and will accept MOVDIR64B instruction.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 arch/x86/include/asm/io.h |   36 ++++++++++++++++++++++++++++++++++++
 1 file changed, 36 insertions(+)

Comments

Borislav Petkov Jan. 13, 2020, 10:14 a.m. UTC | #1
On Tue, Jan 07, 2020 at 01:40:58PM -0700, Dave Jiang wrote:
> With the introduction of MOVDIR64B instruction, there is now an instruction
> that can write 64 bytes of data atomically.
> 
> Quoting from Intel SDM:
> "There is no atomicity guarantee provided for the 64-byte load operation
> from source address, and processor implementations may use multiple
> load operations to read the 64-bytes. The 64-byte direct-store issued
> by MOVDIR64B guarantees 64-byte write-completion atomicity. This means
> that the data arrives at the destination in a single undivided 64-byte
> write transaction."
> 
> We have identified at least 3 different use cases for this instruction in
> the format of func(dst, src, count):
> 1) Clear poison / Initialize MKTME memory
>    @dst is normal memory.
>    @src in normal memory. Does not increment. (Copy same line to all
>    targets)
>    @count (to clear/init multiple lines)
> 2) Submit command(s) to new devices
>    @dst is a special MMIO region for a device. Does not increment.
>    @src is normal memory. Increments.
>    @count usually is 1, but can be multiple.
> 3) Copy to iomem in big chunks
>    @dst is iomem and increments
>    @src in normal memory and increments
>    @count is number of chunks to copy
> 
> Add support for case #2 to support device that will accept commands via
> this instruction. We provide a @count in order to submit a batch of
> preprogrammed descriptors in virtually contiguous memory. This
> allows the caller to submit multiple descriptors to a devices with a single

						  "to a device"

> submission. The special device requires the entire 64bytes descriptor to
> be written atomically and will accept MOVDIR64B instruction.
> 
> Signed-off-by: Dave Jiang <dave.jiang@intel.com>

but the above can be fixed by whoever applies this.

Acked-by: Borislav Petkov <bp@suse.de>
diff mbox series

Patch

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 9997521fc5cd..e1aa17a468a8 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -399,4 +399,40 @@  extern bool arch_memremap_can_ram_remap(resource_size_t offset,
 extern bool phys_mem_access_encrypted(unsigned long phys_addr,
 				      unsigned long size);
 
+/**
+ * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
+ * @__dst: destination, in MMIO space (must be 512-bit aligned)
+ * @src: source
+ * @count: number of 512 bits quantities to submit
+ *
+ * Submit data from kernel space to MMIO space, in units of 512 bits at a
+ * time.  Order of access is not guaranteed, nor is a memory barrier
+ * performed afterwards.
+ *
+ * Warning: Do not use this helper unless your driver has checked that the CPU
+ * instruction is supported on the platform.
+ */
+static inline void iosubmit_cmds512(void __iomem *__dst, const void *src,
+				    size_t count)
+{
+	/*
+	 * Note that this isn't an "on-stack copy", just definition of "dst"
+	 * as a pointer to 64-bytes of stuff that is going to be overwritten.
+	 * In the MOVDIR64B case that may be needed as you can use the
+	 * MOVDIR64B instruction to copy arbitrary memory around. This trick
+	 * lets the compiler know how much gets clobbered.
+	 */
+	volatile struct { char _[64]; } *dst = __dst;
+	const u8 *from = src;
+	const u8 *end = from + count * 64;
+
+	while (from < end) {
+		/* MOVDIR64B [rdx], rax */
+		asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
+			     : "=m" (dst)
+			     : "d" (from), "a" (dst));
+		from += 64;
+	}
+}
+
 #endif /* _ASM_X86_IO_H */