diff mbox series

[RFC,01/14] x86/asm: add iosubmit_cmds512() based on movdir64b CPU instruction

Message ID 157428502934.36836.8119026517510193201.stgit@djiang5-desk3.ch.intel.com (mailing list archive)
State RFC
Headers show
Series idxd driver for Intel Data Streaming Accelerator | expand

Commit Message

Dave Jiang Nov. 20, 2019, 9:23 p.m. UTC
With the introduction of movdir64b instruction, there is now an instruction
that can write 64 bytes of data atomicaly.

Quoting from Intel SDM:
"There is no atomicity guarantee provided for the 64-byte load operation
from source address, and processor implementations may use multiple
load operations to read the 64-bytes. The 64-byte direct-store issued
by MOVDIR64B guarantees 64-byte write-completion atomicity. This means
that the data arrives at the destination in a single undivided 64-byte
write transaction."

We have identified at least 3 different use cases for this instruction in
the format of func(dst, src, count):
1) Clear poison / Initialize MKTME memory
   Destination is normal memory.
   Source in normal memory. Does not increment. (Copy same line to all
   targets)
   Count (to clear/init multiple lines)
2) Submit command(s) to new devices
   Destination is a special MMIO region for a device. Does not increment.
   Source is normal memory. Increments.
   Count usually is 1, but can be multiple.
3) Copy to iomem in big chunks
   Destination is iomem and increments
   Source in normal memory and increments
   Count is number of chunks to copy

This commit adds support for case #2 to support device that will accept
commands via this instruction.

Signed-off-by: Dave Jiang <dave.jiang@intel.com>
---
 arch/x86/include/asm/io.h |   44 ++++++++++++++++++++++++++++++++++++++++++++
 include/linux/io.h        |   11 +++++++++++
 2 files changed, 55 insertions(+)

Comments

Dave Hansen Nov. 20, 2019, 9:50 p.m. UTC | #1
On 11/20/19 1:23 PM, Dave Jiang wrote:
> +static inline void __iowrite512(void __iomem *__dst, const void *src)
> +{
> +	volatile struct { char _[64]; } *dst = __dst;

This _looks_ like gibberish.  I know it's not, but it is subtle enough
that it really needs specific comments.

> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
> +				    size_t count)
> +{
> +	const u8 *from = src;
> +	const u8 *end = from + count * 64;
> +
> +	if (!cpu_has_write512())
> +		return;
> +
> +	while (from < end) {
> +		__iowrite512(dst, from);
> +		from += 64;
> +	}
> +}

Won't this silently just drop things if the CPU doesn't have movdir64b
support?

It seems like this shouldn't be called at all if
!cpu_has_write512(), but wouldn't something like this be mroe appropriate?

	if (!cpu_has_write512()) {
		WARN_ON_ONCE(1);
		return;
	}

Is the caller just supposed to infer that "dst" was never overwritten?
Borislav Petkov Nov. 20, 2019, 9:53 p.m. UTC | #2
On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote:
> +/**
> + * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units

Where is the alignment check on that data before doing the copying?

> + * @dst: destination, in MMIO space (must be 512-bit aligned)
> + * @src: source
> + * @count: number of 512 bits quantities to submit

Where's that check on the data?

> + *
> + * Submit data from kernel space to MMIO space, in units of 512 bits at a
> + * time.  Order of access is not guaranteed, nor is a memory barrier
> + * performed afterwards.
> + */
> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
> +				    size_t count)

An iosubmit function which returns void and doesn't tell its callers
whether it succeeded or not? That looks non-optimal to say the least.

Why isn't there a fallback function which to call when the CPU doesn't
support movdir64b?

Because then you can use alternative_call() and have the thing work
regardless of hardware support for MOVDIR*.

> +{
> +	const u8 *from = src;
> +	const u8 *end = from + count * 64;
> +
> +	if (!cpu_has_write512())

If anything, that thing needs to go and you should use

  static_cpu_has(X86_FEATURE_MOVDIR64B)

as it looks to me like you would care about speed on this fast path?
Yes, no?
Luck, Tony Nov. 20, 2019, 11:19 p.m. UTC | #3
On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote:
> On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote:
> > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
> > +				    size_t count)
> 
> An iosubmit function which returns void and doesn't tell its callers
> whether it succeeded or not? That looks non-optimal to say the least.

That's the underlying functionality of the MOVDIR64B instruction. A
posted write so no way to know if it succeeded. When using dedicated
queues the caller must keep count of how many operations are in flight
and not send more than the depth of the queue.

> Why isn't there a fallback function which to call when the CPU doesn't
> support movdir64b?

This particular driver has no option for fallback. Descriptors can
only be submitted with MOVDIR64B (to dedicated queues ... in later
patch series support for shared queues will be added, but those require
ENQCMD or ENQCMDS to submit).

The driver bails out at the beginning of the probe routine if the
necessary instructions are not supported:

+       /*
+        * If the CPU does not support write512, there's no point in
+        * enumerating the device. We can not utilize it.
+        */
+       if (!cpu_has_write512())
+               return -ENXIO;

Though we should always get past that as this PCI device ID shouldn't
every appear on a system that doesn't have the support. Device is on
the die, not a plug-in card.

> Because then you can use alternative_call() and have the thing work
> regardless of hardware support for MOVDIR*.
> 
> > +{
> > +	const u8 *from = src;
> > +	const u8 *end = from + count * 64;
> > +
> > +	if (!cpu_has_write512())
> 
> If anything, that thing needs to go and you should use
> 
>   static_cpu_has(X86_FEATURE_MOVDIR64B)
> 
> as it looks to me like you would care about speed on this fast path?
> Yes, no?

That might be a better.

-Tony
Borislav Petkov Nov. 20, 2019, 11:26 p.m. UTC | #4
On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote:
> That's the underlying functionality of the MOVDIR64B instruction. A
> posted write so no way to know if it succeeded.

So how do you know whether any of the writes went through?

> When using dedicated queues the caller must keep count of how many
> operations are in flight and not send more than the depth of the
> queue.

This way?
Dave Jiang Nov. 20, 2019, 11:46 p.m. UTC | #5
On 11/20/19 2:50 PM, Hansen, Dave wrote:
> On 11/20/19 1:23 PM, Dave Jiang wrote:
>> +static inline void __iowrite512(void __iomem *__dst, const void *src)
>> +{
>> +	volatile struct { char _[64]; } *dst = __dst;
> 
> This _looks_ like gibberish.  I know it's not, but it is subtle enough
> that it really needs specific comments.

I'll add comments explaining.

> 
>> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
>> +				    size_t count)
>> +{
>> +	const u8 *from = src;
>> +	const u8 *end = from + count * 64;
>> +
>> +	if (!cpu_has_write512())
>> +		return;
>> +
>> +	while (from < end) {
>> +		__iowrite512(dst, from);
>> +		from += 64;
>> +	}
>> +}
> 
> Won't this silently just drop things if the CPU doesn't have movdir64b
> support?
> 
> It seems like this shouldn't be called at all if
> !cpu_has_write512(), but wouldn't something like this be mroe appropriate?
> 
> 	if (!cpu_has_write512()) {
> 		WARN_ON_ONCE(1);
> 		return;
> 	}
> 
> Is the caller just supposed to infer that "dst" was never overwritten?
> 

Thanks. I'll add the WARN().
Dave Jiang Nov. 21, 2019, 12:10 a.m. UTC | #6
On 11/20/19 2:53 PM, Borislav Petkov wrote:
> On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote:
>> +/**
>> + * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
> 
> Where is the alignment check on that data before doing the copying?

I'll add the check on the destination address. The call is modeled after 
__iowrite64_copy() / __iowrite32_copy() in lib/iomap_copy.c. Looks like 
those functions do not check for the alignment requirements either.

> 
>> + * @dst: destination, in MMIO space (must be 512-bit aligned)
>> + * @src: source
>> + * @count: number of 512 bits quantities to submit
> 
> Where's that check on the data?

I don't follow?

> 
>> + *
>> + * Submit data from kernel space to MMIO space, in units of 512 bits at a
>> + * time.  Order of access is not guaranteed, nor is a memory barrier
>> + * performed afterwards.
>> + */
>> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
>> +				    size_t count)
> 
> An iosubmit function which returns void and doesn't tell its callers
> whether it succeeded or not? That looks non-optimal to say the least.
> 
> Why isn't there a fallback function which to call when the CPU doesn't
> support movdir64b?
> 
> Because then you can use alternative_call() and have the thing work
> regardless of hardware support for MOVDIR*.

Looks like Tony answered this part.

> 
>> +{
>> +	const u8 *from = src;
>> +	const u8 *end = from + count * 64;
>> +
>> +	if (!cpu_has_write512())
> 
> If anything, that thing needs to go and you should use
> 
>    static_cpu_has(X86_FEATURE_MOVDIR64B)
> 
> as it looks to me like you would care about speed on this fast path?
> Yes, no?
> 

Yes thank you!
Luck, Tony Nov. 21, 2019, 12:15 a.m. UTC | #7
>> When using dedicated queues the caller must keep count of how many
>> operations are in flight and not send more than the depth of the
>> queue.
>
> This way?

That's the only practical way. The device does keep a count of dropped
attempts ... so in theory you could go read that ... but that would give up
much of the value proposition of low cost to submit work if you had to do
an MMIO read after every submission.

-Tony
Dan Williams Nov. 21, 2019, 12:21 a.m. UTC | #8
On Wed, Nov 20, 2019 at 3:19 PM Luck, Tony <tony.luck@intel.com> wrote:
>
> On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote:
> > On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote:
> > > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
> > > +                               size_t count)
> >
> > An iosubmit function which returns void and doesn't tell its callers
> > whether it succeeded or not? That looks non-optimal to say the least.
>
> That's the underlying functionality of the MOVDIR64B instruction. A
> posted write so no way to know if it succeeded. When using dedicated
> queues the caller must keep count of how many operations are in flight
> and not send more than the depth of the queue.
>
> > Why isn't there a fallback function which to call when the CPU doesn't
> > support movdir64b?
>
> This particular driver has no option for fallback. Descriptors can
> only be submitted with MOVDIR64B (to dedicated queues ... in later
> patch series support for shared queues will be added, but those require
> ENQCMD or ENQCMDS to submit).

I think
>
> The driver bails out at the beginning of the probe routine if the
> necessary instructions are not supported:
>
> +       /*
> +        * If the CPU does not support write512, there's no point in
> +        * enumerating the device. We can not utilize it.
> +        */
> +       if (!cpu_has_write512())
> +               return -ENXIO;
>
> Though we should always get past that as this PCI device ID shouldn't
> every appear on a system that doesn't have the support. Device is on
> the die, not a plug-in card.
>
> > Because then you can use alternative_call() and have the thing work
> > regardless of hardware support for MOVDIR*.
> >
> > > +{
> > > +   const u8 *from = src;
> > > +   const u8 *end = from + count * 64;
> > > +
> > > +   if (!cpu_has_write512())
> >
> > If anything, that thing needs to go and you should use
> >
> >   static_cpu_has(X86_FEATURE_MOVDIR64B)
> >
> > as it looks to me like you would care about speed on this fast path?
> > Yes, no?
>
> That might be a better.

It's meant to be identical.

The expectation was that cpu_has_write512() could be used generically
in drivers like the pmem driver that have a use for movdir64b outside
of DSA command submission use case. On x86 it would just be #define'd
to static_cpu_has(X86_FEATURE_MOVDIR64B), on other archs something
else in the future. For pmem if cpu_has_write512() is false it falls
back to talking to platform firmware for error management. Case1 from
the changelog.
Thomas Gleixner Nov. 21, 2019, 12:22 a.m. UTC | #9
On Wed, 20 Nov 2019, Luck, Tony wrote:
> On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote:
> > On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote:
> > > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
> > > +				    size_t count)
> > 
> > An iosubmit function which returns void and doesn't tell its callers
> > whether it succeeded or not? That looks non-optimal to say the least.
> 
> That's the underlying functionality of the MOVDIR64B instruction. A
> posted write so no way to know if it succeeded. When using dedicated
> queues the caller must keep count of how many operations are in flight
> and not send more than the depth of the queue.
> 
> > Why isn't there a fallback function which to call when the CPU doesn't
> > support movdir64b?
> 
> This particular driver has no option for fallback. Descriptors can
> only be submitted with MOVDIR64B (to dedicated queues ... in later
> patch series support for shared queues will be added, but those require
> ENQCMD or ENQCMDS to submit).
> 
> The driver bails out at the beginning of the probe routine if the
> necessary instructions are not supported:
> 
> +       /*
> +        * If the CPU does not support write512, there's no point in
> +        * enumerating the device. We can not utilize it.
> +        */
> +       if (!cpu_has_write512())
> +               return -ENXIO;
> 
> Though we should always get past that as this PCI device ID shouldn't
> every appear on a system that doesn't have the support. Device is on
> the die, not a plug-in card.

Then the condition in the iosubmit function is just prone to silently paper
over any bug in a driver:

> +       if (!cpu_has_write512())
> +               return;

This should at least issue a WARN_ON_ONCE()

Thanks,

	tglx
Dave Jiang Nov. 21, 2019, 12:27 a.m. UTC | #10
On 11/20/19 5:22 PM, Thomas Gleixner wrote:
> On Wed, 20 Nov 2019, Luck, Tony wrote:
>> On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote:
>>> On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote:
>>>> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
>>>> +				    size_t count)
>>>
>>> An iosubmit function which returns void and doesn't tell its callers
>>> whether it succeeded or not? That looks non-optimal to say the least.
>>
>> That's the underlying functionality of the MOVDIR64B instruction. A
>> posted write so no way to know if it succeeded. When using dedicated
>> queues the caller must keep count of how many operations are in flight
>> and not send more than the depth of the queue.
>>
>>> Why isn't there a fallback function which to call when the CPU doesn't
>>> support movdir64b?
>>
>> This particular driver has no option for fallback. Descriptors can
>> only be submitted with MOVDIR64B (to dedicated queues ... in later
>> patch series support for shared queues will be added, but those require
>> ENQCMD or ENQCMDS to submit).
>>
>> The driver bails out at the beginning of the probe routine if the
>> necessary instructions are not supported:
>>
>> +       /*
>> +        * If the CPU does not support write512, there's no point in
>> +        * enumerating the device. We can not utilize it.
>> +        */
>> +       if (!cpu_has_write512())
>> +               return -ENXIO;
>>
>> Though we should always get past that as this PCI device ID shouldn't
>> every appear on a system that doesn't have the support. Device is on
>> the die, not a plug-in card.
> 
> Then the condition in the iosubmit function is just prone to silently paper
> over any bug in a driver:
> 
>> +       if (!cpu_has_write512())
>> +               return;
> 
> This should at least issue a WARN_ON_ONCE()

Thanks! I'll be adding the WARN_ON_ONCE() for the cap check. Also with 
the alignment check Borislav mentioned, I'll add a WARN_ON() for failures.


> 
> Thanks,
> 
> 	tglx
>
Dan Williams Nov. 21, 2019, 12:27 a.m. UTC | #11
On Wed, Nov 20, 2019 at 3:27 PM Borislav Petkov <bp@alien8.de> wrote:
>
> On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote:
> > That's the underlying functionality of the MOVDIR64B instruction. A
> > posted write so no way to know if it succeeded.
>
> So how do you know whether any of the writes went through?

It's identical to the writel() mmio-write to start a SATA command
transfer. The higher level device driver protocol validates that the
command went through, ultimately with a timeout. There's no return
value for iosubmit_cmds512() for the same reason there's no return
value for the other iowrite primitives.
Thomas Gleixner Nov. 21, 2019, 12:53 a.m. UTC | #12
On Wed, 20 Nov 2019, Dan Williams wrote:
> On Wed, Nov 20, 2019 at 3:27 PM Borislav Petkov <bp@alien8.de> wrote:
> >
> > On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote:
> > > That's the underlying functionality of the MOVDIR64B instruction. A
> > > posted write so no way to know if it succeeded.
> >
> > So how do you know whether any of the writes went through?
> 
> It's identical to the writel() mmio-write to start a SATA command
> transfer. The higher level device driver protocol validates that the
> command went through, ultimately with a timeout. There's no return
> value for iosubmit_cmds512() for the same reason there's no return
> value for the other iowrite primitives.

With the difference that other iowrite primitive have no dependencies on
cpu feature bits and cannot fail on the software level.

Thanks,

	tglx
Dan Williams Nov. 21, 2019, 1:32 a.m. UTC | #13
On Wed, Nov 20, 2019 at 4:53 PM Thomas Gleixner <tglx@linutronix.de> wrote:
>
> On Wed, 20 Nov 2019, Dan Williams wrote:
> > On Wed, Nov 20, 2019 at 3:27 PM Borislav Petkov <bp@alien8.de> wrote:
> > >
> > > On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote:
> > > > That's the underlying functionality of the MOVDIR64B instruction. A
> > > > posted write so no way to know if it succeeded.
> > >
> > > So how do you know whether any of the writes went through?
> >
> > It's identical to the writel() mmio-write to start a SATA command
> > transfer. The higher level device driver protocol validates that the
> > command went through, ultimately with a timeout. There's no return
> > value for iosubmit_cmds512() for the same reason there's no return
> > value for the other iowrite primitives.
>
> With the difference that other iowrite primitive have no dependencies on
> cpu feature bits and cannot fail on the software level.

True, but that would be a driver coding mistake flagged by the
WARN_ON_ONCE, and the failure is static. The driver must check for
static_cpu_has(X86_FEATURE_MOVDIR64B) once at init, but it need not
check again on every command submission.
Borislav Petkov Nov. 21, 2019, 10:37 a.m. UTC | #14
On Wed, Nov 20, 2019 at 05:32:51PM -0800, Dan Williams wrote:
> True, but that would be a driver coding mistake flagged by the
> WARN_ON_ONCE, and the failure is static. The driver must check for
> static_cpu_has(X86_FEATURE_MOVDIR64B) once at init,

So if you do that at driver init time, you don't need the static variant
- simply use boot_cpu_has(). But if this function is going to be used on
other platforms, then you need the check in the function and it must be
static_cpu_has() for speed.

The static_cpu_has() thing is a soft-of once check anyway because it
gets patched by alternatives and after that it is 0 overhead.
Borislav Petkov Nov. 21, 2019, 10:59 a.m. UTC | #15
On Wed, Nov 20, 2019 at 05:10:41PM -0700, Dave Jiang wrote:
> I'll add the check on the destination address. The call is modeled after
> __iowrite64_copy() / __iowrite32_copy() in lib/iomap_copy.c. Looks like
> those functions do not check for the alignment requirements either.

So just because they don't check, you don't need to check either?

Can you guarantee that all callers will always do the right thing?

I mean, if you don't care too much, why even write "(must be 512-bit
aligned)"? Who cares then if the data is aligned or not...

> > > + * @dst: destination, in MMIO space (must be 512-bit aligned)
> > > + * @src: source
> > > + * @count: number of 512 bits quantities to submit
> > 
> > Where's that check on the data?
> 
> I don't follow?

What do you do if the caller doesn't submit data in 512 bits quantities?
Dave Jiang Nov. 21, 2019, 4:52 p.m. UTC | #16
On 11/21/19 3:59 AM, Borislav Petkov wrote:
> On Wed, Nov 20, 2019 at 05:10:41PM -0700, Dave Jiang wrote:
>> I'll add the check on the destination address. The call is modeled after
>> __iowrite64_copy() / __iowrite32_copy() in lib/iomap_copy.c. Looks like
>> those functions do not check for the alignment requirements either.
> 
> So just because they don't check, you don't need to check either?

No what I mean was those primitives are missing the checks and we should 
probably address that at some point.

> 
> Can you guarantee that all callers will always do the right thing?
> 
> I mean, if you don't care too much, why even write "(must be 512-bit
> aligned)"? Who cares then if the data is aligned or not...
> 


>>>> + * @dst: destination, in MMIO space (must be 512-bit aligned)
>>>> + * @src: source
>>>> + * @count: number of 512 bits quantities to submit
>>>
>>> Where's that check on the data?
>>
>> I don't follow?
> 
> What do you do if the caller doesn't submit data in 512 bits quantities?
> 

How would I detect that? Add a size (in bytes) parameter for the total 
source data?
Borislav Petkov Nov. 22, 2019, 8:59 a.m. UTC | #17
On Thu, Nov 21, 2019 at 09:52:19AM -0700, Dave Jiang wrote:
> No what I mean was those primitives are missing the checks and we should
> probably address that at some point.

Oh, patches are always welcome! :)

> How would I detect that? Add a size (in bytes) parameter for the total
> source data?

Sure.

So, here's the deal: the more I look at this thing, the more I think
this iosubmit_cmds512() function should not be in a generic header but
in an intel-/driver-specific one. Why?

Well, movdir64b is Intel-only for now, you don't have a fallback
option for the platforms which do not support that insn and it is more
preferential for you to do the feature check once at driver init and
then call the function because you *know* you have movdir64b support
and not have any feature check in the function itself, not even a fast
static_cpu_has() one.

And this way you can do away with alignment and size checks because you
control what your driver does.

If it turns out that this function needs to be shared with other
platforms, then we can consider lifting it into a generic header and
making it more generic.

Ok?
Dan Williams Nov. 22, 2019, 5:20 p.m. UTC | #18
On Fri, Nov 22, 2019 at 1:00 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Thu, Nov 21, 2019 at 09:52:19AM -0700, Dave Jiang wrote:
> > No what I mean was those primitives are missing the checks and we should
> > probably address that at some point.
>
> Oh, patches are always welcome! :)
>
> > How would I detect that? Add a size (in bytes) parameter for the total
> > source data?
>
> Sure.
>
> So, here's the deal: the more I look at this thing, the more I think
> this iosubmit_cmds512() function should not be in a generic header but
> in an intel-/driver-specific one. Why?
>
> Well, movdir64b is Intel-only for now, you don't have a fallback
> option for the platforms which do not support that insn and it is more
> preferential for you to do the feature check once at driver init and
> then call the function because you *know* you have movdir64b support
> and not have any feature check in the function itself, not even a fast
> static_cpu_has() one.
>
> And this way you can do away with alignment and size checks because you
> control what your driver does.
>
> If it turns out that this function needs to be shared with other
> platforms, then we can consider lifting it into a generic header and
> making it more generic.
>
> Ok?

I do agree that iosubmit_cmds512() can live in a driver specific
header, and it was my fault for advising Dave to make it generic. The
long story of how that came to pass below, but the short story is yes,
lets just make this one driver specific.

The long story is that there is already line of sight for a need for
other generic movdir64b() helpers as mentioned in the changelog, and
iosubmit_cmds512() got wrapped up in that momentum.

For those cases the thought would be to have memset512() for case1 and
__iowrite512_copy() for case3. Where memset512() writes a
non-incrementing source to an incrementing destination, and
__iowrite512_copy() copies an incrementing source to an incrementing
destination. Those 2 helpers *would* have fallbacks, but with the
option to use something like cpu_has_write512() to check in advance
whether those routines will fallback, or not.

That can be a discussion for a future patchset when those users arrive.
Borislav Petkov Nov. 22, 2019, 6:44 p.m. UTC | #19
On Fri, Nov 22, 2019 at 09:20:39AM -0800, Dan Williams wrote:
> For those cases the thought would be to have memset512() for case1 and
> __iowrite512_copy() for case3. Where memset512() writes a
> non-incrementing source to an incrementing destination, and
> __iowrite512_copy() copies an incrementing source to an incrementing
> destination. Those 2 helpers *would* have fallbacks, but with the
> option to use something like cpu_has_write512() to check in advance
> whether those routines will fallback, or not.
> 
> That can be a discussion for a future patchset when those users arrive.

Oh, sure, of course.

My only angle is very simple: if the MOVDIR* et al is only supported on
upcoming Intel platforms and looking at the use cases:

1. clear poison/MKTME
3. copy iomem in big chunks

I'm going to venture a guess that those two cases are going to be
happening only on Intel platforms which already support MODVIR*. So
wouldn't really need to do any generic helpers because those use cases
are very specific already. Which would make your feature detection a
one-time, driver-init time thing anyway...

Unless I misunderstand those cases and there really is a use case
where the thing would fallback and the fallback would really be for an
"unenlightened" platform without that MOVDIR* hw support...?

Hmmm.
Dan Williams Nov. 22, 2019, 6:50 p.m. UTC | #20
On Fri, Nov 22, 2019 at 10:44 AM Borislav Petkov <bp@alien8.de> wrote:
>
> On Fri, Nov 22, 2019 at 09:20:39AM -0800, Dan Williams wrote:
> > For those cases the thought would be to have memset512() for case1 and
> > __iowrite512_copy() for case3. Where memset512() writes a
> > non-incrementing source to an incrementing destination, and
> > __iowrite512_copy() copies an incrementing source to an incrementing
> > destination. Those 2 helpers *would* have fallbacks, but with the
> > option to use something like cpu_has_write512() to check in advance
> > whether those routines will fallback, or not.
> >
> > That can be a discussion for a future patchset when those users arrive.
>
> Oh, sure, of course.
>
> My only angle is very simple: if the MOVDIR* et al is only supported on
> upcoming Intel platforms and looking at the use cases:
>
> 1. clear poison/MKTME
> 3. copy iomem in big chunks
>
> I'm going to venture a guess that those two cases are going to be
> happening only on Intel platforms which already support MODVIR*. So
> wouldn't really need to do any generic helpers because those use cases
> are very specific already. Which would make your feature detection a
> one-time, driver-init time thing anyway...
>
> Unless I misunderstand those cases and there really is a use case
> where the thing would fallback and the fallback would really be for an
> "unenlightened" platform without that MOVDIR* hw support...?

At least for something like __iowrite512_copy() it would indeed be
something an unenlightened driver could call. Those drivers would
simply be looking for opportunistic efficiency and could tolerate a
fallback. Just like current __iowrite64_copy() users don't care if
that routine falls back internally.
diff mbox series

Patch

diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h
index 6bed97ff6db2..3126f6e1d5b8 100644
--- a/arch/x86/include/asm/io.h
+++ b/arch/x86/include/asm/io.h
@@ -403,5 +403,49 @@  extern bool arch_memremap_can_ram_remap(resource_size_t offset,
 
 extern bool phys_mem_access_encrypted(unsigned long phys_addr,
 				      unsigned long size);
+#include <linux/cpufeature.h>
+static inline bool cpu_has_write512(void)
+{
+	return cpu_feature_enabled(X86_FEATURE_MOVDIR64B);
+}
+
+#define cpu_has_write512 cpu_has_write512
+
+static inline void __iowrite512(void __iomem *__dst, const void *src)
+{
+	volatile struct { char _[64]; } *dst = __dst;
+
+	/* movdir64b [rdx], rax */
+	asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02"
+			: "=m" (dst)
+			: "d" (src), "a" (dst));
+}
+
+/**
+ * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units
+ * @dst: destination, in MMIO space (must be 512-bit aligned)
+ * @src: source
+ * @count: number of 512 bits quantities to submit
+ *
+ * Submit data from kernel space to MMIO space, in units of 512 bits at a
+ * time.  Order of access is not guaranteed, nor is a memory barrier
+ * performed afterwards.
+ */
+static inline void iosubmit_cmds512(void __iomem *dst, const void *src,
+				    size_t count)
+{
+	const u8 *from = src;
+	const u8 *end = from + count * 64;
+
+	if (!cpu_has_write512())
+		return;
+
+	while (from < end) {
+		__iowrite512(dst, from);
+		from += 64;
+	}
+}
+
+#define iosubmit_cmds512 iosubmit_cmds512
 
 #endif /* _ASM_X86_IO_H */
diff --git a/include/linux/io.h b/include/linux/io.h
index accac822336a..ca14af3e84de 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -20,6 +20,17 @@  __visible void __iowrite32_copy(void __iomem *to, const void *from, size_t count
 void __ioread32_copy(void *to, const void __iomem *from, size_t count);
 void __iowrite64_copy(void __iomem *to, const void *from, size_t count);
 
+#ifndef cpu_has_write512
+#define cpu_has_write512() (0)
+#endif
+
+#ifndef iosubmit_cmds512
+static inline void iosubmit_cmds512(void __iomem *to, const void *from,
+				    size_t count)
+{
+}
+#endif
+
 #ifdef CONFIG_MMU
 int ioremap_page_range(unsigned long addr, unsigned long end,
 		       phys_addr_t phys_addr, pgprot_t prot);