Message ID | 157428502934.36836.8119026517510193201.stgit@djiang5-desk3.ch.intel.com (mailing list archive) |
---|---|
State | RFC |
Headers | show |
Series | idxd driver for Intel Data Streaming Accelerator | expand |
On 11/20/19 1:23 PM, Dave Jiang wrote: > +static inline void __iowrite512(void __iomem *__dst, const void *src) > +{ > + volatile struct { char _[64]; } *dst = __dst; This _looks_ like gibberish. I know it's not, but it is subtle enough that it really needs specific comments. > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, > + size_t count) > +{ > + const u8 *from = src; > + const u8 *end = from + count * 64; > + > + if (!cpu_has_write512()) > + return; > + > + while (from < end) { > + __iowrite512(dst, from); > + from += 64; > + } > +} Won't this silently just drop things if the CPU doesn't have movdir64b support? It seems like this shouldn't be called at all if !cpu_has_write512(), but wouldn't something like this be mroe appropriate? if (!cpu_has_write512()) { WARN_ON_ONCE(1); return; } Is the caller just supposed to infer that "dst" was never overwritten?
On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote: > +/** > + * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units Where is the alignment check on that data before doing the copying? > + * @dst: destination, in MMIO space (must be 512-bit aligned) > + * @src: source > + * @count: number of 512 bits quantities to submit Where's that check on the data? > + * > + * Submit data from kernel space to MMIO space, in units of 512 bits at a > + * time. Order of access is not guaranteed, nor is a memory barrier > + * performed afterwards. > + */ > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, > + size_t count) An iosubmit function which returns void and doesn't tell its callers whether it succeeded or not? That looks non-optimal to say the least. Why isn't there a fallback function which to call when the CPU doesn't support movdir64b? Because then you can use alternative_call() and have the thing work regardless of hardware support for MOVDIR*. > +{ > + const u8 *from = src; > + const u8 *end = from + count * 64; > + > + if (!cpu_has_write512()) If anything, that thing needs to go and you should use static_cpu_has(X86_FEATURE_MOVDIR64B) as it looks to me like you would care about speed on this fast path? Yes, no?
On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote: > On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote: > > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, > > + size_t count) > > An iosubmit function which returns void and doesn't tell its callers > whether it succeeded or not? That looks non-optimal to say the least. That's the underlying functionality of the MOVDIR64B instruction. A posted write so no way to know if it succeeded. When using dedicated queues the caller must keep count of how many operations are in flight and not send more than the depth of the queue. > Why isn't there a fallback function which to call when the CPU doesn't > support movdir64b? This particular driver has no option for fallback. Descriptors can only be submitted with MOVDIR64B (to dedicated queues ... in later patch series support for shared queues will be added, but those require ENQCMD or ENQCMDS to submit). The driver bails out at the beginning of the probe routine if the necessary instructions are not supported: + /* + * If the CPU does not support write512, there's no point in + * enumerating the device. We can not utilize it. + */ + if (!cpu_has_write512()) + return -ENXIO; Though we should always get past that as this PCI device ID shouldn't every appear on a system that doesn't have the support. Device is on the die, not a plug-in card. > Because then you can use alternative_call() and have the thing work > regardless of hardware support for MOVDIR*. > > > +{ > > + const u8 *from = src; > > + const u8 *end = from + count * 64; > > + > > + if (!cpu_has_write512()) > > If anything, that thing needs to go and you should use > > static_cpu_has(X86_FEATURE_MOVDIR64B) > > as it looks to me like you would care about speed on this fast path? > Yes, no? That might be a better. -Tony
On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote: > That's the underlying functionality of the MOVDIR64B instruction. A > posted write so no way to know if it succeeded. So how do you know whether any of the writes went through? > When using dedicated queues the caller must keep count of how many > operations are in flight and not send more than the depth of the > queue. This way?
On 11/20/19 2:50 PM, Hansen, Dave wrote: > On 11/20/19 1:23 PM, Dave Jiang wrote: >> +static inline void __iowrite512(void __iomem *__dst, const void *src) >> +{ >> + volatile struct { char _[64]; } *dst = __dst; > > This _looks_ like gibberish. I know it's not, but it is subtle enough > that it really needs specific comments. I'll add comments explaining. > >> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, >> + size_t count) >> +{ >> + const u8 *from = src; >> + const u8 *end = from + count * 64; >> + >> + if (!cpu_has_write512()) >> + return; >> + >> + while (from < end) { >> + __iowrite512(dst, from); >> + from += 64; >> + } >> +} > > Won't this silently just drop things if the CPU doesn't have movdir64b > support? > > It seems like this shouldn't be called at all if > !cpu_has_write512(), but wouldn't something like this be mroe appropriate? > > if (!cpu_has_write512()) { > WARN_ON_ONCE(1); > return; > } > > Is the caller just supposed to infer that "dst" was never overwritten? > Thanks. I'll add the WARN().
On 11/20/19 2:53 PM, Borislav Petkov wrote: > On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote: >> +/** >> + * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units > > Where is the alignment check on that data before doing the copying? I'll add the check on the destination address. The call is modeled after __iowrite64_copy() / __iowrite32_copy() in lib/iomap_copy.c. Looks like those functions do not check for the alignment requirements either. > >> + * @dst: destination, in MMIO space (must be 512-bit aligned) >> + * @src: source >> + * @count: number of 512 bits quantities to submit > > Where's that check on the data? I don't follow? > >> + * >> + * Submit data from kernel space to MMIO space, in units of 512 bits at a >> + * time. Order of access is not guaranteed, nor is a memory barrier >> + * performed afterwards. >> + */ >> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, >> + size_t count) > > An iosubmit function which returns void and doesn't tell its callers > whether it succeeded or not? That looks non-optimal to say the least. > > Why isn't there a fallback function which to call when the CPU doesn't > support movdir64b? > > Because then you can use alternative_call() and have the thing work > regardless of hardware support for MOVDIR*. Looks like Tony answered this part. > >> +{ >> + const u8 *from = src; >> + const u8 *end = from + count * 64; >> + >> + if (!cpu_has_write512()) > > If anything, that thing needs to go and you should use > > static_cpu_has(X86_FEATURE_MOVDIR64B) > > as it looks to me like you would care about speed on this fast path? > Yes, no? > Yes thank you!
>> When using dedicated queues the caller must keep count of how many >> operations are in flight and not send more than the depth of the >> queue. > > This way? That's the only practical way. The device does keep a count of dropped attempts ... so in theory you could go read that ... but that would give up much of the value proposition of low cost to submit work if you had to do an MMIO read after every submission. -Tony
On Wed, Nov 20, 2019 at 3:19 PM Luck, Tony <tony.luck@intel.com> wrote: > > On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote: > > On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote: > > > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, > > > + size_t count) > > > > An iosubmit function which returns void and doesn't tell its callers > > whether it succeeded or not? That looks non-optimal to say the least. > > That's the underlying functionality of the MOVDIR64B instruction. A > posted write so no way to know if it succeeded. When using dedicated > queues the caller must keep count of how many operations are in flight > and not send more than the depth of the queue. > > > Why isn't there a fallback function which to call when the CPU doesn't > > support movdir64b? > > This particular driver has no option for fallback. Descriptors can > only be submitted with MOVDIR64B (to dedicated queues ... in later > patch series support for shared queues will be added, but those require > ENQCMD or ENQCMDS to submit). I think > > The driver bails out at the beginning of the probe routine if the > necessary instructions are not supported: > > + /* > + * If the CPU does not support write512, there's no point in > + * enumerating the device. We can not utilize it. > + */ > + if (!cpu_has_write512()) > + return -ENXIO; > > Though we should always get past that as this PCI device ID shouldn't > every appear on a system that doesn't have the support. Device is on > the die, not a plug-in card. > > > Because then you can use alternative_call() and have the thing work > > regardless of hardware support for MOVDIR*. > > > > > +{ > > > + const u8 *from = src; > > > + const u8 *end = from + count * 64; > > > + > > > + if (!cpu_has_write512()) > > > > If anything, that thing needs to go and you should use > > > > static_cpu_has(X86_FEATURE_MOVDIR64B) > > > > as it looks to me like you would care about speed on this fast path? > > Yes, no? > > That might be a better. It's meant to be identical. The expectation was that cpu_has_write512() could be used generically in drivers like the pmem driver that have a use for movdir64b outside of DSA command submission use case. On x86 it would just be #define'd to static_cpu_has(X86_FEATURE_MOVDIR64B), on other archs something else in the future. For pmem if cpu_has_write512() is false it falls back to talking to platform firmware for error management. Case1 from the changelog.
On Wed, 20 Nov 2019, Luck, Tony wrote: > On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote: > > On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote: > > > +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, > > > + size_t count) > > > > An iosubmit function which returns void and doesn't tell its callers > > whether it succeeded or not? That looks non-optimal to say the least. > > That's the underlying functionality of the MOVDIR64B instruction. A > posted write so no way to know if it succeeded. When using dedicated > queues the caller must keep count of how many operations are in flight > and not send more than the depth of the queue. > > > Why isn't there a fallback function which to call when the CPU doesn't > > support movdir64b? > > This particular driver has no option for fallback. Descriptors can > only be submitted with MOVDIR64B (to dedicated queues ... in later > patch series support for shared queues will be added, but those require > ENQCMD or ENQCMDS to submit). > > The driver bails out at the beginning of the probe routine if the > necessary instructions are not supported: > > + /* > + * If the CPU does not support write512, there's no point in > + * enumerating the device. We can not utilize it. > + */ > + if (!cpu_has_write512()) > + return -ENXIO; > > Though we should always get past that as this PCI device ID shouldn't > every appear on a system that doesn't have the support. Device is on > the die, not a plug-in card. Then the condition in the iosubmit function is just prone to silently paper over any bug in a driver: > + if (!cpu_has_write512()) > + return; This should at least issue a WARN_ON_ONCE() Thanks, tglx
On 11/20/19 5:22 PM, Thomas Gleixner wrote: > On Wed, 20 Nov 2019, Luck, Tony wrote: >> On Wed, Nov 20, 2019 at 10:53:39PM +0100, Borislav Petkov wrote: >>> On Wed, Nov 20, 2019 at 02:23:49PM -0700, Dave Jiang wrote: >>>> +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, >>>> + size_t count) >>> >>> An iosubmit function which returns void and doesn't tell its callers >>> whether it succeeded or not? That looks non-optimal to say the least. >> >> That's the underlying functionality of the MOVDIR64B instruction. A >> posted write so no way to know if it succeeded. When using dedicated >> queues the caller must keep count of how many operations are in flight >> and not send more than the depth of the queue. >> >>> Why isn't there a fallback function which to call when the CPU doesn't >>> support movdir64b? >> >> This particular driver has no option for fallback. Descriptors can >> only be submitted with MOVDIR64B (to dedicated queues ... in later >> patch series support for shared queues will be added, but those require >> ENQCMD or ENQCMDS to submit). >> >> The driver bails out at the beginning of the probe routine if the >> necessary instructions are not supported: >> >> + /* >> + * If the CPU does not support write512, there's no point in >> + * enumerating the device. We can not utilize it. >> + */ >> + if (!cpu_has_write512()) >> + return -ENXIO; >> >> Though we should always get past that as this PCI device ID shouldn't >> every appear on a system that doesn't have the support. Device is on >> the die, not a plug-in card. > > Then the condition in the iosubmit function is just prone to silently paper > over any bug in a driver: > >> + if (!cpu_has_write512()) >> + return; > > This should at least issue a WARN_ON_ONCE() Thanks! I'll be adding the WARN_ON_ONCE() for the cap check. Also with the alignment check Borislav mentioned, I'll add a WARN_ON() for failures. > > Thanks, > > tglx >
On Wed, Nov 20, 2019 at 3:27 PM Borislav Petkov <bp@alien8.de> wrote: > > On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote: > > That's the underlying functionality of the MOVDIR64B instruction. A > > posted write so no way to know if it succeeded. > > So how do you know whether any of the writes went through? It's identical to the writel() mmio-write to start a SATA command transfer. The higher level device driver protocol validates that the command went through, ultimately with a timeout. There's no return value for iosubmit_cmds512() for the same reason there's no return value for the other iowrite primitives.
On Wed, 20 Nov 2019, Dan Williams wrote: > On Wed, Nov 20, 2019 at 3:27 PM Borislav Petkov <bp@alien8.de> wrote: > > > > On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote: > > > That's the underlying functionality of the MOVDIR64B instruction. A > > > posted write so no way to know if it succeeded. > > > > So how do you know whether any of the writes went through? > > It's identical to the writel() mmio-write to start a SATA command > transfer. The higher level device driver protocol validates that the > command went through, ultimately with a timeout. There's no return > value for iosubmit_cmds512() for the same reason there's no return > value for the other iowrite primitives. With the difference that other iowrite primitive have no dependencies on cpu feature bits and cannot fail on the software level. Thanks, tglx
On Wed, Nov 20, 2019 at 4:53 PM Thomas Gleixner <tglx@linutronix.de> wrote: > > On Wed, 20 Nov 2019, Dan Williams wrote: > > On Wed, Nov 20, 2019 at 3:27 PM Borislav Petkov <bp@alien8.de> wrote: > > > > > > On Wed, Nov 20, 2019 at 03:19:23PM -0800, Luck, Tony wrote: > > > > That's the underlying functionality of the MOVDIR64B instruction. A > > > > posted write so no way to know if it succeeded. > > > > > > So how do you know whether any of the writes went through? > > > > It's identical to the writel() mmio-write to start a SATA command > > transfer. The higher level device driver protocol validates that the > > command went through, ultimately with a timeout. There's no return > > value for iosubmit_cmds512() for the same reason there's no return > > value for the other iowrite primitives. > > With the difference that other iowrite primitive have no dependencies on > cpu feature bits and cannot fail on the software level. True, but that would be a driver coding mistake flagged by the WARN_ON_ONCE, and the failure is static. The driver must check for static_cpu_has(X86_FEATURE_MOVDIR64B) once at init, but it need not check again on every command submission.
On Wed, Nov 20, 2019 at 05:32:51PM -0800, Dan Williams wrote: > True, but that would be a driver coding mistake flagged by the > WARN_ON_ONCE, and the failure is static. The driver must check for > static_cpu_has(X86_FEATURE_MOVDIR64B) once at init, So if you do that at driver init time, you don't need the static variant - simply use boot_cpu_has(). But if this function is going to be used on other platforms, then you need the check in the function and it must be static_cpu_has() for speed. The static_cpu_has() thing is a soft-of once check anyway because it gets patched by alternatives and after that it is 0 overhead.
On Wed, Nov 20, 2019 at 05:10:41PM -0700, Dave Jiang wrote: > I'll add the check on the destination address. The call is modeled after > __iowrite64_copy() / __iowrite32_copy() in lib/iomap_copy.c. Looks like > those functions do not check for the alignment requirements either. So just because they don't check, you don't need to check either? Can you guarantee that all callers will always do the right thing? I mean, if you don't care too much, why even write "(must be 512-bit aligned)"? Who cares then if the data is aligned or not... > > > + * @dst: destination, in MMIO space (must be 512-bit aligned) > > > + * @src: source > > > + * @count: number of 512 bits quantities to submit > > > > Where's that check on the data? > > I don't follow? What do you do if the caller doesn't submit data in 512 bits quantities?
On 11/21/19 3:59 AM, Borislav Petkov wrote: > On Wed, Nov 20, 2019 at 05:10:41PM -0700, Dave Jiang wrote: >> I'll add the check on the destination address. The call is modeled after >> __iowrite64_copy() / __iowrite32_copy() in lib/iomap_copy.c. Looks like >> those functions do not check for the alignment requirements either. > > So just because they don't check, you don't need to check either? No what I mean was those primitives are missing the checks and we should probably address that at some point. > > Can you guarantee that all callers will always do the right thing? > > I mean, if you don't care too much, why even write "(must be 512-bit > aligned)"? Who cares then if the data is aligned or not... > >>>> + * @dst: destination, in MMIO space (must be 512-bit aligned) >>>> + * @src: source >>>> + * @count: number of 512 bits quantities to submit >>> >>> Where's that check on the data? >> >> I don't follow? > > What do you do if the caller doesn't submit data in 512 bits quantities? > How would I detect that? Add a size (in bytes) parameter for the total source data?
On Thu, Nov 21, 2019 at 09:52:19AM -0700, Dave Jiang wrote: > No what I mean was those primitives are missing the checks and we should > probably address that at some point. Oh, patches are always welcome! :) > How would I detect that? Add a size (in bytes) parameter for the total > source data? Sure. So, here's the deal: the more I look at this thing, the more I think this iosubmit_cmds512() function should not be in a generic header but in an intel-/driver-specific one. Why? Well, movdir64b is Intel-only for now, you don't have a fallback option for the platforms which do not support that insn and it is more preferential for you to do the feature check once at driver init and then call the function because you *know* you have movdir64b support and not have any feature check in the function itself, not even a fast static_cpu_has() one. And this way you can do away with alignment and size checks because you control what your driver does. If it turns out that this function needs to be shared with other platforms, then we can consider lifting it into a generic header and making it more generic. Ok?
On Fri, Nov 22, 2019 at 1:00 AM Borislav Petkov <bp@alien8.de> wrote: > > On Thu, Nov 21, 2019 at 09:52:19AM -0700, Dave Jiang wrote: > > No what I mean was those primitives are missing the checks and we should > > probably address that at some point. > > Oh, patches are always welcome! :) > > > How would I detect that? Add a size (in bytes) parameter for the total > > source data? > > Sure. > > So, here's the deal: the more I look at this thing, the more I think > this iosubmit_cmds512() function should not be in a generic header but > in an intel-/driver-specific one. Why? > > Well, movdir64b is Intel-only for now, you don't have a fallback > option for the platforms which do not support that insn and it is more > preferential for you to do the feature check once at driver init and > then call the function because you *know* you have movdir64b support > and not have any feature check in the function itself, not even a fast > static_cpu_has() one. > > And this way you can do away with alignment and size checks because you > control what your driver does. > > If it turns out that this function needs to be shared with other > platforms, then we can consider lifting it into a generic header and > making it more generic. > > Ok? I do agree that iosubmit_cmds512() can live in a driver specific header, and it was my fault for advising Dave to make it generic. The long story of how that came to pass below, but the short story is yes, lets just make this one driver specific. The long story is that there is already line of sight for a need for other generic movdir64b() helpers as mentioned in the changelog, and iosubmit_cmds512() got wrapped up in that momentum. For those cases the thought would be to have memset512() for case1 and __iowrite512_copy() for case3. Where memset512() writes a non-incrementing source to an incrementing destination, and __iowrite512_copy() copies an incrementing source to an incrementing destination. Those 2 helpers *would* have fallbacks, but with the option to use something like cpu_has_write512() to check in advance whether those routines will fallback, or not. That can be a discussion for a future patchset when those users arrive.
On Fri, Nov 22, 2019 at 09:20:39AM -0800, Dan Williams wrote: > For those cases the thought would be to have memset512() for case1 and > __iowrite512_copy() for case3. Where memset512() writes a > non-incrementing source to an incrementing destination, and > __iowrite512_copy() copies an incrementing source to an incrementing > destination. Those 2 helpers *would* have fallbacks, but with the > option to use something like cpu_has_write512() to check in advance > whether those routines will fallback, or not. > > That can be a discussion for a future patchset when those users arrive. Oh, sure, of course. My only angle is very simple: if the MOVDIR* et al is only supported on upcoming Intel platforms and looking at the use cases: 1. clear poison/MKTME 3. copy iomem in big chunks I'm going to venture a guess that those two cases are going to be happening only on Intel platforms which already support MODVIR*. So wouldn't really need to do any generic helpers because those use cases are very specific already. Which would make your feature detection a one-time, driver-init time thing anyway... Unless I misunderstand those cases and there really is a use case where the thing would fallback and the fallback would really be for an "unenlightened" platform without that MOVDIR* hw support...? Hmmm.
On Fri, Nov 22, 2019 at 10:44 AM Borislav Petkov <bp@alien8.de> wrote: > > On Fri, Nov 22, 2019 at 09:20:39AM -0800, Dan Williams wrote: > > For those cases the thought would be to have memset512() for case1 and > > __iowrite512_copy() for case3. Where memset512() writes a > > non-incrementing source to an incrementing destination, and > > __iowrite512_copy() copies an incrementing source to an incrementing > > destination. Those 2 helpers *would* have fallbacks, but with the > > option to use something like cpu_has_write512() to check in advance > > whether those routines will fallback, or not. > > > > That can be a discussion for a future patchset when those users arrive. > > Oh, sure, of course. > > My only angle is very simple: if the MOVDIR* et al is only supported on > upcoming Intel platforms and looking at the use cases: > > 1. clear poison/MKTME > 3. copy iomem in big chunks > > I'm going to venture a guess that those two cases are going to be > happening only on Intel platforms which already support MODVIR*. So > wouldn't really need to do any generic helpers because those use cases > are very specific already. Which would make your feature detection a > one-time, driver-init time thing anyway... > > Unless I misunderstand those cases and there really is a use case > where the thing would fallback and the fallback would really be for an > "unenlightened" platform without that MOVDIR* hw support...? At least for something like __iowrite512_copy() it would indeed be something an unenlightened driver could call. Those drivers would simply be looking for opportunistic efficiency and could tolerate a fallback. Just like current __iowrite64_copy() users don't care if that routine falls back internally.
diff --git a/arch/x86/include/asm/io.h b/arch/x86/include/asm/io.h index 6bed97ff6db2..3126f6e1d5b8 100644 --- a/arch/x86/include/asm/io.h +++ b/arch/x86/include/asm/io.h @@ -403,5 +403,49 @@ extern bool arch_memremap_can_ram_remap(resource_size_t offset, extern bool phys_mem_access_encrypted(unsigned long phys_addr, unsigned long size); +#include <linux/cpufeature.h> +static inline bool cpu_has_write512(void) +{ + return cpu_feature_enabled(X86_FEATURE_MOVDIR64B); +} + +#define cpu_has_write512 cpu_has_write512 + +static inline void __iowrite512(void __iomem *__dst, const void *src) +{ + volatile struct { char _[64]; } *dst = __dst; + + /* movdir64b [rdx], rax */ + asm volatile(".byte 0x66, 0x0f, 0x38, 0xf8, 0x02" + : "=m" (dst) + : "d" (src), "a" (dst)); +} + +/** + * iosubmit_cmds512 - copy data to single MMIO location, in 512-bit units + * @dst: destination, in MMIO space (must be 512-bit aligned) + * @src: source + * @count: number of 512 bits quantities to submit + * + * Submit data from kernel space to MMIO space, in units of 512 bits at a + * time. Order of access is not guaranteed, nor is a memory barrier + * performed afterwards. + */ +static inline void iosubmit_cmds512(void __iomem *dst, const void *src, + size_t count) +{ + const u8 *from = src; + const u8 *end = from + count * 64; + + if (!cpu_has_write512()) + return; + + while (from < end) { + __iowrite512(dst, from); + from += 64; + } +} + +#define iosubmit_cmds512 iosubmit_cmds512 #endif /* _ASM_X86_IO_H */ diff --git a/include/linux/io.h b/include/linux/io.h index accac822336a..ca14af3e84de 100644 --- a/include/linux/io.h +++ b/include/linux/io.h @@ -20,6 +20,17 @@ __visible void __iowrite32_copy(void __iomem *to, const void *from, size_t count void __ioread32_copy(void *to, const void __iomem *from, size_t count); void __iowrite64_copy(void __iomem *to, const void *from, size_t count); +#ifndef cpu_has_write512 +#define cpu_has_write512() (0) +#endif + +#ifndef iosubmit_cmds512 +static inline void iosubmit_cmds512(void __iomem *to, const void *from, + size_t count) +{ +} +#endif + #ifdef CONFIG_MMU int ioremap_page_range(unsigned long addr, unsigned long end, phys_addr_t phys_addr, pgprot_t prot);
With the introduction of movdir64b instruction, there is now an instruction that can write 64 bytes of data atomicaly. Quoting from Intel SDM: "There is no atomicity guarantee provided for the 64-byte load operation from source address, and processor implementations may use multiple load operations to read the 64-bytes. The 64-byte direct-store issued by MOVDIR64B guarantees 64-byte write-completion atomicity. This means that the data arrives at the destination in a single undivided 64-byte write transaction." We have identified at least 3 different use cases for this instruction in the format of func(dst, src, count): 1) Clear poison / Initialize MKTME memory Destination is normal memory. Source in normal memory. Does not increment. (Copy same line to all targets) Count (to clear/init multiple lines) 2) Submit command(s) to new devices Destination is a special MMIO region for a device. Does not increment. Source is normal memory. Increments. Count usually is 1, but can be multiple. 3) Copy to iomem in big chunks Destination is iomem and increments Source in normal memory and increments Count is number of chunks to copy This commit adds support for case #2 to support device that will accept commands via this instruction. Signed-off-by: Dave Jiang <dave.jiang@intel.com> --- arch/x86/include/asm/io.h | 44 ++++++++++++++++++++++++++++++++++++++++++++ include/linux/io.h | 11 +++++++++++ 2 files changed, 55 insertions(+)