diff mbox series

[net-next,2/4] io: add function to flush the write combine buffer to device immediately

Message ID 1627614864-50824-3-git-send-email-huangguangbin2@huawei.com (mailing list archive)
State Changes Requested
Delegated to: Netdev Maintainers
Headers show
Series net: hns3: add support for TX push | expand

Checks

Context Check Description
netdev/cover_letter success Link
netdev/fixes_present success Link
netdev/patch_count success Link
netdev/tree_selection success Clearly marked for net-next
netdev/subject_prefix success Link
netdev/cc_maintainers warning 6 maintainers not CCed: marcan@marcan.st akpm@linux-foundation.org wangxiongfeng2@huawei.com wangzhou1@hisilicon.com palmerdabbelt@google.com npiggin@gmail.com
netdev/source_inline success Was 0 now: 0
netdev/verify_signedoff success Link
netdev/module_param success Was 0 now: 0
netdev/build_32bit success Errors and warnings before: 11222 this patch: 11222
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/verify_fixes success Link
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 18 lines checked
netdev/build_allmodconfig_warn success Errors and warnings before: 11010 this patch: 11010
netdev/header_inline success Link

Commit Message

Guangbin Huang July 30, 2021, 3:14 a.m. UTC
From: Xiongfeng Wang <wangxiongfeng2@huawei.com>

Device registers can be mapped as write-combine type. In this case, data
are not written into the device immediately. They are temporarily stored
in the write combine buffer and written into the device when the buffer
is full. But in some situation, we need to flush the write combine
buffer to device immediately for better performance. So we add a general
function called 'flush_wc_write()'. We use DGH instruction to implement
this function for ARM64.

Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
---
 arch/arm64/include/asm/io.h | 2 ++
 include/linux/io.h          | 6 ++++++
 2 files changed, 8 insertions(+)

Comments

Will Deacon July 30, 2021, 9 a.m. UTC | #1
Hi,

On Fri, Jul 30, 2021 at 11:14:22AM +0800, Guangbin Huang wrote:
> From: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> 
> Device registers can be mapped as write-combine type. In this case, data
> are not written into the device immediately. They are temporarily stored
> in the write combine buffer and written into the device when the buffer
> is full. But in some situation, we need to flush the write combine
> buffer to device immediately for better performance. So we add a general
> function called 'flush_wc_write()'. We use DGH instruction to implement
> this function for ARM64.
> 
> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
> ---
>  arch/arm64/include/asm/io.h | 2 ++
>  include/linux/io.h          | 6 ++++++
>  2 files changed, 8 insertions(+)

-ENODOCUMENTATION

> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
> index 7fd836bea7eb..5315d023b2dd 100644
> --- a/arch/arm64/include/asm/io.h
> +++ b/arch/arm64/include/asm/io.h
> @@ -112,6 +112,8 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
>  #define __iowmb()		dma_wmb()
>  #define __iomb()		dma_mb()
>  
> +#define flush_wc_write()	dgh()

I think it would be worthwhile to look at what architectures other than
arm64 offer here. For example, is there anything similar to this on riscv,
x86 or power? Doing a quick survery of what's out there might help us define
a macro that can be used across multiple architectures.

Thanks,

Will

>  /*
>   * Relaxed I/O memory access primitives. These follow the Device memory
>   * ordering rules but do not guarantee any ordering relative to Normal memory
> diff --git a/include/linux/io.h b/include/linux/io.h
> index 9595151d800d..469d53444218 100644
> --- a/include/linux/io.h
> +++ b/include/linux/io.h
> @@ -166,4 +166,10 @@ static inline void arch_io_free_memtype_wc(resource_size_t base,
>  }
>  #endif
>  
> +/* IO barriers */
> +
> +#ifndef flush_wc_write
> +#define flush_wc_write()		do { } while (0)
> +#endif
> +
>  #endif /* _LINUX_IO_H */
> -- 
> 2.8.1
>
Catalin Marinas July 30, 2021, 9:42 a.m. UTC | #2
On Fri, Jul 30, 2021 at 11:14:22AM +0800, Guangbin Huang wrote:
> From: Xiongfeng Wang <wangxiongfeng2@huawei.com>
> 
> Device registers can be mapped as write-combine type. In this case, data
> are not written into the device immediately. They are temporarily stored
> in the write combine buffer and written into the device when the buffer
> is full. But in some situation, we need to flush the write combine
> buffer to device immediately for better performance. So we add a general
> function called 'flush_wc_write()'. We use DGH instruction to implement
> this function for ARM64.

Isn't this slightly misleading? IIUC DGH does not guarantee flushing, it
just prevents writes merging (maybe this was already discussed on the
previous RFC).
Guangbin Huang Oct. 11, 2021, 1:37 p.m. UTC | #3
On 2021/7/30 17:00, Will Deacon wrote:
> Hi,
> 
> On Fri, Jul 30, 2021 at 11:14:22AM +0800, Guangbin Huang wrote:
>> From: Xiongfeng Wang <wangxiongfeng2@huawei.com>
>>
>> Device registers can be mapped as write-combine type. In this case, data
>> are not written into the device immediately. They are temporarily stored
>> in the write combine buffer and written into the device when the buffer
>> is full. But in some situation, we need to flush the write combine
>> buffer to device immediately for better performance. So we add a general
>> function called 'flush_wc_write()'. We use DGH instruction to implement
>> this function for ARM64.
>>
>> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
>> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
>> ---
>>   arch/arm64/include/asm/io.h | 2 ++
>>   include/linux/io.h          | 6 ++++++
>>   2 files changed, 8 insertions(+)
> 
> -ENODOCUMENTATION
> 
Hi Will, may I consult you which document file is good to add documentation?

>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>> index 7fd836bea7eb..5315d023b2dd 100644
>> --- a/arch/arm64/include/asm/io.h
>> +++ b/arch/arm64/include/asm/io.h
>> @@ -112,6 +112,8 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
>>   #define __iowmb()		dma_wmb()
>>   #define __iomb()		dma_mb()
>>   
>> +#define flush_wc_write()	dgh()
> 
> I think it would be worthwhile to look at what architectures other than
> arm64 offer here. For example, is there anything similar to this on riscv,
> x86 or power? Doing a quick survery of what's out there might help us define
> a macro that can be used across multiple architectures.
> 
> Thanks,
> 
> Will
> 
>>   /*
>>    * Relaxed I/O memory access primitives. These follow the Device memory
>>    * ordering rules but do not guarantee any ordering relative to Normal memory
>> diff --git a/include/linux/io.h b/include/linux/io.h
>> index 9595151d800d..469d53444218 100644
>> --- a/include/linux/io.h
>> +++ b/include/linux/io.h
>> @@ -166,4 +166,10 @@ static inline void arch_io_free_memtype_wc(resource_size_t base,
>>   }
>>   #endif
>>   
>> +/* IO barriers */
>> +
>> +#ifndef flush_wc_write
>> +#define flush_wc_write()		do { } while (0)
>> +#endif
>> +
>>   #endif /* _LINUX_IO_H */
>> -- 
>> 2.8.1
>>
> .
>
Xiongfeng Wang Oct. 15, 2021, 1:48 a.m. UTC | #4
Hi, Will

On 2021/7/30 17:00, Will Deacon wrote:
> Hi,
> 
> On Fri, Jul 30, 2021 at 11:14:22AM +0800, Guangbin Huang wrote:
>> From: Xiongfeng Wang <wangxiongfeng2@huawei.com>
>>
>> Device registers can be mapped as write-combine type. In this case, data
>> are not written into the device immediately. They are temporarily stored
>> in the write combine buffer and written into the device when the buffer
>> is full. But in some situation, we need to flush the write combine
>> buffer to device immediately for better performance. So we add a general
>> function called 'flush_wc_write()'. We use DGH instruction to implement
>> this function for ARM64.
>>
>> Signed-off-by: Xiongfeng Wang <wangxiongfeng2@huawei.com>
>> Signed-off-by: Guangbin Huang <huangguangbin2@huawei.com>
>> ---
>>  arch/arm64/include/asm/io.h | 2 ++
>>  include/linux/io.h          | 6 ++++++
>>  2 files changed, 8 insertions(+)
> 
> -ENODOCUMENTATION
> 
>> diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
>> index 7fd836bea7eb..5315d023b2dd 100644
>> --- a/arch/arm64/include/asm/io.h
>> +++ b/arch/arm64/include/asm/io.h
>> @@ -112,6 +112,8 @@ static inline u64 __raw_readq(const volatile void __iomem *addr)
>>  #define __iowmb()		dma_wmb()
>>  #define __iomb()		dma_mb()
>>  
>> +#define flush_wc_write()	dgh()
> 
> I think it would be worthwhile to look at what architectures other than
> arm64 offer here. For example, is there anything similar to this on riscv,
> x86 or power? Doing a quick survery of what's out there might help us define
> a macro that can be used across multiple architectures.

I searched in 'barrier.h' of different architectures and didn't find similar
merge preventing instructions. Could you give me some advice on naming this
common interface ?

Thanks,
Xiongfeng

> 
> Thanks,
> 
> Will
> 
>>  /*
>>   * Relaxed I/O memory access primitives. These follow the Device memory
>>   * ordering rules but do not guarantee any ordering relative to Normal memory
>> diff --git a/include/linux/io.h b/include/linux/io.h
>> index 9595151d800d..469d53444218 100644
>> --- a/include/linux/io.h
>> +++ b/include/linux/io.h
>> @@ -166,4 +166,10 @@ static inline void arch_io_free_memtype_wc(resource_size_t base,
>>  }
>>  #endif
>>  
>> +/* IO barriers */
>> +
>> +#ifndef flush_wc_write
>> +#define flush_wc_write()		do { } while (0)
>> +#endif
>> +
>>  #endif /* _LINUX_IO_H */
>> -- 
>> 2.8.1
>>
> .
>
diff mbox series

Patch

diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h
index 7fd836bea7eb..5315d023b2dd 100644
--- a/arch/arm64/include/asm/io.h
+++ b/arch/arm64/include/asm/io.h
@@ -112,6 +112,8 @@  static inline u64 __raw_readq(const volatile void __iomem *addr)
 #define __iowmb()		dma_wmb()
 #define __iomb()		dma_mb()
 
+#define flush_wc_write()	dgh()
+
 /*
  * Relaxed I/O memory access primitives. These follow the Device memory
  * ordering rules but do not guarantee any ordering relative to Normal memory
diff --git a/include/linux/io.h b/include/linux/io.h
index 9595151d800d..469d53444218 100644
--- a/include/linux/io.h
+++ b/include/linux/io.h
@@ -166,4 +166,10 @@  static inline void arch_io_free_memtype_wc(resource_size_t base,
 }
 #endif
 
+/* IO barriers */
+
+#ifndef flush_wc_write
+#define flush_wc_write()		do { } while (0)
+#endif
+
 #endif /* _LINUX_IO_H */