diff mbox series

[RFC,V2,1/2] md/raid5: optimize RAID5 performance.

Message ID SJ0PR10MB574146BF65CC516F253B2DADD83E2@SJ0PR10MB5741.namprd10.prod.outlook.com (mailing list archive)
State RFC, archived
Headers show
Series [RFC,V2,1/2] md/raid5: optimize RAID5 performance. | expand

Commit Message

Shushu Yi April 2, 2024, 5:05 p.m. UTC
From: Shushu Yi <firnyee@gmail.com>

<changelog>

Optimized by using fine-grained locks, customized data structures, and
scattered address space. Achieves significant improvements in both
throughput and latency.

This patch attempts to maximize thread-level parallelism and reduce
CPU suspension time caused by lock contention. On a system with four
PCIe 4.0 SSDs, we achieved increased overall storage throughput by
89.4% and decreases the 99.99th percentile I/O latency by 85.4%.

Seeking feedback on the approach and any addition information regarding
Required performance testing before submitting a formal patch.

Note: this work has been published as a paper, and the URL is
(https://www.hotstorage.org/2022/camera-ready/hotstorage22-5/pdf/
hotstorage22-5.pdf)

Co-developed-by: Yiming Xu <teddyxym@outlook.com>
Signed-off-by: Yiming Xu <teddyxym@outlook.com>
Signed-off-by: Shushu Yi <firnyee@gmail.com>
Tested-by: Paul Luse <paul.e.luse@intel.com>
---
V1 -> V2: Cleaned up coding style and divided into 2 patches (HemiRAID
and ScalaRAID corresponding to the paper mentioned above). This part is
HemiRAID, which increased the number of stripe locks to 128.

 drivers/md/raid5.h | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

Comments

Paul Menzel April 2, 2024, 5:19 p.m. UTC | #1
Dear Shushu,

Thank you for your patch. Some comments and nits.

Please do *not* send it to <majordomo@vger.kernel.org>.

Please also do not add a dot/period at the end of the commit message 
summary. (A more specific one would be nice too.)


Am 02.04.24 um 19:05 schrieb Yiming Xu:
> From: Shushu Yi <firnyee@gmail.com>
> 
> <changelog>

Please remove.

> Optimized by using fine-grained locks, customized data structures, and

Imperative mood: Optimize

> scattered address space. Achieves significant improvements in both
> throughput and latency.
> 
> This patch attempts to maximize thread-level parallelism and reduce
> CPU suspension time caused by lock contention. On a system with four
> PCIe 4.0 SSDs, we achieved increased overall storage throughput by
> 89.4% and decreases the 99.99th percentile I/O latency by 85.4%.
> 
> Seeking feedback on the approach and any addition information regarding
> Required performance testing before submitting a formal patch.
> 
> Note: this work has been published as a paper, and the URL is
> (https://www.hotstorage.org/2022/camera-ready/hotstorage22-5/pdf/
> hotstorage22-5.pdf)

A more elaborate description is needed.

> Co-developed-by: Yiming Xu <teddyxym@outlook.com>
> Signed-off-by: Yiming Xu <teddyxym@outlook.com>
> Signed-off-by: Shushu Yi <firnyee@gmail.com>
> Tested-by: Paul Luse <paul.e.luse@intel.com>
> ---
> V1 -> V2: Cleaned up coding style and divided into 2 patches (HemiRAID
> and ScalaRAID corresponding to the paper mentioned above). This part is
> HemiRAID, which increased the number of stripe locks to 128.
> 
>   drivers/md/raid5.h | 2 +-
>   1 file changed, 1 insertion(+), 1 deletion(-)
> 
> diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
> index 9b5a7dc3f2a0..d26da031d203 100644
> --- a/drivers/md/raid5.h
> +++ b/drivers/md/raid5.h
> @@ -501,7 +501,7 @@ struct disk_info {
>    * and creating that much locking depth can cause
>    * problems.
>    */
> -#define NR_STRIPE_HASH_LOCKS 8
> +#define NR_STRIPE_HASH_LOCKS 128
>   #define STRIPE_HASH_LOCKS_MASK (NR_STRIPE_HASH_LOCKS - 1)
>   
>   struct r5worker {

Is it intentional, that you only increased the number value of the 
macro? The comment above also suggests that bigger numbers might cause 
problems.


Kind regards,

Paul
diff mbox series

Patch

diff --git a/drivers/md/raid5.h b/drivers/md/raid5.h
index 9b5a7dc3f2a0..d26da031d203 100644
--- a/drivers/md/raid5.h
+++ b/drivers/md/raid5.h
@@ -501,7 +501,7 @@  struct disk_info {
  * and creating that much locking depth can cause
  * problems.
  */
-#define NR_STRIPE_HASH_LOCKS 8
+#define NR_STRIPE_HASH_LOCKS 128
 #define STRIPE_HASH_LOCKS_MASK (NR_STRIPE_HASH_LOCKS - 1)
 
 struct r5worker {