diff mbox series

[for-next] RDMA/hns: Add interface to support lock free

Message ID 1583999290-20514-1-git-send-email-liweihang@huawei.com (mailing list archive)
State Rejected
Delegated to: Jason Gunthorpe
Headers show
Series [for-next] RDMA/hns: Add interface to support lock free | expand

Commit Message

Weihang Li March 12, 2020, 7:48 a.m. UTC
From: Jiaran Zhang <zhangjiaran@huawei.com>

In some scenarios, ULP can ensure that there is no concurrency when
processing io, thus lock free can be used to improve performance.

Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
Signed-off-by: Yixian Liu <liuyixian@huawei.com>
Signed-off-by: Weihang Li <liweihang@huawei.com>
---
 drivers/infiniband/hw/hns/hns_roce_device.h |   1 +
 drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 112 ++++++++++++++++++++--------
 drivers/infiniband/hw/hns/hns_roce_qp.c     |   1 +
 3 files changed, 84 insertions(+), 30 deletions(-)

Comments

Leon Romanovsky March 12, 2020, 9:26 a.m. UTC | #1
On Thu, Mar 12, 2020 at 03:48:10PM +0800, Weihang Li wrote:
> From: Jiaran Zhang <zhangjiaran@huawei.com>
>
> In some scenarios, ULP can ensure that there is no concurrency when
> processing io, thus lock free can be used to improve performance.
>
> Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
> Signed-off-by: Yixian Liu <liuyixian@huawei.com>
> Signed-off-by: Weihang Li <liweihang@huawei.com>
> ---
>  drivers/infiniband/hw/hns/hns_roce_device.h |   1 +
>  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 112 ++++++++++++++++++++--------
>  drivers/infiniband/hw/hns/hns_roce_qp.c     |   1 +
>  3 files changed, 84 insertions(+), 30 deletions(-)
>

NAK, everything in this patch is wrong, starting from exposure,
implementation and description.

Thanks
Jason Gunthorpe March 12, 2020, 5:02 p.m. UTC | #2
On Thu, Mar 12, 2020 at 11:26:40AM +0200, Leon Romanovsky wrote:
> On Thu, Mar 12, 2020 at 03:48:10PM +0800, Weihang Li wrote:
> > From: Jiaran Zhang <zhangjiaran@huawei.com>
> >
> > In some scenarios, ULP can ensure that there is no concurrency when
> > processing io, thus lock free can be used to improve performance.
> >
> > Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
> > Signed-off-by: Yixian Liu <liuyixian@huawei.com>
> > Signed-off-by: Weihang Li <liweihang@huawei.com>
> >  drivers/infiniband/hw/hns/hns_roce_device.h |   1 +
> >  drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 112 ++++++++++++++++++++--------
> >  drivers/infiniband/hw/hns/hns_roce_qp.c     |   1 +
> >  3 files changed, 84 insertions(+), 30 deletions(-)
> >
> 
> NAK, everything in this patch is wrong, starting from exposure,
> implementation and description.

Yes, complete no on a module parameter to disable locking.

Jason
Leon Romanovsky March 12, 2020, 5:26 p.m. UTC | #3
On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
> What would you say to a per-process env variable to disable locking in a userspace provider?

We have thread-domain concept in libibverbs to achieve lockless flows.
https://github.com/linux-rdma/rdma-core/blob/master/libibverbs/man/ibv_alloc_td.3

BTW, please don't use top-posting.

Thanks

>
> -Andrew
>
> > On Mar 12, 2020, at 1:02 PM, Jason Gunthorpe <jgg@ziepe.ca> wrote:
> >
> > On Thu, Mar 12, 2020 at 11:26:40AM +0200, Leon Romanovsky wrote:
> >> On Thu, Mar 12, 2020 at 03:48:10PM +0800, Weihang Li wrote:
> >>> From: Jiaran Zhang <zhangjiaran@huawei.com>
> >>>
> >>> In some scenarios, ULP can ensure that there is no concurrency when
> >>> processing io, thus lock free can be used to improve performance.
> >>>
> >>> Signed-off-by: Jiaran Zhang <zhangjiaran@huawei.com>
> >>> Signed-off-by: Yixian Liu <liuyixian@huawei.com>
> >>> Signed-off-by: Weihang Li <liweihang@huawei.com>
> >>> drivers/infiniband/hw/hns/hns_roce_device.h |   1 +
> >>> drivers/infiniband/hw/hns/hns_roce_hw_v2.c  | 112 ++++++++++++++++++++--------
> >>> drivers/infiniband/hw/hns/hns_roce_qp.c     |   1 +
> >>> 3 files changed, 84 insertions(+), 30 deletions(-)
> >>>
> >>
> >> NAK, everything in this patch is wrong, starting from exposure,
> >> implementation and description.
> >
> > Yes, complete no on a module parameter to disable locking.
> >
> > Jason
>
Jason Gunthorpe March 12, 2020, 5:27 p.m. UTC | #4
On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
>    What would you say to a per-process env variable to disable locking in
>    a userspace provider?

That is also a no. verbs now has 'thread domain' who's purpose is to
allow data plane locks to be skipped.

Generally new env vars in verbs are going to face opposition from
me.

Jason
Weihang Li March 13, 2020, 6:02 a.m. UTC | #5
On 2020/3/13 1:27, Jason Gunthorpe wrote:
> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
>>    What would you say to a per-process env variable to disable locking in
>>    a userspace provider?
> 
> That is also a no. verbs now has 'thread domain' who's purpose is to
> allow data plane locks to be skipped.
> 
> Generally new env vars in verbs are going to face opposition from
> me.
> 
> Jason
> 


Hi Leon and Jason,

Thanks for your comments. Do you have some suggestions on how to
achieve lockless flows in kernel? Are there any similar interfaces
in kernel like the thread domain in userspace?

Weihang
Jason Gunthorpe March 13, 2020, 12:18 p.m. UTC | #6
On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
> On 2020/3/13 1:27, Jason Gunthorpe wrote:
> > On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
> >>    What would you say to a per-process env variable to disable locking in
> >>    a userspace provider?
> > 
> > That is also a no. verbs now has 'thread domain' who's purpose is to
> > allow data plane locks to be skipped.
> > 
> > Generally new env vars in verbs are going to face opposition from
> > me.
> > 
> > Jason
> 
> Thanks for your comments. Do you have some suggestions on how to
> achieve lockless flows in kernel? Are there any similar interfaces
> in kernel like the thread domain in userspace?

It has never come up before

Jason
Weihang Li March 14, 2020, 3:44 a.m. UTC | #7
On 2020/3/13 20:18, Jason Gunthorpe wrote:
> On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
>> On 2020/3/13 1:27, Jason Gunthorpe wrote:
>>> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
>>>>    What would you say to a per-process env variable to disable locking in
>>>>    a userspace provider?
>>>
>>> That is also a no. verbs now has 'thread domain' who's purpose is to
>>> allow data plane locks to be skipped.
>>>
>>> Generally new env vars in verbs are going to face opposition from
>>> me.
>>>
>>> Jason
>>
>> Thanks for your comments. Do you have some suggestions on how to
>> achieve lockless flows in kernel? Are there any similar interfaces
>> in kernel like the thread domain in userspace?
> 
> It has never come up before
> 
> Jason
> 

Thank you, Jason. Could you please explain why it's not encouraged to
use module parameters in kernel?

What about the reason why we shouldn't add new environment variables
in userspace? Do they have the same reason?

Weihang
Leon Romanovsky March 14, 2020, 9:49 a.m. UTC | #8
On Sat, Mar 14, 2020 at 03:44:49AM +0000, liweihang wrote:
> On 2020/3/13 20:18, Jason Gunthorpe wrote:
> > On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
> >> On 2020/3/13 1:27, Jason Gunthorpe wrote:
> >>> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
> >>>>    What would you say to a per-process env variable to disable locking in
> >>>>    a userspace provider?
> >>>
> >>> That is also a no. verbs now has 'thread domain' who's purpose is to
> >>> allow data plane locks to be skipped.
> >>>
> >>> Generally new env vars in verbs are going to face opposition from
> >>> me.
> >>>
> >>> Jason
> >>
> >> Thanks for your comments. Do you have some suggestions on how to
> >> achieve lockless flows in kernel? Are there any similar interfaces
> >> in kernel like the thread domain in userspace?
> >
> > It has never come up before
> >
> > Jason
> >
>
> Thank you, Jason. Could you please explain why it's not encouraged to
> use module parameters in kernel?
>
> What about the reason why we shouldn't add new environment variables
> in userspace? Do they have the same reason?

No, the are discouraged for different technical reasons. There are many
reasons for not allowing them, but immediately comes into mind that
environmental variables are not thread safe, silently inherited by fork()
and have very interesting behavior in the scripts.

This makes environmental variables are the worst configuration "tool"
for the libraries.

Module parameters are bad due to inability to deprecate them, difference
between drivers that requires rewrite scripts/configs after driver/HW
change, global nature of the change and really painful experience for
the users if workload requires to change those defaults.

Thanks

>
> Weihang
Leon Romanovsky March 14, 2020, 6:07 p.m. UTC | #9
On Sat, Mar 14, 2020 at 03:44:49AM +0000, liweihang wrote:
> On 2020/3/13 20:18, Jason Gunthorpe wrote:
> > On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
> >> On 2020/3/13 1:27, Jason Gunthorpe wrote:
> >>> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
> >>>>    What would you say to a per-process env variable to disable locking in
> >>>>    a userspace provider?
> >>>
> >>> That is also a no. verbs now has 'thread domain' who's purpose is to
> >>> allow data plane locks to be skipped.
> >>>
> >>> Generally new env vars in verbs are going to face opposition from
> >>> me.
> >>>
> >>> Jason
> >>
> >> Thanks for your comments. Do you have some suggestions on how to
> >> achieve lockless flows in kernel? Are there any similar interfaces
> >> in kernel like the thread domain in userspace?
> >
> > It has never come up before
> >
> > Jason
> >
>
> Thank you, Jason. Could you please explain why it's not encouraged to
> use module parameters in kernel?
>
> What about the reason why we shouldn't add new environment variables
> in userspace? Do they have the same reason?

I don't know why my previous answer didn't appear in the ML, hope that
this will arrive.

The technical reasons to avoid environmental variables and kernel module
parameters are not the same, but very similar.

Environmental variables are not thread safe (in POSIX), inherited with
fork() and behaves differently in scripts. All this together makes them
as very bad user visible configuration interface.

Kernel module parameters are not welcomed due to their global nature,
difference between various drivers which makes hard for users to change
HW/scripts, almost impossible to deprecate e.t.c.

Thanks

>
> Weihang
Weihang Li March 16, 2020, 10:46 a.m. UTC | #10
On 2020/3/15 2:07, Leon Romanovsky wrote:
> On Sat, Mar 14, 2020 at 03:44:49AM +0000, liweihang wrote:
>> On 2020/3/13 20:18, Jason Gunthorpe wrote:
>>> On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
>>>> On 2020/3/13 1:27, Jason Gunthorpe wrote:
>>>>> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
>>>>>>    What would you say to a per-process env variable to disable locking in
>>>>>>    a userspace provider?
>>>>>
>>>>> That is also a no. verbs now has 'thread domain' who's purpose is to
>>>>> allow data plane locks to be skipped.
>>>>>
>>>>> Generally new env vars in verbs are going to face opposition from
>>>>> me.
>>>>>
>>>>> Jason
>>>>
>>>> Thanks for your comments. Do you have some suggestions on how to
>>>> achieve lockless flows in kernel? Are there any similar interfaces
>>>> in kernel like the thread domain in userspace?
>>>
>>> It has never come up before
>>>
>>> Jason
>>>
>>
>> Thank you, Jason. Could you please explain why it's not encouraged to
>> use module parameters in kernel?
>>
>> What about the reason why we shouldn't add new environment variables
>> in userspace? Do they have the same reason?
> 
> I don't know why my previous answer didn't appear in the ML, hope that
> this will arrive.
> 
> The technical reasons to avoid environmental variables and kernel module
> parameters are not the same, but very similar.
> 
> Environmental variables are not thread safe (in POSIX), inherited with
> fork() and behaves differently in scripts. All this together makes them
> as very bad user visible configuration interface.
> 
> Kernel module parameters are not welcomed due to their global nature,
> difference between various drivers which makes hard for users to change
> HW/scripts, almost impossible to deprecate e.t.c.
> 
> Thanks
> 

I have received both of your responses :)
Thanks for your detailed and clear explanation.

Weihang
Jason Gunthorpe March 16, 2020, 12:12 p.m. UTC | #11
On Sat, Mar 14, 2020 at 03:44:49AM +0000, liweihang wrote:
> On 2020/3/13 20:18, Jason Gunthorpe wrote:
> > On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
> >> On 2020/3/13 1:27, Jason Gunthorpe wrote:
> >>> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
> >>>>    What would you say to a per-process env variable to disable locking in
> >>>>    a userspace provider?
> >>>
> >>> That is also a no. verbs now has 'thread domain' who's purpose is to
> >>> allow data plane locks to be skipped.
> >>>
> >>> Generally new env vars in verbs are going to face opposition from
> >>> me.
> >>>
> >>> Jason
> >>
> >> Thanks for your comments. Do you have some suggestions on how to
> >> achieve lockless flows in kernel? Are there any similar interfaces
> >> in kernel like the thread domain in userspace?
> > 
> > It has never come up before
> > 
> > Jason
> > 
> 
> Thank you, Jason. Could you please explain why it's not encouraged to
> use module parameters in kernel?

Behavior that effects the operation of a ULP should never be
configured globally. The ULP must self-select this behavior
pragmatically, only when it is safe.

Jason
Leon Romanovsky March 16, 2020, 1:04 p.m. UTC | #12
On Mon, Mar 16, 2020 at 09:12:31AM -0300, Jason Gunthorpe wrote:
> On Sat, Mar 14, 2020 at 03:44:49AM +0000, liweihang wrote:
> > On 2020/3/13 20:18, Jason Gunthorpe wrote:
> > > On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
> > >> On 2020/3/13 1:27, Jason Gunthorpe wrote:
> > >>> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
> > >>>>    What would you say to a per-process env variable to disable locking in
> > >>>>    a userspace provider?
> > >>>
> > >>> That is also a no. verbs now has 'thread domain' who's purpose is to
> > >>> allow data plane locks to be skipped.
> > >>>
> > >>> Generally new env vars in verbs are going to face opposition from
> > >>> me.
> > >>>
> > >>> Jason
> > >>
> > >> Thanks for your comments. Do you have some suggestions on how to
> > >> achieve lockless flows in kernel? Are there any similar interfaces
> > >> in kernel like the thread domain in userspace?
> > >
> > > It has never come up before
> > >
> > > Jason
> > >
> >
> > Thank you, Jason. Could you please explain why it's not encouraged to
> > use module parameters in kernel?
>
> Behavior that effects the operation of a ULP should never be
> configured globally. The ULP must self-select this behavior
> pragmatically, only when it is safe.

Indeed, very good point.

I just want to add that for ULP it is very rare that module
parameters are the right choice either, because usually those parameters
change ULP behavior be suitable for specific workload.

Thanks

>
> Jason
Weihang Li March 16, 2020, 1:29 p.m. UTC | #13
On 2020/3/16 21:04, Leon Romanovsky wrote:
> On Mon, Mar 16, 2020 at 09:12:31AM -0300, Jason Gunthorpe wrote:
>> On Sat, Mar 14, 2020 at 03:44:49AM +0000, liweihang wrote:
>>> On 2020/3/13 20:18, Jason Gunthorpe wrote:
>>>> On Fri, Mar 13, 2020 at 06:02:20AM +0000, liweihang wrote:
>>>>> On 2020/3/13 1:27, Jason Gunthorpe wrote:
>>>>>> On Thu, Mar 12, 2020 at 01:04:05PM -0400, Andrew Boyer wrote:
>>>>>>>    What would you say to a per-process env variable to disable locking in
>>>>>>>    a userspace provider?
>>>>>>
>>>>>> That is also a no. verbs now has 'thread domain' who's purpose is to
>>>>>> allow data plane locks to be skipped.
>>>>>>
>>>>>> Generally new env vars in verbs are going to face opposition from
>>>>>> me.
>>>>>>
>>>>>> Jason
>>>>>
>>>>> Thanks for your comments. Do you have some suggestions on how to
>>>>> achieve lockless flows in kernel? Are there any similar interfaces
>>>>> in kernel like the thread domain in userspace?
>>>>
>>>> It has never come up before
>>>>
>>>> Jason
>>>>
>>>
>>> Thank you, Jason. Could you please explain why it's not encouraged to
>>> use module parameters in kernel?
>>
>> Behavior that effects the operation of a ULP should never be
>> configured globally. The ULP must self-select this behavior
>> pragmatically, only when it is safe.
> 
> Indeed, very good point.
> 
> I just want to add that for ULP it is very rare that module
> parameters are the right choice either, because usually those parameters
> change ULP behavior be suitable for specific workload.
> 
> Thanks
> 
>>
>> Jason
> 

I see, thanks again for your detailed explanation, it's very helpful for us.

Weihang
diff mbox series

Patch

diff --git a/drivers/infiniband/hw/hns/hns_roce_device.h b/drivers/infiniband/hw/hns/hns_roce_device.h
index d7dcf6e..974d125 100644
--- a/drivers/infiniband/hw/hns/hns_roce_device.h
+++ b/drivers/infiniband/hw/hns/hns_roce_device.h
@@ -703,6 +703,7 @@  struct hns_roce_qp {
 	struct list_head	node;		/* all qps are on a list */
 	struct list_head	rq_node;	/* all recv qps are on a list */
 	struct list_head	sq_node;	/* all send qps are on a list */
+	u8                      flush_en;
 };
 
 struct hns_roce_ib_iboe {
diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
index 82021fa..369f9d1 100644
--- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
+++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c
@@ -48,6 +48,35 @@ 
 #include "hns_roce_hem.h"
 #include "hns_roce_hw_v2.h"
 
+static bool qp_lock = true;
+static bool cq_lock = true;
+
+static inline void v2_spin_lock_irqsave(bool has_lock, spinlock_t *lock,
+					unsigned long *flags)
+{
+	if (likely(has_lock))
+		spin_lock_irqsave(lock, *flags);
+}
+
+static inline void v2_spin_unlock_irqrestore(bool has_lock, spinlock_t *lock,
+					     unsigned long *flags)
+{
+	if (likely(has_lock))
+		spin_unlock_irqrestore(lock, *flags);
+}
+
+static inline void v2_spin_lock_irq(bool has_lock, spinlock_t *lock)
+{
+	if (likely(has_lock))
+		spin_lock_irq(lock);
+}
+
+static inline void v2_spin_unlock_irq(bool has_lock, spinlock_t *lock)
+{
+	if (likely(has_lock))
+		spin_unlock_irq(lock);
+}
+
 static void set_data_seg_v2(struct hns_roce_v2_wqe_data_seg *dseg,
 			    struct ib_sge *sg)
 {
@@ -257,7 +286,8 @@  static inline void update_sq_db(struct hns_roce_dev *hr_dev,
 	 * now.
 	 */
 	if (qp->state == IB_QPS_ERR) {
-		if (!test_and_set_bit(HNS_ROCE_FLUSH_FLAG, &qp->flush_flag))
+		if (qp_lock &&
+		    !test_and_set_bit(HNS_ROCE_FLUSH_FLAG, &qp->flush_flag))
 			init_flush_work(hr_dev, qp);
 	} else {
 		struct hns_roce_v2_db sq_db = {};
@@ -301,7 +331,7 @@  static int hns_roce_v2_post_send(struct ib_qp *ibqp,
 	int ret;
 	int i;
 
-	spin_lock_irqsave(&qp->sq.lock, flags);
+	v2_spin_lock_irqsave(qp_lock, &qp->sq.lock, &flags);
 
 	ret = check_send_valid(hr_dev, qp);
 	if (ret) {
@@ -617,7 +647,7 @@  static int hns_roce_v2_post_send(struct ib_qp *ibqp,
 		update_sq_db(hr_dev, qp);
 	}
 
-	spin_unlock_irqrestore(&qp->sq.lock, flags);
+	v2_spin_unlock_irqrestore(qp_lock, &qp->sq.lock, &flags);
 
 	return ret;
 }
@@ -642,14 +672,14 @@  static int hns_roce_v2_post_recv(struct ib_qp *ibqp,
 	struct hns_roce_v2_wqe_data_seg *dseg;
 	struct hns_roce_rinl_sge *sge_list;
 	struct device *dev = hr_dev->dev;
-	unsigned long flags;
+	unsigned long flags = 0;
 	void *wqe = NULL;
 	u32 wqe_idx;
 	int nreq;
 	int ret;
 	int i;
 
-	spin_lock_irqsave(&hr_qp->rq.lock, flags);
+	v2_spin_lock_irqsave(qp_lock, &hr_qp->rq.lock, &flags);
 
 	ret = check_recv_valid(hr_dev, hr_qp);
 	if (ret) {
@@ -721,14 +751,15 @@  static int hns_roce_v2_post_recv(struct ib_qp *ibqp,
 		 * now.
 		 */
 		if (hr_qp->state == IB_QPS_ERR) {
-			if (!test_and_set_bit(HNS_ROCE_FLUSH_FLAG,
-					      &hr_qp->flush_flag))
+			if (qp_lock && !test_and_set_bit(HNS_ROCE_FLUSH_FLAG,
+							 &hr_qp->flush_flag))
 				init_flush_work(hr_dev, hr_qp);
 		} else {
 			*hr_qp->rdb.db_record = hr_qp->rq.head & 0xffff;
 		}
 	}
-	spin_unlock_irqrestore(&hr_qp->rq.lock, flags);
+
+	v2_spin_unlock_irqrestore(qp_lock, &hr_qp->rq.lock, &flags);
 
 	return ret;
 }
@@ -2810,9 +2841,9 @@  static void __hns_roce_v2_cq_clean(struct hns_roce_cq *hr_cq, u32 qpn,
 static void hns_roce_v2_cq_clean(struct hns_roce_cq *hr_cq, u32 qpn,
 				 struct hns_roce_srq *srq)
 {
-	spin_lock_irq(&hr_cq->lock);
+	v2_spin_lock_irq(cq_lock, &hr_cq->lock);
 	__hns_roce_v2_cq_clean(hr_cq, qpn, srq);
-	spin_unlock_irq(&hr_cq->lock);
+	v2_spin_unlock_irq(cq_lock, &hr_cq->lock);
 }
 
 static void hns_roce_v2_write_cqc(struct hns_roce_dev *hr_dev,
@@ -3142,7 +3173,8 @@  static int hns_roce_v2_poll_one(struct hns_roce_cq *hr_cq,
 		dev_err(hr_dev->dev, "error cqe status is: 0x%x\n",
 			status & HNS_ROCE_V2_CQE_STATUS_MASK);
 
-		if (!test_and_set_bit(HNS_ROCE_FLUSH_FLAG, &hr_qp->flush_flag))
+		if (qp_lock &&
+		    !test_and_set_bit(HNS_ROCE_FLUSH_FLAG, &hr_qp->flush_flag))
 			init_flush_work(hr_dev, hr_qp);
 
 		return 0;
@@ -3294,10 +3326,10 @@  static int hns_roce_v2_poll_cq(struct ib_cq *ibcq, int num_entries,
 	struct hns_roce_dev *hr_dev = to_hr_dev(ibcq->device);
 	struct hns_roce_cq *hr_cq = to_hr_cq(ibcq);
 	struct hns_roce_qp *cur_qp = NULL;
-	unsigned long flags;
+	unsigned long flags = 0;
 	int npolled;
 
-	spin_lock_irqsave(&hr_cq->lock, flags);
+	v2_spin_lock_irqsave(cq_lock, &hr_cq->lock, &flags);
 
 	/*
 	 * When the device starts to reset, the state is RST_DOWN. At this time,
@@ -3323,7 +3355,7 @@  static int hns_roce_v2_poll_cq(struct ib_cq *ibcq, int num_entries,
 	}
 
 out:
-	spin_unlock_irqrestore(&hr_cq->lock, flags);
+	v2_spin_unlock_irqrestore(cq_lock, &hr_cq->lock, &flags);
 
 	return npolled;
 }
@@ -4753,29 +4785,42 @@  static int hns_roce_v2_modify_qp(struct ib_qp *ibqp,
 	if (ret)
 		goto out;
 
+	/* When locks are used for post verbs, flush cqe should be enabled */
+	if (qp_lock) {
+		hr_qp->flush_en = 1;
+		/* Ensure that the value of flush_en can be read correctly later */
+		rmb();
+	}
+
 	/* When QP state is err, SQ and RQ WQE should be flushed */
 	if (new_state == IB_QPS_ERR) {
-		spin_lock_irqsave(&hr_qp->sq.lock, sq_flag);
-		hr_qp->state = IB_QPS_ERR;
+		v2_spin_lock_irqsave(qp_lock, &hr_qp->sq.lock, &sq_flag);
 		roce_set_field(context->byte_160_sq_ci_pi,
 			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_M,
 			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_S,
 			       hr_qp->sq.head);
-		roce_set_field(qpc_mask->byte_160_sq_ci_pi,
-			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_M,
-			       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_S, 0);
-		spin_unlock_irqrestore(&hr_qp->sq.lock, sq_flag);
+		if (hr_qp->flush_en == 1)
+			roce_set_field(qpc_mask->byte_160_sq_ci_pi,
+				       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_M,
+				       V2_QPC_BYTE_160_SQ_PRODUCER_IDX_S, 0);
+
+		hr_qp->state = IB_QPS_ERR;
+		v2_spin_unlock_irqrestore(qp_lock, &hr_qp->sq.lock, &sq_flag);
 
 		if (!ibqp->srq) {
-			spin_lock_irqsave(&hr_qp->rq.lock, rq_flag);
+			v2_spin_lock_irqsave(qp_lock, &hr_qp->rq.lock,
+					     &rq_flag);
 			roce_set_field(context->byte_84_rq_ci_pi,
-			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_M,
-			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_S,
-			       hr_qp->rq.head);
-			roce_set_field(qpc_mask->byte_84_rq_ci_pi,
-			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_M,
-			       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_S, 0);
-			spin_unlock_irqrestore(&hr_qp->rq.lock, rq_flag);
+				       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_M,
+				       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_S,
+				       hr_qp->rq.head);
+			if (hr_qp->flush_en == 1)
+				roce_set_field(qpc_mask->byte_84_rq_ci_pi,
+					       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_M,
+					       V2_QPC_BYTE_84_RQ_PRODUCER_IDX_S,
+					       0);
+			v2_spin_unlock_irqrestore(qp_lock, &hr_qp->rq.lock,
+						  &rq_flag);
 		}
 	}
 
@@ -5017,7 +5062,8 @@  static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
 	recv_cq = hr_qp->ibqp.recv_cq ? to_hr_cq(hr_qp->ibqp.recv_cq) : NULL;
 
 	spin_lock_irqsave(&hr_dev->qp_list_lock, flags);
-	hns_roce_lock_cqs(send_cq, recv_cq);
+	if (cq_lock)
+		hns_roce_lock_cqs(send_cq, recv_cq);
 
 	if (!udata) {
 		if (recv_cq)
@@ -5033,7 +5079,9 @@  static int hns_roce_v2_destroy_qp_common(struct hns_roce_dev *hr_dev,
 
 	hns_roce_qp_remove(hr_dev, hr_qp);
 
-	hns_roce_unlock_cqs(send_cq, recv_cq);
+	if (cq_lock)
+		hns_roce_unlock_cqs(send_cq, recv_cq);
+
 	spin_unlock_irqrestore(&hr_dev->qp_list_lock, flags);
 
 	return ret;
@@ -6617,3 +6665,7 @@  MODULE_AUTHOR("Wei Hu <xavier.huwei@huawei.com>");
 MODULE_AUTHOR("Lijun Ou <oulijun@huawei.com>");
 MODULE_AUTHOR("Shaobo Xu <xushaobo2@huawei.com>");
 MODULE_DESCRIPTION("Hisilicon Hip08 Family RoCE Driver");
+module_param(qp_lock, bool, 0444);
+MODULE_PARM_DESC(qp_lock, "default: true");
+module_param(cq_lock, bool, 0444);
+MODULE_PARM_DESC(cq_lock, "default: true");
diff --git a/drivers/infiniband/hw/hns/hns_roce_qp.c b/drivers/infiniband/hw/hns/hns_roce_qp.c
index 5a28d62..bca66c3 100644
--- a/drivers/infiniband/hw/hns/hns_roce_qp.c
+++ b/drivers/infiniband/hw/hns/hns_roce_qp.c
@@ -56,6 +56,7 @@  static void flush_work_handle(struct work_struct *work)
 
 	attr_mask = IB_QP_STATE;
 	attr.qp_state = IB_QPS_ERR;
+	hr_qp->flush_en = 1;
 
 	if (test_and_clear_bit(HNS_ROCE_FLUSH_FLAG, &hr_qp->flush_flag)) {
 		ret = hns_roce_modify_qp(&hr_qp->ibqp, &attr, attr_mask, NULL);