Message ID | 1615541933-35798-1-git-send-email-liweihang@huawei.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [for-rc] RDMA/hns: Fix bug during CMDQ initialization | expand |
On Fri, Mar 12, 2021 at 05:38:53PM +0800, Weihang Li wrote: > From: Lang Cheng <chenglang@huawei.com> > > When reloading driver, the head/tail pointer of CMDQ may be not at position > 0. Then during initialization of CMDQ, if head is reset first, the firmware > will start to handle CMDQ because the head is not equal to the tail. The > driver can reset tail first since the firmware will be triggerred only by > head. This bug is introduced by changing macros of head/tail register > without changing the order of initialization. > > Besides, the same name represents opposite meanings in new/old driver, it > is hard to maintain, so rename them to PI/CI. Please split this to two patches for the -rc flow > diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > index c3934ab..c359f09 100644 > +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c > @@ -1194,8 +1194,10 @@ static void hns_roce_cmq_init_regs(struct hns_roce_dev *hr_dev, bool ring_type) > upper_32_bits(dma)); > roce_write(hr_dev, ROCEE_TX_CMQ_DEPTH_REG, > (u32)ring->desc_num >> HNS_ROCE_CMQ_DESC_NUM_S); > - roce_write(hr_dev, ROCEE_TX_CMQ_HEAD_REG, 0); > - roce_write(hr_dev, ROCEE_TX_CMQ_TAIL_REG, 0); > + > + /* Make sure to write CI first and then PI */ > + roce_write(hr_dev, ROCEE_TX_CMQ_CI_REG, 0); > + roce_write(hr_dev, ROCEE_TX_CMQ_PI_REG, 0); Only this hunk should be in -rc Thanks, Jason
On 2021/3/12 21:20, Jason Gunthorpe wrote: > On Fri, Mar 12, 2021 at 05:38:53PM +0800, Weihang Li wrote: >> From: Lang Cheng <chenglang@huawei.com> >> >> When reloading driver, the head/tail pointer of CMDQ may be not at position >> 0. Then during initialization of CMDQ, if head is reset first, the firmware >> will start to handle CMDQ because the head is not equal to the tail. The >> driver can reset tail first since the firmware will be triggerred only by >> head. This bug is introduced by changing macros of head/tail register >> without changing the order of initialization. >> >> Besides, the same name represents opposite meanings in new/old driver, it >> is hard to maintain, so rename them to PI/CI. > > Please split this to two patches for the -rc flow > >> diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >> index c3934ab..c359f09 100644 >> +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c >> @@ -1194,8 +1194,10 @@ static void hns_roce_cmq_init_regs(struct hns_roce_dev *hr_dev, bool ring_type) >> upper_32_bits(dma)); >> roce_write(hr_dev, ROCEE_TX_CMQ_DEPTH_REG, >> (u32)ring->desc_num >> HNS_ROCE_CMQ_DESC_NUM_S); >> - roce_write(hr_dev, ROCEE_TX_CMQ_HEAD_REG, 0); >> - roce_write(hr_dev, ROCEE_TX_CMQ_TAIL_REG, 0); >> + >> + /* Make sure to write CI first and then PI */ >> + roce_write(hr_dev, ROCEE_TX_CMQ_CI_REG, 0); >> + roce_write(hr_dev, ROCEE_TX_CMQ_PI_REG, 0); > > Only this hunk should be in -rc > > Thanks, > Jason > I see, thank you. Weihang
diff --git a/drivers/infiniband/hw/hns/hns_roce_common.h b/drivers/infiniband/hw/hns/hns_roce_common.h index 23c438c..8a18aa1 100644 --- a/drivers/infiniband/hw/hns/hns_roce_common.h +++ b/drivers/infiniband/hw/hns/hns_roce_common.h @@ -364,8 +364,8 @@ #define ROCEE_TX_CMQ_BASEADDR_L_REG 0x07000 #define ROCEE_TX_CMQ_BASEADDR_H_REG 0x07004 #define ROCEE_TX_CMQ_DEPTH_REG 0x07008 -#define ROCEE_TX_CMQ_HEAD_REG 0x07010 -#define ROCEE_TX_CMQ_TAIL_REG 0x07014 +#define ROCEE_TX_CMQ_PI_REG 0x07010 +#define ROCEE_TX_CMQ_CI_REG 0x07014 #define ROCEE_RX_CMQ_BASEADDR_L_REG 0x07018 #define ROCEE_RX_CMQ_BASEADDR_H_REG 0x0701c diff --git a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c index c3934ab..c359f09 100644 --- a/drivers/infiniband/hw/hns/hns_roce_hw_v2.c +++ b/drivers/infiniband/hw/hns/hns_roce_hw_v2.c @@ -1194,8 +1194,10 @@ static void hns_roce_cmq_init_regs(struct hns_roce_dev *hr_dev, bool ring_type) upper_32_bits(dma)); roce_write(hr_dev, ROCEE_TX_CMQ_DEPTH_REG, (u32)ring->desc_num >> HNS_ROCE_CMQ_DESC_NUM_S); - roce_write(hr_dev, ROCEE_TX_CMQ_HEAD_REG, 0); - roce_write(hr_dev, ROCEE_TX_CMQ_TAIL_REG, 0); + + /* Make sure to write CI first and then PI */ + roce_write(hr_dev, ROCEE_TX_CMQ_CI_REG, 0); + roce_write(hr_dev, ROCEE_TX_CMQ_PI_REG, 0); } else { roce_write(hr_dev, ROCEE_RX_CMQ_BASEADDR_L_REG, (u32)dma); roce_write(hr_dev, ROCEE_RX_CMQ_BASEADDR_H_REG, @@ -1275,7 +1277,7 @@ static void hns_roce_cmq_setup_basic_desc(struct hns_roce_cmq_desc *desc, static int hns_roce_cmq_csq_done(struct hns_roce_dev *hr_dev) { - u32 tail = roce_read(hr_dev, ROCEE_TX_CMQ_TAIL_REG); + u32 tail = roce_read(hr_dev, ROCEE_TX_CMQ_CI_REG); struct hns_roce_v2_priv *priv = hr_dev->priv; return tail == priv->cmq.csq.head; @@ -1303,7 +1305,7 @@ static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev, } /* Write to hardware */ - roce_write(hr_dev, ROCEE_TX_CMQ_HEAD_REG, csq->head); + roce_write(hr_dev, ROCEE_TX_CMQ_PI_REG, csq->head); /* If the command is sync, wait for the firmware to write back, * if multi descriptors to be sent, use the first one to check @@ -1334,7 +1336,7 @@ static int __hns_roce_cmq_send(struct hns_roce_dev *hr_dev, } } else { /* FW/HW reset or incorrect number of desc */ - tail = roce_read(hr_dev, ROCEE_TX_CMQ_TAIL_REG); + tail = roce_read(hr_dev, ROCEE_TX_CMQ_CI_REG); dev_warn(hr_dev->dev, "CMDQ move tail from %d to %d\n", csq->head, tail); csq->head = tail;