Message ID | 20220210073655.42281-1-guoqing.jiang@linux.dev (mailing list archive) |
---|---|
Headers | show |
Series | patches and bug report for rxe | expand |
On 2/10/22 3:36 PM, Guoqing Jiang wrote: > However, seems rnbd/rtrs over rxe still can't work with 5.17-rc3 kernel, > dmesg reports below. > > 1. server side > > [ 440.723182] rdma_rxe: qp#17 moved to error state > [ 440.725300] rtrs_server L1205: <bla>: remote access error (wr_cqe: 000000003b14397c, type: 0, vendor_err: 0x0, len: 0) > [ 440.845926] rnbd_server L256: RTRS Session bla disconnected > > 2. client side > > [ 997.817536] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0) > [ 998.968810] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues. > [ 999.017988] rtrs_client L610: <bla>: RDMA failed: remote access error > [ 1029.836943] rtrs_client L353: <bla>: Failed IB_WR_LOCAL_INV: WR flushe > > Then I tried 5.16 and 5.15 version, seems 5.15 does work as follows. > > 1. server side > > [ 333.076482] rnbd_server L800: </dev/loop1@bla>: Opened device 'loop1' > > 2. client side > > [ 1584.325825] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0) > [ 1585.268291] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues. > [ 1585.349300] rnbd_client L1607: </dev/loop1@bla> map_device: Device mapped as rnbd0 (nsectors: 0, logical_block_size: 512, physical_block_size: 512, max_write_same_sectors: 0, max_discard_sectors: 0, discard_granularity: 0, discard_alignment: 0, secure_discard: 0, max_segments: 128, max_hw_sectors: 248, rotational: 1, wc: 0, fua: 0) > > I would appreciate if someone shed light on why it doesn't work after 5.15, > And I am happy to test potential patch for it. After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe: Create duplicate mapping tables for FMRs"). The problem is mr_check_range returns -EFAULT after find iova and length are not valid, so connection between two VMs can't be established. Revert the commit manually or apply below temporary change, rxe works again with rnbd/rtrs though I don't think it is the right thing to do. Could experts provide a proper solution? Thanks. diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c b/drivers/infiniband/sw/rxe/rxe_mr.c index 453ef3c9d535..4a2fc4d5809d 100644 --- a/drivers/infiniband/sw/rxe/rxe_mr.c +++ b/drivers/infiniband/sw/rxe/rxe_mr.c @@ -652,7 +652,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) mr->state = RXE_MR_STATE_VALID; set = mr->cur_map_set; - mr->cur_map_set = mr->next_map_set; + //mr->cur_map_set = mr->next_map_set; mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova; mr->next_map_set = set; @@ -662,7 +662,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct rxe_send_wqe *wqe) int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr) { struct rxe_mr *mr = to_rmr(ibmr); - struct rxe_map_set *set = mr->next_map_set; + struct rxe_map_set *set = mr->cur_map_set; struct rxe_map *map; struct rxe_phys_buf *buf; diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 80df9a8f71a1..e41d2c8612d8 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -992,7 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct scatterlist *sg, int sg_nents, unsigned int *sg_offset) { struct rxe_mr *mr = to_rmr(ibmr); - struct rxe_map_set *set = mr->next_map_set; + struct rxe_map_set *set = mr->cur_map_set; And the test is pretty simple. 1. VM (server) modprobe rdma_rxe rdma link add rxe0 type rxe netdev ens3 modprobe rnbd-server 2. VM (client) modprobe rdma_rxe rdma link add rxe0 type rxe netdev ens3 modprobe rnbd-client echo "sessname=bla path=ip:$serverip device_path=$block_device_in_server" > /sys/devices/virtual/rnbd-client/ctl/map_device BTW, I tried wip/jgg-for-next branch with commit 3810c1a1cbe8f. Guoqing
On Tue, Feb 22, 2022 at 5:50 PM Guoqing Jiang <guoqing.jiang@linux.dev> wrote: > > > > On 2/10/22 3:36 PM, Guoqing Jiang wrote: > > However, seems rnbd/rtrs over rxe still can't work with 5.17-rc3 kernel, > > dmesg reports below. > > > > 1. server side > > > > [ 440.723182] rdma_rxe: qp#17 moved to error state > > [ 440.725300] rtrs_server L1205: <bla>: remote access error (wr_cqe: 000000003b14397c, type: 0, vendor_err: 0x0, len: 0) > > [ 440.845926] rnbd_server L256: RTRS Session bla disconnected > > > > 2. client side > > > > [ 997.817536] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0) > > [ 998.968810] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues. > > [ 999.017988] rtrs_client L610: <bla>: RDMA failed: remote access error > > [ 1029.836943] rtrs_client L353: <bla>: Failed IB_WR_LOCAL_INV: WR flushe > > > > Then I tried 5.16 and 5.15 version, seems 5.15 does work as follows. > > > > 1. server side > > > > [ 333.076482] rnbd_server L800: </dev/loop1@bla>: Opened device 'loop1' > > > > 2. client side > > > > [ 1584.325825] rnbd_client L596: Mapping device /dev/loop1 on session bla, (access_mode: rw, nr_poll_queues: 0) > > [ 1585.268291] rnbd_client L1213: [session=bla] mapped 8/8 default/read queues. > > [ 1585.349300] rnbd_client L1607: </dev/loop1@bla> map_device: Device mapped as rnbd0 (nsectors: 0, logical_block_size: 512, physical_block_size: 512, max_write_same_sectors: 0, max_discard_sectors: 0, discard_granularity: 0, discard_alignment: 0, secure_discard: 0, max_segments: 128, max_hw_sectors: 248, rotational: 1, wc: 0, fua: 0) > > > > I would appreciate if someone shed light on why it doesn't work after 5.15, > > And I am happy to test potential patch for it. > > After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe: > Create duplicate mapping tables for FMRs"). The problem is mr_check_range > returns -EFAULT after find iova and length are not valid, so connection > between > two VMs can't be established. > > Revert the commit manually or apply below temporary change, rxe works again > with rnbd/rtrs though I don't think it is the right thing to do. Could > experts provide > a proper solution? Thanks. > > diff --git a/drivers/infiniband/sw/rxe/rxe_mr.c > b/drivers/infiniband/sw/rxe/rxe_mr.c > index 453ef3c9d535..4a2fc4d5809d 100644 > --- a/drivers/infiniband/sw/rxe/rxe_mr.c > +++ b/drivers/infiniband/sw/rxe/rxe_mr.c > @@ -652,7 +652,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct > rxe_send_wqe *wqe) > mr->state = RXE_MR_STATE_VALID; > > set = mr->cur_map_set; > - mr->cur_map_set = mr->next_map_set; > + //mr->cur_map_set = mr->next_map_set; > mr->cur_map_set->iova = wqe->wr.wr.reg.mr->iova; > mr->next_map_set = set; > > @@ -662,7 +662,7 @@ int rxe_reg_fast_mr(struct rxe_qp *qp, struct > rxe_send_wqe *wqe) > int rxe_mr_set_page(struct ib_mr *ibmr, u64 addr) > { > struct rxe_mr *mr = to_rmr(ibmr); > - struct rxe_map_set *set = mr->next_map_set; > + struct rxe_map_set *set = mr->cur_map_set; > struct rxe_map *map; > struct rxe_phys_buf *buf; > > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c > b/drivers/infiniband/sw/rxe/rxe_verbs.c > index 80df9a8f71a1..e41d2c8612d8 100644 > --- a/drivers/infiniband/sw/rxe/rxe_verbs.c > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c > @@ -992,7 +992,7 @@ static int rxe_map_mr_sg(struct ib_mr *ibmr, struct > scatterlist *sg, > int sg_nents, unsigned int *sg_offset) > { > struct rxe_mr *mr = to_rmr(ibmr); > - struct rxe_map_set *set = mr->next_map_set; > + struct rxe_map_set *set = mr->cur_map_set; Thanks a lot. Please file a patch for the above changes. Zhu Yanjun > > And the test is pretty simple. > > 1. VM (server) > > modprobe rdma_rxe > rdma link add rxe0 type rxe netdev ens3 > modprobe rnbd-server > > 2. VM (client) > > modprobe rdma_rxe > rdma link add rxe0 type rxe netdev ens3 > modprobe rnbd-client > echo "sessname=bla path=ip:$serverip > device_path=$block_device_in_server" > > /sys/devices/virtual/rnbd-client/ctl/map_device > > BTW, I tried wip/jgg-for-next branch with commit 3810c1a1cbe8f. > > Guoqing
> > After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe: > Create duplicate mapping tables for FMRs"). The problem is > mr_check_range returns -EFAULT after find iova and length are not > valid, so connection between two VMs can't be established. > > Revert the commit manually or apply below temporary change, rxe works > again with rnbd/rtrs though I don't think it is the right thing to do. > Could experts provide a proper solution? Thanks. > This patch fixed failures in blktests and srp which were discussed at length. See e.g. https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/ and related messages. The conclusion was that two mappings were required. One owned by the driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping while the old one is still active and then switching them. If this case has iova and length not valid as indicated is there a problem with the test case? Bob
On 2/23/22 12:58 AM, Pearson, Robert B wrote: >> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe: >> Create duplicate mapping tables for FMRs"). The problem is >> mr_check_range returns -EFAULT after find iova and length are not >> valid, so connection between two VMs can't be established. >> >> Revert the commit manually or apply below temporary change, rxe works >> again with rnbd/rtrs though I don't think it is the right thing to do. >> Could experts provide a proper solution? Thanks. >> > This patch fixed failures in blktests and srp which were discussed at length. See e.g. > > https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/ Thanks for the link, which reminds me the always_invalidate parameter in rtrs_server. > and related messages. The conclusion was that two mappings were required. One owned by the > driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping > while the old one is still active and then switching them. > > If this case has iova and length not valid as indicated is there a problem with the test case? And after disable always_invalidate (which is enabled by default), rnbd/rtrs over roce works either. So I suppose there might be potential issue for always_invalidate=Y in rtrs server side since invalidate works for srp IIUC, @Jinpu. Thanks, Guoqing
On 2/22/22 22:43, Guoqing Jiang wrote: > > > On 2/23/22 12:58 AM, Pearson, Robert B wrote: >>> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe: >>> Create duplicate mapping tables for FMRs"). The problem is >>> mr_check_range returns -EFAULT after find iova and length are not >>> valid, so connection between two VMs can't be established. >>> >>> Revert the commit manually or apply below temporary change, rxe works >>> again with rnbd/rtrs though I don't think it is the right thing to do. >>> Could experts provide a proper solution? Thanks. >>> >> This patch fixed failures in blktests and srp which were discussed at length. See e.g. >> >> https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/ > > Thanks for the link, which reminds me the always_invalidate parameter in rtrs_server. > >> and related messages. The conclusion was that two mappings were required. One owned by the >> driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping >> while the old one is still active and then switching them. >> >> If this case has iova and length not valid as indicated is there a problem with the test case? > > And after disable always_invalidate (which is enabled by default), rnbd/rtrs over roce > works either. So I suppose there might be potential issue for always_invalidate=Y in > rtrs server side since invalidate works for srp IIUC, @Jinpu. > > Thanks, > Guoqing It would help to understand what you are running. My main concern is that mr_check_range shouldn't fail unless there is something very wrong.
On 2/23/22 1:01 PM, Bob Pearson wrote: > On 2/22/22 22:43, Guoqing Jiang wrote: >> >> On 2/23/22 12:58 AM, Pearson, Robert B wrote: >>>> After investigation, seems the culprit is commit 647bf13ce944 ("RDMA/rxe: >>>> Create duplicate mapping tables for FMRs"). The problem is >>>> mr_check_range returns -EFAULT after find iova and length are not >>>> valid, so connection between two VMs can't be established. >>>> >>>> Revert the commit manually or apply below temporary change, rxe works >>>> again with rnbd/rtrs though I don't think it is the right thing to do. >>>> Could experts provide a proper solution? Thanks. >>>> >>> This patch fixed failures in blktests and srp which were discussed at length. See e.g. >>> >>> https://lore.kernel.org/linux-rdma/20210907163939.GW1200268@ziepe.ca/ >> Thanks for the link, which reminds me the always_invalidate parameter in rtrs_server. >> >>> and related messages. The conclusion was that two mappings were required. One owned by the >>> driver and one by the 'hardware', i.e. bottom half in the rxe case, allowing updating a new mapping >>> while the old one is still active and then switching them. >>> >>> If this case has iova and length not valid as indicated is there a problem with the test case? >> And after disable always_invalidate (which is enabled by default), rnbd/rtrs over roce >> works either. So I suppose there might be potential issue for always_invalidate=Y in >> rtrs server side since invalidate works for srp IIUC, @Jinpu. >> >> Thanks, >> Guoqing > It would help to understand what you are running. My main concern is that mr_check_range > shouldn't fail unless there is something very wrong. Let me paste it again, I just try to map server block device to client side, specifically the bold line. 1. VM (server) modprobe rdma_rxe rdma link add rxe0 type rxe netdev ens3 modprobe rnbd-server 2. VM (client) modprobe rdma_rxe rdma link add rxe0 type rxe netdev ens3 modprobe rnbd-client *echo "sessname=bla path=ip:$serverip device_path=$block_device_in_server" > /sys/devices/virtual/rnbd-client/ctl/map_device* Thanks, Guoqing