Message ID | 20201030171106.4191-1-rpearson@hpe.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [for-next,v2] RDMA/rxe: fix regression caused by recent patch | expand |
On Fri, Oct 30, 2020 at 12:11:07PM -0500, Bob Pearson wrote: > The commit referenced below performs additional checking on > devices used for DMA. Specifically it checks that > > device->dma_mask != NULL > > Rdma_rxe uses this device when pinning MR memory but did not > set the value of dma_mask. In fact rdma_rxe does not perform > any DMA operations so the value is never used but is checked. > > This patch gives dma_mask a valid value extracted from the device > backing the ndev used by rxe. > > Without this patch rdma_rxe does not function. > > N.B. This patch needs to be applied before the recent fix to add back > IB_USER_VERBS_CMD_POST_SEND to uverbs_cmd_mask. > > Dennis Dallesandro reported that Parav's similar patch did not apply > cleanly to rxe. This one does to for-next head of tree as of yesterday. > > Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference") > Signed-off-by: Bob Pearson <rpearson@hpe.com> > drivers/infiniband/sw/rxe/rxe_verbs.c | 18 ++++++++++++++++-- > 1 file changed, 16 insertions(+), 2 deletions(-) > > diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c > index 7652d53af2c1..c857e83323ed 100644 > +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c > @@ -1128,19 +1128,32 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) > int err; > struct ib_device *dev = &rxe->ib_dev; > struct crypto_shash *tfm; > + u64 dma_mask; > > strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc)); > > dev->node_type = RDMA_NODE_IB_CA; > dev->phys_port_cnt = 1; > dev->num_comp_vectors = num_possible_cpus(); > - dev->dev.parent = rxe_dma_device(rxe); > dev->local_dma_lkey = 0; > addrconf_addr_eui48((unsigned char *)&dev->node_guid, > rxe->ndev->dev_addr); > dev->dev.dma_parms = &rxe->dma_parms; > dma_set_max_seg_size(&dev->dev, UINT_MAX); > - dma_set_coherent_mask(&dev->dev, dma_get_required_mask(&dev->dev)); > + > + /* rdma_rxe never does real DMA but does rely on > + * pinning user memory in MRs to avoid page faults > + * in responder and completer tasklets. This code > + * supplies a valid dma_mask from the underlying > + * network device. It is never used but is checked. > + */ > + dev->dev.parent = rxe_dma_device(rxe); Oh! This is another bug, the parent of an ib_device should never be set to a net_device!! This is probably why we get all those mysterious syzkaller faults :| Just leave it NULL > + dma_mask = *(dev->dev.parent->dma_mask); > + err = dma_coerce_mask_and_coherent(&dev->dev, dma_mask); Why not use Parav's logic? Jason
On 10/30/20 12:36 PM, Jason Gunthorpe wrote: > On Fri, Oct 30, 2020 at 12:11:07PM -0500, Bob Pearson wrote: >> The commit referenced below performs additional checking on >> devices used for DMA. Specifically it checks that >> >> device->dma_mask != NULL >> >> Rdma_rxe uses this device when pinning MR memory but did not >> set the value of dma_mask. In fact rdma_rxe does not perform >> any DMA operations so the value is never used but is checked. >> >> This patch gives dma_mask a valid value extracted from the device >> backing the ndev used by rxe. >> >> Without this patch rdma_rxe does not function. >> >> N.B. This patch needs to be applied before the recent fix to add back >> IB_USER_VERBS_CMD_POST_SEND to uverbs_cmd_mask. >> >> Dennis Dallesandro reported that Parav's similar patch did not apply >> cleanly to rxe. This one does to for-next head of tree as of yesterday. >> >> Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference") >> Signed-off-by: Bob Pearson <rpearson@hpe.com> >> drivers/infiniband/sw/rxe/rxe_verbs.c | 18 ++++++++++++++++-- >> 1 file changed, 16 insertions(+), 2 deletions(-) >> >> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c >> index 7652d53af2c1..c857e83323ed 100644 >> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c >> @@ -1128,19 +1128,32 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) >> int err; >> struct ib_device *dev = &rxe->ib_dev; >> struct crypto_shash *tfm; >> + u64 dma_mask; >> >> strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc)); >> >> dev->node_type = RDMA_NODE_IB_CA; >> dev->phys_port_cnt = 1; >> dev->num_comp_vectors = num_possible_cpus(); >> - dev->dev.parent = rxe_dma_device(rxe); >> dev->local_dma_lkey = 0; >> addrconf_addr_eui48((unsigned char *)&dev->node_guid, >> rxe->ndev->dev_addr); >> dev->dev.dma_parms = &rxe->dma_parms; >> dma_set_max_seg_size(&dev->dev, UINT_MAX); >> - dma_set_coherent_mask(&dev->dev, dma_get_required_mask(&dev->dev)); >> + >> + /* rdma_rxe never does real DMA but does rely on >> + * pinning user memory in MRs to avoid page faults >> + * in responder and completer tasklets. This code >> + * supplies a valid dma_mask from the underlying >> + * network device. It is never used but is checked. >> + */ >> + dev->dev.parent = rxe_dma_device(rxe); > > Oh! This is another bug, the parent of an ib_device should never be > set to a net_device!! This is probably why we get all those mysterious > syzkaller faults :| Just leave it NULL > >> + dma_mask = *(dev->dev.parent->dma_mask); >> + err = dma_coerce_mask_and_coherent(&dev->dev, dma_mask); > > Why not use Parav's logic? > > Jason > It's not the network device. It is the parent of the network device. On 64 bit machines it gives 0xffffffffffffffff as dma_mask. struct device *rxe_dma_device(struct rxe_dev *rxe) { struct net_device *ndev; ndev = rxe->ndev; if (is_vlan_dev(ndev)) ndev = vlan_dev_real_dev(ndev); return ndev->dev.parent; } His should work too. They will behave the same at the end of the day. I don't really know what the rxe_dma_device() code was trying to do in the first place so I didn't change it. But it was a handy place to get a dma_mask that should work on any architecture. If there is no reason to set dev.parent I can get rid of rxe_dma_device. Bob
On Fri, Oct 30, 2020 at 12:45:54PM -0500, Bob Pearson wrote: > >> + > >> + /* rdma_rxe never does real DMA but does rely on > >> + * pinning user memory in MRs to avoid page faults > >> + * in responder and completer tasklets. This code > >> + * supplies a valid dma_mask from the underlying > >> + * network device. It is never used but is checked. > >> + */ > >> + dev->dev.parent = rxe_dma_device(rxe); > > > > Oh! This is another bug, the parent of an ib_device should never be > > set to a net_device!! This is probably why we get all those mysterious > > syzkaller faults :| Just leave it NULL > > > >> + dma_mask = *(dev->dev.parent->dma_mask); > >> + err = dma_coerce_mask_and_coherent(&dev->dev, dma_mask); > > > > Why not use Parav's logic? > > > > Jason > > It's not the network device. It is the parent of the network device. > On 64 bit machines it gives 0xffffffffffffffff as dma_mask. No, it is some weird thing because network devices don't always have physical device parents. There is no relation between the netdevice RXE is running on and the DMA mask to use for the dummy dma ops, AFAICT > that should work on any architecture. If there is no reason to set > dev.parent I can get rid of rxe_dma_device. Please, that arrangement is causing bugs. Jason
diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c index 7652d53af2c1..c857e83323ed 100644 --- a/drivers/infiniband/sw/rxe/rxe_verbs.c +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c @@ -1128,19 +1128,32 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name) int err; struct ib_device *dev = &rxe->ib_dev; struct crypto_shash *tfm; + u64 dma_mask; strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc)); dev->node_type = RDMA_NODE_IB_CA; dev->phys_port_cnt = 1; dev->num_comp_vectors = num_possible_cpus(); - dev->dev.parent = rxe_dma_device(rxe); dev->local_dma_lkey = 0; addrconf_addr_eui48((unsigned char *)&dev->node_guid, rxe->ndev->dev_addr); dev->dev.dma_parms = &rxe->dma_parms; dma_set_max_seg_size(&dev->dev, UINT_MAX); - dma_set_coherent_mask(&dev->dev, dma_get_required_mask(&dev->dev)); + + /* rdma_rxe never does real DMA but does rely on + * pinning user memory in MRs to avoid page faults + * in responder and completer tasklets. This code + * supplies a valid dma_mask from the underlying + * network device. It is never used but is checked. + */ + dev->dev.parent = rxe_dma_device(rxe); + dma_mask = *(dev->dev.parent->dma_mask); + err = dma_coerce_mask_and_coherent(&dev->dev, dma_mask); + if (err) { + pr_warn("dma_mask not supported\n"); + return err; + } dev->uverbs_cmd_mask |= BIT_ULL(IB_USER_VERBS_CMD_REQ_NOTIFY_CQ);
The commit referenced below performs additional checking on devices used for DMA. Specifically it checks that device->dma_mask != NULL Rdma_rxe uses this device when pinning MR memory but did not set the value of dma_mask. In fact rdma_rxe does not perform any DMA operations so the value is never used but is checked. This patch gives dma_mask a valid value extracted from the device backing the ndev used by rxe. Without this patch rdma_rxe does not function. N.B. This patch needs to be applied before the recent fix to add back IB_USER_VERBS_CMD_POST_SEND to uverbs_cmd_mask. Dennis Dallesandro reported that Parav's similar patch did not apply cleanly to rxe. This one does to for-next head of tree as of yesterday. Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference") Signed-off-by: Bob Pearson <rpearson@hpe.com> --- drivers/infiniband/sw/rxe/rxe_verbs.c | 18 ++++++++++++++++-- 1 file changed, 16 insertions(+), 2 deletions(-)