diff mbox series

[for-next] RDMA/rxe: fix regression caused by recent patch

Message ID 20201029212545.6616-1-rpearson@hpe.com (mailing list archive)
State Superseded
Headers show
Series [for-next] RDMA/rxe: fix regression caused by recent patch | expand

Commit Message

Bob Pearson Oct. 29, 2020, 9:25 p.m. UTC
The commit referenced below performs additional checking on
devices used for DMA. Specifically it checks that

device->dma_mask != NULL

Rdma_rxe uses this device when pinning MR memory but did not
set the value of dma_mask. In fact rdma_rxe does not perform
any DMA operations so the value is never used but is checked.

This patch gives dma_mask a valid value. Without this patch
rdma_rxe does not function at all.

Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference")
Signed-off-by: Bob Pearson <rpearson@hpe.com>
---
 drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

kernel test robot Oct. 30, 2020, 2:39 a.m. UTC | #1
Hi Bob,

Thank you for the patch! Perhaps something to improve:

[auto build test WARNING on rdma/for-next]
[also build test WARNING on v5.10-rc1 next-20201029]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch]

url:    https://github.com/0day-ci/linux/commits/Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848
base:   https://git.kernel.org/pub/scm/linux/kernel/git/rdma/rdma.git for-next
config: powerpc-allyesconfig (attached as .config)
compiler: powerpc64-linux-gcc (GCC) 9.3.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/0day-ci/linux/commit/880fe509bd2bdc73c885fd887cb3935000855d49
        git remote add linux-review https://github.com/0day-ci/linux
        git fetch --no-tags linux-review Bob-Pearson/RDMA-rxe-fix-regression-caused-by-recent-patch/20201030-052848
        git checkout 880fe509bd2bdc73c885fd887cb3935000855d49
        # save the attached .config to linux build tree
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-9.3.0 make.cross ARCH=powerpc 

If you fix the issue, kindly add following tag as appropriate
Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   drivers/infiniband/sw/rxe/rxe_verbs.c: In function 'rxe_register_device':
>> drivers/infiniband/sw/rxe/rxe_verbs.c:1143:20: warning: assignment to 'u64 *' {aka 'long long unsigned int *'} from 'long long unsigned int' makes pointer from integer without a cast [-Wint-conversion]
    1143 |  dev->dev.dma_mask = DMA_BIT_MASK(64);
         |                    ^

vim +1143 drivers/infiniband/sw/rxe/rxe_verbs.c

  1125	
  1126	int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
  1127	{
  1128		int err;
  1129		struct ib_device *dev = &rxe->ib_dev;
  1130		struct crypto_shash *tfm;
  1131	
  1132		strlcpy(dev->node_desc, "rxe", sizeof(dev->node_desc));
  1133	
  1134		dev->node_type = RDMA_NODE_IB_CA;
  1135		dev->phys_port_cnt = 1;
  1136		dev->num_comp_vectors = num_possible_cpus();
  1137	
  1138		/* rdma_rxe never does real DMA but does rely on
  1139		 * pinning user memory in MRs to avoid page faults
  1140		 * in responder and completer tasklets
  1141		 */
  1142		dev->dev.parent = rxe_dma_device(rxe);
> 1143		dev->dev.dma_mask = DMA_BIT_MASK(64);

---
0-DAY CI Kernel Test Service, Intel Corporation
https://lists.01.org/hyperkitty/list/kbuild-all@lists.01.org
Zhu Yanjun Oct. 30, 2020, 2:54 a.m. UTC | #2
On Fri, Oct 30, 2020 at 5:27 AM Bob Pearson <rpearsonhpe@gmail.com> wrote:
>
> The commit referenced below performs additional checking on
> devices used for DMA. Specifically it checks that
>
> device->dma_mask != NULL
>
> Rdma_rxe uses this device when pinning MR memory but did not
> set the value of dma_mask. In fact rdma_rxe does not perform
> any DMA operations so the value is never used but is checked.
>
> This patch gives dma_mask a valid value. Without this patch
> rdma_rxe does not function at all.
>
> Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference")
> Signed-off-by: Bob Pearson <rpearson@hpe.com>

Thanks a lot.

Zhu Yanjun

> ---
>  drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
>
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 7652d53af2c1..116a234e92db 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
>         dev->node_type = RDMA_NODE_IB_CA;
>         dev->phys_port_cnt = 1;
>         dev->num_comp_vectors = num_possible_cpus();
> +
> +       /* rdma_rxe never does real DMA but does rely on
> +        * pinning user memory in MRs to avoid page faults
> +        * in responder and completer tasklets
> +        */
>         dev->dev.parent = rxe_dma_device(rxe);
> +       dev->dev.dma_mask = DMA_BIT_MASK(64);
>         dev->local_dma_lkey = 0;
> +
>         addrconf_addr_eui48((unsigned char *)&dev->node_guid,
>                             rxe->ndev->dev_addr);
>         dev->dev.dma_parms = &rxe->dma_parms;
> --
> 2.27.0
>
Bob Pearson Oct. 30, 2020, 5:46 a.m. UTC | #3
On 10/29/20 4:25 PM, Bob Pearson wrote:
> The commit referenced below performs additional checking on
> devices used for DMA. Specifically it checks that
> 
> device->dma_mask != NULL
> 
> Rdma_rxe uses this device when pinning MR memory but did not
> set the value of dma_mask. In fact rdma_rxe does not perform
> any DMA operations so the value is never used but is checked.
> 
> This patch gives dma_mask a valid value. Without this patch
> rdma_rxe does not function at all.
> 
> Fixes: f959dcd6ddfd2 ("dma-direct: Fix potential NULL pointer dereference")
> Signed-off-by: Bob Pearson <rpearson@hpe.com>
> ---
>  drivers/infiniband/sw/rxe/rxe_verbs.c | 7 +++++++
>  1 file changed, 7 insertions(+)
> 
> diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
> index 7652d53af2c1..116a234e92db 100644
> --- a/drivers/infiniband/sw/rxe/rxe_verbs.c
> +++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
> @@ -1134,8 +1134,15 @@ int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
>  	dev->node_type = RDMA_NODE_IB_CA;
>  	dev->phys_port_cnt = 1;
>  	dev->num_comp_vectors = num_possible_cpus();
> +
> +	/* rdma_rxe never does real DMA but does rely on
> +	 * pinning user memory in MRs to avoid page faults
> +	 * in responder and completer tasklets
> +	 */
>  	dev->dev.parent = rxe_dma_device(rxe);
> +	dev->dev.dma_mask = DMA_BIT_MASK(64);
>  	dev->local_dma_lkey = 0;
> +
>  	addrconf_addr_eui48((unsigned char *)&dev->node_guid,
>  			    rxe->ndev->dev_addr);
>  	dev->dev.dma_parms = &rxe->dma_parms;
>

Ignore this patch. It turns out it works because any nonzero number in dma_mask will stop the check that is failing and since rxe never uses DMA it won't affect anything. But, it doesn't compile cleanly because the dma_mask is a pointer to the actual dma_mask and not the mask. Somehow I missed the warning. I have a newer version that uses the function dma_coerce_mask_and_coherent() and also works. (Works means it gets to the next problem as mentioned in the prvious note.)

Bob
diff mbox series

Patch

diff --git a/drivers/infiniband/sw/rxe/rxe_verbs.c b/drivers/infiniband/sw/rxe/rxe_verbs.c
index 7652d53af2c1..116a234e92db 100644
--- a/drivers/infiniband/sw/rxe/rxe_verbs.c
+++ b/drivers/infiniband/sw/rxe/rxe_verbs.c
@@ -1134,8 +1134,15 @@  int rxe_register_device(struct rxe_dev *rxe, const char *ibdev_name)
 	dev->node_type = RDMA_NODE_IB_CA;
 	dev->phys_port_cnt = 1;
 	dev->num_comp_vectors = num_possible_cpus();
+
+	/* rdma_rxe never does real DMA but does rely on
+	 * pinning user memory in MRs to avoid page faults
+	 * in responder and completer tasklets
+	 */
 	dev->dev.parent = rxe_dma_device(rxe);
+	dev->dev.dma_mask = DMA_BIT_MASK(64);
 	dev->local_dma_lkey = 0;
+
 	addrconf_addr_eui48((unsigned char *)&dev->node_guid,
 			    rxe->ndev->dev_addr);
 	dev->dev.dma_parms = &rxe->dma_parms;