diff mbox series

[v2,3/6] pvrdma: check number of pages when creating rings

Message ID 20181212193039.11445-4-ppandit@redhat.com (mailing list archive)
State New, archived
Headers show
Series rdma: various issues in rdma/pvrdma backend | expand

Commit Message

Prasad Pandit Dec. 12, 2018, 7:30 p.m. UTC
From: Prasad J Pandit <pjp@fedoraproject.org>

When creating CQ/QP rings, an object can have up to
PVRDMA_MAX_FAST_REG_PAGES=128 pages. Check 'npages' parameter
to avoid excessive memory allocation or a null dereference.

Reported-by: Li Qiang <liq3ea@163.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
---
 hw/rdma/vmw/pvrdma_cmd.c | 11 +++++++++++
 1 file changed, 11 insertions(+)

Update: No change, ack'd v1
  -> https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02786.html

Comments

Yuval Shaia Dec. 16, 2018, 8:30 p.m. UTC | #1
Hi Prasad,
Turned out that this patch cause a regression.

My test plan includes the following steps:
- Start two VMs.
- Run RC and UD traffic between the two.
- Run sanity local test on both which includes:
	- RC traffic on 3 gids with various message size.
	- UD traffic.
	- RDMA-CM connection with MAD.
	- MPI test.
- Power off the two VMs.

With this patch the last step fails, the guest OS hangs, trying to probably
unload pvrdma driver and finally gave up after 3 minutes.

On its face this patch does not seems to be related to the problem above
but fact is a fact, without this patch VM goes down with no issues. The
only thing i can think of is that somehow the guest driver does not capture
the error or does not handles the error correctly.

Anyways with debug turned on i have noticed that there is one case that
devices gets 129 nchunks (i think in MPI) while your patch limits it to
128.
From pvrdma source code  we can see that first page is dedicated to ring
state, this means that it maybe correct that 128 is the limit but we
should check that nchunks does not exceed 129, not 128.

What do you think?

Ie. to replace this line from create_cq_ring
+    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES) {
with this
+    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES + 1) {

Let me know your opinion.
I can make a quick fix to your patch or send a new patch on top of yours
for a review.

Yuval

On Thu, Dec 13, 2018 at 01:00:36AM +0530, P J P wrote:
> From: Prasad J Pandit <pjp@fedoraproject.org>
> 
> When creating CQ/QP rings, an object can have up to
> PVRDMA_MAX_FAST_REG_PAGES=128 pages. Check 'npages' parameter
> to avoid excessive memory allocation or a null dereference.
> 
> Reported-by: Li Qiang <liq3ea@163.com>
> Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
> ---
>  hw/rdma/vmw/pvrdma_cmd.c | 11 +++++++++++
>  1 file changed, 11 insertions(+)
> 
> Update: No change, ack'd v1
>   -> https://lists.gnu.org/archive/html/qemu-devel/2018-12/msg02786.html
> 
> diff --git a/hw/rdma/vmw/pvrdma_cmd.c b/hw/rdma/vmw/pvrdma_cmd.c
> index 4f616d4177..e37fb18280 100644
> --- a/hw/rdma/vmw/pvrdma_cmd.c
> +++ b/hw/rdma/vmw/pvrdma_cmd.c
> @@ -259,6 +259,11 @@ static int create_cq_ring(PCIDevice *pci_dev , PvrdmaRing **ring,
>      int rc = -EINVAL;
>      char ring_name[MAX_RING_NAME_SZ];
>  
> +    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES) {
> +        pr_dbg("invalid nchunks: %d\n", nchunks);
> +        return rc;
> +    }
> +
>      pr_dbg("pdir_dma=0x%llx\n", (long long unsigned int)pdir_dma);
>      dir = rdma_pci_dma_map(pci_dev, pdir_dma, TARGET_PAGE_SIZE);
>      if (!dir) {
> @@ -371,6 +376,12 @@ static int create_qp_rings(PCIDevice *pci_dev, uint64_t pdir_dma,
>      char ring_name[MAX_RING_NAME_SZ];
>      uint32_t wqe_sz;
>  
> +    if (!spages || spages > PVRDMA_MAX_FAST_REG_PAGES
> +        || !rpages || rpages > PVRDMA_MAX_FAST_REG_PAGES) {
> +        pr_dbg("invalid pages: %d, %d\n", spages, rpages);
> +        return rc;
> +    }
> +
>      pr_dbg("pdir_dma=0x%llx\n", (long long unsigned int)pdir_dma);
>      dir = rdma_pci_dma_map(pci_dev, pdir_dma, TARGET_PAGE_SIZE);
>      if (!dir) {
> -- 
> 2.19.2
>
Prasad Pandit Dec. 17, 2018, 6:47 p.m. UTC | #2
Hello Yuval,

+-- On Sun, 16 Dec 2018, Yuval Shaia wrote --+
| With this patch the last step fails, the guest OS hangs, trying to probably 
| unload pvrdma driver and finally gave up after 3 minutes.

Strange...
 
| Anyways with debug turned on i have noticed that there is one case that 
| devices gets 129 nchunks (i think in MPI) while your patch limits it to 128.
| >From pvrdma source code  we can see that first page is dedicated to ring
| state, this means that it maybe correct that 128 is the limit but we
| should check that nchunks does not exceed 129, not 128.
| 
| What do you think?

 -> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c?id=fdf82a7856b32d905c39afc85e34364491e46346#n201

the vmw_pvrdma kernel driver also seems to set MAX_FAST_REG_PAGE = 128.


| Ie. to replace this line from create_cq_ring
| +    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES) {
| with this
| +    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES + 1) {
| 
| Let me know your opinion.

While it may help to fix the regression. I'm not sure it's a right fix.
129 seems a little odd number to have as limit.

Is it possible MPI is erring in getting 129 chunks?

IMO it's better to confirm the right value for 'MAX_FAST_REG_PAGES', before 
going with > PVRDMA_MAX_FAS_REG_PAGES(=128) + 1.

Thank you.
--
Prasad J Pandit / Red Hat Product Security Team
47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F
Yuval Shaia Dec. 17, 2018, 7 p.m. UTC | #3
On Tue, Dec 18, 2018 at 12:17:59AM +0530, P J P wrote:
>   Hello Yuval,
> 
> +-- On Sun, 16 Dec 2018, Yuval Shaia wrote --+
> | With this patch the last step fails, the guest OS hangs, trying to probably 
> | unload pvrdma driver and finally gave up after 3 minutes.
> 
> Strange...
>  
> | Anyways with debug turned on i have noticed that there is one case that 
> | devices gets 129 nchunks (i think in MPI) while your patch limits it to 128.
> | >From pvrdma source code  we can see that first page is dedicated to ring
> | state, this means that it maybe correct that 128 is the limit but we
> | should check that nchunks does not exceed 129, not 128.
> | 
> | What do you think?
> 
>  -> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/drivers/infiniband/hw/vmw_pvrdma/pvrdma_mr.c?id=fdf82a7856b32d905c39afc85e34364491e46346#n201
> 
> the vmw_pvrdma kernel driver also seems to set MAX_FAST_REG_PAGE = 128.

So does the user-space library.
Maybe the mr_type is IB_MR_TYPE_MEM_REG.

> 
> 
> | Ie. to replace this line from create_cq_ring
> | +    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES) {
> | with this
> | +    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES + 1) {
> | 
> | Let me know your opinion.
> 
> While it may help to fix the regression. I'm not sure it's a right fix.
> 129 seems a little odd number to have as limit.

Agree, let's stick with this patch.

> 
> Is it possible MPI is erring in getting 129 chunks?

Yeah but still the driver is holding the shutdown, not MPI.

Anyways, I found a wrong setting of respose to driver in "Add support for
RDMA MAD" patchset v6 and fixed that.
Now the regression is fine, i.e. VM goes down smoothly.

> 
> IMO it's better to confirm the right value for 'MAX_FAST_REG_PAGES', before 
> going with > PVRDMA_MAX_FAS_REG_PAGES(=128) + 1.

Agree.

> 
> Thank you.
> --
> Prasad J Pandit / Red Hat Product Security Team
> 47AF CE69 3A90 54AA 9045 1053 DD13 3D32 FE5B 041F
diff mbox series

Patch

diff --git a/hw/rdma/vmw/pvrdma_cmd.c b/hw/rdma/vmw/pvrdma_cmd.c
index 4f616d4177..e37fb18280 100644
--- a/hw/rdma/vmw/pvrdma_cmd.c
+++ b/hw/rdma/vmw/pvrdma_cmd.c
@@ -259,6 +259,11 @@  static int create_cq_ring(PCIDevice *pci_dev , PvrdmaRing **ring,
     int rc = -EINVAL;
     char ring_name[MAX_RING_NAME_SZ];
 
+    if (!nchunks || nchunks > PVRDMA_MAX_FAST_REG_PAGES) {
+        pr_dbg("invalid nchunks: %d\n", nchunks);
+        return rc;
+    }
+
     pr_dbg("pdir_dma=0x%llx\n", (long long unsigned int)pdir_dma);
     dir = rdma_pci_dma_map(pci_dev, pdir_dma, TARGET_PAGE_SIZE);
     if (!dir) {
@@ -371,6 +376,12 @@  static int create_qp_rings(PCIDevice *pci_dev, uint64_t pdir_dma,
     char ring_name[MAX_RING_NAME_SZ];
     uint32_t wqe_sz;
 
+    if (!spages || spages > PVRDMA_MAX_FAST_REG_PAGES
+        || !rpages || rpages > PVRDMA_MAX_FAST_REG_PAGES) {
+        pr_dbg("invalid pages: %d, %d\n", spages, rpages);
+        return rc;
+    }
+
     pr_dbg("pdir_dma=0x%llx\n", (long long unsigned int)pdir_dma);
     dir = rdma_pci_dma_map(pci_dev, pdir_dma, TARGET_PAGE_SIZE);
     if (!dir) {