diff mbox

SR-IOV with mlx4 on ConnectX-2 fails with DMAR errors

Message ID 20161213190102.GA15119@obsidianresearch.com (mailing list archive)
State Rejected
Headers show

Commit Message

Jason Gunthorpe Dec. 13, 2016, 7:01 p.m. UTC
On Tue, Dec 13, 2016 at 01:36:42PM -0500, Joshua McBeth wrote:
> I bisected the kernel between v4.1 and v4.3.1 by booting each build on
> the SR-IOV host and attempting to "ping x.x.x.x" with x.x.x.x being
> the IP address assigned to the Infiniband interface of a remote host
> 
> At 4be90bc's parent the SR-IOV host is able to ping the remote host,
> but at 4be90bc the SR-IOV host is not able to ping the remote host
> (destination host unreachable)

Okay, that makes sense

> The DMAR errors occur in both the kernel built at 4be90bc (not passing
> ping test) and its parent (passing ping test)

Continuing to bisect until you find the commit that introduces the
DMAR errors would also be helpful, I think.

> Reverting only the commit 4be90bc from a later kernel (4.8.x) does not
> enable the SR-IOV host to ping the remote host, which to me suggests
> that another commit after 4be90bc is also causing my test to fail.

Okay, that does not seem too surprising.

Does this make your 4.8 kernel work? If yes, then I suspect mlx4 has
broken IB_DEVICE_LOCAL_DMA_LKEY with SRIOV.. Leon? mlx5 has this
broken, doesn't it?

It would also be very helpful to try and determine what memory the NIC is
trying to read.. If it is the ipoib packet or some mlx4 internal
thing.


Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Joshua McBeth Dec. 14, 2016, 3:06 p.m. UTC | #1
On Tue, Dec 13, 2016 at 2:01 PM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
>
> On Tue, Dec 13, 2016 at 01:36:42PM -0500, Joshua McBeth wrote:
> > I bisected the kernel between v4.1 and v4.3.1 by booting each build on
> > the SR-IOV host and attempting to "ping x.x.x.x" with x.x.x.x being
> > the IP address assigned to the Infiniband interface of a remote host
> >
> > At 4be90bc's parent the SR-IOV host is able to ping the remote host,
> > but at 4be90bc the SR-IOV host is not able to ping the remote host
> > (destination host unreachable)
>
> Okay, that makes sense
>
> > The DMAR errors occur in both the kernel built at 4be90bc (not passing
> > ping test) and its parent (passing ping test)
>
> Continuing to bisect until you find the commit that introduces the
> DMAR errors would also be helpful, I think.


I will do this when I find some time and report back with the results.
>
>
>
> > Reverting only the commit 4be90bc from a later kernel (4.8.x) does not
> > enable the SR-IOV host to ping the remote host, which to me suggests
> > that another commit after 4be90bc is also causing my test to fail.
>
> Okay, that does not seem too surprising.
>
> Does this make your 4.8 kernel work? If yes, then I suspect mlx4 has
> broken IB_DEVICE_LOCAL_DMA_LKEY with SRIOV.. Leon? mlx5 has this
> broken, doesn't it?
>

With 4.8.1 and the below applied to the SR-IOV host and guest kernels,
SR-IOV functions in both the SR-IOV host and guests and there are no
DMAR errors emitted.  The NFS/RDMA client in the guest does not work
on the SR-IOV virtual function with the NFS/RDMA server of the host on
the SR-IOV physical function, but this may be something else I need to
troubleshoot further, as both IPoIB and synthetic RDMA traffic passes
between the guest, host, and remote node just fine.  The remote node's
NFS/RDMA client is additionally able to function with the host's
NFS/RDMA server on the SR-IOV physical function.

>
> It would also be very helpful to try and determine what memory the NIC is
> trying to read.. If it is the ipoib packet or some mlx4 internal
> thing.


How can I determine this?

> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 2be4ea0cda9c19..1346924d27691f 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -243,6 +243,8 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
>         atomic_set(&pd->usecnt, 0);
>         pd->flags = flags;
>
> +       device->attrs.device_cap_flags = 0;
> +
>         if (device->attrs.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)
>                 pd->local_dma_lkey = device->local_dma_lkey;
>         else
>
> Jason

Apologies for duplicates, I am resending with subject for threading.

On Tue, Dec 13, 2016 at 2:01 PM, Jason Gunthorpe
<jgunthorpe@obsidianresearch.com> wrote:
> On Tue, Dec 13, 2016 at 01:36:42PM -0500, Joshua McBeth wrote:
>> I bisected the kernel between v4.1 and v4.3.1 by booting each build on
>> the SR-IOV host and attempting to "ping x.x.x.x" with x.x.x.x being
>> the IP address assigned to the Infiniband interface of a remote host
>>
>> At 4be90bc's parent the SR-IOV host is able to ping the remote host,
>> but at 4be90bc the SR-IOV host is not able to ping the remote host
>> (destination host unreachable)
>
> Okay, that makes sense
>
>> The DMAR errors occur in both the kernel built at 4be90bc (not passing
>> ping test) and its parent (passing ping test)
>
> Continuing to bisect until you find the commit that introduces the
> DMAR errors would also be helpful, I think.
>
>> Reverting only the commit 4be90bc from a later kernel (4.8.x) does not
>> enable the SR-IOV host to ping the remote host, which to me suggests
>> that another commit after 4be90bc is also causing my test to fail.
>
> Okay, that does not seem too surprising.
>
> Does this make your 4.8 kernel work? If yes, then I suspect mlx4 has
> broken IB_DEVICE_LOCAL_DMA_LKEY with SRIOV.. Leon? mlx5 has this
> broken, doesn't it?
>
> It would also be very helpful to try and determine what memory the NIC is
> trying to read.. If it is the ipoib packet or some mlx4 internal
> thing.
>
> diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
> index 2be4ea0cda9c19..1346924d27691f 100644
> --- a/drivers/infiniband/core/verbs.c
> +++ b/drivers/infiniband/core/verbs.c
> @@ -243,6 +243,8 @@ struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
>         atomic_set(&pd->usecnt, 0);
>         pd->flags = flags;
>
> +       device->attrs.device_cap_flags = 0;
> +
>         if (device->attrs.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)
>                 pd->local_dma_lkey = device->local_dma_lkey;
>         else
>
> Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Dec. 14, 2016, 10:38 p.m. UTC | #2
On Wed, Dec 14, 2016 at 10:06:13AM -0500, Joshua McBeth wrote:

> > Does this make your 4.8 kernel work? If yes, then I suspect mlx4 has
> > broken IB_DEVICE_LOCAL_DMA_LKEY with SRIOV.. Leon? mlx5 has this
> > broken, doesn't it?

> With 4.8.1 and the below applied to the SR-IOV host and guest kernels,
> SR-IOV functions in both the SR-IOV host and guests and there are no
> DMAR errors emitted.

So strange.

Looking at your original report you see these errors:

[  107.137484] DMAR: [DMA Read] Request device [05:06.1] fault addr

But I don't see where 05:06.01 is a PCI device. That seems like a big
problem.

Based on that this looks like a Mellanox bug where
IB_DEVICE_LOCAL_DMA_LKEY is causing the wrong PCI BDF to be provided
as the requestor. Mellanox will have to help you futher, you are
running the latest firmware, right?

> The NFS/RDMA client in the guest does not work on the SR-IOV virtual
> function with the NFS/RDMA server of the host on the SR-IOV physical
> function, but this may be something else I need to troubleshoot
> further, as both IPoIB and synthetic RDMA traffic passes between the
> guest, host, and remote node just fine.  The remote node's NFS/RDMA
> client is additionally able to function with the host's NFS/RDMA
> server on the SR-IOV physical function.

Try removing IB_DEVICE_LOCAL_DMA_LKEY from the mlx4 driver entirely..

> > It would also be very helpful to try and determine what memory the NIC is
> > trying to read.. If it is the ipoib packet or some mlx4 internal
> > thing.

> How can I determine this?

Print out the dma address of the skb when the SEND is submitted in
ipoib and see if it is similar to the DMAR region..

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/core/verbs.c b/drivers/infiniband/core/verbs.c
index 2be4ea0cda9c19..1346924d27691f 100644
--- a/drivers/infiniband/core/verbs.c
+++ b/drivers/infiniband/core/verbs.c
@@ -243,6 +243,8 @@  struct ib_pd *__ib_alloc_pd(struct ib_device *device, unsigned int flags,
 	atomic_set(&pd->usecnt, 0);
 	pd->flags = flags;
 
+	device->attrs.device_cap_flags = 0;
+
 	if (device->attrs.device_cap_flags & IB_DEVICE_LOCAL_DMA_LKEY)
 		pd->local_dma_lkey = device->local_dma_lkey;
 	else