diff mbox

[rdma-rc,4/7] IB/mlx4: Fix possible vl/sl field mismatch in LRH header in QP1 packets

Message ID 1473696984-6161-5-git-send-email-leon@kernel.org (mailing list archive)
State Accepted
Headers show

Commit Message

Leon Romanovsky Sept. 12, 2016, 4:16 p.m. UTC
From: Jack Morgenstein <jackm@dev.mellanox.co.il>

In MLX qp packets, the LRH (built by the driver) has both a VL field
and an SL field. When building a QP1 packet, the VL field should
reflect the SLtoVL mapping and not arbitrarily contain zero (as is
done now). This bug causes credit problems in IB switches at
high rates of QP1 packets.

The fix is to cache the SL to VL mapping in the driver, and look up
the VL mapped to the SL provided in the send request when sending
QP1 packets.

For FW versions which support generating a port_management_config_change
event with subtype sl-to-vl-table-change, the driver uses that event
to update its sl-to-vl mapping cache.  Otherwise, the driver snoops
incoming SMP mads to update the cache.

There remains the case where the FW is running in secure-host mode
(so no QP0 packets are delivered to the driver), and the FW does not
generate the sl2vl mapping change event. To support this case, the
driver updates (via querying the FW) its sl2vl mapping cache when
running in secure-host mode when it receives either a Port Up event
or a client-reregister event (where the port is still up, but there
may have been an opensm failover).
OpenSM modifies the sl2vl mapping before Port Up and Client-reregister
events occur, so if there is a mapping change the driver's cache will
be properly updated.

Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
Signed-off-by: Leon Romanovsky <leon@kernel.org>
---
 drivers/infiniband/hw/mlx4/mad.c        |  64 ++++++++++++++++++-
 drivers/infiniband/hw/mlx4/main.c       | 110 +++++++++++++++++++++++++++++++-
 drivers/infiniband/hw/mlx4/mlx4_ib.h    |   7 ++
 drivers/infiniband/hw/mlx4/qp.c         |  23 ++++++-
 drivers/net/ethernet/mellanox/mlx4/fw.c |  13 ++--
 include/linux/mlx4/device.h             |  13 +++-
 6 files changed, 220 insertions(+), 10 deletions(-)

Comments

Doug Ledford Sept. 16, 2016, 6:01 p.m. UTC | #1
On 9/12/2016 12:16 PM, Leon Romanovsky wrote:
> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> 
> In MLX qp packets, the LRH (built by the driver) has both a VL field
> and an SL field. When building a QP1 packet, the VL field should
> reflect the SLtoVL mapping and not arbitrarily contain zero (as is
> done now). This bug causes credit problems in IB switches at
> high rates of QP1 packets.
> 
> The fix is to cache the SL to VL mapping in the driver, and look up
> the VL mapped to the SL provided in the send request when sending
> QP1 packets.
> 
> For FW versions which support generating a port_management_config_change
> event with subtype sl-to-vl-table-change, the driver uses that event
> to update its sl-to-vl mapping cache.  Otherwise, the driver snoops
> incoming SMP mads to update the cache.
> 
> There remains the case where the FW is running in secure-host mode
> (so no QP0 packets are delivered to the driver), and the FW does not
> generate the sl2vl mapping change event. To support this case, the
> driver updates (via querying the FW) its sl2vl mapping cache when
> running in secure-host mode when it receives either a Port Up event
> or a client-reregister event (where the port is still up, but there
> may have been an opensm failover).
> OpenSM modifies the sl2vl mapping before Port Up and Client-reregister
> events occur, so if there is a mapping change the driver's cache will
> be properly updated.
> 
> Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> Signed-off-by: Leon Romanovsky <leon@kernel.org>
> ---
>  drivers/infiniband/hw/mlx4/mad.c        |  64 ++++++++++++++++++-
>  drivers/infiniband/hw/mlx4/main.c       | 110 +++++++++++++++++++++++++++++++-
>  drivers/infiniband/hw/mlx4/mlx4_ib.h    |   7 ++
>  drivers/infiniband/hw/mlx4/qp.c         |  23 ++++++-
>  drivers/net/ethernet/mellanox/mlx4/fw.c |  13 ++--
>  include/linux/mlx4/device.h             |  13 +++-
>  6 files changed, 220 insertions(+), 10 deletions(-)

That's a lot more code churn than I would like to see in a late RC.  I'm
going to drop this patch and move it to 4.9 instead.  If this fixed an
oops or something like that, I would be more open to taking it now, but
the problem being resolved is credits on a switch.  That isn't the sort
of showstopper issue that would justify this large of a patch this late
in the cycle.
Leon Romanovsky Sept. 18, 2016, 6:53 a.m. UTC | #2
On Fri, Sep 16, 2016 at 02:01:57PM -0400, Doug Ledford wrote:
> On 9/12/2016 12:16 PM, Leon Romanovsky wrote:
> > From: Jack Morgenstein <jackm@dev.mellanox.co.il>
> >
> > In MLX qp packets, the LRH (built by the driver) has both a VL field
> > and an SL field. When building a QP1 packet, the VL field should
> > reflect the SLtoVL mapping and not arbitrarily contain zero (as is
> > done now). This bug causes credit problems in IB switches at
> > high rates of QP1 packets.
> >
> > The fix is to cache the SL to VL mapping in the driver, and look up
> > the VL mapped to the SL provided in the send request when sending
> > QP1 packets.
> >
> > For FW versions which support generating a port_management_config_change
> > event with subtype sl-to-vl-table-change, the driver uses that event
> > to update its sl-to-vl mapping cache.  Otherwise, the driver snoops
> > incoming SMP mads to update the cache.
> >
> > There remains the case where the FW is running in secure-host mode
> > (so no QP0 packets are delivered to the driver), and the FW does not
> > generate the sl2vl mapping change event. To support this case, the
> > driver updates (via querying the FW) its sl2vl mapping cache when
> > running in secure-host mode when it receives either a Port Up event
> > or a client-reregister event (where the port is still up, but there
> > may have been an opensm failover).
> > OpenSM modifies the sl2vl mapping before Port Up and Client-reregister
> > events occur, so if there is a mapping change the driver's cache will
> > be properly updated.
> >
> > Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
> > Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
> > Signed-off-by: Leon Romanovsky <leon@kernel.org>
> > ---
> >  drivers/infiniband/hw/mlx4/mad.c        |  64 ++++++++++++++++++-
> >  drivers/infiniband/hw/mlx4/main.c       | 110 +++++++++++++++++++++++++++++++-
> >  drivers/infiniband/hw/mlx4/mlx4_ib.h    |   7 ++
> >  drivers/infiniband/hw/mlx4/qp.c         |  23 ++++++-
> >  drivers/net/ethernet/mellanox/mlx4/fw.c |  13 ++--
> >  include/linux/mlx4/device.h             |  13 +++-
> >  6 files changed, 220 insertions(+), 10 deletions(-)
>
> That's a lot more code churn than I would like to see in a late RC.  I'm
> going to drop this patch and move it to 4.9 instead.  If this fixed an
> oops or something like that, I would be more open to taking it now, but
> the problem being resolved is credits on a switch.  That isn't the sort
> of showstopper issue that would justify this large of a patch this late
> in the cycle.

No problem, Thanks.

>
>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG Key ID: 0E572FDD
>
Doug Ledford Sept. 23, 2016, 4:48 p.m. UTC | #3
On 9/18/2016 2:53 AM, Leon Romanovsky wrote:
> On Fri, Sep 16, 2016 at 02:01:57PM -0400, Doug Ledford wrote:
>> On 9/12/2016 12:16 PM, Leon Romanovsky wrote:
>>> From: Jack Morgenstein <jackm@dev.mellanox.co.il>
>>>
>>> In MLX qp packets, the LRH (built by the driver) has both a VL field
>>> and an SL field. When building a QP1 packet, the VL field should
>>> reflect the SLtoVL mapping and not arbitrarily contain zero (as is
>>> done now). This bug causes credit problems in IB switches at
>>> high rates of QP1 packets.
>>>
>>> The fix is to cache the SL to VL mapping in the driver, and look up
>>> the VL mapped to the SL provided in the send request when sending
>>> QP1 packets.
>>>
>>> For FW versions which support generating a port_management_config_change
>>> event with subtype sl-to-vl-table-change, the driver uses that event
>>> to update its sl-to-vl mapping cache.  Otherwise, the driver snoops
>>> incoming SMP mads to update the cache.
>>>
>>> There remains the case where the FW is running in secure-host mode
>>> (so no QP0 packets are delivered to the driver), and the FW does not
>>> generate the sl2vl mapping change event. To support this case, the
>>> driver updates (via querying the FW) its sl2vl mapping cache when
>>> running in secure-host mode when it receives either a Port Up event
>>> or a client-reregister event (where the port is still up, but there
>>> may have been an opensm failover).
>>> OpenSM modifies the sl2vl mapping before Port Up and Client-reregister
>>> events occur, so if there is a mapping change the driver's cache will
>>> be properly updated.
>>>
>>> Fixes: 225c7b1feef1 ("IB/mlx4: Add a driver Mellanox ConnectX InfiniBand adapters")
>>> Signed-off-by: Jack Morgenstein <jackm@dev.mellanox.co.il>
>>> Signed-off-by: Leon Romanovsky <leon@kernel.org>
>>> ---
>>>  drivers/infiniband/hw/mlx4/mad.c        |  64 ++++++++++++++++++-
>>>  drivers/infiniband/hw/mlx4/main.c       | 110 +++++++++++++++++++++++++++++++-
>>>  drivers/infiniband/hw/mlx4/mlx4_ib.h    |   7 ++
>>>  drivers/infiniband/hw/mlx4/qp.c         |  23 ++++++-
>>>  drivers/net/ethernet/mellanox/mlx4/fw.c |  13 ++--
>>>  include/linux/mlx4/device.h             |  13 +++-
>>>  6 files changed, 220 insertions(+), 10 deletions(-)
>>
>> That's a lot more code churn than I would like to see in a late RC.  I'm
>> going to drop this patch and move it to 4.9 instead.  If this fixed an
>> oops or something like that, I would be more open to taking it now, but
>> the problem being resolved is credits on a switch.  That isn't the sort
>> of showstopper issue that would justify this large of a patch this late
>> in the cycle.
> 
> No problem, Thanks.

This is now in my 4.9-misc branch.
diff mbox

Patch

diff --git a/drivers/infiniband/hw/mlx4/mad.c b/drivers/infiniband/hw/mlx4/mad.c
index 0f21c3a..c6271fb 100644
--- a/drivers/infiniband/hw/mlx4/mad.c
+++ b/drivers/infiniband/hw/mlx4/mad.c
@@ -230,6 +230,8 @@  static void smp_snoop(struct ib_device *ibdev, u8 port_num, const struct ib_mad
 	    mad->mad_hdr.method == IB_MGMT_METHOD_SET)
 		switch (mad->mad_hdr.attr_id) {
 		case IB_SMP_ATTR_PORT_INFO:
+			if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV)
+				return;
 			pinfo = (struct ib_port_info *) ((struct ib_smp *) mad)->data;
 			lid = be16_to_cpu(pinfo->lid);
 
@@ -245,6 +247,8 @@  static void smp_snoop(struct ib_device *ibdev, u8 port_num, const struct ib_mad
 			break;
 
 		case IB_SMP_ATTR_PKEY_TABLE:
+			if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV)
+				return;
 			if (!mlx4_is_mfunc(dev->dev)) {
 				mlx4_ib_dispatch_event(dev, port_num,
 						       IB_EVENT_PKEY_CHANGE);
@@ -281,6 +285,8 @@  static void smp_snoop(struct ib_device *ibdev, u8 port_num, const struct ib_mad
 			break;
 
 		case IB_SMP_ATTR_GUID_INFO:
+			if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV)
+				return;
 			/* paravirtualized master's guid is guid 0 -- does not change */
 			if (!mlx4_is_master(dev->dev))
 				mlx4_ib_dispatch_event(dev, port_num,
@@ -296,6 +302,26 @@  static void smp_snoop(struct ib_device *ibdev, u8 port_num, const struct ib_mad
 			}
 			break;
 
+		case IB_SMP_ATTR_SL_TO_VL_TABLE:
+			/* cache sl to vl mapping changes for use in
+			 * filling QP1 LRH VL field when sending packets
+			 */
+			if (dev->dev->caps.flags & MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV &&
+			    dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT)
+				return;
+			if (!mlx4_is_slave(dev->dev)) {
+				union sl2vl_tbl_to_u64 sl2vl64;
+				int jj;
+
+				for (jj = 0; jj < 8; jj++) {
+					sl2vl64.sl8[jj] = ((struct ib_smp *)mad)->data[jj];
+					pr_debug("port %u, sl2vl[%d] = %02x\n",
+						 port_num, jj, sl2vl64.sl8[jj]);
+				}
+				atomic64_set(&dev->sl2vl[port_num - 1], sl2vl64.sl64);
+			}
+			break;
+
 		default:
 			break;
 		}
@@ -805,8 +831,7 @@  static int ib_process_mad(struct ib_device *ibdev, int mad_flags, u8 port_num,
 		return IB_MAD_RESULT_FAILURE;
 
 	if (!out_mad->mad_hdr.status) {
-		if (!(to_mdev(ibdev)->dev->caps.flags & MLX4_DEV_CAP_FLAG_PORT_MNG_CHG_EV))
-			smp_snoop(ibdev, port_num, in_mad, prev_lid);
+		smp_snoop(ibdev, port_num, in_mad, prev_lid);
 		/* slaves get node desc from FW */
 		if (!mlx4_is_slave(to_mdev(ibdev)->dev))
 			node_desc_override(ibdev, out_mad);
@@ -1037,6 +1062,23 @@  static void handle_client_rereg_event(struct mlx4_ib_dev *dev, u8 port_num)
 						    MLX4_EQ_PORT_INFO_CLIENT_REREG_MASK);
 		}
 	}
+
+	/* Update the sl to vl table from inside client rereg
+	 * only if in secure-host mode (snooping is not possible)
+	 * and the sl-to-vl change event is not generated by FW.
+	 */
+	if (!mlx4_is_slave(dev->dev) &&
+	    dev->dev->flags & MLX4_FLAG_SECURE_HOST &&
+	    !(dev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT)) {
+		if (mlx4_is_master(dev->dev))
+			/* already in work queue from mlx4_ib_event queueing
+			 * mlx4_handle_port_mgmt_change_event, which calls
+			 * this procedure. Therefore, call sl2vl_update directly.
+			 */
+			mlx4_ib_sl2vl_update(dev, port_num);
+		else
+			mlx4_sched_ib_sl2vl_update_work(dev, port_num);
+	}
 	mlx4_ib_dispatch_event(dev, port_num, IB_EVENT_CLIENT_REREGISTER);
 }
 
@@ -1176,6 +1218,24 @@  void handle_port_mgmt_change_event(struct work_struct *work)
 			handle_slaves_guid_change(dev, port, tbl_block, change_bitmap);
 		}
 		break;
+
+	case MLX4_DEV_PMC_SUBTYPE_SL_TO_VL_MAP:
+		/* cache sl to vl mapping changes for use in
+		 * filling QP1 LRH VL field when sending packets
+		 */
+		if (!mlx4_is_slave(dev->dev)) {
+			union sl2vl_tbl_to_u64 sl2vl64;
+			int jj;
+
+			for (jj = 0; jj < 8; jj++) {
+				sl2vl64.sl8[jj] =
+					eqe->event.port_mgmt_change.params.sl2vl_tbl_change_info.sl2vl_table[jj];
+				pr_debug("port %u, sl2vl[%d] = %02x\n",
+					 port, jj, sl2vl64.sl8[jj]);
+			}
+			atomic64_set(&dev->sl2vl[port - 1], sl2vl64.sl64);
+		}
+		break;
 	default:
 		pr_warn("Unsupported subtype 0x%x for "
 			"Port Management Change event\n", eqe->subtype);
diff --git a/drivers/infiniband/hw/mlx4/main.c b/drivers/infiniband/hw/mlx4/main.c
index 2af44c2..ac235b5 100644
--- a/drivers/infiniband/hw/mlx4/main.c
+++ b/drivers/infiniband/hw/mlx4/main.c
@@ -832,6 +832,66 @@  static int mlx4_ib_query_gid(struct ib_device *ibdev, u8 port, int index,
 	return ret;
 }
 
+static int mlx4_ib_query_sl2vl(struct ib_device *ibdev, u8 port, u64 *sl2vl_tbl)
+{
+	union sl2vl_tbl_to_u64 sl2vl64;
+	struct ib_smp *in_mad  = NULL;
+	struct ib_smp *out_mad = NULL;
+	int mad_ifc_flags = MLX4_MAD_IFC_IGNORE_KEYS;
+	int err = -ENOMEM;
+	int jj;
+
+	if (mlx4_is_slave(to_mdev(ibdev)->dev)) {
+		*sl2vl_tbl = 0;
+		return 0;
+	}
+
+	in_mad  = kzalloc(sizeof(*in_mad), GFP_KERNEL);
+	out_mad = kmalloc(sizeof(*out_mad), GFP_KERNEL);
+	if (!in_mad || !out_mad)
+		goto out;
+
+	init_query_mad(in_mad);
+	in_mad->attr_id  = IB_SMP_ATTR_SL_TO_VL_TABLE;
+	in_mad->attr_mod = 0;
+
+	if (mlx4_is_mfunc(to_mdev(ibdev)->dev))
+		mad_ifc_flags |= MLX4_MAD_IFC_NET_VIEW;
+
+	err = mlx4_MAD_IFC(to_mdev(ibdev), mad_ifc_flags, port, NULL, NULL,
+			   in_mad, out_mad);
+	if (err)
+		goto out;
+
+	for (jj = 0; jj < 8; jj++)
+		sl2vl64.sl8[jj] = ((struct ib_smp *)out_mad)->data[jj];
+	*sl2vl_tbl = sl2vl64.sl64;
+
+out:
+	kfree(in_mad);
+	kfree(out_mad);
+	return err;
+}
+
+static void mlx4_init_sl2vl_tbl(struct mlx4_ib_dev *mdev)
+{
+	u64 sl2vl;
+	int i;
+	int err;
+
+	for (i = 1; i <= mdev->dev->caps.num_ports; i++) {
+		if (mdev->dev->caps.port_type[i] == MLX4_PORT_TYPE_ETH)
+			continue;
+		err = mlx4_ib_query_sl2vl(&mdev->ib_dev, i, &sl2vl);
+		if (err) {
+			pr_err("Unable to get default sl to vl mapping for port %d.  Using all zeroes (%d)\n",
+			       i, err);
+			sl2vl = 0;
+		}
+		atomic64_set(&mdev->sl2vl[i - 1], sl2vl);
+	}
+}
+
 int __mlx4_ib_query_pkey(struct ib_device *ibdev, u8 port, u16 index,
 			 u16 *pkey, int netw_view)
 {
@@ -2650,6 +2710,7 @@  static void *mlx4_ib_add(struct mlx4_dev *dev)
 
 	if (init_node_data(ibdev))
 		goto err_map;
+	mlx4_init_sl2vl_tbl(ibdev);
 
 	for (i = 0; i < ibdev->num_ports; ++i) {
 		mutex_init(&ibdev->counters_table[i].mutex);
@@ -3098,6 +3159,47 @@  static void handle_bonded_port_state_event(struct work_struct *work)
 	ib_dispatch_event(&ibev);
 }
 
+void mlx4_ib_sl2vl_update(struct mlx4_ib_dev *mdev, int port)
+{
+	u64 sl2vl;
+	int err;
+
+	err = mlx4_ib_query_sl2vl(&mdev->ib_dev, port, &sl2vl);
+	if (err) {
+		pr_err("Unable to get current sl to vl mapping for port %d.  Using all zeroes (%d)\n",
+		       port, err);
+		sl2vl = 0;
+	}
+	atomic64_set(&mdev->sl2vl[port - 1], sl2vl);
+}
+
+static void ib_sl2vl_update_work(struct work_struct *work)
+{
+	struct ib_event_work *ew = container_of(work, struct ib_event_work, work);
+	struct mlx4_ib_dev *mdev = ew->ib_dev;
+	int port = ew->port;
+
+	mlx4_ib_sl2vl_update(mdev, port);
+
+	kfree(ew);
+}
+
+void mlx4_sched_ib_sl2vl_update_work(struct mlx4_ib_dev *ibdev,
+				     int port)
+{
+	struct ib_event_work *ew;
+
+	ew = kmalloc(sizeof(*ew), GFP_ATOMIC);
+	if (ew) {
+		INIT_WORK(&ew->work, ib_sl2vl_update_work);
+		ew->port = port;
+		ew->ib_dev = ibdev;
+		queue_work(wq, &ew->work);
+	} else {
+		pr_err("failed to allocate memory for sl2vl update work\n");
+	}
+}
+
 static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
 			  enum mlx4_dev_event event, unsigned long param)
 {
@@ -3128,10 +3230,14 @@  static void mlx4_ib_event(struct mlx4_dev *dev, void *ibdev_ptr,
 	case MLX4_DEV_EVENT_PORT_UP:
 		if (p > ibdev->num_ports)
 			return;
-		if (mlx4_is_master(dev) &&
+		if (!mlx4_is_slave(dev) &&
 		    rdma_port_get_link_layer(&ibdev->ib_dev, p) ==
 			IB_LINK_LAYER_INFINIBAND) {
-			mlx4_ib_invalidate_all_guid_record(ibdev, p);
+			if (mlx4_is_master(dev))
+				mlx4_ib_invalidate_all_guid_record(ibdev, p);
+			if (ibdev->dev->flags & MLX4_FLAG_SECURE_HOST &&
+			    !(ibdev->dev->caps.flags2 & MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT))
+				mlx4_sched_ib_sl2vl_update_work(ibdev, p);
 		}
 		ibev.event = IB_EVENT_PORT_ACTIVE;
 		break;
diff --git a/drivers/infiniband/hw/mlx4/mlx4_ib.h b/drivers/infiniband/hw/mlx4/mlx4_ib.h
index 686ab48..35141f4 100644
--- a/drivers/infiniband/hw/mlx4/mlx4_ib.h
+++ b/drivers/infiniband/hw/mlx4/mlx4_ib.h
@@ -570,6 +570,7 @@  struct mlx4_ib_dev {
 	struct ib_mad_agent    *send_agent[MLX4_MAX_PORTS][2];
 	struct ib_ah	       *sm_ah[MLX4_MAX_PORTS];
 	spinlock_t		sm_lock;
+	atomic64_t		sl2vl[MLX4_MAX_PORTS];
 	struct mlx4_ib_sriov	sriov;
 
 	struct mutex		cap_mask_mutex;
@@ -600,6 +601,7 @@  struct ib_event_work {
 	struct work_struct	work;
 	struct mlx4_ib_dev	*ib_dev;
 	struct mlx4_eqe		ib_eqe;
+	int			port;
 };
 
 struct mlx4_ib_qp_tunnel_init_attr {
@@ -883,4 +885,9 @@  int mlx4_ib_rereg_user_mr(struct ib_mr *mr, int flags,
 int mlx4_ib_gid_index_to_real_index(struct mlx4_ib_dev *ibdev,
 				    u8 port_num, int index);
 
+void mlx4_sched_ib_sl2vl_update_work(struct mlx4_ib_dev *ibdev,
+				     int port);
+
+void mlx4_ib_sl2vl_update(struct mlx4_ib_dev *mdev, int port);
+
 #endif /* MLX4_IB_H */
diff --git a/drivers/infiniband/hw/mlx4/qp.c b/drivers/infiniband/hw/mlx4/qp.c
index 7fb9629..19cee23 100644
--- a/drivers/infiniband/hw/mlx4/qp.c
+++ b/drivers/infiniband/hw/mlx4/qp.c
@@ -2405,6 +2405,22 @@  static int build_sriov_qp0_header(struct mlx4_ib_sqp *sqp,
 	return 0;
 }
 
+static u8 sl_to_vl(struct mlx4_ib_dev *dev, u8 sl, int port_num)
+{
+	union sl2vl_tbl_to_u64 tmp_vltab;
+	u8 vl;
+
+	if (sl > 15)
+		return 0xf;
+	tmp_vltab.sl64 = atomic64_read(&dev->sl2vl[port_num - 1]);
+	vl = tmp_vltab.sl8[sl >> 1];
+	if (sl & 1)
+		vl &= 0x0f;
+	else
+		vl >>= 4;
+	return vl;
+}
+
 #define MLX4_ROCEV2_QP1_SPORT 0xC000
 static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 			    void *wqe, unsigned *mlx_seg_len)
@@ -2590,7 +2606,12 @@  static int build_mlx_header(struct mlx4_ib_sqp *sqp, struct ib_ud_wr *wr,
 			sqp->ud_header.vlan.tag = cpu_to_be16(vlan | pcp);
 		}
 	} else {
-		sqp->ud_header.lrh.virtual_lane    = !sqp->qp.ibqp.qp_num ? 15 : 0;
+		sqp->ud_header.lrh.virtual_lane    = !sqp->qp.ibqp.qp_num ? 15 :
+							sl_to_vl(to_mdev(ib_dev),
+								 sqp->ud_header.lrh.service_level,
+								 sqp->qp.port);
+		if (sqp->qp.ibqp.qp_num && sqp->ud_header.lrh.virtual_lane == 15)
+			return -EINVAL;
 		if (sqp->ud_header.lrh.destination_lid == IB_LID_PERMISSIVE)
 			sqp->ud_header.lrh.source_lid = IB_LID_PERMISSIVE;
 	}
diff --git a/drivers/net/ethernet/mellanox/mlx4/fw.c b/drivers/net/ethernet/mellanox/mlx4/fw.c
index d728704..d87bbe6 100644
--- a/drivers/net/ethernet/mellanox/mlx4/fw.c
+++ b/drivers/net/ethernet/mellanox/mlx4/fw.c
@@ -158,7 +158,8 @@  static void dump_dev_cap_flags2(struct mlx4_dev *dev, u64 flags)
 		[31] = "Modifying loopback source checks using UPDATE_QP support",
 		[32] = "Loopback source checks support",
 		[33] = "RoCEv2 support",
-		[34] = "DMFS Sniffer support (UC & MC)"
+		[34] = "DMFS Sniffer support (UC & MC)",
+		[36] = "sl to vl mapping table change event support"
 	};
 	int i;
 
@@ -703,6 +704,7 @@  int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 #define QUERY_DEV_CAP_FLOW_STEERING_IPOIB_OFFSET	0x74
 #define QUERY_DEV_CAP_FLOW_STEERING_RANGE_EN_OFFSET	0x76
 #define QUERY_DEV_CAP_FLOW_STEERING_MAX_QP_OFFSET	0x77
+#define QUERY_DEV_CAP_SL2VL_EVENT_OFFSET	0x78
 #define QUERY_DEV_CAP_CQ_EQ_CACHE_LINE_STRIDE	0x7a
 #define QUERY_DEV_CAP_ECN_QCN_VER_OFFSET	0x7b
 #define QUERY_DEV_CAP_RDMARC_ENTRY_SZ_OFFSET	0x80
@@ -822,6 +824,9 @@  int mlx4_QUERY_DEV_CAP(struct mlx4_dev *dev, struct mlx4_dev_cap *dev_cap)
 		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_DMFS_IPOIB;
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_FLOW_STEERING_MAX_QP_OFFSET);
 	dev_cap->fs_max_num_qp_per_entry = field;
+	MLX4_GET(field, outbox, QUERY_DEV_CAP_SL2VL_EVENT_OFFSET);
+	if (field & (1 << 5))
+		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT;
 	MLX4_GET(field, outbox, QUERY_DEV_CAP_ECN_QCN_VER_OFFSET);
 	if (field & 0x1)
 		dev_cap->flags2 |= MLX4_DEV_CAP_FLAG2_QCN;
@@ -2698,7 +2703,6 @@  static int mlx4_check_smp_firewall_active(struct mlx4_dev *dev,
 int mlx4_config_mad_demux(struct mlx4_dev *dev)
 {
 	struct mlx4_cmd_mailbox *mailbox;
-	int secure_host_active;
 	int err;
 
 	/* Check if mad_demux is supported */
@@ -2721,7 +2725,8 @@  int mlx4_config_mad_demux(struct mlx4_dev *dev)
 		goto out;
 	}
 
-	secure_host_active = mlx4_check_smp_firewall_active(dev, mailbox);
+	if (mlx4_check_smp_firewall_active(dev, mailbox))
+		dev->flags |= MLX4_FLAG_SECURE_HOST;
 
 	/* Config mad_demux to handle all MADs returned by the query above */
 	err = mlx4_cmd(dev, mailbox->dma, 0x01 /* subn mgmt class */,
@@ -2732,7 +2737,7 @@  int mlx4_config_mad_demux(struct mlx4_dev *dev)
 		goto out;
 	}
 
-	if (secure_host_active)
+	if (dev->flags & MLX4_FLAG_SECURE_HOST)
 		mlx4_warn(dev, "HCA operating in secure-host mode. SMP firewall activated.\n");
 out:
 	mlx4_free_cmd_mailbox(dev, mailbox);
diff --git a/include/linux/mlx4/device.h b/include/linux/mlx4/device.h
index 42da355..062d10a 100644
--- a/include/linux/mlx4/device.h
+++ b/include/linux/mlx4/device.h
@@ -71,7 +71,8 @@  enum {
 	MLX4_FLAG_SLAVE		= 1 << 3,
 	MLX4_FLAG_SRIOV		= 1 << 4,
 	MLX4_FLAG_OLD_REG_MAC	= 1 << 6,
-	MLX4_FLAG_BONDED	= 1 << 7
+	MLX4_FLAG_BONDED	= 1 << 7,
+	MLX4_FLAG_SECURE_HOST	= 1 << 8,
 };
 
 enum {
@@ -221,6 +222,7 @@  enum {
 	MLX4_DEV_CAP_FLAG2_ROCE_V1_V2		= 1ULL <<  33,
 	MLX4_DEV_CAP_FLAG2_DMFS_UC_MC_SNIFFER   = 1ULL <<  34,
 	MLX4_DEV_CAP_FLAG2_DIAG_PER_PORT	= 1ULL <<  35,
+	MLX4_DEV_CAP_FLAG2_SL_TO_VL_CHANGE_EVENT = 1ULL << 36,
 };
 
 enum {
@@ -448,6 +450,7 @@  enum {
 	MLX4_DEV_PMC_SUBTYPE_GUID_INFO	 = 0x14,
 	MLX4_DEV_PMC_SUBTYPE_PORT_INFO	 = 0x15,
 	MLX4_DEV_PMC_SUBTYPE_PKEY_TABLE	 = 0x16,
+	MLX4_DEV_PMC_SUBTYPE_SL_TO_VL_MAP = 0x17,
 };
 
 /* Port mgmt change event handling */
@@ -459,6 +462,11 @@  enum {
 	MLX4_EQ_PORT_INFO_MSTR_SM_SL_CHANGE_MASK	= 1 << 4,
 };
 
+union sl2vl_tbl_to_u64 {
+	u8	sl8[8];
+	u64	sl64;
+};
+
 enum {
 	MLX4_DEVICE_STATE_UP			= 1 << 0,
 	MLX4_DEVICE_STATE_INTERNAL_ERROR	= 1 << 1,
@@ -945,6 +953,9 @@  struct mlx4_eqe {
 					__be32 block_ptr;
 					__be32 tbl_entries_mask;
 				} __packed tbl_change_info;
+				struct {
+					u8 sl2vl_table[8];
+				} __packed sl2vl_tbl_change_info;
 			} params;
 		} __packed port_mgmt_change;
 		struct {