[net-next,18/19] Documentation: net: mlx5: Add mdev usage documentation
diff mbox series

Message ID 20191107160834.21087-18-parav@mellanox.com
State New
Headers show
  • Mellanox, mlx5 sub function support
Related show

Commit Message

Parav Pandit Nov. 7, 2019, 4:08 p.m. UTC
Reviewed-by: Saeed Mahameed <saeedm@mellanox.com>
Signed-off-by: Parav Pandit <parav@mellanox.com>
 .../device_drivers/mellanox/mlx5.rst          | 122 ++++++++++++++++++
 1 file changed, 122 insertions(+)

diff mbox series

diff --git a/Documentation/networking/device_drivers/mellanox/mlx5.rst b/Documentation/networking/device_drivers/mellanox/mlx5.rst
index d071c6b49e1f..cbdf0a37205b 100644
--- a/Documentation/networking/device_drivers/mellanox/mlx5.rst
+++ b/Documentation/networking/device_drivers/mellanox/mlx5.rst
@@ -14,6 +14,7 @@  Contents
 - `Devlink parameters`_
 - `Devlink health reporters`_
 - `mlx5 tracepoints`_
+- `Mediated devices`_
 Enabling the driver and kconfig options
@@ -97,6 +98,10 @@  Enabling the driver and kconfig options
 |   Provides low-level InfiniBand/RDMA and `RoCE <https://community.mellanox.com/s/article/recommended-network-configuration-examples-for-roce-deployment>`_ support.
+**CONFIG_MLX5_MDEV(y/n)** (module mlx5_core.ko)
+|   Provides support for Sub Functions using mediated devices.
 **External options** ( Choose if the corresponding mlx5 feature is required )
@@ -298,3 +303,120 @@  tc and eswitch offloads tracepoints:
     $ cat /sys/kernel/debug/tracing/trace
     kworker/u48:7-2221  [009] ...1  1475.387435: mlx5e_rep_neigh_update: netdev: ens1f0 MAC: 24:8a:07:9a:17:9a IPv4: IPv6: ::ffff: neigh_connected=1
+Mediated devices
+mlx5 mediated device (mdev) enables users to create multiple netdevices
+and/or RDMA devices from single PCI function.
+Each mdev maps to a mlx5 sub function.
+mlx5 sub function is similar to PCI VF. However it doesn't have its own
+PCI function and MSI-X vectors.
+mlx5 sub function has several less low level device capabilities
+as compare to PCI function.
+Each mlx5 sub function has its own resource namespace for RDMA resources.
+mlx5 mdevs share common PCI resources such as PCI BAR region,
+MSI-X interrupts.
+Each mdev has its own window in the PCI BAR region, which is
+accessible only to that mdev and applications using it.
+mdevs are supported when eswitch mode of the devlink instance
+is in switchdev mode described in 'http://man7.org/linux/man-pages/man8/devlink-dev.8.html'.
+mdev uses mediated device subsystem 'https://www.kernel.org/doc/Documentation/vfio-mediated-device.txt' of the kernel for its life cycle.
+mdev is identified using a UUID defined by RFC 4122.
+Each created mdev has unique 12 letters alias. This alias is used to
+derive phys_port_name attribute of the corresponding representor
+User commands examples
+- Set eswitch mode as switchdev mode::
+    $ devlink dev eswitch set pci/0000:06:00.0 mode switchdev
+- Create a mdev::
+    Generate a UUID
+    $ UUID=$(uuidgen)
+    Create the mdev using UUID
+    $ echo $UUID > /sys/class/net/ens2f0_p0/device/mdev_supported_types/mlx5_core-local/create
+- Unbind a mdev from vfio_mdev driver::
+    $ echo $UUID > /sys/bus/mdev/drivers/vfio_mdev/unbind
+- Bind a mdev to mlx5_core driver::
+    $ echo $UUID > /sys/bus/mdev/drivers/mlx5_core/bind
+- View netdevice and (optionally) RDMA device in sysfs tree::
+    $ ls -l /sys/bus/mdev/devices/$UUID/net/
+    $ ls -l /sys/bus/mdev/devices/$UUID/infiniband/
+- View netdevice and (optionally) RDMA device using iproute2 tools::
+    $ ip link show
+    $ rdma dev show
+- Query maximum number of mdevs that can be created::
+    $ cat /sys/class/net/ens2f0_p0/device/mdev_supported_types/mlx5_core-local/max_mdevs
+- Query remaining number of mdevs that can be created::
+    $ cat /sys/class/net/ens2f0_p0/device/mdev_supported_types/mlx5_core-local/available_instances
+- Query an alias of the mdev::
+    $ cat /sys/bus/mdev/devices/$UUID/alias
+Security model
+This section covers security aspects of mlx5 mediated devices at
+host level and at network level.
+Host side:
+- At present mlx5 mdev is meant to be used only in a host.
+It is not meant to be mapped to a VM or access by userspace application
+using VFIO framework.
+Hence, mlx5_core driver doesn't implement any of the VFIO device specific
+callback routines.
+Hence, mlx5 mediated device cannot be mapped to a VM or to a userspace
+application via VFIO framework.
+- At present an mlx5 mdev can be accessed by an application through
+its netdevice and/or RDMA device.
+- mlx5 mdev does not share PCI BAR with its parent PCI function.
+- All mlx5 mdevs of a given parent device share a single PCI BAR.
+However each mdev device has a small dedicated window of the PCI BAR.
+Hence, one mdev device cannot access PCI BAR or any of the resources
+of another mdev device.
+- Each mlx5 mdev has its own dedicated event queue through which interrupt
+notifications are delivered. Hence, one mlx5 mdev cannot enable/disable
+interrupts of other mlx5 mdev. mlx5 mdev cannot enable/disable interrupts
+of the parent PCI function.
+Network side:
+- By default the netdevice and the rdma device of mlx5 mdev cannot send or
+receive any packets over the network or to any other mlx5 mdev.
+- mlx5 mdev follows devlink eswitch and vport model of PCI SR-IOV PF and VFs.
+All traffic is dropped by default in this eswitch model.
+- Each mlx5 mdev has one eswitch vport representor netdevice and rdma port.
+The user must do necessary configuration through such representor to enable
+mlx5 mdev to send and/or receive packets.