diff mbox series

[Xilinx,Alveo,1/8] Documentation: fpga: Add a document describing Alveo XRT drivers

Message ID 20201129000040.24777-2-sonals@xilinx.com (mailing list archive)
State Superseded, archived
Headers show
Series Xilinx Alveo/XRT patch overview | expand

Commit Message

Sonal Santan Nov. 29, 2020, midnight UTC
From: Sonal Santan <sonal.santan@xilinx.com>

Describe Alveo XRT driver architecture and provide basic overview
of Xilinx Alveo platform.

Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
---
 Documentation/fpga/index.rst |   1 +
 Documentation/fpga/xrt.rst   | 588 +++++++++++++++++++++++++++++++++++
 2 files changed, 589 insertions(+)
 create mode 100644 Documentation/fpga/xrt.rst

2.17.1

Comments

Moritz Fischer Dec. 1, 2020, 4:54 a.m. UTC | #1
On Sat, Nov 28, 2020 at 04:00:33PM -0800, Sonal Santan wrote:
> From: Sonal Santan <sonal.santan@xilinx.com>
> 
> Describe Alveo XRT driver architecture and provide basic overview
> of Xilinx Alveo platform.
> 
> Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
> ---
>  Documentation/fpga/index.rst |   1 +
>  Documentation/fpga/xrt.rst   | 588 +++++++++++++++++++++++++++++++++++
>  2 files changed, 589 insertions(+)
>  create mode 100644 Documentation/fpga/xrt.rst
> 
> diff --git a/Documentation/fpga/index.rst b/Documentation/fpga/index.rst
> index f80f95667ca2..30134357b70d 100644
> --- a/Documentation/fpga/index.rst
> +++ b/Documentation/fpga/index.rst
> @@ -8,6 +8,7 @@ fpga
>      :maxdepth: 1
> 
>      dfl
> +    xrt
> 
>  .. only::  subproject and html
> 
> diff --git a/Documentation/fpga/xrt.rst b/Documentation/fpga/xrt.rst
> new file mode 100644
> index 000000000000..9f37d46459b0
> --- /dev/null
> +++ b/Documentation/fpga/xrt.rst
> @@ -0,1 +1,588 @@
> +==================================
> +XRTV2 Linux Kernel Driver Overview
> +==================================
> +
> +XRTV2 drivers are second generation `XRT <https://github.com/Xilinx/XRT>`_ drivers which
> +support `Alveo <https://www.xilinx.com/products/boards-and-kits/alveo.html>`_ PCIe platforms
> +from Xilinx.
> +
> +XRTV2 drivers support *subsystem* style data driven platforms where driver's configuration
> +and behavior is determined by meta data provided by platform (in *device tree* format).
> +Primary management physical function (MPF) driver is called **xmgmt**. Primary user physical
> +function (UPF) driver is called **xuser** and HW subsystem drivers are packaged into a library
> +module called **xrt-lib**, which is shared by **xmgmt** and **xuser** (WIP).
WIP?
> +
> +Alveo Platform Overview
> +=======================
> +
> +Alveo platforms are architected as two physical FPGA partitions: *Shell* and *User*. Shell
Nit: The Shell provides ...
> +provides basic infrastructure for the Alveo platform like PCIe connectivity, board management,
> +Dynamic Function Exchange (DFX), sensors, clocking, reset, and security. User partition contains
> +user compiled binary which is loaded by a process called DFX also known as partial reconfiguration.
> +
> +Physical partitions require strict HW compatibility with each other for DFX to work properly.
> +Every physical partition has two interface UUIDs: *parent* UUID and *child* UUID. For simple
> +single stage platforms Shell → User forms parent child relationship. For complex two stage
> +platforms Base → Shell → User forms the parent child relationship chain.
> +
> +.. note::
> +   Partition compatibility matching is key design component of Alveo platforms and XRT. Partitions
> +   have child and parent relationship. A loaded partition exposes child partition UUID to advertise
> +   its compatibility requirement for child partition. When loading a child partition the xmgmt
> +   management driver matches parent UUID of the child partition against child UUID exported by the
> +   parent. Parent and child partition UUIDs are stored in the *xclbin* (for user) or *xsabin* (for
> +   base and shell). Except for root UUID, VSEC, hardware itself does not know about UUIDs. UUIDs are
> +   stored in xsabin and xclbin.
> +
> +
> +The physical partitions and their loading is illustrated below::
> +
> +            SHELL                               USER
> +        +-----------+                  +-------------------+
> +        |           |                  |                   |
> +        | VSEC UUID | CHILD     PARENT |    LOGIC UUID     |
> +        |           o------->|<--------o                   |
> +        |           | UUID       UUID  |                   |
> +        +-----+-----+                  +--------+----------+
> +              |                                 |
> +              .                                 .
> +              |                                 |
> +          +---+---+                      +------+--------+
> +          |  POR  |                      | USER COMPILED |
> +          | FLASH |                      |    XCLBIN     |
> +          +-------+                      +---------------+
> +
> +
> +Loading Sequence
> +----------------
> +
> +Shell partition is loaded from flash at system boot time. It establishes the PCIe link and exposes
Nit: The Shell
> +two physical functions to the BIOS. After OS boot, xmgmt driver attaches to PCIe physical function
> +0 exposed by the Shell and then looks for VSEC in PCIe extended configuration space. Using VSEC it
> +determines the logic UUID of Shell and uses the UUID to load matching *xsabin* file from Linux
> +firmware directory. The xsabin file contains metadata to discover peripherals that are part of Shell
> +and firmware(s) for any embedded soft processors in Shell.

Neat.
> +
> +Shell exports child interface UUID which is used for compatibility check when loading user compiled
Nit: The Shell
> +xclbin over the User partition as part of DFX. When a user requests loading of a specific xclbin the
> +xmgmt management driver reads the parent interface UUID specified in the xclbin and matches it with
> +child interface UUID exported by Shell to determine if xclbin is compatible with the Shell. If match
> +fails loading of xclbin is denied.
> +
> +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl command. When loading xclbin xmgmt driver
> +performs the following operations:
> +
> +1. Sanity check the xclbin contents
> +2. Isolate the User partition
> +3. Download the bitstream using the FPGA config engine (ICAP)
> +4. De-isolate the User partition
Is this modelled as bridges and regions?

> +5. Program the clocks (ClockWiz) driving the User partition
> +6. Wait for memory controller (MIG) calibration
> +
> +`Platform Loading Overview <https://xilinx.github.io/XRT/master/html/platforms_partitions.html>`_
> +provides more detailed information on platform loading.
> +
> +xsabin
> +------
> +
> +Each Alveo platform comes packaged with its own xsabin. The xsabin is trusted component of the
> +platform. For format details refer to :ref:`xsabin/xclbin Container Format`. xsabin contains
> +basic information like UUIDs, platform name and metadata in the form of device tree. See
> +:ref:`Device Tree Usage` for details and example.
> +
> +xclbin
> +------
> +
> +xclbin is compiled by end user using
> +`Vitis <https://www.xilinx.com/products/design-tools/vitis/vitis-platform.html>`_ tool set from
> +Xilinx. The xclbin contains sections describing user compiled acceleration engines/kernels, memory
> +subsystems, clocking information etc. It also contains bitstream for the user partition, UUIDs,
> +platform name, etc. xclbin uses the same container format as xsabin which is described below.
> +
> +
> +xsabin/xclbin Container Format
> +------------------------------
> +
> +xclbin/xsabin is ELF-like binary container format. It is structured as series of sections.
> +There is a file header followed by several section headers which is followed by sections.
> +A section header points to an actual section. There is an optional signature at the end.
> +The format is defined by header file ``xclbin.h``. The following figure illustrates a
> +typical xclbin::
> +
> +
> +          +---------------------+
> +          |                     |
> +          |       HEADER        |
> +          +---------------------+
> +          |   SECTION  HEADER   |
> +          |                     |
> +          +---------------------+
> +          |         ...         |
> +          |                     |
> +          +---------------------+
> +          |   SECTION  HEADER   |
> +          |                     |
> +          +---------------------+
> +          |       SECTION       |
> +          |                     |
> +          +---------------------+
> +          |         ...         |
> +          |                     |
> +          +---------------------+
> +          |       SECTION       |
> +          |                     |
> +          +---------------------+
> +          |      SIGNATURE      |
> +          |      (OPTIONAL)     |
> +          +---------------------+
> +
> +
> +xclbin/xsabin files can be packaged, un-packaged and inspected using XRT utility called
> +**xclbinutil**. xclbinutil is part of XRT open source software stack. The source code for
> +xclbinutil can be found at https://github.com/Xilinx/XRT/tree/master/src/runtime_src/tools/xclbinutil
> +
> +For example to enumerate the contents of a xclbin/xsabin use the *--info* switch as shown
> +below::
> +
> +  xclbinutil --info --input /opt/xilinx/firmware/u50/gen3x16-xdma/blp/test/bandwidth.xclbin
> +  xclbinutil --info --input /lib/firmware/xilinx/862c7020a250293e32036f19956669e5/partition.xsabin
> +
> +
> +Device Tree Usage
> +-----------------
> +
> +As mentioned previously xsabin stores metadata which advertise HW subsystems present in a partition.
> +The metadata is stored in device tree format with well defined schema. Subsystem instantiations are
> +captured as children of ``addressable_endpoints`` node. Subsystem nodes have standard attributes like
> +``reg``, ``interrupts`` etc. Additionally the nodes also have PCIe specific attributes:
> +``pcie_physical_function`` and ``pcie_bar_mapping``. These identify which PCIe physical function and
> +which BAR space in that physical function the subsystem resides. XRT management driver uses this
> +information to bind *platform drivers* to the subsystem instantiations. The platform drivers are
> +found in **xrt-lib.ko** kernel module defined later. Below is an example of device tree for Alveo U50
> +platform::

I might be missing something, but couldn't you structure the addressable
endpoints in a way that encode the physical function as a parent / child
relation?

What are the regs relative to?
> +
> +  /dts-v1/;
> +
> +  /{
> +	logic_uuid = "f465b0a3ae8c64f619bc150384ace69b";
> +
> +	schema_version {
> +		major = <0x01>;
> +		minor = <0x00>;
> +	};
> +
> +	interfaces {
> +
> +		@0 {
> +			interface_uuid = "862c7020a250293e32036f19956669e5";
> +		};
> +	};
> +
> +	addressable_endpoints {
> +
> +		ep_blp_rom_00 {
> +			reg = <0x00 0x1f04000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> +		};
> +
> +		ep_card_flash_program_00 {
> +			reg = <0x00 0x1f06000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_quad_spi-1.0\0axi_quad_spi";
> +			interrupts = <0x03 0x03>;
> +		};
> +
> +		ep_cmc_firmware_mem_00 {
> +			reg = <0x00 0x1e20000 0x00 0x20000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> +
> +			firmware {
> +				firmware_product_name = "cmc";
> +				firmware_branch_name = "u50";
> +				firmware_version_major = <0x01>;
> +				firmware_version_minor = <0x00>;
> +			};
> +		};
> +
> +		ep_cmc_intc_00 {
> +			reg = <0x00 0x1e03000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
> +			interrupts = <0x04 0x04>;
> +		};
> +
> +		ep_cmc_mutex_00 {
> +			reg = <0x00 0x1e02000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_cmc_regmap_00 {
> +			reg = <0x00 0x1e08000 0x00 0x2000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> +
> +			firmware {
> +				firmware_product_name = "sc-fw";
> +				firmware_branch_name = "u50";
> +				firmware_version_major = <0x05>;
> +			};
> +		};
> +
> +		ep_cmc_reset_00 {
> +			reg = <0x00 0x1e01000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_ddr_mem_calib_00 {
> +			reg = <0x00 0x63000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_debug_bscan_mgmt_00 {
> +			reg = <0x00 0x1e90000 0x00 0x10000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-debug_bridge-1.0\0debug_bridge";
> +		};
> +
> +		ep_ert_base_address_00 {
> +			reg = <0x00 0x21000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_ert_command_queue_mgmt_00 {
> +			reg = <0x00 0x40000 0x00 0x10000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-ert_command_queue-1.0\0ert_command_queue";
> +		};
> +
> +		ep_ert_command_queue_user_00 {
> +			reg = <0x00 0x40000 0x00 0x10000>;
> +			pcie_physical_function = <0x01>;
> +			compatible = "xilinx.com,reg_abs-ert_command_queue-1.0\0ert_command_queue";
> +		};
> +
> +		ep_ert_firmware_mem_00 {
> +			reg = <0x00 0x30000 0x00 0x8000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> +
> +			firmware {
> +				firmware_product_name = "ert";
> +				firmware_branch_name = "v20";
> +				firmware_version_major = <0x01>;
> +			};
> +		};
> +
> +		ep_ert_intc_00 {
> +			reg = <0x00 0x23000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
> +			interrupts = <0x05 0x05>;
> +		};
> +
> +		ep_ert_reset_00 {
> +			reg = <0x00 0x22000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_ert_sched_00 {
> +			reg = <0x00 0x50000 0x00 0x1000>;
> +			pcie_physical_function = <0x01>;
> +			compatible = "xilinx.com,reg_abs-ert_sched-1.0\0ert_sched";
> +			interrupts = <0x09 0x0c>;
> +		};
> +
> +		ep_fpga_configuration_00 {
> +			reg = <0x00 0x1e88000 0x00 0x8000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_hwicap-1.0\0axi_hwicap";
> +			interrupts = <0x02 0x02>;
> +		};
> +
> +		ep_icap_reset_00 {
> +			reg = <0x00 0x1f07000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_mailbox_mgmt_00 {
> +			reg = <0x00 0x1f10000 0x00 0x10000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
> +			interrupts = <0x00 0x00>;
> +		};
> +
> +		ep_mailbox_user_00 {
> +			reg = <0x00 0x1f00000 0x00 0x10000>;
> +			pcie_physical_function = <0x01>;
> +			compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
> +			interrupts = <0x08 0x08>;
> +		};
> +
> +		ep_msix_00 {
> +			reg = <0x00 0x00 0x00 0x20000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-msix-1.0\0msix";
> +			pcie_bar_mapping = <0x02>;
> +		};
> +
> +		ep_pcie_link_mon_00 {
> +			reg = <0x00 0x1f05000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_pr_isolate_plp_00 {
> +			reg = <0x00 0x1f01000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_pr_isolate_ulp_00 {
> +			reg = <0x00 0x1000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> +		};
> +
> +		ep_uuid_rom_00 {
> +			reg = <0x00 0x64000 0x00 0x1000>;
> +			pcie_physical_function = <0x00>;
> +			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> +		};
> +
> +		ep_xdma_00 {
> +			reg = <0x00 0x00 0x00 0x10000>;
> +			pcie_physical_function = <0x01>;
> +			compatible = "xilinx.com,reg_abs-xdma-1.0\0xdma";
> +			pcie_bar_mapping = <0x02>;
> +		};
> +	};
> +
> +  }
> +
> +
> +
> +Deployment Models
> +=================
> +
> +Baremetal
> +---------
> +
> +In bare-metal deployments both MPF and UPF are visible and accessible. xmgmt driver binds to
> +MPF. xmgmt driver operations are privileged and available to system administrator. The full
> +stack is illustrated below::
> +
> +
> +                            HOST
> +
> +                 [XMGMT]            [XUSER]
> +                    |                  |
> +                    |                  |
> +                 +-----+            +-----+
> +                 | MPF |            | UPF |
> +                 |     |            |     |
> +                 | PF0 |            | PF1 |
> +                 +--+--+            +--+--+
> +          ......... ^................. ^..........
> +                    |                  |
> +                    |   PCIe DEVICE    |
> +                    |                  |
> +                 +--+------------------+--+
> +                 |         SHELL          |
> +                 |                        |
> +                 +------------------------+
> +                 |         USER           |
> +                 |                        |
> +                 |                        |
> +                 |                        |
> +                 |                        |
> +                 +------------------------+
> +
> +
> +
> +Virtualized
> +-----------
> +
> +In virtualized deployments privileged MPF is assigned to host but unprivileged UPF
> +is assigned to guest VM via PCIe pass-through. xmgmt driver in host binds to MPF.
> +xmgmt driver operations are privileged and only accessible by hosting service provider.
> +The full stack is illustrated below::
> +
> +
> +                                 .............
> +                  HOST           .    VM     .
> +                                 .           .
> +                 [XMGMT]         .  [XUSER]  .
> +                    |            .     |     .
> +                    |            .     |     .
> +                 +-----+         .  +-----+  .
> +                 | MPF |         .  | UPF |  .
> +                 |     |         .  |     |  .
> +                 | PF0 |         .  | PF1 |  .
> +                 +--+--+         .  +--+--+  .
> +          ......... ^................. ^..........
> +                    |                  |
> +                    |   PCIe DEVICE    |
> +                    |                  |
> +                 +--+------------------+--+
> +                 |         SHELL          |
> +                 |                        |
> +                 +------------------------+
> +                 |         USER           |
> +                 |                        |
> +                 |                        |
> +                 |                        |
> +                 |                        |
> +                 +------------------------+
> +
> +
> +
> +Driver Modules
> +==============
> +
> +xrt-lib.ko
> +----------
> +
> +Repository of all subsystem drivers and pure software modules that can potentially
> +be shared between xmgmt and xuser. All these drivers are structured as Linux
> +*platform driver* and are instantiated by xmgmt (or xuser in future) based on meta
> +data associated with hardware. The metadata is in the form of device tree as
> +explained before.
> +
> +xmgmt.ko
> +--------
> +
> +The xmgmt driver is a PCIe device driver driving MPF found on Xilinx's Alveo
> +PCIE device. It consists of one *root* driver, one or more *partition* drivers
> +and one or more *leaf* drivers. The root and MPF specific leaf drivers are in
> +xmgmt.ko. The partition driver and other leaf drivers are in xrt-lib.ko.
> +
> +The instantiation of specific partition driver or leaf driver is completely data
> +driven based on meta data (mostly in device tree format) found through VSEC
> +capability and inside firmware files, such as xsabin or xclbin file. The root
> +driver manages life cycle of multiple partition drivers, which, in turn, manages
> +multiple leaf drivers. This allows a single set of driver code to support all
> +kinds of subsystems exposed by different shells. The difference among all
> +these subsystems will be handled in leaf drivers with root and partition drivers
> +being part of the infrastructure and provide common services for all leaves found
> +on all platforms.
> +
> +
> +xmgmt-root
> +^^^^^^^^^^
> +
> +The xmgmt-root driver is a PCIe device driver attaches to MPF. It's part of the
Nit: s/attaches/attached ?
> +infrastructure of the MPF driver and resides in xmgmt.ko. This driver
> +
> +* manages one or more partition drivers
> +* provides access to functionalities that requires pci_dev, such as PCIE config
> +  space access, to other leaf drivers through parent calls
> +* together with partition driver, facilities event callbacks for other leaf drivers
> +* together with partition driver, facilities inter-leaf driver calls for other leaf
> +  drivers
> +
> +When root driver starts, it will explicitly create an initial partition instance,
> +which contains leaf drivers that will trigger the creation of other partition
> +instances. The root driver will wait for all partitions and leaves to be created
> +before it returns from it's probe routine and claim success of the initialization
> +of the entire xmgmt driver.
> +
> +partition
> +^^^^^^^^^
> +
> +The partition driver is a platform device driver whose life cycle is managed by
> +root and does not have real IO mem or IRQ resources. It's part of the
> +infrastructure of the MPF driver and resides in xrt-lib.ko. This driver
> +
> +* manages one or more leaf drivers so that multiple leaves can be managed as a group
> +* provides access to root from leaves, so that parent calls, event notifications
> +  and inter-leaf calls can happen
> +
> +In xmgmt, an initial partition driver instance will be created by root, which
> +contains leaves that will trigger partition instances to be created to manage
> +groups of leaves found on different partitions on hardware, such as VSEC, Shell,
> +and User.
> +
> +leaves
> +^^^^^^
> +
> +The leaf driver is a platform device driver whose life cycle is managed by
> +a partition driver and may or may not have real IO mem or IRQ resources. They
> +are the real meat of xmgmt and contains platform specific code to Shell and User
> +found on a MPF.
> +
> +A leaf driver may not have real hardware resources when it merely acts as a driver
> +that manages certain in-memory states for xmgmt. These in-memory states could be
> +shared by multiple other leaves.
> +
> +Leaf drivers assigned to specific hardware resources drive specific subsystem in
> +the device. To manipulate the subsystem or carry out a task, a leaf driver may ask
> +help from root via parent calls and/or from other leaves via inter-leaf calls.
> +
> +A leaf can also broadcast events through infrastructure code for other leaves
> +to process. It can also receive event notification from infrastructure about certain
> +events, such as post-creation or pre-exit of a particular leaf.
> +
> +
> +Driver Interfaces
> +=================
> +
> +xmgmt Driver Ioctls
> +-------------------
> +
> +Ioctls exposed by xmgmt driver to user space are enumerated in the following table:
> +
> +== ===================== ============================= ===========================
> +#  Functionality         ioctl request code            data format
> +== ===================== ============================= ===========================
> +1  FPGA image download   XMGMT_IOCICAPDOWNLOAD_AXLF    xmgmt_ioc_bitstream_axlf
> +2  CL frequency scaling  XMGMT_IOCFREQSCALE            xmgmt_ioc_freqscaling
> +== ===================== ============================= ===========================
> +
> +xmgmt Driver Sysfs
> +------------------
> +
> +xmgmt driver exposes a rich set of sysfs interfaces. Subsystem platform drivers
> +export sysfs node for every platform instance.
> +
> +Every partition also exports its UUIDs. See below for examples::
> +
> +  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/interface_uuids
> +  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/logic_uuids
> +
> +
> +hwmon
> +-----
> +
> +xmgmt driver exposes standard hwmon interface to report voltage, current, temperature,
> +power, etc. These can easily be viewed using *sensors* command line utility.
> +
> +
> +mailbox
> +-------
> +
> +xmgmt communicates with user physical function driver via HW mailbox. Mailbox opcodes
> +are defined in ``mailbox_proto.h``. `Mailbox Inter-domain Communication Protocol
> +<https://xilinx.github.io/XRT/master/html/mailbox.proto.html>`_ defines the full
> +specification. xmgmt implements subset of the specification. It provides the following
> +services to the UPF driver:
> +
> +1.  Responding to *are you there* request including determining if the two drivers are
> +    running in the same OS domain
> +2.  Provide sensor readings, loaded xclbin UUID, clock frequency, shell information, etc.
> +3.  Perform PCIe hot reset
> +4.  Download user compiled xclbin

Is this gonna use the mailbox framework?

> +
> +
> +Platform Security Considerations
> +================================
> +
> +`Security of Alveo Platform <https://xilinx.github.io/XRT/master/html/security.html>`_
> +discusses the deployment options and security implications in great detail.
> --
> 2.17.1

That's a lot of text, I'll have to read it again most likely,

- Moritz
Max Zhen Dec. 2, 2020, 9:24 p.m. UTC | #2
Hi Moritz,

Thanks for your feedback. Please see my reply inline.

Thanks,
-Max

> -----Original Message-----
> From: Moritz Fischer <mdf@kernel.org>
> Sent: Monday, November 30, 2020 20:55
> To: Sonal Santan <sonals@xilinx.com>
> Cc: linux-kernel@vger.kernel.org; linux-fpga@vger.kernel.org; Max Zhen
> <maxz@xilinx.com>; Lizhi Hou <lizhih@xilinx.com>; Michal Simek
> <michals@xilinx.com>; Stefano Stabellini <stefanos@xilinx.com>;
> devicetree@vger.kernel.org
> Subject: Re: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a document
> describing Alveo XRT drivers
> 
> 
> On Sat, Nov 28, 2020 at 04:00:33PM -0800, Sonal Santan wrote:
> > From: Sonal Santan <sonal.santan@xilinx.com>
> >
> > Describe Alveo XRT driver architecture and provide basic overview of
> > Xilinx Alveo platform.
> >
> > Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
> > ---
> >  Documentation/fpga/index.rst |   1 +
> >  Documentation/fpga/xrt.rst   | 588
> +++++++++++++++++++++++++++++++++++
> >  2 files changed, 589 insertions(+)
> >  create mode 100644 Documentation/fpga/xrt.rst
> >
> > diff --git a/Documentation/fpga/index.rst
> > b/Documentation/fpga/index.rst index f80f95667ca2..30134357b70d
> 100644
> > --- a/Documentation/fpga/index.rst
> > +++ b/Documentation/fpga/index.rst
> > @@ -8,6 +8,7 @@ fpga
> >      :maxdepth: 1
> >
> >      dfl
> > +    xrt
> >
> >  .. only::  subproject and html
> >
> > diff --git a/Documentation/fpga/xrt.rst b/Documentation/fpga/xrt.rst
> > new file mode 100644 index 000000000000..9f37d46459b0
> > --- /dev/null
> > +++ b/Documentation/fpga/xrt.rst
> > @@ -0,1 +1,588 @@
> > +==================================
> > +XRTV2 Linux Kernel Driver Overview
> > +==================================
> > +
> > +XRTV2 drivers are second generation `XRT
> > +<https://github.com/Xilinx/XRT>`_ drivers which support `Alveo
> > +<https://www.xilinx.com/products/boards-and-kits/alveo.html>`_ PCIe
> platforms from Xilinx.
> > +
> > +XRTV2 drivers support *subsystem* style data driven platforms where
> > +driver's configuration and behavior is determined by meta data provided
> by platform (in *device tree* format).
> > +Primary management physical function (MPF) driver is called
> > +**xmgmt**. Primary user physical function (UPF) driver is called
> > +**xuser** and HW subsystem drivers are packaged into a library module
> called **xrt-lib**, which is shared by **xmgmt** and **xuser** (WIP).
> WIP?

Working in progress. I'll expand it in the doc.

> > +
> > +Alveo Platform Overview
> > +=======================
> > +
> > +Alveo platforms are architected as two physical FPGA partitions:
> > +*Shell* and *User*. Shell
> Nit: The Shell provides ...

Sure. Will fix.

> > +provides basic infrastructure for the Alveo platform like PCIe
> > +connectivity, board management, Dynamic Function Exchange (DFX),
> > +sensors, clocking, reset, and security. User partition contains user
> compiled binary which is loaded by a process called DFX also known as partial
> reconfiguration.
> > +
> > +Physical partitions require strict HW compatibility with each other for DFX
> to work properly.
> > +Every physical partition has two interface UUIDs: *parent* UUID and
> > +*child* UUID. For simple single stage platforms Shell → User forms
> > +parent child relationship. For complex two stage platforms Base → Shell
> → User forms the parent child relationship chain.
> > +
> > +.. note::
> > +   Partition compatibility matching is key design component of Alveo
> platforms and XRT. Partitions
> > +   have child and parent relationship. A loaded partition exposes child
> partition UUID to advertise
> > +   its compatibility requirement for child partition. When loading a child
> partition the xmgmt
> > +   management driver matches parent UUID of the child partition against
> child UUID exported by the
> > +   parent. Parent and child partition UUIDs are stored in the *xclbin* (for
> user) or *xsabin* (for
> > +   base and shell). Except for root UUID, VSEC, hardware itself does not
> know about UUIDs. UUIDs are
> > +   stored in xsabin and xclbin.
> > +
> > +
> > +The physical partitions and their loading is illustrated below::
> > +
> > +            SHELL                               USER
> > +        +-----------+                  +-------------------+
> > +        |           |                  |                   |
> > +        | VSEC UUID | CHILD     PARENT |    LOGIC UUID     |
> > +        |           o------->|<--------o                   |
> > +        |           | UUID       UUID  |                   |
> > +        +-----+-----+                  +--------+----------+
> > +              |                                 |
> > +              .                                 .
> > +              |                                 |
> > +          +---+---+                      +------+--------+
> > +          |  POR  |                      | USER COMPILED |
> > +          | FLASH |                      |    XCLBIN     |
> > +          +-------+                      +---------------+
> > +
> > +
> > +Loading Sequence
> > +----------------
> > +
> > +Shell partition is loaded from flash at system boot time. It
> > +establishes the PCIe link and exposes
> Nit: The Shell

Will fix.

> > +two physical functions to the BIOS. After OS boot, xmgmt driver
> > +attaches to PCIe physical function
> > +0 exposed by the Shell and then looks for VSEC in PCIe extended
> > +configuration space. Using VSEC it determines the logic UUID of Shell
> > +and uses the UUID to load matching *xsabin* file from Linux firmware
> > +directory. The xsabin file contains metadata to discover peripherals that
> are part of Shell and firmware(s) for any embedded soft processors in Shell.
> 
> Neat.

Thanks :-).

> > +
> > +Shell exports child interface UUID which is used for compatibility
> > +check when loading user compiled
> Nit: The Shell

Sure.

> > +xclbin over the User partition as part of DFX. When a user requests
> > +loading of a specific xclbin the xmgmt management driver reads the
> > +parent interface UUID specified in the xclbin and matches it with
> > +child interface UUID exported by Shell to determine if xclbin is compatible
> with the Shell. If match fails loading of xclbin is denied.
> > +
> > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl command.
> > +When loading xclbin xmgmt driver performs the following operations:
> > +
> > +1. Sanity check the xclbin contents
> > +2. Isolate the User partition
> > +3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > +De-isolate the User partition
> Is this modelled as bridges and regions?

Alveo drivers as written today do not use fpga bridge and region framework. It seems that if we add support for that framework, it’s possible to receive PR program request from kernel outside of xmgmt driver? Currently, we can’t support this and PR program can only be initiated using XRT’s runtime API in user space.

Or maybe we have missed some points about the use case for this framework?

> 
> > +5. Program the clocks (ClockWiz) driving the User partition 6. Wait
> > +for memory controller (MIG) calibration
> > +
> > +`Platform Loading Overview
> > +<https://xilinx.github.io/XRT/master/html/platforms_partitions.html>`
> > +_ provides more detailed information on platform loading.
> > +
> > +xsabin
> > +------
> > +
> > +Each Alveo platform comes packaged with its own xsabin. The xsabin is
> > +trusted component of the platform. For format details refer to
> > +:ref:`xsabin/xclbin Container Format`. xsabin contains basic
> > +information like UUIDs, platform name and metadata in the form of
> device tree. See :ref:`Device Tree Usage` for details and example.
> > +
> > +xclbin
> > +------
> > +
> > +xclbin is compiled by end user using
> > +`Vitis
> > +<https://www.xilinx.com/products/design-tools/vitis/vitis-platform.ht
> > +ml>`_ tool set from Xilinx. The xclbin contains sections describing
> > +user compiled acceleration engines/kernels, memory subsystems,
> clocking information etc. It also contains bitstream for the user partition,
> UUIDs, platform name, etc. xclbin uses the same container format as xsabin
> which is described below.
> > +
> > +
> > +xsabin/xclbin Container Format
> > +------------------------------
> > +
> > +xclbin/xsabin is ELF-like binary container format. It is structured as series
> of sections.
> > +There is a file header followed by several section headers which is
> followed by sections.
> > +A section header points to an actual section. There is an optional signature
> at the end.
> > +The format is defined by header file ``xclbin.h``. The following
> > +figure illustrates a typical xclbin::
> > +
> > +
> > +          +---------------------+
> > +          |                     |
> > +          |       HEADER        |
> > +          +---------------------+
> > +          |   SECTION  HEADER   |
> > +          |                     |
> > +          +---------------------+
> > +          |         ...         |
> > +          |                     |
> > +          +---------------------+
> > +          |   SECTION  HEADER   |
> > +          |                     |
> > +          +---------------------+
> > +          |       SECTION       |
> > +          |                     |
> > +          +---------------------+
> > +          |         ...         |
> > +          |                     |
> > +          +---------------------+
> > +          |       SECTION       |
> > +          |                     |
> > +          +---------------------+
> > +          |      SIGNATURE      |
> > +          |      (OPTIONAL)     |
> > +          +---------------------+
> > +
> > +
> > +xclbin/xsabin files can be packaged, un-packaged and inspected using
> > +XRT utility called **xclbinutil**. xclbinutil is part of XRT open
> > +source software stack. The source code for xclbinutil can be found at
> > +https://github.com/Xilinx/XRT/tree/master/src/runtime_src/tools/xclbi
> > +nutil
> > +
> > +For example to enumerate the contents of a xclbin/xsabin use the
> > +*--info* switch as shown
> > +below::
> > +
> > +  xclbinutil --info --input
> > + /opt/xilinx/firmware/u50/gen3x16-xdma/blp/test/bandwidth.xclbin
> > +  xclbinutil --info --input
> > + /lib/firmware/xilinx/862c7020a250293e32036f19956669e5/partition.xsab
> > + in
> > +
> > +
> > +Device Tree Usage
> > +-----------------
> > +
> > +As mentioned previously xsabin stores metadata which advertise HW
> subsystems present in a partition.
> > +The metadata is stored in device tree format with well defined
> > +schema. Subsystem instantiations are captured as children of
> > +``addressable_endpoints`` node. Subsystem nodes have standard
> attributes like ``reg``, ``interrupts`` etc. Additionally the nodes also have PCIe
> specific attributes:
> > +``pcie_physical_function`` and ``pcie_bar_mapping``. These identify
> > +which PCIe physical function and which BAR space in that physical
> > +function the subsystem resides. XRT management driver uses this
> > +information to bind *platform drivers* to the subsystem
> > +instantiations. The platform drivers are found in **xrt-lib.ko**
> > +kernel module defined later. Below is an example of device tree for
> > +Alveo U50
> > +platform::
> 
> I might be missing something, but couldn't you structure the addressable
> endpoints in a way that encode the physical function as a parent / child
> relation?

Alveo driver does not generate the metadata. The metadata is formatted and generated by HW tools when the Alveo HW platform is built. 

> 
> What are the regs relative to?

Regs indicates offset of the register on the PCIE BAR of the Alveo device.

> > +
> > +  /dts-v1/;
> > +
> > +  /{
> > +     logic_uuid = "f465b0a3ae8c64f619bc150384ace69b";
> > +
> > +     schema_version {
> > +             major = <0x01>;
> > +             minor = <0x00>;
> > +     };
> > +
> > +     interfaces {
> > +
> > +             @0 {
> > +                     interface_uuid = "862c7020a250293e32036f19956669e5";
> > +             };
> > +     };
> > +
> > +     addressable_endpoints {
> > +
> > +             ep_blp_rom_00 {
> > +                     reg = <0x00 0x1f04000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_bram_ctrl-
> 1.0\0axi_bram_ctrl";
> > +             };
> > +
> > +             ep_card_flash_program_00 {
> > +                     reg = <0x00 0x1f06000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_quad_spi-
> 1.0\0axi_quad_spi";
> > +                     interrupts = <0x03 0x03>;
> > +             };
> > +
> > +             ep_cmc_firmware_mem_00 {
> > +                     reg = <0x00 0x1e20000 0x00 0x20000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible =
> > + "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> > +
> > +                     firmware {
> > +                             firmware_product_name = "cmc";
> > +                             firmware_branch_name = "u50";
> > +                             firmware_version_major = <0x01>;
> > +                             firmware_version_minor = <0x00>;
> > +                     };
> > +             };
> > +
> > +             ep_cmc_intc_00 {
> > +                     reg = <0x00 0x1e03000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
> > +                     interrupts = <0x04 0x04>;
> > +             };
> > +
> > +             ep_cmc_mutex_00 {
> > +                     reg = <0x00 0x1e02000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_cmc_regmap_00 {
> > +                     reg = <0x00 0x1e08000 0x00 0x2000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible =
> > + "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> > +
> > +                     firmware {
> > +                             firmware_product_name = "sc-fw";
> > +                             firmware_branch_name = "u50";
> > +                             firmware_version_major = <0x05>;
> > +                     };
> > +             };
> > +
> > +             ep_cmc_reset_00 {
> > +                     reg = <0x00 0x1e01000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_ddr_mem_calib_00 {
> > +                     reg = <0x00 0x63000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_debug_bscan_mgmt_00 {
> > +                     reg = <0x00 0x1e90000 0x00 0x10000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-debug_bridge-
> 1.0\0debug_bridge";
> > +             };
> > +
> > +             ep_ert_base_address_00 {
> > +                     reg = <0x00 0x21000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_ert_command_queue_mgmt_00 {
> > +                     reg = <0x00 0x40000 0x00 0x10000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-ert_command_queue-
> 1.0\0ert_command_queue";
> > +             };
> > +
> > +             ep_ert_command_queue_user_00 {
> > +                     reg = <0x00 0x40000 0x00 0x10000>;
> > +                     pcie_physical_function = <0x01>;
> > +                     compatible = "xilinx.com,reg_abs-ert_command_queue-
> 1.0\0ert_command_queue";
> > +             };
> > +
> > +             ep_ert_firmware_mem_00 {
> > +                     reg = <0x00 0x30000 0x00 0x8000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible =
> > + "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> > +
> > +                     firmware {
> > +                             firmware_product_name = "ert";
> > +                             firmware_branch_name = "v20";
> > +                             firmware_version_major = <0x01>;
> > +                     };
> > +             };
> > +
> > +             ep_ert_intc_00 {
> > +                     reg = <0x00 0x23000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
> > +                     interrupts = <0x05 0x05>;
> > +             };
> > +
> > +             ep_ert_reset_00 {
> > +                     reg = <0x00 0x22000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_ert_sched_00 {
> > +                     reg = <0x00 0x50000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x01>;
> > +                     compatible = "xilinx.com,reg_abs-ert_sched-1.0\0ert_sched";
> > +                     interrupts = <0x09 0x0c>;
> > +             };
> > +
> > +             ep_fpga_configuration_00 {
> > +                     reg = <0x00 0x1e88000 0x00 0x8000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_hwicap-1.0\0axi_hwicap";
> > +                     interrupts = <0x02 0x02>;
> > +             };
> > +
> > +             ep_icap_reset_00 {
> > +                     reg = <0x00 0x1f07000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_mailbox_mgmt_00 {
> > +                     reg = <0x00 0x1f10000 0x00 0x10000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
> > +                     interrupts = <0x00 0x00>;
> > +             };
> > +
> > +             ep_mailbox_user_00 {
> > +                     reg = <0x00 0x1f00000 0x00 0x10000>;
> > +                     pcie_physical_function = <0x01>;
> > +                     compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
> > +                     interrupts = <0x08 0x08>;
> > +             };
> > +
> > +             ep_msix_00 {
> > +                     reg = <0x00 0x00 0x00 0x20000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-msix-1.0\0msix";
> > +                     pcie_bar_mapping = <0x02>;
> > +             };
> > +
> > +             ep_pcie_link_mon_00 {
> > +                     reg = <0x00 0x1f05000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_pr_isolate_plp_00 {
> > +                     reg = <0x00 0x1f01000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_pr_isolate_ulp_00 {
> > +                     reg = <0x00 0x1000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > +             };
> > +
> > +             ep_uuid_rom_00 {
> > +                     reg = <0x00 0x64000 0x00 0x1000>;
> > +                     pcie_physical_function = <0x00>;
> > +                     compatible = "xilinx.com,reg_abs-axi_bram_ctrl-
> 1.0\0axi_bram_ctrl";
> > +             };
> > +
> > +             ep_xdma_00 {
> > +                     reg = <0x00 0x00 0x00 0x10000>;
> > +                     pcie_physical_function = <0x01>;
> > +                     compatible = "xilinx.com,reg_abs-xdma-1.0\0xdma";
> > +                     pcie_bar_mapping = <0x02>;
> > +             };
> > +     };
> > +
> > +  }
> > +
> > +
> > +
> > +Deployment Models
> > +=================
> > +
> > +Baremetal
> > +---------
> > +
> > +In bare-metal deployments both MPF and UPF are visible and
> > +accessible. xmgmt driver binds to MPF. xmgmt driver operations are
> > +privileged and available to system administrator. The full stack is
> illustrated below::
> > +
> > +
> > +                            HOST
> > +
> > +                 [XMGMT]            [XUSER]
> > +                    |                  |
> > +                    |                  |
> > +                 +-----+            +-----+
> > +                 | MPF |            | UPF |
> > +                 |     |            |     |
> > +                 | PF0 |            | PF1 |
> > +                 +--+--+            +--+--+
> > +          ......... ^................. ^..........
> > +                    |                  |
> > +                    |   PCIe DEVICE    |
> > +                    |                  |
> > +                 +--+------------------+--+
> > +                 |         SHELL          |
> > +                 |                        |
> > +                 +------------------------+
> > +                 |         USER           |
> > +                 |                        |
> > +                 |                        |
> > +                 |                        |
> > +                 |                        |
> > +                 +------------------------+
> > +
> > +
> > +
> > +Virtualized
> > +-----------
> > +
> > +In virtualized deployments privileged MPF is assigned to host but
> > +unprivileged UPF is assigned to guest VM via PCIe pass-through. xmgmt
> driver in host binds to MPF.
> > +xmgmt driver operations are privileged and only accessible by hosting
> service provider.
> > +The full stack is illustrated below::
> > +
> > +
> > +                                 .............
> > +                  HOST           .    VM     .
> > +                                 .           .
> > +                 [XMGMT]         .  [XUSER]  .
> > +                    |            .     |     .
> > +                    |            .     |     .
> > +                 +-----+         .  +-----+  .
> > +                 | MPF |         .  | UPF |  .
> > +                 |     |         .  |     |  .
> > +                 | PF0 |         .  | PF1 |  .
> > +                 +--+--+         .  +--+--+  .
> > +          ......... ^................. ^..........
> > +                    |                  |
> > +                    |   PCIe DEVICE    |
> > +                    |                  |
> > +                 +--+------------------+--+
> > +                 |         SHELL          |
> > +                 |                        |
> > +                 +------------------------+
> > +                 |         USER           |
> > +                 |                        |
> > +                 |                        |
> > +                 |                        |
> > +                 |                        |
> > +                 +------------------------+
> > +
> > +
> > +
> > +Driver Modules
> > +==============
> > +
> > +xrt-lib.ko
> > +----------
> > +
> > +Repository of all subsystem drivers and pure software modules that
> > +can potentially be shared between xmgmt and xuser. All these drivers
> > +are structured as Linux *platform driver* and are instantiated by
> > +xmgmt (or xuser in future) based on meta data associated with
> > +hardware. The metadata is in the form of device tree as explained before.
> > +
> > +xmgmt.ko
> > +--------
> > +
> > +The xmgmt driver is a PCIe device driver driving MPF found on
> > +Xilinx's Alveo PCIE device. It consists of one *root* driver, one or
> > +more *partition* drivers and one or more *leaf* drivers. The root and
> > +MPF specific leaf drivers are in xmgmt.ko. The partition driver and other
> leaf drivers are in xrt-lib.ko.
> > +
> > +The instantiation of specific partition driver or leaf driver is
> > +completely data driven based on meta data (mostly in device tree
> > +format) found through VSEC capability and inside firmware files, such
> > +as xsabin or xclbin file. The root driver manages life cycle of
> > +multiple partition drivers, which, in turn, manages multiple leaf
> > +drivers. This allows a single set of driver code to support all kinds
> > +of subsystems exposed by different shells. The difference among all
> > +these subsystems will be handled in leaf drivers with root and
> > +partition drivers being part of the infrastructure and provide common
> services for all leaves found on all platforms.
> > +
> > +
> > +xmgmt-root
> > +^^^^^^^^^^
> > +
> > +The xmgmt-root driver is a PCIe device driver attaches to MPF. It's
> > +part of the
> Nit: s/attaches/attached ?

Yes, sure.

> > +infrastructure of the MPF driver and resides in xmgmt.ko. This driver
> > +
> > +* manages one or more partition drivers
> > +* provides access to functionalities that requires pci_dev, such as
> > +PCIE config
> > +  space access, to other leaf drivers through parent calls
> > +* together with partition driver, facilities event callbacks for
> > +other leaf drivers
> > +* together with partition driver, facilities inter-leaf driver calls
> > +for other leaf
> > +  drivers
> > +
> > +When root driver starts, it will explicitly create an initial
> > +partition instance, which contains leaf drivers that will trigger the
> > +creation of other partition instances. The root driver will wait for
> > +all partitions and leaves to be created before it returns from it's
> > +probe routine and claim success of the initialization of the entire xmgmt
> driver.
> > +
> > +partition
> > +^^^^^^^^^
> > +
> > +The partition driver is a platform device driver whose life cycle is
> > +managed by root and does not have real IO mem or IRQ resources. It's
> > +part of the infrastructure of the MPF driver and resides in
> > +xrt-lib.ko. This driver
> > +
> > +* manages one or more leaf drivers so that multiple leaves can be
> > +managed as a group
> > +* provides access to root from leaves, so that parent calls, event
> > +notifications
> > +  and inter-leaf calls can happen
> > +
> > +In xmgmt, an initial partition driver instance will be created by
> > +root, which contains leaves that will trigger partition instances to
> > +be created to manage groups of leaves found on different partitions
> > +on hardware, such as VSEC, Shell, and User.
> > +
> > +leaves
> > +^^^^^^
> > +
> > +The leaf driver is a platform device driver whose life cycle is
> > +managed by a partition driver and may or may not have real IO mem or
> > +IRQ resources. They are the real meat of xmgmt and contains platform
> > +specific code to Shell and User found on a MPF.
> > +
> > +A leaf driver may not have real hardware resources when it merely
> > +acts as a driver that manages certain in-memory states for xmgmt.
> > +These in-memory states could be shared by multiple other leaves.
> > +
> > +Leaf drivers assigned to specific hardware resources drive specific
> > +subsystem in the device. To manipulate the subsystem or carry out a
> > +task, a leaf driver may ask help from root via parent calls and/or from
> other leaves via inter-leaf calls.
> > +
> > +A leaf can also broadcast events through infrastructure code for
> > +other leaves to process. It can also receive event notification from
> > +infrastructure about certain events, such as post-creation or pre-exit of a
> particular leaf.
> > +
> > +
> > +Driver Interfaces
> > +=================
> > +
> > +xmgmt Driver Ioctls
> > +-------------------
> > +
> > +Ioctls exposed by xmgmt driver to user space are enumerated in the
> following table:
> > +
> > +== ===================== =============================
> ===========================
> > +#  Functionality         ioctl request code            data format
> > +== ===================== =============================
> ===========================
> > +1  FPGA image download   XMGMT_IOCICAPDOWNLOAD_AXLF
> xmgmt_ioc_bitstream_axlf
> > +2  CL frequency scaling  XMGMT_IOCFREQSCALE
> xmgmt_ioc_freqscaling
> > +== ===================== =============================
> > +===========================
> > +
> > +xmgmt Driver Sysfs
> > +------------------
> > +
> > +xmgmt driver exposes a rich set of sysfs interfaces. Subsystem
> > +platform drivers export sysfs node for every platform instance.
> > +
> > +Every partition also exports its UUIDs. See below for examples::
> > +
> > +  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/interface_uuids
> > +  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/logic_uuids
> > +
> > +
> > +hwmon
> > +-----
> > +
> > +xmgmt driver exposes standard hwmon interface to report voltage,
> > +current, temperature, power, etc. These can easily be viewed using
> *sensors* command line utility.
> > +
> > +
> > +mailbox
> > +-------
> > +
> > +xmgmt communicates with user physical function driver via HW mailbox.
> > +Mailbox opcodes are defined in ``mailbox_proto.h``. `Mailbox
> > +Inter-domain Communication Protocol
> > +<https://xilinx.github.io/XRT/master/html/mailbox.proto.html>`_
> > +defines the full specification. xmgmt implements subset of the
> specification. It provides the following services to the UPF driver:
> > +
> > +1.  Responding to *are you there* request including determining if the
> two drivers are
> > +    running in the same OS domain
> > +2.  Provide sensor readings, loaded xclbin UUID, clock frequency, shell
> information, etc.
> > +3.  Perform PCIe hot reset
> > +4.  Download user compiled xclbin
> 
> Is this gonna use the mailbox framework?

The xclbin can be downloaded via IOCTL interface of xmgmt driver.
Or the download request can come from user pf driver via mailbox, yes.

Thanks,
Max

> 
> > +
> > +
> > +Platform Security Considerations
> > +================================
> > +
> > +`Security of Alveo Platform
> > +<https://xilinx.github.io/XRT/master/html/security.html>`_
> > +discusses the deployment options and security implications in great detail.
> > --
> > 2.17.1
> 
> That's a lot of text, I'll have to read it again most likely,
> 
> - Moritz
Moritz Fischer Dec. 2, 2020, 11:10 p.m. UTC | #3
Hi Max,

On Wed, Dec 02, 2020 at 09:24:29PM +0000, Max Zhen wrote:
> Hi Moritz,
> 
> Thanks for your feedback. Please see my reply inline.
> 
> Thanks,
> -Max
> 
> > -----Original Message-----
> > From: Moritz Fischer <mdf@kernel.org>
> > Sent: Monday, November 30, 2020 20:55
> > To: Sonal Santan <sonals@xilinx.com>
> > Cc: linux-kernel@vger.kernel.org; linux-fpga@vger.kernel.org; Max Zhen
> > <maxz@xilinx.com>; Lizhi Hou <lizhih@xilinx.com>; Michal Simek
> > <michals@xilinx.com>; Stefano Stabellini <stefanos@xilinx.com>;
> > devicetree@vger.kernel.org
> > Subject: Re: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a document
> > describing Alveo XRT drivers
> > 
> > 
> > On Sat, Nov 28, 2020 at 04:00:33PM -0800, Sonal Santan wrote:
> > > From: Sonal Santan <sonal.santan@xilinx.com>
> > >
> > > Describe Alveo XRT driver architecture and provide basic overview of
> > > Xilinx Alveo platform.
> > >
> > > Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
> > > ---
> > >  Documentation/fpga/index.rst |   1 +
> > >  Documentation/fpga/xrt.rst   | 588
> > +++++++++++++++++++++++++++++++++++
> > >  2 files changed, 589 insertions(+)
> > >  create mode 100644 Documentation/fpga/xrt.rst
> > >
> > > diff --git a/Documentation/fpga/index.rst
> > > b/Documentation/fpga/index.rst index f80f95667ca2..30134357b70d
> > 100644
> > > --- a/Documentation/fpga/index.rst
> > > +++ b/Documentation/fpga/index.rst
> > > @@ -8,6 +8,7 @@ fpga
> > >      :maxdepth: 1
> > >
> > >      dfl
> > > +    xrt
> > >
> > >  .. only::  subproject and html
> > >
> > > diff --git a/Documentation/fpga/xrt.rst b/Documentation/fpga/xrt.rst
> > > new file mode 100644 index 000000000000..9f37d46459b0
> > > --- /dev/null
> > > +++ b/Documentation/fpga/xrt.rst
> > > @@ -0,1 +1,588 @@
> > > +==================================
> > > +XRTV2 Linux Kernel Driver Overview
> > > +==================================
> > > +
> > > +XRTV2 drivers are second generation `XRT
> > > +<https://github.com/Xilinx/XRT>`_ drivers which support `Alveo
> > > +<https://www.xilinx.com/products/boards-and-kits/alveo.html>`_ PCIe
> > platforms from Xilinx.
> > > +
> > > +XRTV2 drivers support *subsystem* style data driven platforms where
> > > +driver's configuration and behavior is determined by meta data provided
> > by platform (in *device tree* format).
> > > +Primary management physical function (MPF) driver is called
> > > +**xmgmt**. Primary user physical function (UPF) driver is called
> > > +**xuser** and HW subsystem drivers are packaged into a library module
> > called **xrt-lib**, which is shared by **xmgmt** and **xuser** (WIP).
> > WIP?
> 
> Working in progress. I'll expand it in the doc.
> 
> > > +
> > > +Alveo Platform Overview
> > > +=======================
> > > +
> > > +Alveo platforms are architected as two physical FPGA partitions:
> > > +*Shell* and *User*. Shell
> > Nit: The Shell provides ...
> 
> Sure. Will fix.
> 
> > > +provides basic infrastructure for the Alveo platform like PCIe
> > > +connectivity, board management, Dynamic Function Exchange (DFX),
> > > +sensors, clocking, reset, and security. User partition contains user
> > compiled binary which is loaded by a process called DFX also known as partial
> > reconfiguration.
> > > +
> > > +Physical partitions require strict HW compatibility with each other for DFX
> > to work properly.
> > > +Every physical partition has two interface UUIDs: *parent* UUID and
> > > +*child* UUID. For simple single stage platforms Shell → User forms
> > > +parent child relationship. For complex two stage platforms Base → Shell
> > → User forms the parent child relationship chain.
> > > +
> > > +.. note::
> > > +   Partition compatibility matching is key design component of Alveo
> > platforms and XRT. Partitions
> > > +   have child and parent relationship. A loaded partition exposes child
> > partition UUID to advertise
> > > +   its compatibility requirement for child partition. When loading a child
> > partition the xmgmt
> > > +   management driver matches parent UUID of the child partition against
> > child UUID exported by the
> > > +   parent. Parent and child partition UUIDs are stored in the *xclbin* (for
> > user) or *xsabin* (for
> > > +   base and shell). Except for root UUID, VSEC, hardware itself does not
> > know about UUIDs. UUIDs are
> > > +   stored in xsabin and xclbin.
> > > +
> > > +
> > > +The physical partitions and their loading is illustrated below::
> > > +
> > > +            SHELL                               USER
> > > +        +-----------+                  +-------------------+
> > > +        |           |                  |                   |
> > > +        | VSEC UUID | CHILD     PARENT |    LOGIC UUID     |
> > > +        |           o------->|<--------o                   |
> > > +        |           | UUID       UUID  |                   |
> > > +        +-----+-----+                  +--------+----------+
> > > +              |                                 |
> > > +              .                                 .
> > > +              |                                 |
> > > +          +---+---+                      +------+--------+
> > > +          |  POR  |                      | USER COMPILED |
> > > +          | FLASH |                      |    XCLBIN     |
> > > +          +-------+                      +---------------+
> > > +
> > > +
> > > +Loading Sequence
> > > +----------------
> > > +
> > > +Shell partition is loaded from flash at system boot time. It
> > > +establishes the PCIe link and exposes
> > Nit: The Shell
> 
> Will fix.
> 
> > > +two physical functions to the BIOS. After OS boot, xmgmt driver
> > > +attaches to PCIe physical function
> > > +0 exposed by the Shell and then looks for VSEC in PCIe extended
> > > +configuration space. Using VSEC it determines the logic UUID of Shell
> > > +and uses the UUID to load matching *xsabin* file from Linux firmware
> > > +directory. The xsabin file contains metadata to discover peripherals that
> > are part of Shell and firmware(s) for any embedded soft processors in Shell.
> > 
> > Neat.
> 
> Thanks :-).
> 
> > > +
> > > +Shell exports child interface UUID which is used for compatibility
> > > +check when loading user compiled
> > Nit: The Shell
> 
> Sure.
> 
> > > +xclbin over the User partition as part of DFX. When a user requests
> > > +loading of a specific xclbin the xmgmt management driver reads the
> > > +parent interface UUID specified in the xclbin and matches it with
> > > +child interface UUID exported by Shell to determine if xclbin is compatible
> > with the Shell. If match fails loading of xclbin is denied.
> > > +
> > > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl command.
> > > +When loading xclbin xmgmt driver performs the following operations:
> > > +
> > > +1. Sanity check the xclbin contents
> > > +2. Isolate the User partition
> > > +3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > > +De-isolate the User partition
> > Is this modelled as bridges and regions?
> 
> Alveo drivers as written today do not use fpga bridge and region framework. It seems that if we add support for that framework, it’s possible to receive PR program request from kernel outside of xmgmt driver? Currently, we can’t support this and PR program can only be initiated using XRT’s runtime API in user space.

I'm not 100% sure I understand the concern here, let me reply to what I
think I understand:

You're worried that if you use FPGA region as interface to accept PR
requests something else could attempt to reconfigure the region from
within the kernel using the FPGA Region API?

Assuming I got this right, I don't think this is a big deal. When you
create the regions you control who gets the references to it. 

From what I've seen so far Regions seem to be roughly equivalent to
Partitions, hence my surprise to see a new structure bypassing them.
> 
> Or maybe we have missed some points about the use case for this framework?
> 
> > 
> > > +5. Program the clocks (ClockWiz) driving the User partition 6. Wait
> > > +for memory controller (MIG) calibration
> > > +
> > > +`Platform Loading Overview
> > > +<https://xilinx.github.io/XRT/master/html/platforms_partitions.html>`
> > > +_ provides more detailed information on platform loading.
> > > +
> > > +xsabin
> > > +------
> > > +
> > > +Each Alveo platform comes packaged with its own xsabin. The xsabin is
> > > +trusted component of the platform. For format details refer to
> > > +:ref:`xsabin/xclbin Container Format`. xsabin contains basic
> > > +information like UUIDs, platform name and metadata in the form of
> > device tree. See :ref:`Device Tree Usage` for details and example.
> > > +
> > > +xclbin
> > > +------
> > > +
> > > +xclbin is compiled by end user using
> > > +`Vitis
> > > +<https://www.xilinx.com/products/design-tools/vitis/vitis-platform.ht
> > > +ml>`_ tool set from Xilinx. The xclbin contains sections describing
> > > +user compiled acceleration engines/kernels, memory subsystems,
> > clocking information etc. It also contains bitstream for the user partition,
> > UUIDs, platform name, etc. xclbin uses the same container format as xsabin
> > which is described below.
> > > +
> > > +
> > > +xsabin/xclbin Container Format
> > > +------------------------------
> > > +
> > > +xclbin/xsabin is ELF-like binary container format. It is structured as series
> > of sections.
> > > +There is a file header followed by several section headers which is
> > followed by sections.
> > > +A section header points to an actual section. There is an optional signature
> > at the end.
> > > +The format is defined by header file ``xclbin.h``. The following
> > > +figure illustrates a typical xclbin::
> > > +
> > > +
> > > +          +---------------------+
> > > +          |                     |
> > > +          |       HEADER        |
> > > +          +---------------------+
> > > +          |   SECTION  HEADER   |
> > > +          |                     |
> > > +          +---------------------+
> > > +          |         ...         |
> > > +          |                     |
> > > +          +---------------------+
> > > +          |   SECTION  HEADER   |
> > > +          |                     |
> > > +          +---------------------+
> > > +          |       SECTION       |
> > > +          |                     |
> > > +          +---------------------+
> > > +          |         ...         |
> > > +          |                     |
> > > +          +---------------------+
> > > +          |       SECTION       |
> > > +          |                     |
> > > +          +---------------------+
> > > +          |      SIGNATURE      |
> > > +          |      (OPTIONAL)     |
> > > +          +---------------------+
> > > +
> > > +
> > > +xclbin/xsabin files can be packaged, un-packaged and inspected using
> > > +XRT utility called **xclbinutil**. xclbinutil is part of XRT open
> > > +source software stack. The source code for xclbinutil can be found at
> > > +https://github.com/Xilinx/XRT/tree/master/src/runtime_src/tools/xclbi
> > > +nutil
> > > +
> > > +For example to enumerate the contents of a xclbin/xsabin use the
> > > +*--info* switch as shown
> > > +below::
> > > +
> > > +  xclbinutil --info --input
> > > + /opt/xilinx/firmware/u50/gen3x16-xdma/blp/test/bandwidth.xclbin
> > > +  xclbinutil --info --input
> > > + /lib/firmware/xilinx/862c7020a250293e32036f19956669e5/partition.xsab
> > > + in
> > > +
> > > +
> > > +Device Tree Usage
> > > +-----------------
> > > +
> > > +As mentioned previously xsabin stores metadata which advertise HW
> > subsystems present in a partition.
> > > +The metadata is stored in device tree format with well defined
> > > +schema. Subsystem instantiations are captured as children of
> > > +``addressable_endpoints`` node. Subsystem nodes have standard
> > attributes like ``reg``, ``interrupts`` etc. Additionally the nodes also have PCIe
> > specific attributes:
> > > +``pcie_physical_function`` and ``pcie_bar_mapping``. These identify
> > > +which PCIe physical function and which BAR space in that physical
> > > +function the subsystem resides. XRT management driver uses this
> > > +information to bind *platform drivers* to the subsystem
> > > +instantiations. The platform drivers are found in **xrt-lib.ko**
> > > +kernel module defined later. Below is an example of device tree for
> > > +Alveo U50
> > > +platform::
> > 
> > I might be missing something, but couldn't you structure the addressable
> > endpoints in a way that encode the physical function as a parent / child
> > relation?
> 
> Alveo driver does not generate the metadata. The metadata is formatted and generated by HW tools when the Alveo HW platform is built. 

Sure, but you control the tools that generate the metadata :) Your
userland can structure / process it however it wants / needs?
> 
> > 
> > What are the regs relative to?
> 
> Regs indicates offset of the register on the PCIE BAR of the Alveo device.
> 
> > > +
> > > +  /dts-v1/;
> > > +
> > > +  /{
> > > +     logic_uuid = "f465b0a3ae8c64f619bc150384ace69b";
> > > +
> > > +     schema_version {
> > > +             major = <0x01>;
> > > +             minor = <0x00>;
> > > +     };
> > > +
> > > +     interfaces {
> > > +
> > > +             @0 {
> > > +                     interface_uuid = "862c7020a250293e32036f19956669e5";
> > > +             };
> > > +     };
> > > +
> > > +     addressable_endpoints {
> > > +
> > > +             ep_blp_rom_00 {
> > > +                     reg = <0x00 0x1f04000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_bram_ctrl-
> > 1.0\0axi_bram_ctrl";
> > > +             };
> > > +
> > > +             ep_card_flash_program_00 {
> > > +                     reg = <0x00 0x1f06000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_quad_spi-
> > 1.0\0axi_quad_spi";
> > > +                     interrupts = <0x03 0x03>;
> > > +             };
> > > +
> > > +             ep_cmc_firmware_mem_00 {
> > > +                     reg = <0x00 0x1e20000 0x00 0x20000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible =
> > > + "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> > > +
> > > +                     firmware {
> > > +                             firmware_product_name = "cmc";
> > > +                             firmware_branch_name = "u50";
> > > +                             firmware_version_major = <0x01>;
> > > +                             firmware_version_minor = <0x00>;
> > > +                     };
> > > +             };
> > > +
> > > +             ep_cmc_intc_00 {
> > > +                     reg = <0x00 0x1e03000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
> > > +                     interrupts = <0x04 0x04>;
> > > +             };
> > > +
> > > +             ep_cmc_mutex_00 {
> > > +                     reg = <0x00 0x1e02000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_cmc_regmap_00 {
> > > +                     reg = <0x00 0x1e08000 0x00 0x2000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible =
> > > + "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> > > +
> > > +                     firmware {
> > > +                             firmware_product_name = "sc-fw";
> > > +                             firmware_branch_name = "u50";
> > > +                             firmware_version_major = <0x05>;
> > > +                     };
> > > +             };
> > > +
> > > +             ep_cmc_reset_00 {
> > > +                     reg = <0x00 0x1e01000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_ddr_mem_calib_00 {
> > > +                     reg = <0x00 0x63000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_debug_bscan_mgmt_00 {
> > > +                     reg = <0x00 0x1e90000 0x00 0x10000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-debug_bridge-
> > 1.0\0debug_bridge";
> > > +             };
> > > +
> > > +             ep_ert_base_address_00 {
> > > +                     reg = <0x00 0x21000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_ert_command_queue_mgmt_00 {
> > > +                     reg = <0x00 0x40000 0x00 0x10000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-ert_command_queue-
> > 1.0\0ert_command_queue";
> > > +             };
> > > +
> > > +             ep_ert_command_queue_user_00 {
> > > +                     reg = <0x00 0x40000 0x00 0x10000>;
> > > +                     pcie_physical_function = <0x01>;
> > > +                     compatible = "xilinx.com,reg_abs-ert_command_queue-
> > 1.0\0ert_command_queue";
> > > +             };
> > > +
> > > +             ep_ert_firmware_mem_00 {
> > > +                     reg = <0x00 0x30000 0x00 0x8000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible =
> > > + "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
> > > +
> > > +                     firmware {
> > > +                             firmware_product_name = "ert";
> > > +                             firmware_branch_name = "v20";
> > > +                             firmware_version_major = <0x01>;
> > > +                     };
> > > +             };
> > > +
> > > +             ep_ert_intc_00 {
> > > +                     reg = <0x00 0x23000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
> > > +                     interrupts = <0x05 0x05>;
> > > +             };
> > > +
> > > +             ep_ert_reset_00 {
> > > +                     reg = <0x00 0x22000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_ert_sched_00 {
> > > +                     reg = <0x00 0x50000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x01>;
> > > +                     compatible = "xilinx.com,reg_abs-ert_sched-1.0\0ert_sched";
> > > +                     interrupts = <0x09 0x0c>;
> > > +             };
> > > +
> > > +             ep_fpga_configuration_00 {
> > > +                     reg = <0x00 0x1e88000 0x00 0x8000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_hwicap-1.0\0axi_hwicap";
> > > +                     interrupts = <0x02 0x02>;
> > > +             };
> > > +
> > > +             ep_icap_reset_00 {
> > > +                     reg = <0x00 0x1f07000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_mailbox_mgmt_00 {
> > > +                     reg = <0x00 0x1f10000 0x00 0x10000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
> > > +                     interrupts = <0x00 0x00>;
> > > +             };
> > > +
> > > +             ep_mailbox_user_00 {
> > > +                     reg = <0x00 0x1f00000 0x00 0x10000>;
> > > +                     pcie_physical_function = <0x01>;
> > > +                     compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
> > > +                     interrupts = <0x08 0x08>;
> > > +             };
> > > +
> > > +             ep_msix_00 {
> > > +                     reg = <0x00 0x00 0x00 0x20000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-msix-1.0\0msix";
> > > +                     pcie_bar_mapping = <0x02>;
> > > +             };
> > > +
> > > +             ep_pcie_link_mon_00 {
> > > +                     reg = <0x00 0x1f05000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_pr_isolate_plp_00 {
> > > +                     reg = <0x00 0x1f01000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_pr_isolate_ulp_00 {
> > > +                     reg = <0x00 0x1000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
> > > +             };
> > > +
> > > +             ep_uuid_rom_00 {
> > > +                     reg = <0x00 0x64000 0x00 0x1000>;
> > > +                     pcie_physical_function = <0x00>;
> > > +                     compatible = "xilinx.com,reg_abs-axi_bram_ctrl-
> > 1.0\0axi_bram_ctrl";
> > > +             };
> > > +
> > > +             ep_xdma_00 {
> > > +                     reg = <0x00 0x00 0x00 0x10000>;
> > > +                     pcie_physical_function = <0x01>;
> > > +                     compatible = "xilinx.com,reg_abs-xdma-1.0\0xdma";
> > > +                     pcie_bar_mapping = <0x02>;
> > > +             };
> > > +     };
> > > +
> > > +  }
> > > +
> > > +
> > > +
> > > +Deployment Models
> > > +=================
> > > +
> > > +Baremetal
> > > +---------
> > > +
> > > +In bare-metal deployments both MPF and UPF are visible and
> > > +accessible. xmgmt driver binds to MPF. xmgmt driver operations are
> > > +privileged and available to system administrator. The full stack is
> > illustrated below::
> > > +
> > > +
> > > +                            HOST
> > > +
> > > +                 [XMGMT]            [XUSER]
> > > +                    |                  |
> > > +                    |                  |
> > > +                 +-----+            +-----+
> > > +                 | MPF |            | UPF |
> > > +                 |     |            |     |
> > > +                 | PF0 |            | PF1 |
> > > +                 +--+--+            +--+--+
> > > +          ......... ^................. ^..........
> > > +                    |                  |
> > > +                    |   PCIe DEVICE    |
> > > +                    |                  |
> > > +                 +--+------------------+--+
> > > +                 |         SHELL          |
> > > +                 |                        |
> > > +                 +------------------------+
> > > +                 |         USER           |
> > > +                 |                        |
> > > +                 |                        |
> > > +                 |                        |
> > > +                 |                        |
> > > +                 +------------------------+
> > > +
> > > +
> > > +
> > > +Virtualized
> > > +-----------
> > > +
> > > +In virtualized deployments privileged MPF is assigned to host but
> > > +unprivileged UPF is assigned to guest VM via PCIe pass-through. xmgmt
> > driver in host binds to MPF.
> > > +xmgmt driver operations are privileged and only accessible by hosting
> > service provider.
> > > +The full stack is illustrated below::
> > > +
> > > +
> > > +                                 .............
> > > +                  HOST           .    VM     .
> > > +                                 .           .
> > > +                 [XMGMT]         .  [XUSER]  .
> > > +                    |            .     |     .
> > > +                    |            .     |     .
> > > +                 +-----+         .  +-----+  .
> > > +                 | MPF |         .  | UPF |  .
> > > +                 |     |         .  |     |  .
> > > +                 | PF0 |         .  | PF1 |  .
> > > +                 +--+--+         .  +--+--+  .
> > > +          ......... ^................. ^..........
> > > +                    |                  |
> > > +                    |   PCIe DEVICE    |
> > > +                    |                  |
> > > +                 +--+------------------+--+
> > > +                 |         SHELL          |
> > > +                 |                        |
> > > +                 +------------------------+
> > > +                 |         USER           |
> > > +                 |                        |
> > > +                 |                        |
> > > +                 |                        |
> > > +                 |                        |
> > > +                 +------------------------+
> > > +
> > > +
> > > +
> > > +Driver Modules
> > > +==============
> > > +
> > > +xrt-lib.ko
> > > +----------
> > > +
> > > +Repository of all subsystem drivers and pure software modules that
> > > +can potentially be shared between xmgmt and xuser. All these drivers
> > > +are structured as Linux *platform driver* and are instantiated by
> > > +xmgmt (or xuser in future) based on meta data associated with
> > > +hardware. The metadata is in the form of device tree as explained before.
> > > +
> > > +xmgmt.ko
> > > +--------
> > > +
> > > +The xmgmt driver is a PCIe device driver driving MPF found on
> > > +Xilinx's Alveo PCIE device. It consists of one *root* driver, one or
> > > +more *partition* drivers and one or more *leaf* drivers. The root and
> > > +MPF specific leaf drivers are in xmgmt.ko. The partition driver and other
> > leaf drivers are in xrt-lib.ko.
> > > +
> > > +The instantiation of specific partition driver or leaf driver is
> > > +completely data driven based on meta data (mostly in device tree
> > > +format) found through VSEC capability and inside firmware files, such
> > > +as xsabin or xclbin file. The root driver manages life cycle of
> > > +multiple partition drivers, which, in turn, manages multiple leaf
> > > +drivers. This allows a single set of driver code to support all kinds
> > > +of subsystems exposed by different shells. The difference among all
> > > +these subsystems will be handled in leaf drivers with root and
> > > +partition drivers being part of the infrastructure and provide common
> > services for all leaves found on all platforms.
> > > +
> > > +
> > > +xmgmt-root
> > > +^^^^^^^^^^
> > > +
> > > +The xmgmt-root driver is a PCIe device driver attaches to MPF. It's
> > > +part of the
> > Nit: s/attaches/attached ?
> 
> Yes, sure.
> 
> > > +infrastructure of the MPF driver and resides in xmgmt.ko. This driver
> > > +
> > > +* manages one or more partition drivers
> > > +* provides access to functionalities that requires pci_dev, such as
> > > +PCIE config
> > > +  space access, to other leaf drivers through parent calls
> > > +* together with partition driver, facilities event callbacks for
> > > +other leaf drivers
> > > +* together with partition driver, facilities inter-leaf driver calls
> > > +for other leaf
> > > +  drivers
> > > +
> > > +When root driver starts, it will explicitly create an initial
> > > +partition instance, which contains leaf drivers that will trigger the
> > > +creation of other partition instances. The root driver will wait for
> > > +all partitions and leaves to be created before it returns from it's
> > > +probe routine and claim success of the initialization of the entire xmgmt
> > driver.
> > > +
> > > +partition
> > > +^^^^^^^^^
> > > +
> > > +The partition driver is a platform device driver whose life cycle is
> > > +managed by root and does not have real IO mem or IRQ resources. It's
> > > +part of the infrastructure of the MPF driver and resides in
> > > +xrt-lib.ko. This driver
> > > +
> > > +* manages one or more leaf drivers so that multiple leaves can be
> > > +managed as a group
> > > +* provides access to root from leaves, so that parent calls, event
> > > +notifications
> > > +  and inter-leaf calls can happen
> > > +
> > > +In xmgmt, an initial partition driver instance will be created by
> > > +root, which contains leaves that will trigger partition instances to
> > > +be created to manage groups of leaves found on different partitions
> > > +on hardware, such as VSEC, Shell, and User.
> > > +
> > > +leaves
> > > +^^^^^^
> > > +
> > > +The leaf driver is a platform device driver whose life cycle is
> > > +managed by a partition driver and may or may not have real IO mem or
> > > +IRQ resources. They are the real meat of xmgmt and contains platform
> > > +specific code to Shell and User found on a MPF.
> > > +
> > > +A leaf driver may not have real hardware resources when it merely
> > > +acts as a driver that manages certain in-memory states for xmgmt.
> > > +These in-memory states could be shared by multiple other leaves.
> > > +
> > > +Leaf drivers assigned to specific hardware resources drive specific
> > > +subsystem in the device. To manipulate the subsystem or carry out a
> > > +task, a leaf driver may ask help from root via parent calls and/or from
> > other leaves via inter-leaf calls.
> > > +
> > > +A leaf can also broadcast events through infrastructure code for
> > > +other leaves to process. It can also receive event notification from
> > > +infrastructure about certain events, such as post-creation or pre-exit of a
> > particular leaf.
> > > +
> > > +
> > > +Driver Interfaces
> > > +=================
> > > +
> > > +xmgmt Driver Ioctls
> > > +-------------------
> > > +
> > > +Ioctls exposed by xmgmt driver to user space are enumerated in the
> > following table:
> > > +
> > > +== ===================== =============================
> > ===========================
> > > +#  Functionality         ioctl request code            data format
> > > +== ===================== =============================
> > ===========================
> > > +1  FPGA image download   XMGMT_IOCICAPDOWNLOAD_AXLF
> > xmgmt_ioc_bitstream_axlf
> > > +2  CL frequency scaling  XMGMT_IOCFREQSCALE
> > xmgmt_ioc_freqscaling
> > > +== ===================== =============================
> > > +===========================
> > > +
> > > +xmgmt Driver Sysfs
> > > +------------------
> > > +
> > > +xmgmt driver exposes a rich set of sysfs interfaces. Subsystem
> > > +platform drivers export sysfs node for every platform instance.
> > > +
> > > +Every partition also exports its UUIDs. See below for examples::
> > > +
> > > +  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/interface_uuids
> > > +  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/logic_uuids
> > > +
> > > +
> > > +hwmon
> > > +-----
> > > +
> > > +xmgmt driver exposes standard hwmon interface to report voltage,
> > > +current, temperature, power, etc. These can easily be viewed using
> > *sensors* command line utility.
> > > +
> > > +
> > > +mailbox
> > > +-------
> > > +
> > > +xmgmt communicates with user physical function driver via HW mailbox.
> > > +Mailbox opcodes are defined in ``mailbox_proto.h``. `Mailbox
> > > +Inter-domain Communication Protocol
> > > +<https://xilinx.github.io/XRT/master/html/mailbox.proto.html>`_
> > > +defines the full specification. xmgmt implements subset of the
> > specification. It provides the following services to the UPF driver:
> > > +
> > > +1.  Responding to *are you there* request including determining if the
> > two drivers are
> > > +    running in the same OS domain
> > > +2.  Provide sensor readings, loaded xclbin UUID, clock frequency, shell
> > information, etc.
> > > +3.  Perform PCIe hot reset
> > > +4.  Download user compiled xclbin
> > 
> > Is this gonna use the mailbox framework?
> 
> The xclbin can be downloaded via IOCTL interface of xmgmt driver.
> Or the download request can come from user pf driver via mailbox, yes.
> 
> Thanks,
> Max
> 
> > 
> > > +
> > > +
> > > +Platform Security Considerations
> > > +================================
> > > +
> > > +`Security of Alveo Platform
> > > +<https://xilinx.github.io/XRT/master/html/security.html>`_
> > > +discusses the deployment options and security implications in great detail.
> > > --
> > > 2.17.1
> > 
> > That's a lot of text, I'll have to read it again most likely,
> > 
> > - Moritz

Thanks,
Moritz
Max Zhen Dec. 3, 2020, 3:38 a.m. UTC | #4
Hi Moritz,

Please see my reply below.

Thanks,
-Max

> -----Original Message-----
> From: Moritz Fischer <mdf@kernel.org>
> Sent: Wednesday, December 2, 2020 15:10
> To: Max Zhen <maxz@xilinx.com>
> Cc: Moritz Fischer <mdf@kernel.org>; Sonal Santan <sonals@xilinx.com>; 
> linux-kernel@vger.kernel.org; linux-fpga@vger.kernel.org; Lizhi Hou 
> <lizhih@xilinx.com>; Michal Simek <michals@xilinx.com>; Stefano 
> Stabellini <stefanos@xilinx.com>; devicetree@vger.kernel.org
> Subject: Re: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a 
> document describing Alveo XRT drivers
> 
> 
> Hi Max,
> 
> On Wed, Dec 02, 2020 at 09:24:29PM +0000, Max Zhen wrote:
> > Hi Moritz,
> >
> > Thanks for your feedback. Please see my reply inline.
> >
> > Thanks,
> > -Max
> >
> > > -----Original Message-----
> > > From: Moritz Fischer <mdf@kernel.org>
> > > Sent: Monday, November 30, 2020 20:55
> > > To: Sonal Santan <sonals@xilinx.com>
> > > Cc: linux-kernel@vger.kernel.org; linux-fpga@vger.kernel.org; Max 
> > > Zhen <maxz@xilinx.com>; Lizhi Hou <lizhih@xilinx.com>; Michal 
> > > Simek <michals@xilinx.com>; Stefano Stabellini 
> > > <stefanos@xilinx.com>; devicetree@vger.kernel.org
> > > Subject: Re: [PATCH Xilinx Alveo 1/8] Documentation: fpga: Add a 
> > > document describing Alveo XRT drivers
> > >
> > >
> > > On Sat, Nov 28, 2020 at 04:00:33PM -0800, Sonal Santan wrote:
> > > > From: Sonal Santan <sonal.santan@xilinx.com>
> > > >
> > > > Describe Alveo XRT driver architecture and provide basic 
> > > > overview of Xilinx Alveo platform.
> > > >
> > > > Signed-off-by: Sonal Santan <sonal.santan@xilinx.com>
> > > > ---
> > > >  Documentation/fpga/index.rst |   1 +
> > > >  Documentation/fpga/xrt.rst   | 588
> > > +++++++++++++++++++++++++++++++++++
> > > >  2 files changed, 589 insertions(+)  create mode 100644 
> > > > Documentation/fpga/xrt.rst
> > > >

[...cut...]

> > > > +xclbin over the User partition as part of DFX. When a user 
> > > > +requests loading of a specific xclbin the xmgmt management 
> > > > +driver reads the parent interface UUID specified in the xclbin 
> > > > +and matches it with child interface UUID exported by Shell to 
> > > > +determine if xclbin is compatible
> > > with the Shell. If match fails loading of xclbin is denied.
> > > > +
> > > > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl
> command.
> > > > +When loading xclbin xmgmt driver performs the following operations:
> > > > +
> > > > +1. Sanity check the xclbin contents 2. Isolate the User 
> > > > +partition 3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > > > +De-isolate the User partition
> > > Is this modelled as bridges and regions?
> >
> > Alveo drivers as written today do not use fpga bridge and region
> framework. It seems that if we add support for that framework, it’s 
> possible to receive PR program request from kernel outside of xmgmt driver?
> Currently, we can’t support this and PR program can only be initiated 
> using XRT’s runtime API in user space.
> 
> I'm not 100% sure I understand the concern here, let me reply to what 
> I think I understand:
> 
> You're worried that if you use FPGA region as interface to accept PR 
> requests something else could attempt to reconfigure the region from 
> within the kernel using the FPGA Region API?
> 
> Assuming I got this right, I don't think this is a big deal. When you 
> create the regions you control who gets the references to it.

Thanks for explaining. Yes, I think you got my point :-).

> 
> From what I've seen so far Regions seem to be roughly equivalent to 
> Partitions, hence my surprise to see a new structure bypassing them.

I see where the gap is.

Regions in Linux is very different than "partitions" we have defined in xmgmt. Regions seem to be a software data structure representing an area on the FPGA that can be reprogrammed. This area is protected by the concept of "bridge" which can be disabled before program and reenabled after it. And you go through region when you need to reprogram this area.

The "partition" is part of the main infrastructure of xmgmt driver, which represents a group of subdev drivers for each individual IP (HW subcomponents). Basically, xmgmt root driver is parent of several partitions who is, in turn, the parent of several subdev drivers. The parent manages the life cycle of its children here.

We do have a partition to represent the group of subdevs/IPs in the reprogrammable area. And we also have partitions representing other areas which cannot be reprogrammed. So, it is difficult to use "Region" to implement "partition".

From what you have explained, it seems that even if I use region / bridge in xmgmt, we can still keep it private to xmgmt instead of exposing the interface to outside world, which we can't support anyway? This means that region will be used as an internal data structure for xmgmt. Since we can't simply replace partition with region, we might as well just use partition throughout the driver, instead of introducing two data structures and use them both in different places.

However, if using region/bridge can bring in other benefits, please let us know and we could see if we can also add this to xmgmt.

> >
> > Or maybe we have missed some points about the use case for this
> framework?
> >

[...cut...]

> > > > +-----------------
> > > > +
> > > > +As mentioned previously xsabin stores metadata which advertise 
> > > > +HW
> > > subsystems present in a partition.
> > > > +The metadata is stored in device tree format with well defined 
> > > > +schema. Subsystem instantiations are captured as children of 
> > > > +``addressable_endpoints`` node. Subsystem nodes have standard
> > > attributes like ``reg``, ``interrupts`` etc. Additionally the 
> > > nodes also have PCIe specific attributes:
> > > > +``pcie_physical_function`` and ``pcie_bar_mapping``. These 
> > > > +identify which PCIe physical function and which BAR space in 
> > > > +that physical function the subsystem resides. XRT management 
> > > > +driver uses this information to bind *platform drivers* to the 
> > > > +subsystem instantiations. The platform drivers are found in 
> > > > +**xrt-lib.ko** kernel module defined later. Below is an example 
> > > > +of device tree for Alveo U50
> > > > +platform::
> > >
> > > I might be missing something, but couldn't you structure the 
> > > addressable endpoints in a way that encode the physical function 
> > > as a parent / child relation?
> >
> > Alveo driver does not generate the metadata. The metadata is 
> > formatted
> and generated by HW tools when the Alveo HW platform is built.
> 
> Sure, but you control the tools that generate the metadata :) Your 
> userland can structure / process it however it wants / needs?

XRT is a runtime software stack, it is not responsible for generating HW metadata. It is one of the consumers of these data. The shell design is generated by a sophisticated tool framework which is difficult to change.

However, we will take this as a feedback for future revision of the tool.

Thanks,
Max
Moritz Fischer Dec. 3, 2020, 4:36 a.m. UTC | #5
Max,

On Thu, Dec 03, 2020 at 03:38:26AM +0000, Max Zhen wrote:
> [...cut...]
> 
> > > > > +xclbin over the User partition as part of DFX. When a user 
> > > > > +requests loading of a specific xclbin the xmgmt management 
> > > > > +driver reads the parent interface UUID specified in the xclbin 
> > > > > +and matches it with child interface UUID exported by Shell to 
> > > > > +determine if xclbin is compatible
> > > > with the Shell. If match fails loading of xclbin is denied.
> > > > > +
> > > > > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl
> > command.
> > > > > +When loading xclbin xmgmt driver performs the following operations:
> > > > > +
> > > > > +1. Sanity check the xclbin contents 2. Isolate the User 
> > > > > +partition 3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > > > > +De-isolate the User partition
> > > > Is this modelled as bridges and regions?
> > >
> > > Alveo drivers as written today do not use fpga bridge and region
> > framework. It seems that if we add support for that framework, it’s 
> > possible to receive PR program request from kernel outside of xmgmt driver?
> > Currently, we can’t support this and PR program can only be initiated 
> > using XRT’s runtime API in user space.
> > 
> > I'm not 100% sure I understand the concern here, let me reply to what 
> > I think I understand:
> > 
> > You're worried that if you use FPGA region as interface to accept PR 
> > requests something else could attempt to reconfigure the region from 
> > within the kernel using the FPGA Region API?
> > 
> > Assuming I got this right, I don't think this is a big deal. When you 
> > create the regions you control who gets the references to it.
> 
> Thanks for explaining. Yes, I think you got my point :-).

We can add code to make a region 'static' or 'one-time' or 'fixed'.
> 
> > 
> > From what I've seen so far Regions seem to be roughly equivalent to 
> > Partitions, hence my surprise to see a new structure bypassing them.
> 
> I see where the gap is.
> 
> Regions in Linux is very different than "partitions" we have defined in xmgmt. Regions seem to be a software data structure representing an area on the FPGA that can be reprogrammed. This area is protected by the concept of "bridge" which can be disabled before program and reenabled after it. And you go through region when you need to reprogram this area.

Your central management driver can create / destroy regions at will. It
can keep them in a list, array or tree.

Regions can but don't have to have bridges.

If you need to go through the central driver to reprogram a region,
you can use that to figure out which region to program.
> 
> The "partition" is part of the main infrastructure of xmgmt driver, which represents a group of subdev drivers for each individual IP (HW subcomponents). Basically, xmgmt root driver is parent of several partitions who is, in turn, the parent of several subdev drivers. The parent manages the life cycle of its children here.

I don't see how this is conceptually different from what DFL does, and
they managed to use Regions and Bridges.

If things are missing in the framework, please add them instead of
rewriting an entire parallel framework.

> 
> We do have a partition to represent the group of subdevs/IPs in the reprogrammable area. And we also have partitions representing other areas which cannot be reprogrammed. So, it is difficult to use "Region" to implement "partition".

You implement your regions callbacks, you can return -EINVAL / -ENOTTY
if you want to fail a reprogramming request to a static partion /
region.

> From what you have explained, it seems that even if I use region / bridge in xmgmt, we can still keep it private to xmgmt instead of exposing the interface to outside world, which we can't support anyway? This means that region will be used as an internal data structure for xmgmt. Since we can't simply replace partition with region, we might as well just use partition throughout the driver, instead of introducing two data structures and use them both in different places.

Think about your partition as an extension to a region that implements
what you need to do for your case of enumerating and reprogramming that
particular piece of your chip.

> However, if using region/bridge can bring in other benefits, please let us know and we could see if we can also add this to xmgmt.

As maintainer I can say it brings the benefit of looking like existing
infrastructure we have. We can add features to the framework as needed
but blanket replacing the entire thing is always a hard sell.
> 
> > >
> > > Or maybe we have missed some points about the use case for this
> > framework?
> > >
> 
> [...cut...]
> 
> > > > > +-----------------
> > > > > +
> > > > > +As mentioned previously xsabin stores metadata which advertise 
> > > > > +HW
> > > > subsystems present in a partition.
> > > > > +The metadata is stored in device tree format with well defined 
> > > > > +schema. Subsystem instantiations are captured as children of 
> > > > > +``addressable_endpoints`` node. Subsystem nodes have standard
> > > > attributes like ``reg``, ``interrupts`` etc. Additionally the 
> > > > nodes also have PCIe specific attributes:
> > > > > +``pcie_physical_function`` and ``pcie_bar_mapping``. These 
> > > > > +identify which PCIe physical function and which BAR space in 
> > > > > +that physical function the subsystem resides. XRT management 
> > > > > +driver uses this information to bind *platform drivers* to the 
> > > > > +subsystem instantiations. The platform drivers are found in 
> > > > > +**xrt-lib.ko** kernel module defined later. Below is an example 
> > > > > +of device tree for Alveo U50
> > > > > +platform::
> > > >
> > > > I might be missing something, but couldn't you structure the 
> > > > addressable endpoints in a way that encode the physical function 
> > > > as a parent / child relation?
> > >
> > > Alveo driver does not generate the metadata. The metadata is 
> > > formatted
> > and generated by HW tools when the Alveo HW platform is built.
> > 
> > Sure, but you control the tools that generate the metadata :) Your 
> > userland can structure / process it however it wants / needs?
> 
> XRT is a runtime software stack, it is not responsible for generating HW metadata. It is one of the consumers of these data. The shell design is generated by a sophisticated tool framework which is difficult to change.

The Kernel userspace ABI is not going to change once it is merged, which
is why we need to get it right. You can change your userspace code long
time after it is merged into the kernel. The otherway round does not
work.

If you're going to do device-tree you'll need device-tree maintainers to
be ok with your bindings.

> However, we will take this as a feedback for future revision of the tool.
> 
> Thanks,
> Max

Btw: Can you fix your line-breaks :)

- Moritz
Max Zhen Dec. 4, 2020, 1:17 a.m. UTC | #6
Hi Moritz,

I manually fixed some line breaks. Not sure why outlook is not doing it properly.
Let me know if it still looks bad to you.

Please see my reply below.

> 
> 
> Max,
> 
> On Thu, Dec 03, 2020 at 03:38:26AM +0000, Max Zhen wrote:
> > [...cut...]
> >
> > > > > > +xclbin over the User partition as part of DFX. When a user
> > > > > > +requests loading of a specific xclbin the xmgmt management
> > > > > > +driver reads the parent interface UUID specified in the xclbin
> > > > > > +and matches it with child interface UUID exported by Shell to
> > > > > > +determine if xclbin is compatible with the Shell. If match fails loading of xclbin is denied.
> > > > > > +
> > > > > > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl command.
> > > > > > +When loading xclbin xmgmt driver performs the following operations:
> > > > > > +
> > > > > > +1. Sanity check the xclbin contents 2. Isolate the User
> > > > > > +partition 3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > > > > > +De-isolate the User partition
> > > > > Is this modelled as bridges and regions?
> > > >
> > > > Alveo drivers as written today do not use fpga bridge and region
> > > > framework. It seems that if we add support for that framework, it’s
> > > > possible to receive PR program request from kernel outside of xmgmt driver?
> > > > Currently, we can’t support this and PR program can only be initiated
> > > > using XRT’s runtime API in user space.
> > >
> > > I'm not 100% sure I understand the concern here, let me reply to what
> > > I think I understand:
> > >
> > > You're worried that if you use FPGA region as interface to accept PR
> > > requests something else could attempt to reconfigure the region from
> > > within the kernel using the FPGA Region API?
> > >
> > > Assuming I got this right, I don't think this is a big deal. When you
> > > create the regions you control who gets the references to it.
> >
> > Thanks for explaining. Yes, I think you got my point :-).
> 
> We can add code to make a region 'static' or 'one-time' or 'fixed'.
> >
> > >
> > > From what I've seen so far Regions seem to be roughly equivalent to
> > > Partitions, hence my surprise to see a new structure bypassing them.
> >
> > I see where the gap is.
> >
> > Regions in Linux is very different than "partitions" we have defined in xmgmt. Regions seem to be a software data structure
> > representing an area on the FPGA that can be reprogrammed. This area is protected by the concept of "bridge" which can be disabled
> > before program and reenabled after it. And you go through region when you need to reprogram this area.
> 
> Your central management driver can create / destroy regions at will. It
> can keep them in a list, array or tree.
> 
> Regions can but don't have to have bridges.
> 
> If you need to go through the central driver to reprogram a region,
> you can use that to figure out which region to program.

That sounds fine. I can create a region and call into it from xmgmt for
PR programing. The region will, then, call the xmgmt's fpga manager
to program it.

> >
> > The "partition" is part of the main infrastructure of xmgmt driver, which represents a group of subdev drivers for each individual IP
> > (HW subcomponents). Basically, xmgmt root driver is parent of several partitions who is, in turn, the parent of several subdev drivers.
> > The parent manages the life cycle of its children here.
> 
> I don't see how this is conceptually different from what DFL does, and
> they managed to use Regions and Bridges.
> 
> If things are missing in the framework, please add them instead of
> rewriting an entire parallel framework.
> 
> >
> > We do have a partition to represent the group of subdevs/IPs in the reprogrammable area. And we also have partitions
> > representing other areas which cannot be reprogrammed. So, it is difficult to use "Region" to implement "partition".
> 
> You implement your regions callbacks, you can return -EINVAL / -ENOTTY
> if you want to fail a reprogramming request to a static partion /
> region.
> 
> > From what you have explained, it seems that even if I use region / bridge in xmgmt, we can still keep it private to xmgmt instead of
> > exposing the interface to outside world, which we can't support anyway? This means that region will be used as an internal data
> > structure for xmgmt. Since we can't simply replace partition with region, we might as well just use partition throughout the driver,
> > instead of introducing two data structures and use them both in different places.
> 
> Think about your partition as an extension to a region that implements
> what you need to do for your case of enumerating and reprogramming that
> particular piece of your chip.

Yes, we can add region / bridges to represent the PR area and use it in our
code path for reprogramming the PR area. I think what we will do is to
instantiate a region instance for the PR area and associate it with the
FPGA manager in xmgmt for reprogramming it. We can also instantiate
bridges and map the "ULP gate" subdev driver to it in xmgmt. Thus, we
could incorporate region and bridge data structures in xmgmt for PR
reprogramming.

This will be a non-trivial change for us. I'd like to confirm that this is what
you are looking for before we start working on the change. Let us know :-).

> 
> > However, if using region/bridge can bring in other benefits, please let us know and we could see if we can also add this to xmgmt.
> 
> As maintainer I can say it brings the benefit of looking like existing
> infrastructure we have. We can add features to the framework as needed
> but blanket replacing the entire thing is always a hard sell.
> >
> > > >
> > > > Or maybe we have missed some points about the use case for this
> > > > framework?
> > > >
> >
> > [...cut...]
> >
> > > > > > +-----------------
> > > > > > +
> > > > > > +As mentioned previously xsabin stores metadata which advertise
> > > > > > +HW
> > > > > > subsystems present in a partition.
> > > > > > +The metadata is stored in device tree format with well defined
> > > > > > +schema. Subsystem instantiations are captured as children of
> > > > > > +``addressable_endpoints`` node. Subsystem nodes have standard
> > > > > > attributes like ``reg``, ``interrupts`` etc. Additionally the
> > > > > > nodes also have PCIe specific attributes:
> > > > > > +``pcie_physical_function`` and ``pcie_bar_mapping``. These
> > > > > > +identify which PCIe physical function and which BAR space in
> > > > > > +that physical function the subsystem resides. XRT management
> > > > > > +driver uses this information to bind *platform drivers* to the
> > > > > > +subsystem instantiations. The platform drivers are found in
> > > > > > +**xrt-lib.ko** kernel module defined later. Below is an example
> > > > > > +of device tree for Alveo U50
> > > > > > +platform::
> > > > >
> > > > > I might be missing something, but couldn't you structure the
> > > > > addressable endpoints in a way that encode the physical function
> > > > > as a parent / child relation?
> > > >
> > > > Alveo driver does not generate the metadata. The metadata is
> > > > formatted
> > > > and generated by HW tools when the Alveo HW platform is built.
> > >
> > > Sure, but you control the tools that generate the metadata :) Your
> > > userland can structure / process it however it wants / needs?
> >
> > XRT is a runtime software stack, it is not responsible for generating HW metadata. It is one of the consumers of these data. The shell
> > design is generated by a sophisticated tool framework which is difficult to change.
> 
> The Kernel userspace ABI is not going to change once it is merged, which
> is why we need to get it right. You can change your userspace code long
> time after it is merged into the kernel. The otherway round does not
> work.
> 
> If you're going to do device-tree you'll need device-tree maintainers to
> be ok with your bindings.
> 


Yes, we'll wait for the device-tree maintainers to chime in here :-).

Thanks,
Max

> > However, we will take this as a feedback for future revision of the tool.
> >
> > Thanks,
> > Max
> 
> Btw: Can you fix your line-breaks :)
> 
> - Moritz
Moritz Fischer Dec. 4, 2020, 4:18 a.m. UTC | #7
On Fri, Dec 04, 2020 at 01:17:37AM +0000, Max Zhen wrote:
> Hi Moritz,
> 
> I manually fixed some line breaks. Not sure why outlook is not doing it properly.
> Let me know if it still looks bad to you.

That might just be outlook :)
> 
> Please see my reply below.
> 
> > 
> > 
> > Max,
> > 
> > On Thu, Dec 03, 2020 at 03:38:26AM +0000, Max Zhen wrote:
> > > [...cut...]
> > >
> > > > > > > +xclbin over the User partition as part of DFX. When a user
> > > > > > > +requests loading of a specific xclbin the xmgmt management
> > > > > > > +driver reads the parent interface UUID specified in the xclbin
> > > > > > > +and matches it with child interface UUID exported by Shell to
> > > > > > > +determine if xclbin is compatible with the Shell. If match fails loading of xclbin is denied.
> > > > > > > +
> > > > > > > +xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl command.
> > > > > > > +When loading xclbin xmgmt driver performs the following operations:
> > > > > > > +
> > > > > > > +1. Sanity check the xclbin contents 2. Isolate the User
> > > > > > > +partition 3. Download the bitstream using the FPGA config engine (ICAP) 4.
> > > > > > > +De-isolate the User partition
> > > > > > Is this modelled as bridges and regions?
> > > > >
> > > > > Alveo drivers as written today do not use fpga bridge and region
> > > > > framework. It seems that if we add support for that framework, it’s
> > > > > possible to receive PR program request from kernel outside of xmgmt driver?
> > > > > Currently, we can’t support this and PR program can only be initiated
> > > > > using XRT’s runtime API in user space.
> > > >
> > > > I'm not 100% sure I understand the concern here, let me reply to what
> > > > I think I understand:
> > > >
> > > > You're worried that if you use FPGA region as interface to accept PR
> > > > requests something else could attempt to reconfigure the region from
> > > > within the kernel using the FPGA Region API?
> > > >
> > > > Assuming I got this right, I don't think this is a big deal. When you
> > > > create the regions you control who gets the references to it.
> > >
> > > Thanks for explaining. Yes, I think you got my point :-).
> > 
> > We can add code to make a region 'static' or 'one-time' or 'fixed'.
> > >
> > > >
> > > > From what I've seen so far Regions seem to be roughly equivalent to
> > > > Partitions, hence my surprise to see a new structure bypassing them.
> > >
> > > I see where the gap is.
> > >
> > > Regions in Linux is very different than "partitions" we have defined in xmgmt. Regions seem to be a software data structure
> > > representing an area on the FPGA that can be reprogrammed. This area is protected by the concept of "bridge" which can be disabled
> > > before program and reenabled after it. And you go through region when you need to reprogram this area.
> > 
> > Your central management driver can create / destroy regions at will. It
> > can keep them in a list, array or tree.
> > 
> > Regions can but don't have to have bridges.
> > 
> > If you need to go through the central driver to reprogram a region,
> > you can use that to figure out which region to program.
> 
> That sounds fine. I can create a region and call into it from xmgmt for
> PR programing. The region will, then, call the xmgmt's fpga manager
> to program it.

It sounds closer than what I'd expect.
> 
> > >
> > > The "partition" is part of the main infrastructure of xmgmt driver, which represents a group of subdev drivers for each individual IP
> > > (HW subcomponents). Basically, xmgmt root driver is parent of several partitions who is, in turn, the parent of several subdev drivers.
> > > The parent manages the life cycle of its children here.
> > 
> > I don't see how this is conceptually different from what DFL does, and
> > they managed to use Regions and Bridges.
> > 
> > If things are missing in the framework, please add them instead of
> > rewriting an entire parallel framework.
> > 
> > >
> > > We do have a partition to represent the group of subdevs/IPs in the reprogrammable area. And we also have partitions
> > > representing other areas which cannot be reprogrammed. So, it is difficult to use "Region" to implement "partition".
> > 
> > You implement your regions callbacks, you can return -EINVAL / -ENOTTY
> > if you want to fail a reprogramming request to a static partion /
> > region.
> > 
> > > From what you have explained, it seems that even if I use region / bridge in xmgmt, we can still keep it private to xmgmt instead of
> > > exposing the interface to outside world, which we can't support anyway? This means that region will be used as an internal data
> > > structure for xmgmt. Since we can't simply replace partition with region, we might as well just use partition throughout the driver,
> > > instead of introducing two data structures and use them both in different places.
> > 
> > Think about your partition as an extension to a region that implements
> > what you need to do for your case of enumerating and reprogramming that
> > particular piece of your chip.
> 
> Yes, we can add region / bridges to represent the PR area and use it in our
> code path for reprogramming the PR area. I think what we will do is to
> instantiate a region instance for the PR area and associate it with the
> FPGA manager in xmgmt for reprogramming it. We can also instantiate
> bridges and map the "ULP gate" subdev driver to it in xmgmt. Thus, we
> could incorporate region and bridge data structures in xmgmt for PR
> reprogramming.

I'd need to take another look, but the ULP gate sounds like a bridge (or
close to it).
 
> This will be a non-trivial change for us. I'd like to confirm that this is what
> you are looking for before we start working on the change. Let us know :-).

I understand. It looks like the right direction. Let's discuss code when
we have code to look at.

It may take a couple of iterations to get it all sorted.

That's normal when you show show up with that much code all at once :)

Cheers,
Moritz
diff mbox series

Patch

diff --git a/Documentation/fpga/index.rst b/Documentation/fpga/index.rst
index f80f95667ca2..30134357b70d 100644
--- a/Documentation/fpga/index.rst
+++ b/Documentation/fpga/index.rst
@@ -8,6 +8,7 @@  fpga
     :maxdepth: 1

     dfl
+    xrt

 .. only::  subproject and html

diff --git a/Documentation/fpga/xrt.rst b/Documentation/fpga/xrt.rst
new file mode 100644
index 000000000000..9f37d46459b0
--- /dev/null
+++ b/Documentation/fpga/xrt.rst
@@ -0,1 +1,588 @@ 
+==================================
+XRTV2 Linux Kernel Driver Overview
+==================================
+
+XRTV2 drivers are second generation `XRT <https://github.com/Xilinx/XRT>`_ drivers which
+support `Alveo <https://www.xilinx.com/products/boards-and-kits/alveo.html>`_ PCIe platforms
+from Xilinx.
+
+XRTV2 drivers support *subsystem* style data driven platforms where driver's configuration
+and behavior is determined by meta data provided by platform (in *device tree* format).
+Primary management physical function (MPF) driver is called **xmgmt**. Primary user physical
+function (UPF) driver is called **xuser** and HW subsystem drivers are packaged into a library
+module called **xrt-lib**, which is shared by **xmgmt** and **xuser** (WIP).
+
+Alveo Platform Overview
+=======================
+
+Alveo platforms are architected as two physical FPGA partitions: *Shell* and *User*. Shell
+provides basic infrastructure for the Alveo platform like PCIe connectivity, board management,
+Dynamic Function Exchange (DFX), sensors, clocking, reset, and security. User partition contains
+user compiled binary which is loaded by a process called DFX also known as partial reconfiguration.
+
+Physical partitions require strict HW compatibility with each other for DFX to work properly.
+Every physical partition has two interface UUIDs: *parent* UUID and *child* UUID. For simple
+single stage platforms Shell → User forms parent child relationship. For complex two stage
+platforms Base → Shell → User forms the parent child relationship chain.
+
+.. note::
+   Partition compatibility matching is key design component of Alveo platforms and XRT. Partitions
+   have child and parent relationship. A loaded partition exposes child partition UUID to advertise
+   its compatibility requirement for child partition. When loading a child partition the xmgmt
+   management driver matches parent UUID of the child partition against child UUID exported by the
+   parent. Parent and child partition UUIDs are stored in the *xclbin* (for user) or *xsabin* (for
+   base and shell). Except for root UUID, VSEC, hardware itself does not know about UUIDs. UUIDs are
+   stored in xsabin and xclbin.
+
+
+The physical partitions and their loading is illustrated below::
+
+            SHELL                               USER
+        +-----------+                  +-------------------+
+        |           |                  |                   |
+        | VSEC UUID | CHILD     PARENT |    LOGIC UUID     |
+        |           o------->|<--------o                   |
+        |           | UUID       UUID  |                   |
+        +-----+-----+                  +--------+----------+
+              |                                 |
+              .                                 .
+              |                                 |
+          +---+---+                      +------+--------+
+          |  POR  |                      | USER COMPILED |
+          | FLASH |                      |    XCLBIN     |
+          +-------+                      +---------------+
+
+
+Loading Sequence
+----------------
+
+Shell partition is loaded from flash at system boot time. It establishes the PCIe link and exposes
+two physical functions to the BIOS. After OS boot, xmgmt driver attaches to PCIe physical function
+0 exposed by the Shell and then looks for VSEC in PCIe extended configuration space. Using VSEC it
+determines the logic UUID of Shell and uses the UUID to load matching *xsabin* file from Linux
+firmware directory. The xsabin file contains metadata to discover peripherals that are part of Shell
+and firmware(s) for any embedded soft processors in Shell.
+
+Shell exports child interface UUID which is used for compatibility check when loading user compiled
+xclbin over the User partition as part of DFX. When a user requests loading of a specific xclbin the
+xmgmt management driver reads the parent interface UUID specified in the xclbin and matches it with
+child interface UUID exported by Shell to determine if xclbin is compatible with the Shell. If match
+fails loading of xclbin is denied.
+
+xclbin loading is requested using ICAP_DOWNLOAD_AXLF ioctl command. When loading xclbin xmgmt driver
+performs the following operations:
+
+1. Sanity check the xclbin contents
+2. Isolate the User partition
+3. Download the bitstream using the FPGA config engine (ICAP)
+4. De-isolate the User partition
+5. Program the clocks (ClockWiz) driving the User partition
+6. Wait for memory controller (MIG) calibration
+
+`Platform Loading Overview <https://xilinx.github.io/XRT/master/html/platforms_partitions.html>`_
+provides more detailed information on platform loading.
+
+xsabin
+------
+
+Each Alveo platform comes packaged with its own xsabin. The xsabin is trusted component of the
+platform. For format details refer to :ref:`xsabin/xclbin Container Format`. xsabin contains
+basic information like UUIDs, platform name and metadata in the form of device tree. See
+:ref:`Device Tree Usage` for details and example.
+
+xclbin
+------
+
+xclbin is compiled by end user using
+`Vitis <https://www.xilinx.com/products/design-tools/vitis/vitis-platform.html>`_ tool set from
+Xilinx. The xclbin contains sections describing user compiled acceleration engines/kernels, memory
+subsystems, clocking information etc. It also contains bitstream for the user partition, UUIDs,
+platform name, etc. xclbin uses the same container format as xsabin which is described below.
+
+
+xsabin/xclbin Container Format
+------------------------------
+
+xclbin/xsabin is ELF-like binary container format. It is structured as series of sections.
+There is a file header followed by several section headers which is followed by sections.
+A section header points to an actual section. There is an optional signature at the end.
+The format is defined by header file ``xclbin.h``. The following figure illustrates a
+typical xclbin::
+
+
+          +---------------------+
+          |                     |
+          |       HEADER        |
+          +---------------------+
+          |   SECTION  HEADER   |
+          |                     |
+          +---------------------+
+          |         ...         |
+          |                     |
+          +---------------------+
+          |   SECTION  HEADER   |
+          |                     |
+          +---------------------+
+          |       SECTION       |
+          |                     |
+          +---------------------+
+          |         ...         |
+          |                     |
+          +---------------------+
+          |       SECTION       |
+          |                     |
+          +---------------------+
+          |      SIGNATURE      |
+          |      (OPTIONAL)     |
+          +---------------------+
+
+
+xclbin/xsabin files can be packaged, un-packaged and inspected using XRT utility called
+**xclbinutil**. xclbinutil is part of XRT open source software stack. The source code for
+xclbinutil can be found at https://github.com/Xilinx/XRT/tree/master/src/runtime_src/tools/xclbinutil
+
+For example to enumerate the contents of a xclbin/xsabin use the *--info* switch as shown
+below::
+
+  xclbinutil --info --input /opt/xilinx/firmware/u50/gen3x16-xdma/blp/test/bandwidth.xclbin
+  xclbinutil --info --input /lib/firmware/xilinx/862c7020a250293e32036f19956669e5/partition.xsabin
+
+
+Device Tree Usage
+-----------------
+
+As mentioned previously xsabin stores metadata which advertise HW subsystems present in a partition.
+The metadata is stored in device tree format with well defined schema. Subsystem instantiations are
+captured as children of ``addressable_endpoints`` node. Subsystem nodes have standard attributes like
+``reg``, ``interrupts`` etc. Additionally the nodes also have PCIe specific attributes:
+``pcie_physical_function`` and ``pcie_bar_mapping``. These identify which PCIe physical function and
+which BAR space in that physical function the subsystem resides. XRT management driver uses this
+information to bind *platform drivers* to the subsystem instantiations. The platform drivers are
+found in **xrt-lib.ko** kernel module defined later. Below is an example of device tree for Alveo U50
+platform::
+
+  /dts-v1/;
+
+  /{
+	logic_uuid = "f465b0a3ae8c64f619bc150384ace69b";
+
+	schema_version {
+		major = <0x01>;
+		minor = <0x00>;
+	};
+
+	interfaces {
+
+		@0 {
+			interface_uuid = "862c7020a250293e32036f19956669e5";
+		};
+	};
+
+	addressable_endpoints {
+
+		ep_blp_rom_00 {
+			reg = <0x00 0x1f04000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
+		};
+
+		ep_card_flash_program_00 {
+			reg = <0x00 0x1f06000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_quad_spi-1.0\0axi_quad_spi";
+			interrupts = <0x03 0x03>;
+		};
+
+		ep_cmc_firmware_mem_00 {
+			reg = <0x00 0x1e20000 0x00 0x20000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
+
+			firmware {
+				firmware_product_name = "cmc";
+				firmware_branch_name = "u50";
+				firmware_version_major = <0x01>;
+				firmware_version_minor = <0x00>;
+			};
+		};
+
+		ep_cmc_intc_00 {
+			reg = <0x00 0x1e03000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
+			interrupts = <0x04 0x04>;
+		};
+
+		ep_cmc_mutex_00 {
+			reg = <0x00 0x1e02000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_cmc_regmap_00 {
+			reg = <0x00 0x1e08000 0x00 0x2000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
+
+			firmware {
+				firmware_product_name = "sc-fw";
+				firmware_branch_name = "u50";
+				firmware_version_major = <0x05>;
+			};
+		};
+
+		ep_cmc_reset_00 {
+			reg = <0x00 0x1e01000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_ddr_mem_calib_00 {
+			reg = <0x00 0x63000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_debug_bscan_mgmt_00 {
+			reg = <0x00 0x1e90000 0x00 0x10000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-debug_bridge-1.0\0debug_bridge";
+		};
+
+		ep_ert_base_address_00 {
+			reg = <0x00 0x21000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_ert_command_queue_mgmt_00 {
+			reg = <0x00 0x40000 0x00 0x10000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-ert_command_queue-1.0\0ert_command_queue";
+		};
+
+		ep_ert_command_queue_user_00 {
+			reg = <0x00 0x40000 0x00 0x10000>;
+			pcie_physical_function = <0x01>;
+			compatible = "xilinx.com,reg_abs-ert_command_queue-1.0\0ert_command_queue";
+		};
+
+		ep_ert_firmware_mem_00 {
+			reg = <0x00 0x30000 0x00 0x8000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
+
+			firmware {
+				firmware_product_name = "ert";
+				firmware_branch_name = "v20";
+				firmware_version_major = <0x01>;
+			};
+		};
+
+		ep_ert_intc_00 {
+			reg = <0x00 0x23000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_intc-1.0\0axi_intc";
+			interrupts = <0x05 0x05>;
+		};
+
+		ep_ert_reset_00 {
+			reg = <0x00 0x22000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_ert_sched_00 {
+			reg = <0x00 0x50000 0x00 0x1000>;
+			pcie_physical_function = <0x01>;
+			compatible = "xilinx.com,reg_abs-ert_sched-1.0\0ert_sched";
+			interrupts = <0x09 0x0c>;
+		};
+
+		ep_fpga_configuration_00 {
+			reg = <0x00 0x1e88000 0x00 0x8000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_hwicap-1.0\0axi_hwicap";
+			interrupts = <0x02 0x02>;
+		};
+
+		ep_icap_reset_00 {
+			reg = <0x00 0x1f07000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_mailbox_mgmt_00 {
+			reg = <0x00 0x1f10000 0x00 0x10000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
+			interrupts = <0x00 0x00>;
+		};
+
+		ep_mailbox_user_00 {
+			reg = <0x00 0x1f00000 0x00 0x10000>;
+			pcie_physical_function = <0x01>;
+			compatible = "xilinx.com,reg_abs-mailbox-1.0\0mailbox";
+			interrupts = <0x08 0x08>;
+		};
+
+		ep_msix_00 {
+			reg = <0x00 0x00 0x00 0x20000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-msix-1.0\0msix";
+			pcie_bar_mapping = <0x02>;
+		};
+
+		ep_pcie_link_mon_00 {
+			reg = <0x00 0x1f05000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_pr_isolate_plp_00 {
+			reg = <0x00 0x1f01000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_pr_isolate_ulp_00 {
+			reg = <0x00 0x1000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_gpio-1.0\0axi_gpio";
+		};
+
+		ep_uuid_rom_00 {
+			reg = <0x00 0x64000 0x00 0x1000>;
+			pcie_physical_function = <0x00>;
+			compatible = "xilinx.com,reg_abs-axi_bram_ctrl-1.0\0axi_bram_ctrl";
+		};
+
+		ep_xdma_00 {
+			reg = <0x00 0x00 0x00 0x10000>;
+			pcie_physical_function = <0x01>;
+			compatible = "xilinx.com,reg_abs-xdma-1.0\0xdma";
+			pcie_bar_mapping = <0x02>;
+		};
+	};
+
+  }
+
+
+
+Deployment Models
+=================
+
+Baremetal
+---------
+
+In bare-metal deployments both MPF and UPF are visible and accessible. xmgmt driver binds to
+MPF. xmgmt driver operations are privileged and available to system administrator. The full
+stack is illustrated below::
+
+
+                            HOST
+
+                 [XMGMT]            [XUSER]
+                    |                  |
+                    |                  |
+                 +-----+            +-----+
+                 | MPF |            | UPF |
+                 |     |            |     |
+                 | PF0 |            | PF1 |
+                 +--+--+            +--+--+
+          ......... ^................. ^..........
+                    |                  |
+                    |   PCIe DEVICE    |
+                    |                  |
+                 +--+------------------+--+
+                 |         SHELL          |
+                 |                        |
+                 +------------------------+
+                 |         USER           |
+                 |                        |
+                 |                        |
+                 |                        |
+                 |                        |
+                 +------------------------+
+
+
+
+Virtualized
+-----------
+
+In virtualized deployments privileged MPF is assigned to host but unprivileged UPF
+is assigned to guest VM via PCIe pass-through. xmgmt driver in host binds to MPF.
+xmgmt driver operations are privileged and only accessible by hosting service provider.
+The full stack is illustrated below::
+
+
+                                 .............
+                  HOST           .    VM     .
+                                 .           .
+                 [XMGMT]         .  [XUSER]  .
+                    |            .     |     .
+                    |            .     |     .
+                 +-----+         .  +-----+  .
+                 | MPF |         .  | UPF |  .
+                 |     |         .  |     |  .
+                 | PF0 |         .  | PF1 |  .
+                 +--+--+         .  +--+--+  .
+          ......... ^................. ^..........
+                    |                  |
+                    |   PCIe DEVICE    |
+                    |                  |
+                 +--+------------------+--+
+                 |         SHELL          |
+                 |                        |
+                 +------------------------+
+                 |         USER           |
+                 |                        |
+                 |                        |
+                 |                        |
+                 |                        |
+                 +------------------------+
+
+
+
+Driver Modules
+==============
+
+xrt-lib.ko
+----------
+
+Repository of all subsystem drivers and pure software modules that can potentially
+be shared between xmgmt and xuser. All these drivers are structured as Linux
+*platform driver* and are instantiated by xmgmt (or xuser in future) based on meta
+data associated with hardware. The metadata is in the form of device tree as
+explained before.
+
+xmgmt.ko
+--------
+
+The xmgmt driver is a PCIe device driver driving MPF found on Xilinx's Alveo
+PCIE device. It consists of one *root* driver, one or more *partition* drivers
+and one or more *leaf* drivers. The root and MPF specific leaf drivers are in
+xmgmt.ko. The partition driver and other leaf drivers are in xrt-lib.ko.
+
+The instantiation of specific partition driver or leaf driver is completely data
+driven based on meta data (mostly in device tree format) found through VSEC
+capability and inside firmware files, such as xsabin or xclbin file. The root
+driver manages life cycle of multiple partition drivers, which, in turn, manages
+multiple leaf drivers. This allows a single set of driver code to support all
+kinds of subsystems exposed by different shells. The difference among all
+these subsystems will be handled in leaf drivers with root and partition drivers
+being part of the infrastructure and provide common services for all leaves found
+on all platforms.
+
+
+xmgmt-root
+^^^^^^^^^^
+
+The xmgmt-root driver is a PCIe device driver attaches to MPF. It's part of the
+infrastructure of the MPF driver and resides in xmgmt.ko. This driver
+
+* manages one or more partition drivers
+* provides access to functionalities that requires pci_dev, such as PCIE config
+  space access, to other leaf drivers through parent calls
+* together with partition driver, facilities event callbacks for other leaf drivers
+* together with partition driver, facilities inter-leaf driver calls for other leaf
+  drivers
+
+When root driver starts, it will explicitly create an initial partition instance,
+which contains leaf drivers that will trigger the creation of other partition
+instances. The root driver will wait for all partitions and leaves to be created
+before it returns from it's probe routine and claim success of the initialization
+of the entire xmgmt driver.
+
+partition
+^^^^^^^^^
+
+The partition driver is a platform device driver whose life cycle is managed by
+root and does not have real IO mem or IRQ resources. It's part of the
+infrastructure of the MPF driver and resides in xrt-lib.ko. This driver
+
+* manages one or more leaf drivers so that multiple leaves can be managed as a group
+* provides access to root from leaves, so that parent calls, event notifications
+  and inter-leaf calls can happen
+
+In xmgmt, an initial partition driver instance will be created by root, which
+contains leaves that will trigger partition instances to be created to manage
+groups of leaves found on different partitions on hardware, such as VSEC, Shell,
+and User.
+
+leaves
+^^^^^^
+
+The leaf driver is a platform device driver whose life cycle is managed by
+a partition driver and may or may not have real IO mem or IRQ resources. They
+are the real meat of xmgmt and contains platform specific code to Shell and User
+found on a MPF.
+
+A leaf driver may not have real hardware resources when it merely acts as a driver
+that manages certain in-memory states for xmgmt. These in-memory states could be
+shared by multiple other leaves.
+
+Leaf drivers assigned to specific hardware resources drive specific subsystem in
+the device. To manipulate the subsystem or carry out a task, a leaf driver may ask
+help from root via parent calls and/or from other leaves via inter-leaf calls.
+
+A leaf can also broadcast events through infrastructure code for other leaves
+to process. It can also receive event notification from infrastructure about certain
+events, such as post-creation or pre-exit of a particular leaf.
+
+
+Driver Interfaces
+=================
+
+xmgmt Driver Ioctls
+-------------------
+
+Ioctls exposed by xmgmt driver to user space are enumerated in the following table:
+
+== ===================== ============================= ===========================
+#  Functionality         ioctl request code            data format
+== ===================== ============================= ===========================
+1  FPGA image download   XMGMT_IOCICAPDOWNLOAD_AXLF    xmgmt_ioc_bitstream_axlf
+2  CL frequency scaling  XMGMT_IOCFREQSCALE            xmgmt_ioc_freqscaling
+== ===================== ============================= ===========================
+
+xmgmt Driver Sysfs
+------------------
+
+xmgmt driver exposes a rich set of sysfs interfaces. Subsystem platform drivers
+export sysfs node for every platform instance.
+
+Every partition also exports its UUIDs. See below for examples::
+
+  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/interface_uuids
+  /sys/bus/pci/devices/0000:06:00.0/xmgmt_main.0/logic_uuids
+
+
+hwmon
+-----
+
+xmgmt driver exposes standard hwmon interface to report voltage, current, temperature,
+power, etc. These can easily be viewed using *sensors* command line utility.
+
+
+mailbox
+-------
+
+xmgmt communicates with user physical function driver via HW mailbox. Mailbox opcodes
+are defined in ``mailbox_proto.h``. `Mailbox Inter-domain Communication Protocol
+<https://xilinx.github.io/XRT/master/html/mailbox.proto.html>`_ defines the full
+specification. xmgmt implements subset of the specification. It provides the following
+services to the UPF driver:
+
+1.  Responding to *are you there* request including determining if the two drivers are
+    running in the same OS domain
+2.  Provide sensor readings, loaded xclbin UUID, clock frequency, shell information, etc.
+3.  Perform PCIe hot reset
+4.  Download user compiled xclbin
+
+
+Platform Security Considerations
+================================
+
+`Security of Alveo Platform <https://xilinx.github.io/XRT/master/html/security.html>`_
+discusses the deployment options and security implications in great detail.
--