Message ID | 1664919839-27149-1-git-send-email-lizhi.hou@amd.com (mailing list archive) |
---|---|
Headers | show |
Series | xilinx XDMA driver | expand |
On 04. 10. 22 23:43, Lizhi Hou wrote: > Hello, > > This V6 of patch series is to provide the platform driver to support the > Xilinx XDMA subsystem. The XDMA subsystem is used in conjunction with the > PCI Express IP block to provide high performance data transfer between host > memory and the card's DMA subsystem. It also provides up to 16 user > interrupt wires to user logic that generate interrupts to the host. > > +-------+ +-------+ +-----------+ > PCIe | | | | | | > Tx/Rx | | | | AXI | | > <=======> | PCIE | <===> | XDMA | <====>| User Logic| > | | | | | | > +-------+ +-------+ +-----------+ > > The XDMA has been used for Xilinx Alveo PCIe devices. > And it is also integrated into Versal ACAP DMA and Bridge Subsystem. > https://www.xilinx.com/products/boards-and-kits/alveo.html > https://docs.xilinx.com/r/en-US/pg344-pcie-dma-versal/Introduction-to-the-DMA-and-Bridge-Subsystems > > The device driver for any FPGA based PCIe device which leverages XDMA can > call the standard dmaengine APIs to discover and use the XDMA subsystem > without duplicating the XDMA driver code in its own driver. > > Changes since v5: > - Modified user logic interrupt APIs to handle user logic IP which does not > have its own register to enable/disable interrupt. > - Clean up code based on review comments. > > Changes since v4: > - Modified user logic interrupt APIs. > > Changes since v3: > - Added one patch to support user logic interrupt. > > Changes since v2: > - Removed tasklet. > - Fixed regression bug introduced to V2. > - Test Robot warning. > > Changes since v1: > - Moved filling hardware descriptor to xdma_prep_device_sg(). > - Changed hardware descriptor enum to "struct xdma_hw_desc". > - Minor changes from code review comments. > > Lizhi Hou (2): > dmaengine: xilinx: xdma: Add xilinx xdma driver > dmaengine: xilinx: xdma: Add user logic interrupt support > > MAINTAINERS | 11 + > drivers/dma/Kconfig | 13 + > drivers/dma/xilinx/Makefile | 1 + > drivers/dma/xilinx/xdma-regs.h | 171 ++++ > drivers/dma/xilinx/xdma.c | 1034 ++++++++++++++++++++++++ > include/linux/dma/amd_xdma.h | 16 + > include/linux/platform_data/amd_xdma.h | 34 + > 7 files changed, 1280 insertions(+) > create mode 100644 drivers/dma/xilinx/xdma-regs.h > create mode 100644 drivers/dma/xilinx/xdma.c > create mode 100644 include/linux/dma/amd_xdma.h > create mode 100644 include/linux/platform_data/amd_xdma.h > Hi, I have rewritten our V4L2 driver to use this new XDMA driver, but it does not work on our HW (where the previous Xilinx XDMA driver derived from the Xilinx sample code worked fine). The driver is sucessfully loaded and 4(2+2) DMA channels are successfully created. But when a DMA transfer is initiated, I get an error from my PC's DMA chip: AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000a address=0x36a00000 flags=0x0000] and no error from XDMA. Does the driver expect some special FPGA IP core configuration? Or is there something else I'm missing? My code is quiet similar to what you use in your XRT repo on GitHub (there is btw. a bug in the XRT code - you do not clear the dma_slave_config struct before using) but in my case the DMA transfer triggers the AMD-Vi error and timeouts. The code of our driver is attached, the relevant parts are in mgb4_dma.c and mgb4_core.c. M.
On 10/6/22 09:37, Martin Tůma wrote: > On 04. 10. 22 23:43, Lizhi Hou wrote: >> Hello, >> >> This V6 of patch series is to provide the platform driver to support the >> Xilinx XDMA subsystem. The XDMA subsystem is used in conjunction with >> the >> PCI Express IP block to provide high performance data transfer >> between host >> memory and the card's DMA subsystem. It also provides up to 16 user >> interrupt wires to user logic that generate interrupts to the host. >> >> +-------+ +-------+ +-----------+ >> PCIe | | | | | | >> Tx/Rx | | | | AXI | | >> <=======> | PCIE | <===> | XDMA | <====>| User Logic| >> | | | | | | >> +-------+ +-------+ +-----------+ >> >> The XDMA has been used for Xilinx Alveo PCIe devices. >> And it is also integrated into Versal ACAP DMA and Bridge Subsystem. >> https://www.xilinx.com/products/boards-and-kits/alveo.html >> https://docs.xilinx.com/r/en-US/pg344-pcie-dma-versal/Introduction-to-the-DMA-and-Bridge-Subsystems >> >> The device driver for any FPGA based PCIe device which leverages XDMA >> can >> call the standard dmaengine APIs to discover and use the XDMA subsystem >> without duplicating the XDMA driver code in its own driver. >> >> Changes since v5: >> - Modified user logic interrupt APIs to handle user logic IP which >> does not >> have its own register to enable/disable interrupt. >> - Clean up code based on review comments. >> >> Changes since v4: >> - Modified user logic interrupt APIs. >> >> Changes since v3: >> - Added one patch to support user logic interrupt. >> >> Changes since v2: >> - Removed tasklet. >> - Fixed regression bug introduced to V2. >> - Test Robot warning. >> >> Changes since v1: >> - Moved filling hardware descriptor to xdma_prep_device_sg(). >> - Changed hardware descriptor enum to "struct xdma_hw_desc". >> - Minor changes from code review comments. >> >> Lizhi Hou (2): >> dmaengine: xilinx: xdma: Add xilinx xdma driver >> dmaengine: xilinx: xdma: Add user logic interrupt support >> >> MAINTAINERS | 11 + >> drivers/dma/Kconfig | 13 + >> drivers/dma/xilinx/Makefile | 1 + >> drivers/dma/xilinx/xdma-regs.h | 171 ++++ >> drivers/dma/xilinx/xdma.c | 1034 ++++++++++++++++++++++++ >> include/linux/dma/amd_xdma.h | 16 + >> include/linux/platform_data/amd_xdma.h | 34 + >> 7 files changed, 1280 insertions(+) >> create mode 100644 drivers/dma/xilinx/xdma-regs.h >> create mode 100644 drivers/dma/xilinx/xdma.c >> create mode 100644 include/linux/dma/amd_xdma.h >> create mode 100644 include/linux/platform_data/amd_xdma.h >> > > Hi, > I have rewritten our V4L2 driver to use this new XDMA driver, but it > does not work on our HW (where the previous Xilinx XDMA driver derived > from the Xilinx sample code worked fine). The driver is sucessfully > loaded and 4(2+2) DMA channels are successfully created. But when a > DMA transfer is initiated, I get an error from my PC's DMA chip: > > AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000a address=0x36a00000 > flags=0x0000] > > and no error from XDMA. > > Does the driver expect some special FPGA IP core configuration? Or is > there something else I'm missing? My code is quiet similar to what you > use in your XRT repo on GitHub (there is btw. a bug in the XRT code - > you do not clear the dma_slave_config struct before using) but in my > case the DMA transfer triggers the AMD-Vi error and timeouts. > > The code of our driver is attached, the relevant parts are in mgb4_dma.c > and mgb4_core.c. > > M. Hi Martin, Thanks for trying this and got a lot thing works. I have read your driver. Could you call pci_map_sg() before calling prep_sg()? (and pci_unmap_sg()) after transfer complete? example: https://github.com/houlz0507/XRT-1/blob/xdma_v4_usage/src/runtime_src/core/pcie/driver/linux/xocl/subdev/xdma.c#L103 Thanks, Lizhi
On 06. 10. 22 19:42, Lizhi Hou wrote: > > On 10/6/22 09:37, Martin Tůma wrote: >> On 04. 10. 22 23:43, Lizhi Hou wrote: >>> Hello, >>> >>> This V6 of patch series is to provide the platform driver to support the >>> Xilinx XDMA subsystem. The XDMA subsystem is used in conjunction with >>> the >>> PCI Express IP block to provide high performance data transfer >>> between host >>> memory and the card's DMA subsystem. It also provides up to 16 user >>> interrupt wires to user logic that generate interrupts to the host. >>> >>> +-------+ +-------+ +-----------+ >>> PCIe | | | | | | >>> Tx/Rx | | | | AXI | | >>> <=======> | PCIE | <===> | XDMA | <====>| User Logic| >>> | | | | | | >>> +-------+ +-------+ +-----------+ >>> >>> The XDMA has been used for Xilinx Alveo PCIe devices. >>> And it is also integrated into Versal ACAP DMA and Bridge Subsystem. >>> https://www.xilinx.com/products/boards-and-kits/alveo.html >>> https://docs.xilinx.com/r/en-US/pg344-pcie-dma-versal/Introduction-to-the-DMA-and-Bridge-Subsystems >>> >>> The device driver for any FPGA based PCIe device which leverages XDMA >>> can >>> call the standard dmaengine APIs to discover and use the XDMA subsystem >>> without duplicating the XDMA driver code in its own driver. >>> >>> Changes since v5: >>> - Modified user logic interrupt APIs to handle user logic IP which >>> does not >>> have its own register to enable/disable interrupt. >>> - Clean up code based on review comments. >>> >>> Changes since v4: >>> - Modified user logic interrupt APIs. >>> >>> Changes since v3: >>> - Added one patch to support user logic interrupt. >>> >>> Changes since v2: >>> - Removed tasklet. >>> - Fixed regression bug introduced to V2. >>> - Test Robot warning. >>> >>> Changes since v1: >>> - Moved filling hardware descriptor to xdma_prep_device_sg(). >>> - Changed hardware descriptor enum to "struct xdma_hw_desc". >>> - Minor changes from code review comments. >>> >>> Lizhi Hou (2): >>> dmaengine: xilinx: xdma: Add xilinx xdma driver >>> dmaengine: xilinx: xdma: Add user logic interrupt support >>> >>> MAINTAINERS | 11 + >>> drivers/dma/Kconfig | 13 + >>> drivers/dma/xilinx/Makefile | 1 + >>> drivers/dma/xilinx/xdma-regs.h | 171 ++++ >>> drivers/dma/xilinx/xdma.c | 1034 ++++++++++++++++++++++++ >>> include/linux/dma/amd_xdma.h | 16 + >>> include/linux/platform_data/amd_xdma.h | 34 + >>> 7 files changed, 1280 insertions(+) >>> create mode 100644 drivers/dma/xilinx/xdma-regs.h >>> create mode 100644 drivers/dma/xilinx/xdma.c >>> create mode 100644 include/linux/dma/amd_xdma.h >>> create mode 100644 include/linux/platform_data/amd_xdma.h >>> >> >> Hi, >> I have rewritten our V4L2 driver to use this new XDMA driver, but it >> does not work on our HW (where the previous Xilinx XDMA driver derived >> from the Xilinx sample code worked fine). The driver is sucessfully >> loaded and 4(2+2) DMA channels are successfully created. But when a >> DMA transfer is initiated, I get an error from my PC's DMA chip: >> >> AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000a address=0x36a00000 >> flags=0x0000] >> >> and no error from XDMA. >> >> Does the driver expect some special FPGA IP core configuration? Or is >> there something else I'm missing? My code is quiet similar to what you >> use in your XRT repo on GitHub (there is btw. a bug in the XRT code - >> you do not clear the dma_slave_config struct before using) but in my >> case the DMA transfer triggers the AMD-Vi error and timeouts. >> >> The code of our driver is attached, the relevant parts are in mgb4_dma.c >> and mgb4_core.c. >> >> M. > > Hi Martin, > > Thanks for trying this and got a lot thing works. I have read your > driver. Could you call pci_map_sg() before calling prep_sg()? (and > pci_unmap_sg()) after transfer complete? > > example: > https://github.com/houlz0507/XRT-1/blob/xdma_v4_usage/src/runtime_src/core/pcie/driver/linux/xocl/subdev/xdma.c#L103 > > > Thanks, > > Lizhi > Hi, That's not the problem, the sg is already mapped by V4L2 videobuf-dma-sg (https://elixir.bootlin.com/linux/v6.0/source/drivers/media/v4l2-core/videobuf-dma-sg.c#L285). With the original XDMA driver I was also not mapping the sg and it worked fine. And yes, I have tried it anyway and it didn't help ;-) So there must be some other problem in your XDMA driver. M.
On 10/7/22 06:21, Martin Tůma wrote: > On 06. 10. 22 19:42, Lizhi Hou wrote: >> >> On 10/6/22 09:37, Martin Tůma wrote: >>> On 04. 10. 22 23:43, Lizhi Hou wrote: >>>> Hello, >>>> >>>> This V6 of patch series is to provide the platform driver to >>>> support the >>>> Xilinx XDMA subsystem. The XDMA subsystem is used in conjunction >>>> with the >>>> PCI Express IP block to provide high performance data transfer >>>> between host >>>> memory and the card's DMA subsystem. It also provides up to 16 user >>>> interrupt wires to user logic that generate interrupts to the host. >>>> >>>> +-------+ +-------+ +-----------+ >>>> PCIe | | | | | | >>>> Tx/Rx | | | | AXI | | >>>> <=======> | PCIE | <===> | XDMA | <====>| User Logic| >>>> | | | | | | >>>> +-------+ +-------+ +-----------+ >>>> >>>> The XDMA has been used for Xilinx Alveo PCIe devices. >>>> And it is also integrated into Versal ACAP DMA and Bridge Subsystem. >>>> https://www.xilinx.com/products/boards-and-kits/alveo.html >>>> https://docs.xilinx.com/r/en-US/pg344-pcie-dma-versal/Introduction-to-the-DMA-and-Bridge-Subsystems >>>> >>>> >>>> The device driver for any FPGA based PCIe device which leverages >>>> XDMA can >>>> call the standard dmaengine APIs to discover and use the XDMA >>>> subsystem >>>> without duplicating the XDMA driver code in its own driver. >>>> >>>> Changes since v5: >>>> - Modified user logic interrupt APIs to handle user logic IP which >>>> does not >>>> have its own register to enable/disable interrupt. >>>> - Clean up code based on review comments. >>>> >>>> Changes since v4: >>>> - Modified user logic interrupt APIs. >>>> >>>> Changes since v3: >>>> - Added one patch to support user logic interrupt. >>>> >>>> Changes since v2: >>>> - Removed tasklet. >>>> - Fixed regression bug introduced to V2. >>>> - Test Robot warning. >>>> >>>> Changes since v1: >>>> - Moved filling hardware descriptor to xdma_prep_device_sg(). >>>> - Changed hardware descriptor enum to "struct xdma_hw_desc". >>>> - Minor changes from code review comments. >>>> >>>> Lizhi Hou (2): >>>> dmaengine: xilinx: xdma: Add xilinx xdma driver >>>> dmaengine: xilinx: xdma: Add user logic interrupt support >>>> >>>> MAINTAINERS | 11 + >>>> drivers/dma/Kconfig | 13 + >>>> drivers/dma/xilinx/Makefile | 1 + >>>> drivers/dma/xilinx/xdma-regs.h | 171 ++++ >>>> drivers/dma/xilinx/xdma.c | 1034 >>>> ++++++++++++++++++++++++ >>>> include/linux/dma/amd_xdma.h | 16 + >>>> include/linux/platform_data/amd_xdma.h | 34 + >>>> 7 files changed, 1280 insertions(+) >>>> create mode 100644 drivers/dma/xilinx/xdma-regs.h >>>> create mode 100644 drivers/dma/xilinx/xdma.c >>>> create mode 100644 include/linux/dma/amd_xdma.h >>>> create mode 100644 include/linux/platform_data/amd_xdma.h >>>> >>> >>> Hi, >>> I have rewritten our V4L2 driver to use this new XDMA driver, but it >>> does not work on our HW (where the previous Xilinx XDMA driver >>> derived from the Xilinx sample code worked fine). The driver is >>> sucessfully loaded and 4(2+2) DMA channels are successfully created. >>> But when a DMA transfer is initiated, I get an error from my PC's >>> DMA chip: >>> >>> AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000a address=0x36a00000 >>> flags=0x0000] >>> >>> and no error from XDMA. >>> >>> Does the driver expect some special FPGA IP core configuration? Or >>> is there something else I'm missing? My code is quiet similar to >>> what you use in your XRT repo on GitHub (there is btw. a bug in the >>> XRT code - you do not clear the dma_slave_config struct before >>> using) but in my case the DMA transfer triggers the AMD-Vi error and >>> timeouts. >>> >>> The code of our driver is attached, the relevant parts are in >>> mgb4_dma.c >>> and mgb4_core.c. >>> >>> M. >> >> Hi Martin, >> >> Thanks for trying this and got a lot thing works. I have read your >> driver. Could you call pci_map_sg() before calling prep_sg()? (and >> pci_unmap_sg()) after transfer complete? >> >> example: >> https://github.com/houlz0507/XRT-1/blob/xdma_v4_usage/src/runtime_src/core/pcie/driver/linux/xocl/subdev/xdma.c#L103 >> >> >> Thanks, >> >> Lizhi >> > > Hi, > That's not the problem, the sg is already mapped by V4L2 videobuf-dma-sg > (https://elixir.bootlin.com/linux/v6.0/source/drivers/media/v4l2-core/videobuf-dma-sg.c#L285). > With the original XDMA driver I was also not mapping the sg and it > worked fine. And yes, I have tried it anyway and it didn't help ;-) > > So there must be some other problem in your XDMA driver. > > M. Hi Martin, I got a AMD server and debug this issue. And there is indeed a XDMA driver issue. My DMA test passed on this server after fixing the issue. I will post v7 patches and hopefully that fixes the issue you have seen as well. Thanks for trying the driver. Lizhi
On 11. 10. 22 17:27, Lizhi Hou wrote: > > On 10/7/22 06:21, Martin Tůma wrote: >> On 06. 10. 22 19:42, Lizhi Hou wrote: >>> >>> On 10/6/22 09:37, Martin Tůma wrote: >>>> On 04. 10. 22 23:43, Lizhi Hou wrote: >>>>> Hello, >>>>> >>>>> This V6 of patch series is to provide the platform driver to >>>>> support the >>>>> Xilinx XDMA subsystem. The XDMA subsystem is used in conjunction >>>>> with the >>>>> PCI Express IP block to provide high performance data transfer >>>>> between host >>>>> memory and the card's DMA subsystem. It also provides up to 16 user >>>>> interrupt wires to user logic that generate interrupts to the host. >>>>> >>>>> +-------+ +-------+ +-----------+ >>>>> PCIe | | | | | | >>>>> Tx/Rx | | | | AXI | | >>>>> <=======> | PCIE | <===> | XDMA | <====>| User Logic| >>>>> | | | | | | >>>>> +-------+ +-------+ +-----------+ >>>>> >>>>> The XDMA has been used for Xilinx Alveo PCIe devices. >>>>> And it is also integrated into Versal ACAP DMA and Bridge Subsystem. >>>>> https://www.xilinx.com/products/boards-and-kits/alveo.html >>>>> https://docs.xilinx.com/r/en-US/pg344-pcie-dma-versal/Introduction-to-the-DMA-and-Bridge-Subsystems >>>>> >>>>> The device driver for any FPGA based PCIe device which leverages >>>>> XDMA can >>>>> call the standard dmaengine APIs to discover and use the XDMA >>>>> subsystem >>>>> without duplicating the XDMA driver code in its own driver. >>>>> >>>>> Changes since v5: >>>>> - Modified user logic interrupt APIs to handle user logic IP which >>>>> does not >>>>> have its own register to enable/disable interrupt. >>>>> - Clean up code based on review comments. >>>>> >>>>> Changes since v4: >>>>> - Modified user logic interrupt APIs. >>>>> >>>>> Changes since v3: >>>>> - Added one patch to support user logic interrupt. >>>>> >>>>> Changes since v2: >>>>> - Removed tasklet. >>>>> - Fixed regression bug introduced to V2. >>>>> - Test Robot warning. >>>>> >>>>> Changes since v1: >>>>> - Moved filling hardware descriptor to xdma_prep_device_sg(). >>>>> - Changed hardware descriptor enum to "struct xdma_hw_desc". >>>>> - Minor changes from code review comments. >>>>> >>>>> Lizhi Hou (2): >>>>> dmaengine: xilinx: xdma: Add xilinx xdma driver >>>>> dmaengine: xilinx: xdma: Add user logic interrupt support >>>>> >>>>> MAINTAINERS | 11 + >>>>> drivers/dma/Kconfig | 13 + >>>>> drivers/dma/xilinx/Makefile | 1 + >>>>> drivers/dma/xilinx/xdma-regs.h | 171 ++++ >>>>> drivers/dma/xilinx/xdma.c | 1034 >>>>> ++++++++++++++++++++++++ >>>>> include/linux/dma/amd_xdma.h | 16 + >>>>> include/linux/platform_data/amd_xdma.h | 34 + >>>>> 7 files changed, 1280 insertions(+) >>>>> create mode 100644 drivers/dma/xilinx/xdma-regs.h >>>>> create mode 100644 drivers/dma/xilinx/xdma.c >>>>> create mode 100644 include/linux/dma/amd_xdma.h >>>>> create mode 100644 include/linux/platform_data/amd_xdma.h >>>>> >>>> >>>> Hi, >>>> I have rewritten our V4L2 driver to use this new XDMA driver, but it >>>> does not work on our HW (where the previous Xilinx XDMA driver >>>> derived from the Xilinx sample code worked fine). The driver is >>>> sucessfully loaded and 4(2+2) DMA channels are successfully created. >>>> But when a DMA transfer is initiated, I get an error from my PC's >>>> DMA chip: >>>> >>>> AMD-Vi: Event logged [IO_PAGE_FAULT domain=0x000a address=0x36a00000 >>>> flags=0x0000] >>>> >>>> and no error from XDMA. >>>> >>>> Does the driver expect some special FPGA IP core configuration? Or >>>> is there something else I'm missing? My code is quiet similar to >>>> what you use in your XRT repo on GitHub (there is btw. a bug in the >>>> XRT code - you do not clear the dma_slave_config struct before >>>> using) but in my case the DMA transfer triggers the AMD-Vi error and >>>> timeouts. >>>> >>>> The code of our driver is attached, the relevant parts are in >>>> mgb4_dma.c >>>> and mgb4_core.c. >>>> >>>> M. >>> >>> Hi Martin, >>> >>> Thanks for trying this and got a lot thing works. I have read your >>> driver. Could you call pci_map_sg() before calling prep_sg()? (and >>> pci_unmap_sg()) after transfer complete? >>> >>> example: >>> https://github.com/houlz0507/XRT-1/blob/xdma_v4_usage/src/runtime_src/core/pcie/driver/linux/xocl/subdev/xdma.c#L103 >>> >>> >>> Thanks, >>> >>> Lizhi >>> >> >> Hi, >> That's not the problem, the sg is already mapped by V4L2 videobuf-dma-sg >> (https://elixir.bootlin.com/linux/v6.0/source/drivers/media/v4l2-core/videobuf-dma-sg.c#L285). With the original XDMA driver I was also not mapping the sg and it worked fine. And yes, I have tried it anyway and it didn't help ;-) >> >> So there must be some other problem in your XDMA driver. >> >> M. > > Hi Martin, > > I got a AMD server and debug this issue. And there is indeed a XDMA > driver issue. My DMA test passed on this server after fixing the issue. > I will post v7 patches and hopefully that fixes the issue you have seen > as well. > > Thanks for trying the driver. > > Lizhi > Hi, Even the newest version (v7) still does not work with our card. The error is the same. M.