[v9,0/7] iommu/arm-smmu: Enable split pagetable support
mbox series

Message ID 20200626200042.13713-1-jcrouse@codeaurora.org
Headers show
Series
  • iommu/arm-smmu: Enable split pagetable support
Related show

Message

Jordan Crouse June 26, 2020, 8 p.m. UTC
Another iteration of the split-pagetable support for arm-smmu and the Adreno GPU
SMMU. After email discussions [1] we opted to make a arm-smmu implementation for
specifically for the Adreno GPU and use that to enable split pagetable support
and later other implementation specific bits that we need.

On the hardware side this is very close to the same code from before [2] only
the TTBR1 quirk is turned on by the implementation and not a domain attribute.
In drm/msm we use the returned size of the aperture as a clue to let us know
which virtual address space we should use for global memory objects.

There are two open items that you should be aware of. First, in the
implementation specific code we have to check the compatible string of the
device so that we only enable TTBR1 for the GPU (SID 0) and not the GMU (SID 4).
I went back and forth trying to decide if I wanted to use the compatible string
or the SID as the filter and settled on the compatible string but I could be
talked out of it.

The other open item is that in drm/msm the hardware only uses 49 bits of the
address space but arm-smmu expects the address to be sign extended all the way
to 64 bits. This isn't a problem normally unless you look at the hardware
registers that contain a IOVA and then the upper bits will be zero. I opted to
restrict the internal drm/msm IOVA range to only 49 bits and then sign extend
right before calling iommu_map / iommu_unmap. This is a bit wonky but I thought
that matching the hardware would be less confusing when debugging a hang.

v9: Fix bot-detected merge conflict
v7: Add attached device to smmu_domain to pass to implementation specific
functions

[1] https://lists.linuxfoundation.org/pipermail/iommu/2020-May/044537.html
[2] https://patchwork.kernel.org/patch/11482591/


Jordan Crouse (7):
  iommu/arm-smmu: Pass io-pgtable config to implementation specific
    function
  iommu/arm-smmu: Add support for split pagetables
  dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
  iommu/arm-smmu: Add a pointer to the attached device to smmu_domain
  iommu/arm-smmu: Add implementation for the adreno GPU SMMU
  drm/msm: Set the global virtual address range from the IOMMU domain
  arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU

 .../devicetree/bindings/iommu/arm,smmu.yaml   |  4 ++
 arch/arm64/boot/dts/qcom/sdm845.dtsi          |  2 +-
 drivers/gpu/drm/msm/adreno/adreno_gpu.c       | 13 +++++-
 drivers/gpu/drm/msm/msm_iommu.c               |  7 +++
 drivers/iommu/arm-smmu-impl.c                 |  6 ++-
 drivers/iommu/arm-smmu-qcom.c                 | 45 ++++++++++++++++++-
 drivers/iommu/arm-smmu.c                      | 38 +++++++++++-----
 drivers/iommu/arm-smmu.h                      | 30 ++++++++++---
 8 files changed, 120 insertions(+), 25 deletions(-)

Comments

Sai Prakash Ranjan July 1, 2020, 10:11 a.m. UTC | #1
Hi Will, Robin,

On 2020-06-27 01:30, Jordan Crouse wrote:
> Another iteration of the split-pagetable support for arm-smmu and the 
> Adreno GPU
> SMMU. After email discussions [1] we opted to make a arm-smmu 
> implementation for
> specifically for the Adreno GPU and use that to enable split pagetable 
> support
> and later other implementation specific bits that we need.
> 
> On the hardware side this is very close to the same code from before 
> [2] only
> the TTBR1 quirk is turned on by the implementation and not a domain 
> attribute.
> In drm/msm we use the returned size of the aperture as a clue to let us 
> know
> which virtual address space we should use for global memory objects.
> 
> There are two open items that you should be aware of. First, in the
> implementation specific code we have to check the compatible string of 
> the
> device so that we only enable TTBR1 for the GPU (SID 0) and not the GMU 
> (SID 4).
> I went back and forth trying to decide if I wanted to use the 
> compatible string
> or the SID as the filter and settled on the compatible string but I 
> could be
> talked out of it.
> 
> The other open item is that in drm/msm the hardware only uses 49 bits 
> of the
> address space but arm-smmu expects the address to be sign extended all 
> the way
> to 64 bits. This isn't a problem normally unless you look at the 
> hardware
> registers that contain a IOVA and then the upper bits will be zero. I 
> opted to
> restrict the internal drm/msm IOVA range to only 49 bits and then sign 
> extend
> right before calling iommu_map / iommu_unmap. This is a bit wonky but I 
> thought
> that matching the hardware would be less confusing when debugging a 
> hang.
> 
> v9: Fix bot-detected merge conflict
> v7: Add attached device to smmu_domain to pass to implementation 
> specific
> functions
> 
> [1] 
> https://lists.linuxfoundation.org/pipermail/iommu/2020-May/044537.html
> [2] https://patchwork.kernel.org/patch/11482591/
> 
> 
> Jordan Crouse (7):
>   iommu/arm-smmu: Pass io-pgtable config to implementation specific
>     function
>   iommu/arm-smmu: Add support for split pagetables
>   dt-bindings: arm-smmu: Add compatible string for Adreno GPU SMMU
>   iommu/arm-smmu: Add a pointer to the attached device to smmu_domain
>   iommu/arm-smmu: Add implementation for the adreno GPU SMMU
>   drm/msm: Set the global virtual address range from the IOMMU domain
>   arm: dts: qcom: sm845: Set the compatible string for the GPU SMMU
> 
>  .../devicetree/bindings/iommu/arm,smmu.yaml   |  4 ++
>  arch/arm64/boot/dts/qcom/sdm845.dtsi          |  2 +-
>  drivers/gpu/drm/msm/adreno/adreno_gpu.c       | 13 +++++-
>  drivers/gpu/drm/msm/msm_iommu.c               |  7 +++
>  drivers/iommu/arm-smmu-impl.c                 |  6 ++-
>  drivers/iommu/arm-smmu-qcom.c                 | 45 ++++++++++++++++++-
>  drivers/iommu/arm-smmu.c                      | 38 +++++++++++-----
>  drivers/iommu/arm-smmu.h                      | 30 ++++++++++---
>  8 files changed, 120 insertions(+), 25 deletions(-)

Any chance reviewing this?

Thanks,
Sai