[v2,00/13] drm/tegra: Fix IOVA space on Tegra186 and later

Message ID	20190124180254.20080-1-thierry.reding@gmail.com (mailing list archive)
Headers	show Return-Path: <dri-devel-bounces@lists.freedesktop.org> From: Thierry Reding <thierry.reding@gmail.com> To: Thierry Reding <thierry.reding@gmail.com> Subject: [PATCH v2 00/13] drm/tegra: Fix IOVA space on Tegra186 and later Date: Thu, 24 Jan 2019 19:02:41 +0100 Message-Id: <20190124180254.20080-1-thierry.reding@gmail.com> MIME-Version: 1.0 Precedence: list Cc: linux-tegra@vger.kernel.org, Dmitry Osipenko <digetx@gmail.com>, dri-devel@lists.freedesktop.org, Mikko Perttunen <mperttunen@nvidia.com> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: base64 Errors-To: dri-devel-bounces@lists.freedesktop.org Sender: "dri-devel" <dri-devel-bounces@lists.freedesktop.org>
Series	drm/tegra: Fix IOVA space on Tegra186 and later \| expand [v2,00/13] drm/tegra: Fix IOVA space on Tegra186 and later [v2,01/13] gpu: host1x: Set up stream ID table [v2,02/13] gpu: host1x: Program the channel stream ID [v2,03/13] gpu: host1x: Support 40-bit addressing [v2,04/13] gpu: host1x: Use direct DMA with IOMMU API usage [v2,05/13] gpu: host1x: Restrict IOVA space to DMA mask [v2,06/13] gpu: host1x: Support 40-bit addressing on Tegra186 [v2,07/13] drm/tegra: Store parent pointer in Tegra DRM clients [v2,08/13] drm/tegra: vic: Load firmware on demand [v2,09/13] drm/tegra: Setup shared IOMMU domain after initialization [v2,10/13] drm/tegra: Restrict IOVA space to DMA mask [v2,11/13] drm/tegra: vic: Do not clear driver data [v2,12/13] drm/tegra: vic: Support stream ID register programming [v2,13/13] arm64: tegra: Enable SMMU for VIC on Tegra186

Thierry Reding Jan. 24, 2019, 6:02 p.m. UTC

From: Thierry Reding <treding@nvidia.com>

Tegra186 and later are different from earlier generations in that they
use an ARM SMMU rather than the Tegra SMMU. The ARM SMMU driver behaves
slightly differently in that the geometry for IOMMU domains is set only
after a device was attached to it. This is to make sure that the SMMU
instance that the domain belongs to is known, because each instance can
have a different input address space (i.e. geometry).

Work around this by moving all IOVA allocations to a point where the
geometry of the domain is properly initialized.

This second version of the series addresses all review comments and adds
a number of patches that will actually allow host1x to work with an SMMU
enabled on Tegra186. The patches also add programming required to
address the full 40 bits of address space.

This supersedes the following patch:

    https://patchwork.kernel.org/patch/10775579/

Thierry

Thierry Reding (13):
  gpu: host1x: Set up stream ID table
  gpu: host1x: Program the channel stream ID
  gpu: host1x: Support 40-bit addressing
  gpu: host1x: Use direct DMA with IOMMU API usage
  gpu: host1x: Restrict IOVA space to DMA mask
  gpu: host1x: Supports 40-bit addressing on Tegra186
  drm/tegra: Store parent pointer in Tegra DRM clients
  drm/tegra: vic: Load firmware on demand
  drm/tegra: Setup shared IOMMU domain after initialization
  drm/tegra: Restrict IOVA space to DMA mask
  drm/tegra: vic: Do not clear driver data
  drm/tegra: vic: Support stream ID register programming
  arm64: tegra: Enable SMMU for VIC on Tegra186

 arch/arm64/boot/dts/nvidia/tegra186.dtsi    |  1 +
 drivers/gpu/drm/tegra/drm.c                 | 57 +++++++++-------
 drivers/gpu/drm/tegra/drm.h                 |  1 +
 drivers/gpu/drm/tegra/vic.c                 | 75 ++++++++++++++++-----
 drivers/gpu/drm/tegra/vic.h                 |  9 +++
 drivers/gpu/host1x/cdma.c                   | 11 ++-
 drivers/gpu/host1x/dev.c                    | 49 ++++++++++++--
 drivers/gpu/host1x/dev.h                    |  8 +++
 drivers/gpu/host1x/hw/cdma_hw.c             | 13 +++-
 drivers/gpu/host1x/hw/channel_hw.c          | 40 +++++++++--
 drivers/gpu/host1x/hw/host1x06_hardware.h   |  6 ++
 drivers/gpu/host1x/hw/host1x07_hardware.h   |  6 ++
 drivers/gpu/host1x/hw/hw_host1x06_channel.h | 11 +++
 drivers/gpu/host1x/hw/hw_host1x07_channel.h | 11 +++
 14 files changed, 241 insertions(+), 57 deletions(-)
 create mode 100644 drivers/gpu/host1x/hw/hw_host1x06_channel.h
 create mode 100644 drivers/gpu/host1x/hw/hw_host1x07_channel.h

Dmitry Osipenko Jan. 24, 2019, 9:38 p.m. UTC | #1

24.01.2019 21:02, Thierry Reding пишет:
> From: Thierry Reding <treding@nvidia.com>
> 
> Tegra186 and later are different from earlier generations in that they
> use an ARM SMMU rather than the Tegra SMMU. The ARM SMMU driver behaves
> slightly differently in that the geometry for IOMMU domains is set only
> after a device was attached to it. This is to make sure that the SMMU
> instance that the domain belongs to is known, because each instance can
> have a different input address space (i.e. geometry).
> 
> Work around this by moving all IOVA allocations to a point where the
> geometry of the domain is properly initialized.
> 
> This second version of the series addresses all review comments and adds
> a number of patches that will actually allow host1x to work with an SMMU
> enabled on Tegra186. The patches also add programming required to
> address the full 40 bits of address space.
> 
> This supersedes the following patch:
> 
>     https://patchwork.kernel.org/patch/10775579/

My understanding is that falcon won't boot because source DMA address of the firmware isn't set up correctly if the address is >32bit. Please correct me if I'm wrong, otherwise that need to be addressed in this series as well for completeness.

Dmitry Osipenko Jan. 24, 2019, 9:53 p.m. UTC | #2

24.01.2019 21:02, Thierry Reding пишет:
> From: Thierry Reding <treding@nvidia.com>
> 
> Tegra186 and later are different from earlier generations in that they
> use an ARM SMMU rather than the Tegra SMMU. The ARM SMMU driver behaves
> slightly differently in that the geometry for IOMMU domains is set only
> after a device was attached to it. This is to make sure that the SMMU
> instance that the domain belongs to is known, because each instance can
> have a different input address space (i.e. geometry).
> 
> Work around this by moving all IOVA allocations to a point where the
> geometry of the domain is properly initialized.
> 
> This second version of the series addresses all review comments and adds
> a number of patches that will actually allow host1x to work with an SMMU
> enabled on Tegra186. The patches also add programming required to
> address the full 40 bits of address space.
> 
> This supersedes the following patch:
> 
>     https://patchwork.kernel.org/patch/10775579/

Secondly, seems there are additional restrictions for the host1x jobs on T186, at least T186 TRM suggests so. In particular looks like each client is hardwired to a specific sync point and to a specific channel. Or maybe there is assumption that upstream kernel could work only in a hypervisor mode or with all protections disable. Could you please clarify?

Mikko Perttunen Jan. 25, 2019, 8:57 a.m. UTC | #3

On 24.1.2019 23.53, Dmitry Osipenko wrote:
> 24.01.2019 21:02, Thierry Reding пишет:
>> From: Thierry Reding <treding@nvidia.com>
>>
>> Tegra186 and later are different from earlier generations in that they
>> use an ARM SMMU rather than the Tegra SMMU. The ARM SMMU driver behaves
>> slightly differently in that the geometry for IOMMU domains is set only
>> after a device was attached to it. This is to make sure that the SMMU
>> instance that the domain belongs to is known, because each instance can
>> have a different input address space (i.e. geometry).
>>
>> Work around this by moving all IOVA allocations to a point where the
>> geometry of the domain is properly initialized.
>>
>> This second version of the series addresses all review comments and adds
>> a number of patches that will actually allow host1x to work with an SMMU
>> enabled on Tegra186. The patches also add programming required to
>> address the full 40 bits of address space.
>>
>> This supersedes the following patch:
>>
>>      https://patchwork.kernel.org/patch/10775579/
> 
> Secondly, seems there are additional restrictions for the host1x jobs on T186, at least T186 TRM suggests so. In particular looks like each client is hardwired to a specific sync point and to a specific channel. Or maybe there is assumption that upstream kernel could work only in a hypervisor mode or with all protections disable. Could you please clarify?
> 

There are no such syncpoint/channel restrictions. The upstream driver 
indeed currently only supports the case where there is no "hypervisor" 
(that is, server process that allocates host1x resources) running and 
the kernel has access to the Host1x COMMON/"hypervisor" register aperture.

Adding support for the situation where this is not the case shouldn't be 
very difficult, but we currently don't have any upstream platforms where 
the Host1x server exists (it's only there on automotive platforms).

Mikko

Thierry Reding Jan. 25, 2019, 9:23 a.m. UTC | #4

On Fri, Jan 25, 2019 at 12:38:01AM +0300, Dmitry Osipenko wrote:
> 24.01.2019 21:02, Thierry Reding пишет:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > Tegra186 and later are different from earlier generations in that they
> > use an ARM SMMU rather than the Tegra SMMU. The ARM SMMU driver behaves
> > slightly differently in that the geometry for IOMMU domains is set only
> > after a device was attached to it. This is to make sure that the SMMU
> > instance that the domain belongs to is known, because each instance can
> > have a different input address space (i.e. geometry).
> > 
> > Work around this by moving all IOVA allocations to a point where the
> > geometry of the domain is properly initialized.
> > 
> > This second version of the series addresses all review comments and adds
> > a number of patches that will actually allow host1x to work with an SMMU
> > enabled on Tegra186. The patches also add programming required to
> > address the full 40 bits of address space.
> > 
> > This supersedes the following patch:
> > 
> >     https://patchwork.kernel.org/patch/10775579/
> 
> My understanding is that falcon won't boot because source DMA address
> of the firmware isn't set up correctly if the address is >32bit.
> Please correct me if I'm wrong, otherwise that need to be addressed in
> this series as well for completeness.

What makes you say so? I was runtime testing this series as I was
developing these patches and this works properly on Tegra186 with a 40
bit address space. Since the carveout is allocated from the top of the
IOMMU aperture, the addresses that the Falcon sees are always from the
top of the 40 bit IOVA space and this works flawlessly.

Thierry

Dmitry Osipenko Jan. 25, 2019, 1:14 p.m. UTC | #5

25.01.2019 12:23, Thierry Reding пишет:
> On Fri, Jan 25, 2019 at 12:38:01AM +0300, Dmitry Osipenko wrote:
>> 24.01.2019 21:02, Thierry Reding пишет:
>>> From: Thierry Reding <treding@nvidia.com>
>>>
>>> Tegra186 and later are different from earlier generations in that they
>>> use an ARM SMMU rather than the Tegra SMMU. The ARM SMMU driver behaves
>>> slightly differently in that the geometry for IOMMU domains is set only
>>> after a device was attached to it. This is to make sure that the SMMU
>>> instance that the domain belongs to is known, because each instance can
>>> have a different input address space (i.e. geometry).
>>>
>>> Work around this by moving all IOVA allocations to a point where the
>>> geometry of the domain is properly initialized.
>>>
>>> This second version of the series addresses all review comments and adds
>>> a number of patches that will actually allow host1x to work with an SMMU
>>> enabled on Tegra186. The patches also add programming required to
>>> address the full 40 bits of address space.
>>>
>>> This supersedes the following patch:
>>>
>>>     https://patchwork.kernel.org/patch/10775579/
>>
>> My understanding is that falcon won't boot because source DMA address
>> of the firmware isn't set up correctly if the address is >32bit.
>> Please correct me if I'm wrong, otherwise that need to be addressed in
>> this series as well for completeness.
> 
> What makes you say so? I was runtime testing this series as I was
> developing these patches and this works properly on Tegra186 with a 40
> bit address space. Since the carveout is allocated from the top of the
> IOMMU aperture, the addresses that the Falcon sees are always from the
> top of the 40 bit IOVA space and this works flawlessly.

I looked again at the code and noticed this time that the address is *shifted* by 256 bytes. Mikko told yesterday about the alignment requirement, but I just wasn't seeing the shifting in the code. Looks good then!

Dmitry Osipenko Jan. 25, 2019, 1:18 p.m. UTC | #6

25.01.2019 11:57, Mikko Perttunen пишет:
> On 24.1.2019 23.53, Dmitry Osipenko wrote:
>> 24.01.2019 21:02, Thierry Reding пишет:
>>> From: Thierry Reding <treding@nvidia.com>
>>>
>>> Tegra186 and later are different from earlier generations in that they
>>> use an ARM SMMU rather than the Tegra SMMU. The ARM SMMU driver behaves
>>> slightly differently in that the geometry for IOMMU domains is set only
>>> after a device was attached to it. This is to make sure that the SMMU
>>> instance that the domain belongs to is known, because each instance can
>>> have a different input address space (i.e. geometry).
>>>
>>> Work around this by moving all IOVA allocations to a point where the
>>> geometry of the domain is properly initialized.
>>>
>>> This second version of the series addresses all review comments and adds
>>> a number of patches that will actually allow host1x to work with an SMMU
>>> enabled on Tegra186. The patches also add programming required to
>>> address the full 40 bits of address space.
>>>
>>> This supersedes the following patch:
>>>
>>>      https://patchwork.kernel.org/patch/10775579/
>>
>> Secondly, seems there are additional restrictions for the host1x jobs on T186, at least T186 TRM suggests so. In particular looks like each client is hardwired to a specific sync point and to a specific channel. Or maybe there is assumption that upstream kernel could work only in a hypervisor mode or with all protections disable. Could you please clarify?
>>
> 
> There are no such syncpoint/channel restrictions. The upstream driver indeed currently only supports the case where there is no "hypervisor" (that is, server process that allocates host1x resources) running and the kernel has access to the Host1x COMMON/"hypervisor" register aperture.
> 
> Adding support for the situation where this is not the case shouldn't be very difficult, but we currently don't have any upstream platforms where the Host1x server exists (it's only there on automotive platforms).

Thank you very much for the clarification!

[v2,00/13] drm/tegra: Fix IOVA space on Tegra186 and later

Message

Comments