mbox series

[v8,00/14] drm/tegra: Introduce a modern UABI

Message ID 20210709193146.2859516-1-thierry.reding@gmail.com (mailing list archive)
Headers show
Series drm/tegra: Introduce a modern UABI | expand

Message

Thierry Reding July 9, 2021, 7:31 p.m. UTC
From: Thierry Reding <treding@nvidia.com>

Hi all,

Mikko has been away for a few weeks, so I've been testing and revising
the new UABI patches in the meantime. There are very minor changes to
the naming of some of the UABI fields, but other than that it's mostly
unchanged from v7.

One notable change is that mappings can now be read-only, write-only,
read-write or none of them (rather than just read-only or read-write),
since those combinations are all supported by the IOMMUs and it might
be useful to make some mappings write-only.

For a full list of changes in v8, see the changelog in patch 6.

I've also updated the libdrm_tegra library to work against this version
of the UABI. A branch can be found here:

  https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8

That contains helper APIs for the concepts introduced in this series and
shows how they can be used in various tests that can be run for sanity
checking.

In addition, Mikko has made updates to the following projects, though
they may need to be updated for the minor changes in v8:

* vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
  Experimental support for MPEG2 and H264 decoding on T210, T186
  and T194.

* xf86-video-opentegra - https://github.com/grate-driver/xf86-video-opentegra
  X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.

* grate - https://github.com/grate-driver/grate
  3D rendering testbed for Tegra20, Tegra30, and Tegra114

I plan on putting this into linux-next soon after v5.14-rc1 so that this
can get some soak time.

Thierry

Mikko Perttunen (14):
  gpu: host1x: Add DMA fence implementation
  gpu: host1x: Add no-recovery mode
  gpu: host1x: Add job release callback
  gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer
  drm/tegra: Extract tegra_gem_lookup
  drm/tegra: Add new UAPI to header
  drm/tegra: Boot VIC during runtime PM resume
  drm/tegra: Allocate per-engine channel in core code
  drm/tegra: Implement new UAPI
  drm/tegra: Implement syncpoint management UAPI
  drm/tegra: Implement syncpoint wait UAPI
  drm/tegra: Implement job submission part of new UAPI
  drm/tegra: Add job firewall
  drm/tegra: Bump driver version

 drivers/gpu/drm/tegra/Makefile             |   4 +
 drivers/gpu/drm/tegra/drm.c                |  82 ++--
 drivers/gpu/drm/tegra/drm.h                |  12 +
 drivers/gpu/drm/tegra/firewall.c           | 254 ++++++++++
 drivers/gpu/drm/tegra/gather_bo.c          |  81 ++++
 drivers/gpu/drm/tegra/gather_bo.h          |  22 +
 drivers/gpu/drm/tegra/gem.c                |  13 +
 drivers/gpu/drm/tegra/gem.h                |   2 +
 drivers/gpu/drm/tegra/submit.c             | 527 +++++++++++++++++++++
 drivers/gpu/drm/tegra/submit.h             |  21 +
 drivers/gpu/drm/tegra/uapi.c               | 387 +++++++++++++++
 drivers/gpu/drm/tegra/uapi.h               |  58 +++
 drivers/gpu/drm/tegra/vic.c                | 112 ++---
 drivers/gpu/host1x/Makefile                |   1 +
 drivers/gpu/host1x/cdma.c                  |  58 ++-
 drivers/gpu/host1x/fence.c                 | 209 ++++++++
 drivers/gpu/host1x/fence.h                 |  13 +
 drivers/gpu/host1x/hw/channel_hw.c         |  87 +++-
 drivers/gpu/host1x/hw/debug_hw.c           |   9 +-
 drivers/gpu/host1x/hw/hw_host1x02_uclass.h |  12 +
 drivers/gpu/host1x/hw/hw_host1x04_uclass.h |  12 +
 drivers/gpu/host1x/hw/hw_host1x05_uclass.h |  12 +
 drivers/gpu/host1x/hw/hw_host1x06_uclass.h |  12 +
 drivers/gpu/host1x/hw/hw_host1x07_uclass.h |  12 +
 drivers/gpu/host1x/intr.c                  |   9 +
 drivers/gpu/host1x/intr.h                  |   2 +
 drivers/gpu/host1x/job.c                   |  77 ++-
 drivers/gpu/host1x/job.h                   |  16 +
 drivers/gpu/host1x/syncpt.c                |   2 +
 drivers/gpu/host1x/syncpt.h                |  12 +
 include/linux/host1x.h                     |  22 +-
 include/uapi/drm/tegra_drm.h               | 425 ++++++++++++++++-
 32 files changed, 2408 insertions(+), 169 deletions(-)
 create mode 100644 drivers/gpu/drm/tegra/firewall.c
 create mode 100644 drivers/gpu/drm/tegra/gather_bo.c
 create mode 100644 drivers/gpu/drm/tegra/gather_bo.h
 create mode 100644 drivers/gpu/drm/tegra/submit.c
 create mode 100644 drivers/gpu/drm/tegra/submit.h
 create mode 100644 drivers/gpu/drm/tegra/uapi.c
 create mode 100644 drivers/gpu/drm/tegra/uapi.h
 create mode 100644 drivers/gpu/host1x/fence.c
 create mode 100644 drivers/gpu/host1x/fence.h

Comments

Dmitry Osipenko July 9, 2021, 9:16 p.m. UTC | #1
Hello Thierry,

09.07.2021 22:31, Thierry Reding пишет:
> From: Thierry Reding <treding@nvidia.com>
> 
> Hi all,
> 
> Mikko has been away for a few weeks, so I've been testing and revising
> the new UABI patches in the meantime. There are very minor changes to
> the naming of some of the UABI fields, but other than that it's mostly
> unchanged from v7.

Why you haven't addressed any of the previous review comments? There
were some obvious problems in v7 and v8 still has them.

> One notable change is that mappings can now be read-only, write-only,
> read-write or none of them (rather than just read-only or read-write),
> since those combinations are all supported by the IOMMUs and it might
> be useful to make some mappings write-only.
> 
> For a full list of changes in v8, see the changelog in patch 6.
> 
> I've also updated the libdrm_tegra library to work against this version
> of the UABI. A branch can be found here:
> 
>   https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8
> 
> That contains helper APIs for the concepts introduced in this series and
> shows how they can be used in various tests that can be run for sanity
> checking.
> 
> In addition, Mikko has made updates to the following projects, though
> they may need to be updated for the minor changes in v8:
> 
> * vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
>   Experimental support for MPEG2 and H264 decoding on T210, T186
>   and T194.
> 
> * xf86-video-opentegra - https://github.com/grate-driver/xf86-video-opentegra
>   X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.
> 
> * grate - https://github.com/grate-driver/grate
>   3D rendering testbed for Tegra20, Tegra30, and Tegra114
> 
> I plan on putting this into linux-next soon after v5.14-rc1 so that this
> can get some soak time.

It should be a bit too early to push it into kernel. The UAPI is not
ready because it's missing essential features. We can't call this a
'modern UABI' until it's fully implemented. The design decisions are
still questionable because this UAPI is built around the proprietary
firmware (and based on UAPI of downstream driver) which doesn't fit well
into DRM world. I haven't got all the answers to my previous questions,
should I repeat them?

UAPI is not the only problem that we have. The performance and stability
of the driver are in a very bad shape too. The modern UAPI can't be
built on top of the old code. It's clear now that this is a very serious
problem that must be addressed along with the UAPI work and I'm getting
silence from you guys.
Thierry Reding July 14, 2021, 8:30 a.m. UTC | #2
On Sat, Jul 10, 2021 at 12:16:28AM +0300, Dmitry Osipenko wrote:
> Hello Thierry,
> 
> 09.07.2021 22:31, Thierry Reding пишет:
> > From: Thierry Reding <treding@nvidia.com>
> > 
> > Hi all,
> > 
> > Mikko has been away for a few weeks, so I've been testing and revising
> > the new UABI patches in the meantime. There are very minor changes to
> > the naming of some of the UABI fields, but other than that it's mostly
> > unchanged from v7.
> 
> Why you haven't addressed any of the previous review comments? There
> were some obvious problems in v7 and v8 still has them.
> 
> > One notable change is that mappings can now be read-only, write-only,
> > read-write or none of them (rather than just read-only or read-write),
> > since those combinations are all supported by the IOMMUs and it might
> > be useful to make some mappings write-only.
> > 
> > For a full list of changes in v8, see the changelog in patch 6.
> > 
> > I've also updated the libdrm_tegra library to work against this version
> > of the UABI. A branch can be found here:
> > 
> >   https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8
> > 
> > That contains helper APIs for the concepts introduced in this series and
> > shows how they can be used in various tests that can be run for sanity
> > checking.
> > 
> > In addition, Mikko has made updates to the following projects, though
> > they may need to be updated for the minor changes in v8:
> > 
> > * vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
> >   Experimental support for MPEG2 and H264 decoding on T210, T186
> >   and T194.
> > 
> > * xf86-video-opentegra - https://github.com/grate-driver/xf86-video-opentegra
> >   X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.
> > 
> > * grate - https://github.com/grate-driver/grate
> >   3D rendering testbed for Tegra20, Tegra30, and Tegra114
> > 
> > I plan on putting this into linux-next soon after v5.14-rc1 so that this
> > can get some soak time.
> 
> It should be a bit too early to push it into kernel. The UAPI is not
> ready because it's missing essential features. We can't call this a
> 'modern UABI' until it's fully implemented. The design decisions are
> still questionable because this UAPI is built around the proprietary
> firmware (and based on UAPI of downstream driver) which doesn't fit well
> into DRM world. I haven't got all the answers to my previous questions,
> should I repeat them?

I don't know what you means by "built around the proprietary firmware".
Yes, this ends up using proprietary firmware for some of the hardware
engines that host1x drives, but that's completely orthogonal to the
UABI. No matter what UABI we'd be introducing, we'd be using that same
firmware.

And yes, this is based on the UABI of the downstream drivers. The design
is guided by what we've learned over the last decade working with this
hardware in use-cases that customers need. It'd be dumb not to use that
knowledge to our advantage. This is the only way to ensure we can
deliver an upstream driver that's on par with our downstream drivers and
therefore make it possible to eventually adopt the upstream driver.

And frankly, you did get answers to previous questions, though perhaps
not all, but I'm out of patience. We've been going in circles and at
some point we have to make a decision so we can make progress.

I made several attempts over the years to get something usable merged
upstream so that we can finally make use of this hardware and get it
supported upstream and each time I made the mistake of trying to make it
perfect and accomodate all wishlist items. The result is that I wasted a
lot of time and have nothing to show for it.

I've also been very hard Mikko with his work on this and I think we've
stretched this as far as we can without compromising too much on what we
are going to need from this UABI in the future.

We've gone through the process of making sure all existing userspace can
and does work with this new UABI and even left the old UABI in place in
case we need it.

I'm reasonably satisfied with what we have now and I don't see any
reason to hold this back any further. We always have the option of
adding UABI if we need it for something, or extend functionality of
existing UABI where it makes sense. But we also do have to start
somewhere, otherwise we're just not going to get anywhere, as the last
10 years have shown.

> UAPI is not the only problem that we have. The performance and stability
> of the driver are in a very bad shape too. The modern UAPI can't be
> built on top of the old code. It's clear now that this is a very serious
> problem that must be addressed along with the UAPI work and I'm getting
> silence from you guys.

We've been over this multiple times before, though perhaps never over
email. So let me make this clear here again and for future reference: we
will *not* be rewriting the driver from scratch.

If there are any serious performance and stability issues, then we'll
find them and address them incrementally, like we always do in the
kernel.

Thierry
Dmitry Osipenko July 14, 2021, 2:50 p.m. UTC | #3
14.07.2021 11:30, Thierry Reding пишет:
> On Sat, Jul 10, 2021 at 12:16:28AM +0300, Dmitry Osipenko wrote:
>> Hello Thierry,
>>
>> 09.07.2021 22:31, Thierry Reding пишет:
>>> From: Thierry Reding <treding@nvidia.com>
>>>
>>> Hi all,
>>>
>>> Mikko has been away for a few weeks, so I've been testing and revising
>>> the new UABI patches in the meantime. There are very minor changes to
>>> the naming of some of the UABI fields, but other than that it's mostly
>>> unchanged from v7.
>>
>> Why you haven't addressed any of the previous review comments? There
>> were some obvious problems in v7 and v8 still has them.
>>
>>> One notable change is that mappings can now be read-only, write-only,
>>> read-write or none of them (rather than just read-only or read-write),
>>> since those combinations are all supported by the IOMMUs and it might
>>> be useful to make some mappings write-only.
>>>
>>> For a full list of changes in v8, see the changelog in patch 6.
>>>
>>> I've also updated the libdrm_tegra library to work against this version
>>> of the UABI. A branch can be found here:
>>>
>>>   https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8
>>>
>>> That contains helper APIs for the concepts introduced in this series and
>>> shows how they can be used in various tests that can be run for sanity
>>> checking.
>>>
>>> In addition, Mikko has made updates to the following projects, though
>>> they may need to be updated for the minor changes in v8:
>>>
>>> * vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
>>>   Experimental support for MPEG2 and H264 decoding on T210, T186
>>>   and T194.
>>>
>>> * xf86-video-opentegra - https://github.com/grate-driver/xf86-video-opentegra
>>>   X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.
>>>
>>> * grate - https://github.com/grate-driver/grate
>>>   3D rendering testbed for Tegra20, Tegra30, and Tegra114
>>>
>>> I plan on putting this into linux-next soon after v5.14-rc1 so that this
>>> can get some soak time.
>>
>> It should be a bit too early to push it into kernel. The UAPI is not
>> ready because it's missing essential features. We can't call this a
>> 'modern UABI' until it's fully implemented. The design decisions are
>> still questionable because this UAPI is built around the proprietary
>> firmware (and based on UAPI of downstream driver) which doesn't fit well
>> into DRM world. I haven't got all the answers to my previous questions,
>> should I repeat them?
> 
> I don't know what you means by "built around the proprietary firmware".
> Yes, this ends up using proprietary firmware for some of the hardware
> engines that host1x drives, but that's completely orthogonal to the
> UABI. No matter what UABI we'd be introducing, we'd be using that same
> firmware.
> 
> And yes, this is based on the UABI of the downstream drivers. The design
> is guided by what we've learned over the last decade working with this
> hardware in use-cases that customers need. It'd be dumb not to use that
> knowledge to our advantage. This is the only way to ensure we can
> deliver an upstream driver that's on par with our downstream drivers and
> therefore make it possible to eventually adopt the upstream driver.
> 
> And frankly, you did get answers to previous questions, though perhaps
> not all, but I'm out of patience. We've been going in circles and at
> some point we have to make a decision so we can make progress.

By firmware I was referring to the supervisor OS and inter-VM
integration, sorry for not making it clear. My rough understanding is
that it's all software defined and technically it's possible to avoid
going though the trouble of supporting the firmware convention defined
by downstream, and thus, making driver less optimal than it could be.
It's still not clear to me how much that firmware is relevant to
upstream in practice.

> I made several attempts over the years to get something usable merged
> upstream so that we can finally make use of this hardware and get it
> supported upstream and each time I made the mistake of trying to make it
> perfect and accomodate all wishlist items. The result is that I wasted a
> lot of time and have nothing to show for it.

It's a problem that you try to do everything on your own and not
collaborating as much as you could. Writing code isn't a problem, the
problem is that there is no clear understanding of what needs to be
done, IMO. I have a vision of whats need to be done from a perspective
of older SoCs, but I never could start implementing it for upstream
because it requires yours feedback and preliminary agreement since
you're the only maintainer of the driver who could merge patches I don't
want to waste my time too.

> I've also been very hard Mikko with his work on this and I think we've
> stretched this as far as we can without compromising too much on what we
> are going to need from this UABI in the future.
> 
> We've gone through the process of making sure all existing userspace can
> and does work with this new UABI and even left the old UABI in place in
> case we need it.

That is great, but it is not enough. So far we are enabling a minimal
support for newer hardware here, but we also need to solve the older
problems that are relevant to all SoCs and may well affect the UABI
decisions.

It became apparent to me now that yours only goal here is to enable
newer hardware, which is fine. It's also apparent to me that you still
don't have a clear vision of what needs to be done overall, but this is
fine too since nothing is removed here and it's in a staging phase yet.

> I'm reasonably satisfied with what we have now and I don't see any
> reason to hold this back any further. We always have the option of
> adding UABI if we need it for something, or extend functionality of
> existing UABI where it makes sense. But we also do have to start
> somewhere, otherwise we're just not going to get anywhere, as the last
> 10 years have shown.

We're starting in a bit wrong direction by extending the old code base,
adding unnecessary burden to userspace of older SoCs and adding sync
point UAPI that may not integrate with the proper jobs scheduling. It
probably shouldn't be a big problem to rework it all later on if will be
needed, but we could avoid this extra work beforehand if we could put
more effort into implementing more features from the start.

>> UAPI is not the only problem that we have. The performance and stability
>> of the driver are in a very bad shape too. The modern UAPI can't be
>> built on top of the old code. It's clear now that this is a very serious
>> problem that must be addressed along with the UAPI work and I'm getting
>> silence from you guys.
> 
> We've been over this multiple times before, though perhaps never over
> email. So let me make this clear here again and for future reference: we
> will *not* be rewriting the driver from scratch.

This was unnecessary, we agreed on the incremental updates long time
ago. My point is that we shouldn't be building the new UAPI on top of
the old code, instead we should prioritize the upgrading to a modern
APIs and fixing the lame abstractions of the legacy driver.

Mikko assures that this effort won't stop after merging this first bits
and we will start upgrading driver soon. I trust Mikko. You guys want to
light up the newer hardware so much that I don't know how to stop you :)
Even if this will become a mistake later on, likely it won't be a
critical mistake, so let's not hold it.
Mikko Perttunen July 14, 2021, 3:26 p.m. UTC | #4
On 7/14/21 5:50 PM, Dmitry Osipenko wrote:
> 14.07.2021 11:30, Thierry Reding пишет:
>> On Sat, Jul 10, 2021 at 12:16:28AM +0300, Dmitry Osipenko wrote:
>>> Hello Thierry,
>>>
>>> 09.07.2021 22:31, Thierry Reding пишет:
>>>> From: Thierry Reding <treding@nvidia.com>
>>>>
>>>> Hi all,
>>>>
>>>> Mikko has been away for a few weeks, so I've been testing and revising
>>>> the new UABI patches in the meantime. There are very minor changes to
>>>> the naming of some of the UABI fields, but other than that it's mostly
>>>> unchanged from v7.
>>>
>>> Why you haven't addressed any of the previous review comments? There
>>> were some obvious problems in v7 and v8 still has them.
>>>
>>>> One notable change is that mappings can now be read-only, write-only,
>>>> read-write or none of them (rather than just read-only or read-write),
>>>> since those combinations are all supported by the IOMMUs and it might
>>>> be useful to make some mappings write-only.
>>>>
>>>> For a full list of changes in v8, see the changelog in patch 6.
>>>>
>>>> I've also updated the libdrm_tegra library to work against this version
>>>> of the UABI. A branch can be found here:
>>>>
>>>>    https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8
>>>>
>>>> That contains helper APIs for the concepts introduced in this series and
>>>> shows how they can be used in various tests that can be run for sanity
>>>> checking.
>>>>
>>>> In addition, Mikko has made updates to the following projects, though
>>>> they may need to be updated for the minor changes in v8:
>>>>
>>>> * vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
>>>>    Experimental support for MPEG2 and H264 decoding on T210, T186
>>>>    and T194.
>>>>
>>>> * xf86-video-opentegra - https://github.com/grate-driver/xf86-video-opentegra
>>>>    X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.
>>>>
>>>> * grate - https://github.com/grate-driver/grate
>>>>    3D rendering testbed for Tegra20, Tegra30, and Tegra114
>>>>
>>>> I plan on putting this into linux-next soon after v5.14-rc1 so that this
>>>> can get some soak time.
>>>
>>> It should be a bit too early to push it into kernel. The UAPI is not
>>> ready because it's missing essential features. We can't call this a
>>> 'modern UABI' until it's fully implemented. The design decisions are
>>> still questionable because this UAPI is built around the proprietary
>>> firmware (and based on UAPI of downstream driver) which doesn't fit well
>>> into DRM world. I haven't got all the answers to my previous questions,
>>> should I repeat them?
>>
>> I don't know what you means by "built around the proprietary firmware".
>> Yes, this ends up using proprietary firmware for some of the hardware
>> engines that host1x drives, but that's completely orthogonal to the
>> UABI. No matter what UABI we'd be introducing, we'd be using that same
>> firmware.
>>
>> And yes, this is based on the UABI of the downstream drivers. The design
>> is guided by what we've learned over the last decade working with this
>> hardware in use-cases that customers need. It'd be dumb not to use that
>> knowledge to our advantage. This is the only way to ensure we can
>> deliver an upstream driver that's on par with our downstream drivers and
>> therefore make it possible to eventually adopt the upstream driver.
>>
>> And frankly, you did get answers to previous questions, though perhaps
>> not all, but I'm out of patience. We've been going in circles and at
>> some point we have to make a decision so we can make progress.
> 
> By firmware I was referring to the supervisor OS and inter-VM
> integration, sorry for not making it clear. My rough understanding is
> that it's all software defined and technically it's possible to avoid
> going though the trouble of supporting the firmware convention defined
> by downstream, and thus, making driver less optimal than it could be.
> It's still not clear to me how much that firmware is relevant to
> upstream in practice.

As mentioned in discussion elsewhere, there is no 'firmware convention'. 
The view I've formed so far is that the model of ephemeral syncpoint 
allocations and value resets, which I believe you are talking about 
here, is fundamentally opposed to the design intent of the hardware, and 
would result in a less efficient system regardless of inter-VM 
integration convention.

> 
>> I made several attempts over the years to get something usable merged
>> upstream so that we can finally make use of this hardware and get it
>> supported upstream and each time I made the mistake of trying to make it
>> perfect and accomodate all wishlist items. The result is that I wasted a
>> lot of time and have nothing to show for it.
> 
> It's a problem that you try to do everything on your own and not
> collaborating as much as you could. Writing code isn't a problem, the
> problem is that there is no clear understanding of what needs to be
> done, IMO. I have a vision of whats need to be done from a perspective
> of older SoCs, but I never could start implementing it for upstream
> because it requires yours feedback and preliminary agreement since
> you're the only maintainer of the driver who could merge patches I don't
> want to waste my time too.
> 
>> I've also been very hard Mikko with his work on this and I think we've
>> stretched this as far as we can without compromising too much on what we
>> are going to need from this UABI in the future.
>>
>> We've gone through the process of making sure all existing userspace can
>> and does work with this new UABI and even left the old UABI in place in
>> case we need it.
> 
> That is great, but it is not enough. So far we are enabling a minimal
> support for newer hardware here, but we also need to solve the older
> problems that are relevant to all SoCs and may well affect the UABI
> decisions.
> 
> It became apparent to me now that yours only goal here is to enable
> newer hardware, which is fine. It's also apparent to me that you still
> don't have a clear vision of what needs to be done overall, but this is
> fine too since nothing is removed here and it's in a staging phase yet.

While my goal of course is to enable proper use of Host1x on the newer 
SoCs, there is absolutely no intention to forget about the older SoCs. 
Observably, to me at least, GR2D and GR3D are working -- the test suites 
are passing (though I did not port/try mesa). We are also not regressing 
anything, and I do not think after this series we are worse (at least in 
any fundamental manner) than the downstream software stack that these 
chips were originally productized with. As such I have a hard time 
understanding the doom and gloom about the driver's state and needing 
grand overarching re-architectures.

> 
>> I'm reasonably satisfied with what we have now and I don't see any
>> reason to hold this back any further. We always have the option of
>> adding UABI if we need it for something, or extend functionality of
>> existing UABI where it makes sense. But we also do have to start
>> somewhere, otherwise we're just not going to get anywhere, as the last
>> 10 years have shown.
> 
> We're starting in a bit wrong direction by extending the old code base,
> adding unnecessary burden to userspace of older SoCs and adding sync
> point UAPI that may not integrate with the proper jobs scheduling. It
> probably shouldn't be a big problem to rework it all later on if will be
> needed, but we could avoid this extra work beforehand if we could put
> more effort into implementing more features from the start.
> 

As mentioned before, in my opinion, redoing the internals while 
supporting the legacy UAPI would be more difficult. Having the new UAPI 
in place will also guide the improvements to be made - and at least on 
new SoCs where hardware scheduling works fine we want to take advantage 
of that in the UAPI. And I don't know of any issue in the UAPI that 
would prevent use of a software scheduler (depending on implementation 
some fields could be unused in some cases but it's already the case that 
you don't always use every field).

>>> UAPI is not the only problem that we have. The performance and stability
>>> of the driver are in a very bad shape too. The modern UAPI can't be
>>> built on top of the old code. It's clear now that this is a very serious
>>> problem that must be addressed along with the UAPI work and I'm getting
>>> silence from you guys.
>>
>> We've been over this multiple times before, though perhaps never over
>> email. So let me make this clear here again and for future reference: we
>> will *not* be rewriting the driver from scratch.
> 
> This was unnecessary, we agreed on the incremental updates long time
> ago. My point is that we shouldn't be building the new UAPI on top of
> the old code, instead we should prioritize the upgrading to a modern
> APIs and fixing the lame abstractions of the legacy driver.
> 
> Mikko assures that this effort won't stop after merging this first bits
> and we will start upgrading driver soon. I trust Mikko. You guys want to
> light up the newer hardware so much that I don't know how to stop you :)
> Even if this will become a mistake later on, likely it won't be a
> critical mistake, so let's not hold it.
> 

I have a long TODO list of improvements to work on. Admittedly, I won't 
have as much time to work on it as I would have before since I need to 
start working on other projects in parallel as well. And things will 
need to be agreed on (e.g. as alluded to earlier I still don't know of 
any concrete reason why we would need to add a software scheduler. I can 
only make guesses. It probably makes sense for the old SoCs, but I don't 
know why we now need one and downstream never did)

Mikko
Dmitry Osipenko July 15, 2021, 1:53 p.m. UTC | #5
14.07.2021 18:26, Mikko Perttunen пишет:
...
> While my goal of course is to enable proper use of Host1x on the newer
> SoCs, there is absolutely no intention to forget about the older SoCs.
> Observably, to me at least, GR2D and GR3D are working -- the test suites
> are passing (though I did not port/try mesa). We are also not regressing
> anything, and I do not think after this series we are worse (at least in
> any fundamental manner) than the downstream software stack that these
> chips were originally productized with.

> As such I have a hard time
> understanding the doom and gloom about the driver's state and needing
> grand overarching re-architectures.

You need to work with a real consumer device and use it everyday to feel
all those problems. But problems always exist, we will fix them
eventually and then have new problems. We also have an alternative
experimental driver that solves the problems for the time being, so it's
not really that bad.

I may sound a bit too negative, but that's because you're are already
doing this work and I want to get your attention to a core problems that
in my opinion should be solved first to avoid a need to re-do the work
later on.

...
> I have a long TODO list of improvements to work on. Admittedly, I won't
> have as much time to work on it as I would have before since I need to
> start working on other projects in parallel as well. And things will
> need to be agreed on

A slow pace parallel work is a normal thing, that's exactly what me and
everyone else are doing in regards to having fun with Tegra development.
We already managed to achieve a lot this way and it's only getting
better every day.

> (e.g. as alluded to earlier I still don't know of
> any concrete reason why we would need to add a software scheduler. I can
> only make guesses. It probably makes sense for the old SoCs, but I don't
> know why we now need one and downstream never did)

Scheduler is needed for running jobs in a correct order. You don't feel
like it's needed because you use only a single h/w engine and haven't
seen places where it's needed.

DRM scheduler also has a much cleaner code, has more features, it's
optimized, well thought, uses modern upstream concepts and successfully
used by multiple drivers in comparison to what we have in the Host1x driver.

We will discuss it in a more details.
Dmitry Osipenko July 30, 2021, 3:26 p.m. UTC | #6
09.07.2021 22:31, Thierry Reding пишет:
> From: Thierry Reding <treding@nvidia.com>
> 
> Hi all,
> 
> Mikko has been away for a few weeks, so I've been testing and revising
> the new UABI patches in the meantime. There are very minor changes to
> the naming of some of the UABI fields, but other than that it's mostly
> unchanged from v7.
> 
> One notable change is that mappings can now be read-only, write-only,
> read-write or none of them (rather than just read-only or read-write),
> since those combinations are all supported by the IOMMUs and it might
> be useful to make some mappings write-only.
> 
> For a full list of changes in v8, see the changelog in patch 6.
> 
> I've also updated the libdrm_tegra library to work against this version
> of the UABI. A branch can be found here:
> 
>   https://gitlab.freedesktop.org/tagr/drm/-/commits/drm-tegra-uabi-v8
> 
> That contains helper APIs for the concepts introduced in this series and
> shows how they can be used in various tests that can be run for sanity
> checking.
> 
> In addition, Mikko has made updates to the following projects, though
> they may need to be updated for the minor changes in v8:
> 
> * vaapi-tegra-driver - https://github.com/cyndis/vaapi-tegra-driver
>   Experimental support for MPEG2 and H264 decoding on T210, T186
>   and T194.
> 
> * xf86-video-opentegra - https://github.com/grate-driver/xf86-video-opentegra
>   X11 userspace acceleration driver for Tegra20, Tegra30, and Tegra114.
> 
> * grate - https://github.com/grate-driver/grate
>   3D rendering testbed for Tegra20, Tegra30, and Tegra114
> 
> I plan on putting this into linux-next soon after v5.14-rc1 so that this
> can get some soak time.
> 
> Thierry
> 
> Mikko Perttunen (14):
>   gpu: host1x: Add DMA fence implementation
>   gpu: host1x: Add no-recovery mode
>   gpu: host1x: Add job release callback
>   gpu: host1x: Add support for syncpoint waits in CDMA pushbuffer
>   drm/tegra: Extract tegra_gem_lookup
>   drm/tegra: Add new UAPI to header
>   drm/tegra: Boot VIC during runtime PM resume
>   drm/tegra: Allocate per-engine channel in core code
>   drm/tegra: Implement new UAPI
>   drm/tegra: Implement syncpoint management UAPI
>   drm/tegra: Implement syncpoint wait UAPI
>   drm/tegra: Implement job submission part of new UAPI
>   drm/tegra: Add job firewall
>   drm/tegra: Bump driver version

The "gpu: host1x: Add option to skip firewall for a job" of v7 is not here.

The tegra_drm_ioctl_channel_map() uses wrong BO size again, which breaks
firewall, it was fixed in v7.

The DRM_TEGRA_CHANNEL_MAP_WRITE and DRM_TEGRA_CHANNEL_MAP_READ values
should be swapped to preserve compatibility with the current userspace.

Please fix all the coding style problems reported by
"./scripts/checkpatch.pl --strict" and other things reported previously.
Re-submit v9.