diff mbox series

[v2,1/2] drm/file: Rehabilitate the firstopen hook for non-legacy drivers

Message ID 20190320154809.14823-2-paul.kocialkowski@bootlin.com (mailing list archive)
State New, archived
Headers show
Series drm/vc4: Binner BO management improvements | expand

Commit Message

Paul Kocialkowski March 20, 2019, 3:48 p.m. UTC
The firstopen DRM driver hook was initially used to perform hardware
initialization, which is now considered legacy. Only a single user of
firstopen remains at this point (savage).

In some specific cases, non-legacy drivers may also need to implement
these hooks. For instance on VC4, we need to allocate a 16 MiB buffer
for the GPU. Because it's not required for fbcon, it's a waste to
allocate it before userspace starts using the DRM device.

Using firstopen and lastclose for this allocation seems like the best
fit, so re-habilitate the hook to allow it to be called for non-legacy
drivers.

Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
---
 drivers/gpu/drm/drm_file.c | 3 +--
 include/drm/drm_drv.h      | 2 +-
 2 files changed, 2 insertions(+), 3 deletions(-)

Comments

Eric Anholt March 20, 2019, 4:56 p.m. UTC | #1
Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> The firstopen DRM driver hook was initially used to perform hardware
> initialization, which is now considered legacy. Only a single user of
> firstopen remains at this point (savage).
>
> In some specific cases, non-legacy drivers may also need to implement
> these hooks. For instance on VC4, we need to allocate a 16 MiB buffer
> for the GPU. Because it's not required for fbcon, it's a waste to
> allocate it before userspace starts using the DRM device.
>
> Using firstopen and lastclose for this allocation seems like the best
> fit, so re-habilitate the hook to allow it to be called for non-legacy
> drivers.
>
> Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> ---
>  drivers/gpu/drm/drm_file.c | 3 +--
>  include/drm/drm_drv.h      | 2 +-
>  2 files changed, 2 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> index b1838a41ad43..c011b5cbfb6b 100644
> --- a/drivers/gpu/drm/drm_file.c
> +++ b/drivers/gpu/drm/drm_file.c
> @@ -266,8 +266,7 @@ static int drm_setup(struct drm_device * dev)
>  {
>  	int ret;
>  
> -	if (dev->driver->firstopen &&
> -	    drm_core_check_feature(dev, DRIVER_LEGACY)) {
> +	if (dev->driver->firstopen) {
>  		ret = dev->driver->firstopen(dev);
>  		if (ret != 0)
>  			return ret;
> diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> index ca46a45a9cce..aa14607e54d4 100644
> --- a/include/drm/drm_drv.h
> +++ b/include/drm/drm_drv.h
> @@ -236,7 +236,7 @@ struct drm_driver {
>  	 * to set/unset the VT into raw mode.
>  	 *
>  	 * Legacy drivers initialize the hardware in the @firstopen callback,
> -	 * which isn't even called for modern drivers.
> +	 * modern drivers can use it for other purposes only.
>  	 */
>  	void (*lastclose) (struct drm_device *);

Our usage in vc4 is not very different from what we called "hardware
initialization" in other devices.  I would rather just delete this
sentence entirely.

The only alternative I can think of to using a firstopen/lastclose-style
allocation for this would be to allocate the bin bo on the first
(non-dumb?) V3D BO allocation and refcount those to free the binner.
Paul Kocialkowski March 21, 2019, 3:27 p.m. UTC | #2
Hi,

Le mercredi 20 mars 2019 à 09:56 -0700, Eric Anholt a écrit :
> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> 
> > The firstopen DRM driver hook was initially used to perform hardware
> > initialization, which is now considered legacy. Only a single user of
> > firstopen remains at this point (savage).
> > 
> > In some specific cases, non-legacy drivers may also need to implement
> > these hooks. For instance on VC4, we need to allocate a 16 MiB buffer
> > for the GPU. Because it's not required for fbcon, it's a waste to
> > allocate it before userspace starts using the DRM device.
> > 
> > Using firstopen and lastclose for this allocation seems like the best
> > fit, so re-habilitate the hook to allow it to be called for non-legacy
> > drivers.
> > 
> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > ---
> >  drivers/gpu/drm/drm_file.c | 3 +--
> >  include/drm/drm_drv.h      | 2 +-
> >  2 files changed, 2 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > index b1838a41ad43..c011b5cbfb6b 100644
> > --- a/drivers/gpu/drm/drm_file.c
> > +++ b/drivers/gpu/drm/drm_file.c
> > @@ -266,8 +266,7 @@ static int drm_setup(struct drm_device * dev)
> >  {
> >  	int ret;
> >  
> > -	if (dev->driver->firstopen &&
> > -	    drm_core_check_feature(dev, DRIVER_LEGACY)) {
> > +	if (dev->driver->firstopen) {
> >  		ret = dev->driver->firstopen(dev);
> >  		if (ret != 0)
> >  			return ret;
> > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> > index ca46a45a9cce..aa14607e54d4 100644
> > --- a/include/drm/drm_drv.h
> > +++ b/include/drm/drm_drv.h
> > @@ -236,7 +236,7 @@ struct drm_driver {
> >  	 * to set/unset the VT into raw mode.
> >  	 *
> >  	 * Legacy drivers initialize the hardware in the @firstopen callback,
> > -	 * which isn't even called for modern drivers.
> > +	 * modern drivers can use it for other purposes only.
> >  	 */
> >  	void (*lastclose) (struct drm_device *);
> 
> Our usage in vc4 is not very different from what we called "hardware
> initialization" in other devices.  I would rather just delete this
> sentence entirely.

Sounds good to me!

> The only alternative I can think of to using a firstopen/lastclose-style
> allocation for this would be to allocate the bin bo on the first
> (non-dumb?) V3D BO allocation and refcount those to free the binner.

I don't see other options either, and using firstclose/lastopen feels
overall more readable in the driver code.

I'm not sure there is such a big overhead associated with allocating
the binner BO (it seems that the current implementation tries to alloc
until the specific memory constraints for the buffer are met, so
perhaps that can take time). But if there is, I suppose it's best to
have that when starting up rather than delaying the first render
operation.

Cheers,

Paul
Eric Anholt March 21, 2019, 11:12 p.m. UTC | #3
Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> Hi,
>
> Le mercredi 20 mars 2019 à 09:56 -0700, Eric Anholt a écrit :
>> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
>> 
>> > The firstopen DRM driver hook was initially used to perform hardware
>> > initialization, which is now considered legacy. Only a single user of
>> > firstopen remains at this point (savage).
>> > 
>> > In some specific cases, non-legacy drivers may also need to implement
>> > these hooks. For instance on VC4, we need to allocate a 16 MiB buffer
>> > for the GPU. Because it's not required for fbcon, it's a waste to
>> > allocate it before userspace starts using the DRM device.
>> > 
>> > Using firstopen and lastclose for this allocation seems like the best
>> > fit, so re-habilitate the hook to allow it to be called for non-legacy
>> > drivers.
>> > 
>> > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
>> > ---
>> >  drivers/gpu/drm/drm_file.c | 3 +--
>> >  include/drm/drm_drv.h      | 2 +-
>> >  2 files changed, 2 insertions(+), 3 deletions(-)
>> > 
>> > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
>> > index b1838a41ad43..c011b5cbfb6b 100644
>> > --- a/drivers/gpu/drm/drm_file.c
>> > +++ b/drivers/gpu/drm/drm_file.c
>> > @@ -266,8 +266,7 @@ static int drm_setup(struct drm_device * dev)
>> >  {
>> >  	int ret;
>> >  
>> > -	if (dev->driver->firstopen &&
>> > -	    drm_core_check_feature(dev, DRIVER_LEGACY)) {
>> > +	if (dev->driver->firstopen) {
>> >  		ret = dev->driver->firstopen(dev);
>> >  		if (ret != 0)
>> >  			return ret;
>> > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
>> > index ca46a45a9cce..aa14607e54d4 100644
>> > --- a/include/drm/drm_drv.h
>> > +++ b/include/drm/drm_drv.h
>> > @@ -236,7 +236,7 @@ struct drm_driver {
>> >  	 * to set/unset the VT into raw mode.
>> >  	 *
>> >  	 * Legacy drivers initialize the hardware in the @firstopen callback,
>> > -	 * which isn't even called for modern drivers.
>> > +	 * modern drivers can use it for other purposes only.
>> >  	 */
>> >  	void (*lastclose) (struct drm_device *);
>> 
>> Our usage in vc4 is not very different from what we called "hardware
>> initialization" in other devices.  I would rather just delete this
>> sentence entirely.
>
> Sounds good to me!

With the delete, the series is:

Reviewed-by: Eric Anholt <eric@anholt.net>

but hopefully someone other than a vc4 developer (danvet?) can make the
call on whether undeprecating firstopen/lastclose for this is
reasonable.
Daniel Vetter March 28, 2019, 6:53 p.m. UTC | #4
On Thu, Mar 21, 2019 at 04:27:06PM +0100, Paul Kocialkowski wrote:
> Hi,
> 
> Le mercredi 20 mars 2019 à 09:56 -0700, Eric Anholt a écrit :
> > Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> > 
> > > The firstopen DRM driver hook was initially used to perform hardware
> > > initialization, which is now considered legacy. Only a single user of
> > > firstopen remains at this point (savage).
> > > 
> > > In some specific cases, non-legacy drivers may also need to implement
> > > these hooks. For instance on VC4, we need to allocate a 16 MiB buffer
> > > for the GPU. Because it's not required for fbcon, it's a waste to
> > > allocate it before userspace starts using the DRM device.
> > > 
> > > Using firstopen and lastclose for this allocation seems like the best
> > > fit, so re-habilitate the hook to allow it to be called for non-legacy
> > > drivers.
> > > 
> > > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > ---
> > >  drivers/gpu/drm/drm_file.c | 3 +--
> > >  include/drm/drm_drv.h      | 2 +-
> > >  2 files changed, 2 insertions(+), 3 deletions(-)
> > > 
> > > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > > index b1838a41ad43..c011b5cbfb6b 100644
> > > --- a/drivers/gpu/drm/drm_file.c
> > > +++ b/drivers/gpu/drm/drm_file.c
> > > @@ -266,8 +266,7 @@ static int drm_setup(struct drm_device * dev)
> > >  {
> > >  	int ret;
> > >  
> > > -	if (dev->driver->firstopen &&
> > > -	    drm_core_check_feature(dev, DRIVER_LEGACY)) {
> > > +	if (dev->driver->firstopen) {
> > >  		ret = dev->driver->firstopen(dev);
> > >  		if (ret != 0)
> > >  			return ret;
> > > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> > > index ca46a45a9cce..aa14607e54d4 100644
> > > --- a/include/drm/drm_drv.h
> > > +++ b/include/drm/drm_drv.h
> > > @@ -236,7 +236,7 @@ struct drm_driver {
> > >  	 * to set/unset the VT into raw mode.
> > >  	 *
> > >  	 * Legacy drivers initialize the hardware in the @firstopen callback,
> > > -	 * which isn't even called for modern drivers.
> > > +	 * modern drivers can use it for other purposes only.
> > >  	 */
> > >  	void (*lastclose) (struct drm_device *);
> > 
> > Our usage in vc4 is not very different from what we called "hardware
> > initialization" in other devices.  I would rather just delete this
> > sentence entirely.
> 
> Sounds good to me!
> 
> > The only alternative I can think of to using a firstopen/lastclose-style
> > allocation for this would be to allocate the bin bo on the first
> > (non-dumb?) V3D BO allocation and refcount those to free the binner.
> 
> I don't see other options either, and using firstclose/lastopen feels
> overall more readable in the driver code.
> 
> I'm not sure there is such a big overhead associated with allocating
> the binner BO (it seems that the current implementation tries to alloc
> until the specific memory constraints for the buffer are met, so
> perhaps that can take time). But if there is, I suppose it's best to
> have that when starting up rather than delaying the first render
> operation.

I'm not entirely buying the "we don't need this for fbcon only" argument -
there's plenty of dumb kms clients too (boot splash and whatever else
there might be). If you don't want to keep this around I think allocating
on first non-dumb bo allocation and dropping it when the last such fd
closes sounds like a much better idea. Needs a bit more state, you need to
track per drm_file whether you've already allocated a non-dumb bo, and a
drm_device refcount, but that's not much. Firstopen feels like the wrong
thing.

Another option would be first_renderopen or something like that, except
you can also render on the legacy node and I'm not sure how much there's a
demand for this in other drivers. In the end you have open/close
callbacks, you can do all the driver specific things you want to do in
there.

Aside: I kinda also want to ditch lastclose usage. There's fbdev (we have a
better solution with Noralf's drm_client for those) and runtime pm (which
frankly is just a cheap hack, I want my gpu to susepend also when it's not
in use when all the screens are off, not only when I killed X and
everything).
-Daniel
Daniel Stone March 29, 2019, 9:09 a.m. UTC | #5
Hi,

On Thu, 28 Mar 2019 at 18:53, Daniel Vetter <daniel@ffwll.ch> wrote:
> On Thu, Mar 21, 2019 at 04:27:06PM +0100, Paul Kocialkowski wrote:
> > I don't see other options either, and using firstclose/lastopen feels
> > overall more readable in the driver code.
> >
> > I'm not sure there is such a big overhead associated with allocating
> > the binner BO (it seems that the current implementation tries to alloc
> > until the specific memory constraints for the buffer are met, so
> > perhaps that can take time). But if there is, I suppose it's best to
> > have that when starting up rather than delaying the first render
> > operation.
>
> I'm not entirely buying the "we don't need this for fbcon only" argument -
> there's plenty of dumb kms clients too (boot splash and whatever else
> there might be). If you don't want to keep this around I think allocating
> on first non-dumb bo allocation and dropping it when the last such fd
> closes sounds like a much better idea. Needs a bit more state, you need to
> track per drm_file whether you've already allocated a non-dumb bo, and a
> drm_device refcount, but that's not much. Firstopen feels like the wrong
> thing.
>
> Another option would be first_renderopen or something like that, except
> you can also render on the legacy node and I'm not sure how much there's a
> demand for this in other drivers. In the end you have open/close
> callbacks, you can do all the driver specific things you want to do in
> there.

I'd like to avoid doing it in open where possible, since that hurts
device enumeration from userspace.

Cheers,
Daniel
Paul Kocialkowski March 29, 2019, 3:02 p.m. UTC | #6
Hi,

On Fri, 2019-03-29 at 09:09 +0000, Daniel Stone wrote:
> Hi,
> 
> On Thu, 28 Mar 2019 at 18:53, Daniel Vetter <daniel@ffwll.ch> wrote:
> > On Thu, Mar 21, 2019 at 04:27:06PM +0100, Paul Kocialkowski wrote:
> > > I don't see other options either, and using firstclose/lastopen feels
> > > overall more readable in the driver code.
> > > 
> > > I'm not sure there is such a big overhead associated with allocating
> > > the binner BO (it seems that the current implementation tries to alloc
> > > until the specific memory constraints for the buffer are met, so
> > > perhaps that can take time). But if there is, I suppose it's best to
> > > have that when starting up rather than delaying the first render
> > > operation.
> > 
> > I'm not entirely buying the "we don't need this for fbcon only" argument -
> > there's plenty of dumb kms clients too (boot splash and whatever else
> > there might be). If you don't want to keep this around I think allocating
> > on first non-dumb bo allocation and dropping it when the last such fd
> > closes sounds like a much better idea. Needs a bit more state, you need to
> > track per drm_file whether you've already allocated a non-dumb bo, and a
> > drm_device refcount, but that's not much. Firstopen feels like the wrong
> > thing.
> > 
> > Another option would be first_renderopen or something like that, except
> > you can also render on the legacy node and I'm not sure how much there's a
> > demand for this in other drivers. In the end you have open/close
> > callbacks, you can do all the driver specific things you want to do in
> > there.
> 
> I'd like to avoid doing it in open where possible, since that hurts
> device enumeration from userspace.

I've noticed the same issue with firstopen, where our buffer is
allocated/liberated a couple of times during enumeration, before the
final open that stays alive during use.

I'm not sure what is preferable between that and allocating when the
GPU is first used. Slowing down the first GPU operation with the
allocation does not sound too great either and it feels like the buffer
should have been allocated earlier.

To me, it feels I think it's better to have delays due to allocation at
enumeration / startup rather than later on, but I might be missing some
elements to have a clear idea.

What do you think?

Cheers,

Paul
Paul Kocialkowski March 29, 2019, 3:07 p.m. UTC | #7
Hi,

Le jeudi 28 mars 2019 à 19:53 +0100, Daniel Vetter a écrit :
> On Thu, Mar 21, 2019 at 04:27:06PM +0100, Paul Kocialkowski wrote:
> > Hi,
> > 
> > Le mercredi 20 mars 2019 à 09:56 -0700, Eric Anholt a écrit :
> > > Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> > > 
> > > > The firstopen DRM driver hook was initially used to perform hardware
> > > > initialization, which is now considered legacy. Only a single user of
> > > > firstopen remains at this point (savage).
> > > > 
> > > > In some specific cases, non-legacy drivers may also need to implement
> > > > these hooks. For instance on VC4, we need to allocate a 16 MiB buffer
> > > > for the GPU. Because it's not required for fbcon, it's a waste to
> > > > allocate it before userspace starts using the DRM device.
> > > > 
> > > > Using firstopen and lastclose for this allocation seems like the best
> > > > fit, so re-habilitate the hook to allow it to be called for non-legacy
> > > > drivers.
> > > > 
> > > > Signed-off-by: Paul Kocialkowski <paul.kocialkowski@bootlin.com>
> > > > ---
> > > >  drivers/gpu/drm/drm_file.c | 3 +--
> > > >  include/drm/drm_drv.h      | 2 +-
> > > >  2 files changed, 2 insertions(+), 3 deletions(-)
> > > > 
> > > > diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
> > > > index b1838a41ad43..c011b5cbfb6b 100644
> > > > --- a/drivers/gpu/drm/drm_file.c
> > > > +++ b/drivers/gpu/drm/drm_file.c
> > > > @@ -266,8 +266,7 @@ static int drm_setup(struct drm_device * dev)
> > > >  {
> > > >  	int ret;
> > > >  
> > > > -	if (dev->driver->firstopen &&
> > > > -	    drm_core_check_feature(dev, DRIVER_LEGACY)) {
> > > > +	if (dev->driver->firstopen) {
> > > >  		ret = dev->driver->firstopen(dev);
> > > >  		if (ret != 0)
> > > >  			return ret;
> > > > diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
> > > > index ca46a45a9cce..aa14607e54d4 100644
> > > > --- a/include/drm/drm_drv.h
> > > > +++ b/include/drm/drm_drv.h
> > > > @@ -236,7 +236,7 @@ struct drm_driver {
> > > >  	 * to set/unset the VT into raw mode.
> > > >  	 *
> > > >  	 * Legacy drivers initialize the hardware in the @firstopen callback,
> > > > -	 * which isn't even called for modern drivers.
> > > > +	 * modern drivers can use it for other purposes only.
> > > >  	 */
> > > >  	void (*lastclose) (struct drm_device *);
> > > 
> > > Our usage in vc4 is not very different from what we called "hardware
> > > initialization" in other devices.  I would rather just delete this
> > > sentence entirely.
> > 
> > Sounds good to me!
> > 
> > > The only alternative I can think of to using a firstopen/lastclose-style
> > > allocation for this would be to allocate the bin bo on the first
> > > (non-dumb?) V3D BO allocation and refcount those to free the binner.
> > 
> > I don't see other options either, and using firstclose/lastopen feels
> > overall more readable in the driver code.
> > 
> > I'm not sure there is such a big overhead associated with allocating
> > the binner BO (it seems that the current implementation tries to alloc
> > until the specific memory constraints for the buffer are met, so
> > perhaps that can take time). But if there is, I suppose it's best to
> > have that when starting up rather than delaying the first render
> > operation.
> 
> I'm not entirely buying the "we don't need this for fbcon only" argument -
> there's plenty of dumb kms clients too (boot splash and whatever else
> there might be). If you don't want to keep this around I think allocating
> on first non-dumb bo allocation and dropping it when the last such fd
> closes sounds like a much better idea. Needs a bit more state, you need to
> track per drm_file whether you've already allocated a non-dumb bo, and a
> drm_device refcount, but that's not much. Firstopen feels like the wrong
> thing.

This is quite a valid point, but we decided to go for a middle-ground
between always allocating and allocating at the first GPU op, so that
the operation is not significantly delayed by allocating the buffer.

> Another option would be first_renderopen or something like that, except
> you can also render on the legacy node and I'm not sure how much there's a
> demand for this in other drivers. In the end you have open/close
> callbacks, you can do all the driver specific things you want to do in
> there.

Yes I've initially tried playing with that and reached the same
conclusion: there can be render ops on the legacy node too so this is
just not reliable for what we're trying to do.

> Aside: I kinda also want to ditch lastclose usage. There's fbdev (we have a
> better solution with Noralf's drm_client for those) and runtime pm (which
> frankly is just a cheap hack, I want my gpu to susepend also when it's not
> in use when all the screens are off, not only when I killed X and
> everything).

I understand that, yes. I don't see any issue with a implementation
tracking the state in open if we really want firstopen/lastclose to
disappear.

Cheers,

Paul
Daniel Vetter March 29, 2019, 3:25 p.m. UTC | #8
On Fri, Mar 29, 2019 at 04:02:23PM +0100, Paul Kocialkowski wrote:
> Hi,
> 
> On Fri, 2019-03-29 at 09:09 +0000, Daniel Stone wrote:
> > Hi,
> > 
> > On Thu, 28 Mar 2019 at 18:53, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > On Thu, Mar 21, 2019 at 04:27:06PM +0100, Paul Kocialkowski wrote:
> > > > I don't see other options either, and using firstclose/lastopen feels
> > > > overall more readable in the driver code.
> > > > 
> > > > I'm not sure there is such a big overhead associated with allocating
> > > > the binner BO (it seems that the current implementation tries to alloc
> > > > until the specific memory constraints for the buffer are met, so
> > > > perhaps that can take time). But if there is, I suppose it's best to
> > > > have that when starting up rather than delaying the first render
> > > > operation.
> > > 
> > > I'm not entirely buying the "we don't need this for fbcon only" argument -
> > > there's plenty of dumb kms clients too (boot splash and whatever else
> > > there might be). If you don't want to keep this around I think allocating
> > > on first non-dumb bo allocation and dropping it when the last such fd
> > > closes sounds like a much better idea. Needs a bit more state, you need to
> > > track per drm_file whether you've already allocated a non-dumb bo, and a
> > > drm_device refcount, but that's not much. Firstopen feels like the wrong
> > > thing.
> > > 
> > > Another option would be first_renderopen or something like that, except
> > > you can also render on the legacy node and I'm not sure how much there's a
> > > demand for this in other drivers. In the end you have open/close
> > > callbacks, you can do all the driver specific things you want to do in
> > > there.
> > 
> > I'd like to avoid doing it in open where possible, since that hurts
> > device enumeration from userspace.
> 
> I've noticed the same issue with firstopen, where our buffer is
> allocated/liberated a couple of times during enumeration, before the
> final open that stays alive during use.
> 
> I'm not sure what is preferable between that and allocating when the
> GPU is first used. Slowing down the first GPU operation with the
> allocation does not sound too great either and it feels like the buffer
> should have been allocated earlier.
> 
> To me, it feels I think it's better to have delays due to allocation at
> enumeration / startup rather than later on, but I might be missing some
> elements to have a clear idea.
> 
> What do you think?

We'll have the delay somewhere on driver load. Better to have it only once
(when the driver starts using gem for real), than a bunch of time, at
least once while enumerating and then once more while actually
initializing. I think if you allocat this on first non-dumb gem_create,
and on first command submission (just so evil userspace can't screw up the
hw too badly), that should be perfectly fine.

Only way to avoid that is to allocate at driver load and pin it, but
that's what we're trying to avoid here.
-Daniel
Paul Kocialkowski March 29, 2019, 3:49 p.m. UTC | #9
Hi,

Le vendredi 29 mars 2019 à 16:25 +0100, Daniel Vetter a écrit :
> On Fri, Mar 29, 2019 at 04:02:23PM +0100, Paul Kocialkowski wrote:
> > Hi,
> > 
> > On Fri, 2019-03-29 at 09:09 +0000, Daniel Stone wrote:
> > > Hi,
> > > 
> > > On Thu, 28 Mar 2019 at 18:53, Daniel Vetter <daniel@ffwll.ch> wrote:
> > > > On Thu, Mar 21, 2019 at 04:27:06PM +0100, Paul Kocialkowski wrote:
> > > > > I don't see other options either, and using firstclose/lastopen feels
> > > > > overall more readable in the driver code.
> > > > > 
> > > > > I'm not sure there is such a big overhead associated with allocating
> > > > > the binner BO (it seems that the current implementation tries to alloc
> > > > > until the specific memory constraints for the buffer are met, so
> > > > > perhaps that can take time). But if there is, I suppose it's best to
> > > > > have that when starting up rather than delaying the first render
> > > > > operation.
> > > > 
> > > > I'm not entirely buying the "we don't need this for fbcon only" argument -
> > > > there's plenty of dumb kms clients too (boot splash and whatever else
> > > > there might be). If you don't want to keep this around I think allocating
> > > > on first non-dumb bo allocation and dropping it when the last such fd
> > > > closes sounds like a much better idea. Needs a bit more state, you need to
> > > > track per drm_file whether you've already allocated a non-dumb bo, and a
> > > > drm_device refcount, but that's not much. Firstopen feels like the wrong
> > > > thing.
> > > > 
> > > > Another option would be first_renderopen or something like that, except
> > > > you can also render on the legacy node and I'm not sure how much there's a
> > > > demand for this in other drivers. In the end you have open/close
> > > > callbacks, you can do all the driver specific things you want to do in
> > > > there.
> > > 
> > > I'd like to avoid doing it in open where possible, since that hurts
> > > device enumeration from userspace.
> > 
> > I've noticed the same issue with firstopen, where our buffer is
> > allocated/liberated a couple of times during enumeration, before the
> > final open that stays alive during use.
> > 
> > I'm not sure what is preferable between that and allocating when the
> > GPU is first used. Slowing down the first GPU operation with the
> > allocation does not sound too great either and it feels like the buffer
> > should have been allocated earlier.
> > 
> > To me, it feels I think it's better to have delays due to allocation at
> > enumeration / startup rather than later on, but I might be missing some
> > elements to have a clear idea.
> > 
> > What do you think?
> 
> We'll have the delay somewhere on driver load. Better to have it only once
> (when the driver starts using gem for real), than a bunch of time, at
> least once while enumerating and then once more while actually
> initializing. I think if you allocat this on first non-dumb gem_create,
> and on first command submission (just so evil userspace can't screw up the
> hw too badly), that should be perfectly fine.

I'm not totally convinced that it's okay to have a delay outside of
init/enumeration, even if it's a smaller delay.

Eric, do you have an opinion on what is the preference with VC4?

> Only way to avoid that is to allocate at driver load and pin it, but
> that's what we're trying to avoid here.

Exactly!

Cheers,

Paul
Eric Anholt March 29, 2019, 6:14 p.m. UTC | #10
Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:

> Hi,
>
> Le vendredi 29 mars 2019 à 16:25 +0100, Daniel Vetter a écrit :
>> On Fri, Mar 29, 2019 at 04:02:23PM +0100, Paul Kocialkowski wrote:
>> > Hi,
>> > 
>> > On Fri, 2019-03-29 at 09:09 +0000, Daniel Stone wrote:
>> > > Hi,
>> > > 
>> > > On Thu, 28 Mar 2019 at 18:53, Daniel Vetter <daniel@ffwll.ch> wrote:
>> > > > On Thu, Mar 21, 2019 at 04:27:06PM +0100, Paul Kocialkowski wrote:
>> > > > > I don't see other options either, and using firstclose/lastopen feels
>> > > > > overall more readable in the driver code.
>> > > > > 
>> > > > > I'm not sure there is such a big overhead associated with allocating
>> > > > > the binner BO (it seems that the current implementation tries to alloc
>> > > > > until the specific memory constraints for the buffer are met, so
>> > > > > perhaps that can take time). But if there is, I suppose it's best to
>> > > > > have that when starting up rather than delaying the first render
>> > > > > operation.
>> > > > 
>> > > > I'm not entirely buying the "we don't need this for fbcon only" argument -
>> > > > there's plenty of dumb kms clients too (boot splash and whatever else
>> > > > there might be). If you don't want to keep this around I think allocating
>> > > > on first non-dumb bo allocation and dropping it when the last such fd
>> > > > closes sounds like a much better idea. Needs a bit more state, you need to
>> > > > track per drm_file whether you've already allocated a non-dumb bo, and a
>> > > > drm_device refcount, but that's not much. Firstopen feels like the wrong
>> > > > thing.
>> > > > 
>> > > > Another option would be first_renderopen or something like that, except
>> > > > you can also render on the legacy node and I'm not sure how much there's a
>> > > > demand for this in other drivers. In the end you have open/close
>> > > > callbacks, you can do all the driver specific things you want to do in
>> > > > there.
>> > > 
>> > > I'd like to avoid doing it in open where possible, since that hurts
>> > > device enumeration from userspace.
>> > 
>> > I've noticed the same issue with firstopen, where our buffer is
>> > allocated/liberated a couple of times during enumeration, before the
>> > final open that stays alive during use.
>> > 
>> > I'm not sure what is preferable between that and allocating when the
>> > GPU is first used. Slowing down the first GPU operation with the
>> > allocation does not sound too great either and it feels like the buffer
>> > should have been allocated earlier.
>> > 
>> > To me, it feels I think it's better to have delays due to allocation at
>> > enumeration / startup rather than later on, but I might be missing some
>> > elements to have a clear idea.
>> > 
>> > What do you think?
>> 
>> We'll have the delay somewhere on driver load. Better to have it only once
>> (when the driver starts using gem for real), than a bunch of time, at
>> least once while enumerating and then once more while actually
>> initializing. I think if you allocat this on first non-dumb gem_create,
>> and on first command submission (just so evil userspace can't screw up the
>> hw too badly), that should be perfectly fine.
>
> I'm not totally convinced that it's okay to have a delay outside of
> init/enumeration, even if it's a smaller delay.

You'll have non-dumb buffers created during GL context creation, so
early in xserver or other KMS-and-GL-using application init anyway.
Seems like a perfectly fine plan to me.
Daniel Stone March 29, 2019, 6:42 p.m. UTC | #11
Hi,

On Fri, 29 Mar 2019 at 18:14, Eric Anholt <eric@anholt.net> wrote:
> Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> > I'm not totally convinced that it's okay to have a delay outside of
> > init/enumeration, even if it's a smaller delay.
>
> You'll have non-dumb buffers created during GL context creation, so
> early in xserver or other KMS-and-GL-using application init anyway.
> Seems like a perfectly fine plan to me.

Yeah. The alternative is doing it once when Plymouth starts, and then
maybe again when Weston/GNOME/Xorg/... starts, which isn't really
ideal (or maybe even udev helpers). Doing it on probe also complicates
profiling startup for those: if GL context or surface creation takes a
long time, that's easy to reason about. If opening an FD takes ages,
that makes figuring out why stuff is slow a lot more complicated. This
used to happen with RPM resume for PCI devices to read the device ID &
revision, which is why we now have an API that persists that to avoid
the delay.

Sorry this feedback is coming quite late into development.

Cheers,
Daniel
Paul Kocialkowski March 29, 2019, 8:21 p.m. UTC | #12
Hi,

On Fri, 2019-03-29 at 18:42 +0000, Daniel Stone wrote:
> Hi,
> 
> On Fri, 29 Mar 2019 at 18:14, Eric Anholt <eric@anholt.net> wrote:
> > Paul Kocialkowski <paul.kocialkowski@bootlin.com> writes:
> > > I'm not totally convinced that it's okay to have a delay outside of
> > > init/enumeration, even if it's a smaller delay.
> > 
> > You'll have non-dumb buffers created during GL context creation, so
> > early in xserver or other KMS-and-GL-using application init anyway.
> > Seems like a perfectly fine plan to me.
> 
> Yeah. The alternative is doing it once when Plymouth starts, and then
> maybe again when Weston/GNOME/Xorg/... starts, which isn't really
> ideal (or maybe even udev helpers). Doing it on probe also complicates
> profiling startup for those: if GL context or surface creation takes a
> long time, that's easy to reason about. If opening an FD takes ages,
> that makes figuring out why stuff is slow a lot more complicated. This
> used to happen with RPM resume for PCI devices to read the device ID &
> revision, which is why we now have an API that persists that to avoid
> the delay.

Sounds like we have a plan then, I'll spin up a new series allocating
the binner BO at the first non-dumb BO alloc. Thanks for your input!

> Sorry this feedback is coming quite late into development.

The feedback is definitely quite welcome! I tried a few things that
didn't pan out before using firstopen/lastclose and it's really
interesting to refine the implementation based on the shortcomings of
previous ideas.

Cheers,

Paul
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_file.c b/drivers/gpu/drm/drm_file.c
index b1838a41ad43..c011b5cbfb6b 100644
--- a/drivers/gpu/drm/drm_file.c
+++ b/drivers/gpu/drm/drm_file.c
@@ -266,8 +266,7 @@  static int drm_setup(struct drm_device * dev)
 {
 	int ret;
 
-	if (dev->driver->firstopen &&
-	    drm_core_check_feature(dev, DRIVER_LEGACY)) {
+	if (dev->driver->firstopen) {
 		ret = dev->driver->firstopen(dev);
 		if (ret != 0)
 			return ret;
diff --git a/include/drm/drm_drv.h b/include/drm/drm_drv.h
index ca46a45a9cce..aa14607e54d4 100644
--- a/include/drm/drm_drv.h
+++ b/include/drm/drm_drv.h
@@ -236,7 +236,7 @@  struct drm_driver {
 	 * to set/unset the VT into raw mode.
 	 *
 	 * Legacy drivers initialize the hardware in the @firstopen callback,
-	 * which isn't even called for modern drivers.
+	 * modern drivers can use it for other purposes only.
 	 */
 	void (*lastclose) (struct drm_device *);