diff mbox series

[v6,3/4] drm: Expand max DRM device number to full MINORBITS

Message ID 20230724211428.3831636-4-michal.winiarski@intel.com (mailing list archive)
State New, archived
Headers show
Series drm: Use full allocated minor range for DRM | expand

Commit Message

Michał Winiarski July 24, 2023, 9:14 p.m. UTC
Having a limit of 64 DRM devices is not good enough for modern world
where we have multi-GPU servers, SR-IOV virtual functions and virtual
devices used for testing.
Let's utilize full minor range for DRM devices.
To avoid regressing the existing userspace, we're still maintaining the
numbering scheme where 0-63 is used for primary, 64-127 is reserved
(formerly for control) and 128-191 is used for render.
For minors >= 192, we're allocating minors dynamically on a first-come,
first-served basis.

Signed-off-by: Michał Winiarski <michal.winiarski@intel.com>
---
 drivers/gpu/drm/drm_drv.c | 12 ++++++++++++
 1 file changed, 12 insertions(+)

Comments

James Zhu July 24, 2023, 10:29 p.m. UTC | #1
On 2023-07-24 17:14, Michał Winiarski wrote:
> Having a limit of 64 DRM devices is not good enough for modern world
> where we have multi-GPU servers, SR-IOV virtual functions and virtual
> devices used for testing.
> Let's utilize full minor range for DRM devices.
> To avoid regressing the existing userspace, we're still maintaining the
> numbering scheme where 0-63 is used for primary, 64-127 is reserved
> (formerly for control) and 128-191 is used for render.
> For minors >= 192, we're allocating minors dynamically on a first-come,
> first-served basis.

[JZ] Hello Michal,

Do you have libdrm patches for review together with this change?

Especially to support static int drmGetMinorType(int major, int minor).

Thanks and Best Regards!

James Zhu

>
> Signed-off-by: Michał Winiarski<michal.winiarski@intel.com>
> ---
>   drivers/gpu/drm/drm_drv.c | 12 ++++++++++++
>   1 file changed, 12 insertions(+)
>
> diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
> index 34b60196c443..c2c6e80e6b31 100644
> --- a/drivers/gpu/drm/drm_drv.c
> +++ b/drivers/gpu/drm/drm_drv.c
> @@ -121,10 +121,19 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data)
>   	xa_erase(drm_minor_get_xa(minor->type), minor->index);
>   }
>   
> +/*
> + * DRM used to support 64 devices, for backwards compatibility we need to maintain the
> + * minor allocation scheme where minors 0-63 are primary nodes, 64-127 are control nodes,
> + * and 128-191 are render nodes.
> + * After reaching the limit, we're allocating minors dynamically - first-come, first-serve.
> + * Accel nodes are using a distinct major, so the minors are allocated in continuous 0-MAX
> + * range.
> + */
>   #define DRM_MINOR_LIMIT(t) ({ \
>   	typeof(t) _t = (t); \
>   	_t == DRM_MINOR_ACCEL ? XA_LIMIT(0, ACCEL_MAX_MINORS) : XA_LIMIT(64 * _t, 64 * _t + 63); \
>   })
> +#define DRM_EXTENDED_MINOR_LIMIT XA_LIMIT(192, (1 << MINORBITS) - 1)
>   
>   static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type)
>   {
> @@ -140,6 +149,9 @@ static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type)
>   
>   	r = xa_alloc(drm_minor_get_xa(type), &minor->index,
>   		     NULL, DRM_MINOR_LIMIT(type), GFP_KERNEL);
> +	if (r == -EBUSY && (type == DRM_MINOR_PRIMARY || type == DRM_MINOR_RENDER))
> +		r = xa_alloc(&drm_minors_xa, &minor->index,
> +			     NULL, DRM_EXTENDED_MINOR_LIMIT, GFP_KERNEL);
>   	if (r < 0)
>   		return r;
>
Simon Ser July 26, 2023, 6:15 p.m. UTC | #2
On Monday, July 24th, 2023 at 23:14, Michał Winiarski <michal.winiarski@intel.com> wrote:

> Having a limit of 64 DRM devices is not good enough for modern world
> where we have multi-GPU servers, SR-IOV virtual functions and virtual
> devices used for testing.
> Let's utilize full minor range for DRM devices.
> To avoid regressing the existing userspace, we're still maintaining the
> numbering scheme where 0-63 is used for primary, 64-127 is reserved
> (formerly for control) and 128-191 is used for render.
> For minors >= 192, we're allocating minors dynamically on a first-come,
> first-served basis.

In general the approach looks good to me. Old libdrm will see the new
nodes as nodes with an unknown type when it tries to infer the nod type
from the minor, which is as good as it gets.

We do need patches to stop trying to infer the node type from the minor
in libdrm, though. Emil has suggested using sysfs, which we already do
in a few places in libdrm.
Christian König July 27, 2023, 12:01 p.m. UTC | #3
Am 26.07.23 um 20:15 schrieb Simon Ser:
> On Monday, July 24th, 2023 at 23:14, Michał Winiarski <michal.winiarski@intel.com> wrote:
>
>> Having a limit of 64 DRM devices is not good enough for modern world
>> where we have multi-GPU servers, SR-IOV virtual functions and virtual
>> devices used for testing.
>> Let's utilize full minor range for DRM devices.
>> To avoid regressing the existing userspace, we're still maintaining the
>> numbering scheme where 0-63 is used for primary, 64-127 is reserved
>> (formerly for control) and 128-191 is used for render.
>> For minors >= 192, we're allocating minors dynamically on a first-come,
>> first-served basis.
> In general the approach looks good to me. Old libdrm will see the new
> nodes as nodes with an unknown type when it tries to infer the nod type
> from the minor, which is as good as it gets.

Yeah, agree. I wouldn't upstream patch #4, but apart from that it looks 
like it shouldn't break anything which wasn't broken before.

> We do need patches to stop trying to infer the node type from the minor
> in libdrm, though. Emil has suggested using sysfs, which we already do
> in a few places in libdrm.

That sounds like a really good idea to me as well.

But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps 
should use drmGetDevices2() like Emil suggested?

Regards,
Christian.
Simon Ser July 28, 2023, 2:22 p.m. UTC | #4
On Thursday, July 27th, 2023 at 14:01, Christian König <christian.koenig@amd.com> wrote:

> > We do need patches to stop trying to infer the node type from the minor
> > in libdrm, though. Emil has suggested using sysfs, which we already do
> > in a few places in libdrm.
> 
> That sounds like a really good idea to me as well.
> 
> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
> should use drmGetDevices2() like Emil suggested?

DRM_MAX_MINOR has been bumped to 64 now.

With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
because there is no "max minor per type" concept anymore: the minor no
longer contains the type.

So I'd suggest leaving it as-is (so that old apps still continue to
work on systems with < 64 devices like they do today) and mark it as
deprecated.
Christian König Aug. 8, 2023, 1:55 p.m. UTC | #5
Am 28.07.23 um 16:22 schrieb Simon Ser:
> On Thursday, July 27th, 2023 at 14:01, Christian König <christian.koenig@amd.com> wrote:
>
>>> We do need patches to stop trying to infer the node type from the minor
>>> in libdrm, though. Emil has suggested using sysfs, which we already do
>>> in a few places in libdrm.
>> That sounds like a really good idea to me as well.
>>
>> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
>> should use drmGetDevices2() like Emil suggested?
> DRM_MAX_MINOR has been bumped to 64 now.
>
> With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
> because there is no "max minor per type" concept anymore: the minor no
> longer contains the type.
>
> So I'd suggest leaving it as-is (so that old apps still continue to
> work on systems with < 64 devices like they do today) and mark it as
> deprecated.

Sounds like a plan to me.

Regards,
Christian.
James Zhu Aug. 8, 2023, 3:04 p.m. UTC | #6
I would like if these kernel patches are accepted by everyone, If yes, 
when they can be upstream.

I have a MR for libdrm to support drm nodes type up to 2^MINORBITS  
nodes which can work with these patches,

https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

Thanks!

James

On 2023-08-08 09:55, Christian König wrote:
> Am 28.07.23 um 16:22 schrieb Simon Ser:
>> On Thursday, July 27th, 2023 at 14:01, Christian König 
>> <christian.koenig@amd.com> wrote:
>>
>>>> We do need patches to stop trying to infer the node type from the 
>>>> minor
>>>> in libdrm, though. Emil has suggested using sysfs, which we already do
>>>> in a few places in libdrm.
>>> That sounds like a really good idea to me as well.
>>>
>>> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps
>>> should use drmGetDevices2() like Emil suggested?
>> DRM_MAX_MINOR has been bumped to 64 now.
>>
>> With the new minor allocation scheme, DRM_MAX_MINOR is meaningless
>> because there is no "max minor per type" concept anymore: the minor no
>> longer contains the type.
>>
>> So I'd suggest leaving it as-is (so that old apps still continue to
>> work on systems with < 64 devices like they do today) and mark it as
>> deprecated.
>
> Sounds like a plan to me.
>
> Regards,
> Christian.
Simon Ser Aug. 23, 2023, 10:53 a.m. UTC | #7
On Tuesday, August 8th, 2023 at 17:04, James Zhu <jamesz@amd.com> wrote:

> I have a MR for libdrm to support drm nodes type up to 2^MINORBITS
> nodes which can work with these patches,
> 
> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305

FWIW, this MR has been merged, so in theory this kernel patch should
work fine with latest libdrm.
Simon Ser Aug. 23, 2023, 10:58 a.m. UTC | #8
On Wednesday, August 23rd, 2023 at 12:53, Simon Ser <contact@emersion.fr> wrote:

> On Tuesday, August 8th, 2023 at 17:04, James Zhu jamesz@amd.com wrote:
> 
> > I have a MR for libdrm to support drm nodes type up to 2^MINORBITS
> > nodes which can work with these patches,
> > 
> > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305
> 
> FWIW, this MR has been merged, so in theory this kernel patch should
> work fine with latest libdrm.

Hm, we might want to adjust MAX_DRM_NODES still. It's set to 256
currently, which should be enough for 128 DRM devices, but not more.
Not bumping this value will make drmGetDevices2() print a warning and
not return all devices on systems with more than 128 DRM devices.
James Zhu Aug. 23, 2023, 2:06 p.m. UTC | #9
Hi Simon,

Thanks! Yes, this kernel patch should work with latest libdrm.

Best regards!

James Zhu

On 2023-08-23 06:53, Simon Ser wrote:
> On Tuesday, August 8th, 2023 at 17:04, James Zhu <jamesz@amd.com> wrote:
>
>> I have a MR for libdrm to support drm nodes type up to 2^MINORBITS
>> nodes which can work with these patches,
>>
>> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305
> FWIW, this MR has been merged, so in theory this kernel patch should
> work fine with latest libdrm.
diff mbox series

Patch

diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c
index 34b60196c443..c2c6e80e6b31 100644
--- a/drivers/gpu/drm/drm_drv.c
+++ b/drivers/gpu/drm/drm_drv.c
@@ -121,10 +121,19 @@  static void drm_minor_alloc_release(struct drm_device *dev, void *data)
 	xa_erase(drm_minor_get_xa(minor->type), minor->index);
 }
 
+/*
+ * DRM used to support 64 devices, for backwards compatibility we need to maintain the
+ * minor allocation scheme where minors 0-63 are primary nodes, 64-127 are control nodes,
+ * and 128-191 are render nodes.
+ * After reaching the limit, we're allocating minors dynamically - first-come, first-serve.
+ * Accel nodes are using a distinct major, so the minors are allocated in continuous 0-MAX
+ * range.
+ */
 #define DRM_MINOR_LIMIT(t) ({ \
 	typeof(t) _t = (t); \
 	_t == DRM_MINOR_ACCEL ? XA_LIMIT(0, ACCEL_MAX_MINORS) : XA_LIMIT(64 * _t, 64 * _t + 63); \
 })
+#define DRM_EXTENDED_MINOR_LIMIT XA_LIMIT(192, (1 << MINORBITS) - 1)
 
 static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type)
 {
@@ -140,6 +149,9 @@  static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type)
 
 	r = xa_alloc(drm_minor_get_xa(type), &minor->index,
 		     NULL, DRM_MINOR_LIMIT(type), GFP_KERNEL);
+	if (r == -EBUSY && (type == DRM_MINOR_PRIMARY || type == DRM_MINOR_RENDER))
+		r = xa_alloc(&drm_minors_xa, &minor->index,
+			     NULL, DRM_EXTENDED_MINOR_LIMIT, GFP_KERNEL);
 	if (r < 0)
 		return r;