Message ID | 20230724211428.3831636-4-michal.winiarski@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | drm: Use full allocated minor range for DRM | expand |
On 2023-07-24 17:14, Michał Winiarski wrote: > Having a limit of 64 DRM devices is not good enough for modern world > where we have multi-GPU servers, SR-IOV virtual functions and virtual > devices used for testing. > Let's utilize full minor range for DRM devices. > To avoid regressing the existing userspace, we're still maintaining the > numbering scheme where 0-63 is used for primary, 64-127 is reserved > (formerly for control) and 128-191 is used for render. > For minors >= 192, we're allocating minors dynamically on a first-come, > first-served basis. [JZ] Hello Michal, Do you have libdrm patches for review together with this change? Especially to support static int drmGetMinorType(int major, int minor). Thanks and Best Regards! James Zhu > > Signed-off-by: Michał Winiarski<michal.winiarski@intel.com> > --- > drivers/gpu/drm/drm_drv.c | 12 ++++++++++++ > 1 file changed, 12 insertions(+) > > diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c > index 34b60196c443..c2c6e80e6b31 100644 > --- a/drivers/gpu/drm/drm_drv.c > +++ b/drivers/gpu/drm/drm_drv.c > @@ -121,10 +121,19 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data) > xa_erase(drm_minor_get_xa(minor->type), minor->index); > } > > +/* > + * DRM used to support 64 devices, for backwards compatibility we need to maintain the > + * minor allocation scheme where minors 0-63 are primary nodes, 64-127 are control nodes, > + * and 128-191 are render nodes. > + * After reaching the limit, we're allocating minors dynamically - first-come, first-serve. > + * Accel nodes are using a distinct major, so the minors are allocated in continuous 0-MAX > + * range. > + */ > #define DRM_MINOR_LIMIT(t) ({ \ > typeof(t) _t = (t); \ > _t == DRM_MINOR_ACCEL ? XA_LIMIT(0, ACCEL_MAX_MINORS) : XA_LIMIT(64 * _t, 64 * _t + 63); \ > }) > +#define DRM_EXTENDED_MINOR_LIMIT XA_LIMIT(192, (1 << MINORBITS) - 1) > > static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type) > { > @@ -140,6 +149,9 @@ static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type) > > r = xa_alloc(drm_minor_get_xa(type), &minor->index, > NULL, DRM_MINOR_LIMIT(type), GFP_KERNEL); > + if (r == -EBUSY && (type == DRM_MINOR_PRIMARY || type == DRM_MINOR_RENDER)) > + r = xa_alloc(&drm_minors_xa, &minor->index, > + NULL, DRM_EXTENDED_MINOR_LIMIT, GFP_KERNEL); > if (r < 0) > return r; >
On Monday, July 24th, 2023 at 23:14, Michał Winiarski <michal.winiarski@intel.com> wrote: > Having a limit of 64 DRM devices is not good enough for modern world > where we have multi-GPU servers, SR-IOV virtual functions and virtual > devices used for testing. > Let's utilize full minor range for DRM devices. > To avoid regressing the existing userspace, we're still maintaining the > numbering scheme where 0-63 is used for primary, 64-127 is reserved > (formerly for control) and 128-191 is used for render. > For minors >= 192, we're allocating minors dynamically on a first-come, > first-served basis. In general the approach looks good to me. Old libdrm will see the new nodes as nodes with an unknown type when it tries to infer the nod type from the minor, which is as good as it gets. We do need patches to stop trying to infer the node type from the minor in libdrm, though. Emil has suggested using sysfs, which we already do in a few places in libdrm.
Am 26.07.23 um 20:15 schrieb Simon Ser: > On Monday, July 24th, 2023 at 23:14, Michał Winiarski <michal.winiarski@intel.com> wrote: > >> Having a limit of 64 DRM devices is not good enough for modern world >> where we have multi-GPU servers, SR-IOV virtual functions and virtual >> devices used for testing. >> Let's utilize full minor range for DRM devices. >> To avoid regressing the existing userspace, we're still maintaining the >> numbering scheme where 0-63 is used for primary, 64-127 is reserved >> (formerly for control) and 128-191 is used for render. >> For minors >= 192, we're allocating minors dynamically on a first-come, >> first-served basis. > In general the approach looks good to me. Old libdrm will see the new > nodes as nodes with an unknown type when it tries to infer the nod type > from the minor, which is as good as it gets. Yeah, agree. I wouldn't upstream patch #4, but apart from that it looks like it shouldn't break anything which wasn't broken before. > We do need patches to stop trying to infer the node type from the minor > in libdrm, though. Emil has suggested using sysfs, which we already do > in a few places in libdrm. That sounds like a really good idea to me as well. But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps should use drmGetDevices2() like Emil suggested? Regards, Christian.
On Thursday, July 27th, 2023 at 14:01, Christian König <christian.koenig@amd.com> wrote: > > We do need patches to stop trying to infer the node type from the minor > > in libdrm, though. Emil has suggested using sysfs, which we already do > > in a few places in libdrm. > > That sounds like a really good idea to me as well. > > But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps > should use drmGetDevices2() like Emil suggested? DRM_MAX_MINOR has been bumped to 64 now. With the new minor allocation scheme, DRM_MAX_MINOR is meaningless because there is no "max minor per type" concept anymore: the minor no longer contains the type. So I'd suggest leaving it as-is (so that old apps still continue to work on systems with < 64 devices like they do today) and mark it as deprecated.
Am 28.07.23 um 16:22 schrieb Simon Ser: > On Thursday, July 27th, 2023 at 14:01, Christian König <christian.koenig@amd.com> wrote: > >>> We do need patches to stop trying to infer the node type from the minor >>> in libdrm, though. Emil has suggested using sysfs, which we already do >>> in a few places in libdrm. >> That sounds like a really good idea to me as well. >> >> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps >> should use drmGetDevices2() like Emil suggested? > DRM_MAX_MINOR has been bumped to 64 now. > > With the new minor allocation scheme, DRM_MAX_MINOR is meaningless > because there is no "max minor per type" concept anymore: the minor no > longer contains the type. > > So I'd suggest leaving it as-is (so that old apps still continue to > work on systems with < 64 devices like they do today) and mark it as > deprecated. Sounds like a plan to me. Regards, Christian.
I would like if these kernel patches are accepted by everyone, If yes, when they can be upstream. I have a MR for libdrm to support drm nodes type up to 2^MINORBITS nodes which can work with these patches, https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305 Thanks! James On 2023-08-08 09:55, Christian König wrote: > Am 28.07.23 um 16:22 schrieb Simon Ser: >> On Thursday, July 27th, 2023 at 14:01, Christian König >> <christian.koenig@amd.com> wrote: >> >>>> We do need patches to stop trying to infer the node type from the >>>> minor >>>> in libdrm, though. Emil has suggested using sysfs, which we already do >>>> in a few places in libdrm. >>> That sounds like a really good idea to me as well. >>> >>> But what do we do with DRM_MAX_MINOR? Change it or keep it and say apps >>> should use drmGetDevices2() like Emil suggested? >> DRM_MAX_MINOR has been bumped to 64 now. >> >> With the new minor allocation scheme, DRM_MAX_MINOR is meaningless >> because there is no "max minor per type" concept anymore: the minor no >> longer contains the type. >> >> So I'd suggest leaving it as-is (so that old apps still continue to >> work on systems with < 64 devices like they do today) and mark it as >> deprecated. > > Sounds like a plan to me. > > Regards, > Christian.
On Tuesday, August 8th, 2023 at 17:04, James Zhu <jamesz@amd.com> wrote: > I have a MR for libdrm to support drm nodes type up to 2^MINORBITS > nodes which can work with these patches, > > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305 FWIW, this MR has been merged, so in theory this kernel patch should work fine with latest libdrm.
On Wednesday, August 23rd, 2023 at 12:53, Simon Ser <contact@emersion.fr> wrote: > On Tuesday, August 8th, 2023 at 17:04, James Zhu jamesz@amd.com wrote: > > > I have a MR for libdrm to support drm nodes type up to 2^MINORBITS > > nodes which can work with these patches, > > > > https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305 > > FWIW, this MR has been merged, so in theory this kernel patch should > work fine with latest libdrm. Hm, we might want to adjust MAX_DRM_NODES still. It's set to 256 currently, which should be enough for 128 DRM devices, but not more. Not bumping this value will make drmGetDevices2() print a warning and not return all devices on systems with more than 128 DRM devices.
Hi Simon, Thanks! Yes, this kernel patch should work with latest libdrm. Best regards! James Zhu On 2023-08-23 06:53, Simon Ser wrote: > On Tuesday, August 8th, 2023 at 17:04, James Zhu <jamesz@amd.com> wrote: > >> I have a MR for libdrm to support drm nodes type up to 2^MINORBITS >> nodes which can work with these patches, >> >> https://gitlab.freedesktop.org/mesa/drm/-/merge_requests/305 > FWIW, this MR has been merged, so in theory this kernel patch should > work fine with latest libdrm.
diff --git a/drivers/gpu/drm/drm_drv.c b/drivers/gpu/drm/drm_drv.c index 34b60196c443..c2c6e80e6b31 100644 --- a/drivers/gpu/drm/drm_drv.c +++ b/drivers/gpu/drm/drm_drv.c @@ -121,10 +121,19 @@ static void drm_minor_alloc_release(struct drm_device *dev, void *data) xa_erase(drm_minor_get_xa(minor->type), minor->index); } +/* + * DRM used to support 64 devices, for backwards compatibility we need to maintain the + * minor allocation scheme where minors 0-63 are primary nodes, 64-127 are control nodes, + * and 128-191 are render nodes. + * After reaching the limit, we're allocating minors dynamically - first-come, first-serve. + * Accel nodes are using a distinct major, so the minors are allocated in continuous 0-MAX + * range. + */ #define DRM_MINOR_LIMIT(t) ({ \ typeof(t) _t = (t); \ _t == DRM_MINOR_ACCEL ? XA_LIMIT(0, ACCEL_MAX_MINORS) : XA_LIMIT(64 * _t, 64 * _t + 63); \ }) +#define DRM_EXTENDED_MINOR_LIMIT XA_LIMIT(192, (1 << MINORBITS) - 1) static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type) { @@ -140,6 +149,9 @@ static int drm_minor_alloc(struct drm_device *dev, enum drm_minor_type type) r = xa_alloc(drm_minor_get_xa(type), &minor->index, NULL, DRM_MINOR_LIMIT(type), GFP_KERNEL); + if (r == -EBUSY && (type == DRM_MINOR_PRIMARY || type == DRM_MINOR_RENDER)) + r = xa_alloc(&drm_minors_xa, &minor->index, + NULL, DRM_EXTENDED_MINOR_LIMIT, GFP_KERNEL); if (r < 0) return r;
Having a limit of 64 DRM devices is not good enough for modern world where we have multi-GPU servers, SR-IOV virtual functions and virtual devices used for testing. Let's utilize full minor range for DRM devices. To avoid regressing the existing userspace, we're still maintaining the numbering scheme where 0-63 is used for primary, 64-127 is reserved (formerly for control) and 128-191 is used for render. For minors >= 192, we're allocating minors dynamically on a first-come, first-served basis. Signed-off-by: Michał Winiarski <michal.winiarski@intel.com> --- drivers/gpu/drm/drm_drv.c | 12 ++++++++++++ 1 file changed, 12 insertions(+)