Message ID | 20221106210225.2065371-1-ogabbay@kernel.org (mailing list archive) |
---|---|
Headers | show |
Series | new subsystem for compute accelerator devices | expand |
On 11/6/2022 2:02 PM, Oded Gabbay wrote: > This is the third version of the RFC following the comments given on the > second version, but more importantly, following testing done by the VPU > driver people and myself. We found out that there is a circular dependency > between DRM and accel. DRM calls accel exported symbols during init and when > accel devices are registering (all the minor handling), then accel calls DRM > exported symbols. Therefore, if the two components are compiled as modules, > there is a circular dependency. > > To overcome this, I have decided to compile the accel core code as part of > the DRM kernel module (drm.ko). IMO, this is inline with the spirit of the > design choice to have accel reuse the DRM core code and avoid code > duplication. > > Another important change is that I have reverted back to use IDR for minor > handling instead of xarray. This is because I have found that xarray doesn't > handle well the scenario where you allocate a NULL entry and then exchange it > with a real pointer. It appears xarray still considers that entry a "zero" > entry. This is unfortunate because DRM works that way (first allocates a NULL > entry and then replaces the entry with a real pointer). > > I decided to revert to IDR because I don't want to hold up these patches, > as many people are blocked until the support for accel is merged. The xarray > issue should be fixed as a separate patch by either fixing the xarray code or > changing how DRM + ACCEL do minor id handling. This sounds sane to me. However, this appears to be something that Matthew Wilcox should be aware of (added for visibility). Perhaps he has a very quick solution. If not, at-least he might have ideas on how to best address in the future. > The patches are in the following repo: > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v3 > > As in v2, The HEAD of that branch is a commit adding a dummy driver that > registers an accel device using the new framework. This can be served > as a simple reference. I have checked inserting and removing the dummy driver, > and opening and closing /dev/accel/accel0 and nothing got broken :) > > v1 cover letter: > https://lkml.org/lkml/2022/10/22/544 > > v2 cover letter: > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogabbay@kernel.org/T/ > > Thanks, > Oded. > > Oded Gabbay (3): > drivers/accel: define kconfig and register a new major > accel: add dedicated minor for accelerator devices > drm: initialize accel framework > > Documentation/admin-guide/devices.txt | 5 + > MAINTAINERS | 8 + > drivers/Kconfig | 2 + > drivers/accel/Kconfig | 24 ++ > drivers/accel/drm_accel.c | 322 ++++++++++++++++++++++++++ > drivers/gpu/drm/Makefile | 1 + > drivers/gpu/drm/drm_drv.c | 102 +++++--- > drivers/gpu/drm/drm_file.c | 2 +- > drivers/gpu/drm/drm_sysfs.c | 24 +- > include/drm/drm_accel.h | 97 ++++++++ > include/drm/drm_device.h | 3 + > include/drm/drm_drv.h | 8 + > include/drm/drm_file.h | 21 +- > 13 files changed, 582 insertions(+), 37 deletions(-) > create mode 100644 drivers/accel/Kconfig > create mode 100644 drivers/accel/drm_accel.c > create mode 100644 include/drm/drm_accel.h > > -- > 2.25.1 >
On Sun, Nov 06, 2022 at 11:02:22PM +0200, Oded Gabbay wrote: > Another important change is that I have reverted back to use IDR for minor > handling instead of xarray. This is because I have found that xarray doesn't > handle well the scenario where you allocate a NULL entry and then exchange it > with a real pointer. It appears xarray still considers that entry a "zero" > entry. This is unfortunate because DRM works that way (first allocates a NULL > entry and then replaces the entry with a real pointer). This is what XA_ZERO_ENTRY is for. Some APIs, like xa_alloc automatically promote NULL to XA_ZERO_ENTRY, others require it to be explicit. If you use the usual pattern of xa_alloc(NULL), xa_store(!NULL) then you should be fine, as far as I know. So long as the xarray was tagged as allocating. Jason
On Mon, Nov 07, 2022 at 09:07:28AM -0700, Jeffrey Hugo wrote: > > Another important change is that I have reverted back to use IDR for minor > > handling instead of xarray. This is because I have found that xarray doesn't > > handle well the scenario where you allocate a NULL entry and then exchange it > > with a real pointer. It appears xarray still considers that entry a "zero" > > entry. This is unfortunate because DRM works that way (first allocates a NULL > > entry and then replaces the entry with a real pointer). > > > > I decided to revert to IDR because I don't want to hold up these patches, > > as many people are blocked until the support for accel is merged. The xarray > > issue should be fixed as a separate patch by either fixing the xarray code or > > changing how DRM + ACCEL do minor id handling. > > This sounds sane to me. However, this appears to be something that Matthew > Wilcox should be aware of (added for visibility). Perhaps he has a very > quick solution. If not, at-least he might have ideas on how to best address > in the future. Thanks for cc'ing me. I wasn't aware of this problem because I hadn't seen Oded's email yet. The "problem" is simply a mis-use of the API.
Hi Oded, On Sun, Nov 6, 2022 at 4:03 PM Oded Gabbay <ogabbay@kernel.org> wrote: > The patches are in the following repo: > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v3 > > As in v2, The HEAD of that branch is a commit adding a dummy driver that > registers an accel device using the new framework. This can be served > as a simple reference. I have checked inserting and removing the dummy driver, > and opening and closing /dev/accel/accel0 and nothing got broken :) > > v1 cover letter: > https://lkml.org/lkml/2022/10/22/544 > > v2 cover letter: > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogabbay@kernel.org/T/ I was in the room at Plumbers when a lot of this was discussed (in 2022 and also 2019), but I haven't really had an opportunity to provide feedback until now. In general, I think it's great and thanks for pushing it forward and getting feedback. The v1 cover letter mentioned RAS (reliability, availability, serviceability) and Dave also mentioned it here [1]. There was a suggestion to use Netlink. It's an area that I'm fairly interested in because I do a lot of development on the firmware side (and specifically, with Zephyr). Personally, I think Netlink could be one option for serializing and deserializing RAS information but it would be helpful for that interface to be somewhat flexible, like a void * and length, and to provide userspace the capability of querying which RAS formats are supported. For example, AntMicro used OpenAMP + rpmsg in their NVMe accelerator, and gave a talk on it at ZDS and Plumbers this year [2][3]. In Zephyr, the LGPL license for Netlink might be a non-starter (although I'm no lawyer). However, Zephyr does already support OpenAMP, protobufs, json, and will soon support Thrift. Some companies might prefer to use Netlink. Others might prefer to use ASN.1. Some companies might prefer to use key-value pairs and limit the parameters and messages to uint32s. Some might handle all of the RAS details in-kernel, while others might want the kernel to act more like a transport to firmware. Companies already producing accelerators may have a particular preference for serialization / deserialization in their own datacenters. With that, it would be helpful to be able to query RAS capabilities via ioctl. #define ACCEL_CAP_RAS_KEY_VAL_32 BIT(0) #define ACCEL_CAP_RAS_NETLINK BIT(1) #define ACCEL_CAP_RAS_JSON BIT(2) #define ACCEL_CAP_RAS_PROTOBUF BIT(3) #define ACCEL_CAP_RAS_GRPC BIT(4) #define ACCEL_CAP_RAS_THRIFT BIT(5) #define ACCEL_CAP_RAS_JSON BIT(6) #define ACCEL_CAP_RAS_ASN1 BIT(7) or something along those lines. Anyway, just putting the idea out there. I'm sure there are a lot of opinions on this topic and that there are a lot of implications of using this or that serialization format. Obviously there can be security implications as well. Apologies if I've already missed some of this discussion. Cheers, C [1] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html [2] https://zephyr2022.sched.com/event/10CFD/open-source-nvme-ai-accelerator-platform-with-zephyr-karol-gugala-antmicro [3] https://lpc.events/event/16/contributions/1245/
On Sat, Nov 12, 2022 at 12:04 AM Christopher Friedt <chrisfriedt@gmail.com> wrote: > > Hi Oded, > > On Sun, Nov 6, 2022 at 4:03 PM Oded Gabbay <ogabbay@kernel.org> wrote: > > The patches are in the following repo: > > https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/accel.git/log/?h=accel_v3 > > > > As in v2, The HEAD of that branch is a commit adding a dummy driver that > > registers an accel device using the new framework. This can be served > > as a simple reference. I have checked inserting and removing the dummy driver, > > and opening and closing /dev/accel/accel0 and nothing got broken :) > > > > v1 cover letter: > > https://lkml.org/lkml/2022/10/22/544 > > > > v2 cover letter: > > https://lore.kernel.org/lkml/20221102203405.1797491-1-ogabbay@kernel.org/T/ > > I was in the room at Plumbers when a lot of this was discussed (in > 2022 and also 2019), but I haven't really had an opportunity to > provide feedback until now. In general, I think it's great and thanks > for pushing it forward and getting feedback. > > The v1 cover letter mentioned RAS (reliability, availability, > serviceability) and Dave also mentioned it here [1]. There was a > suggestion to use Netlink. It's an area that I'm fairly interested in > because I do a lot of development on the firmware side (and > specifically, with Zephyr). > > Personally, I think Netlink could be one option for serializing and > deserializing RAS information but it would be helpful for that > interface to be somewhat flexible, like a void * and length, and to > provide userspace the capability of querying which RAS formats are > supported. > > For example, AntMicro used OpenAMP + rpmsg in their NVMe accelerator, > and gave a talk on it at ZDS and Plumbers this year [2][3]. > > In Zephyr, the LGPL license for Netlink might be a non-starter > (although I'm no lawyer). However, Zephyr does already support > OpenAMP, protobufs, json, and will soon support Thrift. > > Some companies might prefer to use Netlink. Others might prefer to use > ASN.1. Some companies might prefer to use key-value pairs and limit > the parameters and messages to uint32s. Some might handle all of the > RAS details in-kernel, while others might want the kernel to act more > like a transport to firmware. > > Companies already producing accelerators may have a particular > preference for serialization / deserialization in their own > datacenters. > > With that, it would be helpful to be able to query RAS capabilities via ioctl. > > #define ACCEL_CAP_RAS_KEY_VAL_32 BIT(0) > #define ACCEL_CAP_RAS_NETLINK BIT(1) > #define ACCEL_CAP_RAS_JSON BIT(2) > #define ACCEL_CAP_RAS_PROTOBUF BIT(3) > #define ACCEL_CAP_RAS_GRPC BIT(4) > #define ACCEL_CAP_RAS_THRIFT BIT(5) > #define ACCEL_CAP_RAS_JSON BIT(6) > #define ACCEL_CAP_RAS_ASN1 BIT(7) > > or something along those lines. Anyway, just putting the idea out there. > > I'm sure there are a lot of opinions on this topic and that there are > a lot of implications of using this or that serialization format. > Obviously there can be security implications as well. > > Apologies if I've already missed some of this discussion. > > Cheers, > > C > > [1] https://airlied.blogspot.com/2022/09/accelerators-bof-outcomes-summary.html > [2] https://zephyr2022.sched.com/event/10CFD/open-source-nvme-ai-accelerator-platform-with-zephyr-karol-gugala-antmicro > [3] https://lpc.events/event/16/contributions/1245/ Hi Christopher, Thanks for all this information. At this stage, I'm mainly trying to gather information on RAS current status in the OCP (Open Compute Project) and Linux kernel, so your email was on point :) It seems to me that this topic is broader than just accelerators or GPUs, because there are other device types that are implementing some kind of RAS (e.g. NIC). My gut feeling is that the end solution would be some kind of generic kernel driver/framework that will expose RAS to userspace for any device type, but it's too early to tell. I'll update once I have the full picture. Thanks, Oded